Quantum Mechanics 1-3. Basic Concepts, Tools, and Applications; Angular Momentum, Spin, and Approximation; Fermions, Bosons, Photons, Correlations, and Entanglement [2 ed.] 9783527345533, 9783527345540, 9783527345557

2,299 342 117MB

English Pages 2354 [2425] Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Quantum Mechanics 1-3. Basic Concepts, Tools, and Applications; Angular Momentum, Spin, and Approximation; Fermions, Bosons, Photons, Correlations, and Entanglement [2 ed.]
 9783527345533, 9783527345540, 9783527345557

Table of contents :
Claude Cohen-Tannoudji, Bernard Diu , Frank Laloë - Quantum Mechanics Volume 1. 2nd Edition. Wiley (2019)
Cover
Title Page
Copyright Page
Directions for Use
Foreword
Acknowledgments
Volume I
Table of contents
Chapter I. Waves and particles. Introduction to the fundamental ideas of quantum mechanics
A. Electromagnetic waves and photons
A-1. Light quanta and the Planck-Einstein relations
A-2. Wave-particle duality
A-3. The principle of spectral decomposition
B. Material particles and matter waves
B-1. The de Broglie relations
B-2. Wave functions. Schrödinger equation
C. Quantum description of a particle. Wave packets
C-1. Free particle
C-2. Form of the wave packet at a given time
C-3. Heisenberg relations
C-4. Time evolution of a free wave packet
D. Particle in a time-independent scalar potential
D-1. Separation of variables. Stationary states
D-2. One-dimensional “square” potentials. Qualitative study
COMPLEMENTS OF CHAPTER I, READER’S GUIDE
Complement AI Order of magnitude of the wavelengths associated with material particles
Complement BI Constraints imposed by the uncertainty relations
1. Macroscopic system
2. Microscopic system
Complement CI Heisenberg relation and atomic parameters
Complement DI An experiment illustrating the Heisenberg relations
Complement EI A simple treatment of a two-dimensional wave packet
1. Introduction
2. Angular dispersion and lateral dimensions
3. Discussion
Complement FI The relationship between one- and three-dimensional problems
1. Three-dimensional wave packet
1-a. Simple case
1-b. General case
2. Justification of one-dimensional models
Complement GI One-dimensional Gaussian wave packet: spreading of the wave packet
1. Definition of a Gaussian wave packet
2. Calculation of and ; uncertainty relation
3. Evolution of the wave packet
3-a. Calculation of ψ(x,t)
3-b. Velocity of the wave packet
3-c. Spreading of the wave packet
Complement HI Stationary states of a particle in one-dimensional square potentials
1. Behavior of a stationary wave function
1-a. Regions of constant potential energy
1-b. Behavior of ϕ(x ) at a potential energy discontinuity
1-c. Outline of the calculation
2. Some simple cases
2-a. Potential steps
2-b. Potential barriers
2-c. Bound states: square well potential
Complement JI Behavior of a wave packet at a potential step
1. Total reflection: E} representation
2. The {|p>} representation
2-a. The P operator and functions of P
2-b. The R operator and functions of R
2-c. The Schrödinger equation in the {|p>} representation
Complement EII Some general properties of two observables, Q and P, whose commutator is equal to iħ
1. The operator S(λ): definition, properties
2. Eigenvalues and eigenvectors of Q
2-a. Spectrum of Q
2-b. Degree of degeneracy
2-c. Eigenvectors
3. The {|q>} representation
3-a. The action of Q in the {|q>} representation
3-b. TThe action of S(λ) in the {|q>} representation; the translation operator
3-c. The action of P in the {|q>} representation
4. The {|p>} representation. The symmetric nature of the P and Q observables
Complement FII The parity operator
1. The parity operator
1-a. Definition
1-b. Simple properties of II
1-c. Eigensubspaces of II
2. Even and odd operators
2-a. Definitions
2-b. Selection rules
2-c. Examples
2-d. Functions of operators
3. Eigenstates of an even observable B+
4. Application to an important special case
Complement GII An application of the properties of the tensor product: the two-dimensional infinite well
1. Definition; eigenstates
2. Study of the energy levels
2-a. Ground state
2-b. First excited states
2-c. Systematic and accidental degeneracies
Complement HII Exercises
Chapter III. The postulates of quantum mechanics
A. Introduction
B. Statement of the postulates
B-1. Description of the state of a system
B-2. Description of physical quantities
B-3. The measurement of physical quantities
B-4. Time evolution of systems
B-5. Quantization rules
C. The physical interpretation of the postulates concerning observables and their measurement
C-1. The quantization rules are consistent with the probabilistic interpretation of the wave function
C-2. Quantization of certain physical quantities
C-3. The measurement process
C-4. Mean value of an observable in a given state
C-5. The root mean square deviation
C-6. Compatibility of observables
D. The physical implications of the Schrödinger equation
D-1. General properties of the Schrödinger equation
D-2. The case of conservative systems
E. The superposition principle and physical predictions
E-1. Probability amplitudes and interference effects
E-2. Case in which several states can be associated with the same measurement result
COMPLEMENTS OF CHAPTER III, READER’S GUIDE
Complement AIII Particle in an infinite potential well
1. Distribution of the momentum values in a stationary state
1-a. Calculation of the function φn(p) of
, and, of ∆P 
1-b. Discussion
2. Evolution of the particle’s wave function
2-a. Wave function at the instant t
2-b. Evolution of the shape of the wave packet
2-c. Motion of the center of the wave packet
3. Perturbation created by a position measurement
Complement BIII Study of the probability current in some special cases
1. Expression for the current in constant potential regions
2. Application to potential step problems
2-a. Case where E > V0
2-b. Case where E < V0
3. Probability current of incident and evanescent waves, in the case of reflection from a two-dimensional potential step
Complement CIII Root mean square deviations of two conjugate observables
1. The Heisenberg relation for and P and Q
2. The “minimum” wave packet
Complement DIII Measurements bearing on only one part of a physical system
1. Calculation of the physical predictions
2. Physical meaning of a tensor product state
3. Physical meaning of a state that is not a tensor product
Complement EIII The density operator
1. Outline of the problem
2. The concept of a statistical mixture of states
3. The pure case. Introduction of the density operator
3-a. Description by a state vector
3-b. Description by a density operator
3-c. Properties of the density operator in a pure case
4. A statistical mixture of states (non-pure case)
4-a. Definition of the density operator
4-b. General properties of the density operator
4-c. Populations; coherences
5. Use of the density operator: some applications
5-a. System in thermodynamic equilibrium
5-b. Separate description of part of a physical system. Concept of a partial trace
Complement FIII The evolution operator
1. General properties
2. Case of conservative systems
Complement GIII The Schrödinger and Heisenberg pictures
Complement HIII Gauge invariance
1. Outline of the problem: scalar and vector potentials associated with an electromagnetic field; concept of a gauge
2. Gauge invariance in classical mechanics
2-a. Newton’s equations
2-b. The Hamiltonian formalism
3. Gauge invariance in quantum mechanics
3-a. Quantization rules
3-b. Unitary transformation of the state vector; form invariance of the Schrödinger equation
3-c. Invariance of physical predictions under a gauge transformation
Complement JIII Propagator for the Schrödinger equation
1. Introduction
2. Existence and properties of a propagator K(2,1)
2-a. Existence of a propagator
2-b. Physical interpretation of K(2,1)
2-c. Expression for K(2,1) in terms of the eigenstates of H
2-d. Equation satisfied by K(2,1)
3. Lagrangian formulation of quantum mechanics
3-a. Concept of a space-time path
3-b. Decomposition of K(2,1) into a sum of partial amplitudes
3-c. Feynman’s postulates
3-d. The classical limit and Hamilton’s principle
Complement KIII Unstable states. Lifetime
1. Introduction
2. Definition of the lifetime
3. Phenomenological description of the instability of a state
Complement LIII Exercises
Complement MIII Bound states in a “potential well” of arbitrary shape
1. Quantization of the bound state energies
2. Minimum value of the ground state energy
Complement NIII Unbound states of a particle in the presence of a potential well or barrier of arbitrary shape
1. Transmission matrix M(k)
1-a. Definition of M(k)
1-b. Properties of M(k)
2. Transmission and reflection coefficients
3. Example
Complement OIII Quantum properties of a particle in a one-dimensional periodic structure
1. Passage through several successive identical potential barriers
1-a. Notation
1-b. Matching conditions
1-c. Iteration matrix Q(α)
1-d. Eigenvalues of Q(α)
2. Discussion: the concept of an allowed or forbidden energy band
2-a. Behavior of the wave function φα(x)
2-b. Bragg reflection; possible energies for a particle in a periodic potential
3. Quantization of energy levels in a periodic potential; effect of boundary conditions
3-a. Conditions imposed on the wave function
3-b. Allowed energy bands: stationary states of the particle inside the lattice
3-c. Forbidden bands: stationary states localized on the edges
Chapter IV. Application of the postulates to simple cases: spin 1/2 and two-level systems
A. Spin 1/2 particle: quantization of the angular momentum
A-1. Experimental demonstration
A-2. Theoretical description
B. Illustration of the postulates in the case of a spin 1/2
B-1. Actual preparation of the various spin states
B-2. Spin measurements
B-3. Evolution of a spin 1/2 particle in a uniform magnetic field
C. General study of two-level systems
C-1. Outline of the problem
C-2. Static aspect: effect of coupling on the stationary states of the system
C-3. Dynamical aspect: oscillation of the system between the two unperturbed states
COMPLEMENTS OF CHAPTER IV, READER’S GUIDE
Complement AIV The Pauli matrices
1. Definition; eigenvalues and eigenvectors
2. Simple properties
3. A convenient basis of the 2x2 matrix space
Complement BIV Diagonalization of a 2x2 Hermitian matrix
1. Introduction
2. Changing the eigenvalue origin
3. Calculation of the eigenvalues and eigenvectors
3-a. Angles Φ and φ
3-b. Eigenvalues of K
3-c. Eigenvalues of H
3-d. Normalized eigenvectors of H
Complement CIV Fictitious spin 1/2 associated with a two-level system
1. Introduction
2. Interpretation of the Hamiltonian in terms of fictitious spin
3. Geometrical interpretation of the various effects discussed in § C of Chapter IV
3-a. Fictitious magnetic fields associated with H0, W and H
3-b. Effect of coupling on the eigenvalues and eigenvectors of the Hamiltonian
3-c. Geometrical interpretation of P12(t)
Complement DIV System of two spin 1/2 particles
1. Quantum mechanical description
1-a. State space
1-b. Complete sets of commuting observables
1-c. The most general state
2. Prediction of the measurement results
2-a. Measurements bearing simultaneously on the two spins
2-b. Measurements bearing on one spin alone
Complement EIV Spin 1 2 density matrix
1. Introduction
2. Density matrix of a perfectly polarized spin (pure case)
3. Example of a statistical mixture: unpolarized spin
4. Spin 1/2 at thermodynamic equilibrium in a static field
5. Expansion of the density matrix in terms of the Pauli matrices
Complement FIV Spin 1/2 particle in a static and a rotating magnetic fields: magnetic resonance
1. Classical treatment; rotating reference frame
1-a. Motion in a static field; Larmor precession
1-b. Influence of a rotating field; resonance
2. Quantum mechanical treatment
2-a. The Schrödinger equation
2-b. Changing to the rotating frame
2-c. Transition probability; Rabi’s formula
2-d. Case where the two levels are unstable
3. Relation between the classical treatment and the quantum mechanical treatment: evolution of
4. Bloch equations
4-a. A concrete example
4-b. Solution in the case of a rotating field
Complement GIV A simple model of the ammonia molecule
1. Description of the model
2. Eigenfunctions and eigenvalues of the Hamiltonian
2-a. Infinite potential barrier
2-b. Finite potential barrier
2-c. Evolution of the molecule. Inversion frequency
3. The ammonia molecule considered as a two-level system
3-a. The state space
3-b. Energy levels. Removal of the degeneracy due to the transparency of the potential barrier
3-c. Influence of a static electric field
Complement HIV Effects of a coupling between a stable state and an unstable state
1. Introduction. Notation
2. Influence of a weak coupling on states of different energies
3. Influence of an arbitrary coupling on states of the same energy
Complement JIV Exercises
Chapter V. The one-dimensional harmonic oscillator
A. Introduction
A-1. Importance of the harmonic oscillator in physics
A-2. The harmonic oscillator in classical mechanics
A-3. General properties of the quantum mechanical Hamiltonian
B. Eigenvalues of the Hamiltonian
B-1. Notation
B-2. Determination of the spectrum
B-3. Degeneracy of the eigenvalues
C. Eigenstates of the Hamiltonian
C-1. The representation
C-2. Wave functions associated with the stationary states
D. Discussion
D-1. Mean values and root mean square deviations of X and P in a state |φn)
D-2. Properties of the ground state
D-3. Time evolution of the mean values
COMPLEMENTS OF CHAPITER V, READER’S GUIDE
Complement AV Some examples of harmonic oscillators
1. Vibration of the nuclei of a diatomic molecule
1-a. Interaction energy of two atoms
1-b. Motion of the nuclei
1-c. Experimental observations of nuclear vibration
2. Vibration of the nuclei in a crystal
2-a. The Einstein model
2-b. The quantum mechanical nature of crystalline vibrations
3. Torsional oscillations of a molecule: ethylene
3-a. Structure of the ethylene molecule C2H4
3-b. Classical equations of motion
3-c. Quantum mechanical treatment
4. Heavy muonic atoms
4-a. Comparison with the hydrogen atom
4-b. The heavy muonic atom treated as a harmonic oscillator
4-c. Order of magnitude of the energies and spread of the wave functions
Complement BV Study of the stationary states in the {|x>} representation. Hermite polynomials
1. Hermite polynomials
1-a. Definition and simple properties
1-b. Generating function
1-c. Recurrence relations; differential equation
1-d. Examples
2. The eigenfunctions of the harmonic oscillator Hamiltonian
2-a. Generating function
2-b. φn(x) in terms of the Hermite polynomials
2-c. Recurrence relations
Complement CV Solving the eigenvalue equation of the harmonic oscillator by the polynomial method
1. Changing the function and the variable
2. The polynomial method
2-a. The asymptotic form of φ(x)
2-b. The calculation of h(x) in the form of a series expansion
2-c. Quantization of the energy
2-d. Stationary wave functions
Complement DV Study of the stationary states in the {|p>} representation
1. Wave functions in momentum space
1-a. Changing the variable and the function
1-b. Determination of φn(p)
1-c. Calculation of the phase factor
2. Discussion
Complement EV The isotropic three-dimensional harmonic oscillator
1. The Hamiltonian operator
2. Separation of the variables in Cartesian coordinates
3. Degeneracy of the energy levels
Complement FV A charged harmonic oscillator in a uniform electric field
1. Eigenvalue equation of H' (E) in the {|x>} in the representation
2. Discussion
2-a. Electrical susceptibility of an elastically bound electron
2-b. Interpretation of the energy shift
3. Use of the translation operator
Complement GV Coherent “quasi-classical” states of the harmonic oscillator
1. Quasi-classical states
1-a. Introducing the parameter
to characterize a classical motion
1-b. Conditions defining quasi-classical states
1-c. Quasi-classical states are eigenvectors of the operator
2. Properties of the |α> states
2-a. Expansion of |α> on the basis of the stationary states |φn>
2-b. Possible values of the energy in an |α> state
2-c. Calculation of , ,
and ΔP in an |α> state
2-d. The operator D(α): the wave functions ψα(x)
2-e. The scalar product of two |α> states. Closure relation
3. Time evolution of a quasi-classical state
3-a. A quasi-classical state always remains an eigenvector of a
3-b. Evolution of physical properties
3-c. Motion of the wave packet
4. Example: quantum mechanical treatment of a macroscopic oscillator
Complement HV Normal vibrational modes of two coupled harmonic oscillators
1. Vibration of the two coupled in classical mechanics
1-a. Equations of motion
1-b. Solving the equations of motion
1-c. The physical meaning of each of the modes
1-d. Motion of the system in the general case
2. Vibrational states of the system in quantum mechanics
2-a. Commutation relations
2-b. Transformation of the Hamiltonian operator
2-c. Stationary states of the system
2-d. Evolution of the mean values
References and suggestions for further reading:
Complement JV Vibrational modes of an infinite linear chain of coupled harmonic oscillators; phonons
1. Classical treatment
1-a. Equations of motion
1-b. Simple solutions of the equations of motion
1-c. Normal variables
1-d. Total energy and energy of each of the modes
2. Quantum mechanical treatment
2-a. Stationary states in the absence of coupling
2-b. Effects of the coupling
2-c. Normal operators. Commutation relations
2-d. Stationary states in the presence of coupling
3. Application to the study of crystal vibrations: phonons
3-a. Outline of the problem
3-b. Normal modes. Speed of sound in the crystal
Complement KV Vibrational modes of a continuous physical system. Application to radiation; photons
1. Outline of the problem
2. Vibrational modes of a continuous mechanical system: example of a vibrating string
2-a. Notation. Dynamical variables of the system
2-b. Classical equations of motion
2-c. Introduction of the normal variables
2-d. Classical Hamiltonian
2-e. Quantization
3. Vibrational modes of radiation: photons
3-a. Notation. Equations of motion
3-b. Introduction of the normal variables
3-c. Classical Hamiltonian
3-d. Quantization
References and suggestions for further reading:
Complement LV One-dimensional harmonic oscillator in thermodynamic equilibrium ata temperature T
1. Mean value of the energy
1-a. Partition function
1-b. Calculation of
2. Discussion
2-a. Comparison with the classical oscillator
2-b. Comparison with a two-level system
3. Applications
3-a. Blackbody radiation
3-b. Bose-Einstein distribution law
3-c. Specific heats of solids at constant volume
4. Probability distribution of the observable X
4-a. Definition of the probability density p(x)
4-b. Calculation of p(x)
4-c. Discussion
4-d. Bloch’s theorem
Complement MV Exercises
Chapter VI. General properties of angular momentum in quantum mechanics
A. Introduction: the importance of angular momentum
B. Commutation relations characteristic of angular momentum
B-1. Orbital angular momentum
B-2. Generalization: definition of an angular momentum
B-3. Statement of the problem
C. General theory of angular momentum
C-1. Definitions and notation
C-2. Eigenvalues of J2 and Jz
C-3. “Standard” {|k, j, m>} representations
D. Application to orbital angular momentum
D-1. Eigenvalues and eigenfunctions of L2 and Lz
D-2. Physical considerations
COMPLEMENTS OF CHAPTER VI, READER’S GUIDE
Complement AVI Spherical harmonics
1. Calculation of spherical harmonics
1-a. Determination of YlI (θ, φ)
1-b. General expression for Ylm (θ, φ)
1-c. Explicit expressions for l=0,1 and 2
2. Properties of spherical harmonics
2-a. Recurrence relations
2-b. Orthonormalization and closure relations
2-c. Parity
2-d. Complex conjugation
2-e. Relation between the spherical harmonics and the Legendre polynomials and associated Legendre functions
Complement BVI Angular momentum and rotations
1. Introduction
2. Brief study of geometrical rotations R
2-a. Definition. Parametrization
2-b. Infinitesimal rotations
3. Rotation operators in state space. Example: a spinless particle
3-a. Existence and definition of rotation operators
3-b. Properties of rotation operators
3-c. Expression for rotation operators in terms of angular momentum observables
4. Rotation operators in the state space of an arbitrary system
4-a. System of several spinless particles
4-b. An arbitrary system
5. Rotation of observables
5-a. General transformation law
5-b. Scalar observables
5-c. Vector observables
6. Rotation invariance
6-a. Invariance of physical laws
6-b. Consequence: conservation of angular momentum
6-c. Applications
Complement CVI Rotation of diatomic molecules
1. Introduction
2. Rigid rotator. Classical study
2-a. Notation
2-b. Motion of the rotator. Angular momentum and energy
2-c. The fictitious particle associated with the rotator
3. Quantization of the rigid rotator
3-a. The quantum mechanical state and observables of the rotator
3-b. Eigenstates and eigenvalues of the Hamiltonian
3-c. Study of the observable Z
4. Experimental evidence for the rotation of molecules
4-a. Heteropolar molecules. Pure rotational spectrum
4-b. Homopolar molecules. Raman rotational spectra
Complement EVI A charged particle in a magnetic field: Landau levels
Complement DVI Angular momentum of stationary states of a two-dimensional harmonic oscillator
1. Introduction
1-a. Review of the classical problem
1-b. The problem in quantum mechanics
2. Classification of the stationary states by the quantum numbers and nx and ny
2-a. Energies; stationary states
2-b. does not constitue a C.S.C.O. in Exy
3. Classification of the stationary states in terms of their angular momenta
3-a. Significance and properties of the operator Lz
3-b. Right and left circular quanta
3-c. Stationary states of well-defined angular momentum
3-d. Wave functions associated with the eigenstates common to and Hxy and Lz
4. Quasi-classical states
4-a. Definition of the states and |αx, αy> and |αr, αl>
4-b. Mean values and root mean square deviations of the various observables
Complement EVI A charged particle in a magnetic field: Landau levels
1. Review of the classical problem
1-a. Motion of the particle
1-b. The vector potential. The classical Lagrangian and Hamiltonian
1-c. Constants of the motion in a uniform field
2. General quantum mechanical properties of a particle in a magnetic field
2-a. Quantization. Hamiltonian
2-b. Commutation relations
2-c. Physical consequences
3. Case of a uniform magnetic field
3-a. Eigenvalues of the Hamiltonian
3-b. The observables in a particular gauge
3-c. The stationary states
3-d. Time evolution
Complement FVI Exercises
Chapter VII. Particle in a central potential. The hydrogen atom
A. Stationary states of a particle in a central potential
A-1. Outline of the problem
A-2. Separation of variables
A-3. Stationary states of a particle in a central potential
B. Motion of the center of mass and relative motion for a system of two interacting particles
B-1. Motion of the center of mass and relative motion in classical mechanics
B-2. Separation of variables in quantum mechanics
C. The hydrogen atom
C-1. Introduction
C-2. The Bohr model
C-3. Quantum mechanical theory of the hydrogen atom
C-4. Discussion of the results
COMPLEMENTS OF CHAPTER VII, READER’S GUIDE
Complement AVII Hydrogen-like systems
1. Hydrogen-like systems with one electron
1-a. Electrically neutral systems
1-b. Hydrogen-like ions
2. Hydrogen-like systems without an electron
2-a. Muonic atoms
2-b. Hadronic atoms
Complement BVII A soluble example of a central potential: the isotropic three-dimensional harmonic oscillator
1. Solving the radial equation
2. Energy levels and stationary wave functions
Complement CVII Probability currents associated with the stationary states of the hydrogen atom
1. General expression for the probability current
2. Application to the stationary states of the hydrogen atom
2-a. Structure of the probability current
2-b. Effect of a magnetic field
Complement DVII The hydrogen atom placed in a uniform magnetic field. Paramagnetism and diamagnetism. The Zeeman effect
1. The Hamiltonian of the problem. The paramagnetic term and the diamagnetic term
1-a. Expression for the Hamiltonian
1-b. Order of magnitude of the various terms
1-c. Interpretation of the paramagnetic term
1-d. Interpretation of the diamagnetic term
2. The Zeeman effect
2-a. Energy levels of the atom in the presence of a magnetic field
2-b. Electric dipole oscillations
2-c. Frequency and polarization of emitted radiation
Complement EVII Some atomic orbitals. Hybrid orbitals
1. Introduction
2. Atomic orbitals associated with real wave functions
2-a. orbitals (l=1)
2-b. orbitals (l=1)
2-c. Other values of l
3. sp hybridization
3-a. Introduction of sp hybrid orbitals
3-b. Properties of sp hybrid orbitals
3-c. Example: the structure of acetylene
4. sp2 hybridization
4-a. Introduction of sp2 hybrid orbitals
4-b. Properties of sp2 hybrid orbitals
4-c. Example: the structure of ethylene
5. sp3 hybridization
5-a. Introduction of sp3 hybrid orbitals
5-b. Properties of sp3 hybrid orbitals
5-c. Example: The structure of methane
Complement FVII Vibrational-rotational levels of diatomic molecules
1. Introduction
2. Approximate solution of the radial equation
2-a. The zero angular momentum states (l=0)
2-b. General case (l any positive integer)
2-c. The vibrational-rotational spectrum
3. Evaluation of some corrections
3-a. More precise study of the form of the effective potential Veff(r)
3-b. Energy levels and wave functions of the stationary states
3-c. Interpretation of the various corrections
Complement GVII Exercises
Index
EULA
Claude Cohen-Tannoudji, Bernard Diu , Frank Laloë - Quantum Mechanics Volume 2. 2nd Edition. Wiley (2019)
Cover
Title Page
Copyright Page
Directions for Use
Foreword
VOLUME II
Table of contents
Chapter VIII. An elementary approach to the quantum theory of scattering by a potential
A Introduction
A-1. Importance of collision phenomena
A-2. Scattering by a potential
A-3. Definition of the scattering cross section
A-4. Organization of this chapter
B. Stationary scattering states. Calculation of the cross section
B-1. Definition of stationary scattering states
B-2. Calculation of the cross section using probability currents
B-3. Integral scattering equation
B-4. The Born approximation
C. Scattering by a central potential. Method of partial waves
C-1. Principle of the method of partial waves
C-2. Stationary states of a free particle
C-3. Partial waves in the potential V(r)
C-4. Expression of the cross section in terms of phase shifts
COMPLEMENTS OF CHAPTER VIII, READER’S GUIDE
Complement AVIII The free particle: stationary states with well-defined angular momentum
1. The radial equation
2. Free spherical waves
2-a. Recurrence relations
2-b. Calculation of free spherical waves
2-c. Properties
3. Relation between free spherical waves and plane waves
Complement BVIII Phenomenological description of collisions with absorption
1. Principle involved
2. Calculation of the cross sections
2-a. Elastic scattering cross section
2-b. Absorption cross section
2-c. Total cross section. Optical theorem
Complement CVIII Some simple applications of scattering theory
1. The Born approximation for a Yukawa potential
1-a. Calculation of the scattering amplitude and cross section
1-b. The infinite-range limit
2. Low energy scattering by a hard sphere
3. Exercises
3-a. Scattering of the p wave by a hard sphere
3-b. “Square spherical well”: bound states and scattering resonances
Chapter IX. Electron spin
A. Introduction of electron spin
A-1. Experimental evidence
A-2. Quantum description: postulates of the Pauli theory
B. Special properties of an angular momentum 1/2
C. Non-relativistic description of a spin 1/2 particle
C-1. Observables and state vectors
C-2. Probability calculations for a physical measurement
COMPLEMENTS OF CHAPTER IX, READER’S GUIDE
Complement AIX Rotation operators for a spin 1/2 particle
1. Rotation operators in state space
1-a. Total angular momentum
1-b. Decomposition of rotation operators into tensor products
2. Rotation of spin states
2-a. Explicit calculation of the rotation operators in
2-b. Operator associated with a rotation through an angle of 2
2-c. Relationship between the vectorial nature of S and the behavior of a spin stateupon rotation
3. Rotation of two-component spinors
Complement BIX Exercises
Chapter X. Addition of angular momenta
A. Introduction
A-1. Total angular momentum in classical mechanics
A-2. The importance of total angular momentum in quantum mechanics
B. Addition of two spin 1/2’s. Elementary method
B-1. Statement of the problem
B-2. The eigenvalues of Sz and their degrees of degeneracy
B-3. Diagonalization of S2
B-4. Results: triplet and singlet
C. Addition of two arbitrary angular momenta. General method
C-1. Review of the general theory of angular momentum
C-2. Statement of the problem
C-3. Eigenvalues of J2 and Jz
C-4. Common eigenvectors of J2 and Jz
COMPLEMENTS OF CHAPTER X, READER’S GUIDE
Complement AX Examples of addition of angular momenta
1. Addition of j1 = 1 and j2 = 1
1-a. The subspace Ԑ( J = 2)
1-b. The subspace Ԑ( J = 2)
1-c. The vector | J = 0, M = 0
2. Addition of an integral orbital angular momentum Ɩ and a spin 1/2
2-a. The subspace ɛ( J = l + 1/2)
2-b. The subspace ɛ( J = l + 1/2)
Complement BX Clebsch-Gordan coefficients
1. General properties of Clebsch-Gordan coefficients
1-a. Selection rules
1-b. Orthogonality relations
1-c. Recurrence relations
2. Phase conventions. Reality of Clebsch-Gordan coefficients
2-a. The coefficients :phase of the ket | J,J>
2-b. Other Clebsch-Gordan coefficients
3. Some useful relations
3-a. The signs of some coefficients
3-b. Changing the order of j1 and j2
3-c. Changing the sign of M ,m1 and m2
3-d. The coefficients
Complement CX Addition of spherical harmonics
1 The functions ФMJ (Ω1; Ω2)
2. The functions Fml (Ω)
3. Expansion of a product of spherical harmonics; the integral of a product ofthree spherical harmonics
Complement DX Vector operators: the Wigner-Eckart theorem
1. Definition of vector operators; examples
2. The Wigner-Eckart theorem for vector operators
2-a. Non-zero matrix elements of V in a standard basis
2-b. Proportionality between the matrix elements of J and V inside a subspace Ԑ(k, j)
2-c. Calculation of the proportionality constant; the projection theorem
3. Application: calculation of the Landé gJ factor of an atomic level
3-a. Rotational degeneracy; multiplets
3-b. Removal of the degeneracy by a magnetic field; energy diagram
Complement EX Electric multipole moments
1. Definition of multipole moments
1-a. Expansion of the potential on the spherical harmonics
1-b. Physical interpretation of multipole operators
1-c. Parity of multipole operators
1-d. Another way to introduce multipole moments
2. Matrix elements of electric multipole moments
2-a. General expression for the matrix elements
2-b. Selection rules
Complement FX Two angular momenta J1 and J2 coupled by an interaction aJ1 . J2
1. Classical review
1-a. Equations of motion
2. Quantum mechanical evolution of the average values and
2-a. Calculation of d/dƖ and d/dƖ
2-b. Discussion
3. The special case of two spin 1/2’s
3-a. Stationary states of the two-spin system
3-b. Calculation of S1 (t)
3-c. Discussion. Polarization of the magnetic dipole transitions
4. Study of a simple model for the collision of two spin 1/2 particles
4-a. Description of the model
4-b. State of the system after collision
4-c. Discussion. Correlation introduced by the collision
Complement GX Exercises
Chapter XI. Stationary perturbation theory
A. Description of the method
A-1. Statement of the problem
A-2. Approximate solution of the H (λ ) eigenvalue equation
B. Perturbation of a non-degenerate level
B-1. First-order corrections
B-2. Second-order corrections
C. Perturbation of a degenerate state
COMPLEMENTS OF CHAPTER XI, READER’S GUIDE
Complement AXI A one-dimensional harmonic oscillator subjected to a perturbing potential in x, x2, x3
1. Perturbation by a linear potential
1-a. The exact solution
1-b. The perturbation expansion
2. Perturbation by a quadratic potential
3. Perturbation by a potential in x3
3-a. The anharmonic oscillator
3-b. The perturbation expansion
3-c. Application: the anharmonicity of the vibrations of a diatomic molecule
Complement BXI Interaction between the magnetic dipoles of two spin 1/2 particles
1. The interaction Hamiltonian W
1-a. The form of the Hamiltonian W. Physical interpretation
1-b. An equivalent expression for W
1-c. Selection rules
2. Effects of the dipole-dipole interaction on the Zeeman sublevels of two fixedparticles
2-a. Case where the two particles have different magnetic moments
2-b. Case where the two particles have equal magnetic moments
2-c. Example: the magnetic resonance spectrum of gypsum
3. Effects of the interaction in a bound state
Complement CXI Van der Waals forces
1. The electrostatic interaction Hamiltonian for two hydrogen atoms
1-a. Notation
1-b. Calculation of the electrostatic interaction energy
2. Van der Waals forces between two hydrogen atoms in the 1s ground state
2-a. Existence of a- C/R6 attractive potential
2-b. Approximate calculation of the constant C
3. Van der Waals forces between a hydrogen atom in the 1s state and ahydrogen atom in the 2P state
3-a. Energies of the stationary states of the two-atom system. Resonance effect
3-b. Transfer of the excitation from one atom to the other
4. Interaction of a hydrogen atom in the ground state with a conducting wall
Complement DXI The volume effect: the influence of the spatial extension of the nucleus on the atomic levels
1. First-order energy correction
1-a. Calculation of the correction
1-b. Discussion
2. Application to some hydrogen-like systems
2-a. The hydrogen atom and hydrogen-like ions
2-b. Muonic atoms
Complement EXI The variational method
1. Principle of the method
1-a. A property of the ground state of a system
1-b. Generalization: the Ritz theorem
1-c. A special case where the trial functions form a subspace
2. Application to a simple example
2-a. Exponential trial functions
2-b. Rational wave functions
3. Discussion
Complement FXI Energy bands of electrons in solids: a simple model
1. A first approach to the problem: qualitative discussion
2. A more precise study using a simple model
2-a. Calculation of the energies and stationary states
2-b. Discussion
Complement GXI A simple example of the chemical bond: the H2+ ion
1. Introduction
1-a. General method
1-b. Notation
1-c. Principle of the exact calculation
2. The variational calculation of the energies
2-a. Choice of the trial kets
2-b. The eigenvalue equation of the Hamiltonian H in the trial ket vector subspace ϝ
2-c. Overlap, Coulomb and resonance integrals
2-d. Bonding and antibonding states
3. Critique of the preceding model. Possible improvements
3-a. Results for small R
3-b. Results for R
4. Other molecular orbitals of the H+2 ion
4-a. Symmetries and quantum numbers. Spectroscopic notation
4-b. Molecular orbitals constructed from the 2P atomic orbitals
5. The origin of the chemical bond; the virial theorem
5-a. Statement of the problem
5-b. Some useful theorems
5-c. The virial theorem applied to molecules
5-d. Discussion
Complement HXI Exercises
Chapter XII. An application of perturbation theory: the fine and hyperfine structure of hydrogen
A. Introduction
B. Additional terms in the Hamiltonian
B-1. The fine-structure Hamiltonian
B-2. Magnetic interactions related to proton spin: the hyperfine Hamiltonian
C. The fine structure of the = 2 level
C-1. Statement of the problem
C-2. Matrix representation of the fine-structure Hamiltonian Wf inside the n= 2 level
C-3. Results: the fine structure of the n = 2 level
D. The hyperfine structure of the n = 1 level
D-1. Statement of the problem
D-2. Matrix representation of Whf in the 1s level
D-3. The hyperfine structure of the 1s level
E. The Zeeman effect of the 1s ground state hyperfine structure
E-1. Statement of the problem
E-2. The weak-field Zeeman effect
E-3. The strong-field Zeeman effect
E-4. The intermediate-field Zeeman effect
COMPLEMENTS OF CHAPTER XII, READER’S GUIDE
Complement AXII The magnetic hyperfine Hamiltonian
1. Interaction of the electron with the scalar and vector potentials created bythe proton
2. The detailed form of the hyperfine Hamiltonian
2-a. Coupling of the magnetic moment of the proton with the orbital angularmomentum of the electron
2-b. Coupling with the electron spin
3. Conclusion: the hyperfine-structure Hamiltonian
Complement BXII Calculation of the average values of the fine-structure Hamiltonian in the 1s, 2s and 2p states
1. Calculation of , } representation
B-2. Perturbation equations
B-3. Solution to first order in λ
C. An important special case: a sinusoidal or constant perturbation
C-1. Application of the general equations
C-2. Sinusoidal perturbation coupling two discrete states: the resonance phenomenon
C-3. Coupling with the states of the continuous spectrum
D. Random perturbation
D-1. Statistical properties of the perturbation
D-2. Perturbative computation of the transition probability
D-3. Validity of the perturbation treatment
E. Long-time behavior for a two-level atom
E-1. Sinusoidal perturbation
E-2. Random perturbation
E-3. Broadband optical excitation of an atom
COMPLEMENTS OF CHAPTER XIII, READER’S GUIDE
Complement AXIII Interaction of an atom with an electromagnetic wave
1. The interaction Hamiltonian. Selection rules
1-a. Fields and potentials associated with a plane electromagnetic wave
1-b. The interaction Hamiltonian at the low-intensity limit
1-c. The electric dipole Hamiltonian
1-d. The magnetic dipole and electric quadrupole Hamiltonians
2. Non-resonant excitation. Comparison with the elastically bound electronmodel
2-a. Classical model of the elastically bound electron
2-b. Quantum mechanical calculation of the induced dipole moment
2-c. Discussion. Oscillator strength
3. Resonant excitation. Absorption and induced emission
3-a. Transition probability associated with a monochromatic wave
3-b. Broad-line excitation. Transition probability per unit time
Complement BXIII Linear and non-linear responses of a two-level system subject to a sinusoidal perturbation
1. Description of the model
1-a. Bloch equations for a system of spin 1/2’s interacting with a radiofrequency field
1-b. Some exactly and approximately soluble cases
1-c. Response of the atomic system
2. The approximate solution of the Bloch equations of the system
2-a. Perturbation equations
2-b. The Fourier series expansion of the solution
2-c. The general structure of the solution
3. Discussion
3-a. Zeroth-order solution: competition between pumping and relaxation
3-b. First-order solution: the linear response
3-c. Second-order solution: absorption and induced emission
3-d. Third-order solution: saturation effects and multiple-quanta transitions
4. Exercises: applications of this complement
Complement CXIII Oscillations of a system between two discrete states under the effect of a sinusoidal resonant perturbation
1. The method: secular approximation
2. Solution of the system of equations
3. Discussion
Complement DXIII Decay of a discrete state resonantly coupled to a continuum of final states
1. Statement of the problem
2. Description of the model
2-a. Assumptions about the unperturbed Hamiltonian Hο
2-b. Assumptions about the coupling W
2-c. Results of first-order perturbation theory
2-d. Integrodifferential equation equivalent to the Schrödinger equation
3. Short-time approximation. Relation to first-order perturbation theory
4. Another approximate method for solving the Schrödinger equation
5. Discussion
5-a. Lifetime of the discrete state
5-b. Shift of the discrete state due to the coupling with the continuum
5-c. Energy distribution of the final states
Complement EXIII Time-dependent random perturbation, relaxation
1. Evolution of the density operator
1-a. Coupling Hamiltonian, correlation times
1-b. Evolution of a single system
1-c. Evolution of the ensemble of systems
1-d. General equations for the relaxation
2. Relaxation of an ensemble of spin 1/2’s
2-a. Characterization of the operators, isotropy of the perturbation
2-b. Longitudinal relaxation
2-c. Transverse relaxation
3. Conclusion
Complement FXIII Exercises
Chapter XIV. Systems of identical particles
A. Statement of the problem
A-1. Identical particles: definition
A-2. Identical particles in classical mechanics
A-3. Identical particles in quantum mechanics: the difficulties of applying the generalpostulates
B. Permutation operators
B-1. Two-particle systems
B-2. Systems containing an arbitrary number of particles
C. The symmetrization postulate
C-1. Statement of the postulate
C-2. Removal of exchange degeneracy
C-3. Construction of physical kets
C-4. Application of the other postulates
D. Discussion
D-1. Differences between bosons and fermions. Pauli’s exclusion principle
D-2. The consequences of particle indistinguishability on the calculation of physicalpredictions
COMPLEMENTS OF CHAPTER XIV, READER’S GUIDE
Complement AXIV Many-electron atoms. Electronic configurations
1. The central-field approximation
1-a. Difficulties related to electron interactions
1-b. Principle of the method
1-c. Energy levels of the atom
2. Electron configurations of various elements
Complement BXIV Energy levels of the helium atom. Configurations, terms, multiplets
1. The central-field approximation. Configurations
1-a. The electrostatic Hamiltonian
1-b. The ground state configuration and first excited configurations
1-c. Degeneracy of the configurations
2. The effect of the inter-electron electrostatic repulsion: exchange energy,spectral terms
2-a. Choice of a basis of Ԑ(n,l,n,l) adapted to the symmetries of W
2-b. Spectral terms. Spectroscopic notation
2-c. Discussion
3. Fine-structure levels; multiplets
Complement CXIV Physical properties of an electron gas. Application to solids
1. Free electrons enclosed in a box
1-a. Ground state of an electron gas; Fermi energy EF
1-b. Importance of the electrons with energies close to EF
1-c. Periodic boundary conditions
2. Electrons in solids
2-a. Allowed bands
2-b. Position of the Fermi level and electric conductivity
Complement DXIV Exercises
Appendix I: Fourier series and Fourier transforms
1. Fourier series
1-a. Periodic functions
1-b. Expansion of a periodic function in a Fourier series
1-c. The Bessel-Parseval relation
2. Fourier transforms
2-a. Definitions
2-b. Simple properties
2-c. The Parseval-Plancherel formula
2-d. Examples
2-e. Fourier transforms in three-dimensional space
Appendix II: The Dirac δ -“function”
1. Introduction; principal properties
1-a. Introduction of the δ-“function”
1-b. Functions that approach δ
1-c. Properties of δ
2. The δ-''function” and the Fourier transform
2-a. The Fourier transform of δ
2-b. Applications
3. Integral and derivatives of the δ -“function”
3-a. δ is the derivative of the “unit step-function”
3-b. Derivatives of δ
4. The δ-“function” in three-dimensional space
Appendix III: Lagrangian and Hamiltonian in classical mechanics
1. Review of Newton’s laws
1-a. Dynamics of a point particle
1-b. Systems of point particles
1-c. Fundamental theorems
2. The Lagrangian and Lagrange’s equations
3. The classical Hamiltonian and the canonical equations
3-a. The conjugate momenta of the coordinates
3-b. The Hamilton-Jacobi canonical equations
4. Applications of the Hamiltonian formalism
4-a. A particle in a central potential
4-b. A charged particle placed in an electromagnetic field
5. The principle of least action
5-a. Geometrical representation of the motion of a system
5-b. The principle of least action
5-c. Lagrange’s equations as a consequence of the principle of least action
BIBLIOGRAPHY OF VOLUMES I AND II
INDEX
EULA
Claude Cohen-Tannoudji, Bernard Diu , Frank Laloë - Quantum Mechanics Volume 3. 2nd Edition. Wiley (2019)
Cover
Title Page
Copyright Page
Directions for Use
Foreword
VOLUME III
Table of contents
Chapter XV. Creation and annihilation operators for identical particles
A. General formalism
A-1. Fock states and Fock space
A-2. Creation operators
A-3. Annihilation operators
A-4. Occupation number operators (bosons and fermions)
A-5. Commutation and anticommutation relations
A-6. Change of basis
B. One-particle symmetric operators
B-1. Definition
B-2. Expression in terms of the operators and
B-3. Examples
B-4. Single particle density operator
C. Two-particle operators
C-1. Definition
C-2. A simple case: factorization
C-3. General case
C-4. Two-particle reduced density operator
C-5. Physical discussion; consequences of the exchange
COMPLEMENTS OF CHAPTER XV, READER’S GUIDE
Complement AXV Particles and holes
1. Ground state of a non-interacting fermion gas
2. New definition for the creation and annihilation operators
3. Vacuum excitations
Complement BXV Ideal gas in thermal equilibrium; quantum distribution functions
1. Grand canonical description of a system without interactions
1-a. Density operator
1-b. Grand canonical partition function, grand potential
2. Average values of symmetric one-particle operators
2-a. Fermion distribution function
2-b. Boson distribution function
2-c. Common expression
2-d. Characteristics of Fermi-Dirac and Bose-Einstein distributions
3. Two-particle operators
3-a. Fermions
3-b. Bosons
3-c. Common expression
4. Total number of particles
4-a. Fermions
4-b. Bosons
5. Equation of state, pressure
5-a. Fermions
5-b. Bosons
Complement CXV Condensed boson system, Gross-Pitaevskii equation
1. Notation, variational ket
1-a. Hamiltonian
1-b. Choice of the variational ket (or trial ket)
2. First approach
2-a. Trial wave function for spinless bosons, average energy
2-b. Variational optimization
3. Generalization, Dirac notation
3-a. Average energy
3-b. Energy minimization
3-c. Gross-Pitaevskii equation
4. Physical discussion
4-a. Energy and chemical potential
4-b. Healing length
4-c. Another trial ket: fragmentation of the condensate
Complement DXV Time-dependent Gross-Pitaevskii equation
1. Time evolution
1-a. Functional variation
1-b. Variational computation: the time-dependent Gross-Pitaevskii equation
1-c. Phonons and Bogolubov spectrum
2. Hydrodynamic analogy
2-a. Probability current
2-b. Velocity evolution
3. Metastable currents, superfluidity
3-a. Toroidal geometry, quantization of the circulation, vortex
3-b. Repulsive potential barrier between states of different
3-c. Critical velocity, metastable flow
3-d. Generalization; topological aspects
Complement EXV Fermion system, Hartree-Fock approximation
1. Foundation of the method
1-a. Trial family and Hamiltonian
1-b. Energy average value
1-c. Optimization of the variational wave function
1-d. Equivalent formulation for the average energy stationarity
1-e. Variational energy
1-f. Hartree-Fock equations
2. Generalization: operator method
2-a. Average energy
2-b. Optimization of the one-particle density operator
2-c. Mean field operator
2-d. Hartree-Fock equations for electrons
2-e. Discussion
Complement FXV Fermions, time-dependent Hartree-Fock approximation
1. Variational ket and notation
2. Variational method
2-a. Definition of a functional
2-b. Stationarity
2-c. Particular case of a time-independent Hamiltonian
3. Computing the optimizer
3-a. Average energy
3-b. Hartree-Fock potential
3-c. Time derivative
3-d. Functional value
4. Equations of motion
4-a. Time-dependent Hartree-Fock equations
4-b. Particles in a single spin state
4-c. Discussion
Complement GXV Fermions or Bosons: Mean field thermal equilibrium
1. Variational principle
1-a. Notation, statement of the problem
1-b. A useful inequality
1-c. Minimization of the thermodynamic potential
2. Approximation for the equilibrium density operator
2-a. Trial density operators
2-b. Partition function, distributions
2-c. Variational grand potential
2-d. Optimization
3. Temperature dependent mean field equations
3-a. Form of the equations
3-b. Properties and limits of the equations
3-c. Differences with the zero-temperature Hartree-Fock equations (fermions)
3-d. Zero-temperature limit (fermions)
3-e. Wave function equations
Complement HXV Applications of the mean field method for non-zero temperature (fermions and bosons)
1. Hartree-Fock for non-zero temperature, a brief review
2. Homogeneous system
2-a. Calculation of the energies
2-b. Quasi-particules
3. Spontaneous magnetism of repulsive fermions
3-a. A simple model
3-b. Resolution of the equations by graphical iteration
3-c. Physical discussion
4. Bosons: equation of state, attractive instability
4-a. Repulsive bosons
4-b. Attractive bosons
Chapter XVI. Field operator
A. Definition of the field operator
A-1. Definition
A-2. Commutation and anticommutation relations
B. Symmetric operators
B-1. General expression
B-2. Simple examples
B-3. Field spatial correlation functions
B-4. Hamiltonian operator
C. Time evolution of the field operator (Heisenberg picture)
C-1. Contribution of the kinetic energy
C-2. Contribution of the potential energy
C-3. Contribution of the interaction energy
C-4. Global evolution
D. Relation to field quantization
COMPLEMENTS OF CHAPTER XVI, READER’S GUIDE
Complement AXVI Spatial correlations in an ideal gas of bosons or fermions
1. System in a Fock state
1-a. Two-point correlations
1-b. Four-point correlations
2. Fermions in the ground state
2-a. Two-point correlations
2-b. Correlations between two particles
3. Bosons in a Fock state
3-a. Ground state
3-b. Fragmented state
3-c. Other states
Complement BXVI Spatio-temporal correlation functions, Green’s functions
1. Green’s functions in ordinary space
1-a. Spatio-temporal correlation functions
1-b. Twoand four-point Green’s functions
1-c. An example, the ideal gas
2. Fourier transforms
2-a. General definition
2-b. Ideal gas example
2-c. General expression in the presence of interactions
2-d. Discussion
3. Spectral function, sum rule
3-a. Expression of the one-particle correlation functions
3-b. Sum rule
3-c. Expression of various physical quantities
Complement CXVI Wick’s theorem
1. Demonstration of the theorem
1-a. Statement of the problem
1-b. Recurrence relation
1-c. Contractions
1-d. Statement of the theorem
2. Applications: correlation functions for an ideal gas
2-a. First order correlation function
2-b. Second order correlation functions
2-c. Higher order correlation functions
Chapter XVII. Paired states of identical particles
A. Creation and annihilation operators of a pair of particles
A-1. Spinless particles, or particles in the same spin state
A-2. Particles in different spin states
B. Building paired states
B-1. Well determined particle number
B-2. Undetermined particle number
B-3. Pairs of particles and pairs of individual states
C. Properties of the kets characterizing the paired states
C-1. Normalization
C-2. Average value and root mean square deviation of particle number
C-3. “Anomalous” average values
D. Correlations between particles, pair wave function
D-1. Particles in the same spin state
D-2. Fermions in a singlet state
E. Paired states as a quasi-particle vacuum; Bogolubov-Valatin transformations
E-1. Transformation of the creation and annihilation operators
E-2. Effect on the kets
E-3. Basis of excited states, quasi-particles
COMPLEMENTS OF CHAPTER XVII, READER’S GUIDE
Complement AXVII Pair field operator for identical particles
1. Pair creation and annihilation operators
1-a. Particles in the same spin state
1-b. Pairs in a singlet spin state
2. Average values in a paired state
2-a. Average value of a field operator; pair wave function, and order parameter
2-b. Average value of a product of two field operators; factorization of the order parameter
2-c. Application to the computation of the correlation function (singlet pairs)
3. Commutation relations of field operators
3-a. Particles in the same spin state
3-b. Singlet pairs
Complement BXVII Average energy in a paired state
1. Using states that are not eigenstates of the total particle number
1-a. Computation of the average values
1-b. A good approximation
2. Hamiltonian
2-a. Operator expression
2-b. Simplifications due to pairing
3. Spin 1/2 fermions in a singlet state
3-a. Different contributions to the energy
3-b. Total energy
4. Spinless bosons
4-a. Choice of the variational state
4-b. Different contributions to the energy
4-c. Total energy
Complement CXVII Fermion pairing, BCS theory
1. Optimization of the energy
1-a. Function to be optimized
1-b. Cancelling the total variation
1-c. Short-range potential, study of the gap
2. Distribution functions, correlations
2-a. One-particle distribution
2-b. Two-particle distribution, internal pair wave function
2-c. Properties of the pair wave function, coherence length
3. Physical discussion
3-a. Modification of the Fermi surface and phase locking
3-b. Gain in energy
3-c. Non-perturbative character of the BCS theory
4. Excited states
4-a. Bogolubov-Valatin transformation
4-b. Broken pairs and excited pairs
4-c. Stationarity of the energies
4-d. Excitation energies
Complement DXVII Cooper pairs
1. Cooper model
2. State vector and Hamiltonian
3. Solution of the eigenvalue equation
4. Calculation of the binding energy for a simple case
Complement EXVII Condensed repulsive bosons
1. Variational state, energy
1-a. Variational ket
1-b. Total energy
1-c.
approximation
2. Optimization
2-a. Stationarity conditions
2-b. Solution of the equations
3. Properties of the ground state
3-a. Particle number, quantum depletion
3-b. Energy
3-c. Phase locking; comparison with the BCS mechanism
3-d. Correlation functions
4. Bogolubov operator method
4-a. Variational space, restriction on the Hamiltonian
4-b. Bogolubov Hamiltonian
4-c. Constructing a basis of excited states, quasi-particles
Chapter XVIII. Review of classical electrodynamics
A. Classical electrodynamics
A-1. Basic equations and relations
A-2. Description in the reciprocal space
A-3. Elimination of the longitudinal fields from the expression of the physical quantities
B. Describing the transverse field as an ensemble of harmonic oscillators
B-1. Brief review of the one-dimensional harmonic oscillator
B-2. Normal variables for the transverse field
B-3. Discrete modes in a box
B-4. Generalization of the mode concept
COMPLEMENT OF CHAPTER XVIII, READER’S GUIDE
Complement AXVIII Lagrangian formulation of electrodynamics
1. Lagrangian with several types of variables
1-a. Lagrangian formalism with discrete and real variables
1-b. Extension to complex variables
1-c. Lagrangian with continuous variables
2. Application to the free radiation field
2-a. Lagrangian densities in real and reciprocal spaces
2-b. Lagrange’s equations
2-c. Conjugate momentum of the transverse potential vector
2-d. Hamiltonian; Hamilton-Jacobi equations
2-e. Field commutation relations
2-f. Creation and annihilation operators
2-g. Discrete momentum variables
3. Lagrangian of the global system field + interacting particles
3-a. Choice for the Lagrangian
3-b. Lagrange’s equations
3-c. Conjugate momenta
3-d. Hamiltonian
3-e. Commutation relations
Chapter XIX. Quantization of electromagnetic radiation
A. Quantization of the radiation in the Coulomb gauge
A-1. Quantization rules
A-2. Radiation contained in a box
A-3. Heisenberg equations
B. Photons, elementary excitations of the free quantum field
B-1. Fock space of the free quantum field
B-2. Corpuscular interpretation of states with fixed total energy and momentum
B-3. Several examples of quantum radiation states
C. Description of the interactions
C-1. Interaction Hamiltonian
C-2. Interaction with an atom. External and internal variables
C-3. Long wavelength approximation
C-4. Electric dipole Hamiltonian
C-5. Matrix elements of the interaction Hamiltonian; selection rules
COMPLEMENTS OF CHAPTER XIX, READER’S GUIDE
Complement AXIX Momentum exchange between atoms and photons
1. Recoil of a free atom absorbing or emitting a photon
1-a. Conservation laws
1-b. Doppler effect, Doppler width
1-c. Recoil energy
1-d. Radiation pressure force in a plane wave
2. Applications of the radiation pressure force: slowing and cooling atoms
2-a. Deceleration of an atomic beam
2-b. Doppler laser cooling of free atoms
2-c. Magneto-optical trap
3. Blocking recoil through spatial confinement
3-a. Assumptions concerning the external trapping potential
3-b. Intensities of the vibrational lines
3-c. Effect of the confinement on the absorption and emission spectra
3-d. Case of a one-dimensional harmonic potential
3-e. Mössbauer effect
4. Recoil suppression in certain multi-photon processes
Complement BXIX Angular momentum of radiation
1. Quantum average value of angular momentum for a spin 1 particle
1-a. Wave function, spin operator
1-b. Average value of the spin angular momentum
1-c. Average value of the orbital angular momentum
2. Angular momentum of free classical radiation as a function of normal variables
2-a. Calculation in position space
2-b. Reciprocal space
2-c. Difference between the angular momenta of massive particles and of radiation
3. Discussion
3-a. Spin angular momentum of radiation
3-b. Experimental evidence of the radiation spin angular momentum
3-c. Orbital angular momentum of radiation
Complement CXIX Angular momentum exchange between atoms and photons
1. Transferring spin angular momentum to internal atomic variables
1-a. Electric dipole transitions
1-b. Polarization selection rules
1-c. Conservation of total angular momentum
2. Optical methods
2-a. Double resonance method
2-b. Optical pumping
2-c. Original features of these methods
3. Transferring orbital angular momentum to external atomic variables
3-a. Laguerre-Gaussian beams
3-b. Field expansion on Laguerre-Gaussian modes
Chapter XX. Absorption, emission and scattering of photons by atoms
A. A basic tool: the evolution operator
A-1. General properties
A-2. Interaction picture
A-3. Positive and negative frequency components of the field
B. Photon absorption between two discrete atomic levels
B-1. Monochromatic radiation
B-2. Non-monochromatic radiation
C. Stimulated and spontaneous emissions
C-1. Emission rate
C-2. Stimulated emission
C-3. Spontaneous emission
C-4. Einstein coefficients and Planck’s law
D. Role of correlation functions in one-photon processes
D-1. Absorption process
D-2. Emission process
E. Photon scattering by an atom
E-1. Elastic scattering
E-2. Resonant scattering
E-3. Inelastic scattering Raman scattering
COMPLEMENTS OF CHAPTER XX, READER’S GUIDE
Complement AXX A multiphoton process: two-photon absorption
1. Monochromatic radiation
2. Non-monochromatic radiation
2-a. Probability amplitude, probability
2-b. Probability per unit time when the radiation is in a Fock state
3. Discussion
3-a. Conservation laws
3-b. Case where the relay state becomes resonant for one-photon absorption
Complement BXX Photoionization
1. Brief review of the photoelectric effect
1-a. Interpretation in terms of photons
1-b. Photoionization of an atom
2. Computation of photoionization rates
2-a. A single atom in monochromatic radiation
2-b. Stationary non-monochromatic radiation
2-c. Non-stationary and non-monochromatic radiation
2-d. Correlations between photoionization rates of two detector atoms
3. Is a quantum treatment of radiation necessary to describe photoionization?
3-a. Experiments with a single photodetector atom
3-b. Experiments with two photodetector atoms
4. Two-photon photoionization
4-a. Differences with the one-photon photoionization
4-b. Photoionization rate
4-c. Importance of fluctuations in the radiation intensity
5. Tunnel ionization by intense laser fields
Complement CXX Two-level atom in a monochromatic field. Dressed-atom method
1. Brief description of the dressed-atom method
1-a. State energies of the atom + photon system in the absence of coupling
1-b. Coupling matrix elements
1-c. Outline of the dressed-atom method
1-d. Physical meaning of photon number
1-e. Effects of spontaneous emission
2. Weak coupling domain
2-a. Eigenvalues and eigenvectors of the effective Hamiltonian
2-b. Light shifts and radiative broadening
2-c. Dependence on incident intensity and detuning
2-d. Semiclassical interpretation in the weak coupling domain
2-e. Some extensions
3. Strong coupling domain
3-a. Eigenvalues and eigenvectors of the effective Hamiltonian
3-b. Variation of dressed state energies with detuning
3-c. Fluorescence triplet
3-d. Temporal correlations between fluorescent photons
4. Modifications of the field. Dispersion and absorption
4-a. Atom in a cavity
4-b. Frequency shift of the field in the presence of the atom
4-c. Field absorption
Complement DXX Light shifts: a tool for manipulating atoms and fields
1. Dipole forces and laser trapping
2. Mirrors for atoms
3. Optical lattices
4. Sub-Doppler cooling. Sisyphus effect
4-a. Laser configurations with space-dependent polarization
4-b. Atomic transition
4-c. Light shifts
4-d. Optical pumping
4-e. Sisyphus effect
5. Non-destructive detection of a photon
Complement EXX Detection of one- or two-photon wave packets, interference
1. One-photon wave packet, photodetection probability
1-a. Photoionization of a broadband detector
1-b. Detection probability amplitude
1-c. Temporal variation of the signal
2. Oneor two-photon interference signals
2-a. How should one compute photon interference?
2-b. Interference signal for a one-photon wave packet in two modes
2-c. Interference signals for a product of two one-photon wave packets
3. Absorption amplitude of a photon by an atom
3-a. Computation of the amplitude
3-b. Properties of that amplitude
4. Scattering of a wave packet
4-a. Absorption amplitude by atom B of the photon scattered by atom A
4-b. Wave packet scattered by atom A
5. Example of wave packets with two entangled photons
5-a. Parametric down-conversion
5-b. Temporal correlations between the two photons generated in parametric down-conversion
Chapter XXI. Quantum entanglement, measurements, Bell’s inequalities
A. Introducing entanglement, goals of this chapter
B. Entangled states of two spin- 1/2systems
B-1. Singlet state, reduced density matrices
B-2. Correlations
C. Entanglement between more general systems
C-1. Pure entangled states, notation
C-2. Presence (or absence) of entanglement: Schmidt decomposition
C-3. Characterization of entanglement: Schmidt rank
D. Ideal measurement and entangled states
D-1. Ideal measurement scheme (von Neumann)
D-2. Coupling with the environment, decoherence; “pointer states”
D-3. Uniqueness of the measurement result
E. “Which path” experiment: can one determine the path followed by the photon in Young’s double slit experiment?
E-1. Entanglement between the photon states and the plate states
E-2. Prediction of measurements performed on the photon
F. Entanglement, non-locality, Bell’s theorem
F-1. The EPR argument
F-2. Bohr’s reply, non-separability
F-3. Bell’s inequality
COMPLEMENTS OF CHAPTER XXI, READER’S GUIDE
Complement AXXI Density operator and correlations; separability
1. Von Neumann statistical entropy
1-a. General definition
1-b. Physical system composed of two subsystems
2. Differences between classical and quantum correlations
2-a. Two levels of correlations
2-b. Quantum monogamy
3. Separability
3-a. Separable density operator
3-b. Two spins in a singlet state
Complement BXXI GHZ states, entanglement swapping
1. Sign contradiction in a GHZ state
1-a. Quantum calculation
1-b. Reasoning in the local realism framework
1-c. Discussion; contextuality
2. Entanglement swapping
2-a. General scheme
2-b. Discussion
Complement CXXI Measurement induced relative phase between two condensates
1. Probabilities of single, double, etc. position measurements
1-a. Single measurement (one particle)
1-b. Double measurement (two particles)
1-c. Generalization: measurement of any number of positions
2. Measurement induced enhancement of entanglement
2-a. Measuring the single density P (x1)
2-b. Entanglement between the two modes after the first detection
2-c. Measuring the double density P (x2, x1)
2-d. Discussion
3. Detection of a large number Q of particles
3-a. Probability of a multiple detection sequence
3-b. Discussion; emergence of a relative phase
Complement DXXI Emergence of a relative phase with spin condensates; macroscopic non-locality and the EPR argument
1. Two condensates with spins
1-a. Spin 1/2: a brief review
a brief review
1-b. Projectors associated with the measurements
2. Probabilities of the different measurement results
2-a. A first expression for the probability
2-b. Introduction of the phase and the quantum angle
3. Discussion
3-a. Measurement number Q

Citation preview

QUANTUM MECHANICS Volume I Basic Concepts, Tools, and Applications

Claude Cohen-Tannoudji, Bernard Diu, and Franck Laloë Translated from the French by Susan Reid Hemley, Nicole Ostrowsky, and Dan Ostrowsky

Authors

Second Edition

Prof. Dr. Claude Cohen-Tannoudji Laboratoire Kastler Brossel (ENS) 24 rue Lhomond 75231 Paris Cedex 05 France

All books published by Wiley-VCH are carefully produced. Nevertheless, authors, editors, and publisher do not warrant the information contained in these books, including this book, to be free of errors. Readers are advised to keep in mind that statements, data, illustrations, procedural details or other items may inadvertently be inaccurate.

Prof. Dr. Bernard Diu 4 rue du Docteur Roux 91440 Boures-sur-Yvette France

Library of Congress Card No.: applied for

Prof. Dr. Frank Laloë Laboratoire Kastler Brossel (ENS) 24 rue Lhomond 75231 Paris Cedex 05 France

British Library Cataloguing-in-Publication Data: A catalogue record for this book is available from the British Library.

Cover Image © antishock/Getty Images

Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.d-nb.de. © 2020 WILEY-VCH Verlag GmbH & Co. KGaA, Boschstr. 12, 69469 Weinheim, Germany All rights reserved (including those of translation into other languages). No part of this book may be reproduced in any form – by photoprinting, microfilm, or any other means – nor transmitted or translated into a machine language without written permission from the publishers. Registered names, trademarks, etc. used in this book, even when not specifically marked as such, are not to be considered unprotected by law. Print ISBN 978-3-527-34553-3 ePDF ISBN 978-3-527-82270-6 ePub ISBN 978-3-527-82271-3 Cover Design Tata Consulting Services Printing and Binding CPI Ebner & Spiegel Printed on acid-free paper.

Directions for Use This book is composed of chapters and their complements: – The chapters contain the fundamental concepts. Except for a few additions and variations, they correspond to a course given in the last year of a typical undergraduate physics program (Volume I) or of a graduate program (Volumes II and III). The 21 chapters are complete in themselves and can be studied independently of the complements. – The complements follow the corresponding chapter. Each is labelled by a letter followed by a subscript, which gives the number of the chapter (for example, the complements of Chapter V are, in order, AV , BV , CV , etc.). They can be recognized immediately by the symbol that appears at the top of each of their pages. The complements vary in character. Some are intended to expand the treatment of the corresponding chapter or to provide a more detailed discussion of certain points. Others describe concrete examples or introduce various physical concepts. One of the complements (usually the last one) is a collection of exercises. The difficulty of the complements varies. Some are very simple examples or extensions of the chapter. Others are more difficult and at the graduate level or close to current research. In any case, the reader should have studied the material in the chapter before using the complements. The complements are generally independent of one another. The student should not try to study all the complements of a chapter at once. In accordance with his/her aims and interests, he/she should choose a small number of them (two or three, for example), plus a few exercises. The other complements can be left for later study. To help with the choise, the complements are listed at the end of each chapter in a “reader’s guide”, which discusses the difficulty and importance of each. Some passages within the book have been set in small type, and these can be omitted on a first reading.

Foreword

Foreword

Quantum mechanics is a branch of physics whose importance has continually increased over the last decades. It is essential for understanding the structure and dynamics of microscopic objects such as atoms, molecules and their interactions with electromagnetic radiation. It is also the basis for understanding the functioning of numerous new systems with countless practical applications. This includes lasers (in communications, medicine, milling, etc.), atomic clocks (essential in particular for the GPS), transistors (communications, computers), magnetic resonance imaging, energy production (solar panels, nuclear reactors), etc. Quantum mechanics also permits understanding surprising physical properties such as superfluidity or supraconductivity. There is currently a great interest in entangled quantum states whose non-intuitive properties of nonlocality and nonseparability permit conceiving remarkable applications in the emerging field of quantum information. Our civilization is increasingly impacted by technological applications based on quantum concepts. This why a particular effort should be made in the teaching of quantum mechanics, which is the object of these three volumes. The first contact with quantum mechanics can be disconcerting. Our work grew out of the authors’ experiences while teaching quantum mechanics for many years. It was conceived with the objective of easing a first approach, and then aiding the reader to progress to a more advance level of quantum mechanics. The first two volumes, first published more than forty years ago, have been used throughout the world. They remain however at an intermediate level. They have now been completed with a third volume treating more advanced subjects. Throughout we have used a progressive approach to problems, where no difficulty goes untreated and each aspect of the diverse questions is discussed in detail (often starting with a classical review). This willingness to go further “without cheating or taking shortcuts” is built into the book structure, using two distinct linked texts: chapters and complements. As we just outlined in the “Directions for use”, the chapters present the general ideas and basic concepts, whereas the complements illustrate both the methods and concepts just exposed. Volume I presents a general introduction of the subject, followed by a second chapter describing the basic mathematical tools used in quantum mechanics. While this chapter can appear long and dense, the teaching experience of the authors has shown that such a presentation is the most efficient. In the third chapter the postulates are announced and illustrated in many of the complements. We then go on to certain important applications of quantum mechanics, such as the harmonic oscillator, which lead to numerous applications (molecular vibrations, phonons, etc.). Many of these are the object of specific complements. Volume II pursues this development, while expanding its scope at a slightly higher level. It treats collision theory, spin, addition of angular momenta, and both timedependent and time-independent perturbation theory. It also presents a first approach to the study of identical particles. In this volume as in the previous one, each theoretical concept is immediately illustrated by diverse applications presented in the complements. Both volumes I and II have benefited from several recent corrections, but there have also been additions. Chapter XIII now contains two sections §§ D and E that treat random perturbations, and a complement concerning relaxation has been added. iv

Foreword

Volume III extends the two volumes at a slightly higher level. It is based on the use of the creation and annihilation operator formalism (second quantization), which is commonly used in quantum field theory. We start with a study of systems of identical particles, fermions or bosons. The properties of ideal gases in thermal equilibrium are presented. For fermions, the Hartree-Fock method is developed in detail. It is the base of many studies in chemistry, atomic physics and solid state physics, etc. For bosons, the Gross-Pitaevskii equation and the Bogolubov theory are discussed. An original presentation that treats the pairing effect of both fermions and bosons permits obtaining the BCS (Bardeen-Cooper-Schrieffer) and Bogolubov theories in a unified framework. The second part of volume III treats quantum electrodynamics, its general introduction, the study of interactions between atoms and photons, and various applications (spontaneous emission, multiphoton transitions, optical pumping, etc.). The dressed atom method is presented and illustrated for concrete cases. A final chapter discusses the notion of quantum entanglement and certain fundamental aspects of quantum mechanics, in particular the Bell inequalities and their violations. Finally note that we have not treated either the philosophical implications of quantum mechanics, or the diverse interpretations of this theory, despite the great interest of these subjects. We have in fact limited ourselves to presenting what is commonly called the “orthodox point of view”. It is only in Chapter XXI that we touch on certain questions concerning the foundations of quantum mechanics (nonlocality, etc.). We have made this choice because we feel that one can address such questions more efficiently after mastering the manipulation of the quantum mechanical formalism as well as its numerous applications. These subjects are addressed in the book Do we really understand quantum mechanics? (F. Laloë, Cambridge University Press, 2019); see also section 5 of the bibliography of volumes I and II.

v

Foreword

Acknowledgments: Volumes I and II: The teaching experience out of which this text grew were group efforts, pursued over several years. We wish to thank all the members of the various groups and particularly Jacques Dupont-Roc and Serge Haroche, for their friendly collaboration, for the fruitful discussions we have had in our weekly meetings and for the ideas for problems and exercises that they have suggested. Without their enthusiasm and valuable help, we would never have been able to undertake and carry out the writing of this book. Nor can we forget what we owe to the physicists who introduced us to research, Alfred Kastler and Jean Brossel for two of us and Maurice Levy for the third. It was in the context of their laboratories that we discovered the beauty and power of quantum mechanics. Neither have we forgotten the importance to us of the modern physics taught at the C.E.A. by Albert Messiah, Claude Bloch and Anatole Abragam, at a time when graduate studies were not yet incorporated into French university programs. We wish to express our gratitude to Ms. Aucher, Baudrit, Boy, Brodschi, Emo, Heywaerts, Lemirre, Touzeau for preparation of the mansucript.

Volume III: We are very grateful to Nicole and Daniel Ostrowsky, who, as they translated this Volume from French into English, proposed numerous improvements and clarifications. More recently, Carsten Henkel also made many useful suggestions during his translation of the text into German; we are very grateful for the improvements of the text that resulted from this exchange. There are actually many colleagues and friends who greatly contributed, each in his own way, to finalizing this book. All their complementary remarks and suggestions have been very helpful and we are in particular thankful to: Pierre-François Cohadon Jean Dalibard Sébastien Gleyzes Markus Holzmann Thibaut Jacqmin Philippe Jacquier Amaury Mouchet Jean-Michel Raimond Félix Werner Some delicate aspects of Latex typography have been resolved thanks to Marco Picco, Pierre Cladé and Jean Hare. Roger Balian, Edouard Brézin and William Mullin have offered useful advice and suggestions. Finally, our sincere thanks go to Geneviève Tastevin, Pierre-François Cohadon and Samuel Deléglise for their help with a number of figures.

vi

Table of contents

Volume I Table of contents I A B C D

vii

WAVES AND PARTICLES. INTRODUCTION IDEAS OF QUANTUM MECHANICS Electromagnetic waves and photons . . . . . . . . . . . . Material particles and matter waves . . . . . . . . . . . . Quantum description of a particle. Wave packets . . . . . Particle in a time-independent scalar potential . . . . . .

TO THE BASIC . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

33

READER’S GUIDE FOR COMPLEMENTS

AI BI 1 2

1 3 10 13 23

Order of magnitude of the wavelengths associated with material particles

35

Constraints imposed by the uncertainty relations 39 Macroscopic system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Microscopic system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

CI

Heisenberg relation and atomic parameters

41

DI

An experiment illustrating the Heisenberg relations

45

EI 1 2 3

A simple treatment of a two-dimensional Introduction . . . . . . . . . . . . . . . . . . . . Angular dispersion and lateral dimensions . . . Discussion . . . . . . . . . . . . . . . . . . . . .

FI 1 2

The relationship between one- and three-dimensional problems 53 Three-dimensional wave packet . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Justification of one-dimensional models . . . . . . . . . . . . . . . . . . . . . 56

GI 1 2 3

One-dimensional Gaussian wave packet: spreading of the wave packet Definition of a Gaussian wave packet . . . . . . . . . . . . . . . . . . . . . . . Calculation of and ; uncertainty relation . . . . . . . . . . . . . . . . . Evolution of the wave packet . . . . . . . . . . . . . . . . . . . . . . . . . . .

HI 1 2

Stationary states of a particle in one-dimensional square potentials 63 Behavior of a stationary wave function ( ) . . . . . . . . . . . . . . . . . . . 63 Some simple cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

JI 1 2

Behavior of a wave packet at a potential step 75 Total reflection: 75 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial reflection: 79 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

KI

Exercises

wave packet 49 . . . . . . . . . . . . . . . . . 49 . . . . . . . . . . . . . . . . . 49 . . . . . . . . . . . . . . . . . 51

57 57 58 59

83 *********** vii

Table of contents

II A B C D E F

THE MATHEMATICAL TOOLS OF QUANTUM Space of the one-particle wave function . . . . . . . . . . . . State space. Dirac notation . . . . . . . . . . . . . . . . . . . Representations in state space . . . . . . . . . . . . . . . . . . Eigenvalue equations. Observables . . . . . . . . . . . . . . . Two important examples of representations and observables . Tensor product of state spaces . . . . . . . . . . . . . . . . .

MECHANICS 87 . . . . . . . . . 88 . . . . . . . . . 102 . . . . . . . . . 116 . . . . . . . . . 126 . . . . . . . . . 139 . . . . . . . . . 147

READER’S GUIDE FOR COMPLEMENTS

159

AII

161

BII 1 2 3 4 5

The Schwarz inequality Review of some useful properties of linear Trace of an operator . . . . . . . . . . . . . . . . Commutator algebra . . . . . . . . . . . . . . . . Restriction of an operator to a subspace . . . . . Functions of operators . . . . . . . . . . . . . . . Derivative of an operator . . . . . . . . . . . . .

operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

163 163 165 165 166 169

CII Unitary operators 1 General properties of unitary operators . . . . . . . . . . . . . . . . . . . . . 2 Unitary transformations of operators . . . . . . . . . . . . . . . . . . . . . . . 3 The infinitesimal unitary operator . . . . . . . . . . . . . . . . . . . . . . . .

173 173 177 178

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

DII A more detailed study of the r and p representations 181 1 The r representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 2 The p representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 EII 1 2 3 4

Some general properties of two observables, tator is equal to ~ The operator ( ): definition, properties . . . . . Eigenvalues and eigenvectors of . . . . . . . . . The q representation . . . . . . . . . . . . . . . The p representation. The symmetric nature of

FII 1 2 3 4

The parity operator The parity operator . . . . . . . . . . . Even and odd operators . . . . . . . . . Eigenstates of an even observable B + . Application to an important special case

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

and

, whose commu187 . . . . . . . . . . . . . . . 187 . . . . . . . . . . . . . . . 188 . . . . . . . . . . . . . . . 189 the P and Q observables 190

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

193 193 196 199 199

GII An application of the properties of the tensor product: the twodimensional infinite well 201 1 Definition; eigenstates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 2 Study of the energy levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 HII viii

Exercises

205

Table of contents

III THE POSTULATES OF QUANTUM MECHANICS A Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B Statement of the postulates . . . . . . . . . . . . . . . . . . . . . . . . . . . . C The physical interpretation of the postulates concerning observables and their measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D The physical implications of the Schrödinger equation . . . . . . . . . . . . . E The superposition principle and physical predictions . . . . . . . . . . . . . .

213 214 215

READER’S GUIDE FOR COMPLEMENTS

267

AIII 1 2 3

Particle in an infinite one-dimensional potential well Distribution of the momentum values in a stationary state . . . . . . . . . . . Evolution of the particle’s wave function . . . . . . . . . . . . . . . . . . . . . Perturbation created by a position measurement . . . . . . . . . . . . . . . .

271 271 275 279

BIII 1 2 3

Study of the probability current in some special cases Expression for the current in constant potential regions . . . . . . . . . . . . Application to potential step problems . . . . . . . . . . . . . . . . . . . . . . Probability current of incident and evanescent waves, in the case of reflection from a two-dimensional potential step . . . . . . . . . . . . . . . . . . . .

283 283 284

226 237 253

285

CIII Root mean square deviations of two conjugate observables 289 1 The Heisenberg relation for and . . . . . . . . . . . . . . . . . . . . . . . 289 2 The “minimum” wave packet . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 DIII 1 2 3

Measurements bearing on only one part of a physical Calculation of the physical predictions . . . . . . . . . . . . . Physical meaning of a tensor product state . . . . . . . . . . Physical meaning of a state that is not a tensor product . . .

system 293 . . . . . . . . . 293 . . . . . . . . . 295 . . . . . . . . . 296

EIII 1 2 3 4 5

The density operator Outline of the problem . . . . . . . . . . . . . . . . . The concept of a statistical mixture of states . . . . The pure case. Introduction of the density operator A statistical mixture of states (non-pure case) . . . . Use of the density operator: some applications . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

299 299 299 301 304 308

FIII The evolution operator 313 1 General properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 2 Case of conservative systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 GIII The Schrödinger and Heisenberg pictures

317

HIII Gauge invariance 1 Outline of the problem: scalar and vector potentials associated with an electromagnetic field; concept of a gauge . . . . . . . . . . . . . . . . . . . . . 2 Gauge invariance in classical mechanics . . . . . . . . . . . . . . . . . . . . . 3 Gauge invariance in quantum mechanics . . . . . . . . . . . . . . . . . . . . .

321 321 322 327 ix

Table of contents

JIII 1 2 3

Propagator for the Schrödinger equation 335 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Existence and properties of a propagator (2 1) . . . . . . . . . . . . . . . . 336 Lagrangian formulation of quantum mechanics . . . . . . . . . . . . . . . . . 339

KIII 1 2 3

Unstable states. Lifetime 343 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Definition of the lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 Phenomenological description of the instability of a state . . . . . . . . . . . 345

LIII Exercises

347

MIII Bound states in a “potential well” of arbitrary shape 359 1 Quantization of the bound state energies . . . . . . . . . . . . . . . . . . . . . 359 2 Minimum value of the ground state energy . . . . . . . . . . . . . . . . . . . 363 NIII Unbound states of a particle in the barrier 1 Transmission matrix ( ) . . . . . . . . 2 Transmission and reflection coefficients . . 3 Example . . . . . . . . . . . . . . . . . . .

presence of a potential well or 367 . . . . . . . . . . . . . . . . . . . . 368 . . . . . . . . . . . . . . . . . . . . 372 . . . . . . . . . . . . . . . . . . . . 373

OIII Quantum properties of a particle in a one-dimensional periodic structure 375 1 Passage through several successive identical potential barriers . . . . . . . . . 376 2 Discussion: the concept of an allowed or forbidden energy band . . . . . . . . 381 3 Quantization of energy levels in a periodic potential; effect of boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 *********** IV

APPLICATIONS OF THE POSTULATES TO SIMPLE SPIN 1/2 AND TWO-LEVEL SYSTEMS A Spin 1/2 particle: quantization of the angular momentum . . . . . . B Illustration of the postulates in the case of a spin 1/2 . . . . . . . . C General study of two-level systems . . . . . . . . . . . . . . . . . . .

CASES: 393 . . . . . 394 . . . . . 401 . . . . . 411

READER’S GUIDE FOR COMPLEMENTS

423

AIV 1 2 3

The Pauli matrices Definition; eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . Simple properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A convenient basis of the 2 2 matrix space . . . . . . . . . . . . . . . . . .

425 425 426 427

BIV 1 2 3

Diagonalization of a 2 2 Hermitian matrix Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Changing the eigenvalue origin . . . . . . . . . . . . . . . . . . . . . . . . . . Calculation of the eigenvalues and eigenvectors . . . . . . . . . . . . . . . . .

429 429 429 430

x

Table of contents

CIV 1 2 3

Fictitious spin 1/2 associated with a two-level system 435 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 Interpretation of the Hamiltonian in terms of fictitious spin . . . . . . . . . . 435 Geometrical interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

DIV System of two spin 1/2 particles 441 1 Quantum mechanical description . . . . . . . . . . . . . . . . . . . . . . . . . 441 2 Prediction of the measurement results . . . . . . . . . . . . . . . . . . . . . . 444 EIV 1 2 3 4 5

Spin 1 2 density matrix Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Density matrix of a perfectly polarized spin (pure case) . . . . Example of a statistical mixture: unpolarized spin . . . . . . . Spin 1/2 at thermodynamic equilibrium in a static field . . . . Expansion of the density matrix in terms of the Pauli matrices

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

449 449 449 450 452 453

FIV Spin 1/2 particle in a static and a rotating magnetic fields: magnetic resonance 455 1 Classical treatment; rotating reference frame . . . . . . . . . . . . . . . . . . 455 2 Quantum mechanical treatment . . . . . . . . . . . . . . . . . . . . . . . . . . 458 3 Relation between the classical treatment and the quantum mechanical treatment: evolution of M . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 4 Bloch equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 GIV 1 2 3

A simple model of the ammonia molecule Description of the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eigenfunctions and eigenvalues of the Hamiltonian . . . . . . . . . . . . . . . The ammonia molecule considered as a two-level system . . . . . . . . . . . .

HIV 1 2 3

Effects of a coupling between a stable state and an unstable state 485 Introduction. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Influence of a weak coupling on states of different energies . . . . . . . . . . . 486 Influence of an arbitrary coupling on states of the same energy . . . . . . . . 487

JIV

Exercises

469 469 471 477

491 ***********

V THE ONE-DIMENSIONAL HARMONIC OSCILLATOR A Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B Eigenvalues of the Hamiltonian . . . . . . . . . . . . . . . . . . . C Eigenstates of the Hamiltonian . . . . . . . . . . . . . . . . . . . D Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . READER’S GUIDE FOR COMPLEMENTS

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

497 497 503 510 518 525 xi

Table of contents

AV 1 2 3 4 BV 1 2

Some examples of harmonic oscillators Vibration of the nuclei of a diatomic molecule Vibration of the nuclei in a crystal . . . . . . Torsional oscillations of a molecule: ethylene Heavy muonic atoms . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

527 527 534 536 541

Study of the stationary states in the x representation. Hermite polynomials 547 Hermite polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547 The eigenfunctions of the harmonic oscillator Hamiltonian . . . . . . . . . . . 550

CV Solving the eigenvalue equation of the harmonic oscillator by the polynomial method 555 1 Changing the function and the variable . . . . . . . . . . . . . . . . . . . . . 555 2 The polynomial method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557 DV Study of the stationary states in the momentum representation 563 1 Wave functions in momentum space . . . . . . . . . . . . . . . . . . . . . . . 563 2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565 EV The isotropic three-dimensional harmonic oscillator 569 1 The Hamiltonian operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569 2 Separation of the variables in Cartesian coordinates . . . . . . . . . . . . . . 570 3 Degeneracy of the energy levels . . . . . . . . . . . . . . . . . . . . . . . . . . 572 FV A charged harmonic oscillator in a uniform electric field 575 1 Eigenvalue equation of (E ) in the representation . . . . . . . . . . . 575 2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 3 Use of the translation operator . . . . . . . . . . . . . . . . . . . . . . . . . . 579 GV 1 2 3 4

Coherent “quasi-classical” states of the harmonic oscillator Quasi-classical states . . . . . . . . . . . . . . . . . . . . . . . . . . . Properties of the states . . . . . . . . . . . . . . . . . . . . . . . Time evolution of a quasi-classical state . . . . . . . . . . . . . . . . Example: quantum mechanical treatment of a macroscopic oscillator

. . . .

. . . .

. . . .

. . . .

. . . .

583 584 588 594 596

HV Normal vibrational modes of two coupled harmonic oscillators 599 1 Vibration of the two coupled in classical mechanics . . . . . . . . . . . . . . . 599 2 Vibrational states of the system in quantum mechanics . . . . . . . . . . . . . 605 JV 1 2 3 xii

Vibrational modes of an infinite linear chain of coupled oscillators; phonons Classical treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantum mechanical treatment . . . . . . . . . . . . . . . . . . . . Application to the study of crystal vibrations: phonons . . . . . .

harmonic 611 . . . . . . 612 . . . . . . 622 . . . . . . 626

Table of contents

KV Vibrational modes of a continuous physical system. Photons 1 Outline of the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Vibrational modes of a continuous mechanical system: example of a vibrating string . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Vibrational modes of radiation: photons . . . . . . . . . . . . . . . . . . . . . LV 1 2 3 4

One-dimensional harmonic oscillator in at a temperature Mean value of the energy . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . Applications . . . . . . . . . . . . . . . . . . Probability distribution of the observable .

631 631 632 639

thermodynamic equilibrium 647 . . . . . . . . . . . . . . . . . . 648 . . . . . . . . . . . . . . . . . . 650 . . . . . . . . . . . . . . . . . . 651 . . . . . . . . . . . . . . . . . . 655

MV Exercises

661 ***********

VI A B C D

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS 667 Introduction: the importance of angular momentum . . . . . . . . . . . . . . 667 Commutation relations characteristic of angular momentum . . . . . . . . . . 669 General theory of angular momentum . . . . . . . . . . . . . . . . . . . . . . 671 Application to orbital angular momentum . . . . . . . . . . . . . . . . . . . . 685 703

READER’S GUIDE FOR COMPLEMENTS

AVI Spherical harmonics 705 1 Calculation of spherical harmonics . . . . . . . . . . . . . . . . . . . . . . . . 705 2 Properties of spherical harmonics . . . . . . . . . . . . . . . . . . . . . . . . . 710 BVI 1 2 3 4 5 6

Angular momentum and rotations Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brief study of geometrical rotations R . . . . . . . . . . . . . . Rotation operators in state space. Example: a spinless particle Rotation operators in the state space of an arbitrary system . . Rotation of observables . . . . . . . . . . . . . . . . . . . . . . Rotation invariance . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

717 717 718 720 727 730 734

CVI 1 2 3 4

Rotation of diatomic molecules Introduction . . . . . . . . . . . . . . . Rigid rotator. Classical study . . . . . Quantization of the rigid rotator . . . Experimental evidence for the rotation

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

739 739 740 741 746

. . . . . . . . . . . . . . . . . . . . . . . . of molecules

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

DVI Angular momentum of stationary states of a two-dimensional harmonic oscillator 755 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755 2 Classification of the stationary states by the quantum numbers and . . 759 3 Classification of the stationary states in terms of their angular momenta . . . 761 xiii

Table of contents

4

Quasi-classical states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765

EVI 1 2 3

A charged particle in a magnetic field: Landau levels Review of the classical problem . . . . . . . . . . . . . . . . . . . . . . . . . . General quantum mechanical properties of a particle in a magnetic field . . . Case of a uniform magnetic field . . . . . . . . . . . . . . . . . . . . . . . . .

FVI Exercises

771 771 775 779 795

*********** VII PARTICLE IN A CENTRAL POTENTIAL, HYDROGEN ATOM A Stationary states of a particle in a central potential . . . . . . . . . . . . . . . B Motion of the center of mass and relative motion for a system of two interacting particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C The hydrogen atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

803 804

READER’S GUIDE FOR COMPLEMENTS

831

812 818

AVII Hydrogen-like systems 833 1 Hydrogen-like systems with one electron . . . . . . . . . . . . . . . . . . . . . 834 2 Hydrogen-like systems without an electron . . . . . . . . . . . . . . . . . . . . 839 BVII A soluble example of a central potential: The isotropic three-dimensional harmonic oscillator 841 1 Solving the radial equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842 2 Energy levels and stationary wave functions . . . . . . . . . . . . . . . . . . . 845 CVII Probability currents associated with the stationary states of the hydrogen atom 851 1 General expression for the probability current . . . . . . . . . . . . . . . . . . 851 2 Application to the stationary states of the hydrogen atom . . . . . . . . . . . 852 DVII The hydrogen atom placed in a uniform magnetic field. Paramagnetism and diamagnetism. The Zeeman effect 855 1 The Hamiltonian of the problem. The paramagnetic term and the diamagnetic term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856 2 The Zeeman effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 862 EVII Some atomic orbitals. Hybrid orbitals 1 Introduction . . . . . . . . . . . . . . . . . . . . . . 2 Atomic orbitals associated with real wave functions 3 hybridization . . . . . . . . . . . . . . . . . . . 2 4 hybridization . . . . . . . . . . . . . . . . . . . 3 5 hybridization . . . . . . . . . . . . . . . . . . . xiv

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

869 869 870 876 878 882

Table of contents

FVII Vibrational-rotational levels of diatomic molecules 885 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 885 2 Approximate solution of the radial equation . . . . . . . . . . . . . . . . . . . 886 3 Evaluation of some corrections . . . . . . . . . . . . . . . . . . . . . . . . . . 892 GVII Exercises 899 1 Particle in a cylindrically symmetric potential . . . . . . . . . . . . . . . . . . 899 2 Three-dimensional harmonic oscillator in a uniform magnetic field . . . . . . 899

INDEX

901 ***********

xv

Table of contents

Volume II

VOLUME II

923

Table of contents

v

VIII AN ELEMENTARY APPROACH TO THE QUANTUM THEORY OF SCATTERING BY A POTENTIAL 923 READER’S GUIDE FOR COMPLEMENTS

957

AVIII The free particle: stationary states with well-defined angular momentum

959

BVIII

Phenomenological description of collisions with absorption

971

CVIII

Some simple applications of scattering theory

977

*********** IX

ELECTRON SPIN

985

READER’S GUIDE FOR COMPLEMENTS

999

AIX

Rotation operators for a spin 1/2 particle

1001

BIX

Exercises

1009 ***********

X

ADDITION OF ANGULAR MOMENTA

1015

READER’S GUIDE FOR COMPLEMENTS

1041

AX

Examples of addition of angular momenta

1043

BX

Clebsch-Gordan coefficients

1051

CX

Addition of spherical harmonics

1059

DX

Vector operators: the Wigner-Eckart theorem

1065

EX

Electric multipole moments

1077

FX

Two angular momenta J1 and J2 coupled by an interaction J1 J2

GX xvi

Exercises

1091 1107

Table of contents

***********

XI

STATIONARY PERTURBATION THEORY

READER’S GUIDE FOR COMPLEMENTS

1115 1129

AXI A one-dimensional harmonic oscillator subjected to a perturbing potential in , 2 , 3 1131 BXI Interaction between the magnetic dipoles of two spin 1/2 particles

1141

CXI Van der Waals forces

1151

DXI The volume effect: the influence of the spatial extension of the nucleus on the atomic levels 1162 EXI The variational method

1169

FXI Energy bands of electrons in solids: a simple model

1177

GXI A simple example of the chemical bond: the H+ 2 ion

1189

HXI Exercises

1221 ***********

XII AN APPLICATION OF PERTURBATION THEORY: THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM 1231 READER’S GUIDE FOR COMPLEMENTS

1265

AXII

1267

The magnetic hyperfine Hamiltonian

BXII Calculation of the average values of the fine-structure Hamiltonian in the 1 , 2 and 2 states 1276 CXII The hyperfine structure and the Zeeman effect for muonium and positronium 1281 DXII The influence of the electronic spin on the Zeeman effect of the hydrogen resonance line 1289 EXII

The Stark effect for the hydrogen atom

1298

*********** xvii

Table of contents

XIII APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

1303

READER’S GUIDE FOR COMPLEMENTS

1337

AXIII

1339

Interaction of an atom with an electromagnetic wave

BXIII Linear and non-linear responses of a two-level system subject to a sinusoidal perturbation 1357 CXIII Oscillations of a system between two discrete states under the effect of a sinusoidal resonant perturbation 1374 DXIII Decay of a discrete state resonantly coupled to a continuum of final states 1378 EXIII

Time-dependent random perturbation, relaxation

1390

FXIII

Exercises

1409 ***********

XIV SYSTEMS OF IDENTICAL PARTICLES

1419

READER’S GUIDE FOR COMPLEMENTS

1457

AXIV

1459

Many-electron atoms. Electronic configurations

BXIV Energy levels of the helium atom. Configurations, terms, multiplets 1467 CXIV

Physical properties of an electron gas. Application to solids

1481

DXIV

Exercises

1496 ***********

APPENDICES

1505

I

Fourier series and Fourier transforms

1505

II

The Dirac -“function”

1515

III

Lagrangian and Hamiltonian in classical mechanics

1527

BIBLIOGRAPHY OF VOLUMES I AND II

1545

INDEX

1569 ***********

xviii

Table of contents

Volume III

VOLUME III

1591

Table of contents

v

XV CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES 1591 READER’S GUIDE FOR COMPLEMENTS

1617

AXV Particles and holes

1621

BXV Ideal gas in thermal equilibrium; quantum distribution functions 1625 CXV Condensed boson system, Gross-Pitaevskii equation

1643

DXV Time-dependent Gross-Pitaevskii equation

1657

EXV Fermion system, Hartree-Fock approximation

1677

FXV Fermions, time-dependent Hartree-Fock approximation

1701

GXV Fermions or Bosons: Mean field thermal equilibrium

1711

HXV Applications of the mean field method for non-zero temperature 1733 *********** XVI FIELD OPERATOR

1751

READER’S GUIDE FOR COMPLEMENTS

1767

AXVI

Spatial correlations in an ideal gas of bosons or fermions

1769

BXVI

Spatio-temporal correlation functions, Green’s functions

1781

CXVI

Wick’s theorem

1799 ***********

XVII

PAIRED STATES OF IDENTICAL PARTICLES

READER’S GUIDE FOR COMPLEMENTS

1811 1843 xix

Table of contents

AXVII Pair field operator for identical particles

1845

BXVII Average energy in a paired state

1869

CXVII Fermion pairing, BCS theory

1889

DXVII Cooper pairs

1927

EXVII Condensed repulsive bosons

1933

*********** XVIII REVIEW OF CLASSICAL ELECTRODYNAMICS

1957

READER’S GUIDE FOR COMPLEMENTS

1977

AXVIII Lagrangian formulation of electrodynamics

1979

*********** XIX QUANTIZATION OF ELECTROMAGNETIC RADIATION

1997

READER’S GUIDE FOR COMPLEMENTS

2017

AXIX

Momentum exchange between atoms and photons

2019

BXIX

Angular momentum of radiation

2043

CXIX

Angular momentum exchange between atoms and photons

2055

*********** XX ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS 2067 READER’S GUIDE FOR COMPLEMENTS

2095

AXX A multiphoton process: two-photon absorption

2097

BXX Photoionization

2109

CXX Two-level atom in a monochromatic field. Dressed-atom method 2129 DXX Light shifts: a tool for manipulating atoms and fields

2151

EXX Detection of one- or two-photon wave packets, interference

2163

*********** XXI QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES 2187 xx

Table of contents

READER’S GUIDE FOR COMPLEMENTS

2215

AXXI

Density operator and correlations; separability

2217

BXXI

GHZ states, entanglement swapping

2227

CXXI

Measurement induced relative phase between two condensates 2237

DXXI Emergence of a relative phase with spin condensates; macroscopic non-locality and the EPR argument 2253 *********** APPENDICES

2267

IV

Feynman path integral

2267

V

Lagrange multipliers

2281

VI

Brief review of Quantum Statistical Mechanics

2285

VII Wigner transform

2297

BIBLIOGRAPHY OF VOLUME III

2325

INDEX

2333

xxi

Chapter I

Waves and particles. Introduction to the fundamental ideas of quantum mechanics A

B

C

D

Electromagnetic waves and photons . . . . . . . . . . . A-1 Light quanta and the Planck-Einstein relations . . . . . A-2 Wave-particle duality . . . . . . . . . . . . . . . . . . . A-3 The principle of spectral decomposition . . . . . . . . . Material particles and matter waves . . . . . . . . . . . B-1 The de Broglie relations . . . . . . . . . . . . . . . . . . B-2 Wave functions. Schrödinger equation . . . . . . . . . . Quantum description of a particle. Wave packets . . . C-1 Free particle . . . . . . . . . . . . . . . . . . . . . . . . C-2 Form of the wave packet at a given time . . . . . . . . . C-3 Heisenberg relations . . . . . . . . . . . . . . . . . . . . C-4 Time evolution of a free wave packet . . . . . . . . . . . Particle in a time-independent scalar potential . . . . D-1 Separation of variables. Stationary states . . . . . . . . D-2 One-dimensional “square” potentials. Qualitative study

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

3 3 3 7 10 10 11 13 14 15 19 20 23 24 26

In the present state of scientific knowledge, quantum mechanics plays a fundamental role in the description and understanding of natural phenomena. In fact, phenomena that occur on a very small (atomic or subatomic) scale cannot be explained outside the framework of quantum physics. For example, the existence and the properties of atoms, the chemical bond and the propagation of an electron in a crystal cannot be understood

Quantum Mechanics, Volume I, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER I

INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS

in terms of classical mechanics. Even when we are concerned only with macroscopic physical objects (that is, whose dimensions are comparable to those encountered in everyday life), it is necessary, in principle, to begin by studying the behavior of their various constituent atoms, ions, electrons, in order to arrive at a complete scientific description. Actually, there are many phenomena that reveal, on a macroscopic scale, the quantum behaviour of nature. It is in this sense that it can be said that quantum mechanics is the basis of our present understanding of all natural phenomena, including those traditionally treated in chemistry, biology, etc... From a historical point of view, quantum ideas contributed to a remarkable unification of the concepts of fundamental physics by treating material particles and radiation on the same footing. At the end of the nineteenth century, people distinguished between two entities in physical phenomena: matter and radiation. Completely different laws were used for each one. To predict the motion of material bodies, the laws of Newtonian mechanics (cf. Appendix III) were utilized. Their success, though of long standing, was none the less impressive. With regard to radiation, the theory of electromagnetism, thanks to the introduction of Maxwell’s equations, had produced a unified interpretation of a set of phenomena which had previously been considered as belonging to different domains: electricity, magnetism and optics. In particular, the electromagnetic theory of radiation had been spectacularly confirmed experimentally by the discovery of Hertzian waves. Finally, interactions between radiation and matter were well explained by the Lorentz force. This set of laws had brought physics to a point which could be considered satisfactory, in view of the experimental data at the time. However, at the beginning of the twentieth century, physics was to be marked by the profound upheaval that led to the introduction of relativistic mechanics and quantum mechanics. The relativistic “revolution” and the quantum “revolution” were, to a large extent, independent, since they challenged classical physics on different points. Classical laws cease to be valid for material bodies travelling at very high speeds, comparable to that of light (relativistic domain). In addition, they are also found to be wanting on an atomic or subatomic scale (quantum domain). However, it is important to note that classical physics, in both cases, can be seen as an approximation of the new theories, an approximation which is valid for most phenomena on an everyday scale. For example, Newtonian mechanics enables us to predict correctly the motion of a solid body, providing it is non-relativistic (speeds much smaller than that of light) and macroscopic (dimensions much greater than atomic ones). Nevertheless, from a fundamental point of view, quantum theory remains indispensable. It is the only theory which enables us to understand the very existence of a solid body and the values of the macroscopic parameters (density, specific heat, elasticity, etc...) associated with it. It is possible to develop a theory that is at the same time quantum and relativistic, but such a theory is relatively complex. However, most atomic and molecular phenomena are well explained by the non-relativistic quantum mechanics that we intend to examine here. This chapter is an introduction to quantum ideas and “vocabulary”. No attempt is made here to be rigorous or complete. The essential goal is to awaken the curiosity of the reader. Phenomena will be described which unsettle ideas as firmly anchored in our intuition as the concept of a trajectory. We want to render the quantum theory “plausible” for the reader by showing simply and qualitatively how it enables us to solve the problems which are encountered on an atomic scale. We shall later return to the various ideas introduced in this chapter and go into further detail, either from the point 2

A. ELECTROMAGNETIC WAVES AND PHOTONS

of view of the mathematical formalism (Chap. II) or from the physical point of view (Chap. III). In the first section (§ A), we introduce the basic quantum ideas (wave-particle duality, the measurement process), relying on well-known optical experiments. Then we show (§ B) how these ideas can be extended to material particles (wave function, Schrödinger equation). We next study in more detail the characteristics of the “wave packet” associated with a particle, and we introduce the Heisenberg relations (§ C). Finally, we discuss some simple examples of typical quantum effects (§ D). A. A-1.

Electromagnetic waves and photons Light quanta and the Planck-Einstein relations

Newton considered light to be a beam of particles, able, for example, to bounce back upon reflection from a mirror. During the first half of the nineteenth century, the wavelike nature of light was demonstrated (interference, diffraction). This later enabled optics to be integrated into electromagnetic theory. In this framework, the speed of light, , is related to electric and magnetic constants and light polarization phenomena can be interpreted as manifestations of the vectorial character of the electric field. However, the study of blackbody radiation, which electromagnetic theory could not explain, led Planck to suggest the hypothesis of the quantization of energy (1900): for an electromagnetic wave of frequency , the only possible energies are integral multiples of the quantum , where is a new fundamental constant. Generalizing this hypothesis, Einstein proposed a return to the particle theory (1905): light consists of a beam of photons, each possessing an energy . Einstein showed how the introduction of photons made it possible to understand, in a very simple way, certain as yet unexplained characteristics of the photoelectric effect. Twenty years had to elapse before the photon was actually shown to exist, as a distinct entity, by the Compton effect (1924). These results lead to the following conclusion: the interaction of an electromagnetic wave with matter occurs by means of elementary indivisible processes, in which the radiation appears to be composed of particles, the photons. Particle parameters (the energy and the momentum p of a photon) and wave parameters (the angular frequency = 2 and the wave vector k, where k = 2 , with the frequency and the wavelength) are linked by the fundamental relations: = =~ p = ~k where ~ =

(Planck-Einstein relations)

(A-1)

2 is defined in terms of the Planck constant :

6 62 10

34

Joule

second

(A-2)

During each elementary process, energy and total momentum must be conserved. A-2.

Wave-particle duality

Thus we have returned to a particle conception of light. Does this mean that we must abandon the wave theory? Certainly not. We shall see that typical wave 3

CHAPTER I

INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS

phenomena such as interference and diffraction could not be explained in a purely particle framework. Analyzing Young’s well-known double-slit experiment will lead us to the following conclusion: a complete interpretation of the phenomena can be obtained only by conserving both the wave aspect and the particle aspect of light (although they seem a priori irreconcilable). We shall then show how this paradox can be resolved by the introduction of the fundamental quantum concepts. A-2-a.

Analysis of Young’s double-slit experiment

The device used in this experiment is shown schematically in Figure 1. The monochromatic light emitted by the source S falls on an opaque screen P pierced by two narrow slits 1 and 2 , which illuminate the observation screen E (a photographic plate, for example). If we block 2 , we obtain on E a light intensity distribution 1 ( ) which is the diffraction pattern of 1 . In the same way, when 1 is obstructed, the diffraction pattern of 2 is described by 2 ( ). When the two slits 1 and 2 are open at the same time, we observe a system of interference fringes on the screen. In particular, we note that the corresponding intensity ( ) is not the sum of the intensities produced by 1 and 2 separately: ( )=

1(

)+

2(

)

(A-3)

How could one conceive of explaining, in terms of a particle theory (seen, in the preceding section, to be necessary), the experimental results just described? The existence of a diffraction pattern when only one of the two slits is open could, for example, be explained as being due to photon collisions with the edges of the slit. Such an explanation would, of course, have to be developed more precisely, and a more detailed study would show it to be insufficient. Instead, let us concentrate on the interference phenomenon. We could attempt to explain it by an interaction between the photons which pass through the slit 1 and those which pass through the slit 2 . Such an explanation would lead to the following prediction: if the intensity of the source S (the number of photons emitted per second) is diminished until the photons strike the screen practically one by one, the interaction between the photons must diminish and, eventually, vanish. The interference fringes should therefore disappear. Before we indicate the answer given by experiment, recall that the wave theory provides a completely natural interpretation of the fringes. The light intensity at a point of the screen E is proportional to the square of the amplitude of the electric field at this point. If 1 ( ) and 2 ( ) represent, in complex notation, the electric fields produced at by slits 1 and 2 respectively (the slits behave like secondary sources), the total resultant field at this point when 1 and 2 are both open is1 : ( )=

1(

)+

2(

)

(A-4)

Using complex notation, we then have: ( )

( )2 =

1(

)+

2(

)2

(A-5)

2 Since the intensities 1 ( ) and 2 ( ) are proportional, respectively, to and 1( ) 2 2 ( ) , formula (A-5) shows that ( ) differs from 1 ( ) + 2 ( ) by an interference 1 Since the experiment studied here is performed with unpolarized light, the vectorial character of the electric field does not play an essential role. For the sake of simplicity, we ignore it in this paragraph.

4

A. ELECTROMAGNETIC WAVES AND PHOTONS

Figure 1: Diagram of Young’s double-slit light interference experiment (fig. a). Each of the slits 1 and 2 produces a diffraction pattern on the screen . The corresponding intensities are 1 ( ) and 2 ( ) (solid lines in figure b). When the two slits 1 and 2 are open simultaneously, the intensity ( ) observed on the screen is not the sum 1 ( ) + 2 ( ) (dashed lines in figure b), but shows oscillations due to the interference between the electric fields radiated by 1 and 2 (solid line in figure c).

term which depends on the phase difference between 1 and 2 and whose presence explains the fringes. The wave theory thus predicts that diminishing the intensity of the source S will simply cause the fringes to diminish in intensity but not vanish. What actually happens when S emits photons practically one by one? Neither the predictions of the wave theory nor those of the particle theory are verified. In fact: (i) If we cover the screen E with a photographic plate and increase the exposure time so as to capture a large number of photons on each photograph, we observe when we develop them that the fringes have not disappeared. Therefore, the purely corpuscular interpretation, according to which the fringes are due to an interaction between photons, must be rejected. ( ) On the other hand, we can expose the photographic plate during a time so short that it can only receive a few photons. We then observe that each photon produces a localized impact on E and not a very weak interference pattern. Therefore, the purely wave interpretation must also be rejected. In reality, as more and more photons strike the photographic plate, the following phenomenon occurs. Their individual impacts seem to be distributed in a random manner, and only when a great number of them have reached E does the distribution of the impacts begin to have a continuous aspect. The density of the impacts at each point of E corresponds to the interference fringes: maximum on a bright fringe and zero on a dark fringe. It can thus be said that the photons, as they arrive, build up the interference pattern. The result of this experiment therefore leads, apparently, to a paradox. Within the framework of the particle theory, for example, it can be expressed in the following way. Since photon-photon interactions are excluded, each photon must be considered separately. But then it is not clear why the phenomena should change drastically according to whether only one slit or both slits are open. For a photon passing through one of the slits, why should the fact that the other is open or closed have such a critical 5

CHAPTER I

INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS

importance? Before we discuss this problem, note that in the preceding experiment we did not seek to determine through which slit each photon passed before it reached the screen. In order to obtain this information, we can imagine placing detectors (photomultipliers) behind 1 and 2 . It will then be observed that, if the photons arrive one by one, each one passes through a well-determined slit (a signal is recorded either by the detector placed behind 1 or by the one covering 2 but not by both at once). But, obviously, the photons detected in this way are absorbed and do not reach the screen. Remove the photomultiplier which blocks 1 , for example. The one which remains behind 2 tells us that, out of a large number of photons, about half pass through 2 . We conclude that the others (which can continue as far as the screen) pass through 1 . But the pattern that they gradually construct on the screen is not an interference pattern, since 2 is blocked. It is only the diffraction pattern of 1 . A-2-b.

Quantum unification of the two aspects of light

The preceding analysis shows that it is impossible to explain all the phenomena observed if only one of the two aspects of light, wave or particle, is considered. Now these two aspects seem to be mutually exclusive. To overcome this difficulty, it thus becomes indispensable to reconsider in a critical way the concepts of classical physics. We must accept the possibility that these concepts, although our everyday experience leads us to consider them well-founded, may not be valid in the new (“microscopic”) domain which we are entering. For example, an essential characteristic of this new domain appeared when we placed counters behind Young’s slits: when one performs a measurement on a microscopic system, one disturbs it in a fundamental fashion. This is a new property since, in the macroscopic domain, we always have the possibility of conceiving measurement devices whose influence on the system is practically as weak as one might wish. This critical revision of classical physics is imposed by experiment and must of course be guided by experiment. Let us reconsider the “paradox” stated above concerning the photon which passes through one slit but behaves differently depending on whether the other slit is open or closed. We saw that if we try to detect the photons when they cross the slits, we prevent them from reaching the screen. More generally, a detailed experimental analysis shows that it is impossible to observe the interference pattern and to know at the same time through which slit each photon has passed (cf. Complement DI ). Thus it is necessary, in order to resolve the paradox, to give up the idea that a photon inevitably passes through a particular slit. We are then led to question the concept, which is a fundamental one of classical physics, of a particle’s trajectory. Moreover, as the photons arrive one by one, their impacts on the screen gradually build up the interference pattern. This implies that, for a particular photon, we are not certain in advance where it will strike the screen. Now these photons are all emitted under the same conditions. Thus another classical idea has been destroyed: that the initial conditions completely determine the subsequent motion of a particle. We can only say, when a photon is emitted, that the probability of its striking the screen at is proportional to the intensity ( ) calculated using wave theory, that is, to ( ) 2 . After many tentative efforts that we shall not describe here, the concept of waveparticle duality was formulated. We can summarize it schematically as follows2 : 2 It

6

is worth noting that this interpretation of physical phenomena, generally considered to be “ortho-

A. ELECTROMAGNETIC WAVES AND PHOTONS

( ) The particle and wave aspects of light are inseparable. Light behaves simultaneously like a wave and like a flux of particles, the wave enabling us to calculate the probability of the manifestation of a particle. ( ) Predictions about the behavior of a photon can only be probabilistic. ( ) The information about a photon at time is given by the wave (r ), which is a solution of Maxwell’s equations. We say that this wave characterizes the state of the photons at time ; (r ) is interpreted as the probability amplitude of a photon appearing, at time , at the point r. This means that the corresponding probability is proportional to (r ) 2 . Comments:

(i) Since Maxwell’s equations are linear and homogeneous, we can use a superposition principle: if 1 and 2 are two solutions of these equations, then = 1 1 + 2 2 , where 1 and 2 are constants, is also a solution. It is this superposition principle which explains wave phenomena in classical optics (interference, diffraction). In quantum physics, the interpretation of (r ) as a probability amplitude is thus essential to the persistence of such phenomena. ( ) The theory merely allows one to calculate the probability of the occurence of a given event. Experimental verifications must thus be founded on the repetition of a large number of identical experiments. In the above experiment, a large number of photons, all produced in the same way, are emitted successively and build up the interference pattern, according to the calculated probabilities. ( ) We are talking here about “the photon state” so as to be able to develop in § B an analogy between (r ) and the wave function (r ) that characterizes the quantum state of a material particle. This “optical analogy” is very fruitful. In particular, as we shall see in § D, it allows us to understand, simply and without recourse to calculation, various quantum properties of material particles. However, we should not push it too far and let it lead us to believe that it is rigorously correct to consider (r ) as characterizing the quantum state of a photon. Furthermore, we shall see that the fact that (r ) is complex is essential in quantum mechanics, while the complex notation (r ) is used in optics purely for convenience (only its real part has a physical meaning). The precise definition of the (complex) quantum state of radiation can only be given in the framework of quantum electrodynamics, a theory which is simultaneously quantum mechanical and relativistic. We shall not consider these problems here (they will briefly be discussed in Complement KV ). A-3.

The principle of spectral decomposition

Armed with the ideas introduced in § A-2, we are now going to discuss another simple optical experiment, whose subject is the polarization of light. This will permit us to introduce the fundamental concepts which concern the measurement of physical quantities. dox”, is not unique; other interpretations have been proposed, which are still discussed among physicists.

7

CHAPTER I

INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS

x P A E

ex θ ep z

O ey y

Figure 2: A simple measurement experiment relating to the polarization of a light wave. A beam of light propagates along the direction and crosses successively the polarizer and the analyzer ; is the angle between and the electric field of the wave transmitted by . The vibrations transmitted by are parallel to . The experiment consists of directing a polarized monochromatic plane light wave onto an analyzer . designates the direction of propagation of this wave and e , the unit vector describing its polarization (cf. Fig. 2). The analyzer transmits light polarized parallel to and absorbs light polarized parallel to . The classical description of this experiment (a description which is valid for a sufficiently intense light beam) is the following. The polarized plane wave is characterized by an electric field of the form: E(r ) =

0

e e(

)

(A-6)

where 0 is a constant. The light intensity is proportional to through the analyzer , the plane wave is polarized along : E (r ) =

0

e e(

and its intensity =

0

2

. After its passage

)

(A-7)

, proportional to

0

2

, is given by Malus’ law:

cos2

[e is the unit vector of the

(A-8) axis and

is the angle between e and e ].

What will happen on the quantum level, that is, when is weak enough for the photons to reach the analyzer one by one? (We then place a photon detector behind this analyser.) First of all, the detector never registers a “fraction of a photon”. Either the photon crosses the analyzer or it is entirely absorbed by it. Next (except in special cases that we shall examine in a moment), we cannot predict with certainty whether a given incident photon will pass or be absorbed. We can only know the corresponding 8

A. ELECTROMAGNETIC WAVES AND PHOTONS

probabilities. Finally, if we send out a large number of photons one after the other, the result will correspond to the classical law, in the sense that about cos2 photons will be detected after the analyzer. We shall retain the following ideas from this description: ( ) The measurement device (the analyzer, in this case) can give only certain privileged results, which we shall call eigen (or proper) results 3 . In the above experiment, there are only two possible results: the photon crosses the analyzer or it is stopped. One says that there is quantization of the result of the measurement, in contrast to the classical case [cf. formula (A-8)] where the transmitted intensity can vary continuously, according to the value of , between 0 and . ( ) To each of these eigen results corresponds an eigenstate. Here, the two eigenstates are characterized by: or

e = e e = e

(A-9)

(e is the unit vector of the axis). If e = e , we know with certainty that the photon will traverse the analyzer; if e = e , it will, on the contrary, definitely be stopped. The correspondence between eigen results and eigenstates is therefore the following. If the particle is, before the measurement, in one of the eigenstates, the result of this measurement is certain: it can only be the associated eigen result. ( ) When the state before the measurement is arbitrary, only the probabilities of obtaining the different eigen results can be predicted. To find these probabilities, one decomposes the state of the particles into a linear combination of the various eigenstates. Here, for an arbitrary e , we write: e = e cos

+ e sin

(A-10)

The probability of obtaining a given eigen result is then proportional to the square of the absolute value of the coefficient of the corresponding eigenstate. The proportionality factor is determined by the condition that the sum of all these probabilities must be equal to 1. We thus deduce from (A-10) that each photon has a probability cos2 of traversing the analyzer and a probability sin2 of being absorbed by it (we know that cos2 + sin2 = 1). This is indeed what was stated above. This rule is called in quantum mechanics the principle of spectral decomposition. Note that the decomposition to be performed depends on the type of measurement device being considered, since one must use the eigenstates which correspond to it: in formula (A-10), the choice of the axes and is fixed by the analyzer. ( ) After passing through the analyzer, the light is completely polarized along e . If we place, after the first analyzer , a second analyzer , having the same axis, all the photons which traversed will also traverse . According to what we have just seen in point ( ), this means that, after they have crossed , the state of the photons is the eigenstate characterized by e . There has therefore been an abrupt change in the state of the particles. Before the measurement, this state was defined by a vector E(r ) which was collinear with e . After the measurement, we possess an additional piece of information (the photon has passed) which is incorporated by describing the state by a different vector, which is now collinear with e . This expresses the fact, already pointed 3 The

reason for this name will appear in Chapter III.

9

CHAPTER I

INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS

out in § A-2, that the measurement disturbs the microscopic system (here, the photon) in a fundamental fashion. Comment:

The certain prediction of the result when e = e or e = e is only a special case. The probability of one of the possible events is then indeed equal to 1. But, in order to verify this prediction, one must perform a large number of experiments. One must be sure that all the photons pass (or are stopped), since the fact that a particular photon crosses the analyzer (or is absorbed) is not characteristic of e = e (or e = e ).

B. B-1.

Material particles and matter waves The de Broglie relations

Parallel to the discovery of photons, the study of atomic emission and absorption spectra uncovered a fundamental fact, which classical physics was unable to explain: these spectra are composed of narrow lines. In other words, a given atom emits or absorbs only photons having well-determined frequencies (that is, energies). This fact can be interpreted very easily if one accepts that the energy of the atom is quantized, that is, it can take on only certain discrete values ( =1 2 ): the emission or absorption of a photon is then accompanied by a “jump” in the energy of the atom from one permitted value to another . Conservation of energy implies that the photon has a frequency such that: =

(B-1)

Only frequencies which obey (B-1) can therefore be emitted or absorbed by the atom. The existence of such discrete energy levels was confirmed independently by the Franck-Hertz experiment. Bohr interpreted this in terms of privileged electronic orbits and stated, with Sommerfeld, an empirical rule which permitted the calculation of these orbits for the case of the hydrogen atom. But the fundamental origin of these quantization rules remained mysterious. In 1923, however, de Broglie put forth the following hypothesis: material particles, just like photons, can have a wavelike aspect. He then derived the Bohr-Sommerfeld quantization rules as a consequence of this hypothesis, the various permitted energy levels appearing as analogues of the normal modes of a vibrating string. Electron diffraction experiments (Davisson and Germer, 1927) strikingly confirmed the existence of a wavelike aspect of matter by showing that interference patterns could be obtained with material particles such as electrons. One therefore associates with a material particle of energy and momentum p, a wave whose angular frequency = 2 and wave vector k are given by the same relations as for photons (cf. § A-1): = = ~ p = ~k 10

(B-2)

B. MATERIAL PARTICLES AND MATTER WAVES

In other words, the corresponding wavelength is: 2 = k p

=

(de Broglie relation)

(B-3)

Comment: The very small value of the Planck constant explains why the wavelike nature of matter is very difficult to demonstrate on a macroscopic scale. Complement AI of this chapter discusses the orders of magnitude of the de Broglie wavelengths associated with various material particles. B-2.

Wave functions. Schrödinger equation

In accordance with de Broglie’s hypothesis, we shall apply the ideas introduced in § A for the case of the photon to all material particles. Recalling the conclusions of this paragraph, we are led to the following formulation: (i) For the classical concept of a position and momentum, we must substitute the concept of a time-varying state. The quantum state of a particle such as the electron4 is characterized by a wave function (r ), which contains all the information it is possible to obtain about the particle. (ii) (r ) is interpreted as a probability amplitude of the particle’s presence. Since the possible positions of the particle form a continuum, the probability dP(r ) of the particle being, at time , in a volume element d3 = d d d situated at the point r must be proportional to d3 . It is therefore infinitesimal, and (r ) 2 is interpreted as the corresponding probability density, with: (r ) 2 d3

dP(r ) =

(B-4)

where

is a normalization constant [see comment (i) at the end of § B-2]. (iii) The principle of spectral decomposition applies to the measurement of an arbitrary physical quantity: – The result found must belong to a set of eigen results { }. – With each eigenvalue is associated an eigenstate, that is, an eigenfunction (r). This function is such that, if (r 0 ) = (r) (where 0 is the time at which the measurement is performed), the measurement will always yield . – For any (r ), the probability P of finding the eigenvalue for a measurement at time 0 is found by decomposing (r 0 ) in terms of the functions (r): (r

0)

=

(r)

(B-5)

Then: P =

4 We

2 2

(B-6)

shall not take into account here the existence of electron spin (cf. Chap. IX).

11

CHAPTER I

INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS

(the presence of the denominator ensures that the total probability is equal to unity: P = 1). – If the measurement indeed yields , the wave function of the particle immediately after the measurement is: (r

0)

=

(r)

(B-7)

(iv) The equation describing the evolution of the function (r ) remains to be written. It is possible to introduce it in a very natural way, using the Planck and de Broglie relations. Nevertheless, we have no intention of proving this fundamental equation, which is called the Schrödinger equation. We shall simply assume it. Later, we shall discuss some of its consequences (whose experimental verification will prove its validity). Besides, we shall consider this equation in much more detail in Chapter III. When the particle (of mass ) is subjected to the influence of a potential5 (r ), the Schrödinger equation takes on the form: ~

(r ) =

~2 2

(r ) +

(r ) (r )

(B-8)

2 2 2 where is the Laplacian operator 2 + 2 + 2 . We notice immediately that this equation is linear and homogeneous in . Consequently, for material particles, there exists a superposition principle which, combined with the interpretation of as a probability amplitude, is the source of wavelike effects. Note, moreover, that the differential equation (B-8) is first-order with respect to time. This condition is necessary if the state of the particle at a time 0 , characterized by (r 0 ), is to determine its subsequent state. Thus there exists a fundamental analogy between matter and radiation: in both cases, a correct description of the phenomena necessitates the introduction of quantum concepts, and, in particular, the idea of wave-particle duality.

Comments:

(i) For a system composed of only one particle, the total probability of finding the particle anywhere in space, at time , is equal to 1: dP(r ) = 1

(B-9)

Since dP(r ) is given by formula (B-4), we conclude that the wave function (r ) must be square-integrable: (r ) 2 d3

is finite

(B-10)

5 (r ) designates a potential energy here. For example, it may be the product of an electric potential and the particle’s charge. In quantum mechanics, (r ) is commonly called a potential.

12

C. QUANTUM DESCRIPTION OF A PARTICLE. WAVE PACKETS

The normalization constant 1

=

that appears in (B-4) is then given by the relation:

(r ) 2 d3

(B-11)

(we shall later see that the form of the Schrödinger equation implies that is time-independent). One often uses wave functions which are normalized, such that: (r ) 2 d3 = 1 The constant

(B-12)

is then equal to 1.

(ii) Note the important difference between the concepts of classical states and quantum states. The classical state of a particle is determined at time by the specification of six parameters characterizing its position and its velocity at time : , , ; , , . The quantum state of a particle is determined by an infinite number of parameters: the values at the various points in space of the wave function (r ) which is associated with it. For the classical idea of a trajectory (the succession in time of the various states of the classical particle), we must substitute the idea of the propagation of the wave associated with the particle. Consider, for example, Young’s double-slit experiment, previously described for the case of photons, but which in principle can also be performed with material particles such as electrons. When the interference pattern is observed, it makes no sense to ask through which slit each particle has passed, since the wave associated with it passed through both. (iii) It is worth noting that, unlike photons, which can be emitted or absorbed during an experiment, material particles can neither be created nor destroyed. The electrons emitted by a heated filament for example already existed in the filament. In the same way, an electron absorbed by a counter does not disappear; it becomes part of an atom or an electric current. Actually, the theory of relativity shows that it is possible to create and annihilate material particles: for example, a photon having sufficient energy, passing near an atom, can materialize into an electronpositron pair. Inversely, the positron, when it collides with an electron, annihilates with it, emitting photons. However, we pointed out in the beginning of this chapter that we would limit ourselves here to the non-relativistic quantum domain, and we have indeed treated time and space coordinates asymmetrically. In the framework of non-relativistic quantum mechanics, material particles can neither be created nor annihilated. This conservation law, as we shall see, plays a role of primary importance. The need to abandon it is one of the important difficulties encountered when one tries to construct a relativistic quantum mechanics. C.

Quantum description of a particle. Wave packets

In the preceding paragraph, we introduced the fundamental concepts necessary for the quantum description of a particle. In this paragraph, we are going to familiarize ourselves with these concepts and deduce from them several very important properties. We start with the very simple case of a free particle. 13

CHAPTER I

C-1.

INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS

Free particle

Consider a particle whose potential energy is zero (or has a constant value) at every point in space. The particle is thus not subjected to any force; it is said to be free. When (r ) = 0, the Schrödinger equation becomes: ~2 2

(r ) =

~

(r )

(C-1)

This differential equation is obviously satisfied by solutions of the form: e (k r

(r ) = (where =

)

(C-2)

is a constant), on the condition that k and ~k2 2

satisfy the relation: (C-3)

Observe that, according to the de Broglie relations [see (B-2)], condition (C-3) expresses the fact that the energy and the momentum p of a free particle satisfy the equation, which is well-known in classical mechanics: =

p2 2

(C-4)

We shall come back later (§ C-3) to the physical interpretation of a state of the form (C-2). We already see that, since (r ) 2 =

2

(C-5)

a plane wave of this type represents a particle whose probability of presence is uniform throughout all space (see comment below). The principle of superposition tells us that every linear combination of plane waves satisfying (C-3) will also be a solution of equation (C-1). Such a superposition can be written: (r ) =

1 (2 )3

(k) e [k r

2

( ) ]

d3

(C-6)

(d3 represents, by definition, the infinitesimal volume element in k-space: d d d ). (k), which can be complex, must be sufficiently regular to allow differentiation inside the integral. It can be shown, moreover, that any square-integrable solution can be written in the form (C-6). A wave function such as (C-6), a superposition of plane waves, is called a threedimensional “wave packet”. For the sake of simplicity, we shall often be led to study the case of a one-dimensional wave packet6 of plane waves all propagating parallel to . The wave function then depends only on and : (

)=

1 2

+

( )e [

( ) ]

d

(C-7)

6 A simple model of a two-dimensional wave packet is presented in Complement E . Some general I properties of three-dimensional wave packets are studied in Complement FI , which also shows how, in certain cases, a three-dimensional problem can be reduced to several one-dimensional problems.

14

C. QUANTUM DESCRIPTION OF A PARTICLE. WAVE PACKETS

In the following paragraph, we shall be interested in the form of the wave packet at a given instant. If we choose this instant as the time origin, the wave function is written: (

0) =

1 2

( )e

d

(C-8)

We see that ( ) is simply the Fourier transform (cf. Appendix I) of ( )=

1 2

(

0) e

(

0):

d

(C-9)

Consequently, the validity of formula (C-8) is not limited to the case of the free particle: whatever the potential, ( 0) can always be written in this form. The consequences that we shall derive from this in §§ C-2 and C-3 below are thus perfectly general. It is not until § C-4 that we shall return explicitly to the free particle. Comment:

A plane wave of type (C-2), whose modulus is constant throughout all space [cf. (C-5)], is not square-integrable. Therefore, rigorously, it cannot represent a physical state of the particle (in the same way as, in optics, a monochromatic plane wave is not physically realizable). On the other hand, a superposition of plane waves like (C-7) can be square-integrable. C-2.

Form of the wave packet at a given time

The form of the wave packet is given by the -dependence of ( 0) defined by equation (C-8). Imagine that ( ) has the shape depicted in Figure 3, with a pronounced peak situated at = 0 and a width (defined, for example, at half its maximum value) of . g(k)

Figure 3: Shape of the function ( ) , modulus of the Fourier transform of ( 0): we assume that it is centered at = 0 , where it reaches a maximum, and has a width of ∆ .

Δk

0

k k0

Let us begin by trying to understand qualitatively the behavior of ( 0) through the study of a very simple special case. Let ( 0), instead of being the superposition of an infinite number of plane waves e as in formula (C-8), be the sum of only three plane waves. The wave vectors of these plane waves are

0,

0

2

,

0+

2

, and their 15

CHAPTER I

INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS

amplitudes are proportional, respectively, to 1, 1/2 and 1/2. We then have: ( 0) 1 e 0 + e( 2 2 ( 0) = e 0 1 + cos 2

( )=

0

2

) +1e( 2

0+ 2

) (C-10)

2

We see that ( ) is maximum when = 0. This result is due to the fact that, when takes on this value, the three waves are in phase and interfere constructively, as shown in Figure 4. As one moves away from the value = 0, the waves become more and more out of phase, and ( ) decreases. The interference becomes completely destructive when 2) the phase shift between e 0 and e ( 0 is equal to : ( ) goes to zero when = 2, being given by: =4

(C-11)

This formula shows that the smaller the width of the function ( ) , the larger the width of the function ( ) (the distance between two zeros of ( ) ). Comment: Formula (C-10) shows that ( ) is periodic in and therefore has a series of maxima and minima. This arises from the fact that ( ) is the superposition of a finite number of waves (here, three). For a continuous superposition of an infinite number of waves, as in formula (C-8), such a phenomenon does not occur, and ( 0) can have only one maximum.

Let us now return to the general wave packet of formula (C-8). Its form also results from an interference phenomenon: ( 0) is maximum when the different plane waves interfere constructively. Let ( ) be the argument of the function ( ): ( )=

( ) e

Assume that

( )

(C-12)

( ) varies sufficiently smoothly within the interval

where ( ) is appreciable; then, when the neighborhood of = 0 : ( )

( 0) + (

d d

0)

0+ 2 2 is sufficiently small, one can expand ( ) in 0

(C-13) =

0

which enables us to rewrite (C-8) in the form: ( 16

0)

e[

0

+ (

2

0 )]

+

( ) e(

0 )(

0)

d

(C-14)

C. QUANTUM DESCRIPTION OF A PARTICLE. WAVE PACKETS

k0 +

Δk 2

k0

k0 –

Δk 2 Re { ψ(x) } –

Δx 2

+ 0

Δx 2 x

Figure 4: The real parts of the three waves whose sum gives the function ( ) of (C-10). At = 0, the three waves are in phase and interfere constructively. As one moves away from = 0, they go out of phase and interfere destructively for = ∆ 2. In the lower part of the figure, Re ( ) is shown. The dashed-line curve corresponds to the function 1 + cos ∆2 , which, according to (C-10), gives ( ) (and therefore, the form of the wave packet). with: 0

=

d d

(C-15) =

0

The form (C-14) is useful for studying the variations of ( 0) in terms of . When which is to be integrated oscillates a very large 0 is large, the function of number of times within the interval . We then see (cf. Fig. 5-a, in which the real part of this function is depicted) that the contributions of the successive oscillations cancel each other out, and the integral over becomes negligible. In other words, when is fixed at a value far from 0 , the phases of the various waves which make up ( 0) vary very rapidly in the domain , and these waves destroy each other by interference. On the other hand, if oscillates hardly at all (cf. 0 , the function to be integrated over Fig. 5-b), and ( 0) is maximum. The position (0) of the center of the wave packet is therefore: (0) =

0

=

d d

(C-16) =

0

Actually the result (C-16) can be obtained very easily. An integral such as the one appearing in (C-8) will be maximum (in absolute value) when the waves having the 17

CHAPTER I

INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS

Re { g(k) ei(k – k0)(x – x0) }

Re { g(k) ei(k – k0)(x – x0) }

g(k)

g(k)

a

b

k 0

k

0

k0

k0 x – x0 >

1 k

x – x0
~ 2 which, since is fixed, imposes a lower limit on . This corresponds to what can be seen in Figure 2. Comments:

(i) The spreading of a packet of free waves is a general phenomenon which is not limited to the special case studied here. It can be shown that, for an arbitrary free wave packet, the variation in time of its width has the shape shown in Figure 2 (cf. exercise 4 of Complement LIII ). (ii) In Chapter I, a simple argument led us in (C-17) to

1, without making any particular hypothesis about ( ). We simply assumed that ( ) has a peak of width whose shape is that of Figure 3 of Chapter I (which is indeed the case in this complement). Then how did we obtain 1 (for example, for a Gaussian wave packet when is large)? Of course, this is only an apparent contradiction. In Chapter I, in order to find

61

COMPLEMENT GI

• x

δxcl

0

t

Figure 2: Variation in time of the width ∆ of the wave packet of Figure 1. For large , ∆ approaches the dispersion of the positions of a group of classical particles which left = 0 at time = 0 with a velocity dispersion ∆ .

1, we assumed in (C-13) that the argument ( ) of ( ) could be approximated by a linear function in the domain . Thus we implicitly assumed a supplementary hypothesis: that the nonlinear terms make a negligible contribution to the phase of ( ) in the domain . For example, for the terms which are of second order in ( 0 ), it is necessary that: 2

d2 d 2

2

(22)

= 0

If, on the contrary, the phase ( ) cannot be approximated in the domain by a linear function with an error much smaller than 2 , we find when we return to the argument of Chapter I that the wave packet is larger than was predicted by (C-17). In the case of the Gaussian wave packet studied in the present complement, 1 ~ 2 we have and ( ) = . Consequently, condition (22) can be written 2 1 2~ 2 . Indeed, we can verify from (20) that, as long as this condition is fulfilled, the product

62

is approximately equal to 1.



STATIONARY STATES OF A PARTICLE IN ONE-DIMENSIONAL SQUARE POTENTIALS

Complement HI Stationary states of a particle in one-dimensional square potentials

1

2

Behavior of a stationary wave function ( ) . . . . 1-a Regions of constant potential energy . . . . . . . . . 1-b Behavior of ( ) at a potential energy discontinuity 1-c Outline of the calculation . . . . . . . . . . . . . . . Some simple cases . . . . . . . . . . . . . . . . . . . . 2-a Potential steps . . . . . . . . . . . . . . . . . . . . . 2-b Potential barriers . . . . . . . . . . . . . . . . . . . . 2-c Bound states: square well potential . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

63 63 64 65 65 65 68 71

We saw in Chapter I (cf. § D-2) the interest in studying the motion of a particle in a “square potential” whose rapid spatial variations for certain values of introduce purely quantum effects. The shape of the wave functions associated with the stationary states of the particle was predicted by considering an optical analogy which enabled us to understand very simply how these new physical effects appear. In this complement, we outline the quantitative calculation of the stationary states of the particle. We shall give the results of this calculation for a certain number of simple cases, and discuss their physical implications. We limit ourselves to one-dimensional models (cf. Complement FI ).

1.

Behavior of a stationary wave function

1-a.

( )

Regions of constant potential energy

In the case of a square potential, ( ) is a constant function ( ) = in certain regions of space. In such a region, equation (D-8) of Chapter I can be written: d2 2 ( )+ 2 ( d 2 ~

) ( )=0

(1)

We shall distinguish between several cases: (i) Let us introduce the positive constant , defined by =

~2 2

2

(2)

The solution of equation (1) can then be written: ( )=

e

+

e

(3) 63

COMPLEMENT HI



where

are complex constants.

and

(ii) This condition corresponds to regions of space which would be forbidden to the particle by the laws of classical mechanics. In this case, we introduce the positive constant defined by: =

~2 2

2

(4)

and the solution of (1) can be written: ( )= where

e

+

and (

)

1-b.

e

(5)

are complex constants. =

. In this special case, ( ) is a linear function of .

Behavior of

( ) at a potential energy discontinuity

How does the wave function behave at a point = 1 , where the potential ( ) is discontinuous? One might expect the wave function ( ) to behave strangely at this point, becoming itself discontinuous, for example. The aim of this section is to show that this is not the case: ( ) and d d are continuous, and it is only the second derivative d2 d 2 that is discontinuous at = 1 . Without giving a rigorous proof, let us try to understand this property. To do this, recall that a square potential must be considered (cf. Chap. I, § D-2-a) as the limit, when 0, of a potential ( ) equal to ( ) outside the interval [ 1 1 + ], and varying continuously within this interval. Then consider the equation: d2 d 2

( )+

2 [ ~2

( )]

( )=0

(6)

where ( ) is assumed to be bounded, independently of , within the interval [ 1 1 + ]. Choose a solution ( ) which, for , coincides with a given solution of (1). The 1 problem is to show that, when 0, ( ) tends towards a function ( ) which is continuous and differentiable at = 1 . Let us grant that ( ) remains bounded1 , whatever the value of , in the neighborhood of = 1 . Physically, this means that the probability density remains finite. Integrating (6) between 1 and 1 + , we obtain: d ( d

1

+ )

d ( d

)=

1

2 ~2

1+

[

( )

]

( )d

(7)

1

At the limit where 0, the function to be integrated on the right-hand side of this expression remains bounded, owing to our previous assumption. Consequently, if tends towards zero, the integral also tends towards zero, and: d ( d

1

+ )

d ( d

1

)

0

Thus, at this limit, d d is continuous at continuous function). On the other hand, d2 1 This

64

(8)

0

= d

1, 2

and so is ( ) (since it is the integral of a is discontinuous, and, as can be seen directly

point could be proved mathematically from the properties of the differential equation (1).



STATIONARY STATES OF A PARTICLE IN ONE-DIMENSIONAL SQUARE POTENTIALS

from (1), makes a jump at change in

( ) at

=

=

1,

which is equal to

1 ].

2 ~2

(

1)

[where

represents the

Comment:

It is essential, in the preceding argument, that ( ) remain bounded. In certain exercises of Complement KI , for example, the case is considered for which ( ) = ( ), an unbounded function whose integral remains finite. In this case, ( ) remains continuous, but d d does not. 1-c.

Outline of the calculation

The procedure for determining the stationary states in a “square potential” is therefore the following: in all regions where ( ) is constant, write ( ) in whichever of the two forms (3) or (5) is applicable; then “match” these functions by requiring the continuity of ( ) and of d d at the points where ( ) is discontinuous. 2.

Some simple cases

Let us now carry out the quantitative calculation of the stationary states, performed according to the method described above, for all the forms of the potential ( ) considered in § D-2-c of Chapter I. This will ensure that the form of the solutions is indeed the one predicted by the optical analogy. 2-a.

Potential steps

V(x) V0 I

x

0

.

Figure 1: Potential step.

II

Case where

0;

partial reflection

Set: 2 ~2

=

2 (

0)

~2

(9)

1

=

2

(10) 65



COMPLEMENT HI

The solution of (1) has the form (3) in the two regions I (

I(

)=

II (

e

1

)=

e

2

+

1

+

2

1

2

e

e

0) and II (

0):

(11)

1

(12)

2

Since equation (1) is homogeneous, the calculation method of § 1-c can only enable us to determine the ratios 1 1 , 2 1 and 2 1 . In fact, the two matching conditions at = 0 do not suffice for the determination of these three ratios. This is why we shall choose 2 = 0, which amounts to limiting ourselves to the case of an incident particle coming from = . The matching conditions then give: 1

=

1

2

1

2

1+

2

2 1 +

=

1

(13)

(14)

1

2

I ( ) is the superposition of two waves. The first one (the term in 1 ) corresponds to an incident particle, with momentum = ~ 1 , propagating from left to right. The second one (the term in 1 ) corresponds to a reflected particle, with momentum ~ 1 , propagating in the opposite direction. Since we have chosen 2 = 0, II ( ) consists of only one wave, which is associated with a transmitted particle. We shall see in Chapter III (cf. § D-1-c- ) how it is possible, using the concept of a probability current, to define the transmission coefficient and the reflection coefficient of the potential step (see also § 2 of Complement BIII ). These coefficients give the probability for the particle, arriving from = , to pass the potential step at = 0 or to turn back. Thus we find: 2 1

=

(15)

1

and, for2

: 2

=

2

2

1

1

(16)

Taking (13) and (14) into account, we then have: 4

=1

= 2 The

66

( 4

(

1

1

1 2

+

(17)

2 2)

1 2

+

2)

(18)

2

physical origin of the factor

2

1,

appearing in

is discussed in § 2 of Complement JI .



STATIONARY STATES OF A PARTICLE IN ONE-DIMENSIONAL SQUARE POTENTIALS

It is easy to verify that + = 1: it is certain that the particle will be either transmitted or reflected. Contrary to the predictions of classical mechanics, the incident particle has a non-zero probability of turning back. This point was explained in Chapter I, using the optical analogy and considering the reflection of a light wave from a plane interface (with 1 2 ). Furthermore, we know that in optics, no phase delay is created by such a reflection; equations (13) and (14) do indeed show that the ratios 1 1 and 2 1 are real. Therefore, the quantum particle is not slowed down by its reflection or transmission (cf. Complement JI , § 2). Finally, using (9), (10) and (18) it is easy to verify that, if 1: when the energy of the particle is sufficiently large compared 0 , we have to the height of the potential step, the particle clears this step as if it did not exist.

.

Case where

0;

total reflection

We then replace (10) and (12) by: 2 (

)

0

=

~2 II (

)=

2

e

+

2

(19)

2

2

e

(20)

2

For the solution to remain bounded when 2

=0

=

1

2

1

1

= 0 yield in this case:

2

+

2

2 1 1+

2

1

=

(22)

(23)

The reflection coefficient 2

=

, it is necessary that: (21)

The matching conditions at 1

+

1 1

is then equal to:

2

=

1 1

2

+

=1

(24)

2

As in classical mechanics, the particle is always reflected (total reflection). Nevertheless, there is an important difference, which has already been pointed out in Chapter I. Because of the existence of the evanescent wave e 2 , the particle has a non-zero probability of presence in the region of space which, classically, would be forbidden to it. This probability decreases exponentially with and becomes negligible when is greater than the “range” 1 2 of the evanescent wave. Note also that the coefficient 1 1 is complex. A certain phase shift appears upon reflection, which, physically, is due to the fact that the particle is delayed when it penetrates the 0 region (cf. Complement JI , § 1 and also BIII , § 3). This phase shift is analogous to the one that appears when light is reflected from a metallic type of substance; however, there is no analogue in classical mechanics. 67



COMPLEMENT HI

Comment:

When

+

0

,

+ , so that (22) and (23) yield:

2

1

1

(25)

0

2

In the 0 region, the wave, whose range decreases without bound, tends towards zero. Since ( 1 + 1 ) 0, the wave function ( ) goes to zero at = 0, so that it remains continuous at this point. On the other hand, its derivative, which changes abruptly from the value 2 1 to zero, is no longer continuous. This is due to the fact that since the potential jump is infinite at = 0, the integral of (7) no longer tends towards zero when tends towards 0. 2-b.

Potential barriers

V(x) V0 I

II

0

.

Figure 2: Square potential barrier.

III

x

l

Case where

0;

resonances3

Using notations (9) and (10), we find, in the three regions I ( and III ( ) shown in Fig. 2: I(

) =

II (

) =

III (

)=

1

e

1

+

2

e

2

+

3

e

1

+

0), II (0

)

1

e

1

(26-a)

2

e

2

(26-b)

3

e

1

(26-c)

Let us choose, as above, 3 = 0 (incident particle coming from = ). The matching conditions at = then give 2 and 2 in terms of 3 , and those at = 0 give 1 and 1 in terms of 2 and 2 (and, consequently, in terms of 3 ). Thus we find: 1

1

= cos =

2 2

2

2 2 1

sin

2 1

+

2

1 2

2

e

2 2

1

sin 3

2

e

1

3

(27)

1 2

3 V can be either positive (the case of a potential barrier like the one shown in Figure 2) or negative 0 (a potential well).

68



STATIONARY STATES OF A PARTICLE IN ONE-DIMENSIONAL SQUARE POTENTIALS

T 1

4E(E

V0)

V0) + V 20

4E(E

l

0

/k2

2 /k2

Figure 3: Variations of the transmission coefficient of the barrier as a function of its width (the height 0 of the barrier and the energy of the particle are fixed). Resonances appear each time that is an integral multiple of the half-wavelength 2 , in region II.

1 and 3 1 enable us to calculate the reflection coefficient 1 coefficient of the barrier: 2

=

1

=

1

( 4

2 2 1 2

2 1

+(

2

=

3

=

1

4

2 2 1 2

+(

2 2 2) 2 1 4 12 2 1

It is then easy to verify that 4 (

= 4 (

0)

+

2 0

sin

sin2 2 2 2 2 2 ) sin

2

2 2 2 2 2)

sin2

2

+

= 1. Taking (9) and (10) into account, we have:

(28-a) (28-b)

0) 2

and the transmission

2 (

(29) 0)

~

The variations with respect to of the transmission coefficient are shown in Figure 3 (with and 0 fixed): oscillates periodically between its minimum value, 1+

2 0

1

, and its maximum value, which is 1. This function is the analogue 4 ( 0) of the one describing the transmission of a Fabry-Perot interferometer. As in optics, the resonances (obtained when = 1, that is, when 2 = ) correspond to the values of which are integral multiples of the half-wavelength of the particle in region II. When 0 , the reflection of the particle at each of the potential discontinuities occurs without a phase shift of the wave function (cf. § 2-a- ). This is why the resonance condition 2 = corresponds to the values of for which a system of standing waves can exist in region II. On the other hand, far from the resonances, the various waves which are reflected at = 0 and = destroy each other by interference, so that the values of the wave function are small. A study of the propagation of a wave packet (analogous to the one in Complement JI ) would show that, if the resonance condition is satisfied, the wave packet spends a relatively long time in region II. In quantum mechanics this phenomenon is called resonance scattering. 69



COMPLEMENT HI

.

Case where

0;

tunnel effect

We must now replace (26-b) by (20), 2 still being given by (19). The matching conditions at = 0 and = enable us to calculate the transmission coefficient of the barrier. In fact, it is unnecessary to perform the calculations again: all we must do is replace, in the equations obtained in § , the wave vector 2 by 2 . We then have: 2 3

=

4 (

= 4 (

1

with, of course, 16 (

)+

0

=1 )

0 2 0

2 0

. When 2

e

0

sinh2

)

(30)

2 (

0

)

~

1, we have:

2

(31)

2

We have already seen in Chapter I why, contrary to the classical predictions, the particle has a non-zero probability of crossing the potential barrier. The wave function in region II is not zero, but has the behavior of an “evanescent wave” of range 1 2 . When . 1 2 , the particle has a considerable probability of crossing the barrier by the “tunnel effect”. This effect has numerous physical applications: the inversion of the ammonia molecule (cf. Complement GIV ), the tunnel diode, the Josephson effect, the -decay of certain nuclei, etc... For an electron, the range of the evanescent wave is: 1

1.96

2

˚ A

(32)

0

where and 0 are expressed in electron-volts (this formula is easily obtained by replacing, in formula (8) of Complement AI , = 2 by 2 2 ). Now consider an electron of energy 1 eV which encounters a barrier for which 0 = 2 eV and = 1 ˚ A. The range of the evanescent wave is then 1.96 ˚ A, that is, of the order of : the electron must then have a considerable probability of crossing the barrier. Indeed, formula (30) gives in this case: 0.78

(33)

The quantum result is radically different from the classical result: the electron has approximately 8 chances out of 10 of crossing the barrier. Let us now assume that the incident particle is a proton (whose mass is about 1 840 times that of the electron). The range 1 2 then becomes: 1

1.96 1 840(

2

0

˚ A )

If we retain the same values: = 1 eV, than . Formula (31) then gives: 4

10

19

4.6

10

2

˚ A

(34)

0

0

= 2 eV, = 1 ˚ A, we find a range 1

2,

much smaller (35)

Under these conditions, the probability of the proton’s crossing the potential barrier is negligible. This is all the more true if we apply (31) to macroscopic objects, for which we find such small probabilities that they cannot possibly play any role in physical phenomena.

70

• 2-c.

STATIONARY STATES OF A PARTICLE IN ONE-DIMENSIONAL SQUARE POTENTIALS

Bound states: square well potential

.

Well of finite depth V(x) a 2

+

0

a 2

x

Figure 4: Square well potential. I

II

III

V0

We shall limit ourselves to studying the case 0 included in the calculations of the preceding section 2-b- ). In regions I have respectively:

2

, II

2

6

6

2

, and III

0 (the case

2

0 was

shown in Fig. 4, we

I(

) =

1

e

+

1

e

(36-a)

II (

) =

2

e

+

2

e

(36-b)

III (

)=

3

e

+

3

e

(36-c)

with 2

=

=

(37)

~2

2 (

+

0)

(38)

~2

Since ( ) must be bounded in region I, we must have: 1

=0

(39)

The matching conditions at

2

= e(

2

=

+

e

and those at

)

( +

+ 2

2

)

2

then give:

1

2

2 =

=

1

(40)

2: 71



COMPLEMENT HI

3

=

1

e 4 2

3 1

)2 e

(

2

+ 2

=

)2 e

( + sin

(41)

But ( ) must also be bounded in region III. Therefore, it is necessary that that is:

3

= 0,

2

= e2

+

(42)

Since and depend on , equation (42) can only be satisfied for certain values of . Imposing a bound on ( ) in all regions of space thus entails the quantization of energy. More precisely, two cases are possible: (i) if: =

+

e

(43)

we have: = tan

(44)

2

Set: 0

2

=

0

~2

2

=

2

+

(45)

We then obtain: 2

1 cos2

= 1 + tan2

2

=

+ 2

2

2

=

0

(46)

2

Equation (43) is thus equivalent to the system of equations:

cos tan

2 2

=

(47a) 0

0

(47b)

The energy levels are determined by the intersection of a straight line, having a slope 1 0 , with sinusoidal arcs (long dashed lines in Figure 5). Thus we obtain a certain number of energy levels, whose wave functions are even. This becomes clear if we substitute (43) into (40) and (41); it is easy to verify that 3 = 1 and that 2 = 2 , so that ( ) = ( ). 72



STATIONARY STATES OF A PARTICLE IN ONE-DIMENSIONAL SQUARE POTENTIALS

y P I P I P 0

/a

2 /a

3 /a

4 /a

k0 5 /a

k

Figure 5: Graphic solution of equation (42), giving the energies of the bound states of a particle in a square well potential. In the case shown in the figure, there exist five bound states, three even (associated with the points of the figure), and two odd (points ). (ii) if: =e

+

(48)

a calculation of the same type leads to: sin

=

2

tan

(49a) 0

0

2

(49b)

The energy levels are then determined by the intersection of the same straight line as before with other sinusoidal arcs (cf. short dashed lines in Figure 5). The levels thus obtained fall between those found in (i). It can easily be shown that the corresponding wave functions are odd. Comment: If

6

0

, that is, if: 2 2

0

6

1

=

~

2

(50)

2

Figure 5 shows that there exists only one bound state of the particle, and this state has an even wave function. Then, if 1 6 0 4 1 , a first odd level appears, and so on: when 0 increases, there appear alternatively even and odd levels. If 0 1 , the slope 1 0 of the straight line of Figure 5 is very small: for the lowest energy levels, we practically have: = where =

(51) is an integer, and consequently: 2

2 2

2

2

~

0

(52)

73



COMPLEMENT HI

.

Infinitely deep well

Assume

( ) to be zero for 0

and infinite everywhere else. Set:

2

=

(53)

~2

According to the comment made at the end of § 2-a- of this complement, ( ) must be zero outside the interval [0 ], and continuous at = 0, as well as at = . Now for 0 : ( )=

e

+

e

Since (0) = 0, it can be deduced that ( )=2

(54) =

, which leads to:

sin

(55)

Moreover, ( ) = 0, so that: =

(56)

where is an arbitrary positive integer. If we normalize function (55), taking (56) into account, we then obtain the stationary wave functions: 2

( )=

sin

(57)

with energies: 2 2 2

=

~

2

(58)

2

The quantization of the energy levels is thus, in this case, particularly simple. Comments:

(i) Relation (56) simply expresses the fact that the stationary states are determined by the condition that the width of the well must contain an integral number of half-wavelengths, . This is not the case when the well has a finite depth (cf. § 2-c- ); the difference between the two cases arises from the phase shift of the wave function that occurs upon reflection from a potential step (cf. § 2-a- ). (ii) It can easily be verified from (51) and (52) that, if the depth approaches infinity, we find the energy levels of an infinite well.

0

of a finite well

References and suggestions for further reading:

Eisberg and Resnick (1.3), Chap. 6; Ayant and Belorizky (1.10), Chap. 4; Messiah (1.17), Chap. III; Merzbacher (1.16), Chap. 6; Valentin (16.1), annex V.

74



BEHAVIOR OF A WAVE PACKET AT A POTENTIAL STEP

Complement JI Behavior of a wave packet at a potential step

1 2

Total reflection: Partial reflection:

0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0

75 79

In Complement HI , we determined the stationary states of a particle in various “square” potentials. For certain cases (a step potential, for example), the stationary states obtained consist of unbounded plane waves (incident, reflected and transmitted). Of course, since they cannot be normalized, such wave functions cannot really represent a physical state of the particle. However, they can be linearly superposed to form normalizable wave packets. Moreover, since such a wave packet is expanded directly in terms of stationary wave functions, its time evolution is very simple to determine. All we need to do is multiply each of the coefficients of the expansion by an imaginary exponential e

~

with a well-defined frequency

(chap. I, § D-1-b).

We intend, in this complement, to construct such wave packets and study their time evolution for the case where the potential presents a “step” of height 0 , as in Figure 1 of Complement HI . In this way, we shall be able to describe precisely the quantum behavior of the particle when it arrives at the potential step by determining the motion and the deformation of its associated wave packet. This will also enable us to confirm various results obtained in HI through the study of the stationary states alone (reflection and transmission coefficients, delay upon reflection, etc...). We shall set: 2

=

~2 2

0

~2

=

(1)

0

and, as in Complement HI , we shall distinguish between two cases, corresponding to smaller or greater than 0 . 1.

Total reflection:

0

In this case, the stationary wave functions are given by formulas (11) and (20) of Complement HI ( 1 will be simply called here), the coefficients 1 , 1 , 2 and 2 of these formulas being related by equations (21), (22) and (23) of HI . We are going to construct a wave packet from these stationary wave functions by linearly superposing them. We shall choose only values of less than 0 , so that the waves forming the packet undergo total reflection. To ensure this, we shall choose a function ( ) (which characterizes the wave packet) which is zero for 0 . We 75



COMPLEMENT JI

are going to focus our attention on the negative region of the -axis, to the left of the potential barrier. In Complement HI , relation (22) shows that the coefficients 1 and 1 of expression (11) for a stationary wave in this region have the same modulus. Therefore, we can set: 1(

) =e ( 1 )

2 ( )

(2)

with [cf. formula (19) of HI ]: 2 0

tan ( ) =

2

(3)

Finally, the wave packet which we are going to consider can be written, at time for negative : (

0) =

1 2

= 0,

0

d

( )[e

+e

2 ( )

e

]

(4)

0

As in § C of Chapter I, we assume that ( ) has a pronounced peak of width about the value = 0 . 0 In order to obtain the expression for the wave function ( ) at any time , we simply use the general relation (D-14) of Chapter I: (

)=

1 2

0

( )e[

d

( ) ]

0

1 2

+

0

d

( )e

[

+ ( ) +2 ( )]

(5)

0

where ( ) = ~ 2 2 . By construction, this expression is valid only for negative . Its first term represents the incident wave packet; its second term, the reflected packet. For simplicity, we shall assume ( ) to be real. The stationary phase condition (cf. Chap. I, § C-2) then enables us to calculate the position of the center of the incident wave packet. If, at = 0 , we set the derivative with respect to of the argument of the first exponential equal to zero, we obtain: =

d d

= =

~

0

(6)

0

In the same way, the position of the center of the reflected packet is obtained by differentiating the argument of the second exponential. Differentiating equation (3), we find: [1 + tan2 ] d = 1 + =

2 0

2

d

2

d

2 0

2

2

d 2 0

2

(7)

that is: 2 0 2

76

d =

2 0 2

1 2 0

2

d

(8)



BEHAVIOR OF A WAVE PACKET AT A POTENTIAL STEP

Thus we have: d d +2 d d

=

= =

~

0

+

0

2 2 0

2 0

(9)

Formulas (6) and (9) enable us to describe more precisely the motion of the particle, localized in a region of small width centered at or . First of all, let us consider what happens for negative . The center of the incident wave packet propagates from left to right with a constant velocity ~ 0 . On the other hand, we see from formula (9) that is positive, that is, situated outside the region 0 where expression (5) for the wave function is valid. This means that, for all negative values of , the various waves of the second term of (5) interfere destructively: for negative , there is no reflected wave packet, but only an incident wave packet like those we studied in § C of Chapter I. The center of the incident wave packet arrives at the barrier at time = 0. During a certain interval of time around = 0, the wave packet is localized in the region 0 where the barrier is, and its form is relatively complicated. But, when is sufficiently large, we see from (6) and (9) that it is the incident wave packet which has disappeared, and we are left with only the reflected wave packet. It is now which is positive, while has become negative: the waves of the incident packet interfere destructively for all negative values of , while those of the reflected packet interfere constructively for = 0. The reflected wave packet propagates towards the left at a speed of ~ 0 , opposite to that of the incident packet, whose mirror image it is; its form is unchanged1 . Moreover, formula (9) shows that the reflection has introduced a delay , given by: =

2

d d d d

2

= =

0

~

0

2 0

2 0

(10)

Contrary to what is predicted by classical mechanics, the particle is not instantaneously reflected. Note that the delay is related to the phase shift 2 ( ) between the incident wave and the reflected wave for a given value of . Nevertheless, it should be observed that the delay of the wave packet is not simply proportional to ( 0 ), as would be the case for an unbounded plane wave, but to the derivative d d evaluated at = 0 . Physically, this delay is due to the fact that, for close to zero, the probability of presence of the particle in the region 0, which is classically forbidden, is not zero [evanescent wave, see comment (i) below]. It can be said, metaphorically, that the particle spends a time of the order of in this region before retracing its steps. Formula (10) shows that the ~2 02 closer the average energy of the wave packet is to the height 0 of the barrier, the 2 longer the delay . Comments: (i) Here we have focussed on the behavior of the wave packet for 0, but it is also possible to study what happens for 0. In this region, the wave packet can be written: 1 We assume to be small enough for the spreading of the wave packet to be negligible during the time interval considered.

77



COMPLEMENT JI

(

)=

0

1 2

d

( )

2(

)e

( )

( )

e

(11)

0

where: ( )=

2 0

2

(12)

1 by 1, 1 by 2 ( ) is given by equation (23) of Complement HI when we replace and 2 by . An argument analogous to the one in § C-2 of Chapter I then shows that the modulus ( ) of expression (11) is maximum when the phase of the function to be integrated over is stationary. Now, according to expressions (22) and (23) of HI , the argument of 2 is half that of 1 , which, according to (2), is equal to 2 ( ). Consequently, if we expand ( ) and ( ) in the neighborhood of = 0 , we obtain, for the phase of the function to be integrated over in (11):

d d

d d

= 0

(

0)

=

~

0

(

0)

(13)

2

= 0

[we have used (10) and the fact that ( ) is assumed real]. From this we can deduce that ( ) is maximum in the 0 region for2 = . The time at which the wave 2 packet turns back is therefore 2, which gives us the same delay upon reflection that we obtained above. We also see from expression (13) that, as soon as time ~

2

exceeds the

defined by: 0

1

(14)

where is the width of ( ), the waves go out of phase and expression (11) for ( ) becomes negligible. Thus, the wave packet as a whole remains in the 0 region during an interval of time of the order of: =

1 ~ 0

(15)

which corresponds approximately to the time it takes, in the distance comparable to its width 1 . (ii) Since is assumed to be much smaller than (15) shows that:

0

and

0,

0 region, to travel a

the comparison of (10) and

(16) The delay upon reflection thus involves, for the reflected wave packet, a displacement which is much smaller than its width.

2 Note that the phase (13) does not depend on , contrary to what we found in Chapter I for a free wave packet. It follows that, in the 0 region, ( ) does not have a pronounced peak that moves with respect to time.

78

• 2.

Partial reflection:

BEHAVIOR OF A WAVE PACKET AT A POTENTIAL STEP

0

We now consider a function ( ) of width , centered at a value = 0 0 , which is zero for . The wave packet is formed in this case by superposing, with coeffi0 cients ( ), the stationary wave functions whose expressions are given by formulas (11) and (12) of Complement HI . We shall choose 2 = 0 so as to have the particle arrive at the barrier from the negative region of the axis, and we shall take 1 = 1. The coefficients 1 ( ) and 2 ( ) are obtained from formulas (13) and (14) of Complement HI 2 2 (in which 1 is replaced by 1, 1 by , and 2 by 0 ). In order to describe the wave packet by a single expression, valid for all values of , we can use the Heaviside “step function” ( ) defined by: ( )=0

if

0

( )=1

if

0

(17)

The wave packet we are studying can then be written: 1 2 1 + ( ) 2 1 + ( ) 2

(

)= (

+

)

d

( )e[

d

( )

( ) ]

0

+ 1(

)e

[

+ ( ) ]

0

+

d

( )

2(

)e[

2

2 0

( ) ]

(18)

0

It is composed of three wave packets: incident, reflected and transmitted. As in § 1 above, the stationary phase condition gives the position of their respective centers , and . Since 1 ( ) and 2 ( ) are real, we find: =

~

~

= =

0

~

(19-a) 0 2 0

(19-b) 2 0

(19-c)

A discussion analogous to that of (6) and (9) leads to the following conclusions: for negative , only the incident wave packet exists; for sufficiently large positive , only the reflected and transmitted wave packets exist (Fig. 1). Note that there is no delay, either upon reflection or upon transmission (this is due to the fact that the coefficients 1 ( ) and 2 ( ) are real). The incident and reflected wave packets propagate with velocities of ~ 0 and ~ 0 respectively. Let us assume to be sufficiently small that, within the interval , we can neglect the variation of 1 ( ) compared to that of ( ). 0+ 2 2 We can then, in the second term of (18), replace 1 ( ) by 1 ( 0 ) and take it outside the integral. It is then easy to see that the reflected wave packet has the same form as the incident wave packet, being its mirror image. Its amplitude is smaller, however, 0

79

COMPLEMENT JI

• V(x)

a x

0 ψ(x) 2

b

x

0 ψ(x) 2

c

x

0 ψ(x) 2

d

x

0

Figure 1: Behavior of a wave packet at a potential step, in the case 0 . The potential is shown in figure a. In figure b, the wave packet is moving towards the step. Figure c shows the wave packet during the transitory period in which it splits in two. Interference between the incident and reflected waves are responsible for the oscillations of the wave packet in the 0 region. After a certain time (fig. d), we find two wave packets. The first one (the reflected wave packet) is returning towards the left; its amplitude is smaller than that of the incident wave packet, and its width is the same. The second one (the transmitted wave packet) propagates towards the right; its amplitude is slightly greater than that of the incident wave packet, but it is narrower.

since, according to formula (13) of Complement HI , 1 ( 0 ) is less than 1. The reflection coefficient is, by definition, the ratio between the probabilities of finding the particle in the reflected wave packet and in the incident packet. Therefore, we have = 1 ( 0 ) 2 , which indeed corresponds to equation (15) of Complement HI [recall that we have chosen 1 ( 0 ) = 1]. that and

The situation is different for the transmitted wave packet. We can still use the fact is very small in order to simplify its expression: we replace 2 ( ) by 2 ( 0 ), 2 2 0 by the approximation:

2

2 0

2 0

0

+(

2 0

+( 0)

0)

0 0

80

d

2

d

2 0 =

0

(20)



BEHAVIOR OF A WAVE PACKET AT A POTENTIAL STEP

with: 0

2 0

=

2 0

(21)

The transmitted wave packet can then be written: (

)

2( 0) e

0

+

1 2

(

d

( )e

0)

0 0

( )

(22)

0

Let us compare this expression to the one for the incident wave packet: (

)=e

+

1 2

0

d

( ) e [(

0)

( ) ]

(23)

0

We see that: (

)

0

2( 0)

(24)

0

The transmitted wave packet thus has a slightly greater amplitude than that of the incident packet: according to formula (14) of Complement HI , 2 ( 0 ) is greater than 1. However, its width is smaller, since, if ( ) has a width , formula (24) shows that the width of ( ) is: (

) =

0

(25)

0

The transmission coefficient (the ratio between the probabilities of finding the particle in the transmitted packet and in the incident packet) is thus the product of two factors: =

0

2( 0)

2

(26)

0

This indeed corresponds to formula (16) of Complement HI , since 1 ( 0 ) = 1. Finally, note that, taking into account the contraction of the transmitted wave packet along the axis, we can find its velocity: =

~

0

0

=

~

0

(27)

0

References and suggestions for further reading:

Schiff (1.18), Chap. 5, Figs. 16, 17, 18, 19; Eisberg and Resnick (1.3), § 6-3, Fig. 6-8; also see reference (1.32).

81



EXERCISES

Complement KI Exercises 1. A beam of neutrons of mass ( 1 67 10 27 kg), of constant velocity and energy , is incident on a linear chain of atomic nuclei, arranged in a regular fashion as shown in the figure (these nuclei could be, for example, those of a long linear molecule). We call the distance between two consecutive nuclei, and , their size ( ). A neutron detector is placed far away, in a direction which makes an angle of with the direction of the incident neutrons. l θ D

a) Describe qualitatively the phenomena observed at when the energy of the incident neutrons is varied. b) The counting rate, as a function of , presents a resonance about = 1. Knowing that there are no other resonances for 1 , show that one can determine . Calculate for = 30 and 1 = 1 3 10 20 joule. c) At about what value of must we begin to take the finite size of the nuclei into account? 2.

Bound state of a particle in a “delta function potential”

Consider a particle whose Hamiltonian Chapter I] is: =

~2 d2 2 d 2

[operator defined by formula (D-10) of

( )

where

is a positive constant whose dimensions are to be found. a) Integrate the eigenvalue equation of between and + . Letting approach 0, show that the derivative of the eigenfunction ( ) presents a discontinuity at = 0 and determine it in terms of , and (0). b) Assume that the energy of the particle is negative (bound state). ( ) can then be written: 0

( )=

1

e

+

1

e

0

( )=

2

e

+

2

e

Express the constant calculate the matrix 2 2

=

in terms of and defined by:

. Using the results of the preceding question,

1 1

83



COMPLEMENT KI

Then, using the condition that ( ) must be square-integrable, find the possible values of the energy. Calculate the corresponding normalized wave functions. c) Plot these wave functions on a graph. Give an order of magnitude for their width . d) What is the probability dP( ) that a measurement of the momentum of the particle in one of the normalized stationary states calculated above will give a result included between and + d ? For what value of is this probability maximum? In what domain, of dimension , does it take on non-negligible values? Give an order of magnitude for the product .

3.

Transmission of a “delta function” potential barrier

Consider a particle placed in the same potential as in the preceding exercise. The particle is now propagating from left to right along the axis, with a positive energy . a) Show that a stationary state of the particle can be written: if if

0 0

where ,

( )=e + ( )= e

e

are constants which are to be calculated in terms of the energy , of d and of (watch out for the discontinuity in at = 0). d 2 2 b) Set = 2~ (bound state energy of the particle). Calculate, in terms of the dimensionless parameter , the reflection coefficient and the transmission coefficient of the barrier. Study their variations with respect to ; what happens when ? How can this be interpreted? Show that, if the expression of is extended for negative values of , it diverges when , and discuss this result.

4.

and

Return to exercise 2, using this time the Fourier transform.

a) Write the eigenvalue equation of and the Fourier transform of this equation. Deduce directly from this the expression for ( ), the Fourier transform of ( ), in terms of , , and (0). Then show that only one value of , a negative one, is possible. Only the bound state of the particle, and not the ones in which it propagates, is found by this method; why? Then calculate ( ) and show that one can find in this way all the results of exercise 2. b) The average kinetic energy of the particle can be written (cf. Chap. III): =

1 2

+ 2

( )2d

Show that, when ( ) is a “sufficiently smooth” function, we also have: = 84

~2 2

+

( )

d2 d d 2



EXERCISES

These formulas enable us to obtain, in two different ways, the energy for a particle in the bound state calculated in a). What result is obtained? Note that, in this case, ( ) is not “regular” at = 0, where its derivative is discontinuous. It is then necessary to differentiate ( ) in the sense of distributions, which introduces a contribution of the point = 0 to the average value we are looking for. Interpret this contribution physically: consider a square well, centered at = 0, whose width approaches 0 and whose depth 0 approaches infinity (so that 0 = ), and study the behavior of the wave function in this well. 5.

Well consisting of two delta functions Consider a particle of mass ( )=

( )

(

)

whose potential energy is 0

where is a constant length. a) Calculate the bound states of the particle, setting possible energies are given by the relation e where

=

1

=

~2 2

2

. Show that the

2

is defined by

=

2

. Give a graphic solution of this equation. ~2 ( ) Ground state. Show that this state is even (invariant with respect to reflection about the point = 2), and that its energy is less than the energy introduced in problem 3. Interpret this result physically. Represent graphically the corresponding wave function. ( ) Excited state. Show that, when is greater than a value to be specified, there exists an odd excited state, of energy greater than . Find the corresponding wave function. ( ) Explain how the preceding calculations enable us to construct a model which represents an ionized diatomic molecule ( 2+ , for example) whose nuclei are separated by a distance . How do the energies of the two levels vary with respect to ? What happens at the limit where 0 and at the limit where ? If the repulsion of the two nuclei is taken into account, what is the total energy of the system? Show that the curve that gives the variation with respect to of the energies thus obtained enables us to predict in certain cases the existence of bound states of 2+ , and to determine the value of at equilibrium. The calculation provides a very elementary model of the chemical bond. b) Calculate the reflection and transmission coefficients of the system of two delta function barriers. Study their variations with respect to . Do the resonances thus obtained occur when is an integral multiple of the de Broglie wavelength of the particle? Why?

85



COMPLEMENT KI

6. Consider a square well potential of width and depth 0 (in this exercise, we shall use systematically the notation of § 2-c- of Complement HI ). We intend to study the properties of the bound state of a particle in this well when its width approaches zero. a) Show that there indeed exists only one bound state and calculate its energy

(we find

2 2 0 , 2~2

that is, an energy which varies with the square of the area

0

)

of the well . b) Show that 0 and that 2 = 2 1 2. Deduce from this that, in the bound state, the probability of finding the particle outside the well approaches 1. c) How can the preceding considerations be applied to a particle placed, as in exercise 2, in the potential ( ) = ( )?

7.

Consider a particle placed in the potential ( )=0

if

( )=

0

if

0

with ( ) infinite for negative . Let ( ) be a wave function associated with a stationary state of the particle. Show that ( ) can be extended to give an odd wave function which corresponds to a stationary state for a square well of width 2 and depth 0 (cf. Complement HI , § 2-c- ). Discuss, with respect to and 0 , the number of bound states of the particle. Is there always at least one such state, as for the symmetric square well?

8. Consider, in a two-dimensional problem, the oblique reflection of a particle from a potential step defined by: (

)=0

(

)=

0

if

0

if

0

Study the motion of the center of the wave packet. In the case of total reflection, interpret physically the differences between the trajectory of this center and the classical trajectory (lateral shift upon reflection). Show that, when 0 + , the quantum trajectory becomes asymptotic to the classical trajectory.

86

Chapter II

The mathematical tools of quantum mechanics A

B

C

D

E

F

Space of the one-particle wave function . . . . . A-1 Structure of the wave function space F . . . . . A-2 Discrete orthonormal bases in F : (r) . . . . A-3 Introduction of “bases” not belonging to F . . . State space. Dirac notation . . . . . . . . . . . . B-1 Introduction . . . . . . . . . . . . . . . . . . . . B-2 “Ket” vectors and “bra” vectors . . . . . . . . . B-3 Linear operators . . . . . . . . . . . . . . . . . . B-4 Hermitian conjugation . . . . . . . . . . . . . . . Representations in state space . . . . . . . . . . C-1 Introduction . . . . . . . . . . . . . . . . . . . . C-2 Relations characteristic of an orthonormal basis . C-3 Representation of kets and bras . . . . . . . . . . C-4 Representation of operators . . . . . . . . . . . . C-5 Change of representations . . . . . . . . . . . . . Eigenvalue equations. Observables . . . . . . . . D-1 Eigenvalues and eigenvectors of an operator . . . D-2 Observables . . . . . . . . . . . . . . . . . . . . . D-3 Sets of commuting observables . . . . . . . . . . Two important examples of representations and E-1 The r and p representations . . . . . . E-2 The R and P operators . . . . . . . . . . . . . . Tensor product of state spaces . . . . . . . . . . F-1 Introduction . . . . . . . . . . . . . . . . . . . . F-2 Definition and properties of the tensor product . F-3 Eigenvalue equations in the product space . . . . F-4 Applications . . . . . . . . . . . . . . . . . . . .

. . . . . . . 88 . . . . . . . 89 . . . . . . . 91 . . . . . . . 94 . . . . . . . 102 . . . . . . . 102 . . . . . . . 103 . . . . . . . 108 . . . . . . . 111 . . . . . . . 116 . . . . . . . 116 . . . . . . . 116 . . . . . . . 119 . . . . . . . 121 . . . . . . . 124 . . . . . . . 126 . . . . . . . 126 . . . . . . . 130 . . . . . . . 133 observables139 . . . . . . . 139 . . . . . . . 143 . . . . . . . 147 . . . . . . . 147 . . . . . . . 147 . . . . . . . 151 . . . . . . . 154

Quantum Mechanics, Volume I, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

This chapter is intended to be a general survey of the basic mathematical tools used in quantum mechanics. We shall give a simple condensed presentation aimed at facilitating the study of subsequent chapters for readers unfamiliar with these tools. We make no attempt to be mathematically complete or rigorous. We feel it preferable to limit ourselves to a practical point of view, uniting in a single chapter the various concepts useful in quantum mechanics. In particular, we wish to stress the convenience of the Dirac notation for carrying out the various calculations to be performed. In this spirit, we shall try to simplify the discussion as much as possible. Neither the general definitions nor the rigorous proofs which would be required by a mathematician are to be found here. For example, we shall sometimes speak of infinite-dimensional spaces and reason as if they had a finite number of dimensions. Moreover, many terms (square-integrable function, basis, etc...) will be employed with a meaning which, although commonly used in physics, is not exactly the one used in pure mathematics. We begin in § A by studying the wave functions introduced in Chapter I. We show that these wave functions belong to an abstract vector space, which we call the “wave function space F ”. This study will be carried out in detail as it introduces some basic concepts of the mathematical formalism of quantum mechanics: scalar products, linear operators, bases, etc... Starting in § B, we shall develop a more general formalism, characterizing the state of a system by a “state vector” belonging to a vector space: the “state space E ”. Dirac notation, which greatly simplifies calculations in this formalism, is introduced. § C is intended to study the idea of a representation. § D is particularly recommended to the reader who is unfamiliar with the diagonalization of an operator: this operation will be constantly useful to us in what follows. In § E, we treat two important examples of representations. In particular, we show how the wave functions studied in § A are the “components” of state vectors in a particular representation. Finally, we introduce in § F the concept of a tensor product. This concept will be illustrated more concretely by a simple example in Complement DIV . A.

Space of the one-particle wave function

The probabilistic interpretation of the wave function (r ) of a particle was given in the preceding chapter: (r ) 2 d3 represents the probability of finding, at time , the particle in a volume d3 = d d d about the point r. As the total probability of finding the particle somewhere in space is equal to 1, we must have: d3

(r ) 2 = 1

(A-1)

where the integration extends over all space. Thus, we are led to studying the set of square-integrable functions. These are functions for which the integral (A-1) converges 1 . From a physical point of view, it is clear that the set 2 is too wide in scope: given the meaning attributed to (r ) 2 , the wave functions that are actually used possess certain properties of regularity. We can only retain the functions (r ) which are everywhere defined, continuous, and infinitely differentiable (for example, to state that a function is really discontinuous at a given point in space has no physical meaning, 1 This

88

set is called

2

by mathematicians and it has the structure of a Hilbert space

A. SPACE OF THE ONE-PARTICLE WAVE FUNCTION

since no experiment enables us to have access to real phenomena on a very small scale, say of 10 30 m). It is also possible to confine ourselves to wave functions that have a bounded domain (which makes it certain that the particle can be found within a finite region of space, for example inside the laboratory). We shall not try to give a precise, general list of these supplementary conditions: we shall call F the set of wave functions composed of sufficiently regular functions of 2 (F is a subspace of 2 ). Structure of the wave function space F

A-1.

F is a vector space

A-1-a.

It can easily be shown that F satisfies all the criteria of a vector space. As an example, we demonstrate that if 1 (r) and 2 (r) F , then2 : (r) = where

1

1

1 (r)

and

2

+

2

(A-2)

are two arbitrary complex numbers.

In order to show that (r) 2 =

F

2 (r)

2

1 (r)

1

2

+

(r) is square-integrable, expand 2

2 (r)

2

2

+

1

2

1 (r)

2 (r)

+

(r) 2 : 1 2

1 (r)

2 (r)

(A-3)

The first two terms of the right-hand side of (A-3) are square integrable, since 1 and 2 belong to F . The sum of the third and fourth terms is real, and its modulus has an upper limit: 2 2 2 1 2 1 (r) 2 (r) 1 2 1 (r) + 2 (r) 2 (this inequality is obtained by writing that [ 1 (r) + e 2 (r)] is necessarily positive, whatever 2 the real value of ). The function (r) is therefore smaller that a sum of functions whose integrals converge, so that its integral also converges. A-1-b.

.

The scalar product

Definition

With each pair of elements of F , (r) and (r), taken in this order, we associate a complex number, denoted by ( ), which, by definition, is equal to: (

d3

)=

(r) (r)

(A-4)

( ) is the scalar product of belong to F ]. .

(r) by

(r) [this integral always converges if

and

Properties They follow from definition (A-4): (

)=(

(

1

1

+

2

1

+

2

2

(

1

2 The

)

symbol

(A-5) 2)

=

)=

1( 1( 1

1)

+

)+

2( 2( 2

2)

(A-6)

)

(A-7)

signifies: “belongs to”.

89

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

The scalar product is linear with respect to the second function of the pair, antilinear with respect to the first one. If ( ) = 0, (r) and (r) are said to be orthogonal.

(

d3

)=

(r) 2

(A-8)

is a real, positive number, which is zero if and only if (r) 0. ( ) is called the norm of (r) [it can easily be verified that this number has all the properties of a norm]. The scalar product chosen above thus permits the definition of a norm in F . Let us finally mention the Schwarz inequality (cf. Complement AII ): (

2)

1

6

(

1)

1

(

2)

2

(A-9)

This becomes an equality if and only if the two functions A-1-c.

.

1

and

2

are proportional.

Linear operators

Definition

A linear operator is, by definition, a mathematical entity which associates with every function (r) F another function (r), the correspondence being linear: (r) = [

1 (r)

1

(r) +

(A-10 -a) 2 (r)]

2

=

1

1 (r)

+

2

2 (r)

(A-10 -b)

Let us cite some simple examples of linear operators: – the parity operator (

, whose definition is:

)= (

)

(A-11)

– the operator that performs a multiplication by , which we shall call which is defined by: (

)=

(

)

– finally, the operator that we shall call and whose definition is: (

(

)=

(A-12) , which differentiates with respect to ,

)

[the two operators and , acting on a function (r) function which is no longer necessarily square-integrable]. .

(A-13) F , can transform it into a

Product of operators Let (

90

, and

and

) (r) =

be two linear operators. Their product [

(r)]

is defined by: (A-14)

A. SPACE OF THE ONE-PARTICLE WAVE FUNCTION

is first allowed to act on (r), which gives (r) = (r), then function (r). In general, = . We call the commutator of and [ ] and defined by [

operates on the new the operator written

]=

(A-15)

We shall calculate, as an example, the commutator [ take an arbitrary function (r): [

]

(r) =

(r)

=

(r)

[

=

(r)

(r)

Since this is true for all [

]=

(r)] (r) =

A-2-a.

(r)

(A-16)

(r), it can be deduced that:

1

(A-17)

Discrete orthonormal bases in F :

A-2.

]. In order to do this, we shall

(r)

Definition

Consider a countable set of functions of F , labeled by a discrete index ( = 1, 2, ..., , ...): F

1 (r)

– The set (

)=

F

2 (r)

d3

(r)

F

(r) is orthonormal if: (r)

(r) =

(A-18)

where

, the Kronecker delta function, is equal to 1 for = and to 0 for = . – It constitutes a basis3 if every function (r) F can be expanded in one and only one way in terms of the (r) : (r) =

A-2-b.

(r)

Components of a wave function in the

Multiply the two sides of (A-19) by and (A-18)4 :

(A-19)

(r)

basis

(r) and integrate over all space. From (A-6)

3 When the set (r) constitutes a basis, it is sometimes said to be a complete set of functions. It must be noted that the word complete is used with a meaning different from the one it usually has in mathematics. 4 To

be totally rigorous, one should make certain that one can interchange

and

d3 . We shall

systematically ignore this kind of problem.

91

CHAPTER II

(

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

)=

=

=

(

)

=

(A-20)

that is: =(

d3

)=

(r) (r)

(A-21)

The component of (r) on (r) is therefore equal to the scalar product of (r) by (r). Once the (r) basis has been chosen, it is equivalent to specify (r) or the set of its components with respect to the basis functions. The set of numbers is said to represent (r) in the (r) basis. Comments: (i) Note the analogy with an orthonormal basis e1 e2 e3 of the ordinary threedimensional space, 3 . The fact that e1 , e2 and e3 are orthogonal and unitary can indeed be expressed by: e =

e

(

Any vector V of

3

= 1 2 3)

(A-22)

can be expanded in this basis:

3

V=

(A-23)

e =1

with =e

(A-24)

V

Formulas (A-18), (A-19) and (A-21) thus generalize, as it were, the well-known formulas, (A-22), (A-23) and (A-24). However, it must be noted that the are real numbers, while the are complex numbers. (ii) The same function (r) obviously has different components in two different bases. We shall study the problem of a change in basis later. (iii) We can also, in the (r) basis, represent a linear operator by a set of numbers which can be arranged in the form of a matrix. We shall take up this question again in § C, after we have introduced Dirac notation.

A-2-c.

Expression for the scalar product in terms of the components

Let (r) and

92

(r) be two wave functions which can be expanded as follows:

(r) =

(r)

(r) =

(r)

(A-25)

A. SPACE OF THE ONE-PARTICLE WAVE FUNCTION

Their scalar product can be calculated by using (A-6), (A-7) and (A-18):

(

)=

=

(

)

= that is: (

)=

(A-26)

In particular: (

)=

2

(A-27)

The scalar product of two wave functions (or the square of the norm of a wave function) can thus be very simply expressed in terms of the components of these functions in the (r) basis.

Comment: Let V and W be two vectors of 3 , with components of their scalar product is well-known:

and

. The analytic expression

3

V W=

(A-28) =1

Formula (A-26) can therefore be considered to be a generalization of (A-28). A-2-d.

Closure relation

Relation (A-18), called the orthonormalization relation, expresses the fact that the functions of the set (r) are normalized to 1 and orthogonal with respect to each other. We are now going to establish another relation, called the closure relation, which expresses the fact that this set constitutes a basis. If (r) is a basis of F , there exists an expansion such as (A-19) for every function (r) F . Substitute into (A-19) expression (A-21) for the various components [the name of the integration variable must be changed, since r already appears in (A-19)]: (r) = =

(r) = d3

(

)

(r)

(r ) (r )

(r)

(A-29)

93

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Interchanging

d3

(r) =

d3 , we obtain:

and

(r )

(r)

(r )

(r) (r ) is therefore a function (r), we have: d3

(r) =

(r )

(r r ) of r and of r such that, for every function

(r r )

Equation (A-31) is characteristic of the function (r it can be deduced that: (r)

(r ) = (r

(A-30)

r)

(A-31) r ) (cf. Appendix II). From this

(A-32)

Reciprocally, if an orthonormal set (r) satisfies the closure relation (A-32), it constitutes a basis. Any function (r) can indeed be written in the form: (r) =

d3

(r ) (r

r)

(A-33)

Substituting (A-32) for (r r ) into this expression, we obtain formula (A-30). To return to (A-29), all we must do is again interchange summation and integration. This equation then expresses the fact that (r) can always be expanded in terms of the (r) and gives the coefficients of this expansion.

Comment: We shall re-examine the closure relation using the Dirac notation in § C, and we shall see that it can be given a simple geometric interpretation. A-3.

Introduction of “bases” not belonging to F

The (r) bases studied above are composed of square-integrable functions. It can also be convenient to introduce “bases” of functions not belonging to either F or 2 , but in terms of which any wave function (r) can nevertheless be expanded. We are going to give examples of such bases and we shall show how it is possible to extend to them the important formulas established in the preceding section. A-3-a.

Plane waves

For simplicity, we treat the one-dimensional case. We shall therefore study squareintegrable functions ( ) which depend only on the variable. In Chapter I we saw the 94

A. SPACE OF THE ONE-PARTICLE WAVE FUNCTION

advantage of using the Fourier transform ( )= ( )=

( ):

+

1 2 ~ 1 2 ~

d

( )e

d

( )e

~

(A-34 -a)

+

Consider the function ( )=

( ) of

1 e 2 ~

~

(A-34 -b)

( ), defined by:

~

(A-35)

( ) is a plane wave, with the wave vector ~. The integral over the whole axis of 1 ( )2= diverges. Therefore ( ) F . We shall designate by ( ) the set 2 ~ of all plane waves, that is, of all functions ( ) corresponding to the various values of . The number , which varies continuously between and + , will be considered as a continuous index which permits us to label the various functions of the set ( ) [recall that the index used for the set (r) considered above was discrete]. Formulas (A-34) can be rewritten using (A-35): +

( )=

d

( )

( )=(

)=

( )

(A-36)

+

d

( )

( )

(A-37)

These two formulas can be compared to (A-19) and (A-21). Relation (A-36) expresses the idea that every function ( ) F can be expanded in one and only one way in terms of the ( ), that is, the plane waves. Since the index varies continuously and not discretely, the summation appearing in (A-19) must be replaced by an integration over . Relation (A-37), like (A-21), gives the component ( ) of ( ) on ( ) in the form of a scalar product5 ( ). The set of these components, which correspond to the various possible values of , constitutes a function of , ( ), the Fourier transform of ( ). Thus, ( ) is the analogue of . These two complex numbers, which depend either on or on , represent the components of the same function ( ) in two different bases: ( ) and ( ) . This point also appears clearly if we calculate the square of the norm of ( ). According to Parseval’s relation [Appendix I, formula (45)], we have: +

(

)=

d

( )2

(A-38)

5 We have only defined the scalar product for two square-integrable functions, but this definition can easily be extended to cases like this one, provided that the corresponding integral converges.

95

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

a formula which resembles (A-27), if we replace Let us show that the pendix II, equation (34)]:

by

( ) and

by

d .

( ) satisfy a closure relation. Using the formula [cf. Ap-

+

1 2

d e

= ( )

(A-39)

we find: +

d

( )

( )=

1 2

d e ~

~(

)

= (

)

This formula is the analogue of (A-32) with, again, the substitution of

(A-40)

d for

.

Finally, let us calculate the scalar product ( ) in order to see if there exists an equivalent of the orthonormalization relation. Again using (A-39), we obtain: +

(

)=

d

( )

( )

that is: (

)=

1 2

d e ~

~(

)

= (

)

(A-41)

Compare (A-41) and (A-18). Instead of having two discrete indices and and a Kronecker delta , we now have two continuous indices and and a delta function of the difference between the indices, ( ). Note that if we set = , the scalar product ( ) diverges; again we see that ( ) F . Although this constitutes a misuse of the term, we shall call (A-41) an “orthonormalization” relation. It is also sometimes said that the ( ) are “orthonormalized in the Dirac sense”. The generalization to three dimensions presents no difficulties. We consider the plane waves: 3 2

1

p (r) =

epr

2 ~

~

(A-42)

The functions of the (r) basis now depend on the three continuous indices , condensed into the notation p. It is then easy to show that:

(

(r) =

d3

(p) = (

p

)= d3

96

(p) )=

d3 p (r)

p (r)

d3

) = (r

,

(A-43)

p (r)

(p) (p) p (r

,

(r)

(A-44) (A-45)

r)

(A-46)

A. SPACE OF THE ONE-PARTICLE WAVE FUNCTION

(

p

p

) = (p

p)

(A-47)

They represent the generalizations of (A-36), (A-37), (A-38), (A-40) and (A-41). Thus the (r) can be considered to constitute a “continuous basis”. All the formulas established above for the discrete basis (r) can be extended to this continuous basis, using the correspondence rules summarized in table (II-1). p d3

Table (II-1)

(p

A-3-b.

p)

“Delta functions”

In the same way, let us introduce a set of functions of r, r0 (r) , labeled by the continuous index r0 (condensed notation for 0 , 0 , 0 ) and defined by: r0 (r)

= (r

r0 )

(A-48)

r0 (r)

represents the set of delta functions centered at the various points r0 of space; (r) is obviously not square-integrable: r0 (r) F . r0 Then consider the following relations, which are valid for every function (r) F : (r) = (r0 ) =

d3

(r0 ) (r

0

d3

(r0

r0 )

(A-49)

r) (r)

(A-50)

They can be rewritten, using (A-48), in the form: (r) =

d3

(r0 ) = (

r0

(r0 )

0

)=

r0 (r)

d3

r0 (r)

(A-51) (r)

(A-52)

(A-51) expresses the fact that every function (r) F can be expanded in one and only one way in terms of the r0 (r). (A-52) shows that the component of (r) on the function r0 (r) (we are dealing here with real basis functions) is precisely the value (r0 ) of (r) at the point r0 . (A-51) and (A-52) are analogous to (A-19) and (A-21): we simply replace the discrete index

by the continuous index r0 , and

by

d3 0 .

(r0 ) is therefore the equivalent of : these two complex numbers, which depend either on r0 or on , represent the components of the same function (r) in two different bases: (r) . r0 (r) and 97

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Formula (A-26) becomes here: (

d3

)=

(A-53)

(r0 ) (r0 )

0

We see that the application of (A-26) to the case of the continuous basis r0 (r) results in the definition (A-4) of the scalar product. Finally, note that the r0 (r) satisfy “orthonormalization” and closure relations of the same type as those for the p (r). Thus we have [formula (28) of Appendix II]: d3

0

r0 (r)

r0 )

=

r0 (r

d3

)=

0

(r

r0 ) (r

r0 ) = (r

r)

(A-54)

and: (

r0

d3

(r

r0 ) (r

r0 ) = (r0

r0 )

(A-55)

All the formulas established for the discrete basis (r) can be generalized for the continuous basis r0 (r) , using the correspondence rules summarized in Table (II-2). r0 d3

0

(r0

Table (II-2)

r0 )

Important comment: The usefulness of the continuous bases that we have just introduced is revealed more clearly in what follows. However, we must not lose sight of the following point: a physical state must always correspond to a square-integrable wave function. In no case can p (r) or r0 (r) represent the state of a particle. These functions are nothing more than intermediaries, very useful in calculations involving operations on the wave functions (r) which are used to describe a physical state. An analogous situation is encountered in classical optics, where the plane monochromatic wave is a mathematically very useful, but physically unrealizable, idealization. Even the most selective filters always permit the passage of a frequency band , which may be very small but is never exactly zero. The same holds true for the functions r0 (r). We can imagine a squareintegrable wave function, localized about r0 , for example: ( ) r0 (r)

=

where the

( )

( )

(r

r0 ) =

( )

(

0)

( )

(

0)

( )

(

0)

are functions which have a peak of width

and amplitude

1

,

+

centered at 98

0,

0

or

0,

such that

( )

(

0) d

= 1 (see § 1-b of Appendix II

A. SPACE OF THE ONE-PARTICLE WAVE FUNCTION

( )

for examples of such functions). When 0, r0 (r) r0 (r), which is no longer square-integrable. However, it is impossible to have a physical state that corresponds to this limit: as localized as the physical state of a particle may be, is never exactly zero.

A-3-c.

.

Generalization: continuous “orthonormal” bases

Definition

Generalizing the results obtained in the two preceding paragraphs, we shall call a continuous “orthonormal” basis, a set of functions of r, (r) , labeled by a continuous index , which satisfy the two following relations, called orthonormalization and closure relations: (

d3

)= d

(r)

(r)

(r) = (

(r ) = (r

)

(A-56)

r)

(A-57)

Comments: (i) If (ii)

=

,(

) diverges. Therefore,

(r)

F.

can represent several indices, as is the case for r0 and p in the above examples.

(iii) It is possible to imagine a basis that includes both functions (r), labeled by a discrete index, and functions (r), labeled by a continuous index. In this case, the set of (r) does not form a basis; the set of (r) must be added to it. Let us cite an example of this situation. Consider the case of the square well studied in § D-2-c of Chapter I (see also Complement HI ). As we shall see later, the set of stationary states of a particle in a time-independent potential constitutes a basis. For 0, we have discrete energy levels, to which correspond square-integrable wave functions labeled by a discrete index. But these are not the only possible stationary states. Equation (D17) of Chapter I is also satisfied, for all 0, by solutions which are bounded but which extend over all space and are thus not square-integrable. In the case of a “mixed” (discrete and continuous) basis, thonormalization relations are: ( ( (

(r)

(r) , the or-

)= )= (

)

(A-58)

)=0

And the closure relation becomes: (r)

(r ) +

d

(r)

(r ) = (r

r)

(A-59)

99

CHAPTER II

.

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Components of a wave function

(r)

We can always write: d3

(r) =

(r )

(r

r)

Using the expression for (r 3

order of

d

(r) =

and

r ) given by (A-57), and assuming that we can reverse the

d , we obtain: d3

d

(A-60)

(r )

(r )

(r)

(A-61)

that is: (r) =

d

( )

(r)

(A-62)

with: ( )=(

d3

)=

(r )

(r )

(A-63)

(A-62) expresses the fact that every wave function (r) has a unique expansion in terms of the (r). The component ( ) of (r) on (r) is equal, according to (A-63), to the scalar product ( ).

.

Expression for the scalar product and the norm in terms of the components

of the

Let (r) and (r) be two square-integrable functions whose components in terms (r) are known:

(r) =

d

(r) =

d

( )

(r)

( )

(A-64)

(r)

(A-65)

Calculate their scalar product: (

)= =

d3 d

(r) d

(r) ( ) ( )

d3

(r)

The last integral is given by (A-56): ( 100

)=

d

d

( ) ( ) (

)

(r)

(A-66)

A. SPACE OF THE ONE-PARTICLE WAVE FUNCTION

that is: (

)=

d

( ) ( )

(A-67)

In particular: (

)=

( )2

d

(A-68)

All the formulas of § A-2 can thus be generalized, using the correspondence rules of table (II-3).

d (

Table (II-3) )

The most important formulas established in this section are assembled in table (II-4). Actually, it is not necessary to remember them in this form: we shall see that the introduction of Dirac notation enables us to rederive them very simply.

101

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Table (II-4) Discrete basis Orthonormalization

(

(r)

Continuous basis

)=

(

(r)

)= (

)

relation Closure relation

(r)

Expansion of a wave function (r) Expression for the components of (r)

(r ) = (r

(r) =

) = d3

=(

Scalar product

(

)=

Square of the norm

(

)=

B. B-1.

r)

d

(r)

(r) (r)

(r)

(r) =

2

d

)=

(

( )

) = d3

( )=(

(

(r ) = (r r )

)=

d

(r)

(r) (r)

( ) ( )

d

( )2

State space. Dirac notation Introduction

In Chapter I, we stated the following postulate: the quantum state of a particle is defined, at a given instant, by a wave function (r). The probabilistic interpretation of this wave function requires that it be square-integrable. This requirement led us to study the F -space (§ A). We then found, in particular, that the same function (r) can be represented by several distinct sets of components, each one corresponding to the choice of a basis [table (II-5)]. This result can be interpreted in the following manner: , or (p), or ( ), characterizes the state of a particle just as well as the wave function (r) [if the basis being used has been specified previously]. Furthermore, (r) itself appears, in table (II-5), on the same footing as , (p) and ( ): the value (r0 ) which the wave function takes on at a point r0 of space can be considered as its component with respect to a specific function r0 (r) of a particular basis (the function basis). 102

B. STATE SPACE. DIRAC NOTATION

Basis

Components of

(r) p (r) r0 (r) (r)

(r)

=1 2 (p) (r0 ) ( ) Table (II-5)

We thus find ourselves in a situation which is analogous to the one encountered in ordinary space, 3 : the position of a point in space can be described by a set of three numbers, which are its coordinates with respect to a system of axes defined in advance. If one changes axes, another set of coordinates corresponds to the same point. But the geometrical vector concept and vector calculation enable us to avoid referring to a system of axes; this considerably simplifies both formulas and reasoning. We are going to use a similar approach here: each quantum state of a particle will be characterized by a state vector, belonging to an abstract space, Er , called the state space of a particle. The fact that the space F is a subspace of 2 means that Er is a subspace of a Hilbert space. We are going to define the notation and the rules of vector calculation in Er . Actually, the introduction of state vectors and the state space does more than merely simplify the formalism. It also permits a generalization of the formalism. Indeed, there exist physical systems whose quantum description cannot be given by a wave function: we shall see in Chapters IV and IX that this is the case when the spin degrees of freedom are taken into account, even for a single particle. Consequently, the first postulate that we shall set forth in Chapter III will be the following: the quantum state of any physical system is characterized by a state vector, belonging to a space E which is the state space of the system. Therefore, in the rest of this chapter, we are going to develop a vector calculus in E . The concepts which we are going to introduce and the results which we shall obtain are valid for whatever physical system we might consider. Nevertheless, to illustrate these concepts and results, we shall apply them to the simple case of a (spinless) particle, since this is the case we have previously considered. We shall begin, in this paragraph, by defining the Dirac notation, which will prove to be very useful in the formal manipulations which we shall have to perform. B-2. B-2-a.

.

“Ket” vectors and “bra” vectors Elements of E : kets

Notation

Any element, or vector, of space E is called a ket vector, or, more simply, a ket. It is represented by the symbol , inside which is placed a distinctive sign which enables us to distinguish the corresponding ket from all others, for example: . In particular, since the concept of a wave function is now familiar to us, we shall define the space Er of the states of a particle by associating with every square-integrable 103

CHAPTER II

function (r)

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

of Er :

(r) a ket vector F

Er

(B-1)

Afterwards, we shall transpose into Er the different operations that we introduced for F . Although F and Er are isomorphic, we shall carefully distinguish between them in order to avoid confusion and to reserve the possibilities of generalization mentioned above in § B-1. We stress the fact that an r-dependence no longer appears in ; only the letter appears, to remind us with which function it is associated. (r) will be interpreted (§ E) as the set of the components of the ket in a particular basis, r playing the role of an index [cf. § A-3-b and table (II-5)]. Consequently, the procedure which we are adopting here consists in initially characterizing a vector by its components in a privileged coordinate system, which will later be treated on the same footing as all other coordinate systems. We shall designate by E the state space of a (spinless) particle in only one dimension, that is, the abstract space constructed as in (B-1), but using wave functions that depend only on the variable.

.

Scalar product

With each pair of kets and , taken in this order, we associate a complex number, which is their scalar product, ( ), and which satisfies the various properties described by equations (A-5), (A-6) and (A-7). We shall later rewrite these formulas in Dirac notation after we have introduced the concept of a “bra”. In Er , the scalar product of two kets will coincide with the scalar product defined above for the associated wave functions. Elements of the dual space E

B-2-b.

of E : bras

Definition of the dual space E

.

Recall, first of all, the definition of a linear functional defined on the kets of E . A linear functional is a linear operation which associates a complex number with every ket : E (

1

1

number ( +

2

2

)=

1

) (

1

)+

2

(

2

)

(B-2)

Linear functional and linear operator must not be confused. In both cases, one is dealing with linear operations, but the former associates each ket with a complex number, while the latter associates another ket.

It can be shown that the set of linear functionals defined on the kets E constitutes a vector space, which is called the dual space of E and which will be symbolized by E . .

Bra notation for the vectors of E

Any element, or vector, of the space E is called a bra vector, or, more simply, a bra. It is symbolized by . For example, the bra designates the linear functional 104

B. STATE SPACE. DIRAC NOTATION

and we shall henceforth use the notation to denote the number obtained by causing the linear functional E to act on the ket E: (

)=

(B-3)

The origin of this terminology is the word “bracket”, used to denote the symbol Hence the name “bra” for the left-hand side , and the name “ket” for the right-hand side of this symbol. B-2-c.

.

Correspondence between kets and bras

.

To every ket corresponds a bra

The existence of a scalar product in E will now enable us to show that we can associate, with every ket E , an element of E , that is, a bra, which will be denoted by . The ket does indeed enable us to define a linear functional: the one that associates (in a linear way), with each ket E , a complex number equal to the scalar product ( ) of by . Let be this linear functional; it is thus defined by the relation: =( .

)

(B-4)

This correspondance is antilinear

In the space E , the scalar product is antilinear with respect to the first vector. In the notation of (B-4), this is expressed by: (

1

1

+

2

2

)=

1

=

1

=( bra

(

)+

1

+

1

+

1

1

2

2

(

)

2

2

2 2

)

(B-5)

It appears from (B-5) that the bra associated with the ket 1 + 2 2 :

1

1

+

2

2

is the

1 1

1

The ket =

+

2

2

=

1

1

+

2

(B-6)

2

bra correspondence is therefore antilinear.

Comment:

If is a complex number, and are sometimes led to write it as =

a ket, :

is a ket (E is a vector space). We (B-7)

One must then be careful to remember that represents the bra associated with the ket . Since the correspondence between a bra and a ket is antilinear, we have: =

(B-8) 105

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

ξ(ε) x0 ( )

Figure 1: 0 ( ) is a function having a peak at = 0 (of width and amplitude 1 ), whose integral between and + is equal to 1.

1

ε

ε

0

x

x0

.

Dirac notation for the scalar product

We now have at our disposal two distinct notations for designating the scalar product of by : ( ) or , being the bra associated with the ket . Henceforth we shall use only the (Dirac) notation: . Table (II-6) summarizes, in Dirac notation, the properties of the scalar product, already given in § A-1-b.

= 1 1

1

1

+

(B-9) +

2

2

2

2

=

1

=

1

1 1

+

2

+

2

(B-10)

2

(B-11)

2

real, positive; zero if and only if

=0

(B-12)

Table (II-6) .

Is there a ket to correspond to every bra?

Although to every ket there corresponds a bra, we shall see, in two examples chosen in F , that it is possible to find bras that have no corresponding kets. We shall later show why this difficulty does not hinder us in quantum mechanics. (i) Counter-examples chosen in F For simplicity, we shall reason in one dimension. +

Let

( )

0 ( ) be a sufficiently regular real function, such that

d

( ) 0

( ) = 1, and

having the form of a peak of width and amplitude 1 , centered at = 0 [see Fig. 1; ( ) is, for example, one of the functions considered in § 1-b of Appendix II]. If = 0, 0 ( ) ( ) (the square of its norm is of the order of 1 ). Denote by the corresponding ket: 0

106

( )

( )

0

0

( )

( ) 0

( ) F

(B-13)

B. STATE SPACE. DIRAC NOTATION

( )

If = 0, have:

( )

E . Let

0

E , we

be the bra associated with this ket; for every

0

+ ( )

( )

=(

0

Now let 0

0

d

( )=

( ) ( )

(B-14)

(B-15) ( ) 0

( ), which is of the order of 1 , diverges when

0]; therefore:

E

( ) 0

0

0

F

( )

0

[the square of the norm of Lim

( )

approach zero. On the one hand:

( )

Lim

)=

0

(B-16)

On the other hand, when 0, integral (B-14) approaches a perfectly well-defined limit, ( 0 ) [since, for sufficiently small , ( ) can be replaced in (B-14) by ( 0 ) and removed from ( ) the integral]. Consequently, approaches a bra which we shall denote by is 0 : 0 0 the linear functional which associates, with every ket of E , the value ( 0 ) taken on by the associated wave function at the point 0 : ( )

Lim

=

0

0

E

If

E

0

= (

0

0)

(B-17)

Thus we see that the bra exists, but no ket corresponds to it. 0 In the same way, let us consider a plane wave which is truncated outside an interval of width : ( ) 0

1 e 2 ~

( )=

~

0

if

2

6

6+

(B-18)

2

( )

with the function 0 ( ) going rapidly to zero outside this interval (while remaining continuous ( ) ( ) and differentiable). We shall denote by the ket associated with 0 ( ): 0 ( ) 0

F

( )

( ) 0

The square of the norm of Therefore: Lim

E ( ) 0

(B-19) , which is practically equal to

E

( ) 0

=(

0

When for = Lim If

. (B-20)

( )

Now let us consider the bra

( )

2 ~, diverges if

( )

1 2 ~

)

0

( )

0

+

associated with

( ) 0

. For every

E , we have:

2

d e

0

~

( )

(B-21)

2

, has a limit: the value ( 0 ) of the Fourier transform ( ) of 0 ( ) , tends towards a perfectly well-defined bra 0 . Therefore, when 0 ( ) 0

E

=

0

0

( ) 0 :

E = ( 0)

(B-22)

107

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Here again, no ket corresponds to the bra

0

.

(ii) Physical resolution of the preceding difficulties This dissymmetry of the correspondence between kets and bras is related, as the preceding examples show, to the existence of “continuous bases” for F . Since the functions constituting these “bases” do not belong to F , we cannot associate a ket of E with them. However, their scalar product with an arbitrary function of F is defined, and this permits us to associate with them a linear functional in E , that is, a bra belonging to E . The reason for using such “continuous bases” lies in their usefulness in certain practical calculations. The same reason (which will become more apparent in what follows) leads us here to reestablish the symmetry between kets and bras by introducing “generalized kets”, defined using functions that are not square-integrable, but whose scalar product with every function of F exists. In what follows, we shall work with “kets” such as or 0 0 , associated with 0 ( ) or 0 ( ). It must not be forgotten that these generalized “kets” cannot, strictly speaking, represent physical states. They are merely intermediaries, useful in calculations involving certain operations to be performed on the true kets of the space E , which actually characterize realizable quantum states. This method poses a certain number of mathematical problems, which can be avoided ( ) by adopting the following physical point of view: (or (or 0 0 ) actually denotes 0 ( ) ) where is very small (or is very large) compared to all the other lengths in the problem 0 ( ) ( ) we are considering. In all the intermediary calculations where (or ) appears, the 0 0 limit = 0 (or ) is never attained, so that one is always working in E . The physical result obtained at the end of the calculation depends very little on the value of , as long as is sufficiently small with respect to all the other lengths: it is then possible to neglect , that is, to set = 0, in the final result (the procedure to be used for is analogous). ( ) ( ) The objection could be raised that, unlike and and 0( ) 0( ) , 0 ( ) 0 ( ) are not orthonormal bases, insofar as they do not rigorously satisfy the closure relation. In fact, they fulfill it approximately. For example, the expression

d

( ) 0

0

( )

( ) 0

( ) is a function of

(

) which can serve as an excellent approximation for ( ). Its graphical representation 1 is practically a triangle of base 2 and height , centered at = 0 (Appendix II, § 1-c- ). If is negligible compared to all the other lengths in the problem, the difference between this expression and ( ) is physically inappreciable.

In general, the dual space E and the state space E are not isomorphic, except, of course, if E is finite-dimensional6 : although to each ket of E there corresponds a bra in E , the converse is not true. Nevertheless, we shall agree to use, in addition to vectors belonging to E (whose norm is finite), generalized kets with infinite norms but whose scalar product with every ket of E is finite. Thus, to each bra of E , there will correspond a ket. But generalized kets do not represent physical states of the system. B-3. B-3-a.

Linear operators Definitions

They are the same as those of § A-1-c. A linear operator associates with every ket

E another ket

E , the

6 It is true that the Hilbert space 2 and its dual space are isomorphic; however, we have taken for the wave function space F a subspace of 2 , which explains why F is “larger” than F .

108

B. STATE SPACE. DIRAC NOTATION

correspondence being linear: = (

1

(B-23) +

1

2

2

)=

1

1

+

2

The product of two linear operators lowing way: (

)

=

(

and

, written

, is defined in the fol-

)

(B-25)

first acts on to give the ket = . The commutator [ ] of [

(B-24)

2

; then acts on the ket and is, by definition:

]=

(B-26)

Let and be two kets. We call the matrix element of , the scalar product: (

)

B-3-b.

between

and (B-27)

Consequently, this is a number which depends linearly on

.

. In general,

and antilinearly on

.

Examples of linear operators: projectors

Important comment about Dirac notation

We have begun to sense, in the preceding, the simplicity and convenience of the Dirac formalism. For example, denotes a linear functional (a bra), and 1 2 , the scalar product of two kets 1 and 2 . The number associated by the linear functional with an arbitrary ket is then written simply by juxtaposing the symbols and : . This is the scalar product of by the ket corresponding to (which is why it is useful to have a one-to-one correspondence between kets and bras). Now assume that we write and in the opposite order: (B-28) We shall see that if we abide by the rule of juxtaposition of symbols, this expression represents an operator. Choose an arbitrary ket and consider: (B-29) We already know that is a complex number; consequently,(B-29) is a ket, obtained by multiplying by the scalar . , applied to an arbitrary ket, gives another ket: it is an operator. Thus we see that the order of the symbols is of critical importance. Only complex numbers can be moved about with impunity, because of the linearity of the space E and of the operators which we shall use. Indeed, if is a number: = = =

(where =

is a linear operator)

= 109

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

(B-30) But, for kets, bras and operators, the order must always be carefully respected in writing the formulas: this is the price that must be paid for the simplicity of the Dirac formalism. .

The projector Let

onto a ket

be a ket which is normalized to one: =1

(B-31)

Consider the operator

, defined by:

=

(B-32)

and apply it to an arbitrary ket

:

=

(B-33)

, acting on an arbitrary ket , gives a ket proportional to . The coefficient of proportionality is the scalar product of by . The “geometrical” significance of is therefore clear: it is the “orthogonal projection” operator onto the ket . This interpretation is confirmed by the fact that 2 = (projecting twice in succession onto a given vector is equivalent to projecting a single time). To see this, we write: 2

=

=

(B-34)

In this expression, 2

=

.

is a number, which is equal to 1 [formula (B-31)]. Therefore:

=

(B-35)

Projector onto a subspace Let

,

1

2

, ...,

, be

normalized vectors which are orthogonal to each

other: =

;

=1 2

We denote by E the subspace of E spanned by these Let be the linear operator defined by: =

(B-36) vectors.

(B-37) =1

Calculating 2

2

:

=

(B-38) =1 =1

110

B. STATE SPACE. DIRAC NOTATION

we get, using (B-36): 2

=

= =1 =1

=

(B-39)

=1

is therefore a projector. It is easy to see that since for any E:

projects onto the subspace E ,

=

(B-40) =1

acting on gives the linear superposition of the projections of , that is, the projection of onto the subspace E . B-4.

onto the various

Hermitian conjugation

B-4-a.

Action of a linear operator on a bra

Until now, we have only defined the action of a linear operator on kets. We are now going to see that it is also possible to define the action of on bras. Let be a well-defined bra, and consider the set of all kets . With each of these kets can be associated the complex number ( ), already defined above as the matrix element of between and . Since is linear and the scalar product depends linearly on the ket, the number ( ) depends linearly on . Thus, for fixed and , we can associate with every ket a number which depends linearly on . The specification of and therefore defines a new linear functional on the kets of E , that is, a new bra belonging to E . We shall denote this new bra by . The relation which defines can thus be written: (

)

=

(

)

(B-41)

The operator associates with every bra a new bra . Let us show that the correspondence is linear. In order to do this, consider a linear combination of bras 1 and 2 : =

1

1

+

2

(which means that (

Since

)

(B-42)

2

=

=

(

=

1

=

1

1

1

+

2

2

(

). From (B-41), we have:

2

) (

1

(

)

1

)+

2

+

2

(

2

) )

)

(B-43)

is arbitrary, it follows that: =( =

1 1

1 1

+ +

2

2 2

) 2

(B-44) 111

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

ψ

ψ

A

ψ

A†

ψ

Figure 2: Definition of the adjoint operator between kets and bras.

of an operator

A ψ

ψ A† using the correspondence

Equation (B-41) therefore defines a linear operation on bras. The bra bra which results from the action of the linear operator on the bra .

is the

Commments:

(i) From definition (B-41) of , we see that the place of the parenthesis in the symbol defining the matrix element of between and is of no importance. Therefore, we shall henceforth designate this matrix element by the notation : =(

)

=

(

)

(B-45)

(ii) The relative order of and is very important in the notation (cf. § 3-b-a above). One must write and not : acting on a ket gives a number ; is therefore indeed a bra. On the other hand, , acting on a ket , would give , that is, an operator (the operator multiplied by the number ). We have not defined any mathematical object of this sort: therefore has no meaning. B-4-b.

The adjoint operator

of a linear operator

We are now going to see that the correspondence between kets and bras, studied in § B-2-c, enables us to associate with every linear operator another linear operator , called the adjoint operator (or Hermitian conjugate) of . Let then be an arbitrary ket of E . The operator associates with it another ket = of E (Fig. 2). To the ket corresponds a bra ; in the same way, to corresponds . This correspondence between kets and bras thus permits us to define the action of the operator on the bras: the operator associates with the bra corresponding to the ket , the bra corresponding to the ket = . We write: = . Let us show that the relation = is linear. We know that, to the bra 1 1 + 2 2 , corresponds the ket 1 1 + 2 2 (the correspondence between a 112

B. STATE SPACE. DIRAC NOTATION

bra and a ket is antilinear). The operator transforms 1 1 + 2 2 into = 1 1 + 2 2 . Finally, to this ket corresponds the bra: 2 2 = + 2 2 . From this we conclude that: 2 1 1 2 (

1

+

1

2

2

)

=

1

1

+

2

1

1 1

1

+ +

(B-46)

2

is therefore a linear operator, defined by the formula: =

=

(B-47)

From (B-47), it is easy to deduce another important relationship satisfied by the operator . Using the properties of the scalar product, one can always write: =

(B-48)

is an arbitrary ket of E . Using expressions (B-47) for

where obtain:

=

and

, we

(B-49)

a relation which is valid for all

and

.

Comment about notation:

We have already mentioned a notation which can lead to confusion: and , where is a scalar [formulas (B-7) and (B-8)]. The same problem arises with the expressions and , where is a linear operator. is another way of designating the ket : =

(B-50)

is the bra associated with the ket

. Using (B-50) and (B-57), we see

that: =

(B-51)

When a linear operator is taken outside the bra symbol, it must be replaced by its adjoint (and placed to the right of the bra). B-4-c.

Correspondence between an operator and its adjoint

By using (B-47) or (B-49), it is easy to show that: (

)

=

(

)

=

( +

) =

(B-52) (where +

is a number)

(B-53) (B-54) 113

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Now let us calculate ( ) . To do this, consider the ket in the form = , setting = . Then: =

(

since

=

(

) =

) =

=

. Write it

=

. From this, we deduce that: (B-55)

Note that the order changes when one takes the adjoint of a product of operators. Comment:

Since (

) =

, we can write, using (B-51):

=

(

) =

Thus the left-hand side of (B-41) can be rewritten in the form . In the same way, the right-hand side of this same equation can be put, with the notation of (B-50), into the form . From this results the following relation, sometimes used to define the adjoint operator of : = B-4-d.

(B-56)

Hermitian conjugation in Dirac notation

In the preceding section, we introduced the concept of an adjoint operator by using the correspondence between kets and bras. A ket and its corresponding bra are said to be “Hermitian conjugates” of each other. The operation of Hermitian conjugation is represented by the wavy arrows in Figure 2; we see that it associates with . This is the reason why is also called the Hermitian conjugate operator of . The operation of Hermitian conjugation changes the order of the objects to which it is applied. Thus we see in Figure 2 that becomes . The ket is changed into , the operator into , and the order is reversed. In the same way, we saw in (B-55) that the Hermitian conjugate of a product of two operators is equal to the product of the Hermitian conjugates taken in the opposite order. Finally, let us show that: (

) =

(B-57)

( is replaced by , and by , and the order is changed). Applying relation (B-49) to the operator , we find: (

)

=[

(

)

]

(B-58)

Now, if we use property (B-9) of the scalar product: [

(

)

] = =

114

= (

)

(B-59)

B. STATE SPACE. DIRAC NOTATION

By comparing (B-58) and (B-59), we can derive (B-57). The result of the operation of Hermitian conjugation on a constant remains to be found. We see from (B-6) and (B-53) that this operation simply transforms into (complex conjugation). This is in agreement with the fact that = . To summarize, the Hermitian conjugate of a ket is a bra, and vice versa; that of an operator is its adjoint; that of a number, its complex conjugate. In Dirac notation, the operation of Hermitian conjugation is very simple to perform; it suffices to apply the following rule: RULE

To obtain the Hermitian conjugate (or the adjoint) of any expression composed of constants, kets, bras and operators, one must: the constants by their complex conjugates – Replace the kets by the bras associated with them the bras by the kets associated with them the operators by their adjoints – Reverse the order of the factors (the position of the constants, nevertheless, is of no importance).

EXAMPLES

is an operator ( and are numbers). The adjoint of this operator is obtained by using the preceding rule: , which can also be written , changing the position of the numbers and . In the same way, is a ket ( and are constants). The conjugate bra is , which can also be written . B-4-e.

Hermitian operators

An operator

is said to be Hermitian if it is equal to its adjoint, that is, if:

=

(B-60)

Combining (B-60) and (B-49), we see that a Hermitian operator satisfies the relation: =

(B-61)

which is valid for all and . Finally, for a Hermitian operator, (B-56) becomes =

(B-62)

We shall treat Hermitian operators in more detail later, when we consider the problem of eigenvalues and eigenvectors. Moreover, we shall see in Chapter III that Hermitian operators play a fundamental role in quantum mechanics. 115

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

If formula (B-57) is applied to the case where projector = is Hermitian: =

=

=

, we see that the

=

(B-63)

Comment: The product of two Hermitian operators and is Hermitian only if [ if = and = , it can be shown using (B-55) that ( ) = is equal to only if [ ] = 0.

C. C-1. C-1-a.

] = 0. Indeed, = , which

Representations in state space Introduction Definition of a representation

Choosing a representation means choosing an orthonormal basis, either discrete or continuous, in the state space E . Vectors and operators are then represented in this basis by numbers: components for the vectors, matrix elements for the operators. The vectorial calculus introduced in § B then becomes a matrix calculus with these numbers. The choice of a representation is, in theory, arbitrary. Actually, it obviously depends on the particular problem being studied: in each case, one chooses the representation that leads to the simplest calculations. C-1-b.

Aim of section C

Using the Dirac notation, and for any arbitrary E space, we are going to treat again all the concepts introduced in §§ A-2 and A-3 for discrete and continuous bases of F. We shall write the two characteristic relations of a basis in Dirac notation: the orthonormalization and closure relations. Then we shall show how, using these two relations, it is possible to solve all specific problems involving a representation and the transformation from one representation to another. C-2. C-2-a.

Relations characteristic of an orthonormal basis Orthonormalization relation

A set of kets, discrete or continuous , is said to be orthonormal if the kets of this set satisfy the orthonormalization relation: (C-1)

= or = ( 116

)

(C-2)

C. REPRESENTATIONS IN STATE SPACE

It can be seen that, for a continuous set, does not exist: the have an infinite norm and therefore do not belong to E . Nevertheless, the vectors of E can be expanded on the . It is useful, consequently, to accept the as generalized kets (see the discussions in §§ A-3 and B-2-c). C-2-b.

ket

Closure relation

A discrete set, , or a continuous one, belonging to E has a unique expansion on the

, constitutes a basis if every or the :

=

(C-3)

=

d

( )

(C-4)

Let us assume, moreover, that the basis is orthonormal. Then perform the scalar multiplication on both sides of (C-3) with , and on both sides of (C-4) with . We obtain, using (C-1) or (C-2), expressions for the components or ( ): =

(C-5)

= ( )

(C-6)

Then replace in (C-3) =

by

, and in (C-4) ( ) by

:

=

=

=

=

d

=

d

( )

=

(C-7)

d =

d

(C-8)

[since, in (C-7), we can place the number after the ket (C-8), we can place the number after the ket ]. Thus, we see two operators appear, belonging to E to give the same ket

on every ket follows that: = =

= d

and

; in the same way, in d

. Since

. They act is arbitrary, it

(C-9) =

(C-10)

where denotes the identity operator in E . Relation (C-9), or (C-10), is called the closure relation. Conversely, let us show that relations (C-9) and (C-10) express the fact 117

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

that the sets can write: =

and =

constitute bases. For every

belonging to E , we

=

=

(C-11)

with: =

(C-12)

In the same way: =

=

=

d

=

d

( )

(C-13)

with: ( )=

(C-14)

Thus, every ket has a unique expansion on the or on the . Each of these two sets therefore forms a basis, a discrete one or a continuous one. We also see that relation (C-9), or (C-10), spares us the need of memorizing expressions (C-12) and (C-14) for the components and ( ). Comments: (i) We shall see later (§ E) that, in the case of the F -space, relations (A-32) and (A-57) can easily be deduced from (C-9) and (C-10). (ii) Geometrical interpretation of the closure relation. From the discussion of § B-3-b, we see that projector onto the subspace E spanned by basis, every ket of E can be expanded on the

1

is a projector: the

If the form a 2 ; the subspace E is then identical

with the E -space itself. Consequently, it is reasonable for

to be equal to

the identity operator: projecting onto E a ket which belongs to E does not modify this ket. The same argument can be applied to

d

.

We can now find an equivalent of the closure relation for the three-dimensional space of ordinary geometry, 3 . If e1 , e2 and e3 are three orthonormal vectors of this space, and 1 , 2 and 3 are the projectors onto these three vectors, the fact that { e1 , e2 , e3 } forms a basis in 3 is expressed by the relation 1

+

2

+

3

=

(C-15)

On the other hand, { e1 , e2 } constitutes an orthonormal set but not a basis of 3 . This is expressed by the fact that the projector 1 + 2 (which projects onto the plane spanned by e1 and e2 ) is not equal to ; for example: ( 1 + 2 ) e3 = 0.

118

C. REPRESENTATIONS IN STATE SPACE

in the

Table (II-7) summarizes the only fundamental formulas required for any calculation or representation. representation

representation

= =

= ( =

=

d

) =

Table (II-7)

C-3.

Representation of kets and bras

C-3-a.

Representation of kets

In the basis, the ket is represented by the set of its components, that is, by the set of numbers = . These numbers can be arranged vertically to form a one-column matrix (with, in general, a countable infinity of rows):

1 2

.. .

(C-16)

.. . In the continuous basis, the ket is represented by a continuous infinity of numbers, ( ) = , that is, by a function of . It is then possible to draw a vertical axis, along which are placed the various possible values of . To each of these values corresponds a number, : .. . .. . (C-17)

.. . .. . C-3-b.

Representations of bras

Let =

be an arbitrary bra. In the =

=

basis, we can write: (C-18)

has a unique expansion on the bras . The components of , , are the complex conjugates of the components = of the ket associated with . 119

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

In the same way, we obtain, in the =

=

=

basis:

d

(C-19)

The components of , , are the complex conjugates of the components ( ) = of the ket associated with . We have agreed to arrange the components of a ket vertically. Before describing how to arrange the components of a bra, let us show how the closure relation enables us to find simply the expression for the scalar product of two kets in terms of their components. We know that we can always place between and in the expression for the scalar product: =

=

=

=

(C-20)

In the same way: = =

= d

=

d

( ) ( )

(C-21)

Let us arrange the components of the bra horizontally, to form a row matrix (having one row and an infinite number of columns): 1

2

(C-22)

Using this convention, is the matrix product of the column matrix which represents and the row matrix which represents . The result is a matrix having one row and one column, that is, a number: In the basis, has a continuous infinity of components . The various values of are placed along a horizontal axis. To each of these values corresponds a component of bra : (C-23)

Comment: In a given representation, the matrices which represent a ket and the associated bra are Hermitian conjugates of each other (in the matrix sense): one passes from one matrix to the other by interchanging rows and columns and taking the complex conjugate of each element.

120

C. REPRESENTATIONS IN STATE SPACE

C-4.

Representation of operators

C-4-a.

Representation of

by a “square” matrix

Given a linear operator , we can, in a it a series of numbers defined by:

or

(

or

basis, associate with

=

(C-24)

)=

(C-25)

These numbers depend on two indices and can therefore be arranged in a “square” matrix having a countable or continuous infinity of rows and columns. The usual convention is to have the first index fix the rows and the second, the columns. Thus, in the basis, the operator is represented by the matrix: 11

12

21

22

2

.. .

.. .

.. .

1

.. .

1

(C-26)

2

.. .

.. .

We see that the th column is made up of the components, in the basis, of the transform of the basis vector . For a continuous basis, we draw two perpendicular axes. To a point that has for its abscissa and for its ordinate, there corresponds the number ( ):

.. . .. . (

)

(C-27) Let us use the closure relation to calculate the matrix which represents the operator in the basis: = = =

(C-28)

The convention chosen above for the arrangement of the elements [or ( )] is therefore consistent with the one relating to the product of two matrices: (C-28) expresses the fact that the matrix representing the operator is the product of the matrices associated with and . 121

CHAPTER II

C-4-b.

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Matrix representation of the ket

=

The problem is the following: knowing the components of and the matrix elements of in a given representation, how can we calculate the components of = in the same representation? In the basis, the coordinate of are given by: =

=

(C-29)

If we simply insert the closure relation between =

and

, we obtain:

=

= =

(C-30)

For the

basis, we obtain, in the same way:

( )=

=

=

d

=

d

(

) ( )

(C-31)

The matrix expression for = is therefore very simple. We see, for example from (C-30), that the column matrix representing is equal to the product of the column matrix representing and the square matrix representing : 1

11

12

1

2

21

22

2

2

.. .

.. .

.. .

.. . .. .

.. .

=

1

.. . .. . C-4-c.

.. . .. .

2

.. . .. .

.. . .. .

1

(C-32)

.. .

Expression for the number

By inserting the closure relation between , we obtain: – for the basis:

and

and again between

and

= = =

122

(C-33)

C. REPRESENTATIONS IN STATE SPACE

– for the

basis: = =

d d

=

d d

( )

(

) ( )

(C-34)

In the matrix formalism, the interpretation of these formulas is as follows: is a number, that is, a matrix with one row and one column, obtained by multiplying the column matrix representing first by the square matrix representing and then by the row matrix representing . For example, in the basis:

=

1

11

12

1

21

22

2

2

.. .

.. .

.. .

.. . .. .

2

1

1

2

.. . .. .

.. . .. .

.. . .. .

(C-35)

.. .

Comments: (i) It can be shown in the same way that the bra is represented by a row matrix, the product of the square matrix representing by the row matrix representing [the first two matrices of the right-hand side of (C-35)]. Again we see the importance of the order of the symbols: the expression would lead to a matrix operation which is undefined (the product of a row matrix by a square matrix). (ii) From a matrix point of view, equation (B-41) which defines merely expresses the associativity of the product of the three matrices that appear in (C-35). (iii) Using the preceding conventions, we express 1

1 1

1 2

1

2

2 1

2 2

2

.. .

.. .

.. .

1

=

2

1

.. .

(C-36)

2

.. .

.. .

This is indeed an operator, while matrix, is a number. C-4-d.

.. .

by a square matrix:

.. . , the product of a column matrix by a row

Matrix representation of the adjoint

of

Using (B-49), we obtain easily: (

) =

=

=

(C-37) 123

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

or (

)=

=

=

(

)

(C-38)

Therefore, the matrices representing and in a given representation are Hermitian conjugates of each other, in the matrix sense: one passes from one to the other by interchanging rows and columns and then taking the complex conjugate. If is Hermitian, = , and we can then replace ( ) by in (C-37), and ( ) by ( ) in (C-38): = (

)=

(C-39) (

)

(C-40)

A Hermitian operator is therefore represented by a Hermitian matrix, that is, one in which any two elements which are symmetric with respect to the principal diagonal are complex conjugates of each other. In particular, for = or = , (C-39) and (C-40) become: = (

)=

(C-41) (

)

(C-42)

The diagonal elements of a Hermitian matrix are therefore always real numbers. C-5.

Change of representations

C-5-a.

Outline of the problem

In a given representation, a ket (or a bra, or an operator) is represented by a matrix. If we change representations, that is, bases, the same ket (or bra, or operator) will be represented by a different matrix. How are these two matrices related? For the sake of simplicity, we shall assume here that we are going from one discrete orthonormal basis to another discrete orthonormal basis . In § E, we shall study an example of changing from one continuous basis to another continuous basis. The change of basis is defined by specifying the components of each of the kets of the new basis in terms of each of the kets of the old one. We shall set: =

(C-43)

is the matrix of the basis change (transformation matrix). If conjugate: (

)

=(

) =

denotes its Hermitian (C-44)

The following calculations can be performed very easily, and without memorization, by using the two closure relations:

124

=

=

(C-45)

=

=

(C-46)

C. REPRESENTATIONS IN STATE SPACE

and the two orthonormalization relations: =

(C-47)

=

(C-48)

Comment: The transformation matrix,

= where (

, is unitary (Complement CII ). That is, it satisfies:

=

(C-49)

is the unit matrix. Indeed, we see that: )

=

=

=

=

(C-50)

In the same way: (

)

=

=

=

C-5-b.

nents

=

(C-51)

Transformation of the components of a ket

To obtain the components of a ket in the new basis from its compoin the old basis, one simply inserts (C-45) between and : =

=

= =

(C-52)

The inverse expressions can be derived in the same way, using (C-46): =

=

= =

(C-53) 125

CHAPTER II

C-5-c.

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Transformation of the components of a bra

The principle of the calculation is exactly the same. For example: =

=

= =

C-5-d.

(C-54)

Transformation of the matrix elements of an operator

If, in , we obtain:

, we insert (C-45) between

and

, and again between

and

= =

(C-55)

that is: =

(C-56)

In the same way: =

=

= =

D.

(C-57)

Eigenvalue equations. Observables

D-1.

Eigenvalues and eigenvectors of an operator

D-1-a.

Definitions

is said to be an eigenvector (or eigenket) of the linear operator =

if: (D-1)

where is a complex number. We are going to study a certain number of properties of equation (D-1), the eigenvalue equation of the linear operator . In general, this equation possesses solutions only when takes on certain values, called eigenvalues of . The set of the eigenvalues is called the spectrum of . Note that, if is an eigenvector of with the eigenvalue , (where is an arbitrary complex number) is also an eigenvector of with the same eigenvalue: ( 126

)=

=

= (

)

(D-2)

D. EIGENVALUE EQUATIONS. OBSERVABLES

To rid ourselves of this ambiguity, we could agree to normalize the eigenvectors to 1: =1

(D-3)

But this does not completely remove the ambiguity, since e , where is an arbitrary real number, has the same norm as . We shall see below that, in quantum mechanics, the physical predictions obtained using

or e

are identical.

The eigenvalue is called non-degenerate (or simple) when its corresponding eigenvector is unique to within a constant factor, that is, when all its associated eigenkets are collinear. On the other hand, if there exist at least two linearly independent kets which are eigenvectors of with the same eigenvalue, this eigenvalue is said to be degenerate. Its degree (or order) of degeneracy is then the number of linearly independent eigenvectors associated with it (the degree of degeneracy of an eigenvalue can be finite or infinite). For example, if is -fold degenerate, there correspond to it independent kets ( =1 2 ) such that: =

(D-4)

But then every ket

of the form:

=

(D-5) =1

is an eigenvector of =

with the eigenvalue , whatever the coefficients =

=1

, since:

=

(D-6)

=1

Consequently, the set of eigenkets of associated with constitutes a -dimensional vector space (which can be infinite-dimensional), called the “eigensubspace” of the eigenvalue . In particular, it is equivalent to say that is non-degenerate or to say that its degree of degeneracy is = 1. (with

To illustrate these definitions, let us choose the example of a projector (§ B-3-b): = 1). Its eigenvalue equation is written:

=

= that is: =

(D-7)

The ket on the left-hand side is always collinear with , or zero. Consequently, the eigenvectors of are: on the one hand, itself, with an eigenvalue of = 1; on the other hand, all the kets orthogonal to , for which the associated eigenvalue is = 0. The spectrum of therefore includes only two values: 1 and 0. The first one is simple, the second, infinitely degenerate (if the state space considered is infinite-dimensional). The eigensubspace associated with = 0 is the supplement7 of (see § D-2-c). 7 In a vector space E , two subspaces E and E are said to be supplementary if all kets of E can 1 2 be written = 1 + 2 where 1 and 2 belong, respectively, to E1 and E2 , and if E1 and E2 are disjoint (no common non-zero ket; the expansion = 1 + 2 is then unique). Actually, there

127

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Comments:

(i) Taking the Hermitian conjugate of both sides of equation (D-1), we obtain: =

(D-8)

Therefore, if is an eigenket of with an eigenvalue , it can also be said that is an eigenbra of with an eigenvalue . However, let us stress the fact that, except in the case where is Hermitian (§ D-2-a), nothing can be said a priori about . (ii) To be completely rigorous, one should solve the eigenvalue equation (D-1) in the space E . That is, one should consider only those eigenvectors which have a finite norm. In fact, we shall be obliged to use operators for which the eigenkets do not satisfy this condition (§ E). Therefore, we shall grant that vectors which are solutions of (D-1) can be “generalized kets”. D-1-b.

Finding the eigenvalues and eigenvectors of an operator

Given a linear operator , how does one find all its eigenvalues and the corresponding eigenvectors? We are concerned with this question from a purely practical point of view. We shall consider the case where the state space is of finite dimension , and we shall grant that the results can be generalized to an infinite-dimensional state space. Let us choose a representation, for example , and let us project the vector equation (D-1) onto the various orthonormal basis vectors : =

(D-9)

Inserting the closure relation between =

and

, we obtain: (D-10)

With the usual notation: = =

(D-11)

equations (D-10) can be written: =

(D-12)

or [

]

=0

(D-13)

(D-13) can be considered to be a system of equations where the unknowns are the , the components of the eigenvector in the chosen representation. This system is linear and homogeneous. exists an infinity of sub-subspaces E2 supplementary to a given subspace E1 . One can fix E2 by forcing it to be orthogonal to E1 . This shall be done throughout this book, even though the word “orthogonal” will not be explicitly written before supplement. Example: In ordinary three-dimensional space, if E1 is a plane , E2 can be any arbitrary straight line, not contained in . The orthogonal supplement of E1 is the straight line passing through the origin and orthogonal to .

128

D. EIGENVALUE EQUATIONS. OBSERVABLES

.

The characteristic equation

The system (D-13) consists of equations ( = 1 2 ) with unknowns ( =1 2 ). Since it is linear and homogeneous, it has a non-trivial solution (the trivial solution is the one for which all the are zero) if and only if the determinant of the coefficients is zero. This condition is written: Det [A

]=0

(D-14)

where A is the matrix of elements and is the unit matrix. Equation (D-14), called the characteristic equation (or secular equation), enables us to determine all the eigenvalues of the operator , that is, its spectrum. (D-14) can be written explicitly in the form: 11

12 21

22

.. .

.. . 1

2

13

1

23

2

.. .

.. .

=0

(D-15)

3

This is an th order equation in ; consequently, it has roots, real or complex, distinct or identical. It is easy to show, by performing an arbitrary change of basis, that the characteristic equation is independent of the representation chosen. Therefore, the eigenvalues of an operator are the roots of its characteristic equation. .

Determination of the eigenvectors

Now let us choose an eigenvalue 0 , a solution of the characteristic equation (D-14), and let us look for the corresponding eigenvectors. We are going to distinguish between two cases: (i) First, assume that 0 is a simple root of the characteristic equation. We can then show that the system (D-13), when = 0 , is comprised of ( 1) independent equations, the th one following from the preceding ones and hence redundant. But we have unknowns; there is therefore an infinite number of solutions, but all the can be determined in a unique way in terms of one of them, say 1 . If we fix 1 , we obtain for the ( 1) other a system of ( 1) linear, inhomogeneous equations (the “right-hand side” of each equation is the term in 1 ) with a non-zero determinant [the ( 1) equations are independent]. The solution of this system is of the form: =

0

(D-16)

1

since the initial system (D-13) is linear and homogeneous. 10 is, of course, equal to 1 by definition, and the ( 1) coefficients 0 for = 1 are determined from the matrix elements and 0 . The eigenvectors associated with 0 differ only by the value chosen for 1 . They are therefore all given by: 0( 1)

0

=

1

=

1

0

(D-17)

with: 0

=

0

(D-18)

129

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Therefore, when 0 is a simple root of the characteristic equation, only one eigenvector corresponds to it (to within a constant factor): it is a non-degenerate eigenvalue. (ii) When 0 is a multiple root of order 1 of the characteristic equation, there are two possibilities: – in general, when = 0 , the system (D-13) is still composed of ( 1) independent equations. Only one eigenvector then corresponds to the eigenvalue 0 . The operator cannot be diagonalized in this case: the eigenvectors of are not sufficiently numerous for one to be able to construct with them alone a basis of the state space. – nevertheless, when = 0 , it may happen that the system (D-13) has only ( ) independent equations (where is a number greater than 1 but not larger than ). To the eigenvalue 0 there then corresponds an eigensubspace of dimension , and 0 is a -fold degenerate eigenvalue. Let us assume, for example, that, for = 0 , (D-13) is composed of ( 2) linearly independent equations. These equations enable us to calculate the coefficients in terms of any two of them, for example 1 and 2 : 0

=

1

+

(obviously: of the form: 0( 1

2)

0 1

0

(D-19)

2

=

0 2

=1;

=

1

1 0

0 1

+

=

0 2

2

2 0

= 0). All the eigenvectors associated with

0

are then (D-20)

with: 1 0

=

0

2 0

=

0

(D-21)

The vectors 0 ( 1 2 ) do indeed constitute a two-dimensional vector space, this being characteristic of a two-fold degenerate eigenvalue. When an operator is Hermitian, it can be shown that the degree of degeneracy of an eigenvalue is always equal to the multiplicity of the corresponding root in the characteristic equation. Since, in most cases, we shall be studying only Hermitian operators, we shall only need to know the multiplicity of each root of (D-14) to obtain immediately the dimension of the corresponding eigensubspace. Thus, in a space of finite dimension , a Hermitian operator always has linearly independent eigenvectors (we shall see later that they can be chosen to be orthonormal): this operator can therefore be diagonalized (§ D-2-b). D-2.

Observables

D-2-a.

Properties of the eigenvalues and eigenvectors of a Hermitian operator

We shall now consider the very important case in which the operator

is Hermi-

tian: = (i) The eigenvalues of a Hermitian operator are real. 130

(D-22)

D. EIGENVALUE EQUATIONS. OBSERVABLES

Taking the scalar product of the eigenvalue equation (D-1) by

, we obtain:

= But

(D-23) is a real number if

=

is Hermitian, as we see from:

=

(D-24)

where the last equation follows from hypothesis (D-22). Since real, equation (D-23) implies that must also be real. If is Hermitian, we can, in (D-8), replace just shown that is real. Thus we obtain:

by

and

and

by

are

, since we have

=

(D-25)

which shows that whatever the ket

is also an eigenbra of

with the real eigenvalue . Therefore,

:

=

(D-26)

The Hermitian operator

is said to act on the left in (D-26).

(ii) Two eigenvectors of a Hermitian operator corresponding to two different eigenvalues are orthogonal. Consider two eigenvectors

Since

and

of the Hermitian operator

:

=

(D-27-a)

=

(D-27-b)

is Hermitian, (D-27-b) can be written in the form: =

(D-28)

Then multiply (D-27-a) by

on the left and (D-28) by

on the right:

=

(D-29-a)

=

(D-29-b)

Subtracting (D-29-b) from (D-29-a), we find: (

)

=0

Consequently, if ( D-2-b.

(D-30) ) = 0,

and

are orthogonal.

Definition of an observable

When E is finite-dimensional, we have seen (§ D-1-b) that it is always possible to form a basis with the eigenvectors of a Hermitian operator. When E is infinitedimensional, this is no longer necessarily the case. This is why it is useful to introduce a new concept, that of an observable. 131

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Consider a Hermitian operator . For simplicity, we shall assume that the set of its eigenvalues forms a discrete spectrum { ; = 1, 2, ... }, and we shall indicate later the modifications that must be made when all or part of this spectrum is continuous. The degree of degeneracy of the eigenvalue will be denoted by (if = 1, is non-degenerate). We shall denote by ( = 1, 2, ... ) linearly independent vectors chosen in the eigensubspace E of : =

;

=1 2

(D-31)

We have just shown that every vector belonging to E is orthogonal to every vector of another subspace E , associated with = ; therefore: = 0 for

=

and arbitrary

Inside each subspace E , the

and

(D-32)

can always be chosen orthonormal, that is, such

that: = the

(D-33)

If such a choice is made, the result is an orthonormal system of eigenvectors of satisfy the relations: =

:

(D-34)

obtained by regrouping (D-32) and (D-33). By definition, the Hermitian operator is an observable if this orthonormal system of vectors forms a basis in the state space. This can be expressed by the closure relation: =

(D-35)

=1 =1

Comments:

(i) Since the vectors ( = 1, 2, ..., ) which span the eigensubspace E of are orthonormal, the projector onto this subspace E can be written (cf. § B-3-b- ): =

(D-36-a) =1

The observable =

is then given by: (D-36-b)

(it is easy to verify that the action of both sides of this equation on all the kets gives the same result). 132

D. EIGENVALUE EQUATIONS. OBSERVABLES

(ii) Relation (D-35) can be generalized to include cases where the spectrum of eigenvalues is continuous by using the rules given in table (II-3). For example, consider a Hermitian operator whose spectrum is composed of a discrete part { (degree of degeneracy )} and a continuous part ( ) (assumed to be non-degenerate): =

;

=1 2 =1 2

= ( )

;

1

(D-37-a) (D-37-b)

2

The vectors can always be chosen in such a way that they form an “orthonormal” system: = = (

)

=0

(D-38)

will be said to be an observable if this system forms a basis, that is, if: 2

+

d

D-2-c.

=

(D-39)

1

=1

Example: the projector

Let us show that = (with = 1) is an observable. We have already pointed out (§ B-4-e) that it is Hermitian, and that its eigenvalues are 1 and 0 (§ D-1-a); the first one is simple (associated eigenvector: ), the second one is infinitely degenerate (associated eigenvectors: all kets orthogonal to ). Consider an arbitrary ket in the state space. It can always be written in the form: =

+(

Now,

)

(D-40)

is an eigenket of (

)=

Moreover, ( from: (

2

) )

with the eigenvalue 1. Since

2

=

, we have:

=

(D-41)

is also an eigenket of

=(

2

)

, but with the eigenvalue 0, as we see

=0

(D-42)

Every ket can thus be expanded on these eigenkets of ; therefore, servable. We shall see in § E-2 two other important examples of observables. D-3. D-3-a.

.

is an ob-

Sets of commuting observables Important theorems

Theorem I

If two operators and commute, and if also an eigenvector of , with the same eigenvalue.

is an eigenvector of

,

is

133

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

We know that, if

is an eigenvector of

, we have:

=

(D-43)

Applying

to both sides of this equation, we obtain: =

(D-44)

Since we assumed that side by : (

)= (

and

commute, we also have, replacing

)

on the left-hand

(D-45)

This equation expresses the fact that is an eigenvector of , with the eigenvalue ; the theorem is therefore proved. Two cases may then arise: (i) If is a nondegenerate eigenvalue, all the eigenvectors associated with it are by definition colinear, and is necessarily proportional to . Thus is also an eigenvector of . (ii) If is a degenerate eigenvalue, it can only be said that belongs to the eigensubspace E of , corresponding to the eigenvalue . Therefore, for any E , we have: E

(D-46)

E is said to be globally invariant (or stable) under the action of be stated in another form: Theorem I’: If two operators globally invariant under the action of . of

and .

. Theorem I can also

commute, every eigensubspace of

is

Theorem II If two observables and commute, and if 1 and 2 are two eigenvectors with different eigenvalues, the matrix element 1 2 is zero. If 1 and 2 are eigenvectors of , we can write: 1

=

1

1

2

=

2

2

(D-47)

According to theorem I, the fact that and commute means that is an 2 eigenvector of , with the eigenvalue 2 . 2 is therefore (cf. § D-2-a) orthogonal to 1 (eigenvector of eigenvalue 1 = 2 ), which can be written: 1

2

=0

(D-48)

The theorem is then proved. Another proof can be given, which does not involve theorem I: since the operator [ ] is zero, we have: 1

134

(

)

2

=0

(D-49)

D. EIGENVALUE EQUATIONS. OBSERVABLES

Using (D-47) and the Hermiticity of 1 1

2

=

1

1

2

2

=

2

1

2

[cf. equation (D-25)], we obtain: (D-50)

and (D-49) can be rewritten in the form: (

1

2)

1

Since, by hypothesis, ( .

=0

2

(D-51) 2)

1

is not zero, we can deduce (D-48) from this equality.

Theorem III (fundamental)

If two observables and commute, one can construct an orthonormal basis of the state space with eigenvectors common to and . Consider two commuting observables, and . In order to simplify the notation, we shall assume that their spectra are entirely discrete. Since is an observable, there exists at least one orthonormal system of eigenvectors of which forms a basis in the state space. We shall denote these vectors by : =

;

=1 2 =1 2

(D-52)

is the degree of degeneracy of the eigenvalue sponding eigensubspace E . We have: =

, that is, the dimension of the corre-

(D-53)

What does the matrix look like which represents in the basis? We know (cf. theorem II) that the matrix elements are zero when = (on the other hand, we can say nothing a priori about what happens for = and = ). Let us arrange the basis vectors in the order: 1 2 1 2 1 1 ; ; 1 1 2 3 1 2 We then obtain for a “block-diagonal” matrix, that is, of the form:

1

2

3

1

2

.. .. . . .. .. . .

0

0

.. .. . . .. .. . . .. .. . .

3

..

.

..

.

..

.

0

0

0

0

0

0

..

0

0

0

.

(D-54)

0 .. .. . . .. .. . . 135

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

(only the dotted square parts contain non-zero matrix elements). The fact that the eigensubspaces E are globally invariant under the action of (cf. § D-3-a- ) is evident from this matrix. Two cases can then arise: (i) When is a nondegenerate eigenvalue of , there exists only one eigenvector of , of eigenvalue (the index in is then unnecessary): the dimension of E is then equal to 1. In the matrix (D-54), the corresponding “block”, then reduces to a 1 1 matrix, that is, to a simple number. In the column associated with , all the other matrix elements are zero. This expresses the fact (cf. § D-3-a- -i) that is an eigenvector common to and . (ii) When is a degenerate eigenvalue of ( 1), the “block” which represents in E is not, in general, diagonal: the are not, in general, eigenvectors of . It can be seen, nevertheless, that, since the action of on each of the vectors reduces to a simple multiplication by , the matrix representing the restriction of to within E is equal to (where is the unit matrix). This expresses the fact that an arbitrary ket of E is an eigenvector of with the eigenvalue . The choice in E of a basis such as ; = 1 2 is therefore arbitrary. Whatever this basis, the matrix representing the operator in E is always diagonal and equal to . We shall use this property to obtain a basis of E composed of vectors that are also eigenvectors of . The matrix representing in E , when the basis chosen is: ; = 1 2 , has for its elements: ( )

=

(D-55) ( )

( )

This matrix is Hermitian = , since is a Hermitian operator. It is therefore diagonalizable, that is, one can find in E a new basis ; = 1 2 in which is represented by a diagonal matrix: =

( )

(D-56)

This means that the new basis vectors in E are eigenvectors of =

:

( )

(D-57)

As we saw above, these vectors are automatically eigenvectors of with an eigenvalue since they belong to E . Let us stress the fact that the eigenvectors of associated with degenerate eigenvalues are not necessarily eigenvectors of . What we have just shown is that it is always possible to choose, in every eigensubspace of , a basis of eigenvectors common to and . If we perform this operation in all the subspaces E , we obtain a basis of E , formed by eigenvectors common to and . The theorem is therefore proved. Comments:

(i) From now on, we shall denote by

the eigenvectors common to

and

.

= = 136

(D-58)

D. EIGENVALUE EQUATIONS. OBSERVABLES

The indices and which appear in enable us to specify the eigenvalues and of and . The additional index will eventually be used to distinguish between the different basis vectors which correspond to the same eigenvalues and (§ D-3-b below). (ii) The converse of theorem III is very simple to prove: if there exists a basis of eigenvectors common to and , these two observables commute. From (D-58), it is easy to deduce: =

=

=

=

(D-59)

and, subtracting these equations: [

]

=0

(D-60)

This relation is valid for all , form a basis, (D-60) entails [

and . Since, by hypothesis, the vectors ] = 0.

(iii) We shall occasionally solve the eigenvalue equation of an observable that: = where

+

avec [

]=0

and

are also observables.

such

(D-61)

When one has found a basis of eigenvectors common to and , the problem is solved, since we see immediately that is also an eigenvector of , with an eigenvalue + . The fact that constitutes a basis is obviously essential: this allows us, for example, to show simply that all the eigenvalues of are of the form + . D-3-b.

Complete8 sets of commuting observables (C.S.C.O.)9

Consider an observable and a basis of E composed of eigenvectors of . If none of the eigenvalues of is degenerate, the various basis vectors of E can be labelled by the eigenvalue (the index in being in this case unnecessary). All the eigensubspaces E are then one-dimensional. Therefore, specifying the eigenvalue determines in a unique way the corresponding eigenvector (to within a constant factor). In other words, there exists only one basis of E formed by the eigenvectors of (we shall not consider here as distinct two bases whose vectors are proportional). It is then said that the observable constitutes, by itself, a C.S.C.O. If, on the other hand, one or several eigenvalues of are degenerate, the situation is different. Specifying is no longer always sufficient to characterize a basis vector, since several independent vectors correspond to any degenerate eigenvalue. In this case, 8 The word “complete” is used here in a sense which is totally unrelated to those referred to in the note 3 of § A-2-a, p. 91. This use of the word “complete” is customary in quantum mechanics. 9 To have a good understanding of the important concepts introduced in this section, the reader should apply them to a concrete example such as the one discussed in Complement HII (solved exercises 11 and 12).

137

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

the basis of eigenvectors of is obviously not unique. One can choose any basis inside each of the eigensubspaces E of dimension greater than 1. Let us then choose another observable which commutes with , and let us construct an orthonormal basis of eigenvectors common to and . By definition, and form a C.S.C.O. if this basis is unique (to within a phase factor for each of the basis vectors), that is, if, to each of the possible pairs of eigenvalues , there corresponds only one basis vector. Comment: In § D-3-a, we constructed a basis of eigenvectors common to and by solving the eigenvalue equation of inside each eigensubspace E of E . For and to constitute a C.S.C.O., it is necessary and sufficient that, inside each of these subspaces, all the eigenvalues of be distinct. Since all the vectors of E correspond to the same eigenvalue of , the vectors can then be distinguished by the eigenvalue of which is associated with them. Note that it is not necessary that all the eigenvalues of be non-degenerate: vectors belonging to two distinct subspaces E can have the same eigenvalue for . Moreover, if all the eigenvalues of were non-degenerate, alone would constitute a C.S.C.O.

If, for at least one of the possible pairs , there exist several independent vectors which are eigenvectors of and with these eigenvalues, the set is not complete. Let us add to it, then, a third observable , which commutes with both and . We can then use the same argument as in § D-3-a above, generalizing it in the following way. When to a pair , there corresponds only one vector, this vector is necessarily an eigenvector of . If there are several vectors, they form an eigensubspace E , in which it is possible to choose a basis formed by vectors which are also eigenvectors of . One can thus construct, in the state space, an orthonormal basis formed by eigenvectors common to , and . , and form a C.S.C.O. if this basis is unique (to within multiplicative factors). Specifying a possible set of eigenvalues of , , then characterizes only one of the vectors of this basis. If this is not the case, one adds to , , an observable which commutes with each of these three operators, and so on. In general, we are thus led to the following: By definition, a set of observables , , ... is called a complete set of commuting observables if (i) all the observables , , ... commute by pairs, (ii) specifying the eigenvalues of all the operators , , ... determines a unique (to within a multiplicative factor) common eigenvector. An equivalent way of saying this is the following: A set of observables , , ... is a complete set of commuting observables if there exists a unique orthonormal basis of common eigenvectors (to within phase factors). C.S.C.O.’s play an important role in quantum mechanics. We shall see numerous examples of them (see, in particular, § E-2-d).

138

E. TWO IMPORTANT EXAMPLES OF REPRESENTATIONS AND OBSERVABLES

Comments:

(i) If is a C.S.C.O., another C.S.C.O. can be obtained by adding to it any observable , on the condition, of course, that it commutes with and . However, it is generally understood that one is confined to “minimal” sets, that is, those which cease to be complete when any one of the observables is omitted. (ii) Let be a C.S.C.O.. Since the specification of the eigenvalues , , ... determines a unique ket of the corresponding basis (to within a constant factor), this ket is sometimes denoted by . (iii) For a given physical system, there exist several distinct complete sets of commuting observables. We shall see a particular example of this in § E-2-d. E.

Two important examples of representations and observables

In this paragraph, we shall return to the F -space of wave functions of a particle, or, more exactly, to the state space Er which is associated with it, and which we shall define in the following way. Let there correspond to every wave function (r) a ket belonging to Er ; this correspondence is linear. Moreover, the scalar product of two kets coincides with that of the functions which are associated with them: d3

=

(r) (r)

(E-1)

Er is thus the state space of a (spinless) particle. We are going to define and study, in this space, two representations and two operators which are particularly important. In Chapter III we shall associate them with the position and the momentum of the particle under consideration. They will enable us, moreover, to apply and illustrate the concepts which we have introduced in the preceding sections. E-1.

The

E-1-a.

and

r

p

representations

Definition

In §§ A-3-a and A-3-b, we introduced two particular “bases” of F : (r) . They are not composed of functions belonging to F : p0 r0 (r)

= (r

p0 (r)

= (2 ~)

r0 (r)

r0 ) 3 2

and

(E-2-a) e ~ p0 r

(E-2-b)

However, every sufficiently regular square-integrable function can be expanded in one or the other of these “bases”. This is why we shall remove the quotation marks and associate a ket with each of the functions of these bases (cf. § B-2-c). The ket associated with r0 (r) will be denoted simply by r0 , and that associated with p0 (r), by p0 : r0 (r)

r0

(E-3-a)

p0 (r)

p0

(E-3-b)

139

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Using the bases r0 (r) and p0 (r) of F , we thus define in Er two representations: the r0 representation and the p0 representation. A basis vector of the first one is characterized by three “continuous indices” 0 , 0 and 0 , which are the coordinates of a point in three-dimensional space; for the second, the three indices are also the components of an ordinary vector p0 .

E-1-b.

Orthonormalization and closure relations

Let us calculate r0 r 0 . Using the definition of the scalar product in Er : d3

r0 r 0 =

r0 (r)

r

0

(r) = (r0

r 0)

(E-4-a)

where relation (A-55) has been used. In the same way: d3

p0 p 0 =

p0 (r)

p

0

(r) = (p0

p 0)

(E-4-b)

using (A-47). The bases which we have just defined are therefore orthonormal in the extended sense. The fact that the set of the r0 or that of the p0 constitutes a basis in Er can be expressed by a closure relation in Er . This is written in an analogous manner to (C-10), integrating here, however, over three indices instead of one. We therefore have the fundamental relations: r0 r 0 = (r0 d3

E-1-c.

r0 =

r0

0

r 0)

(a) (b)

p0 p 0 = (p0 d3

0

p0

p 0)

p0 =

(c) (d)

(E-5)

Components of a ket

Consider an arbitrary ket , corresponding to the wave function (r). The preceding closure relations enable us to write it in either of these two forms: =

d3

0

r0

r0

(E-6-a)

=

d3

0

p0

p0

(E-6-b)

The coefficients r0 r0

=

d3

p0

=

d3

and p0 r0 (r) p0 (r)

(r) (r)

can be calculated using the formulas: (E-7-a) (E-7-b)

We then find:

140

r0

= (r0 )

(E-8-a)

p0

= (p0 )

(E-8-b)

E. TWO IMPORTANT EXAMPLES OF REPRESENTATIONS AND OBSERVABLES

where (p is the Fourier transform of (r). The value (r0 ) of the wave function at the point r0 is thus shown to be the component of the ket on the basis vector r0 of the r0 representation. The “wave function in momentum space” (p) can be interpreted analogously. The possibility of characterizing by (r) is thus simply a special case of the results of § C-3-a. For example, for = p0 , formula (E-8-a) gives: r0 p0 =

p0 (r0 )

= (2 ~)

3 2

e ~ p 0 r0

(E-9)

For = r 0 , the result is indeed in agreement with the orthonormalization relation (E-5-a): r0 r 0 =

r

0

(r0 ) = (r0

r 0)

(E-10)

Now that we have reinterpreted the wave function (r) and its Fourier transform (p), we shall denote the basis vectors of the two representations we are studying here by r and p , instead of r0 and p0 . Formulas (E-8) can then be written: = (r) = (p)

r p

(E-8-a) (E-8-b)

and the orthonormalization and closure relations (E-5) become: r r d3

= (r

r)

(a)

r =

r

(b)

p p d3

= (p p

p)

p =

(c) (d)

(E-5)

Of course, r and p are still considered to represent two sets of continuous indices, { , , } and { , , }, which fix the basis kets of the r and p representations respectively. Now let (r) be an orthonormal basis of F . With each (r) is associated a ket The set forms an orthonormal basis in Er ; it therefore satisfies the closure relation: =

of Er .

(E-11)

Evaluate the matrix element of both sides of (E-11) between r and r : r

r

= r

r

= r r

(E-12)

According to (E-8-a) and (E-5-a), this relation can be written: (r)

(r ) = (r

r)

The closure relation for the (r) [formula (A-32)] is therefore simply the expression in the representation of the vectorial closure relation (E-11). E-1-d.

(E-13) r

The scalar product of two vectors

We have defined the scalar product of two kets of Er as being equal to that of the associated wave functions in F [equation (E-1)]. In light of the discussion in § E-1-c, 141

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

this definition appears simply as a special case of formula (C-21). (E-1) can, in fact, be derived by inserting the closure relation (E-5-b) between and : =

d3

r

(E-14)

r

and by interpreting the components r and r ) as in (E-8-a). If we place ourselves in the p representation, a well-known property of the Fourier transform is demonstrated (Appendix I, § 2-c):

E-1-e.

=

d3

=

d3

p

p

(p) (p)

(E-15)

r

Changing from the

representation to the

p

representation

This is accomplished using the method indicated in § C-5, the only difference arising from the fact that we are dealing here with two continuous bases. Changing from one basis to the other brings in the numbers: r p = p r

3 2

= (2 ~)

e~p r

(E-16)

A given ket is represented by r = (r) in the r representation and by p = (p) in the p representation. We already know [relations (E-2-b) and (E-7-b)] that (r) and (p) are related by a Fourier transform. This is indeed what the formulas for the representation change yield: r

=

d3

r p

p

that is: (r) = (2 ~)

3 2

d3

e~p r

(p)

(E-17)

Inversely: p

=

d3

p r

r

that is: (p) = (2 ~)

3 2

d3

e

~p r

(r)

(E-18)

By applying the general formula (C-56), one can easily pass from the matrix elements r r = (r r) of an operator in the r representation to the matrix elements p p = (p p) of the same operator in the p representation: (p p) = (2 ~)

3

d3

d3

e ~ (p r

p r)

An analogous formula enables one to calculate 142

(r r) (r r) from

(E-19) (p p).

E. TWO IMPORTANT EXAMPLES OF REPRESENTATIONS AND OBSERVABLES

E-2.

The R and P operators

E-2-a.

Definition

Let be an arbitrary ket of Er and let r = (r) ( ) be the corresponding wave function. Using the definition of the operator , the ket: =

(E-20)

is represented, in the that: (

)=

basis, by the function r

r

(

=

(r)

(

) such

)

(E-21)

In the r representation, the operator therefore coincides with the operator which multiplies by . Although we characterize by the way in which it transforms the wave functions, it is an operator which acts in the state space Er . We can introduce two other operators, and , in an analogous manner. Thus we define , and by the formulas: r

=

r

(E-22-a)

r

=

r

(E-22-b)

r

=

r

(E-22-c)

where the numbers , , are precisely the three indices which label the ket r . , and will be considered to be the “components” of a “vector operator” R: for the moment, we shall treat this simply as a condensed notation, suggested by the fact that , , are the components of the ordinary vector r. Manipulation of the , , operators is particularly simple in the r representation. For example, in order to calculate the matrix element , all we need to do is insert the closure relation (E-5-b) between and and use definition (E-22-a): =

d3

r

=

d3

(r)

r (r)

Similarly, we define the vector operator P by its components action, in the p representation, is given by:

(E-23) ,

,

p

=

p

(E-24-a)

p

=

p

(E-24-b)

p

=

p

(E-24-c)

, whose

where

, , are the three indices which appear in the ket p . Let us ascertain how the P operator acts in the r representation. To do so (cf. § C-5-d), we use the closure relation (E-5-d) and the transformation matrix (E-16) 143

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

to obtain: d3

=

r

r p

= (2 ~)

3 2

p d3

¯(p)

~p r

e

We recognize in (E-25) the Fourier transform of

(E-25) (p), that is

~

(r) [Appendix I,

relation (38a)]. Therefore: r P

=

~

∇ r

(E-26)

In the r representation, the P operator coincides with the differential operator (~ )∇ applied to the wave functions. The calculation of a matrix element such as in the r representation is therefore performed in the following manner: =

d3

r

=

d3

(r)

r ~

(r)

(E-27)

Placing ourselves in the r representation, we can also calculate the commutators between the , , , , , operators. For example: r [

]

= r ( = =

) ~

r ~

r ~

r

r

= ~ r

(E-28)

This calculation is valid for all [

and for any ket of the r basis. Thus one finds10 :

]= ~

(E-29)

In the same way, we find all the other commutators between the components of R and those of P. The result can be written in the form: [ [ [

]=0 ] =0 ] = ~

(E-30)

= 1 2 3

where 1 , 2 , 3 , and 1 , 2 , 3 designate respectively Formulas (E-30) are called canonical commutation relations.

,

,

and

,

,

.

10 The commutator [ ] is an operator, and it should, actually, be written [ ] = ~ . However, we shall often replace the identity operator by the number 1, except when it is important to make the distinction.

144

E. TWO IMPORTANT EXAMPLES OF REPRESENTATIONS AND OBSERVABLES

E-2-b.

R and P are Hermitian

In order to show that (E-23): d3

=

, for example, is a Hermitian operator, we can use formula

(r)

d3

(r) =

(r)

(r)

=

(E-31)

From § B-4-e, we know that equation (E-31) is characteristic of a Hermitian operator. Similar proofs show that and are also Hermitian. For , and , the p representation can be used, and the calculations are then analogous to the preceding ones. It is interesting to show that P is Hermitian by using equation (E-26), which gives its action in the r representation. Consider, for example, formula (E-27) and integrate it by parts: +

=

~

d

d

d

=

~

d

d

(r) (r)

(r)

(r) +

=+

d

(r)

(r)

(E-32)

=

Since the integral which yields the scalar product is convergent, (r) (r) approaches zero when . The first term on the right hand side of (E-32) is therefore equal to zero, and: =

~

d3

(r)

(r)

=

~

d3

(r)

(r)

=

(E-33)

It can be seen that the presence of the imaginary number

is essential. The differential oper-

, acting on the functions of F , is not Hermitian, because of the sign change which is ~ introduced by the integration by parts. However, is Hermitian, as is . ator

E-2-c.

Eigenvectors of R and P

Consider the action of the

operator on the ket r0 ; according to (E-22-a), we

have: r

r0 =

r r0 =

(r

r0 ) =

0

(r

r0 ) =

0

r r0

(E-34)

This equation expresses the fact that the components, in the r representation, of the ket r0 are equal to those of the ket r0 multiplied by 0 . We therefore have: r0 =

0

r0

(E-35) 145

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

An analogous argument shows that the kets r0 are also eigenvectors of the and operators. Omitting the index zero which then becomes unnecessary, we can write: r = r = r =

r r r

(E-36)

The kets r are therefore the eigenkets common to , and . Thus the notation r which we chose above is justified: each eigenvector is labelled by a vector r, whose components , , represent three continuous indices which correspond to the eigenvalues of , , . Similar arguments can be elaborated for the P operator, placing ourselves, this time, in the representation p . We then obtain: p = p = p =

p p p

(E-37)

Comment:

This result can also be derived from equation (E-26), which gives the action of P in the r representation. Using (E-9), we find: r

p = =

~

r p = (2 ~)

3 2

~

e~p r =

(2 ~)

3 2

r p

e~p r (E-38)

All the components of the ket p in the r representation can be obtained by multiplying those of p by the constant : p is an eigenket of with the eigenvalue . E-2-d.

R and P are observables

Relations (E-5-b) and (E-5-d) express the fact that the r vectors and the vectors constitute bases in Er . Therefore, R and P are observables. Moreover, the specification of the three eigenvalues 0 , 0 , 0 of , , uniquely determines the corresponding eigenvector r0 : in the r representation, its coordinates are ( , , therefore 0) ( 0) ( 0 ). The set of the three operators constitutes a C.S.C.O. in Er . It can be shown in the same way that the three components , , of P also constitute a C.S.C.O. in Er . Note that, in Er , does not constitute a C.S.C.O. by itself. When the 0 index is fixed, 0 and 0 can take on any real values. Thus, each eigenvalue 0 is infinitely degenerate. On the other hand, in the state space E of a one-dimensional problem, constitutes a C.S.C.O.: the eigenvalue 0 uniquely determines the corresponding eigenket representation. 0 , its coordinates being ( 0 ) in the p

146

F. TENSOR PRODUCT OF STATE SPACES

Comment: We have found two C.S.C.O.’s in Er , and . We shall encounter others later. Consider, for example, the set : these three observables commute (equations (E-30)); moreover, if the three eigenvalues 0 , 0 and 0 are fixed, there corresponds to them only one ket, whose associated wave function is written: 0

F. F-1.

0

0

(

)= (

0)

1 e~( 2 ~

0

+ 0

)

(E-39)

Tensor product of state spaces11 Introduction

We introduced the state space of a physical system using the concept of a oneparticle wave function. However, our reasoning has involved sometimes one- and sometimes three-dimensional wave functions. Now it is clear that the space of square-integrable functions is not the same for functions of one variable ( ) as for functions of three variables (r): Er and E are therefore different spaces. Nevertheless, Er appears to be essentially a generalization of E . Does there exist a more precise relation between these two spaces? In this section, we are going to define and study the operation of taking the tensor product of vector spaces12 , and apply it to state spaces. This will answer, in particular, the question we have just asked: Er can be constructed from E and two other spaces, E and E , which are isomorphic to it (§ F-4-a below). In the same way, we shall be concerned later (Chapters IV and IX) with the existence, for certain particles, of an intrinsic angular momentum or spin. In addition to the external degrees of freedom (position, momentum), which are treated using the observables R and P defined in Er , it will be necessary to take into account the internal degrees of freedom and to introduce spin observables which act in a spin state space E . The state space E of a particle with spin will then be seen to be the tensor product of Er and E . Finally, the concept of a tensor product of state spaces allows us to solve the following problem. Let ( 1 ) and ( 2 ) be two isolated physical systems (they are, for example, sufficiently far apart that their interactions are perfectly negligible). The state spaces which correspond to ( 1 ) and ( 2 ) are, respectively, E1 and E2 . Now let us assume that we consider the set of these two systems to form one physical system ( ) (this becomes indispensable when they are close enough to interact). What is then the state space E of the global system ( )? It can be seen from these examples how useful the definitions and results of this section are in quantum mechanics. F-2.

Definition and properties of the tensor product

Let E1 and E2 be two13 spaces, of dimension 1 and 2 respectively ( 1 and 2 can be finite or infinite). Vectors and operators of these spaces will be assigned an index, 11 This section is not necessary for the understanding of Chapter III. One can study it later when it becomes necessary to use tensor products (Complement DIV , or Chapter IX). 12 This operation is sometimes called the “Kronecker product” 13 The following definitions can casily be extended to the tensor product of a finite number of spaces.

147

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

(1) or (2), depending on whether they belong to E1 or E2 . Tensor product space E

F-2-a.

.

Definition By definition, the vector space E is called the tensor product of E1 and E2 : E = E1

E2

(F-1) (1) belonging to E1 and

if there is associated with each pair of vectors, longing to E2 , a vector of E , denoted by14 : (1)

(2)

(2) be-

(F-2)

which is called the tensor product of (1) and (2) , this correspondence satisfying the following conditions: (i) It is linear with respect to multiplication by complex numbers: (1)

(2) =

(1)

(2)

=

(1)

(2)

(1)

(2)

(F-3)

(ii) It is distributive with respect to vector addition: (1) 1 (1)

1 (2)

+

+

2 (1)

2 (2)

=

(1)

(2) =

1 (2)

1 (1)

+

(2) +

(1) 2 (1)

2 (2)

(2)

(F-4)

(iii) When a basis has been chosen in each of the spaces E1 and E2 , (1) for E1 and (2) for E2 , the set of vectors (1) (2) constitutes a basis in E . If 1 and 2 are finite, the dimension of E is consequently 1 2. .

Vectors of E

(i) Let us first consider a tensor product vector, (1) (2) . Whatever (1) and (2) may be, they can be expressed in the (1) and (2) bases respectively: (1) =

(1)

(2) =

(2)

(F-5)

Using the properties described in § F-2-a- , the expansion of the vector in the (1) (2) basis can be written: (1)

(2) =

14 This vector can be written either is of no importance.

148

(1)

(1)

(2)

(2) or

(1)

(2)

(F-6)

(2)

(1) ; the order of the two vectors

F. TENSOR PRODUCT OF STATE SPACES

Therefore, the components of a tensor product vector are the products of the components of the two vectors of the product. (ii) There exist in E vectors which are not tensor products of a vector of E1 by a vector of E2 . Since (1) (2) constitutes by hypothesis a basis in E , the most general vector of E is expressed by: =

(1)

(2)

(F-7)

Given 1 2 arbitrary complex numbers , it is not always possible to put them in the form of products, , of 1 numbers and 2 numbers . Therefore, in general, vectors (1) and (2) of which is the tensor product do not exist. However, an arbitrary vector of E can always be decomposed into a linear combination of tensor product vectors, as is shown by formula (F-7). The scalar product in E

.

The existence of scalar products in E1 and E2 permits us to define one in E as well. We first define the scalar product of (1) (2) = (1) (2) by (1) (2) = (1) (2) by setting: (1)

(2)

(1) (2) =

(1)

(1)

(2)

(2)

(F-8)

For two arbitrary vectors of E , we simply use the fundamental properties of the scalar product [equations (B-9), (B-10) and (B-11)], since each of these vectors is a linear combination of tensor product vectors. Notice, in particular, that the basis (1) (2) = (1) (2) is orthonormal if each of the bases (1) and (2) is: (1)

(2)

(1) (2) =

(1)

(1)

(2)

(2)

= F-2-b.

(F-9)

Tensor product of operators

(i) First, consider a linear operator (1) defined in E1 . We associate with it a linear operator (1) acting in E , which we call the extension of (1) in E , and which is characterized in the following way: when (1) is applied to a tensor product vector (1) (2) , one obtains, by definition: (1)

(1)

(2)

=

(1)

(1)

(2)

(F-10)

The hypothesis that (1) is linear is then sufficient for determining it completely. An arbitrary vector of E can be written in the form (F-7). Definition (F-10) then gives the action of (1) on : (1)

=

(1)

(1)

(2)

We obtain in an analogous manner the extension defined in E2 .

(F-11) (2) of an operator

(2) initially

149

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

(ii) Now let (1) and (2) be two linear operators acting respectively in E1 and E2 . Their tensor product (1) (2) is the linear operator in E , defined by the following relation which describes its action on the tensor product vectors: (1)

(2)

(1)

(2)

=

(1)

(1)

(2)

Here also, this definition is sufficient for characterizing

(2)

(1)

(F-12)

(2).

Comments: (i) The extensions of operators are special cases of tensor products: if (1) and (2) are the identity operators in E1 and E2 respectively, (1) and (2) can be written: (1) =

(1)

(2)

(2) = (1)

(2)

(F-13)

Inversely, the tensor product two operators (1) and (2) of E : (1)

(2) =

(1)

(2) coincides with the ordinary product of

(1) (2)

(F-14)

(ii) It is easy to show that two operators such as (1)

(2) commute in E :

(1) and

(2) = 0

(F-15)

We must verify that (1) (2) and arbitrary vector of the (1) (1) (2)

(1)

(2) =

(1)

= (2) (1)

(1)

(2) (1) yield the same result when they act on an (2) basis:

(2) = =

(1)

(1) (2)

(2)

(1) (1)

(1)

(2)

(2) (1)

(1)

(2)

(F-16)

(2) (2)

(2)

(F-17)

(iii) The projector onto the tensor product vector (1) (2) = (1) (2) , which is an operator acting in E , is obtained by taking the tensor product of the projectors onto (1) and (2) : (1) (2)

(1) (2) =

(1)

(1)

(2)

(2)

(F-18)

This relation follows immediately from the definition of the scalar product in E . (iv) Just as with vectors, there exist operators in E which are not tensor products of an operator of E1 and an operator of E2 . F-2-c.

Notations

In quantum mechanics, the notation generally used is a simplified version of the one which we have defined here. This is the one we shall adopt, but it is important to interpret it correctly in the light of the preceding discussion. 150

F. TENSOR PRODUCT OF STATE SPACES

First of all, the symbol which indicates the tensor product is omitted, and the vectors or operators which are to be multiplied tensorially are simply juxtaposed: (1)

(2)

(1) (2)

means

(1)

means

(2)

(1)

(F-19)

(2)

(F-20)

Moreover, the extension in E of an operator of E1 or E2 is written in the same way as this operator itself: (1)

means

(1)

or

(1)

(F-21)

No confusion is possible in (F-19): until now we have never written two kets one after the other as we do here. Notice in particular that the expression , where and belong to the same space E , is not defined in this space: it represents a vector of the space which is the tensor product of E by itself. On the other hand, the notation in (F-20) and (F-21) is slightly ambiguous, especially in the latter, where two different operators are represented by the same symbol. However, it will be possible in practice to distinguish between them by the vector to which this symbol is applied: depending on whether it is a vector of E or of E1 , we shall be dealing with (1), or with (1) in a strict sense. As for formula (F-20), it poses no problem when E1 and E2 are different, since we have, until now, defined only products of operators which act in the same space. Moreover, (1) (2) can be considered to be an ordinary product of operators of E , if (1) and (2) are interpreted as designating, in fact, (1) and (2) [equation (F-14)]. F-3.

Eigenvalue equations in the product space

The vectors of E which are tensor products of a vector of E1 and a vector of E2 play an important role in the discussion above. We shall see that this is also the case for the extensions to E of operators acting in E1 and E2 . F-3-a.

.

Eigenvalues and eigenvectors of extended operators

Eigenvalue equation of

(1)

Consider an operator (1), for which we know, in E1 , all the eigenstates and eigenvalues. We shall assume, for example, that the whole spectrum of (1) is discrete: (1)

(1) =

(1)

;

=1 2

(F-22)

We want to solve the eigenvalue equation of the extension of (1)

=

(1) in E :

E

;

(F-23)

It can immediately be seen, from (F-10), that every vector of the form (1) is an eigenvector of (1) with the eigenvalue , whatever (2) may be, since: (1)

(1)

(2) = =

(1)

(1) (1)

(2)

(2) (2)

(F-24) 151

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

Let us show that, when (1) is an observable in E1 , all the solutions of (F-23) can be obtained in this way. The set of the (1) then forms a basis in E1 . Consequently, the orthonormal system of vectors such that: =

(1)

(2)

(F-25)

(where (2) is a basis of E2 ) forms a basis in E . We therefore have an orthonormal basis constituted by the eigenvectors of (1) in E , , which means that equation (F-23) is solved. The following conclusions can be drawn: – If (1) is an observable in E1 , it is also an observable in E . This results from the fact that the extension of (1) is Hermitian and from the fact that constitutes a basis in E . – The spectrum of (1) is the same in E as in E1 : the same eigenvalues appear in (F-22) and in (F-24). – Nevertheless, an eigenvalue which is -fold degenerate in E1 has, in E , a degree of degeneracy 2 . We known that the eigensubspace associated with is spanned in E by the kets = (1) (2) with fixed and = 1, 2, ..., ; = 1, 2, ..., 2 . Therefore, even if is simple in E1 , it is ( 2 -fold) degenerate in E . The projector onto the eigensubspace corresponding to an eigenvalue [cf. (F-18)]: =

(1)

(1)

=

(1)

(1)

(2)

is written, in E

(2)

(2)

(F-26)

using in E2 the closure relation relative to the (2) basis. It is therefore the extension of the projector (1) = (1) (1) which is associated with in E1 .

.

Eigenvalue equation of

(1) +

(2)

We shall often need to solve, in a tensor product space such as E , eigenvalue equations for operators of the form: =

(1) +

(2)

(F-27)

where (1) and (2) are observables whose eigenvalues and eigenvectors are known in E1 and E2 respectively: (1)

(1) =

(1)

(2)

(2) =

(2)

[to simplify the notation, we assume the spectra of non-degenerate in E1 and E2 ]. 152

(F-28) (1) and

(2) to be discrete and

F. TENSOR PRODUCT OF STATE SPACES

(1) and (2) commute [formulas (F-16) and (F-17)], and the which form a basis in E , are eigenvectors common to (1) and (2): (1)

(1)

(2) =

(1)

(2)

(2)

(1)

(2) =

(1)

(2)

They are also eigenvectors of (1)

(2) = (

+

(1)

(2) ,

(F-29)

: )

(1)

(2)

This gives us directly the solution of the eigenvalue equation of

(F-30) .

Therefore: the eigenvalues of = (1) + (2) are the sums of an eigenvalue of (1) and an eigenvalue of (2). One can find a basis of eigenvectors of which are tensor products of an eigenvector of (1) and an eigenvector of (2). Comment: Equation (F-30) shows that the eigenvalues of are all of the form = + . If two different pairs of values of and which give the same value for do not exist, is non-degenerate (recall that we have assumed and to be non-degenerate in E1 and E2 respectively). The corresponding eigenvector of is necessarily the tensor product (1) (2) . If, on the other hand, the eigenvalue is, for example, twofold degenerate (there exist and such that = ), all that can be asserted is that every eigenvector of corresponding to this eigenvalue is written: (1)

(2) +

(1)

(2)

(F-31)

where and are arbitrary complex numbers. In this case, therefore, there exist eigenvectors of which are not tensor products. Complete sets of commuting observables in E

F-3-b.

We are finally going to show that if a C.S.C.O. has been chosen in both spaces E1 and E2 , obtaining one in E is straightforward. As an example, let us consider the case where (1) constitutes a C.S.C.O. by itself in E1 , and the C.S.C.O. in E2 is composed of two observables, (2) and (2). This means (cf. § D-3-b) that all the eigenvalues of (1) are nondegenerate in E1 : (1)

(1) =

(1)

(F-32)

the ket (1) being unique to within a constant factor. On the other hand, in E2 , some of the eigenvalues of (2) are degenerate, as are some of the eigenvalues ( ) of (2). Nevertheless, the basis of eigenvectors common to (2) and (2) is unique in E2 , since there exists only one ket (to within a constant factor) which is an eigenvector of (2) and of (2) with the eigenvalues and , fixed: (2) (2) = (2) (2) (2) = (2) (2) unique to within a constant factor

(F-33)

153

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

In E , each of the eigenvalues is 2 -fold degenerate (cf. § F-3-a). Therefore, (1) no longer forms a C.S.C.O. by itself. Similarly, there exist 1 linearly independent kets which are eigenvectors of (2) and (2) with the eigenvalues and respectively, and the set (2) (2) is not complete either. However, we saw in § F-3-a that the eigenvectors which are common to the three commuting observables (1), (2) and (2) are the (1) (2) = (1) (2) : (1)

(1)

(2) =

(1)

(2)

(2)

(1)

(2) =

(1)

(2)

(2)

(1)

(2) =

(1)

(2)

(F-34)

(1) (2) constitutes a basis in E , since this is the case for (1) in E1 and E2 respectively. Moreover, if a set of three eigenvalues is chosen, only one vector (1) (2) corresponds to it. (1), (2) and (2) therefore constitute a C.S.C.O. in E . The preceding argument can be generalized without difficulty: by joining two sets of commuting observables which are complete in E1 and E2 respectively, one obtains a complete set of commuting observables in E . The system and (2)

F-4. F-4-a.

.

Applications One- and three-dimensional particle states

State spaces

Consider again, in the light of the preceding discussion, the problem posed in the introduction (§ F-1): how are E and Er related? E is the state space of a particle moving in one dimension, that is, the state space associated with the wave functions ( ). In E , the observable which was studied in § E-2 constitutes a C.S.C.O. by itself (§ E-2-d); its eigenvectors are the basis kets of the representation. A vector of E is characterized, in this representation, by a wave function ( ) = ; in particular, the basis ket 0 corresponds to the wave function 0 ( ) = ( 0 ). In the same way, it is possible to introduce the spaces E and E associated with the wave functions ( ) and ( ). The observable forms a C.S.C.O. in E , as does in E . The corresponding eigenvectors are the basis kets of the and representations of E and E respectively. A vector of E (or of E ) is characterized in the (or representation by a function ( ) = (or ( ) = ). The function which corresponds to the basis ket 0 (or 0 ) is ( ) (or ( )). 0 0 Let us then form the tensor product: E

=E

E

E

We obtain a basis in E from the tensor product of the bases. We shall denote it by , with: = 154

(F-35) ,

and

(F-36)

F. TENSOR PRODUCT OF STATE SPACES

The basis kets are simultaneous eigenvectors of the E :

,

and

operators extended into

= = =

(F-37)

Therefore, E coincides with Er , the state space of a three-dimensional particle, and with r : r = where , ,

(F-38)

are precisely the cartesian coordinates of r.

There exist in Er kets = that are the tensor products of three kets, one of E , one of E and one of E . The components in the r representation are then [cf. formula (F-8)]: =

r

(F-39)

The associated wave functions are thus factorized: the basis vectors themselves: r r0 = (r

r0 ) = (

0)

(

0)

(

( ) ( )

( ). This is the case for

0)

(F-40)

Note that the most general state of Er is not such a product. It is written: =

d d d

(

)

(F-41)

In ( )= , the -, - and -dependences cannot, in general, be factorized: each of the wave functions associated with the kets of Er is a wave function with three variables. The results of § F-3 thus enable us to understand why , which constitutes a C.S.C.O. by itself in E , no longer has this property in Er (cf. § E-2-d): the eigenvalues of its extension in Er are the same as in E , but they become infinitely degenerate because E and E are infinite-dimensional. Starting with a C.S.C.O. in E , E and E , we construct one for Er : , for example, but also since forms a C.S.C.O. in E , or , etc... .

An important application Let us try to solve in Er the eigenvalue equation of an operator =

+

+

such that: (F-42)

where , and are the extensions of observables acting respectively in E , E and E . In practice, one recognizes that , for example, is the extension of an observable of E because it is constructed using only the operators and . Using the reasoning of 155

CHAPTER II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS

§ F-3-a- , one first looks for the eigenvalues and eigenvectors of in E :

in E ,

in E and

= = =

(F-43)

The eigenvalues of =

are then all of the form:

+

+

(F-44)

with an eigenvector that is the tensor product ciated with this vector is the product: ( )

( )

; the wave function asso-

( )=

This is the type of situation that was considered in Complement FI (§ 2) for the justification of the study of one-dimensional models. There, we were dealing with differential operators acting on wave functions: ~2 + (r) (F-45) 2 This equation can be decomposed as in (F-42) in the particular case where the potential can be written: =

(r) = F-4-b.

1(

)+

2(

)+

3(

)

(F-46)

States of a two-particle system

Consider a physical system which is made up of two (spinless) particles. We shall distinguish between them by numbering them (1) and (2). To describe the system quantum mechanically, we can generalize the concept of a wave function, introduced for the case of one particle. A state of the system can be characterized, at a given time, by a function of six spatial variables (r1 r2 ) ( 1 1 1 ; 2 2 2 ). The probabilistic interpretation of such a two-particle wave function is the following: the probability dP(r1 r2 ), at the given time, of finding particle (1) in the volume d3 1 = d 1 d 1 d 1 situated at the point r1 , and particle (2) in the volume d3 2 = d 2 d 2 d 2 about r2 , is: (r1 r2 ) 2 d3

dP(r1 r2 ) =

1

d3

2

(F-47)

The normalization constant is obtained by imposing the condition that the total probability must be equal to 1 (conservation of the number of particles; cf. § B-2 of Chapter I): 1

=

d3

1

d3

2

(r1 r2 ) 2

(F-48)

and the observables 1 , 1 , 1 can be defined in Er1 . Similarly, in the state space Er2 of particle (2), we introduce the r2 representation and the observables 2 , 2 , 2 . Take the tensor product: Er 1 r 2 = Er 1 156

Er 2

(F-49)

F. TENSOR PRODUCT OF STATE SPACES

The set of vectors: r1 r2 = r1

(F-50)

r2

forms a basis in Er1 r2 . Consequently, every ket d3

=

1

d3

2

of this space can be written:

(r1 r2 ) r1 r2

(F-51)

with (r1 r2 ) = r1 r2

(F-52)

Moreover, the square of the norm of =

d3

1

d3

2

is equal to:

(r1 r2 ) 2

(F-53)

For it to be finite, (r1 r2 ) must be square-integrable. Therefore, a wave function (r1 r2 ) is associated with each ket of Er1 r2 : the state space of a two-particle system is the tensor product of the spaces which correspond to each of the particles. A C.S.C.O. is obtained in Er1 r2 by joining, for example, 1 , 1 , 1 and 2 , 2 , 2 . Assume that the state of the system is described by a tensor product ket: =

1

(F-54)

2

The corresponding wave function can then be factorized: (r1 r2 ) = r1 r2

= r1

1

r2

2

=

1 (r1 )

2 (r2 )

(F-55)

In this case, one says that there is no correlation between the two particles. We shall analyze later (Complement DIII ) the physical consequences of such a situation.

The preceding can be generalized: when a physical system is composed of the union of two or several simpler systems, its state space is the tensor product of the spaces which correspond to each of the component systems. References and suggestions for further reading: Section 10 of the bibliography contains references to a certain number of mathematical texts, listed by subject. Under each heading, they are ranked, as much as possible, in order of increasing difficulty. See also the quantum mechanics texts (sections 1 and 2 of the bibliography), which treat the mathematical problems at many different levels. They also contain other references. For a very simple approach to the fundamental mathematical concepts needed to understand Chapter II (vector spaces, operators, diagonalization of matrices, etc.), the reader can consult, for example: Arfken (10.4), Chap. 4; Bak and Lichtenberg (10.3), Chap. I; Bass (10.1), vol. I, Chap. II to V. A more explicit application to quantum mechanics can be found in Jackson (10.5) (see, in particular, Chap. 5), Butkov (10.8), Chap. 10 (finite-dimensional linear spaces) and Chap. 11 (infinite-dimensional vector spaces, spaces of functions). See also Meijer and Bauer (2.18), Chap. 1, particularly the table at the end of this chapter.

157

COMPLEMENTS OF CHAPTER II, , READER’S GUIDE AII : THE SCHWARZ INEQUALITY BII : REVIEW OF SOME USEFUL PROPERTIES OF LINEAR OPERATORS

Review of some definitions and useful mathematical results (elementary level) intended for readers unfamiliar with these concepts; will serve as a reference later (especially BII ).

CII : UNITARY OPERATORS DII : A MORE DETAILED STUDY OF THE AND p REPRESENTATIONS

r

EII : SOME GENERAL PROPERTIES OF TWO OBSERVABLES, AND , WHOSE COMMUTATOR IS EQUAL TO }

Complete § E of Chapter II. DII : remains at the level of Chapter II and can be read immediately after it. EII : adopts a more general and a slightly more formal point of view. Introduces, in particular, the translation operator. May be reserved for later study.

FII : THE PARITY OPERATOR

Discussion of the parity operator, particularly important in quantum mechanics; at the same time, a simple illustration of the concepts of Chapter II ; recommended for these two reasons.

GII : AN APPLICATION OF THE PROPERTIES OF THE TENSOR PRODUCT: THE TWODIMENSIONAL INFINITE WELL

A simple application of the tensor product (§ F of Chapter II); can be considered as a worked exercise.

HII : EXERCISES

Solutions are given for exercises 11 and 12; their aim is to familiarize the reader with the properties of commuting observables and the concept of a C.S.C.O. in a very simple special case. It is recommended that these exercises be done during the reading of § D-3 of Chapter II.

159



THE SCHWARZ INEQUALITY

Complement AII The Schwarz inequality For any ket

belonging to the state space , we have:

real

0

(1)

being equal to zero only when is the null vector [cf. equation (B-12) of Chapter II]. Using inequality (1), we shall derive the Schwarz inequality. This inequality states that, if 1 and 2 are any arbitrary vectors of , then: 1

2

2

1

1

2

(2)

2

the equality being realized if and only if 1 and 2 are proportional. Given 1 and 2 , consider the ket defined by: = where

1

+

(3)

2

is an arbitrary parameter. Whatever =

1

1

Let us chose for =

2

1

2

2

+

1

2

+

2

1

+

may be: 2

2

0

(4)

the value: (5)

In (4), the second and third terms of the right-hand side are then equal, and opposite in value to the fourth term, so that (4) reduces to: 1 1

2 2

Since 1

2

1

1

2 1

0

(6)

2

is positive, we can multiply this inequality by

2 2

2

1

2

2

1

2

2

, to obtain: (7)

which is precisely (2). In (7), the equality can only be realized if = 0, that is, according to (3), if 1 = . The kets and are then proportional. 2 1 2 References:

Bass I (10.1), § 5-3; Arfken (10.4), § 9-4.

161



REVIEW OF SOME USEFUL PROPERTIES OF LINEAR OPERATORS

Complement BII Review of some useful properties of linear operators

1

Trace of an operator . . . . . . . . . . . . . . . . . . . . . . . . 163 1-a

Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

1-b

The trace is invariant . . . . . . . . . . . . . . . . . . . . . . 164

1-c

Important properties . . . . . . . . . . . . . . . . . . . . . . . 164

2

Commutator algebra . . . . . . . . . . . . . . . . . . . . . . . 165 2-a

Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

2-b

Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

3

Restriction of an operator to a subspace . . . . . . . . . . . 165

4

Functions of operators . . . . . . . . . . . . . . . . . . . . . . 166 4-a

Definition and simple properties . . . . . . . . . . . . . . . . 166

4-b

An important example: the potential operator . . . . . . . . 168

4-c

Commutators involving functions of operators . . . . . . . . . 168

5

Derivative of an operator . . . . . . . . . . . . . . . . . . . . . 169 5-a

Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

5-b

Differentiation rules . . . . . . . . . . . . . . . . . . . . . . . 170

5-c

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

5-d

An application: a useful formula . . . . . . . . . . . . . . . . 171

The aim of this complement is to review a certain number of definitions and useful properties of linear operators. 1.

Trace of an operator

1-a.

Definition

The trace of an operator , written Tr , is the sum of its diagonal matrix elements. When a discrete orthonormal basis, , is chosen for the space , one has, by definition: Tr

=

(1)

For the case of a continuous orthonormal basis Tr

=

, one has:

d

When is an infinite-dimensional space, the trace of the operator expressions (1) and (2) converge.

(2) is defined only if

163



COMPLEMENT BII

1-b.

The trace is invariant

The sum of the diagonal elements of the matrix which represents an operator in an arbitrary basis does not depend on this basis. Let us derive this property for the case of a change from one discrete orthonormal basis to another discrete orthonormal basis . We have: =

(3)

(where we have used the closure relation for the is equal to:

states). The right-hand side of (3)

=

(4)

(since it is possible to change the order of two numbers in a product). We can then replace in (4) by (closure relation for the states), and we obtain, finally: =

(5)

We have therefore demonstrated the property of invariance for this case. Comment:

If the operator is an observable, Tr can therefore be calculated in a basis of eigenvectors of . The diagonal matrix elements are then the eigenvalues of (degree of degeneracy ) and the trace can be written: Tr 1-c.

=

(6)

Important properties

Tr

= Tr

Tr

= Tr

(7a) = Tr

(7b)

In general, the trace of the product of any number of operators is invariant when a cyclic permutation is performed on these operators. Let us prove, for example, relation (7a): Tr

= =

= =

(twice using the closure relation on the generalization (7b) presents no difficulty. 164

= Tr

(8)

basis). Relation (7a) is thus proved; its

• 2.

REVIEW OF SOME USEFUL PROPERTIES OF LINEAR OPERATORS

Commutator algebra

2-a.

Definition

The commutator [ , [

] of two operators is, by definition:

]=

2-b.

(9)

Properties

[

]=

[

(

[

+

[

]

)] = [

]=[

[

[

]+[ ]

]] + [

[

(10) +

[

] =[

]

[

]

]] + [

[

(11) (12) ]] = 0

(13)

]

(14)

The derivation of these properties is straightforward: it suffices to compare both sides of each equation after having written them out explicitly. 3.

Restriction of an operator to a subspace

Let be the projector onto the -dimensional subspace vectors :

spanned by the orthonormal

=

(15) =1

By definition, the restriction ˆ of the operator

to the subspace

ˆ = If

is: (16)

is an arbitrary ket, it follows from this definition that: ˆ

ˆ

=

(17)

where: ˆ

=

(18)

is the orthogonal projection of onto . Consequently, to make ˆ act on an arbitrary ket , one begins by projecting this ket onto ; then one lets the operator act on this projection, retaining only the projection in of the resulting ket. The operator ˆ , which transforms any ket of into a ket belonging to this same subspace, is therefore an operator whose action has been restricted to . What can be said about the matrix which represents ˆ ? Let us choose a basis whose first vectors belong to (they are, for example, the ), the others belonging to the supplementary subspace. We have: ˆ

=

(19) 165

COMPLEMENT BII



that is: ˆ

=

0

if if one of the two indices

or

is greater than

(20)

Therefore, the matrix which represents ˆ is, as it were, “cut out” of the one which represents . One retains only the matrix elements of associated with basis vectors and , both belonging to , the other matrix elements being replaced by zeros. 4.

Functions of operators

4-a.

Definition and simple properties

Consider an arbitrary linear operator . It is not difficult to define the operator : it is the operator which corresponds to successive applications of the operator 1 1 . The definition of the operator , the inverse of , is also well known: is the operator (if it exists) which satisfies the relations: 1

1

=

=

(21)

How can we define, in a more general way, an arbitrary function of an operator? To do this, let us consider a function of a variable . Assume that, in a certain domain, can be expanded in a power series in : ( )=

(22) =0

By definition, the corresponding function of the operator by a series which has the same coefficients : ( )=

is the operator

( ) defined

(23) =0

For example, the operator e is defined by: e = =0

!

=

+

+

2

2! +

+

!+

(24)

We shall not consider the problems concerning the convergence of the series (23), which depends on the eigenvalues of and on the radius of convergence of the series (22). Note that if ( ) is a real function, the coefficients are real. If, moreover, is Hermitian, we see from (23) that ( ) is Hermitian. Let be an eigenvector of with eigenvalue : = Applying the operator = 166

(25) times in succession, we obtain: (26)



REVIEW OF SOME USEFUL PROPERTIES OF LINEAR OPERATORS

Now let us apply series (23) to ( )

=

=

; we obtain: ( )

(27)

=0

This leads to the following rule: when is an eigenvector of with the eigenvalue , is also an eigenvector of ( ), with the eigenvalue ( ). This property leads to a second definition of a function of an operator. Let us consider a diagonalizable operator (this is always the case if is an observable), and let us choose a basis where the matrix associated with is actually diagonal (its elements are then the eigenvalues of ). ( ) is, by definition, the operator which is represented, in this same basis, by the diagonal matrix whose elements are ( ). For example, if , is the matrix =

1 0

0

(28)

1

it follows directly that: e

e 0

=

0 1 e

(29)

Comment:

Attention should be given, when functions of operators are used, with respect to the order of the operators. For example, the operators e e , e e , and e + are not, in general, equal when and are operators and not numbers. Consider: e e =

e e =

e

+

=

!

!

!

!

=

=

(30)

! !

(31)

! !

( + ) !

(32)

When and are arbitrary, the right-hand sides of (30), (31) and (32) have no reason to be equal (see exercise 7 of Complement HII ). However, when and commute, we have: [

]=0=

e e =e e =e

+

(an obvious relation if the diagonal matrices that represent e sidered in a basis of eigenvectors common to and ).

(33) and e

are con-

167



COMPLEMENT BII

4-b.

An important example: the potential operator

In one-dimensional problems, we shall often have to consider “potential” operators ( ) (so called because they correspond to the classical potential energy ( ) of a particle placed in a force field), where ( ) is a function of the position operator . It follows from the preceding section that ( ) has as eigenvectors the eigenvectors of , and we have simply: ( )

=

( )

(34)

The matrix elements of ( )

=

( ) in the

( ) (

representation are therefore:

)

(35)

Applying (34) and using the fact that real), we obtain: ( )

=

( )

=

( ) is Hermitian (the function

( ) ( )

( ) is (36)

This equation shows that in the representation, the action of the operator ( ) is simply multiplication by ( ). The generalization of (34), (35) and (36) to three-dimensional problems can be performed without difficulty; in this case, we obtain: (R) r =

(r) r

(37)

r

(R) r =

(r) (r

r

(R)

(r) (r)

4-c.

=

r)

(38) (39)

Commutators involving functions of operators

Definition (23) shows that [

( )] = 0

Similarly, if [

commutes with every function of

(40)

and

]=0=

:

[

commute, so do

( ) and

( )] = 0

: (41)

What will be the commutator of an operator with a function of another operator that does not commute with it? We shall restrict ourselves here to the case of the and operators, whose commutator is equal to: [

]= ~

(42)

Using relation (12), we can calculate: [

2

]=[

]=[

]

+

[

]=2~

(43)

More generally, let us show that: [ 168

]= ~

1

(44)



REVIEW OF SOME USEFUL PROPERTIES OF LINEAR OPERATORS

If we assume that this equation is verified, we obtain: +1

[

]=[

]=[

= ~

] 1

+ ~

+

[

]

= ~( + 1)

(45)

Relation (44) is therefore established by recurrence. Now let us calculate the commutator [ ( )]: 1

(46)

If ( ) denotes the derivative of the function of the operator ( ). Therefore:

( ), we recognize in (46) the definition

[

( )] =

[

[

]=

( )] = ~

~

( )

(47)

An analogous argument would have enabled us to obtain the symmetric relation: [

( )] =

~

( )

(48)

Comments:

( ) The preceding argument is based on the fact that ( ) (or ( )) depends only on (or on ). It is more difficult to calculate a commutator such as [ Φ( )], where Φ( ) is an operator which depends on both and : the difficulties arise from the fact that and do not commute. ( ) Equations (47) and (48) can be generalized to the case of two operators and which both commute with their commutator. An argument modeled on the preceding one shows that, if we have: [

]=[

]=0

(49)

with: =[

]

(50)

then: [ 5.

( )] = [

]

( )

(51)

Derivative of an operator

5-a.

Definition

Let

( ) be an operator which depends on an arbitrary variable . By definition, d the derivative of ( ) with respect to is given by the limit (if it exists): d d = lim ∆ 0 d

( +∆ ) ∆

()

(52) 169



COMPLEMENT BII

The matrix elements of functions of : =

()

d d

Let us call

( ) in an arbitrary basis of -independent vectors

are (53)

d d

=

the matrix elements of

d . It is easy to verify the d

relation: d d

=

d d

(54) d , all we d and differentiate each of its elements (without

Thus we obtain a very simple rule: to obtain the matrix elements representing must do is take the matrix representing changing their places). 5-b.

Differentiation rules

They are analogous to the ones for ordinary functions: d ( d

+

d ( d

)=

d d + d d

)=

d d

+

(55) d d

(56)

Nevertheless, care must be taken not to modify the order of the operators in formula (56). Let us prove, for example, the second of these equations. The matrix elements of are: =

(57)

We have seen that the matrix elements of d( )/d are the derivatives with respect to of those of ( ). Thus we have, taking the derivative of the right-hand side of (57): d ( d

)

d d

=

=

d d

This equation is valid for any 5-c.

+

+

d d

d d

(58)

and . Formula (56) is thus established.

Examples

Let us calculate the derivative of the operator e . By definition, we have: e

(

= =0

170

) !

(59)



REVIEW OF SOME USEFUL PROPERTIES OF LINEAR OPERATORS

Taking the derivative of the series term by term, we obtain: d e d

1

=

!

=0

( ( =1

=

=

( ( =1

)

1

1)! 1

)

(60)

1)!

We recognize inside the brackets the series that defines e index = 1). The result is therefore: d e d

=

e

=e

(taking as the summation

(61)

In this simple case involving only one operator, it is unnecessary to pay attention to the order of the factors: e and commute. This is not the case if one is interested in taking the derivative of an operator such as e e . Applying (56) and (61), we obtain: d (e e ) = d

e e

+e

e

(62)

The right-hand side of this equation can be transformed into e e +e e or e e +e e , for example. However, we can never obtain (unless, of course, and commute) an expression such as ( + )e e . In this case, the order of the operators is therefore important. Comment:

Even when the function involves only one operator, taking the derivative cannot always be performed according to the rules valid for ordinary functions. For d () example, when ( ) has an arbitrary time-dependence, the derivative e is d d generally not equal to e ( ) . It can be seen by expanding e ( ) in a power series d d in ( ) that ( ) and must commute for the equality to hold. d 5-d.

An application: a useful formula

Consider two operators and which, by hypothesis, both commute with their commutator. In this case, we shall derive the relation: e e =e

+

1

e2[

]

(63)

(sometimes called Glauber’s formula). 171



COMPLEMENT BII

Let us define the operator ()=e

( ), a function of the real variable , by:

e

(64)

We have: d = d

e

e

Since and calculate: [e

+e

e

=( +e

e

) ()

(65)

commute with their commutator, formula (51) can be applied in order to

]= [

]e

(66)

Therefore: e

=

e

+ [

]e

(67)

Multiply both sides of this equation on the right by e obtained into (65), we obtain: d = d

+

+ [

]

. Substituting the relation so

()

(68)

The operators + and [ ] commute by hypothesis. We can therefore integrate the differential equation (68) as if + and [ ] were numbers. This yields: ()=

(0)e(

+ ) + 12 [

]

2

Setting = 0 in (64), we see that ( ) = e(

+ ) + 12 [

]

(69) (0) = , and we obtain for any time :

2

(70)

Let us then set = 1; we obtain equation (63), which is thus proven. Comment:

When the operators and are arbitrary, equation (63) is not in general valid: it is necessary that both and commute with [ ]. This condition may seem very restrictive. Actually, in quantum mechanics, one often encounters operators whose commutator is a number: for example, and , or the operators and of the harmonic oscillator (cf. Chap. V). References:

See the subsections “General texts” and “Linear algebra – Hilbert spaces” of section 10 of the bibliography.

172



UNITARY OPERATORS

Complement CII Unitary operators

1

General properties of unitary operators . . . . . . 1-a Definition and simple properties . . . . . . . . . . 1-b Unitary operators and change of bases . . . . . . . 1-c Unitary matrices . . . . . . . . . . . . . . . . . . . 1-d Eigenvalues and eigenvectors of a unitary operator Unitary transformations of operators . . . . . . . The infinitesimal unitary operator . . . . . . . . .

2 3

1.

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

173 173 174 176 176 177 178

General properties of unitary operators

1-a.

Definition and simple properties

By definition, an operator =

is unitary if its inverse

1

is equal to its adjoint

=

(1)

Consider two arbitrary vectors ˜2 under the action of : ˜1 = ˜2 =

:

1

and

2

of , and their transforms ˜1 and

1

(2)

2

Let us calculate the scalar product ˜1 ˜2 ; we obtain: ˜1 ˜2 =

1

2

=

1

(3)

2

The unitary transformation associated with the operator therefore conserves the scalar product (and, consequently, the norm) in . When is finite-dimensional, moreover, this property is characteristic of a unitary operator. Comments:

( ) If

is a Hermitian operator, the operator =e

=e

=e

is unitary, since: (4)

and therefore: =e =e e (obviously,

e

= = commutes with

(5) ).

173



COMPLEMENT CII

( ) The product of two unitary operators is also unitary. If we have: =

=

=

=

and

are unitary,

(6)

Let us now calculate: (

) (

)=

=

=

(

)(

) =

=

=

(7)

These equations indeed show that the product operator is unitary. This property, moreover, was foreseeable: when two transformations conserve the scalar product, so does the successive application of these two transformations. (

1-b.

) In the ordinary three-dimensional space of real vectors, we are familiar with operators which conserve the norm and the scalar product: rotations, symmetry operations with respect to a point, to a plane, etc. In this case, where the space is real, these operators are said to be orthogonal. Unitary operators constitute the generalization of orthogonal operators to complex spaces (with an arbitrary number of dimensions). Unitary operators and change of bases

Call ˜

Let be an orthonormal basis of the state space , assumed to be discrete. the transform of the vector under the action of a unitary operator :

˜ =

(8)

Since the operator ˜ ˜

=

is unitary, we have: =

(9)

The ˜ vectors are therefore orthonormal. Let us show that they constitute a basis of . To do so, consider an arbitrary vector of . Since the set constitutes a basis, the vector can be expanded on the : =

(10)

Applying the operator =

to this equation, we obtain: (11)

and, therefore: = 174

˜

(12)



UNITARY OPERATORS

This equation expresses the fact that any vector can be expanded on the vectors ˜ , which therefore constitute a basis. Thus we can state the following result: a necessary condition for an operator to be unitary is that the vectors of an orthonormal basis of , transformed by , constitute another orthonormal basis. . Conversely, let us show that this condition is sufficient. By hypothesis, we then have: ˜ = ˜ ˜

= ˜

˜ =

(13)

and therefore: ˜

= ˜

(14)

Let us calculate: =

˜ =

=

˜

˜ ˜ =

=

(15)

Relation (15), which is valid for all , expresses the fact that the operator is the identity operator. Let us show, in the same way, that = . To do this, consider the action of on a vector : =

=

˜

(16)

We then have: =

=

˜

˜

˜

= We deduce from this that

(17) = : the operator

is therefore unitary. 175



COMPLEMENT CII

1-c.

Unitary matrices

Let: =

(18)

be the matrix elements of . How can one see from the matrix representing operator is unitary? Relation (1) gives us: =

if this

(19)

that is: =

(20)

When a matrix is unitary, the sum of the products of the elements of one column and the complex conjugates of the elements of another column is – zero if the two columns are different, – equal to 1 if they are not. Let us cite some examples in which this rule can be easily verified. Examples:

( ) The matrix which represents a rotation through an angle three-dimensional space: cos sin 0

( )=

sin cos 0

e (

(21)

1-d.

2( + )

cos

)= e

2(

)

sin

e2(

2

e

2

1 particle (cf. Chap. IX): 2 )

2( + )

sin cos

2

(22)

2

Eigenvalues and eigenvectors of a unitary operator

Let

be a normalized eigenvector of the unitary operator

=

with eigenvalue : (23)

The square of the norm of the vector = 176

, in ordinary

0 0 1

( ) The rotation matrix in the state space of a spin

(1 2)

about

=

is: (24)



UNITARY OPERATORS

Since the unitary operator conserves the norm, we have, necessarily, = 1. The eigenvalues of a unitary operator must therefore be complex numbers of modulus 1: =e

where

is real

(25)

Consider two eigenvectors =

and

of

; we then have:

=

=e(

)

(26)

When the eigenvalues and are different, we see from (26) that the scalar product must be zero: two eigenvectors of a unitary operator corresponding to different eigenvalues are orthogonal. 2.

Unitary transformations of operators

We saw in § 1-b that a unitary operator permits the construction, starting with one orthonormal basis of , of another one, ˜ . In this section, we are going to define a transformation that acts, not on the vectors, but on the operators. By definition, the transform ˜ of the operator will be the operator which, in the ˜ basis, has the same matrix elements as the operator in the basis: ˜ ˜˜

=

(27)

Substituting (8) into this equation, we obtain: ˜ Since

and ˜

=

(28)

are arbitrary, we have:

=

(29)

or, multiplying this equation on the left by

and on the right by

:

˜=

(30)

Equation (30) can be taken to be the definition of the transform ˜ of the operator by the unitary transformation . In quantum mechanics, such transformations are often used: a first example is given in Complement FII of this chapter (§ 2-a). How can the eigenvectors of ˜ be obtained from those of ? Let us consider an eigenvector of , with an eigenvalue : = Let ˜ ˜ ˜

(31)

be the transform of =(

)

= =

=

by the operator (

: ˜

=

. We then have:

)

= ˜

(32) 177



COMPLEMENT CII

˜ is therefore an eigenvector of ˜, with eigenvalue . This can be generalized to the following rule: the eigenvectors of the transform ˜ of are the transforms ˜ of the eigenvectors of ; the eigenvalues are unchanged. Comments:

( ) The adjoint of the transform ˜ of ( ˜) = (

by

is the transform of

by

= ˜

) =

: (33)

In particular, it follows from this relation that, if

is Hermitian, ˜ is also.

( ) Similarly, we have: ( ˜)2 =

= ˜2

=

and, in general: ( ˜) = ˜

(34)

Using definition (23) of Complement BII , we can easily show that: ˜( ) = where 3.

( ˜)

(35)

( ) is a function of the operator

.

The infinitesimal unitary operator

Let ( ) be a unitary operator which depends on an infinitely small real quantity ; by hypothesis, ( ) when 0. Expand ( ) in a power series in : ( )=

+

+

(36)

We then have: ( )=

+

+

(37)

and: ( )

)+

(38)

Since ( ) is unitary, the first-order terms in we therefore have:

on the right-hand side of (38) are zero;

+

( )=

( ) ( )=

+ (

+

=0

This relation expresses the fact that the operator to set: = 178

(39) is anti-Hermitian. It is convenient

(40)



UNITARY OPERATORS

so as to obtain the equation: =0

(41)

which states that is Hermitian. An infinitesimal unitary operator can therefore be written in the form: ( )= where

(42)

is a Hermitian operator. Substituting (42) into (30), we obtain:

˜=(

) ( +

)=(

) ( +

)

(43)

and, therefore: ˜

=

[

]

The variation of the operator portional to the commutator [

(44) under the transformation ].

is, to first order in , pro-

179



A MORE DETAILED STUDY OF THE R AND P REPRESENTATIONS

Complement DII A more detailed study of the

1

The

The

1-a.

representation

and

p

representations

. . . . . . . . . . . . . . . . . . . . . 181

1-a

The R operator and functions of R

1-b

The P operator and functions of P . . . . . . . . . . . . . . . 182

1-c

The Schrödinger equation in the

2

1.

r

r

The

p

r

. . . . . . . . . . . . . . 181

representation . . . . . 183

representation . . . . . . . . . . . . . . . . . . . . . 184

2-a

The P operator and functions of P . . . . . . . . . . . . . . . 184

2-b

The R operator and functions of R

2-c

The Schrödinger equation in the

r

p

. . . . . . . . . . . . . . 184 representation

. . . . 184

representation

The R operator and functions of R

Let us calculate the matrix elements, in the r representation, of the , , operators. Using formula (E-2-c) of Chapter II and the orthogonality relations of the kets r , we immediately obtain: r

r =

(r

r)

r

r =

(r

r)

r

r =

(r

r)

(1)

These three equations can be condensed into one: r R r = r (r

r)

(2)

The matrix elements, in the r representation, of a function simple [cf. equation (27) of Complement BII ]: r

(R) r =

(r) (r

r)

(R) are also very

(3) 181



COMPLEMENT DII

1-b.

The P operator and functions of P

Let us calculate the matrix element r r

r =

d3

=

d3

= (2 ~)

r

rp 3

pr pr e ~ p (r

d3

r)

+

1

=

p

r :

d

2 ~

(

e~

)

+

1 2 ~

d

e~

(

)

d

e~

(

)

+

1 2 ~

(4)

From this, it follows that, using the integral form of the “delta function” and its derivative [cf. Appendix II, equations (34) and (53)]: r =

r

~

(

) (

) (

)

(5)

The matrix elements of the other components of P could be obtained in an analogous fashion. Let us verify that the action of in the r representation can indeed be derived from formula (5). To do so, let us calculate: d3

=

r

r

r

(6)

r

From (5): =

r

~

(

)d

(

)d

(

)

(

)d

(7)

Using the relation (48) of Appendix II: (

) ( )d =

and taking r

= =

~

( ) ( )d =

(8)

, we obtain: (

)

which is indeed equation (E-26) of Chapter II. 182

(0)

(9)



A MORE DETAILED STUDY OF THE R AND P REPRESENTATIONS

What is the value of the matrix element r operator? An analogous calculation gives us: d3

(P) r =

r

(P) r

of a function

(P) p p r

r

= (2 ~)

3

d3

(p) e p (r

= (2 ~)

3 2

˜ (r

r)

r) ~

(10)

where ˜ (r) is the inverse Fourier transform of the function ˜ (r) = (2 ~) 1-c.

d3 e ~ p r

3 2

(P) of the P

(p):

(p)

(11)

The Schrödinger equation in the

representation

r

In Chapter III, we shall introduce the Schrödinger equation, which is of fundamental importance in quantum mechanics: d () = () (12) d where is the Hamiltonian operator, which we shall define in that chapter. For a (spinless) particle in a scalar potential (r) [cf. equation (B-42) of Chapter III]: ~

1 2 P + (R) 2 Let us write this equation in the function (r ), defined by: =

(13) r

representation, that is, using the wave

(r ) = r ( )

(14)

Projecting (12) onto r , in the case where

is given by formula (13), we obtain:

1 r P2 ( ) + r (R) ( ) 2 The quantities involved in this equation can be expressed in terms of r () =

~

r () = (R) ( ) =

r

(r ), since:

(r )

(16)

(r) (r )

(17)

The matrix element r P2 the

(15)

can be calculated by using the fact that P acts like

~

∇ in

representation:

r 2

rP

() = r(

2

2

+ 2

) ()

2

=

~2

=

~2 ∆ (r )

2

2

+

+

2

2

+

2

(

) (18) 183

COMPLEMENT DII



The Schrödinger equation then becomes: ~2 ∆+ 2

(r ) =

~

(r)

(r )

(19)

This is indeed the wave equation introduced in Chapter I (§ B-2). 2.

The

2-a.

p

representation

The P operator and functions of P

We obtain without difficulty formulas analogous to (2) and (3): p P p = p (p p

(P) p =

2-b.

p)

(20)

(p) (p

p)

(21)

The R operator and functions of R

Arguments analogous to those of § 1 yield the formulas corresponding to (5) and (10): p = ~ (

p

) (

) (

)

(22)

and: p

(R) p = (2 ~)

3 2

¯ (p

p)

(23)

(r)

(24)

with: ¯ (p) = (2 ~) 2-c.

3 2

d3 e

~p r

The Schrödinger equation in the

p

representation

Let us introduce the “wave function in the (p ) = p ( )

p

representation” by: (25)

Using (12), we shall look for the equation giving the time evolution of (p ). Projecting (12) onto the ket p , we obtain: ~

p () =

1 p P2 2

() + p

(R) ( )

(26)

Now we have: p () = p P2 184

(p )

(27)

( ) = p2 (p )

(28)



A MORE DETAILED STUDY OF THE R AND P REPRESENTATIONS

The quantity which remains to be calculated is: p

(R) ( ) =

d3

p

(R) p

p

d3

(p

()

(29)

Using (23), we find: p where

(R) ( ) = (2 ~)

3 2

(p) is the Fourier transform of

(p) = (2 ~)

3 2

d3 e

~p r

The Schrödinger equation in the

~

(p ) =

p2 2

p ) (p

)

(30)

(r):

(r) p

(p ) + (2 ~)

(31) representation is therefore written:

3 2

d3

(p

p ) (p

)

(32)

Comment:

Since (p ) is the Fourier transform of (r ) [cf. formula (E-18) of Chapter II], it would have been possible to find equation (32) by taking the Fourier transforms of both sides of equation (19).

185

• TWO OBSERVABLES,

AND

, WHOSE COMMUTATOR IS EQUAL TO ~

Complement EII Some general properties of two observables, commutator is equal to ~

1 2

3

4

and

, whose

The operator ( ): definition, properties . . . . . . . . . . . Eigenvalues and eigenvectors of . . . . . . . . . . . . . . . 2-a Spectrum of . . . . . . . . . . . . . . . . . . . . . . . . . . 2-b Degree of degeneracy . . . . . . . . . . . . . . . . . . . . . . . 2-c Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . The q representation . . . . . . . . . . . . . . . . . . . . . 3-a The action of Q in the q representation . . . . . . . . . . 3-b The action of ( ) in the q representation; the translation operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-c The action of P in the q representation . . . . . . . . . . The p representation. The symmetric nature of the P and Q observables . . . . . . . . . . . . . . . . . . . . . . . . .

187 188 188 188 189 189 189 190 190 190

In quantum mechanics, one often encounters two operators whose commutator is equal to ~. This is the case, for example, when these two operators correspond to the two classical conjugate quantities and ( , the coordinate in a system of orthonormal axes, and the conjugate momentum

=

mechanics, one associates with

operators

[

and

cf. § 3-a of Appendix III). In quantum and

that satisfy the relation:

]= ~

(1)

In § E of Chapter II, we encountered such operators: and . In this complement, we shall take a more general point of view and show that it is possible to establish a whole series of important properties relative to two observables and whose commutator is equal to ~. All these properties are just consequences of the commutation relation (1). 1.

The operator

( ): definition, properties

We shall consider two observables [

and

, satisfying the relation:

]= ~

(2)

and we shall define the operator

( ), which depends on the real parameter , by:

~

( )=e

(3)

This operator is unitary; it is easy to verify the relations: ( )=

1

( )= (

)

(4) 187

COMPLEMENT EII



Let us calculate the commutator [ ( )]. We can apply formula (51) of Complement BII , since [ ] = ~ commutes with and : [

( )] = ~

~

e

=

( )

(5)

~

This relation can also be written: ( ) = ( )[

+ ]

(6)

Finally, note that: ( ) ( )= ( + ) 2.

(7)

Eigenvalues and eigenvectors of

2-a.

Spectrum of

Assume that

has a non-zero eigenvector

=

(8)

Apply equation (6) to the vector ( )

, with eigenvalue :

= ( )(

. This yields:

+ )

= ( )( + )

=( + ) ( )

(9)

This equation expresses the fact that ( ) is another non-zero eigenvector of , with an eigenvalue of ( + ) ( ( ) is non-zero because ( ) is unitary). Thus, starting with an eigenvector of , one can, by applying ( ), construct another eigenvector of , with any real eigenvalue ( can indeed take on any real value). The spectrum of is therefore a continuous spectrum, composed of all possible values on the real axis1 . 2-b.

Degree of degeneracy

From now on, we shall assume, for simplicity, that the eigenvalue of is nondegenerate (the results which we shall derive can be generalized to the case where is degenerate). Let us show that if is non-degenerate, all the other eigenvalues of are also non-degenerate. Let us assume, for example, that the eigenvalue + is two-fold degenerate, and we shall show that we arrive at a contradiction. There would then exist two orthogonal eigenvectors, + and + , corresponding to the eigenvalue + : + 1 This

+

=0

(10)

shows that in a space of finite dimension , there are no observables and , whose commutator is equal to ~. The number of eigenvalues of could not be simultaneously less than or equal to and infinite. This result can be derived directly, moreover, by taking the trace of relation (2): Tr Tr = Tr ~. When is finite, the traces on the left-hand side of this equation exist: they are finite and equal numbers [cf. Complement BII , formula (7a)]. The equation becomes 0 = Tr ~ = ~, which is impossible.

188

• TWO OBSERVABLES,

AND

, WHOSE COMMUTATOR IS EQUAL TO ~

Consider the two vectors ( ) + and ( ) + . They are, according to (9), two eigenvectors of , with an eigenvalue of + = . They are not collinear, since they are orthogonal; their scalar product can be written, using the fact that ( ) is unitary: +

(

) (

) +

=

+

+

=0

(11)

We reach the conclusion that is at least two-fold degenerate, which is contrary to the initial hypothesis. Consequently, all the eigenvalues of must have the same degree of degeneracy. 2-c.

Eigenvectors

We shall fix the relative phases of the different eigenvectors of the eigenvector 0 , of eigenvalue 0, by setting: = ( )0 Applying ( )

with respect to

(12) ( ) to both sides of (12) and using (7), we obtain:

= ( ) ( )0 = ( + )0 =

+

(13)

The adjoint expression of (13) is written: ( )=

+

(14)

or, using (4) and replacing

by

:

( )= 3.

The

(15) q

representation

Since is an observable, the set of its eigenvectors constitutes a basis of . It is possible to characterize each ket by its “wave function in the representation”: ( )= 3-a.

ket

(16) The action of Q in the

Let us calculate, in the . It is written: =

=

q

representation

representation, the wave function associated with the

( )

[using (8) and the fact that is Hermitian]. The action of is therefore simply a multiplication by .

(17) in the

representation

189



COMPLEMENT EII

3-b.

The action of

( ) in the

q

The wave function in the written [formula (15)]: ( )

=

representation; the translation operator

representation associated with the ket

= (

( )

)

is

(18)

The action of the operator ( ) in the representation is therefore a translation of the wave function over a distance parallel to the -axis2 . For this reason, ( ) is called the translation operator. 3-c.

The action of P in the

When (

q

representation

is an infinitely small quantity, we have: ~

)=e

=

+

( 2)

+

(19)

~ Consequently: (

)

= ( )+

+

( 2)

(20)

~ On the other hand, equation (18) yields: (

)

= ( + )

(21)

Comparison of (20) and (21) shows that: ( + )= ( )+

+

( 2)

(22)

~ It follows that: =

~

=

~ d d

The action of

Lim

( + )

( )

0

( )

(23)

in the

representation is therefore that of

~ d . Equation (E-26) of d

Chapter II is thus generalized. 4.

The

p

representation. The symmetric nature of the P and Q observables

Relation (23) enables us to obtain easily the wave function ( ) associated, in the representation, with the eigenvector of with an eigenvalue of : ( )=

= (2 ~)

1 2

e~

2 The function ( ) is the function which, at the point = is therefore the function obtained from ( ) by a translation of + .

190

(24)

0

+ , takes on the value (

0 ).

It

• TWO OBSERVABLES,

AND

, WHOSE COMMUTATOR IS EQUAL TO ~

We can therefore write: +

= (2 ~)

1 2

A ket

d e~

(25)

can be defined by its “wave function in the

representation”:

( )=

(26)

Using the adjoint relation of (25), we obtain: +

( ) = (2 ~)

1 2

d e

~

( )

(27)

( ) is therefore the Fourier transform of ( ). The action of the operator in the representation corresponds to a multiplication by ; that of the operator corresponds, as can easily be shown using (27), to d the operation ~ . d Thus we obtain symmetrical results in the and representations. This is not surprising: in our hypotheses, it is possible to exchange the and operators, simply changing the sign of the commutator in relation (2). Instead of introducing the operator ( ), we could therefore have considered ( ) defined by: ( )=e

~

and we could have developed the same arguments, replacing where.

(28) by

and

by

every-

References:

Messiah (1.17), Vol. I, § VIII-6; Dirac (1.13), § 25; Merzbacher (1.16), Chap. 14, § 7.

191



THE PARITY OPERATOR

Complement FII The parity operator

1

2

3 4

1.

The parity operator . . . . . . . . . . . . . . . 1-a Definition . . . . . . . . . . . . . . . . . . . . 1-b Simple properties of Π . . . . . . . . . . . . . 1-c Eigensubspaces of Π . . . . . . . . . . . . . . Even and odd operators . . . . . . . . . . . . 2-a Definitions . . . . . . . . . . . . . . . . . . . 2-b Selection rules . . . . . . . . . . . . . . . . . 2-c Examples . . . . . . . . . . . . . . . . . . . . 2-d Functions of operators . . . . . . . . . . . . . Eigenstates of an even observable B + . . . . Application to an important special case . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

193 193 194 195 196 196 197 197 198 199 199

The parity operator

1-a.

Definition

Consider a physical system whose state space is defined by its action on the basis vectors1 r of r : Πr =

rΠr = r

(1)

d3

Now calculate Π Π

=

d3

representation: (2)

of

r:

(r) r

(3)

If the variable change r = d3

r

r = (r + r )

Consider an arbitrary vector

=

The parity operator Π is

r

The matrix elements of Π are therefore, in the

=

r.

( r)

r is performed,

r

can be written: (4)

; we obtain: ( r)r

(5)

1 Care must be taken not to confuse r0 and r0 . The former is an eigenvector of R, with eigenvalue r0 and wavefunction r0 (r) = (r + r0 ). The latter is an eigenvector of R with eigenvalue r0 and wavefunction (r r0 ). r0 (r) =

193

COMPLEMENT FII



Comparison of (3) and (5) shows that the action of Π in the change r to r: rΠ

r

representation is to

= ( r)

(6)

Now let us consider a physical system S whose state vector is ;Π describes the physical system obtained from S by reflection through the origin of the axes. 1-b.

Simple properties of Π

The operator Π2 is the identity operator. From (1) we have: Π2 r = Π(Π r ) = Π

r = r

(7)

that is, since the kets r form a basis of

r:

Π2 =

(8a)

or: Π=Π

1

(8b)

It is easy to show by recurrence that the operator Π is: equal to when equal to Π when

is even is odd

We can rewrite (6) in the form: rΠ

=

Since this equation is valid for all rΠ=

(9)

r , it can be deduced that:

r

(10)

Moreover, the Hermitian conjugate expression of (1) is written: rΠ =

r

(11)

Since the kets r form a basis, it follows from (10) and (11) that Π is Hermitian: Π =Π

(12)

Combining this equation with (8b), we obtain: Π

1



Π is therefore unitary as well. 194

(13)

• 1-c.

THE PARITY OPERATOR

Eigensubspaces of Π

Let obtain:

be an eigenvector of Π, with an eigenvalue of

= Π2

=

. Applying (8a), we

2

(14)

We therefore have 2 = 1: the eigenvalues of Π are limited to 1 and 1. Since the space r is infinite-dimensional, we immediately see that these eigenvalues are degenerate. An eigenvector of Π with the eigenvalue + 1 will be said to be even; an eigenvector with the eigenvalue 1, odd. Consider the two operators + and defined by: +

=

1 ( + Π) 2

=

1 ( 2

Π)

(15)

These operators are Hermitian; using (8a), it is easy to show that: 2 +

=

2

=

+

(16)

and are thus the projectors onto two subspaces of r , which we shall call . Let us calculate the products + and + ; we obtain:

+

+

+

=

1 ( +Π 4

=

1 ( 4

Π

Π2 ) = 0

Π+Π

Π2 ) = 0

+

and

(17)

The two subspaces + and are therefore orthogonal. Let us show that they are also supplementary. We see immediately from definition (15) that: +

+

=

For all

(18) , we have, therefore:

=(

+

+

)

=

+

+

(19)

with: +

=

+

=

(20)

Let us calculate the products Π Π Π

+

=

1 1 Π( + Π) = (Π + ) = 2 2

=

1 Π( 2

Π) =

1 (Π 2

)=

+

and Π

; we obtain:

+

(21) 195



COMPLEMENT FII

These equations enable us to show that the vectors even and odd, respectively: Π

+

Π



=

+



=

The spaces + and and 1. In the r r

+

=

+ (r)

=

r

=

+

+

and

introduced in (20) are

+

=

(22)

are therefore the eigensubspaces of Π, with the eigenvalues + 1 representation, equations (22) can be written:

= rΠ

(r) =

+

=



+(

=

r) ( r)

(23)

The wave functions + (r) and (r) are even and odd, respectively. Relation (19) expresses the fact that any ket of r can be decomposed into a sum of two eigenvectors of Π, + and , belonging respectively to the even subspace . Therefore, Π is an observable. + and the odd subspace 2.

Even and odd operators

2-a.

Definitions

In § 2 of Complement CII , we defined the concept of a unitary transformation of operators. In the case of Π [which is indeed unitary; see (13)], the transformed operator of an arbitrary operator is written: ˜=Π Π

(24)

and satisfies the relation [cf. equation (27) of Complement CII ]: r ˜r =

r

r

(25)

The operator ˜ is said to be the parity transform of . In particular, if: ˜ = + the operator is said to be even if: ˜ = the operator is said to be odd. An even operator +



+

therefore satisfies:



(26)

or, multiplying this equation on the left by Π and using (8a): Π [Π

+

= +]



=0

(27) (28)

An even operator is therefore an operator that commutes with Π. It can be seen, similarly, that an odd operator is an operator that anticommutes with Π: Π 196

+

Π=0

(29)

• 2-b.

THE PARITY OPERATOR

Selection rules

Let + be an even operator. Let us calculate the matrix element hypothesis, we have: =

+

Π



=

+

; by (30)

+

with: =Π =Π

(31)

If one of the two kets, relation (30) yields: =

+

+

and

, is even and the other odd (

=

,

=0

=

), (32)

Hence the rule: the matrix elements of an even operator are zero between vectors of opposite parity. If, now, is odd, relation (30) becomes: =

(33)

which is zero when and are both either even or odd. Hence the rule: the matrix elements of an odd operator are zero between vectors of the same parity. In particular, the diagonal matrix element (the mean value of in the state ; cf. Chapter III, § C-4) is zero if has a definite parity. 2-c.

Examples

.

The

,

,

operators

In this case, we have: Π

r =Π

= Π

=

=

(34)

r

and: Πr =

r =

=

=

(35)

r

Adding these two equations together, we obtain: (Π

+

Π) r = 0

(36)

or, since the vectors r form a basis: Π

+

Π=0

is therefore odd. The proofs are the same for

(37) and

; R is therefore an odd operator. 197

COMPLEMENT FII

.

The

,

• ,

operators

Let us calculate the ket Π p ; we obtain: Π p = (2 ~)

3 2

d3 e p r ~ Π r

= (2 ~)

3 2

d3 e p r

= (2 ~)

3 2

d3 e

=

~

pr ~

r r (38)

p

We then have, using an argument analogous to the one developed in : Π

p =

p

Πp =

(39)

p

and: Π

+

Π=0

(40)

The P operator is odd. .

The parity operator Π obviously commutes with itself; it is an even operator.

2-d.

Π

Functions of operators

Let

+

be an even operator. Using relation (8a), we obtain:



= (Π

+ Π)(Π

+ Π)



+ Π)

=

(41)

+

factors An even operator raised to the th power is even. Hence, any operator Let be an odd operator; let us calculate the operator Π Π:

Π

Π = (Π

Π)(Π

Π)



Π) = ( 1) (

)

(

+)

is even.

(42)

factors An odd operator raised to the th power is even if is even, odd if is odd. Consider an operator ( ); this operator is even if the corresponding function ( ) is even, odd if it is odd. In general, ( ) has no definite parity. 198

• 3.

THE PARITY OPERATOR

Eigenstates of an even observable B +

Let us consider an arbitrary even observable + and an eigenvector of + with an eigenvalue . Since + is even, it commutes with Π. Applying the theorems of § D-3-a of Chapter II, we obtain the following results: . If is a non-degenerate eigenvalue, is necessarily an eigenvector of Π; it is therefore either an even or an odd vector. The mean value of any odd observable , such as R, P, etc..., is then zero. If is a degenerate eigenvalue corresponding to the eigensubspace , the vectors of do not all necessarily have a definite parity. Π may be a vector which is noncollinear with ; it is nevertheless a vector which has the same eigenvalue . Moreover, it is possible to find a basis of eigenvectors common to Π and + in every subspace . 4.

Application to an important special case

We shall often need to find the eigenstates of a Hamiltonian operator the form: =

P2 + 2

(R)

, acting in

r,

of

(43)

Since the P operator is odd, the P2 operator is even. When, in addition, the function (r) is even ( (r) = ( r)), the operator is even. According to what we have just seen, it is then possible to look for the eigenstates of among the even or odd states. This often simplifies the calculations considerably. We have already encountered a certain number of cases where the Hamiltonian is even: the square well, the infinite well (cf. Complement HI ). We shall study others: the harmonic oscillator, the hydrogen atom, etc... It is easy to verify in all these special cases the properties which we have derived. Comment:

If is even, and if one of its eigenstates which has no definite parity (i.e. Π is non-collinear to ) has been found, it can be asserted that the corresponding eigenvalue is degenerate: since Π commutes with , Π is an eigenvector of with the same eigenvalue as . References and suggestions for further reading:

Schiff (1.18), § 29; Roman (2.3), § 5-3 d; Feynman I (6.3), Chap. 52; Sakurai (2.7), Chap. 3; articles by Morrison (2.28), Feinberg and Goldhaber (2.29), Wigner (2.30).

199

• PROPERTIES OF THE TENSOR PRODUCT AND THE TWO-DIMENSIONAL INFINITE WELL

Complement GII An application of the properties of the tensor product: the two-dimensional infinite well

1 2

Definition; eigenstates . . . . . . . . . . . . . . Study of the energy levels . . . . . . . . . . . 2-a Ground state . . . . . . . . . . . . . . . . . . 2-b First excited states . . . . . . . . . . . . . . . 2-c Systematic and accidental degeneracies . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

201 202 202 203 203

In Complement HI (§ 2-c) we have already studied, in a one-dimensional problem, the stationary states of a particle placed in an infinite potential well. By using the concept of a tensor product (cf. Chap. II, § F), we shall be able to generalize this discussion to the case of a two-dimensional infinite well (the introduction of a third dimension would not involve any additional theoretical difficulty). 1.

Definition; eigenstates

We shall consider a particle of mass , restricted to a plane , inside a “square box” of edge : its potential energy ( ) becomes infinite when one of its coordinates or leaves the interval [0 ]: (

)=

( )+

( )

(1)

with: ( )=0

if

=+

if

0 0

or

(2)

The Hamiltonian of the quantum particle is then (Chap. III, § B-5): =

1 ( 2

2

+

2

)+

( )+

( )

(3)

which can be written: =

+

(4)

with: =

1 2

2

+

( )

=

1 2

2

+

( )

(5) 201

COMPLEMENT GII



We thus find ourselves in the important special case mentioned in Chapter II (§ F-4-a- ), and we can consider the eigenstates of in the form: Φ =

(6)

with: =

;

=

;

(7)

We then have: Φ =

Φ

with: =

+

(8)

We have therefore reduced a two-dimensional problem to a one-dimensional problem, which, moreover, has already been solved (cf. Complement HI ). Applying the results of this complement, and formulas (7) and (8), we therefore see that: – the eigenvalues of are of the form: 1

( 2 + 2 ) 2 ~2 2 2 where and are positive integers. – to these energies correspond eigenstates Φ of tensor products: =

Φ

=

(9)

which can be written in the form (10)

whose normalized wave function is: Φ

(

)= =

( ) 2

sin

( ) sin

(11)

It is easy to verify that these wave functions vanish at the edges of the “square box” ( or = 0 or ), where the potential energy becomes infinite. 2.

Study of the energy levels

2-a.

Ground state

and are strictly positive integers1 . The ground state is therefore obtained when = 1, = 1. Its energy is: 2 2 1 1

=

~

(12)

2

This value is attained only for 1 The

values normalize).

202

= 0 or

=

= 1. The ground state is therefore not degenerate.

= 0 are excluded as they give null wave functions (therefore impossible to

• PROPERTIES OF THE TENSOR PRODUCT AND THE TWO-DIMENSIONAL INFINITE WELL 2-b.

First excited states

The first excited state is obtained either for = 1. Its energy is: 1 2

=

=

2 1

5 2

= 1 and

= 2 or for

= 2 and

2 2

~

(13)

2

This state is two-fold degenerate, since Φ1 2 and Φ2 1 are independent. The second excited state corresponds to = = 2; it is not degenerate, and its energy is: 2 2 2 2

~

=4

(14)

2

The third excited state corresponds to 2-c.

= 1,

= 3 and

= 3,

= 1, etc.

Systematic and accidental degeneracies The general observation can be made that all levels for which

=

=

are degenerate, since:

(15)

This degeneracy is related to a symmetry of the problem. The square well under consideration is symmetric with respect to the first bisectrix of the plane. This is expressed by the fact that the Hamiltonian is invariant under the exchange2 :

(16) If an eigenstate of is known whose wave function is Φ( ), the state which corresponds to Φ( ) = Φ( ) is also an eigenstate of with the same eigenvalue. Consequently, if the function Φ( ) is not symmetric with respect to and , the eigenvalue associated with it is necessarily degenerate. This is the origin of the degeneracy (15): for = , Φ ( ) is not symmetric with respect to and [formula (11)]. This interpretation is corroborated by the fact that if the symmetry is destroyed by choosing a well whose widths along and along are different (being equal to and respectively), the corresponding degeneracy disappears, and formula (9) becomes: 2 2

=

~

2

2 2

2

+

2

(17)

which implies: =

(18)

Such degeneracies, whose origin lies in a symmetry of the problem, are called systematic degeneracies.

2 In the state space, an operator could be defined to correspond to a reflection about the first bisectrix. It could then be shown that, in the present case, this operator commutes with .

203

COMPLEMENT GII



Comment: The other symmetries of the two-dimensional square well do not create systematic degeneracies because the eigenstates of are all invariant with respect to them. For example, for arbitrary and , Φ ( ) is simply multiplied by a phase factor if is replaced by ( ) and by ( ) (symmetry with respect to the center of the well). Degeneracies may also arise which are not directly related to the symmetry of the problem. They are called accidental degeneracies. For example, in the case which we have discussed, it so happens that 5 5 = 7 1 and 7 4 = 8 1

204



EXERCISES

Complement HII Exercises Dirac notation. Commutators. Eigenvectors and eigenvalues 1. are the eigenstates of a Hermitian operator of some physical system). Assume that the states The operator ( ) is defined by: (

(

is, for example, the Hamiltonian form a discrete orthonormal basis.

)= Calculate the adjoint

(

) of

Calculate the commutator [

(

(

). )].

Prove the relation: (

)

(

)=

Calculate Tr Let

(

(

)

) , the trace of the operator

be an operator, with matrix elements =

(

Show that

=

(

) . Prove the relation:

)

= Tr

(

) .

2. In a two-dimensional vector space, consider the operator whose matrix, in an orthonormal basis 1 2 , is written: 0

=

0

Is Hermitian? Calculate its eigenvalues and eigenvectors (giving their normalized expansion in terms of the 1 2 basis). Calculate the matrices that represent the projectors onto these eigenvectors. Then verify that they satisfy the orthogonality and closure relations. Same questions for the matrices: 2

=

2 2

3

and, in a three-dimensional space =

~ 2

0 2 0

2 0 2

0 2 0 205



COMPLEMENT HII

3. The state space of a certain physical system is three-dimensional. Let be an orthonormal basis of this space. The kets 0 and 1 are defined by: 0

=

1 2

1

+

1

=

1 3

1

+

2

2 3

+

1 2

1

2

3

3

Are these kets normalized? Calculate the matrices 0 and 1 representing, in the 1 projection operators onto the state 0 and onto the state matrices are Hermitian.

4. Let be the operator defined by vectors of the state space. Under what condition is Calculate

2

=

, where

basis, the 3 . Verify that these

2 1

and

are two

Hermitian?

. Under what condition is

a projector ?

Show that can always be written in the form to be calculated and 1 and 2 are projectors.

=

1 2

where

is a constant

5. Let 1 be the orthogonal projector onto the subspace 1 , 2 the orthogonal projector onto the subspace 2 . Show that, for the product 1 2 to be an orthogonal projector as well, it is necessary and sufficient that 1 and 2 commute. In this case, what is the subspace onto which 1 2 projects? 6. The =

matrix is defined by:

0 1 1 0

Prove the relation: e

= cos

where

is the 2

+

sin

2 unit matrix.

7. Establish, for the matrix given in exercise 2, a relation analogous to the one proved for in the preceding exercise. Generalize for all matrices of the form: =

+

with: 2

206

+

2

=1

3

(e

Calculate the matrices representing e2 e ? ) ? e ( + ) to e

, (e

)2 and e (

+

)



EXERCISES

. Is e2

equal to

2

8. Consider the Hamiltonian

of a particle in a one-dimensional problem, defined

by: 1 2 + ( ) 2 where and are the operators defined in § E of Chapter II, which satisfy the relation: [ ] = ~. The eigenvectors of are denoted by : = , where is a discrete index. =

Show that: = where is a coefficient which depends on the difference between Calculate (hint: consider the commutator [ ]).

and

.

From this, deduce, using the closure relation, the equation: )2

(

2

=

~2

2

2

9. Let be the Hamiltonian operator of a physical system. Denote by eigenvectors of , with eigenvalues :

the

= For an arbitrary operator [

]

, prove the relation:

=0

Consider a one-dimensional problem, where the physical system is a particle of mass with potential energy ( ). In this case, is written: =

1 2

2

In terms of

+

,

( ) and

( ), find the commutators: [

], [

] and [

].

Show that the matrix element (which we shall interpret in Chapter III as the mean value of the momentum in the state ) is zero. 2

Establish a relation between energy in the state

) and

energy in the state kinetic energy when: ( )=

0

( =2 4 6

;

0

is

= d d ( )

2

(the mean value of the kinetic . Since the mean value of the potential

, how is it related to the mean value of the

0)? 207



COMPLEMENT HII

~ 10. Using the relation = (2 ~) 1 2 e , find the expressions of and in terms of ( ). Can these results be found directly by using the fact ~ d that in the representation, acts like ? d

Complete sets of commuting observables, C.S.C.O. 11. Consider a physical system whose three-dimensional state space is spanned by the orthonormal basis formed by the three kets 1 , 2 , 3 . In the basis of these three vectors, taken in this order, the two operators and are defined by: =~ where

0

1 0 0 0 1 0 0 0 1

0

and

Are

100 001 010

=

are real constants.

and

Show that

Hermitian? and

commute. Give a basis of eigenvectors common to

Of the sets of operators:

,

,

2

,

and

.

, which form a C.S.C.O.?

12. In the same state space as that of the preceding exercise, consider two operators and defined by: 1 1

= =

1

2

3

2

=

=0 2

3 3

Write the matrices that represent, in the 1 2 , , 2 . Are these operators observables?

= =

3 1 2

3

basis, the operators

,

Give the form of the most general matrix which represents an operator which commutes with . Same question for 2 , then for 2 . Do

2

and

form a C.S.C.O.? Give a basis of common eigenvectors.

Solution of exercise 11 and are Hermitian because the matrices which correspond to them are symmetric and real. and . We therefore have 1 is an eigenvector common to 1 = 1 . We see, then, that for and to commute, it is sufficient that the restrictions of these operators to the subspace 2 , spanned by 2 and 3 , commute. Now, in this subspace, the matrix representing is equal to ~ 0 (where is the 2 2 unit matrix), which commutes with all 2 2 matrices. and therefore commute 208



EXERCISES

(this result could, of course, be obtained by calculating directly the matrices and ). The restriction of to 2 is written:

2

2

01 10

=

The normalized eigenvectors of this 2 2

=

1 [ 2

2

3

=

1 [ 2

2

+

2 matrix are easy to obtain; they are:

3

] (eigenvalue + )

3

] (eigenvalue – )

These vectors are automatically eigenvectors of since 2 is the eigensubspace of corresponding to the eigenvalue ~ 0 . To summarize, the eigenvectors common to and are given by: eigenvalue of 1 2

3

=

~

1

1 = [ 2 1 = [ 2

2

2

+

eigenvalue of

0

3

]

~

0

3

]

~

0

These vectors are the only (to within, of course, a phase factor) normalized eigenvectors common to and . It can be seen from the above table that has a two-fold degenerate eigenvalue; it is therefore not a C.S.C.O. Similarly, also has a two-fold degenerate eigenvalue and is therefore not a C.S.C.O.: an eigenvector of with the eigenvalue can be 1 1 1 1 , or 2 , or 1 + 2 + 3 , for example. On the other hand, the 3 3 3 set of the two operators and does constitute a C.S.C.O. We see from the above table that no two vectors have the same eigenvalues for both and . This is why, as has already been pointed out, the system of normalized eigenvectors common to and is unique (to within phase factors). Note that within the eigensubspace 2 of associated with the eigenvalue ~ 0 , the eigenvalues of are distinct ( and ). Similarly, in the eigensubspace of spanned by 1 and are distinct (~ 0 and ~ 0 ). 2 , the eigenvalues of 2

has three eigenvectors with the eigenvalue ~2 02 , 1 , 2 and 3 . It is easy to see that 2 and do not constitute a C.S.C.O., since two linearly independent eigenvectors 1 and 2 correspond to the pair of eigenvalues ~2 02 .

209

COMPLEMENT HII



Solution of exercise 12 Let us use the rule for constructing the matrix of an operator: “in the th column of the matrix, write the components of the operator transform of the th basis vector”. We obtain easily: 10 0 00 0 00 1

=

2

100 000 001

=

2

=

0 0 1 0 1 0 1 0 0

=

1 0 0 0 1 0 0 0 1

These matrices are symmetric and real, and therefore Hermitian. Since the space is finite-dimensional, they can be diagonalized and therefore represent observables. Let be an operator that commutes with . cannot (cf. Chap. II, § D-3a) have any matrix elements between 1 and 2 , or between 2 and 3 , or between 1 and 3 (eigenvectors of with different eigenvalues). The matrix which represents is therefore necessarily diagonal, that is, of the form: 0

11

[

]=0

=

0 0

0 0

22

0

33

Let be an operator that commutes with 2 . The matrix representing can have elements between 1 and 3 (eigenvectors of 2 with the same eigenvalue), but none between 2 and 1 or 3 . is therefore written: 2

[

]=0

=

11

0

13

0

22

0

31

0

33

It is therefore less restrictive to impose the condition that an operator commute with 2 than with : is not necessarily a diagonal matrix. It can only be said that does not mix the vectors of the subspace 2 spanned by 1 and 3 with those of the one-dimensional subspace spanned by 2 . This property, moreover, appears very clearly if the matrix which represents the operator is written in the basis (changing the order of the basis vectors): 1 3 2 =

11

13

31

33

0 0

0

0

22

Finally, since 2 is the identity operator, any 3 its most general form is: [

210

2

]=0

=

11

12

13

21

22

23

31

32

33

3 matrix commutes with

2

, and

• is an eigenvector common to and 3 , 2 and are written: 2

2

=

10 01

2

=

01 10

2 2

2

2

and

. In the subspace

2

EXERCISES

spanned by

1

The eigenvectors of the latter matrix are: 2

=

1 [ 2

1

3

=

1 [ 2

1

+

3

]

3

]

and the basis of eigenvectors common to vector 1

=

2

=

3

1

1

and

eigenvalue of

2

1 [ 2 1 = [ 2

2

+

is: 2

eigenvalue of

0

1

3

]

1

1

3

]

1

1

No two lines are alike in the table of eigenvalues of 2 and : these two operators therefore form a C.S.C.O. (this is not, however, the case for either one of them taken alone).

211

Chapter III

The postulates of quantum mechanics A B

C

D

E

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statement of the postulates . . . . . . . . . . . . . . . . . . . B-1 Description of the state of a system . . . . . . . . . . . . . . B-2 Description of physical quantities . . . . . . . . . . . . . . . . B-3 The measurement of physical quantities . . . . . . . . . . . . B-4 Time evolution of systems . . . . . . . . . . . . . . . . . . . . B-5 Quantization rules . . . . . . . . . . . . . . . . . . . . . . . . The physical interpretation of the postulates concerning observables and their measurement . . . . . . . . . . . . . . C-1 The quantization rules are consistent with the probabilistic interpretation of the wave function . . . . . . . . . . . . . . . C-2 Quantization of certain physical quantities . . . . . . . . . . . C-3 The measurement process . . . . . . . . . . . . . . . . . . . . C-4 Mean value of an observable in a given state . . . . . . . . . C-5 The root mean square deviation . . . . . . . . . . . . . . . . C-6 Compatibility of observables . . . . . . . . . . . . . . . . . . The physical implications of the Schrödinger equation . . . D-1 General properties of the Schrödinger equation . . . . . . . . D-2 The case of conservative systems . . . . . . . . . . . . . . . . The superposition principle and physical predictions . . . . E-1 Probability amplitudes and interference effects . . . . . . . . E-2 Case in which several states can be associated with the same measurement result . . . . . . . . . . . . . . . . . . . . . . . .

214 215 215 216 216 223 223 226 226 227 227 228 230 232 237 237 245 253 253 259

Quantum Mechanics, Volume I, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER III

A.

THE POSTULATES OF QUANTUM MECHANICS

Introduction

In classical mechanics, the motion of any physical system is determined if the position r( ) and velocity v( ˙ ˙ ˙) of each of its points are known as a function of time. In general (Appendix III), to describe such a system, one introduces generalized coordinates ()( =1 2 ), whose derivatives with respect to time, ˙ ( ), are the generalized velocities. Specifying the ( ) and ˙ ( ) enables us to calculate, at any given instant, the position and velocity of any point of the system. Using the Lagrangian ( ˙ ), one defines the conjugate momentum of each of the generalized coordinates : =

(A-1)

˙

The ( ) and ( ) ( = 1 2 ) are called the fundamental dynamical variables. All the physical quantities associated with the system (energy, angular momentum, etc.) can be expressed in terms of the fundamental dynamical variables. For example, the total energy of the system is given by the classical Hamiltonian ( ). The motion of the system can be studied by using either Lagrange’s equations or the Hamilton-Jacobi canonical equations, which are written: d = d

(A-2a)

d = d

(A-2b)

In the special case of a system consisting of a single physical point of mass , the are simply the three coordinates of this point, and the ˙ are the components of its velocity v. If the forces acting on this particle can be derived from a scalar potential (r ), the three conjugate momenta of its position r (that is, the components of its linear momentum p) are equal to the components of its mechanical momentum v. The total energy is then written: =

p2 + 2

(r )

(A-3)

and the angular momentum with respect to the origin: =r

p

Since (r p ) = (p2 2 ) + well-known form:

(A-4) (r ), the Hamilton-Jacobi equations (A-2) take on the

dr p = d

(A-5a)

dp = d

(A-5b)

214



B. STATEMENT OF THE POSTULATES

The classical description of a physical system can therefore be summarized as follows: ( ) The state of the system at a fixed time 0 is defined by specifying coordinates ( 0 ) and their conjugate momenta ( 0 ).

generalized

( ) The value, at a given time, of the various physical quantities is completely determined when the state of the system at this time is known: knowing the state of the system, one can predict with certainty the result of any measurement performed at time 0 (

) The time evolution of the state of the system is given by the Hamilton-Jacobi equations. Since these are first-order differential equations, their solution () () is unique if the value of these functions at a given time 0 is fixed, ( 0) ( 0) . The state of the system is known for all time if its initial state is known.

In this chapter, we shall study the postulates on which the quantum description of physical systems is based. We have already introduced them, in a qualitative and partial way, in Chapter I. Here we shall discuss them explicitly, within the framework of the formalism developed in Chapter II. These postulates will provide us with an answer to the following questions (which correspond to the three points enumerated above for the classical description): ( ) How is the state of a quantum system at a given time described mathematically? ( ) Given this state, how can we predict the results of the measurement of various physical quantities? (

) How can the state of the system at an arbitrary time time 0 is known?

be found when the state at

We shall begin by stating the postulates of quantum mechanics (§ B). Then we shall analyze their physical content and discuss their consequences (§§ C, D, E). B. B-1.

Statement of the postulates Description of the state of a system

In Chapter I, we introduced the concept of the quantum state of a particle. We first characterized this state at a given time by a square-integrable wave function. Then, in Chapter II, we associated a ket of the state space r with each wave function: choosing belonging to r is equivalent to choosing the corresponding function (r) = r . Therefore, the quantum state of a particle at a fixed time is characterized by a ket of the space r . In this form, the concept of a state can be generalized to any physical system. First Postulate: At a fixed time 0 , the state of an isolated physical system is defined by specifying a ket ( 0 ) belonging to the state space . It is important to note that, since is a vector space, this first postulate implies a superposition principle: a linear combination of state vectors is a state vector. We shall discuss this fundamental point and its relations to the other postulates in § E. 215

CHAPTER III

B-2.

THE POSTULATES OF QUANTUM MECHANICS

Description of physical quantities

We have already used, in § D-1 of Chapter I, a differential operator related to the total energy of a particle in a scalar potential. This is simply a special case of the second postulate. Second Postulate: Every measurable physical quantity operator acting in ; this operator is an observable.

is described by an

Comments:

( ) The fact that is an observable (cf. Chap. II, § D-2) will be seen below (§ B-3) to be essential. ( ) Unlike classical mechanics (cf. § A), quantum mechanics describes in a fundamentally different manner the state of a system and the associated physical quantities: a state is represented by a vector, a physical quantity by an operator. B-3.

The measurement of physical quantities

B-3-a.

Possible results

The connection between the operator and the total energy of the particle appeared in § D-1 of Chapter I in the following form: the only energies possible are the eigenvalues of the operator . Here as well, this relation can be extended to all physical quantities. Third Postulate: The only possible result of the measurement of a physical quantity is one of the eigenvalues of the corresponding observable .

Comments:

( ) A measurement of mitian.

always gives a real value, since

is by definition Her-

( ) If the spectrum of is discrete, the results that can be obtained by measuring are quantized (§ C-2). B-3-b.

Principle of spectral decomposition

We are going to generalize and discuss in more detail the conclusions of § A-3 of Chapter I, where we analyzed a simple experiment performed on polarized photons. Consider a system whose state is characterized, at a given time, by the ket , assumed to be normalized to 1: =1 216

(B-1)

B. STATEMENT OF THE POSTULATES

We want to predict the result of the measurement, at this time, of a physical quantity associated with the observable . This prediction, as we already know, is of a probabilistic sort. We are now going to give the rules that allow us to calculate the probability of obtaining any given eigenvalue of . .

Case of a discrete spectrum

First, let us assume that the spectrum of is entirely discrete. If all the eigenvalues of are non-degenerate, there is associated with each of them a unique (to within a constant factor) eigenvector : =

(B-2)

Since is an observable, the set of the , which we shall take to be normalized, constitutes a basis in , and the state vector can be written: =

(B-3)

We postulate that the probability (

2

)=

(

) of finding

when

is measured is:

2

=

(B-4)

Fourth Postulate (case of a discrete non-degenerate spectrum): When the physical quantity is measured on a system in the normalized state , the probability ( ) of obtaining the non-degenerate eigenvalue of the corresponding observable is: (

2

)=

where

is the normalized eigenvector of

If, now, some of the eigenvalues vectors correspond to them: =

;

associated with the eigenvalue

are degenerate, several orthonormalized eigen-

=1 2

(B-5)

can still be expanded on the orthonormal basis =

: (B-6)

=1

In this case, the probability (

2

)= =1

(

) becomes: 2

=

(B-7)

=1

217

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

(B-4) is then seen to be a special case of (B-7), which can therefore be considered to be the general formula.

Fourth Postulate (case of a discrete spectrum): When the physical quantity is measured on a system in the normalized state , the probability ( ) of obtaining the eigenvalue of the corresponding observable is: (

2

)= =1

where is the degree of degeneracy of and ( =1 2 ) is an orthonormal set of vectors which forms a basis in the eigensubspace associated with the eigenvalue of . For this postulate to make sense, it is obviously necessary that, if the eigenvalue is degenerate, the probability ( ) be independent of the choice of the basis in . To verify this, consider the vector: =

(B-8) =1

where the coefficients

are the same as those appearing in the expansion (B-6) of

=

: (B-9)

is the part of which belongs to , that is, the projection of is, moreover, what we find when we substitute (B-9) into (B-8): =

onto

. This

=

(B-10)

=1

where: =

(B-11) =1

is the projector onto (§ B-3-b of Chapter II). Let us now calculate the square of the norm of . From (B-8): 2

=

(B-12)

=1

Therefore, ( ) is the square of the norm of = , the projection of onto . From this expression, it is clear that a change in the basis in does not affect ( ). This probability is written: (

)=

or, using the fact that ( 218

)=

(B-13) is Hermitian (

=

) and that it is a projector (

2

=

):

(B-14)

B. STATEMENT OF THE POSTULATES

.

Case of a continuous spectrum

Now let us assume that the spectrum of is continuous and, for the sake of simplicity, non-degenerate. The system, orthonormal in the extended sense, of eigenvectors of : =

(B-15)

forms a continuous basis in , in terms of which =

d

can be expanded:

( )

(B-16)

Since the possible results of a measurement of form a continuous set, we must define a probability density, just as we did for the interpretation of the wave function of a particle (§ B-2 of Chapter I). The probability d ( ) of obtaining a value included between and + d is given by: d ( ) = ( )d with: ( )= ( )2=

2

(B-17)

Fourth Postulate (case of a continuous non-degenerate spectrum): When the physical quantity is measured on a system in the normalized state , the probability d ( ) of obtaining a result included between and + d is equal to: 2

d ( )=

d

where is the eigenvector corresponding to the eigenvalue observable associated with .

of the

Comments:

( ) It can be verified explicitly, in each of the cases considered above, that the total probability is equal to 1. For example, starting with formula (B-7), we find: (

2

)=

=

=1

(B-18)

=1

since is normalized. This last condition is therefore indispensable if the statements we have made are to be coherent. Nevertheless, it is not essential: if it is not fulfilled, it suffices to replace (B-7) and (B-17), respectively, by: (

)=

1

2

(B-19)

=1

219

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

and: 1

( )=

( )2

(B-20)

( ) For the fourth postulate to be coherent, it is necessary for the operator associated with any physical quantity to be an observable: it must be possible to expand any state on the eigenvectors of . (

.

) We have not given the fourth postulate in its most general form. Starting with the discussion of the cases we have envisaged, it is simple to extend the principle of spectral decomposition to any situation (continuous degenerate spectrum, partially continuous and partially discrete spectrum, etc...). In § E, and later in Chapter IV, we shall apply this fourth postulate to a certain number of examples, pointing out certain implications of the superposition principle mentioned in § B-1.

An important consequence Consider two kets

and

such that:

=e where

(B-21)

is a real number. If =

e

e

is normalized, so is

:

=

(B-22)

The probabilities predicted for an arbitrary measurement are the same for since, for any : 2

2

= e

2

=

Similarly, we can multiply

and

(B-23)

by a constant factor:

= e

(B-24)

without changing any of the physical results: since each coefficient , or ( ), is multipled by the same factor in both the numerator and denominator of (B-19) and (B-20), there appear two factors of 2 that cancel each other. Therefore, two proportional state vectors represent the same physical state. Care must be taken to interpret this result correctly. For example, let us assume that: =

1

1

+

2

(B-25)

2

where 1 and 2 are complex numbers. It is true that e 1 1 represents, for all real 1 , the same physical state as 1 , and e 2 2 represents the same state as 2 . But, in general: = 220

1

e

1

1

+

2

e

2

2

(B-26)

B. STATEMENT OF THE POSTULATES

does not describe the same state as (we shall see in § E-1 that the relative phases of the expansion coefficients of the state vector play an important role). This is not true for the special case where 1 = 2 + 2 , that is, where: =e

1

[

1

1

+

2

2

]=e

1

(B-27)

In other words: a global phase factor does not affect the physical predictions, but the relative phases of the coefficients of an expansion are significant. B-3-c.

Reduction of the wave packet

We have already introduced this concept in speaking of the measurement of the polarization of photons in the experiment described in § A-3 of Chapter I. We are now going to generalize it, confining ourselves, nevertheless, to the case of a discrete spectrum (we shall take up the case of a continuous spectrum in § E). Assume that we want to measure, at a given time, the physical quantity . If the ket , which represents the state of the system immediately before the measurement, is known, the fourth postulate allows us to predict the probabilities of obtaining the various possible results. But when the measurement is actually performed, it is obvious that only one of these possible results is obtained. Immediately after this measurement, we cannot speak of the “probability of having obtained” this or that value: we know which one was actually obtained. We therefore possess additional information, and it is understandable that the state of the system after the measurement, which must incorporate this information, should be different from . Let us first consider the case where the measurement of yields a simple eigenvalue of the observable . We then postulate that the state of the system immediately after this measurement is the eigenvector associated with : (

)

=

(B-28)

Comments:

( ) We have been speaking about states “immediately before” the measurement ( ) and “immediately after” ( ). The precise meaning of these expressions is the following: assume that the measurement takes place at the time 0, and that we know the state (0) of the system at the time = 0. 0 The sixth postulate (see § B-4) describes how the system evolves over time, that is, enables us to calculate from (0) the state ( 0 ) “immediately before” the measurement. If the measurement has yielded the non-degenerate eigenvalue , the state ( 1 ) at a time 1 0 must be calculated from ( 0) = , the state “immediately after” the measurement, using the sixth postulate to determine the evolution of the state vector between the times 0 and 1 (Fig. 1). ( ) If we perform a second measurement of immediately after the first one (that is, before the system has had time to evolve), we shall always find the same result , since the state of the system immediately before the second measurement is , and no longer . 221

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

Measurement giving the result an un ψ (0)

ψ (t1)

ψ (t0) t

0

t0

t1

Figure 1: When a measurement at time 0 of the observable gives the result , the state vector of the system undergoes an abrupt modification and becomes . This new initial state then evolves.

When the eigenvalue given by the measurement is degenerate, postulate (B-28) can be generalized as follows. If the expansion of the state immediately before the measurement is written, with the same notation as in section B-3-b: =

(B-29) =1

the modification of the state vector due to the measurement is written: (

)

=

1

(B-30) 2 =1

=1

is the vector defined above [formula (B-8)], that is, the projection of =1 onto the eigensubspace associated with . In (B-30), we normalized this vector since it is always more convenient to use state vectors of norm 1 [comment ( ) of § B-3-b above]. With the notation of (B-10) and (B-11), we can therefore write (B-30) in the form: (

)

(B-31)

=

Fifth Postulate: If the measurement of the physical quantity on the system in the state gives the result , the state of the system immediately after the measurement is the normalized projection, eigensubspace associated with

, of

onto the

.

The state of the system immediately after the measurement is therefore always an eigenvector of with the eigenvalue . We stress the fact, however, that it is not 222

B. STATEMENT OF THE POSTULATES

an arbitrary ket of the subspace , but the part of that belongs to (suitably normalized, for convenience). In the light of § B-3-b- above, equation (B-28) can be seen to be a special case of (B-30). When = 1, the summation over disappears from (B-30), which becomes: 1

= e Arg(

)

This ket indeed describes the same physical state as B-4.

(B-32) .

Time evolution of systems

We have already presented, in § B-2 of Chapter I, the Schrödinger equation for one particle. Here we shall write it in the general case.

Sixth Postulate: The time evolution of the state vector by the Schrödinger equation: ~

d d

where

() =

( ) is governed

() ()

( ) is the observable associated with the total energy of the system.

is called the Hamiltonian operator of the system, as it is obtained from the classical Hamiltonian (Appendix III and § B-5 below). B-5.

Quantization rules

We are finally going to discuss how to construct, for a physical quantity already defined in classical mechanics, the operator which describes it in quantum mechanics. B-5-a.

Statement

Let us first consider a system composed of a single particle, without spin, subjected to a scalar potential. In this case: With the position r( ) of the particle is associated the observable R( ). With the momentum p( ) of the particle is associated the observable P( ). Recall that the components of R and P satisfy the canonical commutation relations [Chap. II, equations (E-30)]: [

]=[

[

]= ~

]=0 (B-33)

Any physical quantity related to this particle is expressed in terms of the fundamental dynamical variables r and p: (r p ). To obtain the corresponding observable 223

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

, one could simply replace1 , in the expression for the observables R and P: ()=

(r p ), the variables r and p by

(R P )

(B-34)

However, this mode of action would be, in general, ambiguous. Assume, for example, that in (r p ) there appears a term of the form: r p=

+

+

(B-35)

In classical mechanics, the scalar product r p is commutative, and one can just as well write: p r=

+

+

(B-36)

But when r and p are replaced by the corresponding observables R and P, the operators obtained from (B-35) and (B-36) are not identical [see relations (B-33)]: R P=P R

(B-37)

Moreover, neither R P nor P R is Hermitian: (R P) = (

+

+

) =P R

(B-38)

To the preceding postulates, therefore, must be added a symmetrization rule. For example, the observable associated with r p will be: 1 (R P + P R) 2

(B-39)

which is indeed Hermitian. For an observable which is more complicated than R P, an analogous symmetrization is to be performed. The observable which describes a classically defined physical quantity is obtained by replacing, in the suitably symmetrized expression for , r and p by the observables R and P respectively. We shall see, however, that there exist quantum physical quantities that have no classical equivalent and which are therefore defined directly by the corresponding observables (this is the case, for example, for particle spin).

Comment:

The preceding rules, and commutation rules (B-33) in particular, are valid only in cartesian coordinates. It would be possible to generalize them to other coordinate systems; however, they would no longer have the same simple form as they do above. 1 See,

224

in Complement BII , the definition of a function of an operator.

B. STATEMENT OF THE POSTULATES

B-5-b.

.

Important examples

The Hamiltonian of a particle in a scalar potential

Consider a (spinless) particle of charge and mass , placed in an electric field derived from a scalar potential (r). The potential energy of the particle is therefore (r) = (r), and the corresponding classical Hamiltonian is written [Appendix III, formula (29)]: (r p) =

p2 + 2

(r)

(B-40)

with: p=

dr = d

(B-41)

v

where v is the particle’s velocity. No difficulties are presented by the construction of the quantum operator responding to . No symmetrization is necessary, since neither P2 = 2 + 2 + (R) involves products of non-commuting operators. We therefore have: =

P2 + 2

(R)

2

cornor

(B-42)

(R) is the operator obtained by replacing r by R in (r) (cf. Complement BII , § 4). In this particular case, the Schrödinger equation, given in the sixth postulate, becomes: ~

d d

.

P2 + 2

() =

(R)

()

(B-43)

The Hamiltonian of a particle in a vector potential

If the particle is now placed in an arbitrary electromagnetic field, the classical Hamiltonian becomes [Appendix III, relation (66)]: (r p) =

1 [p 2

2

A(r )] +

(r )

(B-44)

where (r ) and A(r ) are the scalar and vector potentials which describe the electromagnetic field, and where p is given by: p=

dr + A(r ) = d

v + A(r )

(B-45)

Once again, since A(r ) depends only on r and the parameter (and not on p), construction of the corresponding quantum operator A(R ) presents no problem. The Hamiltonian operator is then given by: ()=

1 [P 2

2

A(R )] +

(R )

(B-46)

with: (R ) =

(R )

(B-47) 225

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

and the Schrödinger equation is written: ~

d d

1 [P 2

() =

2

A(R )] +

(R )

()

(B-48)

Comment:

Care must be taken not to confuse p (the momentum of the particle, also called the conjugate momentum of r) with v (the mechanical momentum of the particle): the difference between these two quantities appears clearly in (B-45). In quantum mechanics, there of course exists an operator associated with the velocity of the particle which is written here: =

1

(P

(B-49)

A)

is then given by: ()=

1 2

2

+

(R )

(B-50)

It is the sum of two terms, one corresponding to the kinetic energy and the other to the potential energy of the particle. However, it is the conjugate momentum p and not the mechanical momentum v that becomes in quantum mechanics the operator P satisfying the canonical commutation relations (B-33). C.

The physical interpretation of the postulates concerning observables and their measurement

C-1.

The quantization rules are consistent with the probabilistic interpretation of the wave function

It is natural to associate the observables R and P, whose action was defined in § E of Chapter II, with the position and momentum of a particle. First of all, each of the observables and possesses a continuous spectrum, and experiments indeed show that all real values are possible for the six position and momentum variables. Moreover, we shall see that applying the fourth postulate to the case of these observables enables us to re-derive the probabilistic interpretation of the wave function as well as that of its Fourier transform (see §§ B-2 and C-3 of Chapter I). Let us consider, for simplicity, the one-dimensional problem. If the particle is in the normalized state , the probability that a measurement of its position will yield a result included between and + d is equal to [formula (B-17)]: 2

d ( )=

d

(C-1)

where is the eigenket of with the eigenvalue . We again find that the square of the modulus of the wave function ( ) = is the particle’s position probability density. Now, to the eigenvector of the observable corresponds the plane wave: = 226

1 e 2 ~

~

(C-2)

C. THE PHYSICAL INTERPRETATION OF THE POSTULATES CONCERNING OBSERVABLES AND THEIR MEASUREMENT

and we have seen (§ C-3 of Chapter I) that the de Broglie relations associate with this wave a well-defined momentum which is precisely . In addition, the probability of finding, for a particle in the state , a momentum between and + d is: d ( )=

2

d =

( ) 2d

(C-3)

This is indeed what we found in § C-3 of Chapter I. C-2.

Quantization of certain physical quantities

As we have already pointed out, the third postulate enables us to explain the quantization observed for certain quantities, such as the energy of atoms. But it does not imply that all quantities are quantized, since observables exist whose spectrum is continuous. The physical predictions based on the third postulate are therefore not at all obvious a priori. For example, when we study the hydrogen atom (Chap. VII), we shall start from the total energy of the electron in the Coulomb potential of the proton, from which we shall deduce the Hamiltonian operator. Solving its eigenvalue equation, we shall find that the bound states of the system can only correspond to certain discrete energies which we shall calculate. Thus we shall not only explain the quantization of the levels of the hydrogen atom, but also predict the possible energy values, which can be measured experimentally. We stress the fact that these results will be obtained using the same fundamental interaction law used in classical mechanics in the macroscopic domain. C-3.

The measurement process

The fourth and fifth postulates pose a certain number of fundamental problems which we shall not consider here (see section 5 of the bibliography of volumes I and II, or for instance Do we really understand quantum mechanics? by F. Laloë, Cambridge University Press, 2019). There is, in particular, the question of the “fundamental” perturbation involved in the observation of a quantum system (cf. Chap. I, §§ A-2 and A-3). The origin of these problems lies in the fact that the system under study is treated independently from the measurement device, although their interaction is essential to the observation process. One should actually consider the system and the measurement device together as a whole. This raises delicate questions concerning the details of the measurement process. We shall content ourselves with pointing out that the nondeterministic formulation of the fourth and fifth postulates is related to the problems that we have just mentioned. For example, the abrupt change from one state vector to another due to the measurement corresponds to the fundamental perturbation of which we have spoken. But it is impossible to predict what this perturbation will be, since it depends on the measurement result, which is not known with certainty in advance2 . We shall consider here only ideal measurements. To understand this concept, let us return, for example, to the experiment of § A-3 of Chapter I on polarized photons. It is clear that when we grant that all photons polarized in a certain direction traverse the analyzer, we assume that the analyzer is perfect. In practice, obviously, it also absorbs some of the photons that it should let through. We shall therefore make the hypothesis, 2 Except, obviously, in the case where one is sure of the result that will be found (probability equal to 1: the measurement does not modify the state of the system).

227

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

in the general case, that the measurement devices used are perfect: this amounts to assuming that the perturbation they provoke is due only to the quantum mechanical aspect of the measurement. Of course, real devices always present imperfections that affect the measurement and the system; but one can, in principle, constantly ameliorate them and thus approach the ideal limit defined by the postulates which we have stated. C-4.

Mean value of an observable in a given state

The predictions deduced from the fourth postulate are expressed in terms of probabilities. To verify them, it would be necessary to perform a large number of measurements under identical conditions. This means measuring the same quantity in a large number of systems which are all in the same quantum state. If these predictions are correct, the proportion of identical experiments resulting in a given event will approach, as , the theoretically predicted probability of this event. Such a verification can only be carried out in the limit where ; in practice, is of course finite, and statistical techniques must be used to interpret the results. The mean value of the observable3 in the state , which we shall denote by , or more simply by , is defined as the average of the results obtained when a large number of measurements of this observable are performed on systems which are all in the state . When is given, the probabilities of finding all the possible results are known. The mean value can therefore be predicted. We shall show that if is normalized, is given by the formula: =

(C-4)

First consider the case where the entire spectrum of is discrete. Out of measurements of (the system being in the state each time), the eigenvalue will be obtained ( ) times, with: (

)

(

)

(C-5)

and: (

)=

(C-6)

The mean value of the results of these experiments is the sum of the values found divided by (when experiments have yielded the same result, this result will clearly appear times in this sum). It is therefore equal to: 1

(

)

(C-7)

Using (C-5), we see that when = 3 We

(

)

, this mean value approaches: (C-8)

shall henceforth use the word “observable” to designate a physical quantity as well as the associated operator.

228

C. THE PHYSICAL INTERPRETATION OF THE POSTULATES CONCERNING OBSERVABLES AND THEIR MEASUREMENT

Now substitute into this formula expression (B-7) for

(

):

=

(C-9) =1

Since: =

(C-10)

(C-9) can be written in the form: = =1

=

(C-11) =1

Since the form an orthonormal basis of , the expression in brackets is equal to the identity operator (closure relation), and we obtain formula (C-4). The argument is completely analogous for the case where the spectrum of is continuous (for simplicity, we shall continue to assume it to be non-degenerate). Consider identical experiments, and call d ( ) the number of experiments which have yielded a result included between and + d . We have, similarly: d ( )

d ( )

(C-12)

The mean value of the results obtained is

1

d ( ), which, when

, ap-

proaches: =

d ( )

(C-13)

Substitute into (C-13) the expression for d ( ) given by (B-17): =

d

(C-14)

We can use the equation: =

(C-15)

to transform (C-14) into: = =

d d

Using the closure relation satisfied by the states

(C-16) , we again find formula (C-4). 229

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

Comments:

()

, the average over a set of identical measurements, must not be confused with the time averages sometimes taken when dealing with time-dependent phenomena.

( ) If the ket representing the state of the system is not normalized, formula (C-4) becomes [cf. comment (i) of § B-3-b]: = (

(C-17)

) In practice, to calculate explicitly, one often places oneself in a particular representation. For example: = =

d3

r r

=

d3

(r)

(r)

using the definition of the larly:

(C-18)

operator [cf. Chap. II, relations (E-22)]. Simi-

= = or, using the

d3 r

(p)

(p)

representation:

=

d3

r r

=

d3

(r)

~

since P is then represented by C-5.

(C-19)

(r) ~

(C-20)

∇ [formula (E-26) of Chapter II].

The root mean square deviation

indicates the order of magnitude of the values of the observable when the system is in the state . However, this mean value does not give any idea of the dispersion of the results we expect when measuring . Assume, for example, that the spectrum of is continuous and that, for a given state , the curve representing the 2 variation with respect to of the probability density ( ) = has the shape shown in Figure 2. For a system in the state , nearly all the values that can be found when is measured are included in an interval of width containing , where 230

C. THE PHYSICAL INTERPRETATION OF THE POSTULATES CONCERNING OBSERVABLES AND THEIR MEASUREMENT

ρ( )

Figure 2: Variation with respect to of the probability density ( ). The mean value is the abscissa of the center of gravity of the area under the curve (it does not necessarily coincide with the abscissa of the maximum of the function). O

α

A

m

the quantity characterizes the width of the curve: the smaller , the more the measurement results are concentrated about . How can we define, in a general way, a quantity which characterizes the dispersion of the measurement results about ? We might envisage the following method: for each measurement, take the difference between the value obtained and ; then calculate the average of these deviations, dividing their sum by the number of experiments. It is easy to see, however, that the result obtained would be zero; we have, obviously: =

=0

(C-21)

By the very definition of , the average of the negative deviations balances exactly the average of the positive ones. To avoid this compensation, it suffices to define ∆ such that (∆ )2 is the mean of the squares of the deviations: (∆ )2 = (

)2

(C-22)

By definition, we therefore introduce the root mean square deviation ∆ ∆

=

)2

(

by setting: (C-23)

Using the expression for the mean value given in (C-4), we then have: ∆

=

)2

(

(C-24)

This relation can also be written in a slightly different way: )2 = (

(

2

=

2

=

2

2

+ 2

2

+

=

2

)

2

2

The root-mean-square deviation ∆ ∆

2

2

(C-25) is therefore also given by: (C-26) 231

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

For example, in the case of the continuous spectrum of the observable above, ∆ is given by: +

(∆ )2 =

[

]

2

( )d

+

=

2

+ 2

considered

( )d

( )d

(C-27)

If definition (C-23) is applied to the observables R and P, it can be shown (Complement CIII ), using their commutation relations, that for any state , one has: ∆



~ 2

∆ ∆

~ 2

∆ ∆

~ 2

(C-28)

In other words, we find the Heisenberg relations (§ C-3 of Chapter I) again, but with a precise lower limit, which arises from the precise definition of the uncertainties. C-6.

Compatibility of observables

C-6-a.

Compatibility and commutation rules

Consider two observables [

]=0

and

which commute: (C-29)

We shall assume for simplicity that both of their spectra are discrete. According to the theorem proved in § D-3-a of Chapter II, there exists a basis of the state space composed of eigenkets common to and , which we shall denote by : = =

(C-30)

(the index allows us to distinguish, if necessary, between the different vectors corresponding to one pair of eigenvalues). Therefore, for any and (chosen, respectively, in the spectra of and ), there exists at least one state for which a measurement of will always give and a measurement of will always give . Two such observables and which can be simultaneously determined are said to be compatible. On the other hand, if and do not commute, a state cannot in general4 be a simultaneous eigenvector of these two observables. They are said to be incompatible. Let us examine more closely the measurement of two compatible observables on a system which is initially in an arbitrary (normalized) state . This state can always be written: = 4 Some

(C-31)

kets may be simultaneous eigenvectors of and . But there would not be a sufficient number of them to form a basis, as would be the case if and commuted.

232

C. THE PHYSICAL INTERPRETATION OF THE POSTULATES CONCERNING OBSERVABLES AND THEIR MEASUREMENT

First assume that we measure and then, immediately afterwards, (before the system has had time to evolve). Let us calculate the probability ( ) of obtaining in the first measurement and in the second one. We begin by measuring in the state ; the probability of finding is therefore: (

2

)=

(C-32)

When we then measure the state :

, the system is no longer in the state

but, if we have found

1

=

, in

(C-33) 2

The probability of obtaining therefore equal to: 1

( )=

2

)=

(

is

(C-34)

2

The probability ( we must first find (

when it is known that the first measurement has yielded

) sought corresponds to a “composite event”: to be in a favorable case, and then, having satisfied this first condition, find . Therefore: )

( )

(C-35)

Substituting into this formula expressions (C-32) and (C-34), we obtain: (

2

)=

(C-36)

Moreover, the state of the system becomes, immediately after the second measurement: 1

(C-37)

2

Therefore, if we decide to measure either or again, we are sure of the result ( or ): is an eigenvector common to and with the eigenvalues and respectively. Let us now return to the system in the state , and let us measure the two observables in the opposite order ( , then ). What is the probability ( ) of obtaining the same results as before? The reasoning is the same. We have here: (

)=

( )

(

)

(C-38)

From (C-31), we see that: 2

( )=

and that, after a measurement of =

1

(C-39) which yields

, the state of the system becomes: (C-40)

2

233

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

Therefore: (

)=

1

2

(C-41)

2

and finally: (

2

)=

If we have indeed found

(C-42) and then

, the system has gone into the state:

1

=

(C-43)

2

When two observables are compatible, the physical predictions are the same, whatever the order of performing the two measurements (provided that the time interval which separates them is sufficiently small). The probabilities of obtaining either then or then are identical: (

)=

(

2

)=

=

2

(C-44)

Moreover, the state of the system immediately after the two measurements is in both cases (if the results are and for and respectively): =

=

1

(C-45) 2

New measurements of or will yield the same values again without fail. The preceding discussion thus leads to the following result: when two observables and are compatible, the measurement of does not cause any loss of information previously obtained from a measurement of (and vice versa) but, on the contrary, adds to it. Moreover, the order of measuring the two observables and is of no importance. This last point, furthermore, enables us to envisage the simultaneous measurement of and . The fourth and fifth postulates can be generalized to the case of such a simultaneous measurement, as can be seen from formulas (C-44) and (C-45). To the result correspond the orthonormal eigenvectors . From this, (C-44) and (C-45) can be seen to be applications of postulates (B-7) and (B-30). On the other hand, if and do not commute, the preceding arguments are no longer valid. To understand this in a simple way, imagine that the state space is replaced by the two-dimensional space of real vectors. The vectors 1 and 2 in Figure 3 are eigenvectors of with eigenvalues 1 and 2 respectively; 1 and 2 are eigenvectors of with eigenvalues 1 and 2 respectively. Each of the two sets 1 2 and forms an orthonormal basis in . We shall therefore represent them in 1 2 Figure 3 by two pairs of perpendicular unit vectors. The fact that and do not commute implies that these two pairs do not coincide. The physical system under study is initially in the normalized state , which is represented in the figure by an arbitrary 234

C. THE PHYSICAL INTERPRETATION OF THE POSTULATES CONCERNING OBSERVABLES AND THEIR MEASUREMENT

u2 H2

υ2 ψ

Figure 3: Diagram associated with the successive measurement of two non-compatible observables and . The state vector of the system is . The eigenvectors of are 1 and 2 (eigenvalues 1 and 2 ), which are different from those of 1 and 2 (eigenvalues 1 and 2 ).

K2

O

K1

u1

H1

υ1

unit vector. We measure and find, for example, 1 ; the system goes into the state We then measure and find, for example, 2 ; the state of the system becomes 2 : ( 1)

( 2)

=

=

1

1

.

(C-46)

2

If, on the other hand, we perform the measurements in the opposite order, obtaining the same results: ( 2)

( 1)

=

=

2

(C-47)

1

The final state of the system is not the same in both cases. We also see from Figure 3 that: (

1

2)

=

1

(

2

1)

=

2

Although (

2

1 1)

=

2 2

= (

1

2 1 2

2 2

, in general

(C-48) 1

=

2

2)

and: (C-49)

Therefore: two incompatible observables cannot be measured simultaneously. It can be seen from (C-46) and (C-47) that the second measurement causes the information supplied by the first one to be lost. If, for example, after the sequence represented in (C-46), we measure again, we can no longer be sure of the result since 2 is not an eigenvector of . All that was gained by the first measurement of is thus lost. C-6-b.

Preparation of a state

Let us consider a physical system in the state (whose spectrum we assume to be discrete).

and measure the observable

235

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

If the measurement yields a non-degenerate eigenvalue , the state of the system immediately after this measurement is the corresponding eigenvector . In this case, it suffices to know the result of this measurement to be able to determine unambiguously the state of the system after this measurement, as it does not depend on the initial ket. As we have already noted, at the end of § B-3-c, this is due to the fact that represents the same physical state as itself. The same does not hold true when the eigenvalue degenerate. In: =

1

found in the measurement is

(C-50) 2

=1

the absolute values of the coefficients and their relative phases are significant (§ B-3b- ). Since the are fixed when the initial state is specified, the state after the measurement therefore depends on . However, we saw in § C-6-a that two compatible observables and can be measured simultaneously. If the result ( ) of this combined measurement corresponds to only one eigenvector common to and , there is no summation over in formula (C-37), which becomes: =

(C-51)

This state is physically equivalent to . Again, specifying the result of the measurement uniquely determines the final state of the system, which is therefore independent of the initial ket . If there are associated with ( ) several eigenvectors of and , we can go back to the first argument, measuring, at the same time as and , a third observable which is compatible with both of them. We then arrive at the following conclusion: for the state of the system after a measurement to be completely defined uniquely by the result obtained, this measurement must be made on a complete set of commuting observables (§ D-3-b of Chapter II). This is the property which justifies physically the introduction of the concept of a C.S.C.O. The methods that can be used to prepare a system in a well-defined quantum state are analogous, in principle, to those used to obtain polarized light. When a polarizer is placed in the path of a light beam, the outgoing light is polarized along a direction which is characteristic of the polarizer and therefore independent of the state of polarization of the incoming light. Similarly, we can construct devices, intended to prepare a quantum system, in such a way that they only allow the passage of one state, corresponding to a particular eigenvalue for each of the observables of the complete set chosen. We shall study a concrete example of the preparation of a quantum system in Chapter IV (§ B-1). Comment:

The measurement of a C.S.C.O. enables us to prepare only one of the basis states associated with this C.S.C.O. However, it is obvious that changing the set of observables allows us to obtain other states of the system. We shall see explicitly in a concrete example, in § B-1 of Chapter IV, that we can prepare in this way any state of the space . 236

D. THE PHYSICAL IMPLICATIONS OF THE SCHRÖDINGER EQUATION

D.

The physical implications of the Schrödinger equation

The Schrödinger equation plays a fundamental role in quantum mechanics since, according to the sixth postulate stated above, it is the equation that governs the time evolution of the physical system. In this section, we shall study in detail the most important properties of this equation. D-1.

General properties of the Schrödinger equation

D-1-a.

Determinism in the evolution of physical systems

The Schrödinger equation: ~

d d

() =

() ()

(D-1)

is of first order in . Consequently, given the initial state ( 0 ) , the state ( ) at any subsequent time is determined. There is no indeterminacy in the time evolution of a quantum system. Indeterminacy appears only when a physical quantity is measured, the state vector then undergoing an unpredictable modification (cf. fifth postulate). However, between two measurements, the state vector evolves in a perfectly deterministic way, in accordance with equation (D-1). D-1-b.

The superposition principle

Equation (D-1) is linear and homogeneous. It follows that its solutions are linearly superposable. Let 1 ( ) and 2 ( ) be two solutions of (D-1). If the initial state of the system is ( 0 ) = 1 1 ( 0 ) + 2 2 ( 0 ) (where 1 and 2 are two complex constants), to it corresponds, at time , the state ( ) = 1 1 ( ) + 2 2 ( ) . The correspondence between ( 0 ) and ( ) is therefore linear. We shall later study (Complement FIII ) the properties of the linear operator ( 0 ) which transforms ( 0 ) into ( ) : () = D-1-c.

.

(

0)

( 0)

(D-2)

Conservation of probability

The norm of the state vector remains constant

Since the Hamiltonian operator ( ) that appears in (D-1) is Hermitian, the square of the norm of the state vector, ( ) ( ) , does not depend on as we shall now show: d d

() () =

d d

()

() +

()

d d

()

(D-3)

According to (D-1), we can write: d d

() =

1 ~

() ()

(D-4)

Taking the Hermitian conjugates of both sides of (D-4), we find: d d

() =

1 ~

()

()=

1 ~

()

()

(D-5) 237

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

since ( ) is Hermitian (it is an observable). Substituting (D-4) and (D-5) into (D-3), we obtain: d d

1 ~

() () =

()

() () +

1 ~

()

=0

() () (D-6)

The property of the norm conservation is very useful in quantum mechanics. For example, it becomes indispensable when we interpret the square of the modulus (r ) 2 of the wave function of a spinless particle as being the position probability density. The fact that the state ( 0 ) of the particle is normalized at time 0 is expressed by the relation: d3

( 0) ( 0) =

(r

0)

2

=1

(D-7)

where (r 0 ) = r ( 0 ) is the wave function associated with ( 0 ) . Equation (D-7) means that the total probability of finding the particle somewhere in all space is equal to 1. The property of conservation of the norm which we have just proved is expressed by the equation: () () =

d3

(r ) 2 =

( 0) ( 0) = 1

(D-8)

where ( ) is the solution of (D-1) corresponding to the initial state ( 0 ) . In other words, time evolution does not modify the global probability of finding the particle in all space, which always remains equal to 1. Thus (r ) 2 can be interpreted as a probability density. .

Local conservation of probability. Probability densities and probability currents

In this paragraph, we shall confine ourselves to the case of a physical system composed of only one (spinless) particle. In this case, if (r ) is normalized, (r ) =

(r ) 2

(D-9)

is a probability density: the probability d (r ) of finding, at time , the particle in an infinitesimal volume d3 located at point r is equal to: d (r ) = (r ) d3

(D-10)

We have just shown that the integral of (r ) over all space remains constant for all time (and equal to 1 if is normalized). This does not mean that (r ) must be independent of at every point r. The situation is analogous to the one encountered in electromagnetism. If, in an isolated physical system, there is a charge distributed in space with the volume density (r ), the total charge [the integral of (r ) over all space] is conserved in time. However, within the system, the spatial distribution of this charge may vary, giving rise to electric currents. 238

D. THE PHYSICAL IMPLICATIONS OF THE SCHRÖDINGER EQUATION

In fact, this analogy can be carried further. Global conservation of electrical charge is based on local conservation. If the charge contained within a fixed volume varies over time, the closed surface which limits must be traversed by an electric current. More precisely, the variation d during a time d of the charge contained within is equal to d , where is the intensity of the current traversing , that is, the flux of the vector current density J(r ) leaving . Using classical vector analysis, we can express local conservation of electrical charge in the form: (r ) + div J(r ) = 0

(D-11)

We are going to show that it is possible to find a vector J(r ), a probability current, which satisfies an equation identical to (D-11): there is then local conservation of probability. It is as if we were dealing with a “probability fluid” whose density and motion were described by (r ) and J(r ). If the probability of finding the particle in the (fixed) volume d3 about r varies over time, it means that the probability current has a non-zero flux accross the surface which limits this volume element. First of all, let us assume that the particle under study is subjected to a scalar potential (r ). Its Hamiltonian is then: =

P2 + 2

(R )

(D-12)

and the Schrödinger equation is written, in the r ~2 ∆ (r ) + 2

(r ) =

~

(r ) must be real for is therefore:

(r ) (r )

(D-13)

to be Hermitian. The complex conjugate equation of (D-13)

~2 ∆ 2

(r ) =

~

representation (see Complement DII ):

(r ) +

(r )

(r )

Multiply both sides of (D-13) by (r ) and both sides of (D-14) by two equations thus obtained. It follows that: [

~

~2 [ 2

(r ) (r )] =





]

(D-14) (r ). Add the

(D-15)

that is: (r ) +

~ 2

[

(r )∆ (r )

(r ) ∆

(r )] = 0

(D-16)

If we set: J(r ) = =

~ 2 1

[

Re



∇ ~



] (D-17)

239

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

equation (D-16) can be put into the form of (D-11) since: div J(r ) = ∇ J = =

~

(∇

2 ~ 2

[

) (∇ ) +





∇2

(∇ ) (∇

]

)

∇2 (D-18)

We have therefore proved the equation of local conservation of probability and have found the expression for the probability current in terms of the normalized wave function (r ).

Comment: The form of the probability current (D-17) can be interpreted as follows. J(r ) appears as the mean value, in the state ( ) , of an operator K(r) given by: K(r) =

1 [r rP+Pr r] 2

(D-19)

Now the mean value of the operator r r is (r ) 2 , that is, the probability denP sity (r ), and is the velocity operator . Therefore, K is the quantum operator constructed, with the help of an appropriate symmetrization, from the product of the probability density and the velocity of the particle. This indeed corresponds to the vector current density of a classical fluid (it is well known, for example, that the electrical current density associated with a fluid of charged particles is equal to the product of the charge volume density and the velocity of the particles).

If the particle is placed in an electromagnetic field described by the potentials (r ) and A(r ), we can use the preceding argument, starting with the Hamiltonian (B-46). We then find, in this case: J(r ) =

1

~

Re



A

(D-20)

We see that this expression can be obtained from (D-17), using the same rule as was used for the Hamiltonian: P is simply replaced by P A. Example of a plane wave. Consider a wave function of the form: (r ) = with: ~ = (r ) =

e (k r

)

(D-21)

~2 k2 . The corresponding probability density: 2 (r ) 2 =

2

(D-22)

is uniform throughout all space and does not depend on time. The calculation of J(r ) from (D-17) presents no difficulties and leads to: J(r ) = 240

2 ~k

= (r ) v

(D-23)

D. THE PHYSICAL IMPLICATIONS OF THE SCHRÖDINGER EQUATION

where v =

~k

is the group velocity associated with the momentum ~k (Chap. I, § C-

4). We see that the probability current is indeed equal to the product of the probability density and the group velocity of the particle. In this case, and J are time-independent: the flow of the probability fluid associated with a plane wave is in a steady state condition (since and J do not depend on r either, this state is also homogeneous and uniform). D-1-d.

Evolution of the mean value of an observable; relationship with classical mechanics

Let be an observable. If the state ( ) of the system is normalized (and we have just seen that this normalization is conserved for all ), the mean value of the observable at the instant is equal to5 : ()=

()

()

(D-24)

We see that ( ) depends on through ( ) [and ( ) ], which evolve over time according to the Schrödinger equation (D-4) [and (D-5)]. Moreover, the observable may depend explicitly on time, causing an additional variation of the mean value () with respect to . We intend to study, in this section, the evolution of ( ) and to show how this enables us to relate classical mechanics to quantum mechanics. .

General formula Differentiating (D-24) with respect to , we obtain: d d

()

() () =

d d +

Using (D-4) and (D-5) for d d

()

() () =

() ()

d d

1 ~ +

() () +

()

d d

()

()

() [ ()

(D-25) d d

( ) , we find:

()

( ) ( )]

( ) and

()

()

()

()

(D-26)

that is: d d

=

1 [ ~

( )] +

(D-27)

Comment:

The mean value is a number which depends only on . It is essential to understand how this dependence arises. For example, consider the case of a spinless 5 The

notation

( ) means that the mean value

of

is a number which depends on .

241

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

particle. Let (r p ) be a classical quantity. In classical mechanics, r and p depend on time (they evolve according to the Hamilton equations), so that (r p ) depends on explicitly, and implicitly through r and p. To the classical quantity (r p ) corresponds the Hermitian operator = (R P ), obtained by replacing, in , r and p by the operators R and P, and by symmetrizing the operators if necessary (quantization rules, see § B-5). The eigenstates and eigenvalues of R and P and, consequently, these observables themselves, no longer depend on . The time dependence of r and p, which characterizes the time evolution of the classical state, no longer appears in R and P, but in the quantum state vector ( ) , associated in the r representation with the wave function (r ) = r ( ) . In this representation, the mean value of is written: d3

=

(r )

r

~



(r )

(D-28)

It is clear that integration over r leads to a number that only depends on . With ~ regard to classical mechanics, it is this number [and not the operator (r ∇ )] that must be compared with the value taken on by the classical quantity at time (cf. § D-1-d- below).

.

(r p )

Application to the observables R and P (Ehrenfest’s theorem)

Now let us apply the general formula (D-27) to the observables R and P. We shall consider, for simplicity, the case of a spinless particle in a scalar stationary potential (r). We then have: =

P2 + 2

(R)

(D-29)

so that we can write: d 1 R = [R d ~ d 1 P = [P d ~

1 P2 R ~ 2 1 ] = [P (R)] ~ ] =

(D-30) (D-31)

The commutator that appears in (D-30) can easily be calculated from the canonical commutation relations; we obtain: R

P2 2

=

~

P

(D-32)

For the one in formula (D-31), the following generalization of formula (B-33) must be used [cf. Complement BII , formula (48)]: [P 242

(R)] =

~ ∇ (R)

(D-33)

D. THE PHYSICAL IMPLICATIONS OF THE SCHRÖDINGER EQUATION

where ∇ (R) denotes the set of three operators obtained by replacing r by R in the three components of the gradient of the function (r). Therefore: d 1 R = P d d P = ∇ (R) d

(D-34) (D-35)

These two equations express Ehrenfest’s theorem. Their form recalls that of the classical Hamilton-Jacobi equations for a particle (Appendix III, § 3): 1 d r= p d d p= d

(D-36a)

∇ (r)

(D-36b)

which reduce, in this simple case, to Newton’s well-known equation: dp = d .

d2 r = d2

∇ (r)

(D-37)

Discussion of Ehrenfest’s theorem; classical limit

Let us analyze the physical meaning of Ehrenfest’s theorem, that is, equations (D34) and (D-35). We shall assume that the wave function (r ) describing the state of the particle is a wave packet like those studied in Chapter I. R then represents a set of three time-dependent numbers . We shall call the point R ( ) the center of the wave packet 6 at the instant . The set of points corresponding to various values of constitutes the trajectory of the center of the wave packet. Recall, however, that one can never rigorously speak of the trajectory of the particle itself, whose state is described by the wave packet as a whole, which inevitably has a certain spatial extension. We see, nevertheless, that if this extension is much smaller than the other distances involved in the problem, we can approximate the wave packet by its center. In this limiting case, there is no appreciable difference between the quantum and classical descriptions of the particle. It is therefore important to know the answer to the following question: does the motion of the center of the wave packet obey the laws of classical mechanics? This answer is supplied by Ehrenfest’s theorem. Equation (D-34) expresses the fact that the velocity of the center of the wave packet is equal to the average momentum of this wave packet d2 divided by . Consequently, the left-hand side of (D-35) can be written R , so d2 that the answer to the preceding question will be affirmative if the right-hand side of (D-35) is equal to the classical force F at the point where the center of the wave packet is situated: F = [ ∇ (r)]r=

R

(D-38)

6 The

center and the maximum of a wave packet are, in general, distinct. They coincide, however, if the wave packet has a symmetrical shape (§ C-5, Fig. 2).

243

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

In fact, the right-hand side of (D-35) is equal to the average of the force over the whole wave packet, and, in general: ∇ (R) = [∇ (r)]r=

(D-39)

R

(in other words, the mean value of a function is not equal to its value for the mean value of the variable). Strictly speaking, the answer to the question we asked is therefore negative.

Comment: It is easy to convince ourselves of (D-39) if we consider a concrete example. Let us choose, for simplicity, a one-dimensional model, and assume that: ( )=

(D-40)

where is a real constant and , a positive integer. From this we deduce the operator associated with ( ): ( )=

(D-41)

d The left-hand side of (D-39) can be written (replacing ∇ by ) d right-hand side, it is equal to: d d

=[

1

]

=

1

=

1

. As for the

(D-42)

=

1 1 Now we know that in general = ; for example, for = 3, we have 2 2 = (since the difference between these two quantities enters into the calculation of the root mean square deviation ∆ ). 1 1 Note however that for = 1 or 2, = . The two sides of (D-39) are then equal. The same holds true, moreover, for = 0, in which case both sides are equal to zero. For a free particle ( = 0), or a particle placed in a uniform force field ( = 1) or in a parabolic potential well ( = 2, the case of a harmonic oscillator), the motion of the center of the wave packet therefore rigorously obeys the laws of classical mechanics. We have already established this result for the free particle ( = 0) (cf. Chapter I, § C-4).

Although the two sides of (D-39) are not, in general, equal, there exist situations (called quasi-classical) where the difference between these two quantities is negligible: this is the case when the wave packet is sufficiently localized. To see this, let us write explicitly, in the r representation, the left-hand side of this equation: ∇ (R) =

d3

(r ) [∇ (r)]

=

d3

(r ) 2 ∇ (r)

(r ) (D-43)

Let us assume the wave packet to be highly localized: more precisely, (r ) 2 takes on non-negligible values only within a domain whose dimensions are much smaller than the distances over which (r) varies appreciably. Then, within this domain, centered about 244

D. THE PHYSICAL IMPLICATIONS OF THE SCHRÖDINGER EQUATION

R , ∇ (r) is practically constant. Therefore, in (D-43), ∇ (r) can be replaced by its value for r = R and taken outside the integral, which is then equal to 1, since (r ) is normalized. Thus we find that for sufficiently localized wave packets: ∇ (R)

[∇ (r)]r=

(D-44)

R

In the macroscopic limit (where the de Broglie wavelengths are much smaller than the distances over which the potential varies7 ), wave packets can be made sufficiently small to satisfy (D-44) while retaining a good degree of definition for the momentum. The motion of the wave packet is then practically that of a classical particle of mass placed in the potential (r). The result that we have thus established is very important since it enables us to show that the equations of classical mechanics follow from the Schrödinger equation in certain limiting conditions satisfied, in particular, by most macroscopic systems. D-2.

The case of conservative systems

When the Hamiltonian of a physical system does not depend explicitly on time, the system is said to be conservative. In classical mechanics, the most important consequence of such a situation is the conservation of energy over time. It can also be said that the total energy of the system is a constant of the motion. We shall see in this section that in quantum mechanics as well, conservative systems possess important special properties in addition to the general properties of the preceding section. D-2-a.

Solution of the Schrödinger equation

First, let us consider the eigenvalue equation for =

: (D-45)

For simplicity, we assume the spectrum of to be discrete; denotes the set of indices other than that are necessary for characterizing a unique vector (in general, these indices will determine the eigenvalues of operators forming a C.S.C.O. with ). Since, by hypothesis, does not depend explicitly on time, neither the eigenvalue nor the eigenket is dependent. First, we are going to show that, given the and the , it is very simple to solve the Schrödinger equation, that is, to determine the time evolution of any state. Since the form a basis ( is an observable), it is always possible, for every value of , to expand any state ( ) of the system in terms of the : () =

()

(D-46)

()

(D-47)

with: ()=

Since the do not depend on , all the time dependence of ( ) is contained within the ( ). To calculate the ( ), let us project the Schrödinger equation onto each 7 See

the order of magnitude of the de Broglie wavelengths associated with a macroscopic system in Complement AI .

245

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

. This yields8 :

of the states ~

d d

Since

() =

()

(D-48)

is Hermitian, it can be deduced from (D-45) that: =

(D-49)

so that (D-48) can be written in the form: d ()= () d This equation can be integrated directly to give:

(D-50)

~

()= When as follows: ( ) Expand

(

( 0) e

0)

~

(D-51)

does not depend explicitly on time, to find ( 0 ) in terms of a basis of eigenstates of

( 0) =

( ) , given

( 0 ) , proceed

:

( 0)

(D-52)

( 0 ) is given by the usual formula: ( 0) =

( 0)

( ) Now, to obtain pansion (D-52) by e state : () =

(D-53)

( ) for arbitrary , multiply each coefficient ( 0 ) of the ex( 0) ~ , where is the eigenvalue of associated with the

(

( 0) e

0)

~

(D-54)

The preceding argument can easily be generalized to the case where the spectrum is continuous; formula (D-54) then becomes, with obvious notation:

of

() =

D-2-b.

d

(

0)

e

(

0)

~

(D-55)

Stationary states

An important special case is that in which ( 0 ) is itself an eigenstate of . Expansion (D-52) of ( 0 ) then involves only eigenstates of with the same eigenvalue (for example, ): ( 0) = 8 In

246

(D-48),

( 0) can be placed to the right of

(D-56) d , since d

does not depend on .

D. THE PHYSICAL IMPLICATIONS OF THE SCHRÖDINGER EQUATION

In formula (D-56), there is no summation over , and the passage from ( 0 ) to ( ) ( 0) ~ involves only one factor e , which can be taken outside the summation over :

() =

( 0) e

=e

(

0)

~

=e

(

0)

~

(

0)

~

( 0) ( 0)

(D-57)

( 0) ~ ( ) and ( 0 ) therefore differ only by the global phase factor e . These two states are physically indistinguishable (cf. discussion in § B-3-b- ). From this we conclude that all the physical properties of a system which is in an eigenstate of do not vary over time; the eigenstates of are called, for this reason, stationary states.

It is also interesting to see how conservation of energy in a conservative system appears in quantum mechanics. Let us assume that, at time 0 , we measure the energy of such a system and we find, for example: . Immediately after the measurement, the system is in an eigenstate of , with an eigenvalue of (the postulate of the reduction of the wave packet). We have just seen that the eigenstates of are stationary states. Therefore, the state of the system will no longer evolve after the first measurement and will always remain an eigenstate of with an eigenvalue of . It follows that a second measurement of the energy of the system, at any subsequent time , will always yield the same result as the first one.

Comment:

One passes from (D-52) to (D-54) by multiplying each coefficient ( 0 ) of (D-52) ( ( 0) ~ 0) ~ by e . The fact that e is a phase factor should not lead us to believe that ( ) and ( 0 ) are always physically indistinguishable. Actually, expansion (D-52) involves, in general, several eigenstates of with different eigenvalues. To these different possible values of correspond different phase factors. This modifies the relative phases of the expansion coefficients of the state vector and leads, consequently, to a state ( ) which is physically distinct from ( 0) . Only in the case where a single value of enters into (D-52) [the case where ( 0 ) is an eigenstate of ] is the time evolution described by a single phase factor, which is then a global one, of no physical importance. In other words, there is physical evolution over time only if the energy of the initial state is not known with certainty. We shall come back later to the relation between time evolution and energy uncertainty (cf. § D-2-e). 247

CHAPTER III

D-2-c.

THE POSTULATES OF QUANTUM MECHANICS

Constants of the motion

By definition, a constant of the motion is an observable explicitly on time and which commutes with :

which does not depend

=0 (D-58) [

]=0

For a conservative system, is therefore itself a constant of the motion. Constants of the motion possess important properties which we are now going to derive. ( ) If we substitute (D-58) into the general formula (D-27), we find: d d

=

d d

()

() =0

(D-59)

Whatever the state ( ) of the physical system, the mean value of in this state does not evolve over time (hence the term “constant of the motion”). ( ) Since and are two observables which commute, we can always find for them a system of common eigenvectors, which we shall denote by : = =

(D-60)

We shall assume for simplicity that the spectra of and are discrete. The index fixes the eigenvalues of observables which form a C.S.C.O. with and . Since the states are eigenstates of , they are stationary states. If the system is in the state at the initial instant, it will therefore remain there indefinitely (to within a global phase factor). But the state is an eigenstate of as well. Consequently, when is a constant of the motion, there exist stationary states of the physical system (the states ) that always remain, for all , eigenstates of with the same eigenvalue . The eigenvalues of are called, for this reason, good quantum numbers. ( ) Finally, let us show that for an arbitrary state ( ) , the probability of finding the eigenvalue , when the constant of the motion is measured, is not time-dependent. ( 0 ) can always be expanded on the basis introduced above: ( 0) =

( 0)

(D-61)

From this we directly deduce: () =

()

(D-62)

with: ()= 248

( 0) e

(

0)

~

(D-63)

D. THE PHYSICAL IMPLICATIONS OF THE SCHRÖDINGER EQUATION

According to the postulate of spectral decomposition, the probability ( 0 ) of finding when is measured at time 0 , on the system in the state ( 0 ) , is equal to: (

0)

( 0) 2

=

(D-64)

Similarly: (

( )2

)=

(D-65)

Now we see from (D-63) that ( ) and ( 0 ) have the same modulus. Therefore, ( )= ( ), which proves the property stated above. 0

Comment:

If all but one of the probabilities ( ( 0 ) are zero [leaving for example 0) non-zero and, moreover, necessarily equal to 1], the physical system at time 0 is in an eigenstate of with an eigenvalue of . Since the ( ) do not depend on , the state of the system at time remains an eigenstate of with an eigenvalue of . D-2-d.

Bohr frequencies of a system. Selection rules

Let be an arbitrary observable of the system under consideration (it does not necessarily commute with ). Formula (D-27) enables us to calculate the derivative d of the mean value of : d d d

=

1 [ ~

] +

(D-66)

For a conservative system, we know the general form (D-54) of

( ) . Therefore, in this d case, we can calculate explicitly ( ) ( ) (and not merely ). d The Hermitian conjugate expression of (D-54) is written (changing the summation indices): () =

( 0) e

(

0)

We can then, in ( ) ( ) , replace respectively. Thus we obtain: ()

() = =

~

(D-67) ( ) and

( ) by expansions (D-54) and (D-67),

() ( 0)

( 0)

e(

)(

0)

~

(D-68) 249

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

From now on, we shall assume that does not depend explicitly on time: the matrix elements are therefore constant. Formula (D-68) then shows that the evolution of ( ) is described by a series of oscillating terms, whose frequencies 1 = = are characteristic of the system under consideration 2 ~ but independent of and of the initial state of the system. The frequencies are called the Bohr frequencies of the system. Thus, for an atom, the mean values of all the atomic quantities (electric and magnetic dipole moments, etc...) oscillate at the various Bohr frequencies of the atom. It is reasonable to imagine that only these frequencies can be radiated or absorbed by the atom. This remark allows us to understand intuitively the Bohr relation between the spectral frequencies emitted or absorbed and the differences in atomic energies. It can also be seen from (D-68) that, while the frequencies involved in the motion of ( ) are independent of , the same does not hold true for the respective weights of these frequencies in the variation of . The importance of each frequency depends on the matrix elements . In particular, if these matrix elements are zero for certain values of and , the corresponding frequencies are absent from the expansion of ( ), whatever the initial state of the system. This is the origin of the selection rules which indicate what frequencies can be emitted or absorbed under given conditions. To establish these rules, one must study the non-diagonal matrix elements ( = ) of the various atomic operators such as the electric and magnetic dipoles, etc... Finally, the weights of the various Bohr frequencies also depend on the initial state, via ( 0) ( 0 ). In particular, if the initial state is a stationary state of energy , the expansion of ( 0 ) contains only one value of ( = ) and ( 0) ( 0 ) can be non-zero only for = = . In this case, is not time-dependent.

Comment:

It can be directly verified, using (D-68), that the mean value of a constant of the motion is always time-independent. We see that if commutes with , the matrix elements of are zero between two eigenstates of that correspond to different eigenvalues (cf. Chap. II, § D-3-a). It follows that is zero for = . The only terms of that are non-zero are thus constant. D-2-e.

The time-energy uncertainty relation

We shall now see that for a conservative system, the greater the energy uncertainty, the more rapid the time evolution. More precisely, if ∆ is a time interval at the end of which the system has evolved to an appreciable extent, and if ∆ denotes the energy uncertainty, ∆ and ∆ satisfy the relation: ∆



&

(D-69)

First, if the system is in an eigenstate of , its energy is perfectly well-defined: ∆ = 0. But we have seen that such a state is stationary, meaning that it does not evolve in time; in a sense, its evolution time ∆ is infinite [relation (D-69) indicates that when ∆ = 0, ∆ must be infinite]. 250

D. THE PHYSICAL IMPLICATIONS OF THE SCHRÖDINGER EQUATION

Now let us assume that ( 0 ) is a linear superposition of two eigenstates of and 2 , with different eigenvalues 1 and 2 :

1

( 0) =

1

1

+

2

,

(D-70)

2

Then: () =

1

1(

e

0)

~

+

1

2

e

2(

0)

If we measure the energy, we find either the order of: ∆

2

1

~

or

(D-71)

2 2.

The uncertainty of

is therefore of (D-72)

1

Now consider an arbitrary observable which does not commute with . The probability of finding, in a measurement of at time , the eigenvalue associated with the eigenvector (we assume, for simplicity, to be non-degenerate) is given by: (

)=

() + 2 Re

2

2 1

This equation shows that frequency

21

2

=

1

= e(

1

2

1 1 )(

2

(

0)

2

+

2

~

2

2

2

2

(D-73)

1

) oscillates between two extreme values, with the Bohr

. The characteristic evolution time of the system is therefore:



(D-74) 2

1

and comparison with (D-72) shows that: ∆ ∆ Let us now assume that the spectrum of The most general state ( 0 ) can be written: ( 0) =

d

. is continuous (and non-degenerate).

( )

(D-75)

where is the eigenstate of with the eigenvalue . Let us assume that ( ) 2 has non-negligible values only in a domain of width ∆ about 0 (Fig. 4). ∆ then represents the energy uncertainty of the system. ( ) is obtained by using (D-55): () =

d

(

( )e

0)

~

(D-76)

The quantity ( ) introduced above, which represents the probability of finding the eigenvalue when the observable is measured on the system in the state ( ) , is here equal to: 2

(

)=

()

2

=

d

( )e

(

0)

~

(D-77)

In general, does not vary rapidly with when varies about 0 . If ∆ is sufficiently small, the variation of , in integral (D-77), can be neglected relative 251

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

to that of ( ). One can then replace outside integral (D-77):

by

0

and take this quantity

2

(

2

)

0

d

( )e

(

0)

~

(D-78)

If this approximation is valid, we thus see that ( ) is, to within a coefficient, the square of the modulus of the Fourier transform of ( ). According to the properties of the Fourier transform (cf. Appendix I, § 2-b), the width in of ( ), that is, ∆ , is therefore related to the width ∆ of ( ) 2 by relation (D-69).

Comment: (D-69) can be established directly for a free one-dimensional wave packet. One can associate with the momentum uncertainty ∆ of this wave packet an energy uncertainty d d d ∆ = ∆ . Since = ~ and = ~ , we have = = , where is the d d d group velocity of the wave packet (Chap. I, § C-4). Consequently: ∆

=



(D-79)

Now the characteristic evolution time ∆ is the time taken by this wave packet, travelling at the velocity , to “pass” a point in space. If ∆ is the spatial extension of the wave packet, we therefore have: ∆



(D-80)

From this we deduce, combining (D-79) and (D-80): ∆



∆ ∆ &~

(D-81)

Relation (D-69) is often called the fourth Heisenberg uncertainty relation. It is clearly different, however, from the other three uncertainty relations which relate to the three components of R and P [formulas (14) of Complement FI ]. In (D-69), only the energy is a physical quantity like R and P; , on the other hand, is a parameter, with which no quantum mechanical operator is associated.

c(E ) 2

ΔE 0

E E0

252

Figure 4: By superposing the stationary states with the coefficients ( ), we obtain a state of the system where the energy is not perfectly well-defined. The corresponding uncertainty ∆ is given by the width of the curve that represents ( ) 2 . According to the fourth uncertainty relation, the evolution of the state ( ) will be significant after a time ∆ such that ∆ ∆ & ~.

E. THE SUPERPOSITION PRINCIPLE AND PHYSICAL PREDICTIONS

E.

The superposition principle and physical predictions

The physical meaning of the first postulate remains to be examined. According to this postulate, the states of a physical system belong to a vector space and are, consequently, linearly superposable. One of the important consequences of the first postulate, when it is combined with the others, is the appearance of interference effects such as those which led us to wave-particle duality (Chap. I). Our understanding of these phenomena is based on the concept of probability amplitudes, which we shall examine here with the aid of some simple examples. E-1.

Probability amplitudes and interference effects

E-1-a.

.

The physical meaning of a linear superposition of states

The difference between a linear superposition and a statistical mixture Let 1

1

1

=

and

2

be two orthogonal normalized states:

2

2

=1

1

2

=0

(E-1)

( 1 and 2 could be, for example, two eigenstates of the same observable associated with two different eigenvalues 1 and 2 ). If the system is in the state 1 , we can calculate all the probabilities concerning the measurement results for a given observable . For example, if is the (normalized) eigenvector of which corresponds to the eigenvalue (assumed to be non-degenerate), the probability 1 ( ) of finding when is measured on the system in the state 1 is: 1(

)=

1

2

(E-2)

An analogous quantity 2(

)=

2

2(

) can be defined for the state

:

2

(E-3)

Now consider a normalized state 2

2

which is a linear superposition of

1

and

: = 1

2

+

1 2

1 2

+

=1

2

2

(E-4)

It is often said that, when the system is in the state , one has a probability 1 2 2 of finding it in the state 1 and a probability 2 of finding it in the state 2 . The exact meaning of this manner of speaking is the following: if 1 and 2 are two eigenvectors (here assumed to be normalized) of the observable corresponding to different eigenvalues 1 and 2 , the probability of finding 1 when is measured is 1 2 and that of finding 2 is 2 2 . This could lead us to believe (wrongly, as we shall see), that a state such as (E-4) is a statistical mixture of the states 1 and 2 with the weights 1 2 and 2 2 . In 253

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

other words, if we consider a large number of identical systems, all in the state (E-4), we might imagine that this set of systems in the state was completely equivalent 2 2 to another set composed of systems in the state and systems in the 1 1 2 state 2 . Such an interpretation of the state is erroneous and leads to inaccurate physical predictions, as we shall see. Assume that we are actually trying to calculate the probability ( ) of finding the eigenvalue when the observable is measured on the system in the state given by (E-4). If we interpret the state as being a statistical mixture of the states 1 and 2 and 2 2 , then we can obtain ( ) by taking the weighted 2 with the weights 1 sum of the probabilities 1 ( ) and 2 ( ) calculated above [formulas (E-2) and (E-3)]: ?

(

)=

1

2

1(

)+

2

2

2(

)

(E-5)

Actually, the postulates of quantum mechanics unambiguously indicate how to calculate ( ). The correct expression for this probability is: (

2

)=

(E-6)

( ) is therefore the square of the modulus of the probability amplitude from (E-4) that this amplitude is the sum of two terms: =

1

1

+

2

2

1

+

2

2

. We see (E-7)

Thus we obtain: (

)=

1

=

1

2

1

+ 2 Re

2

+

1 2

2

2

2

1

2

2

(E-8)

2

Taking (E-2) and (E-3) into account, we find that the correct expression for therefore written: (

)=

1

2

1(

)+

2

2

2(

) + 2 Re

1 2

1

2

(

) is (E-9)

This result is different from that of formula (E-5). It is therefore wrong to consider to be a statistical mixture of states. Such an interpretation eliminates all the interference effects contained in the double product of formula (E-9). We see that it is not only the moduli of 1 and 2 that play a role; the relative phase9 of 1 and 2 is just as important, since it enters explicitly into the physical predictions, through the intermediary of 1 2 . .

A concrete example

Consider photons propagating along the unit vector (Fig. 5): e=

1 (e + e ) 2

whose polarization state is represented by

(E-10)

9 Multiplying by a global phase factor e is equivalent to changing 1 and 2 to 1 e and 2 e . It can be verified from (E-9) that such an operation does not modify the physical predictions, which depend only on 1 2 , 2 2 and 1 2 .

254

E. THE SUPERPOSITION PRINCIPLE AND PHYSICAL PREDICTIONS

This state is a linear superposition of two orthogonal polarization states e and e . It represents light which is linearly polarized at an angle of 45 with respect to e and e . It 2 1 would be absurd to assume that photons in the state e are equivalent to = 2 2 2 1 photons in the state e and = photons in the state e . If we place in the 2 2 beam’s trajectory an analyzer whose axis e is perpendicular to e, we know that none of the photons in the state e will pass through this analyzer. But, for the statistical mixture

photons in the state e , photons in the state e , half the photons will 2 2 pass through the analyzer. In this concrete example, it is clear that a linear superposition

ex

e

e O

z

ey

Figure 5: A simple experiment which illustrates the difference between a linear superposition and a statistical mixture of states. If all the incident photons are in the polarization state: 1 e= (e + e ) 2 none of them will pass through an analyzer whose axis e is perpendicular to e. If we had, on the contrary, a statistical mixture of photons polarized either along e or along e (in equal proportions; i.e., natural light), half of them would pass through the analyzer.

such as (E-10), associated with light polarized at an angle of 45 with respect to e and e , is physically different from a statistical mixture of equal proportions of the states e and e associated with natural light (an unpolarized beam). We can also understand the importance of the relative phase of the expansion 255

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

coefficients of the state vector, by considering the four states: 1 (e + e ) 2 1 e2 = (e e ) 2 1 e3 = (e + e ) 2 1 e4 = (e e ) 2 e1 =

(E-11) (E-12) (E-13) (E-14)

which differ only by the relative phase of the coefficients (this phase being equal to 0, , + and , respectively). These four states are physically quite different: the first two 2 2 represent light which is polarized linearly along the bisectors of (e e ); the second two represent circularly polarized light (right and left respectively). E-1-b.

.

Summation over the intermediate states

Prediction of measurement results in two simple experiments

( ) Experiment 1. Imagine that the observable has been measured, at a given time, on a physical system, and that the non-degenerate eigenvalue has been found. If is the eigenvector associated with , the physical system, immediately after the measurement, is in the state . Before the system has had time to evolve, we measure another observable that does not commute with . Using the notation introduced in § C-6-a, we denote by ( ) the probability that this second measurement will yield the result . Immediately before the measurement of , the system is in the state . Therefore, if is the eigenvector of associated with the eigenvalue (assumed to be non-degenerate), the postulates of quantum mechanics lead to: 2

( )=

(E-15)

(ii) Experiment 2. We now imagine another experiment, in which we measure successively and very rapidly three observables , , , which do not commute with each other (the time separating two measurements is too short for the system to evolve). Denote by ( ) the probability, given that the result of the first measurement is , that the results of the second and third will be and respectively. ( ) is equal to the product of ( ) (the probability that, the measurement of having yielded , that of will yield ) and ( ) (the probability that, the measurement of having yielded , that of will yield ): (

)=

( )

( )

(E-16)

If all the eigenvalues of are assumed to be non-degenerate and if denotes the corresponding eigenvectors, it follows that [using for ( ) and ( ) formulas analogous to (E-15)]: ( 256

)=

2

2

(E-17)

E. THE SUPERPOSITION PRINCIPLE AND PHYSICAL PREDICTIONS

wb1

ua

wb2

υc

wb3

Figure 6: Different possible “paths” for the state vector of the system when the system is allowed to evolve freely (without undergoing any measurement) between the initial state and the final state . In this case, we must add together the probability amplitudes associated with these different paths, and not their probabilities.

.

The fundamental difference between these two experiments

In both of these experiments, the state of the system after the measurement of the observable is (the role of this measurement being to fix this initial state). It then becomes after the last measurement, that of the observable (for this reason, will be called the “final state”). It is possible in both cases to decompose the state of the system just before the measurement of in terms of the eigenvectors of , and to say that between the state and the state , the system “can pass” through several different “intermediate states” . Each of these intermediate states defines a possible “path” between the initial state and the final state (Fig. 6). The difference between the two experiments described above is the following. In the first one, the path that the system has taken between the state and the state is not determined experimentally [we measure only the probability ( ) that, starting from , it ends up in ]. On the other hand, in the second experiment, this path is determined, by measuring the observable [thus enabling us to obtain the probability ( ) that the system, starting from , passes through a given intermediate state and ends up finally in ]. We could then be tempted, in order to relate ( ) to ( ), to use the following argument: in experiment 1, the system is “free to pass” through all intermediate states ; it would then seem that the global probability ( ) should be equal to the sum of all the probabilities ( ) associated with each of the possible “paths”. Can we not then write: ?

( )=

for

(

)

(E-18)

As we shall see, this formula is wrong. Let us go back to the exact formula (E-15) ( ); this formula brings in the probability amplitude which we can write, 257

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

using the closure relation for the states =

: (E-19)

Substitute this expression into (E-15): 2

( )= 2

=

2

+

(E-20) =

Using (E-17), we therefore obtain: ( )=

(

)+

(E-21) =

This equation enables us to understand why formula (E-18) is wrong: all the “cross terms” that appear in the square of the modulus of sum (E-19) are absent in (E-18). All the interference effects between the different possible paths are thus missing in (E-18). If, therefore, we want to establish a relation between these two experiments, we see that it is necessary to reason in terms of probability amplitudes. When the intermediate states of the system are not determined experimentally, it is the probability amplitudes, and not the probabilities, which must be summed. The error in the reasoning which led to the wrong relation (E-18) is obvious, moreover, if we remember the fifth postulate (reduction of the wave packet). In the second experiment, the measurement of the observable must, in fact, involve a perturbation of the system under study: during the measurement its state vector undergoes an abrupt change (projection onto one of the states ). It is this unavoidable perturbation which is responsible for the disappearance of interference effects. In the first experiment, on the other hand, it is incorrect to say that the physical system “passes through one or another of the states ”; it would be more accurate to say that it passes through all the states .

Comments:

( ) The preceding discussion resembles in every respect that of § A-2-a of Chapter I concerning Young’s double-slit experiment. To determine the probability that a photon emitted by the source will arrive at a given point of the screen, one must first calculate the total electric field at . In this problem, the electric field plays the role of a probability amplitude. When one is not trying to determine through which slit the photon passes, it is the electric fields radiated by the two slits, and not their intensities, that must be added together to obtain the total field at (whose square yields the desired probability). In other words, the field radiated by one of the slits at point is 258

E. THE SUPERPOSITION PRINCIPLE AND PHYSICAL PREDICTIONS

analogous to the probability amplitude for a photon, emitted by the source, to pass through this slit and to arrive at . ( ) It is not necessary to retain the assumption that the measurements of and in experiment 1 and of , , in experiment 2 are performed very close together in time. If the system has had time to evolve between two of these measurements, we can use the Schrödinger equation to determine the modification of the state of the system due to this evolution [cf. Complement FIII , comment (ii) of § 2]. E-1-c.

Conclusion: The importance of the concept of probability amplitudes

The two examples studied in §§ E-1-a and E-1-b demonstrate the importance of the concept of probability amplitudes. Formulas (E-5) and (E-18), as well as the arguments that lead to them, are incorrect since they represent an attempt to calculate a probability directly without first considering the corresponding probability amplitude. In both cases, the correct expression (E-8) or (E-20) has the form of a square of a sum (more precisely, the square of the modulus of this sum), while the incorrect formula (E-5) or (E-18) contains only a sum of squares (all the cross terms, responsible for interference effects, being omitted). From the preceding discussion, we shall therefore retain the following ideas: ( ) The probabilistic predictions of quantum theory are always obtained by squaring the modulus of a probability amplitude. ( ) When, in a particular experiment, no measurement is made at an intermediate stage, one must never reason in terms of the probabilities of the various results that could have been obtained in such a measurement, but rather in terms of their probability amplitudes. (

) The fact that the states of a physical system are linearly superposable means that a probability amplitude often presents the form of a sum of partial amplitudes. The corresponding probability is then equal to the square of the modulus of a sum of terms, and the various partial amplitudes interfere with each other.

E-2.

Case in which several states can be associated with the same measurement result

In the preceding section, we stressed and illustrated the fact that, in certain cases, the probability of an event is given by the postulates of quantum mechanics in the form of a square of a sum of terms (more precisely, the square of the modulus of such a sum). Now the statement of the fourth postulate [formula (B-7)] involves a sum of squares (the sum of the squares of the moduli) when the measurement result whose probability is sought is associated with a degenerate eigenvalue. It is important to understand that these two rules are not contradictory but, on the contrary, complementary: each term of the sum of squares (B-7) can itself be the square of a sum. This is the first point on which we shall focus our attention in this section. Furthermore, this discussion will enable us to complete the statement of the postulates: we shall consider measurement devices whose accurary is limited (as is always, of course, the case) and see how to predict theoretically the possible results. Finally, we shall extend to the case of continuous spectra the fifth postulate of reduction of the wave packet. 259

CHAPTER III

E-2-a.

THE POSTULATES OF QUANTUM MECHANICS

Degenerate eigenvalues

In the examples treated in § E-1, we always assumed that the results of the various measurements envisaged were simple, i.e. non-degenerate, eigenvalues of the corresponding observables. This hypothesis was intended to simplify these examples so that the origin of the interference effects appeared as clearly as possible. Now let us consider a degenerate eigenvalue of an observable . The eigenstates associated with form a vector subspace of dimension , in which an orthonormal basis ; =1 2 can be chosen. The discussion of § C-6-b shows that knowing a measurement of has yielded is not sufficient to determine the state of the physical system after this measurement. We shall say that several final states can be associated with the same result : if the initial state (the state before the measurement) is given, the final state after the measurement is perfectly well-defined; but if the initial state is changed, the final state is, in general, different (for the same measurement result ). All final states associated with are linear combinations of the orthonormal vectors , with = 1 2 . Formula (B-7) indicates unambiguously how to find the probability ( ) that a measurement of on a system in the state will yield the result . One chooses an orthonormal basis, for example ; = 1 2 , in the eigensubspace which 2 corresponds to ; one calculates the probability of finding the system in each of the states of this basis; ( ) is then the sum of these probabilities. However, it 2 must not be forgotten that each probability can be the square of the modulus of a sum of terms. Consider, for example, the case envisaged in § E-1-a- and assume now that the eigenvalue of the observable , whose probability ( ) is to be calculated, is -fold degenerate. Formula (E-6) is then replaced by: (

2

)=

(E-22)

=1

with: =

1

1

+

2

2

(E-23)

The discussion of § E-1-a- remains valid for each of the terms of formula (E-22): 2 , which is obtained from (E-23), is the square of a sum; ( ) is then the sum of these squares. § E-1-b can similarly be generalized to the case where the eigenvalues of the observables measured are degenerate. Before summarizing the preceding discussions, we are going to study another important situation where several final states are associated with the same measurement result. E-2-b.

.

Insufficiently selective measurement devices

Definition

Assume that, in order to measure the observable in a given physical system, we have at our disposal a device which works in the following way: ( ) This device can give only two different answers10 , which we shall denote, for convenience, by “yes” and “no”. 10 The

following arguments can easily be generalized to cases where the device can give several different answers having characteristics similar to those described in ( ) and ( ).

260

E. THE SUPERPOSITION PRINCIPLE AND PHYSICAL PREDICTIONS

( ) If the system is in an eigenstate of whose eigenvalue is included in a given interval ∆ of the real axis, the answer is always “yes”; this is also the case when the state of the system is any linear combination of eigenstates of associated with eigenvalues which are all included in ∆. (

) If the state of the system is an eigenstate of whose eigenvalue falls outside ∆, or any linear combination of such eigenstates, the answer is always “no”.

∆ therefore characterizes the resolving power of the measurement device under consideration. If there exists only one eigenvalue of in the interval ∆, the resolving power is infinite: when the system is in an arbitrary state, the probability (yes) of obtaining the answer “yes” is equal to the probability of finding in a measurement of ; the probability (no) of obtaining “no” is obviously equal to 1 (yes). If, on the other hand, ∆ contains several eigenvalues of , the device does not have a sufficient resolution to distinguish between these various eigenvalues: we shall say that it is insufficiently selective. We shall see how to calculate (yes) and (no) in this case. To be able to study the perturbation created by such a measurement on the state of the system, we are going to add the following hypothesis: the device transmits without perturbation the eigenstates of associated with the eigenvalues of the interval ∆ (as well as any linear combination of these eigenstates), while it “blocks” the eigenstates of associated with the eigenvalues outside ∆ (as well as all their linear combinations). The device thus behaves like a perfect filter for all states associated with ∆.

.

Example

Most measurement devices used in practice are insufficiently selective. For example, to measure the abscissa of an electron propagating parallel to the axis, one can (Fig. 7) place in the plane ( is perpendicular to the plane of the figure) a plate with a slit whose axis is parallel to , the abscissas of the edges being 1 and 2 . It can then be seen that any wave packet which is entirely included between the = 1 and = 2 planes (a superposition of eigenstates of having eigenvalues contained within the interval [ 1 2 ]) will enter the region to the right of the slit (“yes” answer); in this case, it will not undergo any modification. On the other hand, any wave packet situated below the = 1 plane or above the = 2 plane will be blocked by the plate and will not pass to the right (“no” answer). x

Figure 7: Schematic drawing of a device for measuring the abscissa of a particle. Since the interval 1 2 is necessarily non-zero, such a device is always imperfectly selective.

x2 x1 O

.

z

Quantum description

For such an insufficiently selective device, several final states are possible after a measurement which has yielded the answer yes. They can be, for example, the various eigenstates of that correspond to the eigenvalues contained in the interval ∆. The physical problem posed by such devices, and which we are now going to consider, consists of predicting the answer which will be obtained when a system in an arbitrary state

261

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

enters the device. For example, for the apparatus of Figure 7, what happens when we are dealing with a wave packet which is neither entirely contained between the = 1 and = 2 planes (in which case the answer is certainly yes) nor entirely situated outside this region (in which case the answer is certainly no)? We shall see that this is equivalent to measuring an observable whose spectrum is degenerate. Consider the subspace ∆ spanned by all the eigenstates of whose eigenvalues are contained within the interval ∆. The projector ∆ onto this subspace is written (cf. § B-3-bof Chap. II): ∆

=

(E-24) ∆ =1

(the eigenvalues of the interval ∆ can be degenerate, hence the additional index ; the vectors are assumed to be orthonormal). ∆ is the subspace formed by all the possible states of the system after a measurement which has given the result yes. Referring to the definition of the measurement device, we see that the response will certainly be yes for any state belonging to ∆ , that is, for any eigenstate of ∆ with the eigenvalue of +1. The answer will certainly be no for any state belonging to the supplement of ∆ , that is, for any eigenstates of ∆ with the eigenvalue of 0. The yes and no answers which can be furnished by the measurement device therefore correspond to the eigenvalues +1 and 0 of the observable ∆ : it could be said that the device is actually measuring the observable ∆ rather than . In the light of this interpretation, the case of an insufficiently selective measurement device can be treated in the framework of the postulates which we have stated. The probability (yes) of obtaining the answer yes is equal to the probability of finding the (degenerate) eigenvalue +1 of ∆ . Now an orthonormal basis in the corresponding eigensubspace is constituted by the set of states which are eigenstates of with eigenvalues contained within the interval ∆. Applying formula (B-7) to the eigenvalue + 1 of the observable ∆ we therefore obtain (for a system in the state ): 2

(yes) =

(E-25)

∆ =1

Since there are only two possible answers, we obviously have: (no) = 1



is

(yes)

(E-26)

The projector onto the eigensubspace associated with the eigenvalue +1 of the observable ∆ itself; formula (B-14) therefore gives here:

(yes) =

(E-27)



[this formula is equivalent to (E-25)]. Similarly, since the device does not perturb states belonging to ∆ and blocks those of the supplement of ∆ , we find that the state of the system after a measurement which has given the result yes is: 1

=

(E-28) 2



=1

∆ =1

that is: =

1 ∆ ∆

262

(E-29)

E. THE SUPERPOSITION PRINCIPLE AND PHYSICAL PREDICTIONS

When ∆ contains only one eigenvalue , ∆ reduces to : formulas (B-14) and (B-31) are then seen to be special cases of formulas (E-27) and (E-29). E-2-c.

Recapitulation: must one sum the amplitudes or the probabilities?

There are therefore cases (§ E-1) where, to calculate a probability, one takes the square of a sum, because several probability amplitudes must be added together. In other cases (§ E-2), one takes a sum of squares, because several probabilities must be added together. It is clearly important not to confuse these different cases and to know, in a given situation, if it is the probability amplitudes or the probabilities themselves which must be summed. Young’s double-slit experiment will again furnish us with a very convenient physical example which will enable us to illustrate and summarize the preceding discussions. Imagine that we want to calculate the probability for a particular photon to strike the plate anywhere between two points 1 and 2 having abscissas of 1 and 2 (Fig. 8). This probability is proportional to the total light intensity received by this portion of the plate. It is therefore a “sum of squares”; more precisely, it is the integral of the intensity ( ) between 1 and 2 . But each term ( ) of this sum is obtained by squaring the electric field E ( ) at , which is equal to the sum of the electric fields E ( ) and E ( ) radiated at by the slits and . ( ) is therefore proportional to E ( ) + E ( ) 2 , that is, to the square of a sum. E ( ) and E ( ) are the amplitudes associated with the two possible paths SAM and SBM which end at the same point ; they are added to obtain the amplitude at since one is not trying to determine through which slit the photon passes. Then, to calculate the total light intensity received by the interval 1 2 , one adds the intensities which arrive at the various points of this interval. To sum up, the fundamental idea to be retained from the discussions of this section can be expressed schematically in the following way: Add the amplitudes corresponding to the same final state, then the probabilities corresponding to orthogonal final states.

M2 A

x2

M x M1 x2

S

0

B

Figure 8: Young’s double-slit experiment. To calculate the probability density for detecting a photon at the point , it is necessary to add the electric fields radiated by the slits and , then to square the field thus obtained (“square of the sum”). The probability of finding a photon in the interval [ 1 2 ] is now obtained by summing this probability density between 1 and 2 (“sum of squares”).

263

CHAPTER III

E-2-d.

THE POSTULATES OF QUANTUM MECHANICS

Application to the treatment of continuous spectra

When the observable we want to measure has a continuous spectrum, only insufficiently selective devices can be used: it is impossible to imagine a physical device that could isolate a single eigenvalue belonging to a continuous set. We now show how the study of § E-2-b enables us to be more precise and complete in our treatment of observables with continuous spectra.

.

Example: measurement of the position of a particle

Let (r) = r be the wave function of a (spinless) particle. What is the probability of finding the abscissa of this particle within the interval [ 1 2 ] of the -axis, using, for example, a measurement device like the one in Figure 7? The subspace ∆ associated with this measurement result is the space spanned by the kets r = which are such that: 1 2 . Since these kets are orthonormal in the extended sense, application of the rule stated in § E-2-c above yields: +

2

(

2)

1

=

+

d

d

2

d

1

+

2

=

+

d

d

(r) 2

d

(E-30)

1

Formula (E-27) obviously leads to the same result, since the projector +

2



=



is written here:

+

d

d

d

(E-31)

1

and we therefore have: (

2)

1

=

∆ +

2

=

+

d

d

d

(E-32)

1

To know the state of the particle after such a measurement, which has yielded the result yes, it suffices to apply formula (E-29): =

=

1 ∆ +

2

1

+

d

d

d

(E-33)

1

where the normalization factor = is known [formula (E-32)]. Let us calculate the wave function (r) = r associated with the ket : =

r

+

2

1

d

+

d

d

rr

(r )

(E-34)

1

Now r r = (r r ) = ( ) ( ) ( ). The integrations over and can therefore be performed immediately: they amount to replacing and by and in the function to be integrated. Equation (E-34) thus becomes: (

)=

2

1

d 1

264

(

)

(

)

(E-35)

E. THE SUPERPOSITION PRINCIPLE AND PHYSICAL PREDICTIONS

If the point is situated inside the interval of integration [ we were integrating from to + : (

)=

1

(

)

for

1

2 ],

1

the result is the same as if

(E-36)

2

On the other hand, if falls outside the interval of integration, ( included in this interval, and: (

)=0

for

2

and

) is zero for all values of

(E-37)

1

The part of (r) that corresponds to the interval accepted by the measurement device therefore persists, undeformed, immediately after the measurement [the factor 1/ simply ensures that (r) remains normalized]; the rest is suppressed by the measurement. The wave packet (r) representing the initial state of the particle is, as it were, “truncated” by the edges of the slit.

Comments: ( ) This example clearly reveals the concrete meaning of the “reduction of the wave packet”. ( ) If a large number of particles, all in the same state , enter the device successively, the result will sometimes be yes and sometimes be no [with the probabilities (yes) and (no)]. If the result is yes, the particle continues on its way, starting from the “truncated” state ; if the result is no, the particle is absorbed by the screen. In the example we are considering here, the measurement device becomes all the more selective as 2 1 becomes smaller. We see, however, that it is impossible to make it perfectly selective because the spectrum of is continuous: however narrow the slit may be, the interval [ 1 2 ] which it defines always contains an infinity of eigenvalues. Nevertheless, in the limiting case of a slit of an infinitely small width ∆ , we find the equivalent of formula (B-17), which was the expression of the fourth postulate in the case of a continuous spectrum. ∆ ∆ Let us choose 1 = 0 and 2 = 0 + (a slit of width ∆ centered at 0 ), and 2 2 assume that the wave function (r) varies very little within the interval ∆ . Then, in (E-30), we can replace (r) 2 by ( 0 ) 2 and perform the integration over : 0

∆ 2

0

+

∆ 2

+



+

d

d

(

0

)2

(E-38)

We indeed find a probability equal to the product of ∆ and a positive quantity which plays the role of a probability density at the point 0 . The difference with formula (B-17) lies in the fact that the latter applies to the case of a continuous but non-degenerate spectrum, while here the eigenvalues of are infinitely degenerate in r ; this is the origin of the integrals over and that appear in (E-38) (summation over the indices associated with the degeneracy).

.

Postulate of reduction of wave packets in the case of a continuous spectrum

In § B-3-c, we limited ourselves, in the statement of the fifth postulate, to the case of a discrete spectrum. Formula (E-33) and its accompanying discussion enable us to understand the form assumed by this postulate when a continuous spectrum is considered: one simply applies the results of § E-2-b concerning insufficiently selective devices. Let be an observable with a continuous spectrum (assumed, for simplicity, to be nondegenerate). The notation is the same as in § B-3-b- .

265

CHAPTER III

THE POSTULATES OF QUANTUM MECHANICS

If a measurement of on a system in the state has yielded the result 0 to within ∆ , the state of the system immediately after this measurement is described by: 1

=



(



0)

(

0)

(E-39)

with: 0+



(

0)

= 0

∆ 2

∆ 2

d

(E-40)

Figures 9-a and 9-b illustrate this statement. If the function , representing in the basis, has the form indicated in Figure 9-a, the state of the system immediately after the measurement is represented, to within a normalization factor, by the function of Figure 9b [the calculation is analogous in all respects to the one which derives (E-36) and (E-37) from (E-33)].

υ

υ

ψ

PΔ (

0)

ψ

Δ 0

0

0

0

Figure 9: Illustration of the postulate of reduction of wave packets in the case of a continuous spectrum: one measures the observable , with eigenvectors and eigenvalues . The measurement device has a selectivity ∆ . If the value found is 0 to within ∆ , the effect of the measurement on the wave function is to “truncate” it about the value 0 (to normalize the new wave function, it is obviously necessary to multiply it by a factor larger than 1).

We see that, even if ∆ is very small, one can never actually prepare the system in the state basis, by 0 ). We can 0 , which would be represented, in the 0 = ( only obtain a narrow function centered at 0 , since ∆ is never exactly zero. References and suggestions for further reading:

Development of quantum mechanical concepts: references of section 4 of the bibliography, particularly Jammer (4.8). Discussion and interpretation of the postulates: references of section 5 of the bibliography; Von Neumann (10.10), Chaps. V and VI; Feynman III (1.2), § 2.6, Chap. 3 and § 8.3. Quantization rules using Poisson brackets: Dirac (1.13), § 21; Schiff (1.18), § 24. Probability and statistics: see the corresponding subsection of section 10 of the bibliography. 266

COMPLEMENTS OF CHAPTER III, READER’S GUIDE

AIII : PARTICLE IN AN INFINITE DIMENSIONAL POTENTIAL WELL

ONE-

Direct applications of Chapter III to simple cases. The accent is placed on the physical discussion of the results (elementary level).

BIII : STUDY OF THE PROBABILITY CURRENT IN SOME SPECIAL CASES

CIII : ROOT MEAN SQUARE DEVIATIONS OF TWO CONJUGATE OBSERVABLES

A little more formal; general proof of the Heisenberg relations; may be skipped in a first reading.

DIII : MEASUREMENTS BEARING ON ONLY ONE PART OF A PHYSICAL SYSTEM

Discussion of measurements bearing on only one part of a system; a rather simple but somewhat formal application of Chapter III; may be skipped on a first reading.

(more complements on the next page)

267

EIII : THE DENSITY OPERATOR FIII : THE EVOLUTION OPERATOR GIII : THE SCHRÖDINGER AND HEISENBERG PICTURES HIII : GAUGE INVARIANCE JIII : PROPAGATOR FOR THE SCHRÖDINGER EQUATION

Complements that serve as an introduction to a more advanced quantum mechanics course. Aside from FIII , which is simple, they are on a higher level than the rest of this book, but they are comprehensible if Chapter III has been read. May be reserved for subsequent study. EIII : definition and properties of the density operator, which is used in the quantum mechanical description of systems whose state is imperfectly known (statistical mixture of states). Fundamental tool of quantum statistical mechanics. FIII : introduction of the evolution operator, which gives the quantum state of a system at an arbitrary instant in terms of its state at the instant 0 . GIII : describes the evolution of a quantum system in a way that is different from, but equivalent to, that of Chapter III. The time dependence now appears in the observables and not in the state of the system. HIII : discussion of the quantum formalism in the case where the system is subject to an electromagnetic field. Although the description of the system involves the electromagnetic potentials, the physical properties depend only on the values of the electric and magnetic fields; they remain invariant when the potentials describing the same electromagnetic field are changed. JIII : an introduction to a different way of approaching quantum mechanics, based on a principle analogous to Huygens’ principle in classical wave optics.

KIII : UNSTABLE LEVELS, LIFETIMES

LIII : EXERCISES

(more complements on the next page) 268

Simple introduction to the important physical concepts of instability and lifetimes; easy, but can be skipped in a first reading.

MIII : BOUND STATES OF A PARTICLE IN A “POTENTIAL WELL” OF ARBITRARY SHAPE NIII : UNBOUND STATES OF A PARTICLE IN THE PRESENCE OF A POTENTIAL WELL OR BARRIER OF ARBITRARY SHAPE OIII :QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

Return to one-dimensional problems, considered from a more general point of view than in Chapter I and its complements. MIII : generalisation to an arbitray potential well of the main results obtained in § 2-c of Complement HI ; recommended, since easy and physically important. NIII : study of unbound stationary states in an arbitrary potential; a little more formal; the definitions and results of this complement are necessary for complement OIII . OIII : introduction of the concept (which is fundamental to solid state physics) of energy bands in a potential having a periodic structure (this concept will be treated differently in Complement FXI ); rather difficult, can be reserved for later reading.

269



PARTICLE IN AN INFINITE ONE-DIMENSIONAL POTENTIAL WELL

Complement AIII Particle in an infinite potential well

1

2

3

Distribution of the momentum values in a stationary state 1-a Calculation of the function ( ) of , and, of ∆ . . . . . 1-b Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evolution of the particle’s wave function . . . . . . . . . . . 2-a Wave function at the instant . . . . . . . . . . . . . . . . . 2-b Evolution of the shape of the wave packet . . . . . . . . . . . 2-c Motion of the center of the wave packet . . . . . . . . . . . . Perturbation created by a position measurement . . . . . .

271 271 273 275 276 276 277 279

In Complement HI (§ 2-c- ), we studied the stationary states of a particle in a one-dimensional infinite potential well. Here we intend to re-examine this subject from a physical point of view. This will allow us to apply some of the postulates of Chapter III to a concrete case. We shall focus in particular on the results obtained when the position or momentum of the particle is measured. 1.

Distribution of the momentum values in a stationary state

1-a.

Calculation of the function

( ) of

, and, of ∆

We have seen that the stationary states of the particle correspond to the energies1 : 2 2 2

=

~

2

(1)

2

and to the wave functions: ( )=

2

sin

(2)

(where is the width of the well and is any positive integer). Consider a particle in the state , with energy . The probability of a measurement of the momentum of the particle yielding a result between and + d is: ( )2 d

( )d =

(3)

with: ( )= 1 We

1 2 ~

2

sin

e

~

d

(4)

0

shall use the notation of Complement HI .

271



COMPLEMENT AIII

This integral is easy to calculate; it is equal to: ( )=

=

1

e

2

~ 1

e

2

+~

e

~

d

0

1

~

~

+~

e

+

~

1

(5)

~

that is: ( )=

1 2

e

2

~

2~

+ ( 1)

+1

+

~

(6)

~

with: ( )=

sin(

2~) 2~

(7)

To within a proportionality factor, the function ( ) is the sum (or the difference) ~ ~ of two “diffraction functions” , centered at = . The “width” of these functions (the distance between the first two zeros, symmetric with respect to the 4 ~ central value) does not depend on and is equal to . Their “amplitude” does not depend on either. The function inside brackets in expression (6) is even if is odd, and odd if is even. The probability density ( ) given in (3) is therefore an even function of in all cases, so that: +

=

( ) d =0

(8)

The mean value of the momentum of the particle in the energy state is therefore zero. 2 Let us calculate, in the same way, the mean value of the square of the mo~ d mentum. Using the fact that in the representation acts like , and performing d 2 an integration by parts, we obtain : 2

= ~2 0

d d

d 2

2

= ~2

2

cos2

d

0

= 2 Result

~

2

(9) +

2 ( )2 2d . (9) could also be derived from (6) by performing the integral = This calculation, which presents no theoretical difficulties, is nevertheless not as direct as the one given here.

272



PARTICLE IN AN INFINITE ONE-DIMENSIONAL POTENTIAL WELL

From (8) and (9), we get: ∆

2

=

2

=

~

(10)

The root mean square deviation therefore increases linearly with . 1-b.

Discussion

Let us plot, for different values of , the curves giving the probability density ( ). To do this, let us begin by studying the function inside brackets in expression (6). For the ground state ( = 1), it is the sum of two functions , the centers of these two diffraction curves being separated by half their width (Fig. 1-a). For the first excited level ( = 2), the distance between these centers is twice as large, and in this case, moreover, the difference of two functions must be taken (Fig. 2-a). Finally, for an excited level corresponding to a large value of , the centers of the two diffraction curves are separated by a distance much greater than their width.

πh a

F p

F p

πh a

a πh a

πh a

p

0 𝒫1(p) b

πh a

πh a 0

p

Figure 1: The wave function 1 ( ), associated in the representation with the ground state of a particle in an infinite well, is obtained by adding two diffraction functions (dashed lines curves in figure a). Since the centers of these two functions are separated by half their width, their sum has the shape represented by the solid-line curve in figure a. Squaring this sum, one obtains the probability density 1 ( ) associated with a measurement of the momentum of the particle (fig. b).

Squaring these functions, one obtains the probability density b and 2-b). Note that for large +

~

the interference term between

( ) (cf. Fig. 1~ and

is negligible (because of the separation of the centers of the two curves): 273

COMPLEMENT AIII

• 2πh a

F p

F p

2π h a

a 2π h a

2π h a p 𝒫2(p)

b 2π h a

2π h a

p

0

Figure 2: For the first excited level, the function 2 ( ) is obtained by taking the difference between two functions , which have the same width as in Figure 1-a but are now more widely separated (dash-line curves in figure a). The curve obtained in this way is the solid line in figure a. The probability density 2 ( ) then has two maxima located in the neighborhood of = 2 ~ (fig. b).

( )=

~ 4 ~ 2

4 ~

~

+ ( 1) +

2

+1

+

+

~

2

~

(11)

The function ( ) then has the shape shown in Figure 3. It can be seen that when is large, the probability density has two symmetrical 4 ~ ~ peaks, of width , centered at = . It is then possible to predict with almost complete certainty the results of a measurement of the momentum of the particle in the ~ ~ state : the value found will be nearly equal to + or , the relative accuracy3 ~ being equally probable). This is improving as increases (the two opposite values simple to understand: for large , the function ( ), which varies sinusoidally, performs numerous oscillations inside the well; it can then be considered to be practically the sum ~ of two progressive waves corresponding to opposite momenta = . When decreases, the relative accuracy with which one can predict the possible values of the momentum diminishes. We see, for example, in Figure 2-b, that when = 2, 3 The

274

absolute accuracy is independent of , since the width of the curves is always

4 ~

.



PARTICLE IN AN INFINITE ONE-DIMENSIONAL POTENTIAL WELL

𝒫n(p)

pcl

0

nπ h a

pcl

p nπ h a

Figure 3: When is large (a very excited level), the probability density has two pronounced peaks, centered at the values = ~ , which are the momenta associated with the classical motion at the same energy.

the function ( ) has two peaks whose widths are comparable to their distance from the origin. In this case, the wave function undergoes only one oscillation inside the well. It is not surprising that, for this sinusoid “truncated” at = 0 and = , the wavelength (and therefore, the momentum of the particle) is poorly defined. Finally, for the ground state, the wave function is represented by half a sinusoidal arc: the relative values of the wavelength and momentum of the particle are then very poorly known (Fig. 1-b). Comments:

( ) Let us calculate the momentum of a classical particle of energy (1); we have: 2

given in

2 2 2

=

2 that is:

=

~

2

2 ~

(12)

(13)

When is large, the two peaks of ( ) therefore correspond to the classical values of the momentum. ( ) We see that, for large , although the absolute value of the momentum is welldefined, its sign is not. This is why ∆ is large: for probability distributions with two maxima like that of Figure 3, the root mean square deviation reflects the distance between the two peaks; it is no longer related to their widths. 2.

Evolution of the particle’s wave function

Each of the states , with its wave function ( ), describes a stationary state, which leads to time-independent physical predictions. Time evolution appears only when the 275

COMPLEMENT AIII



state vector is a linear combination of several kets simple case, for which at time = 0 the state vector 1 [ 2

(0) = 2-a.

+

1

2

. We shall consider here a very (0) is:

]

(14)

Wave function at the instant

Apply formula (D-54) of Chapter III; we immediately obtain: 2

1 e 2

() =

2

~

2

2

1

2

+e

2

or, omitting a global phase factor of 1 2

()

1

+e

=

3 2

21

~

(15)

2

(): (16)

2

with: 21

2-b.

=

2

1

~

2

~

(17)

2

Evolution of the shape of the wave packet

The shape of the wave packet is given by the probability density: (

)2=

1 2

2 1(

)+

1 2

2 2(

)+

1(

)

2(

) cos

(18)

21

We see that the time variation of the probability density is due to the interference term in 1 2 . Only one Bohr frequency appears, 21 = ( 2 , since the initial state 1)

φ12

0

φ22

a

a x

0

b

φ1φ2

a x

0

c

a x

Figure 4: Graphical representation of the functions 21 (the probability density of the particle in the ground state), 22 (the probability density of the particle in the first excited state) and 1 2 (the cross term responsible for the evolution of the shape of the wave packet).

276



PARTICLE IN AN INFINITE ONE-DIMENSIONAL POTENTIAL WELL

t=0

0 < t < π/2ω21

t = π/2ω21

t = π/ω21

t = 3π/2ω21

t = 2π/ω21

Figure 5: Periodic motion of a wave packet obtained by superposing the ground state and the first excited state of a particle in an infinite well. The frequency of the motion is the Bohr frequency 21 2 .

(14) is composed only of the two states 1 and 2 . The curves corresponding to the variation of the functions 21 , 22 and 1 2 are plotted in Figures 4-a, b and c. Using these figures and relation (18), it is not difficult to represent graphically the variation in time of the shape of the wave packet (cf. Fig. 5): we see that the wave packet oscillates between the two walls of the well. 2-c.

Motion of the center of the wave packet

Let us calculate the mean value is convenient to take: =

( ) of the position of the particle at time . It

2

(19)

since, by symmetry, the diagonal matrix elements of 1

2

1 0

2

0

2

2

sin2 sin2

are zero:

d =0 2

d =0

(20)

We then have: ( ) = Re e

21

1

2

(21) 277

COMPLEMENT AIII



with: 1

=

2

=

1

2

2

2

sin

1

sin

2

2

d

0

=

16 9 2

(22)

Therefore: ()=

2

16 cos 9 2

(23)

21

X a

0

2π/ω21

t

Figure 6: Time variation of the mean value corresponding to the wave packet’s motion plotted in Figure 5. The dashed line represents the position of a classical particle moving with the same period. Quantum mechanics predicts that the center of the wave packet will turn back before reaching the wall, as explained by the action of the potential on the “edges” of the wave packet.

The variation of ( ) is represented in Figure 6. In dashed lines, the variation of the position of a classical particle has been traced, for a particle moving to and fro in the well with an angular frequency of 21 (since it is not subjected to any force except at the walls, its position varies linearly with between 0 and during each half-period). We immediately notice a very clear difference between these two types of motion, classical and quantum mechanical. The center of the quantum wave packet, instead of turning back at the walls of the well, executes a movement of smaller amplitude and retraces its steps before reaching the regions where the potential is not zero. We see again here a result of § D-2 of Chapter I: since the potential varies infinitely quickly at = 0 and = , its variation within a domain of the order of the dimension of the wave packet is not negligible, and the motion of the center of the wave packet does not obey the laws of classical mechanics (see also Chapter III, § D-1-d- ). The physical explanation of this phenomenon is the following: before the center of the wave packet has touched the wall, the action of the potential on the “edges” of this packet is sufficient to make it turn back. 278



PARTICLE IN AN INFINITE ONE-DIMENSIONAL POTENTIAL WELL

Comment:

The mean value of the energy of the particle in the state is easy to obtain: =

1 2

+

1

1 2

=

2

5 2

( ) calculated in (15)

(24)

1

as is: 2

=

1 2

2 1

+

1 2

2 2

=

17 2

2 1

(25)

which gives: ∆

=

3 2

(26)

1

2 Note in particular that , and ∆ are not time-dependent; this could have been foreseen, since is a constant of the motion. In addition, we see from the preceding discussion that the wave packet evolves appreciably over a time of the order of:

1



(27)

21

Using (26) and (27), we find: ∆

3 2



1

~ 3

= 1

~ 2

(28)

We again find the time-energy uncertainty relation (§ D-2-e of Chapter III).

3.

Perturbation created by a position measurement

Consider a particle in the state 1 . Assume that the position of the particle is measured at time = 0, with the result = 2. What are the probabilities of the different results that can be obtained in a measurement of the energy, performed immediately after this first measurement? One must beware of the following false argument: after the measurement, the particle is in the eigenstate of corresponding to the result found, and its wave function is therefore proportional to ( 2); if a measurement of the energy is then performed, the various values can be found, with probabilities proportional to: 2

d 0

2

( )

2

2

=

2

if

is odd

=

(29) 0 if

is even

Using this incorrect argument, one would find the probabilities of all values of corresponding to odd to be equal. This is absurd, since the sum of these probabilities would then be infinite. 279

COMPLEMENT AIII



This error results from the fact that we have not taken the norm of the wave function into account. To apply the fourth postulate of Chapter III correctly, it is necessary to write the wave function as normalized just after the first measurement. However it is not possible4 to normalize the function ( 2). The problem posed above must be stated more precisely. As we saw in § E-2-b of Chapter III, an experiment in which the measurement of an observable with a continuous spectrum is performed never yields any result with complete accuracy. For the case with which we are concerned, we can only say that: 2

2

2

+

(30)

2

where

depends on the measurement device used but is never zero. If we assume to be much smaller than the extension of the wave function before the measurement (here ), the wave function after the measurement will be practically ( )

[ ( ) ( ) is the null function everywhere except in the interval defined 2 in (30), where it takes on the value 1 ; cf. Appendix II, § 1-a]. This wave function is indeed normalized since: d

2

( )

=1

2

(31)

𝒫(En) 2ε a

2a ε

0 1

3

5

7

9 11 13 15 17

n

Figure 7: Variation with of the probability ( ) of finding the energy after a measurement of the particle’s position has yielded the result 2 with an accuracy of ( ). The smaller , the greater the probability of finding high energy values.

What happens now if the energy is measured? Each value

4 We

280

can be found with

see concretely in this example that a -function cannot represent a physically realizable state.



PARTICLE IN AN INFINITE ONE-DIMENSIONAL POTENTIAL WELL

the probability: 2

(

)=

( )

( ) 8

= 0

1

2

d

2

sin2

2

if

is odd (32)

if

is even

The variation with respect to of ( ), for fixed and odd , is shown in Figure 7. This figure shows that the probability ( ) becomes negligible when is much larger than . Therefore, however small may be, the distribution of probabilities ( ) depends strongly on . This is why, in the first argument, where we set = 0 at the beginning, we could not obtain the correct result. We also see from the figure that the smaller is, the more the curve extends towards large values of . The interpretation of this result is the following: according to Heisenberg’s uncertainty relations (cf. Chap. I, § C-3), if one measures the position of the particle with great accuracy, one drastically changes its momentum. Thus kinetic energy is transferred to the particle, the amount increasing as decreases.

281



STUDY OF THE PROBABILITY CURRENT IN SOME SPECIAL CASES

Complement BIII Study of the probability current in some special cases

1 2

Expression for the current in constant potential regions . . Application to potential step problems . . . . . . . . . . . . 2-a Case where 0 . . . . . . . . . . . . . . . . . . . . . . . . 2-b Case where 0 . . . . . . . . . . . . . . . . . . . . . . . . Probability current of incident and evanescent waves, in the case of reflection from a two-dimensional potential step

3

283 284 284 285 285

The probability current associated with a particle having a wave function was defined in Chapter III by the relation: J(r ) =

~

[

2

(r )∇ (r )

c.c.]

(r )

(1)

(where c.c. is an abbreviation for complex conjugate). In this complement, we shall study this probability current in greater detail in some special cases: one- and two-dimensional “square” potentials. 1.

Expression for the current in constant potential regions

Consider a one-dimensional problem, with a particle of energy placed in a constant potential 0 . In Complement HI , we distinguished between several cases. ( ) When 0 , the wave function is written: ( )=

e

+

e

(2)

with: 0

=

~2 2

2

(3)

Substituting (2) into (1), we obtain: =

~

2

2

(4)

The interpretation of this result is simple: the wave function given in (2) corresponds to 2 2 two plane waves of opposite momenta = ~ with probability densities and . ( ) When , we have: 0 ( )=

e

+

~2 2

2

e

(5)

with: 0

=

(6) 283



COMPLEMENT BIII

V(x)

V0

Figure 1: Potential step of height

II

I

0.

x

0

Substituting (5) into (1), we obtain: =

~

[

+ c.c.]

(7)

In this case, we see that the two exponential waves must both have non-zero coefficients for the probability current to be non-zero. 2.

Application to potential step problems

Let us apply these results to the potential step problems studied in Complements HI , and JI . We shall therefore consider a particle of mass and energy propagating in the direction and arriving at = 0 at a potential step of height 0 (Fig. 1). 2-a.

Case where

0

Apply formula (4) to wave functions (11) and (12) of Complement HI , setting, as in that complement: 2

=0

(8)

In region I, the probability current is: =

I

~

1 1

2

2 1

(9)

and in region II: II

=

~

2 2

2

(10)

I is the difference of two terms, the first one corresponding to the incident current and the second, to the reflected current. The ratio of these two currents gives the reflection coefficient of the barrier: 2

=

1 1

which is precisely formula (15) of Complement HI . 284

(11)



STUDY OF THE PROBABILITY CURRENT IN SOME SPECIAL CASES

Similarly, the transmission coefficient of the barrier is the ratio of the transmitted current II to the incident current; we therefore have: 2

=

2

2

1

1

(12)

and we again find relation (16) of Complement HI . 2-b.

Case where

0

Since the expression for the wave function 1 ( ) is the same as in § 2-a, relation (9) is still valid. However, in region II, the wave function is: II (

)=

2

e

(13)

2

[since, in equation (20) of Complement HI , II

=0

2

= 0]. Using (7), we thus obtain: (14)

The transmitted flux is zero, as is consistent with relation (24) of HI . How should we interpret the fact that, in region II, the probability current is zero while the probability of finding the particle in this region is not? Let us refer to the results obtained in § 1 of Complement JI . We saw that part of the incident wave packet enters the classically forbidden region II, and then turns back, before setting out in the negative direction (this incursion into region II being responsible for the delay upon reflection). In the steady state, we shall therefore have two probability currents in region II: a positive current corresponding to the entrance into this region of part of the incident wave packet; a negative current corresponding to the return towards region I of this part of the wave packet. These two currents are exactly equal, so the overall result is zero. In the case of a one-dimensional problem, the structure of the probability current of the evanescent wave is therefore masked by the fact that the two opposite currents balance. This is why we are going to consider a two-dimensional problem, for the case of oblique reflection, so as to obtain a non-zero current and interpret its structure. 3.

Probability current of incident and evanescent waves, in the case of reflection from a two-dimensional potential step

We shall consider the following two-dimensional problem: a particle of mass , moving in the plane, has a potential energy ( ) which is independent of and given by: (

)=0

(

)=

if 0

0

if

0

(15)

The present case corresponds to the one studied in § 2 of Complement FI : the potential energy ( ) is the sum of a function 1 ( ) (potential energy of a onedimensional step) and a function 2 ( ), which is zero here. We can therefore look for a solution of the eigenvalue equation of the Hamiltonian in the form of a product: (

)=

1(

)

2(

)

(16) 285



COMPLEMENT BIII

The functions 1 ( ) and 2 ( ) satisfy one-dimensional eigenvalue equations which correspond respectively to 1 ( ) and 2 ( ) and to energies 1 and 2 such that: 1

+

2

=

(total energy of the particle)

(17)

We shall assume 1 0 : the equation giving 1 ( ) therefore corresponds to total reflection in a one-dimensional problem, and we can use formulas (11) and (20) of Complement HI . As for the function 2 ( ), it can be obtained immediately since it corresponds to the case of a free particle ( 2 = 0): it is a plane wave. We therefore have, in region I ( 0): 1(

e(

)=

+

)

e(

+

+

)

(18)

with: 2

=

1

and, in region II ( II (

)=

2

=

~2

2

(19)

~2

0):

e

e

0

1)

(20)

with: 2 (

=

(21)

~2

Equations (22) and (23) of HI give us the ratios parameter defined by: tan

=

=

0

1

;

0

and

. Introducing the

(22)

2

1

we obtain: =

=e

+

2

(23)

and: =

2 +

= 2 cos e

(24)

Let us apply relation (1), which defines the probability current. We obtain, in region I: ( I) =

~

( I) =

~

2

2

=0 (25)

JI

286

e

+

e

2

=

~

2

[2 + 2 cos(2

+ 2 )]



STUDY OF THE PROBABILITY CURRENT IN SOME SPECIAL CASES

2ky

kr

Figure 2: The sum of the probability currents associated with the incident and reflected waves yields a probability current parallel to .

ki

and in region II: (

II )

=0 (26)

JII (

II )

=

~

2

e

2

=

~

4

2

2

cos

e

2

In region I, only the ( I ) component of the probability current is non-zero; this component is the sum of two terms:

Jy

I

II

0

x

Figure 3: Because of the interference between the incident and reflected waves, the probability current in region I is an oscillatory function of ; in region II, it decreases exponentially (evanescent wave).

– the term proportional to 2 2 which results from the sum of the currents of the incident and reflected waves (cf. Fig. 2); – the term containing cos(2 + 2 ), which represents an interference effect between the two waves; it is responsible for the oscillation of the probability current with respect to (cf. Fig. 3). In region II, the probability current is again parallel to . Its exponential decay corresponds to the decay of the evanescent wave. This probability current arises from the fact that the wave packets do enter the second region (cf. Fig. 4) and, before turning back, propagate in the direction for a time of the order of the reflection delay [cf. Complement JI , equation (10)]. This penetration is also related to the lateral shift of the wave packet upon reflection (cf. Fig. 4). 287

COMPLEMENT BIII



Figure 4: The penetration of the particle into region II leads to a lateral shift upon reflection.

288



ROOT MEAN SQUARE DEVIATIONS OF TWO CONJUGATE OBSERVABLES

Complement CIII Root mean square deviations of two conjugate observables

1

The Heisenberg relation for

and

. . . . . . . . . . . . . . 289

2

The “minimum” wave packet . . . . . . . . . . . . . . . . . . 290

Two conjugate observables and are two observables whose commutator [ ] is equal to ~. We shall show in this complement that the root mean square deviations (cf. § C-5 of Chapter III) ∆ and ∆ , for any state vector of the system under study, satisfy the relation: ~ 2

∆ ∆

(1)

We shall then show that if the system is in a state where the product ∆ ∆ is exactly equal to ~ 2, the wave function associated with this state in the representation is a Gaussian wave packet (as is the wave function in the representation). 1.

The Heisenberg relation for

and

Consider the ket: =(

+

)

(2)

where is an arbitrary real parameter. For all , the square of the norm This is written: =

(

)( 2

= =

2

=

2

+

+

+

~+

)

(

[

)

] + 2

2

is positive.

2

+

2

2

2

0

(3)

The discriminant of this expression, of second order in , is therefore negative or zero: ~2

2

4

2

0

(4)

and we have: 2

2

~2 4

(5) 289

COMPLEMENT CIII



Assuming defined by:

to be given, let us now introduce the two observables

=

=

=

=

and

and

(6)

are also conjugate observables, since we have:

[

]=[

]= ~

(7)

Result (5), obtained above for 2

and

, is therefore also valid for

and

:

~2 4

2

(8)

In addition, referring to definition (C-23) (Chap. III) of the root mean square deviation and using (6), we see that: ∆

=

2



=

2

(9)

Relation (8) can therefore also be written: ~ 2

∆ ∆

(10)

Thus, if two observables are conjugate (as is the case when they correspond to a classical position and its conjugate momentum ), there exists an exact lower bound for the product ∆ ∆ . We thus generalize the Heisenberg uncertainty relation.

Comment:

This argument can easily be generalized to two arbitrary observables One obtains: ∆ 2.



1 [ 2

]

and

.

(11)

The “minimum” wave packet

When the minimum value of the product ∆ ∆

is attained:

~ 2

(12)

the state vector and .

is said to correspond to a minimum wave packet for the observables

∆ ∆

290

=



ROOT MEAN SQUARE DEVIATIONS OF TWO CONJUGATE OBSERVABLES

According to the preceding argument, relation (12) requires that the square of the norm of the ket: =(

+

)

(13)

be a second-order polynomial in therefore zero: (

+

)

0

=[

+

with a double root 0(

)]

0.

When

=0

=

0,

the ket

is (14)

On the other hand, if ∆ ∆ ~ 2, the polynomial which gives can never be equal to zero (it is positive for all ). Therefore, the necessary and sufficient condition for the product ∆ ∆ to take on its minimum value ~ 2 is that the kets ( ) and ( ) be proportional. The proportionality coefficient can easily be calculated. When ∆ ∆ = ~ 2, the 0 equation: 2

=

(∆ )2

~ + (∆ )2 = 0

(15)

has for its double root: 0

=

~ 2(∆ )2 = 2(∆ )2 ~

(16)

Let us write relation (14) in the representation (for simplicity, we assume that the eigenvalues of to be non-degenerate). Using the fact (cf. Complement EII ) ~ d that in this representation, acts like , we obtain: d +~

0

d d

( )=0

0

(17)

with: ( )=

(18)

To integrate equation (17), it is convenient to introduce the function ( ) defined by: ~

( )=e

(

)

(19)

Substituting (19) into (17), we thus obtain a more simple equation: +

0~

d d

( )=0

(20)

whose solution is: ( )= (where obtain:

e

2

2

0~

(21)

is an arbitrary complex constant). Substituting (16) and (21) into (19), we 2

( )=

e

~

e

2∆

(22) 291

COMPLEMENT CIII



This function can be normalized by setting: = 2 (∆ )2

1 4

(23)

We thus arrive at the following conclusion: when the product ∆ ∆ takes on its minimum value ~ 2, the wave function in the representation is a Gaussian wave packet, obtained from the Gaussian function ( ) by transformation (19) (which is equivalent to two changes of the origin, one on the -axis and one on the -axis).

Comment:

This argument in the representation can be repeated in the tation. One then finds that the wave function ( ) defined by: ( )=

+

1 2 ~

=

represen-

~

d e

( )

(24)

is also a Gaussian function, given by: 2

( ) = 2 (∆ )

2

1 4

e

to within a phase factor exp(

292

~

e

2∆

}).

(25)



MEASUREMENTS BEARING ON ONLY ONE PART OF A PHYSICAL SYSTEM

Complement DIII Measurements bearing on only one part of a physical system

1 2 3

Calculation of the physical predictions . . . . . . . . . . . . . 293 Physical meaning of a tensor product state . . . . . . . . . . 295 Physical meaning of a state that is not a tensor product . . 296

The concept of a tensor product, introduced in § F of Chapter II, enabled us to see how to construct, starting with the state spaces of two subsystems, that of the global system obtained by considering them together. We intend to pursue this study here, using the postulates of Chapter III to see what results can be obtained, when the state of the global system is known, from measurements bearing on only one subsystem. 1.

Calculation of the physical predictions

Consider a system composed of two parts (1) and (2) (for example, a system of two electrons). If (1) and (2) are the state spaces of parts (1) and (2), the state space of the global system (1) + (2) is the tensor product (1) (2). For example, the state of a twoelectron system is described by a wave function of six variables, ( 1 1 1 ; 2 2 2 ), associated with a ket of r (1) r (2) (cf. Chap. II, § F-4-b). It is possible to imagine measurements that bear on only one of the two parts [part (1), for example] of the global system. The observables ˜(1) corresponding to these measurements are defined in (1) (2) by extending1 the observables (1) acting only in (1) (cf. Chap. II, § F-2-b): (1) = ˜(1) = (1) (2) (1) where (2) is the identity operator in (2). The spectrum of ˜(1) in (1) (2) is the same as that of (1) in (1). On the other hand, we have seen that all the eigenvalues of ˜(1) are degenerate in (1) (2), even if none of the eigenvalues of (1) is degenerate in (1) [on the condition, of course, that the dimension of (2) be greater than 1]. When a measurement is made on system (1) alone, the global system may therefore be in several different states after the measurement, whatever the result (the state after the measurement depends not only on the result but also on the state before the measurement). From a physical point of view, this multiplicity of states corresponds to the degrees of freedom of system (2), about which no information is sought in the measurement. Let (1) be the projector, in (1), onto the eigensubspace related to the eigenvalue of (1): (1) =

(1)

(1)

(2)

=1 1 For

the sake of clarity, we shall adopt throughout this complement different notations for its extension ˜(1).

(1) and

293



COMPLEMENT DIII

where the kets (1) are orthonormal eigenvectors associated with . Let ˜ (1) be the projector, in (1) (2), onto the eigensubspace related to the same eigenvalue of ˜(1). ˜ (1) is obtained by extending (1) into (1) (2): ˜ (1) =

(1)

(2)

(3)

To write the identity operator (2) of (2), let us use the closure relation for an arbitrary orthonormal basis (2) of (2): (2) =

(2)

(2)

(4)

Substituting (4) into (3) and using (2), we obtain: ˜ (1) =

(1) (2)

(1) (2)

(5)

=1

Thus, knowing the state of the global system (assumed to be normalized to 1), we can calculate the probability (1) ( ) of finding the result in a measurement of (1) on part (1) of this system. Using general formula (B-14) of Chapter III, which here gives: (1)

(

˜ (1)

)=

(6)

we find: (1)

(

)=

(1) (2)

2

(7)

=1

Similarly, the state of the system after the measurement can be calculated; according to formula (B-31) of Chapter III, it is given by: ˜ (1)

=

(8)

˜ (1) that is, using (5): (1) (2) =

(1) (2)

=1

(9) (1) (2)

2

=1

Comments:

( ) The choice of an orthonormal basis (2) in (2) is arbitrary. We see from (3), (6) and (8) that the predictions concerning subsystem (1) do not depend on this choice. Physically, it is clear that if no measurement is performed on system (2), no state or set of states of this system can play a preferential role. 294

• ( ) If the state =

MEASUREMENTS BEARING ON ONLY ONE PART OF A PHYSICAL SYSTEM

before the measurement is a tensor product:

(1)

(2)

(10)

[where (1) and (2) are two normalized states of (1) and (2) respectively], it is easy to see, using (3) and (8), that the state is also a tensor product: =

(1)

(2)

(11)

with: (1) (1)

(1) =

(1)

(12)

(1) (1)

The state of system (1) has therefore changed, but not that of system (2). (

) If the eigenvalue of (1) is non-degenerate in (1) – or, more generally, if (1) actually represents a complete set of commuting observables of (1) – the index is no longer necessary in formula (2) and those which follow. The state of the system after a measurement yielding the result can always be put in the form of a product of two vectors. This can be seen by writing relation (9) in the form: =

(1)

(2)

(13)

where the normalized vector (2)

(2) of (2) is given by:

(1) (2) (14)

(2) = (1) (2)

2

Therefore, whatever the state of the global system before the measurement, the state of the system after a measurement bearing on part (1) alone is always a tensor product when this measurement is complete with respect to part (1) [although partial as regards the global system (1) + (2)]. 2.

Physical meaning of a tensor product state

To see what a product state represents physically, let us apply the results of the preceding paragraph to the particular case where the initial state of the global system is of the form (10). We immediately obtain, using (6) and (3): (1)

(

)=

(1) (2)

(1)

(2) (1) (2)

The very definition of the tensor product malized then allow us to write: (1)

(

)=

(1)

(1) (1)

=

(1)

(1) (1)

(1)

(15) (2) and the fact that

(2) is nor-

(2) (2) (2) (16) 295

COMPLEMENT DIII



(1)

( ) does not depend on (2) , but only on (1) . When the state of the global system has the form (10), all physical predictions relating to only one of the two systems do not, therefore, depend on the state of the other one and are expressed entirely in terms of (1) [or of (2) ], depending on whether it is system (1) alone [or system (2) alone] that is being observed. A product state (1) (2) can therefore be considered to represent the simple juxtaposition of two systems, one in the state (1) and the other in the state (2) . In such a state, the two systems are also said to be uncorrelated (more precisely, the results of the two types of measurements, bearing either on one system or on the other, correspond to independent random variables). Such a situation is realized when the two systems have been separately prepared in the states (1) and (2) and then united without interacting. 3.

Physical meaning of a state that is not a tensor product

Now consider the case in which the state of the global system is not a product state, that is, where cannot be written in the form (1) (2) . The predictions of measurement results bearing on only one of the two systems can then no longer be expressed in terms of a ket (1) [or (2) ] in which system (1) [or (2)] would be found. In this case, general formulas (6) and (7) must be used to find the probabilities of the various possible results. We assume here without proof that such a situation generally reflects the existence of correlations between systems (1) and (2). The results of measurements bearing on either system (1) or system (2) correspond to random variables which are not independent and can therefore be correlated. It can be shown, for example, that an interaction between the two systems transforms an initial state which is a product into one which is no longer a product: any interaction between two systems therefore introduces, in general, correlations between them. When the state of the global system is not a product (1) (2) , how can each partial system (1) or (2) be characterized, since the ket (1) or (2) can no longer be associated with it? This question is very important since, in general, every physical system has interacted with others in the past (even if it is isolated at the instant when it is being studied). The state of the global system: system (1) + system (2) with which it has interacted in the past is therefore not in general a product state, and it is not possible to associate a state vector (1) with system (1) alone. To resolve these difficulties, one must describe system (1), not by a state vector, but by an operator, called the density operator. The corresponding formalism, fundamental to statistical quantum mechanics, is introduced in Complement EIII (cf. § 5-b). However, system (1) can always be described by a state vector after a complete set of measurements has been performed on it. We have seen that whatever the state of the global system (1) + (2) before the measurement, a complete measurement on system (1) places the global system in a product state [cf. formulas (13) and (14)]. The vector associated with (1) is the unique eigenvector (to within a multiplicative factor) associated with the results of the complete set of measurements done on it. This set of measurements has therefore erased all correlations resulting from previous interactions between the two systems. If, at the moment of measurement, system (2) is already far away and no longer interacting with system (1), it can then be completely forgotten if one is interested only in system (1). 296



MEASUREMENTS BEARING ON ONLY ONE PART OF A PHYSICAL SYSTEM

Comment:

Nervertheless, the quantum state (2) reached by system (2) depends in general2 on the result of the measurements performed on system (1). This is easy to check with (14), when the state before the measurement is not a product state. Such a result may seem extremely surprising: how can the state of system (2) after a series of measurements made on system (1) change as a function of the result of measurements performed on an arbitrarily remote system (1), with which is does not interact any longer? To this “paradox”, studied in detail by many physicists, are attached the names of Einstein, Podolsky and Rosen; it is discussed in § F-1 of Chapter XXI.

References and suggestions for further reading:

The Einstein-Podolsky-Rosen paradox/argument: see also Do we really understand quantum mechanics?, F. Laloë, Cambridge University Press (2018), as well the subsection “Hidden variables and paradoxes” of section 5 of the bibliography; Bohm (5.1), §§ 22.15 to 22.19; d’Espagnat (5.3), Chap. 7. Photons produced in the decay of positronium: Feynman III (1.2), § 18.3; Dicke and Wittke (1.14), Chap. 7.

2 Recall

that this is not the case when

is a product state; cf. comment ( ) of § 1

297



THE DENSITY OPERATOR

Complement EIII The density operator

1 2 3

4

5

1.

Outline of the problem . . . . . . . . . . . . . . . . . . . . . . The concept of a statistical mixture of states . . . . . . . . . The pure case. Introduction of the density operator . . . . 3-a Description by a state vector . . . . . . . . . . . . . . . . . . 3-b Description by a density operator . . . . . . . . . . . . . . . . 3-c Properties of the density operator in a pure case . . . . . . . A statistical mixture of states (non-pure case) . . . . . . . . 4-a Definition of the density operator . . . . . . . . . . . . . . . . 4-b General properties of the density operator . . . . . . . . . . . 4-c Populations; coherences . . . . . . . . . . . . . . . . . . . . . Use of the density operator: some applications . . . . . . . 5-a System in thermodynamic equilibrium . . . . . . . . . . . . . 5-b Separate description of part of a physical system. Concept of a partial trace . . . . . . . . . . . . . . . . . . . . . . . . . .

299 299 301 301 302 303 304 304 305 307 308 308 309

Outline of the problem

Until now, we have considered systems whose state is perfectly well known. We have shown how to study their time evolution and how to predict the results of various measurements performed on them. To determine the state of a system at a given instant, it suffices to perform on the system a set of measurements corresponding to a C.S.C.O. For example, in the experiment studied in § A-3 of Chapter I, the polarization state of the photons is perfectly well known when the light beam has traversed the polarizer. However, in practice, the state of the system is often not perfectly determined. This is true, for example, of the polarization state of photons coming from a source of natural (unpolarized) light, and also for the atoms of a beam emitted by a furnace at temperature , where the atoms’ kinetic energy is known only statistically. The problem posed by the quantum description of such systems is the following: how can we incorporate into the formalism the incomplete information we possess about the state of the system, so that our predictions make maximum use of this partial information? To do this, we shall introduce here a very useful mathematical tool, the density operator, which facilitates the simultaneous application of the postulates of quantum mechanics and the results of probability calculations. 2.

The concept of a statistical mixture of states

When one has incomplete information about a system, one typically appeals to the concept of probability. For example, we know that a photon emitted by a source of 299

COMPLEMENT EIII



natural light can have any polarization state with equal probability. Similarly, a system in thermodynamic equilibrium at a temperature has a probability proportional to e of being in a state of energy . More generally, the incomplete information one has about the system usually presents itself, in quantum mechanics, in the following way: the state of this system may be either the state 1 with a probability 1 or the state 2 with a probability 2 , etc... Obviously: 1

+

2

+

=

=1

(1)

We then say that we are dealing with a statistical mixture of states 1 , 2 , ... with probabilities 1 , 2 ... Now let us see what happens to the predictions concerning the results of measurements performed on this system. If the state of the system were , we could use the postulates stated in Chapter III to determine the probability of obtaining one or another measurement result. Since such a possibility (the state ) has a probability of , it is clear that the results obtained must be weighted by the and then summed over the various values of , that is, over all the states of the statistical mixture.

Comments:

( ) The various states 1 , 2 , ... are not necessarily orthogonal. However, they can always be chosen normalized; in this complement, we shall assume that this is the case. ( ) Note that, in the present case, probabilities intervene at two different levels: – first, in the initial information about the system (until now, we have not introduced probabilities at this stage: we considered the state vector to be perfectly well known, in which case all the probabilities are zero, except one, which is equal to 1); – again, when the postulates concerning the measurement are applied (leading to probabilistic predictions, even if the initial state of the system is perfectly well known). There are thus two totally different reasons necessitating the introduction of probabilities: the incomplete nature of the initial information about the state of the system (such situations are also envisaged in classical statistical mechanics), and the (specifically quantum mechanical) uncertainty related to the measurement process. (

) A system described by a statistical mixture of states (with the probability of the state vector being ) must not be confused with a system whose state is a linear superposition of states1 : =

1 We

assume, in this comment ( ), that the states essential but it simplifies the discussion.

300

(2) are orthonormal. This hypothesis is not



THE DENSITY OPERATOR

It is often said in quantum mechanics, when the state vector is the ket 2 given in (2), that “the system has a probability of being in the state ”. If we want to be precise, this must be understood to mean that if we perform a set of measurements corresponding to a C.S.C.O. which has as an eigenvector, the probability of finding the set of eigenvalues associated with 2 is . But we have stressed, in § E-1 of Chapter III, the fact that a system in the state given in (2) is not simply equivalent to a system having the probability 1 2 of being in the state 1 , 2 2 of being in the state , there exist, in general, 2 , etc... In fact, for a linear combination of interference effects between these states (due to cross terms of the type , obtained when the modulus of the probability amplitudes is squared) which are very important in quantum mechanics. We therefore see that it is impossible, in general, to describe a statistical mixture by an “average state vector” which would be a superposition of the states . As we indicated earlier, when we take a weighted sum of probabilities, we can never obtain interference terms between the various states of a statistical mixture. 3.

The pure case. Introduction of the density operator

To study the behavior of a statistical mixture of states, we have envisaged one method: calculation of the physical predictions corresponding to a possible state ; weighting the results so obtained by the probability associated with this state and summation over . Although correct in principle, this method often leads to clumsy calculations. We have indicated [comment ( )] that it is impossible to associate an “average state vector” with the system. Actually, it is an “average operator” and not an “average vector” which permits a simple description of the statistical mixture of states: the density operator. Before studying this general case, we shall return to the simple case where the state of the system is perfectly known (all the probabilities are zero, except one). The system is then said to be in a pure state. We shall show that characterizing the system by its state vector is completely equivalent to characterizing it by a certain operator acting in the state space, the density operator. The usefulness of this operator will become apparent in § 4, where we shall show that nearly all the formulas involving this operator, and derived for the pure case, remain valid for the description of a statistical mixture of states. 3-a.

Description by a state vector

Consider a system whose state vector at the instant () =

()

is: (3)

where the form an orthonormal basis of the state space, assumed to be discrete (extension to the case of a continuous basis presents no difficulties). The coefficients 301

COMPLEMENT EIII



( ) satisfy the relation: ( )2=1

(4)

which expresses the fact that ( ) is normalized. If is an observable, with matrix elements: =

(5)

the mean value of ()=

at the instant

()

() =

()

Finally, the evolution of

is: ()

(6)

( ) is described by the Schrödinger equation:

d () = () () d where ( ) is the Hamiltonian of the system.

(7)

~

3-b.

Description by a density operator

Relation (6) shows that the coefficients ( ) enter into the mean values through quadratic expressions of the type ( ) ( ). These are simply the matrix elements of the operator ( ) ( ) , the projector onto the ket ( ) (cf. Chap. II, § B-3-b), as can be seen from (3): ()

()

=

()

()

(8)

It is therefore natural to introduce the density operator ( ), defined by: ()=

()

()

(9)

The density operator is represented in the matrix whose elements are: ()=

()

=

()

basis by a matrix called the density

()

(10)

We are going to show that the specification of ( ) suffices to characterize the quantum state of the system; that is, it enables us to obtain all the physical predictions that can be calculated from ( ) . Let us write formulas (4), (6) and (7) in terms of the operator ( ). According to (10), relation (4) indicates that the sum of the diagonal elements of the density matrix is equal to 1: ( )2=

( ) = Tr ( ) = 1

(11)

In addition, using (5) and (10), formula (6) becomes: ()=

()

=

()

= Tr 302

()

(12)



THE DENSITY OPERATOR

Finally, the time evolution of the operator ( ) can be deduced from the Schrödinger equation (7): d ()= d

d d

()

() +

()

1 1 () () () + ~ ( ~) 1 = [ ( ) ( )] ~ =

d d ()

() ()

() (13)

Therefore, in terms of the density operator, conservation of probability is expressed by: Tr ( ) = 1

(14)

The mean value of an observable ( ) = Tr

()

= Tr

is calculated using the formula:

()

(15)

and the time evolution obeys the equation: ~

d ()=[ () d

( )]

(16)

For completeness, we must also indicate how to calculate from ( ) the probabilities ( ) of the various results which can be obtained in the measurement of an observable at time . Actually, formula (15) can be used to do this. We know [see equation (B-14) of Chapter III] that ( ) can be written as a mean value, that of the projector onto the eigensubspace associated with : (

)=

()

()

(17)

Using (15), we therefore obtain: ( 3-c.

) = Tr

()

(18)

Properties of the density operator in a pure case

In a pure case, a system can be described just as well by a density operator as by a state vector. However, the density operator presents a certain number of advantages. First of all, we see from (9) that two state vectors ( ) and e ( ) (where is a real number), which describe the same physical state, correspond to the same density operator. Using this operator therefore eliminates the drawbacks related to the existence of an arbitrary global phase factor for the state vector. Moreover, we see from (14), (15) and (18) that the formulas using the density operator are linear with respect to it, while expressions (6) and (17) are quadratic with respect to ( ) . This is an important property which will be useful subsequently. Finally, let us mention some properties of ( ), which can be deduced directly from its definition (9): ()= ()

(19) 303

COMPLEMENT EIII



(the density operator is Hermitian)

Tr

2

()= ()

(20)

2

()=1

(21)

These last two relations, which follow from the fact that ( ) is a projector, are true only in a pure case. We shall see later that they are not valid for a statistical mixture of states. 4.

A statistical mixture of states (non-pure case)

4-a.

Definition of the density operator

We now return to the general case described in § 1, and consider a system for which (at a given instant) the various probabilities 1 2 ... ... are arbitrary, on the condition that they satisfy the relations: 0

1

1

2

(22) =1 Under these conditions, how does one calculate the probability of the observable will yield the result ? Let: (

)=

(

) that a measurement

(23)

be the probability of finding if the state vector were . To obtain the desired probability ( ), one must, as we have already indicated, weight ( ) by and then sum over : (

)=

(

)

(24)

Now, from (18), we have: (

) = Tr

(25)

where: =

(26)

is the density operator corresponding to the state have: (

)=

. Substituting (25) into (24), we

Tr

= Tr = Tr 304

(27)



THE DENSITY OPERATOR

where we have set: =

(28)

We therefore see that the linearity of the formulas which use the density operator enables us to express all physical predictions in terms of , the average of the density operators with weights ; is, by definition, the density operator of the system. Comment: An ensemble of pure states with probabilities leads to a single density operator , but the reverse is not true in general: the same density operator can be interpreted as several different statistical mixtures of pure states. For instance, in a space of states with dimension , a statistical mixture of the states of a given basis with the same probability 1 leads to a density operator that is the unit operator, divided by ; but the same operator may be obtained with a statistical mixture of the kets of any other basis . These various ensembles lead to the same probabilities ( ), and cannot be distinguished by measuring these probabilities. This situation is sometimes described as the “multiple preparations” of the same density operator.

4-b.

General properties of the density operator

Since the coefficients

are real,

is obviously a Hermitian operator like each of

the Let us calculate the trace of ; it is equal to: Tr =

Tr

(29)

Now, as we saw in § 3-b, the trace of Tr =

=1

is always equal to 1; it follows that: (30)

Relation (14) is therefore valid in the general case. We have already given, in (27), the expression that enables us to calculate the probability ( ) in terms of . Using this expression, we can easily generalize formula (15) to statistical mixtures: =

(

) = Tr

= Tr

(31)

[we have used formula (D-36-b) of Chapter II]. Now let us calculate the time evolution of the density operator. To do this, we shall assume that, unlike the state of the system, its Hamiltonian ( ) is perfectly well known. One can then easily show that if the system at the initial time 0 has the probability of being in the state , then, at a subsequent time , it has the same probability of being in the state ( ) given by: ~

d d

() =

()

() (32)

( 0) = 305

COMPLEMENT EIII



The density operator at the instant ()=

will then be:

()

(33)

with: ()=

()

According to (16), ~

d d

()=[ ()

()

(34)

( ) obeys the evolution equation: ( )]

The linearity of formulas (33) and (35) with respect to ~

d ( ) = [ ( ) ( )] d

(35) ( ) implies that: (36)

We can therefore generalize to a statistical mixture of states all the equations of § 3, with the exception of (20) and (21). We see that, since is no longer a projector, we have, in general2 : 2

=

(37)

and, consequently: Tr

2

1

(38)

Moreover, it is sufficient that one of the equations, (20) or (21), be satisfied for us to be sure that we are dealing with a pure state. Finally, we see from definition (28) that, for any ket , we have: = 2

=

(39)

and consequently: 0

(40)

is therefore a positive operator.

2 Assume, for example, that the states are mutually orthogonal. In an orthonormal basis including the , is diagonal and its elements are the . To obtain 2 , we simply replace by 2 . Relations (37) and (38) then follow from the fact that the are always less than 1 (except in the particular case where only one of them is non-zero: the pure case).

306

• 4-c.

THE DENSITY OPERATOR

Populations; coherences

What is the physical meaning of the matrix elements of in the basis? First, let us consider the diagonal element . According to (28), we have: =

[

]

(41)

that is, using (26) and introducing the components: ( )

=

of

in the =

( )

(42) basis: ( )

2

(43)

2

is a positive real number, whose physical interpretation is the following: if the

state of the system is , this number is the probability of finding, in a measurement, this system in the state . According to (41), if we take into account the indeterminacy of the state before the measurement, represents the average probability of finding the system in the state . For this reason, is called the population of the state : if the same measurement is carried out times under the same initial conditions, where is a large number, systems will be found in the state . It is evident ( ) 2 from (43) that is a positive real number, equal to zero only if all the are zero. A calculation analogous to the preceding one gives the following expression for the non-diagonal element : =

( ) ( )

(44)

( ) ( )

is a cross term, of the same type as those studied in § E-1 of Chapter III. It reflects the interference effects between the states and which can appear when the state is a coherent linear superposition of these states. According to (44), is the average of these cross terms, taken over all the possible states of the statistical mixture. In contrast to the populations, can be zero even if none of the products ( ) ( ) is: while is a sum of real positive (or zero) numbers, is a sum of complex numbers. If is zero, this means that the average (44) has cancelled out any interference effects between and . On the other hand, if is different from zero, a certain coherence subsists between these states. This is why the non-diagonal elements of are often called coherences.

Comments:

( ) The distinction between populations and coherences obviously depends on the basis chosen in the state space. Since is Hermitian, it is always 307



COMPLEMENT EIII

possible to find an orthonormal basis be written:

where

is diagonal.

= Since

can then (45)

is positive [relation (40)] and Tr

0

= 1, we have:

1 (46) =1

can thus be considered to describe a statistical mixture of the states with the probabilities (there are no coherences between the states ). ( ) If the kets are eigenvectors of the Hamiltonian be time-independent: =

, which is assumed to (47)

we obtain directly from (36): d d d ~ d ~

()=0 (48) ()=(

)

that is: ( ) = constant (49) ()=e

~(

)

(0)

The populations are constant, and the coherences oscillate at the Bohr frequencies of the system. (

) Using (40), one can prove the inequality: 2

It follows, for example, that populations are not zero. 5.

(50) can have coherences only between states whose

Use of the density operator: some applications

5-a.

System in thermodynamic equilibrium

The first example we shall consider is borrowed from quantum statistical mechanics. Consider a system in thermodynamic equilibrium with a reservoir at the absolute temperature . It can be shown that its density operator is then: = 308

1

e

(51)



THE DENSITY OPERATOR

where is the Hamiltonian operator of the system, is the Boltzmann constant, and is a normalization coefficient chosen so as to make the trace of equal to 1: = Tr e (

(52)

is called the “partition function”). In the basis of eigenvectors of =

1

=

1

=

1

=

1

, we have (cf. Complement BII , § 4-a):

e e

(53)

and: e e

=0

(54)

At thermodynamic equilibrium, the populations of the stationary states are exponentially decreasing functions of the energy (the lower the temperature , the more rapid the decrease), and the coherences between stationary states are zero. More details on the use of the density operator in statistical mechanics and at thermal equilibrium are given in Appendix VI of Volume III. 5-b.

Separate description of part of a physical system. Concept of a partial trace

We now return to the problem mentioned in § 3 of Complement DIII . Consider two different systems (1) and (2) and the global system (1) + (2), whose state space is the tensor product: = (1)

(2)

(55)

Let (1) be a basis of (1) and (2) a basis of (2); the kets (1) (2) form a basis of . The density operator of the global system is an operator which acts in . We saw in Chapter II (cf. § F-2-b) how to extend into an operator which acts only in (1) [or (2)]. We are going to show here how to perform the inverse operation: we shall construct from an operator (1) [or (2)] acting only in (1) [or (2)]. This will enable us to make all the physical predictions about measurements bearing only on system (1), or on system (2). This operation will be called a partial trace with respect to (2) [or (1)]. Let us introduce the operator (1) whose matrix elements are: (1) (1)

(1) =

(

(1)

By definition, (1) is obtained from (1) = Tr2

(2) ) (

(1)

(2) )

(56)

by performing a partial trace on (2): (57)

Similarly, the operator: (2) = Tr1

(58)

309

COMPLEMENT EIII



has matrix elements: (2) (2)

(2) =

(1)

(2)

(1)

(2)

(59)

It is clear why these operations are called partial traces. We know that the (total) trace of Tr =

(1)

(2)

(1)

(2)

is: (60)

The difference between (60) and (56) [or (59)] is the following: for the partial traces, the indices and (or and ) are not required to be equal and the summation is performed only over (or ). We have, moreover: Tr = Tr1 (Tr2 ) = Tr2 (Tr1 )

(61)

(1) and (2) are therefore, like , operators whose trace is equal to 1. It can be verified from their definitions that they are Hermitian, and that they satisfy all the general properties of a density operator (cf. § 4-b). Now let (1) be an observable acting in (1), and ˜(1) = (1) (2) its extension in . We obtain, using (31), the definition of the trace, and the closure relation on the (1) (2) basis: ˜(1) = Tr

˜(1)

=

(1)

(2)

(1) (1)

=

(1)

(2)

(2)

(2) (1) (1)

(1)

(2)

(1)

(2)

(2) (1)

(1)

(2)

(2)

(62)

Now: (2)

(2) =

(63)

We can therefore write (62) in the form: ˜(1) =

(1) (2)

(1) (2)

(1)

(1)

(1)

Inside the brackets on the right-hand side of (64), we recognize the matrix element of defined in (56). We therefore have: ˜(1) = = = Tr

(1) (1)

(1)

(1) (1) (1) (1) (1)

(1)

(1)

(64) (1)

(1)

(1) (65)

Let us compare this result with (31). We see that the partial trace (1) enables us to calculate all the mean values ˜(1) as if the system (1) were isolated and had (1) for a density operator. Making the same comment as for formula (17), we see that (1) also enables us to obtain the probabilities of all the results of measurements bearing on system (1) alone.

310



THE DENSITY OPERATOR

Comments: ( ) We saw in Complement DIII that it is impossible to assign a state vector to system (1) [or (2)] when the state of the global system (1) + (2) is not a product state. We now see that the density operator is a much more simple tool than the state vector. In all cases (whether the global system is in a product state or not, whether it corresponds to a pure case or to a statistical mixture), one can always, thanks to the partial trace operation, assign a density operator to subsystem (1) [or (2)]. This permits us to calculate all the physical predictions about this subsystem. ( ) Even if describes a pure state (Tr 2 = 1), this is not in general true of the density operators (1) and (2) obtained from by a partial trace. It can be verified from (56) [or (59)] that Tr 2 (1) [or Tr 2 (2) ] is not generally equal to 1. This is another way of saying that it is not in general possible to assign a state vector to (1) [or (2)], except, of course, if the global system is in a product state. (

) If the global system is in a product state: =

(1)

(2)

(66)

we can verify directly that the corresponding density operator is written: = (1)

(2)

(67)

with: (1) =

(1)

(1)

(2) =

(2)

(2)

(68)

More generally, we can envisage states of the global system for which the density operator can be factored as in (67) [ (1) and (2) can correspond to statistical mixtures as well as to pure cases]. The partial trace operation then yields: Tr2

(1)

(2) = (1)

Tr1

(1)

(2) = (2)

(69)

An expression such as (67) therefore represents the simple juxtaposition of a system (1), described by the density operator (1), and a system (2), described by the density operator (2). ( ) Starting with an arbitrary density operator [that cannot be factored as in (67)], let us calculate (1) = Tr2 and (2) = Tr1 . Then let us form the product: = (1)

(2)

(70)

Unlike the case envisaged in comment ( ), is in general different from . When the density operator cannot be factored as in (67), there is therefore a certain “correlation” between systems (1) and (2), which is no longer contained in the operator of formula (70). ( ) If the evolution of the global system is described by equation (36), it is in general impossible to find a Hamiltonian operator relating to system (1) alone that would enable us to write an analogous equation for (1). While the definition, at any time, of (1) in terms of is simple, the evolution of (1) is much more difficult to describe.

311

COMPLEMENT EIII



References and suggestions for further reading:

Articles by Fano (2.31) and Ter Haar (2.32). Using the density operator to study relaxation phenomena: Abragam (14.1), Chap. VIII; Slichter (14.2), Chap. 5; Sargent, Scully and Lamb (15.5), Chap. VII.

312



THE EVOLUTION OPERATOR

Complement FIII The evolution operator

1 2

General properties . . . . . . . . . . . . . . . . . . . . . . . . . 313 Case of conservative systems . . . . . . . . . . . . . . . . . . 315

In § D-1-b of Chapter III, we saw that the transformation of ( 0 ) (the state vector at the initial instant 0 ) into ( ) (the state vector at an arbitrary instant) is linear. There therefore exists a linear operator ( 0 ) such that: () =

(

0)

( 0)

(1)

We intend to study here the principal properties of the evolution operator of the system. 1.

(

0 ),

which is, by definition,

General properties

Since the ket (

0

0)

( 0 ) is arbitrary, it is clear from (1) that:

=

(2)

Also, substituting (1) into the Schrödinger equation, we obtain: (

~

0)

( 0) =

() (

0)

( 0)

(3)

from which, for the same reason as above: (

~

0)

=

() (

0)

(4)

The first-order differential equation (4) completely defines ( 0 ), taking the initial condition (2) into account. Note, moreover, that (2) and (4) can be condensed into a single integral equation: (

0)

( ) (

= ~

0 )d

Now let us consider the parameter just like . We then write (1) in the form: () = But

(

(5)

0

0,

which appears in

(

) ( )

0)

as a variable

, (6)

( ) can itself be obtained from a formula of the same type: ( ) =

(

) ( )

(7) 313



COMPLEMENT FIII

Substitute (7) into (6): () =

(

) (

Since, moreover, (

)=

(

) ( )

() =

(

) (

)

(8)

) ( ) , we deduce ( ( ) being arbitrary): (9)

It is easy to generalize this procedure and to obtain: (

1)

=

(

1)

(

3

2)

(

2

1)

(10)

where 1 2 ..., are arbitrary. If we assume that 1 2 3 is simple to interpret: to go from 1 to , the system progresses from . 2 to 3 , ..., then, finally from 1 to Set = in (9); taking (2) into consideration, we obtain: =

(

) (

)

(

) (

, formula (10) to 2 , then from

(11)

or, interchanging the roles of =

1

and :

)

(12)

)

(13)

We therefore have: (

1

)=

(

Now let us calculate the evolution operator between two instants separated by d . To do this, write the Schrödinger equation in the form: d () =

( +d )

=

()

() ()d

(14)

~ that is: ( +d ) =

( )d

()

(15)

~ We then obtain, using the very definition of ( +d

)=

( )d

( +d

): (16)

~ ( + d ) is called the infinitesimal evolution operator. Since ( ) is Hermitian, ( + d ) is unitary (cf. Complement CII , § 3). It follows that ( ) is also unitary since the interval [ ] can be divided into a very large number of infinitesimal intervals. Formula (10) then shows that ( ) is a product of unitary operators; it is therefore a unitary operator. One can consequently write (13) in the form: (

)=

1

(

)=

(

)

(17)

It is not surprising that the transformation ( ) is unitary, that is, that it conserves the norm of vectors on which it acts. We saw in Chapter III (cf. § D-1-c) that the norm of the state vector does not change over time. 314

• 2.

THE EVOLUTION OPERATOR

Case of conservative systems

When the operator does not depend on time, equation (4) can easily be integrated; taking the initial condition (2) into account, we obtain: (

0)

(

=e

0)

~

(18)

One can verify directly from this formula all the properties of the evolution operator cited in § 1. It is very simple to go from formula (D-52) to (D-54) of Chapter III using (18). It suffices to apply the operator ( 0 ) to both sides of (D-52), noting that, since is an eigenvector of with the eigenvalue : (

0)

(

=e

0)

~

(

=e

0)

~

(19)

Comments:

( ) When is time-dependent, one might be tempted to believe, by analogy with formula (18), that the evolution operator is equal to the operator ( 0 ) defined by: (

0)

=e

( )d

~

(20)

0

Actually, this is not true, since the derivative of an operator of the form e is not in general equal to ( )e ( ) (cf. Complement BII , § 5-c): (

~

0)

=

() (

0)

( )

(21)

( ) Let us again consider the experiments described in § E-1-b of Chapter III. As we have already indicated [comment ( ) of § E-1-b- ], it is not necessary to assume that the measurements of the various observables , and are made very close together in time. When the system has had the time to evolve between two successive measurements, the variations of the state vector can easily be taken into account by using the evolution operator. If 0 , 1 and 2 designate respectively the instants at which the measurements of , and are performed, we then replace (E-15) by: ( )=

(

2

2

0)

(22)

and (E-17) by: (

)=

(

2

2

1)

(

1

2

0)

(23)

We then have, using (9): (

2

0)

= =

(

2

(

1)

(

2

1)

1

0)

(

1

0)

Substituting (24) into (22), we see, as in (E-21), that ( ).

(24) ( ) is not equal to

315

COMPLEMENT FIII



References and suggestions for further reading:

The evolution operator is of fundamental importance in collision theory (see Chapter VIII) and time-dependent perturbation theory (see Chapter XIII), as well as in the study of the interactions between atoms and photons (see Chapter XX).

316



THE SCHRÖDINGER AND HEISENBERG PICTURES

Complement GIII The Schrödinger and Heisenberg pictures In the formalism developed in Chapter III, it is the time-independent operators which correspond to the observables of the system (cf. Chap. III, § D-1-d). For example, the position, momentum and kinetic energy operators of a particle do not depend on time. The evolution of the system is entirely contained in that of the state vector ( ) [here written ( ) , for reasons which will be evident later] and is obtained from the Schrödinger equation. This is why this approach is called the Schrödinger picture. Nevertheless, we know that all the predictions of quantum mechanics (probabilities, mean values) are expressed in terms of scalar products of a bra and a ket or matrix elements of operators. Now, as we saw in Complement CII , these quantities are invariant when the same unitary transformation is performed on the kets and on the operators. This transformation can be chosen so as to make the transform of the ket ( ) a timeindependent ket. Of course, the transforms of the observables cited above then depend on time. We thus obtain the Heisenberg picture. To avoid confusion, in this complement, we shall systematically assign an index to the kets and operators of the Schrödinger picture and an index to those of the Heisenberg picture. The index can be considered to be implicit in all the other complements and chapters where only the Schrödinger picture is used. The state vector ( ) at the instant is expressed in terms of ( 0 ) by the relation: () =

(

0)

( 0)

(1)

where ( 0 ) is the evolution operator (cf. Complement FIII ). Since this operator is unitary, it is sufficient to perform the unitary transformation associated with the operator ( 0 ) to obtain a constant transformed vector : =

(

=

0)

() =

(

0)

(

0)

( 0)

( 0)

(2)

In the Heisenberg picture, the state vector, which is constant, is therefore equal to () at time 0 The transform ( ) of an operator ( ) is given by (Complement CII , § 2): ()=

(

0)

() (

0)

(3)

As we have already seen, ( ) generally depends on time, even if does not. Nevertheless, there exists an interesting special case in which, if is timeindependent, the same is true of : the case in which the system is conservative ( does not depend on time) and commutes with ( is then a constant of the motion; cf. Chap. III, § D-2-c). In this case, we have: (

0)

(

=e

0)

~

(4)

If the operator commutes with ment BII , § 4-c), so that: ()=

(

0)

(

0)

=

, it also commutes with

(

0)

(cf. Comple(5) 317

COMPLEMENT GIII



The operators and are therefore simply equal in this case (in particular, = , and the indices and are, in reality, unnecessary for the Hamiltonian). Since they are time-independent, we see that they indeed correspond to a constant of the motion. When ( ) is arbitrary, let us calculate the evolution of the operator ( ). Using relation (4) of Complement FIII , as well as its adjoint, we obtain: d d

()=

1 ~ 1 + ~

(

0)

()

() (

0)

(

0)

()

() (

0)

+

(

0)

d

() d

(

0)

(6)

In the first and last terms of this expression, let us insert between and the product ( 0 ) ( 0 ), which is equal to the identity operator [formula (17) of Complement FIII ]: d d

1 ~

()= + +

( (

1 ~

0) 0)

(

() (

d

() d

0)

(

() (

0)

(

0)

() (

0)

(

0)

() (

0)

0) 0)

(7)

According to definition (3), we finally obtain: ~

d d

( )=[

()

( )] + ~

d d

()

(8)

Comments:

( ) Historically, the first picture was developed by Schrödinger, leading him to the equation which bears his name, and the second one, by Heisenberg, who calculated the evolution of matrices representing the various operators () (hence the name “matrix mechanics”). It was not until later that the equivalence of the two approaches was proved. ( ) Using (8), one immediately obtains equation (D-27) of Chapter III as we shall now show. In the Heisenberg picture, the evolution of the mean value ()=

()

()

()

can be calculated, since: ()=

()

(9)

On the right-hand side of (9), only ( ) depends on time, so (D-27) can be obtained directly by differentiation. Note, nevertheless, that equation (8) is more general than (D-27) since, instead of expressing the equality of two mean values (that is, two matrix elements of operators), it expresses the equality of two operators. 318

• (

THE SCHRÖDINGER AND HEISENBERG PICTURES

) When the system under consideration is composed of a particle of mass under the influence of a potential, equation (8) becomes very simple. We then have (confining ourselves to one dimension): 2

()=

+

2

(

)

(10)

and therefore [cf. formula (35) of Complement CII ]: 2

()=

+

2

(

)

(11)

Substituting (11) into (8) and using the fact that [ ]=[ ] = ~, we obtain, by an argument analogous to that of § D-1-d of Chapter III: d d d d

()= ()=

1

() (

)

(12)

These equations generalize the Ehrenfest theorem [cf. Chap. III, relations (D-34) and (D-35)]. They are similar to those giving the evolution of the classical quantities and [cf. Chap. III, relations (D-36a) and (D-36b)]. An advantage of the Heisenberg picture is that it leads to equations formally similar to those of classical mechanics. References and suggestions for further reading:

The interaction picture: see exercise 15 of Complement LIII as well as § A-2 of Chapter XX; Messiah (1.17), Chap. VIII, § 14; Schiff (1.18), § 24; Merzbacher (1.16), Chap. 18, § 7.

319



GAUGE INVARIANCE

Complement HIII Gauge invariance

1 2

3

1.

Outline of the problem: scalar and vector potentials associated with an electromagnetic field; concept of a gauge . . 321 Gauge invariance in classical mechanics . . . . . . . . . . . . 322 2-a Newton’s equations . . . . . . . . . . . . . . . . . . . . . . . . 322 2-b The Hamiltonian formalism . . . . . . . . . . . . . . . . . . . 322 Gauge invariance in quantum mechanics . . . . . . . . . . . 327 3-a Quantization rules . . . . . . . . . . . . . . . . . . . . . . . . 327 3-b Unitary transformation of the state vector; form invariance of the Schrödinger equation . . . . . . . . . . . . . . . . . . . . 328 3-c Invariance of physical predictions under a gauge transformation 331

Outline of the problem: scalar and vector potentials associated with an electromagnetic field; concept of a gauge

Consider an electromagnetic field, characterized by the values E(r; ) of the electric field and B(r; ) of the magnetic field at every instant and at all points in space: E(r; ) and B(r; ) are not independent since they satisfy Maxwell’s equations. Instead of specifying these two vector fields, it is possible to introduce a scalar potential (r; ) and a vector potential A(r; ) such that: E(r; ) =

∇ (r; )

A(r; ) (1)

B(r; ) = ∇

A(r; )

It can be shown from Maxwell’s equations (cf. Appendix III, § 4-b- ) that there always exist functions (r; ) and A(r; ) that allow E(r; ) and B(r; ) to be expressed in the form (1). All electromagnetic fields can therefore be described by scalar and vector potentials. However, when E(r; ) and B(r; ) are given, (r; ) and A(r; ) are not uniquely determined. It can easily be verified that if we have a set of possible values for (r; ) and A(r; ), we obtain other potentials (r; ) and A (r; ) describing the same electromagnetic field by the transformation: (r; ) =

(r; )

(r; )

A (r; ) = A(r; ) + ∇ (r; )

(2)

where (r; ) is an arbitrary function of r and . This can be seen by replacing (r; ) by (r; ) and A(r; ) by A (r; ) in (1) and verifying that E(r; ) and B(r; ) remain 321

COMPLEMENT HIII



unchanged. Moreover, it can be shown that relations (2) give all the possible scalar and vector potentials associated with a given electromagnetic field. When a particular set of potentials has been chosen to describe an electromagnetic field, a choice of gauge is said to have been made. As we just mentioned, an infinite number of different gauges can be used for the same field, characterized by E(r; ) and B(r; ). When one changes from one to another, one is said to perform a gauge transformation. It often happens in physics that the equations of motion of a system involve, not the fields E(r; ) and B(r; ), but the potentials (r; ) and A(r; ). We saw an example of this in § B-5-b of Chapter III, when we wrote the Schrödinger equation for a particle of charge in an electromagnetic field [cf. relation (B-48) of that chapter]. The following question can then be posed: do the physical results predicted by the theory depend only on the values of the fields E(r; ) and B(r; ) at all points in space, or do they also depend on the gauge used to write the equations? In the latter case, it would obviously be necessary, in order for the theory to make sense, to specify in which gauge the equations are valid. The aim of this complement is to answer this question. We shall see that in classical mechanics (§ 2), as in quantum mechanics (§ 3), physical results are not modified when a gauge transformation is performed. The scalar and vector potentials can then be seen to be calculation tools; actually, all that counts are the values of the electric and magnetic fields at all points in space. We shall express this result by saying that classical and quantum mechanics possess the property of gauge invariance. 2.

Gauge invariance in classical mechanics

2-a.

Newton’s equations

In classical mechanics, the motion of a particle1 of charge and mass placed in an electromagnetic field can be calculated from the force f exerted on it. This force is given by Lorentz’ law: f=

[E(r; ) + v

B(r; )]

(3)

where v is the velocity of the particle. To obtain the equations of motion which allow one to calculate the position r( ) of the particle at any instant , one substitutes relation (3) into the fundamental dynamical equation (Newton’s law): d2 r( ) = f d2

(4)

In this approach, only the values of the electric and magnetic fields enter into the calculation; therefore, the problem of gauge invariance does not arise. 2-b.

The Hamiltonian formalism

Instead of adopting the point of view of the preceding section, one can use other equations of motion, the Hamilton-Jacobi equations. It is not difficult to show (cf. 1 For simplicity, we shall assume in this complement that the system under study is composed of a single particle. Generalization to a more complex system formed by several particles placed in an electromagnetic field presents no difficulties.

322



GAUGE INVARIANCE

Appendix III) that the latter equations are completely equivalent to Newton’s equations. However, since we used the Hamiltonian formalism in Chapter III to quantize a physical system, it is useful to study how a gauge transformation appears in this formalism. Although the scalar and vector potentials do not enter into Newton’s equations, they are indispensable for writing those of Hamilton. The property of gauge invariance is therefore less obvious for this second point of view. .

The dynamical variables of the system and their evolution

To determine the motion of a particle subjected to the Lorentz force written in (3), one can use the Lagrangian2 : (r v; ) =

1 v2 2

[ (r; )

v A(r; )]

(5)

This expression permits the calculation of the momentum p, which is written: p = ∇v (r v; ) =

v + A(r; )

(6)

It is then possible to introduce the classical Hamiltonian: (r p; ) =

1 [p 2

2

A(r; )] +

(r; )

(7)

In the Hamiltonian formalism, the state of the particle at a given time is defined by its position r and its momentum p, which we shall call the fundamental dynamical variables, and no longer by its position and its velocity, as in § 2-a above (and as in the Lagrange point of view). The momentum p (conjugate momentum of the position r) must not be confused with the mechanical momentum π: π=

(8)

v

They are indeed different since, according to (6): π=p

A(r; )

(9)

This relation allows us to calculate the mechanical momentum (and therefore the velocity) whenever the values of r and p are known. Similarly, all the other quantities associated with the particle (kinetic energy, angular momentum, etc...) are expressed in the Hamiltonian formalism as functions of the fundamental dynamical variables r and p (and, if necessary, of time). The evolution of the system is governed by Hamilton’s equations: d r( ) = ∇p [r( ) p( ); ] d d p( ) = ∇v [r( ) p( ); ] d

(10)

where is the function of r and p written in (7). These equations give the values, for all times, of the fundamental dynamical variables if they are known at the initial instant. 2 We

state without proof a certain number of results of analytical mechanics which are established in Appendix III.

323

COMPLEMENT HIII



To write equations (10), it is necessary to choose a gauge , that is, a pair of potentials (r; ) A(r; ) describing the electromagnetic field. What happens if, instead of this gauge , we choose another one , characterized by different potentials (r; ) and A (r; ), but describing the same fields E(r; ) and B(r; )? We shall label with a prime the values of the dynamical variables associated with the motion of the particle when the gauge chosen is . As we pointed out in § a, Newton’s equations indicate that the position r and the velocity v take on, at every instant, values independent of the gauge. Consequently, we have: r ( ) = r( )

(11a)

π ( ) = π( )

(11b)

Now, from (9): π( ) = p( )

A[r( ); ]

π( )=p( )

A [r( ); ]

(12)

Therefore, the values p( ) and p ( ) of the momentum in the gauges different; they must satisfy: p()

A [r ( ); ] = p( )

A[r( ); ]

and

are

(13)

If (r; ) is the function appearing in formulas (2) which govern the gauge transformation from to , the values of the fundamental dynamical variables are transformed according to the formulas: r ( ) = r( )

(14a)

p ( ) = p( ) + ∇ [r( ); ]

(14b)

In the Hamiltonian formalism, the value at each instant of the dynamical variables describing a given motion depends on the gauge chosen. Moreover, such a result is not surprising since, in (7) and (10), the scalar and vector potentials appear explicitly in the equations of motion for the position and momentum. .

“True physical quantities” and “non-physical quantities”

( ) Definitions We have just seen, in relations (14) for example, that it is possible to distinguish between two types of quantities associated with the particle: those which, like r or π, have identical values at all times in two different gauges, and those which, like p, have values that depend on the arbitrarily chosen gauge. We are thus led to the following general definition: – A true physical quantity associated with the system under consideration is a quantity whose value at any time does not depend (for a given motion of the system) on the gauge used to describe the electromagnetic field. – A non-physical quantity, on the other hand, is a quantity whose value is modified by a gauge transformation; thus, like the scalar and vector potentials, it is seen to be a calculation tool, rather than an actually observable quantity. 324



GAUGE INVARIANCE

The problem then posed is the following: in the Hamiltonian formalism, all the quantities associated with the system appear in the form of functions of the fundamental dynamical variables r and p; how can we know whether such a function corresponds to a true physical quantity or not? ( ) Characteristic relation of true physical quantities Let us first assume that a quantity associated with the particle is described, in the gauge , by a function of r and p (which may depend on time) which we shall denote by (r p; ). If to this quantity corresponds, in another gauge , the same function (r p ; ), the quantity is clearly not truly physical [except in the special case where the function depends only on r and not on p; see equations (14)]. Since the values of the momentum are different in the two gauges and , the same is obviously true for the values of the function . To obtain the true physical quantities associated with the system, we must therefore consider functions (r p; ) whose form depends on the gauge chosen (this is why we label these functions with an index ). We have already seen an example of such a function: the mechanical momentum π is a function of r and p via the vector potential A [cf. (9)]. In this case, the function is different in the two gauges and ; that is, it is of the form π (r p; ). The definition given in ( ) thus implies that the function (r p; ) describes a true physical quantity on the condition that: [r( ) p( ); ] =

[r ( ) p ( ); ]

(15)

where r( ) and p( ) are the values taken on by the position and momentum in the gauge , and r ( ) and p ( ) are their values in the gauge . If we substitute relations (14) into (15), we obtain: [r( ) p( ); ] =

[r( ) p( ) + ∇ (r( ); ); ]

(16)

This relation must be satisfied at every instant and for all possible motions of the system. Since, when is fixed, the values of the position and the momentum can be chosen independently, both sides of (16) must in fact be the same function of r and p, which is written: [r p; ] =

[r p + ∇ (r; ); ]

(17)

This relation is characteristic of the functions [r p; ] associated with true physical quantities. Therefore, if one considers the function [r p; ] for the gauge , and if one replaces p by p + ∇ (r; ) [where (r; ) defines, according to (2), the gauge transformation from to ], one obtains a new function of r and p which must be identical to [r p; ]. If this is not the case, the function considered corresponds to a quantity which is not truly physical. ( ) Examples Let us give some examples of functions [r p; ] which describe true physical quantities. We have already encountered two: those corresponding to the position and to the mechanical momentum; the first is simply equal to r and the second to: π (r p; ) = p

A(r; )

(18) 325



COMPLEMENT HIII

Since relations (11) express the fact that r and π are true physical quantities, we know a priori that relation (17) is satisfied by the corresponding functions. However, let us verify this directly in order to familiarize ourselves with the use of this relation. As regards r, we are dealing with a function that does not depend on p and whose form does not depend on the gauge3 ; this immediately implies (17). As regards π, relation (18) yields: π (r p; ) = p

A (r; )

(19)

Replace in this function p by p + ∇ (r; ); we obtain the function: p + ∇ (r; )

A (r; ) = p

A(r; )

(20)

which is none other than π (r p; ); relation (17) is therefore satisfied.

Other true physical quantities are the kinetic energy: (r p; ) =

1 [p 2

A(r; )]2

(21)

and the moment, with respect to the origin, of the mechanical momentum: λ (r p; ) = r

[p

A(r; )]2

(22)

In general, we see that whenever a function of r and p has the form: (r p; ) =

[r p

A(r; )]

(23)

(where is a function whose form is independent of the gauge chosen), we obtain a true physical quantity 4 . This result makes sense since (23) really expresses the fact that the values taken on by the quantity considered are obtained from those of r and π, which we know to be gauge-invariant. Let us also give some examples of functions describing quantities that are not true physical quantities. In addition to the momentum p, we can cite the function: (p) =

p2 2

(24)

which must not be confused with the kinetic energy written in (21), and, in general, any function of p alone (and, possibly, of the time). Similarly, the angular momentum: (r p) = r

p

(25)

cannot be considered to be a true physical quantity. Finally, let us cite the classical Hamiltonian, which, according to (7), is the sum of the kinetic energy (r p; ), which is a true physical quantity, and the potential energy . Now, the latter [which should rigorously be written in the form of a gauge-dependent function (r; )] is not a true physical quantity since, at every point in space, its value changes when the gauge is changed. 3 It is not difficult to verify that, in general, any function (r ) that depends only on r (and, possibly, on the time), and whose form is the same in any gauge chosen, describes a true physical quantity. 4 One could also construct functions associated with true physical quantities in which the potentials are involved in a more complex way than in (23) (for example, the scalar product of the particle velocity and the electric field at the position of the particle).

326

• 3.

GAUGE INVARIANCE

Gauge invariance in quantum mechanics

In Chapter III, we introduced the postulates of quantum mechanics by starting from the Hamiltonian formulation of classical mechanics. We are thus led to ask if the problem of gauge invariance, easily resolved in classical mechanics because of the existence of Newton’s equations, is more complex in the framework of quantum mechanics. The following question then arises: are the postulates stated in Chapter III valid for any arbitrarily chosen gauge or only for a particular gauge? In answering this question, we shall be guided by the results obtained in the preceding paragraph. Following the same type of reasoning, we shall see that there exists a close analogy between the consequences of a gauge transformation in the classical Hamiltonian formalism and in the quantum mechanical formalism. We shall thus establish the gauge invariance of quantum mechanics. To do this, we shall begin (§ 3-a) by examining the results obtained when the quantization rules are applied in the same way in two different gauges. We shall then see (§ 3-b) that, like in classical mechanics, where the values of the dynamical variables generally change when the gauge is changed, a given physical system must be characterized by a mathematical state vector that depends on the gauge. To pass from a state vector corresponding to one gauge to that of another gauge , we use a unitary transformation. The form of the Schrödinger equation, however, always remains the same (as do Hamilton’s equations in classical mechanics). Finally, we shall examine the behavior, under a gauge transformation, of the observables associated with the system (§ 3-c). We shall see that the simultaneous modification of the state vector and the observables is such that the physical content of quantum mechanics does not depend on the gauge chosen. Moreover, we shall demonstrate this by showing that the density and the probability current values are gauge invariant. 3-a.

Quantization rules

The state space of a (spinless) particle is always r . However, we are clearly led by the results of § 2 above to expect that the operator associated with a given quantity may be different in two different gauges. We shall therefore label these operators with an index . The quantization rules associate, with the position r and the momentum p of the particle, operators R and P acting in r such that: [

]=[

]=[

]= ~

(26)

(where all the other commutators between components of R and P are zero). In the r representation, the operator R acts like a multiplication by r, and P like the differential ~ operator ∇. These rules are the same in all gauges. We can therefore write: R

=R

(27a)

P

=P

(27b)

In fact, these equations enable us to omit the index for the observables R and P, and we shall henceforth do so. The quantization of all other quantities associated with the particle is obtained as follows: in a given gauge, take the function of r and p giving the classical quantity 327



COMPLEMENT HIII

considered and (after having symmetrized, if necessary) replace r by the operator R and p by P. We thus obtain the operator which, in the gauge chosen, describes this quantity. Consider some examples: – The angular momentum operator, obtained from r p, is the same in all gauges: L

=L

(28)

– The operator associated with the mechanical momentum, on the other hand, depends on the gauge chosen. In the gauge , it is given by: Π

=P

A(R; )

(29)

If the gauge is changed, it becomes: Π

=P

A (R; )

whose action in Π

r

(30)

is different from that of Π :

∇ (R; )



(31)

– Similarly, the operator5 : Λ

=R

Π

=R

[P

A(R; )]

(32)

which describes the moment of the mechanical momentum, explicitly involves the vector potential chosen. – Finally, the Hamiltonian operator is obtained from formula (7): =

1 [P 2

2

A(R; )] +

(R; )

(33)

It is obvious that in another gauge, it becomes a different operator, since: = 3-b.

.

1 [P 2

2

A (R; )] +

(R; ) =

(34)

Unitary transformation of the state vector; form invariance of the Schrödinger equation

The unitary operator

()

In classical mechanics, we denoted by r( ) p( ) and r ( ) p ( ) the values of the fundamental dynamical variables characterizing the state of the particle in two different gauges and . In quantum mechanics, we shall therefore denote by ( ) and () the state vectors relative to these two gauges, and the analogue of relations (14) is thus given by relations between mean values:

5 It

()R

() =

( )R

()P

() =

( ) P + ∇ (R; ) ( )

()

(35a) (35b)

can be verified, by using the commutation relations of R and Π , that it is not necessary to symmetrize expression (32).

328



GAUGE INVARIANCE

Using (27), we immediately see that this is possible only if ( ) and ( ) are two different kets. We shall therefore seek a unitary transformation ( ) that enables us to go from ( ) to (): () = ()

()=

() ()

(36a)

()

(36b)

()=

Taking (27) into account, we see that equations (35) are satisfied for any that:

Multiplying (37a) on the left by ( )=

R

( ) on condition

( )R

( )=R

(37a)

( )P

( ) = P + ∇ (R; )

(37b)

( ), we obtain:

( )R

(38)

The desired unitary operator commutes with the three components of R; it can therefore be written in the form: (R; )

( )=e where [P

(39)

(R; ) is a Hermitian operator. Relation (48) of Complement BII then allows us to write: ( )] = ~∇

(R; )

()

(40)

If we multiply this equation on the left by the relation: ~∇

( ) and substitute it into (37b), we easily obtain

(R; ) = ∇ (R; )

(41)

which is satisfied when: (R; ) =

0(

)+

(R; )

(42)

~

Omitting the coefficient 0 ( ), which corresponds, for the state vector factor of no physical consequence, we obtain the operator ( ): ( )=e If, in (36a),

~

( ) , to a global phase

(R; )

(43)

( ) is this operator, relations (35) are automatically satisfied.

Comments:

( ) In the r representation, relations (36a) and (43) imply that the wave functions (r ) = r ( ) and (r ) = r ( ) are related by: (r ) = e

~ (r )

(r )

(44)

For the wave function, the gauge transformation corresponds to a phase change which varies from one point to another, and is not, therefore, a global phase factor. The gauge invariance of physical predictions obtained by using the wave functions or , is therefore not obvious a priori. 329



COMPLEMENT HIII

( ) If the system under study is composed of several particles having positions r1 , r2 , ... and charges 1 , 2 , ..., (43) must be replaced by: (1)

()=

= e~[ .

2

()

1

()

(R1 )+

2

(R2 )+

]

(45)

Time evolution of the state vector

Now let us show that if the evolution of the ket the Schrödinger equation: ~

d d

() =

() ()

the state vector : ~

d d

, to

(46)

( ) given by (36) satisfies an equation of the same form in the gauge

() =

where

( ) obeys, in the gauge

()

()

(47)

( ) is given by (34). To do this, let us calculate the left-hand side of (47); it is written:

~

d d

d d

() = ~

() () d d

= ~

()

() + ~

()

d d

()

(48)

that is, according to (43) and (46) 6 : ~

d d

() =

(R; )

() () +

(R; ) + ˜ ( )

=

where ˜ ( ) designates the transform of ˜ ( )=

()

()

()

() ()

()

(49)

( ) by the unitary operator

( ):

()

(50)

Equation (47) will therefore be satisfied if: ()= ˜ ()

(R; )

(51)

Now ˜ ( ) is given by: ˜ ( )= 1 2

6 The

function

˜ P

˜ ) A(R;

2

+

˜ ) (R;

depends on R and not on P; consequently

(52)

(R ) commutes with

(R; ). This

is why ( ) can be differentiated as if (R ) were an ordinary function of the time and not an operator (cf. Complement BII , comment of § 5-c).

330



GAUGE INVARIANCE

˜ and P ˜ designate the transforms of R and P by the unitary operator where R to (37): ˜ = R ˜ = P

( )R

( )=R

( )P

( )=P

( ). According

(53a) ∇ (R; )

(53b)

These relations, substituted into (52), indicate that: ˜ ( ) = 1 [P 2

A(R; )

∇ (R; )]2 +

(R; )

(54)

Using relations (2) to replace the potentials relative to the gauge by those relative to , we then obtain, using (34), relation (51). Therefore, the Schrödinger equation can be written in the same way in any gauge chosen. 3-c.

.

Invariance of physical predictions under a gauge transformation

Behavior of the observables

Under the effect of the unitary transformation formed into ˜ , with: ˜ =

()

()

( ), any observable

is trans(55)

˜ is simply equal to R, P ˜ is not equal to P. We have already seen, in (53), that while R ˜ is different from Π since: Similarly, Π ˜ Π

˜ =P

˜ ) A(R;

=P

∇ (R; ) ∇ (R; )



A(R; ) (56)

Taking (27a) and (31) into account, we see that relations (53a) and (56) imply that the observables R and Π , associated with true physical quantities (position and mechanical momentum) are such that: ˜ R

=R (57)

˜ Π



On the other hand, the momentum P (which is not a true physical quantity) does not satisfy an analogous relation, since, from (27b) and (53b): ˜ P

=P

(58)

We shall see that this result is a general one: in quantum mechanics, for every true physical quantity, there is an operator ( ) that satisfies: ˜

=

()

(59)

This relation is the quantum mechanical analogue of the classical relation (16). It shows that, except for the special case of R or a function of R alone, the operator corresponding 331

COMPLEMENT HIII



to a true physical quantity depends on the gauge this in (29) and (32).

. We have already seen examples of

To prove (59), one need only apply the quantization rules stated in Chapter III to a function (r p; ) and use relation (17), the characteristic relation for true physical classical quantities. We therefore replace r and p by the operators R and P and obtain (if necessary, after a symmetrization with respect to these operators) the operator ( ). If the form of the function depends on the gauge chosen, the operator ( ) also depends on . When the quantity associated with is a true physical quantity, we have, according to (17): [R P + ∇ (R; ); ]

[R P; ] =

Applying the unitary transformation

(60)

( ) to this relation, we obtain:

˜ [R P; ] = ˜ [R P + ∇ (R; ); ] ˜ P ˜ + ∇ (R; ˜ ); ] = [R

(61)

That is, taking (53) into account: ˜ [R P; ] =

[R P; ]

(62)

After symmetrizing, if necessary, both sides of this relation, we indeed obtain (59).

Let us give some examples of true physical observables. In addition to R and Π , we can cite the moment Λ of the mechanical momentum [cf. (32)], or the kinetic energy: Γ

=

Π2 1 = [P 2 2

A(R; )]2

(63)

On the other hand, P and L are not true physical quantities; neither is the Hamiltonian, since relation (51) implies in general that: ˜ ( )=

()

(64)

Comment:

In classical mechanics, it is well known that the total energy of a particle moving in a time-independent electromagnetic field is a constant of the motion. It is indeed possible in this case to limit oneself to potentials which are also time-independent. We see from (51) that one then has: ˜

=

(65)

In this particular case, is indeed a true physical observable which can therefore be interpreted to be the total energy of the particle. .

Probability of the various possible results of a measurement bearing on a true physical quantity

Assume that at time we want to measure a true physical quantity. In the gauge , the state of the system is described at this instant by the ket7 , and the physical 7 We

do not indicate the time dependence because all the quantities must be evaluated at the time when we want to perform the measurement.

332

• quantity, by the observable . Let be an eigenvector of (assumed, for simplicity, to be non-degenerate):

GAUGE INVARIANCE

, with the eigenvalue

=

(66)

As calculated in the gauge from the postulates of quantum mechanics, the probability of obtaining in the measurement envisaged is equal to: 2

=

(67)

What happens to this prediction when the gauge is changed? According to (59), the operator associated with the quantity under consideration in the new gauge can have the ket: =

(68)

as an eigenvector, with the same eigenvalue

as in (66). That is:

= =

=

(69)

In the gauge , therefore still appears as a possible measurement result. Moreover, calculation of the corresponding probability yields the same value as in the gauge , since, according to (36a) and (68): =

=

(70)

We have thus verified that the postulates of quantum mechanics lead to gaugeinvariant physical predictions: the possible results of any measurement and the associated probabilities are invariant under a gauge transformation. .

Probability density and current

Let us calculate, from formulas (D-9) and (D-20) of Chapter III, the probability density (r ) and probability current J(r ) in two different gauges and . For the first gauge, we have: (r ) 2

(r ) =

(71)

and: J(r ) =

1

Re

(r )

~



A(r; )

(r )

(72)

Relation (44) immediately shows that: (r ) 2 = (r )

(r ) =

(73)

Moreover, it also implies that: J (r ) = =

1 1

Re e Re

~ (r; )

(r )

~

(r ) ∇

~



A (r; ) e

A (r; ) + ∇ (r; )

~ (r; )

(r )

(r ) (74)

333

COMPLEMENT HIII



that is, taking (2) into account: J (r ) = J(r )

(75)

The probability density and current are therefore invariant under a gauge transformation. This result could have been foreseen, moreover, from the conclusions of § 3-c- above, since [cf. relation (D-19) of Chapter III] (r ) and J(r ) can be considered to be mean values of the operators r r and: K (r) =

1 2

r rΠ +Π

r r

(76)

It is not difficult to show that these two operators satisfy relation (59). They therefore describe true physical quantities whose mean values are gauge-invariant. References and suggestions for further reading:

Messiah (1.17), Chap. XXI, §§ 20 to 22; Sakurai (2.7), § 8-1. Gauge invariance, extended to other domains, plays an important role in particle physics; see, for example, the article by Abers and Lee (16.35).

334



PROPAGATOR FOR THE SCHRÖDINGER EQUATION

Complement JIII Propagator for the Schrödinger equation

1 2

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Existence and properties of a propagator (2 1) . . . . . 2-a Existence of a propagator . . . . . . . . . . . . . . . . . . 2-b Physical interpretation of (2 1) . . . . . . . . . . . . . . 2-c Expression for (2 1) in terms of the eigenstates of . . 2-d Equation satisfied by (2 1) . . . . . . . . . . . . . . . . Lagrangian formulation of quantum mechanics . . . . . . 3-a Concept of a space-time path . . . . . . . . . . . . . . . . 3-b Decomposition of (2 1) into a sum of partial amplitudes 3-c Feynman’s postulates . . . . . . . . . . . . . . . . . . . . 3-d The classical limit and Hamilton’s principle . . . . . . . .

3

1.

. . . . . . . . . . .

. . . . . . . . . . .

335 336 336 337 338 339 339 339 340 341 341

Introduction

Consider a particle described by the wave function enables us to calculate

(r ). The Schrödinger equation

(r ), that is, the rate of variation of

(r ) with respect to

. It therefore gives the time evolution of the wave function (r ), using a differential point of view. One might wonder if it is possible to adopt a more global (but equivalent) point of view that would allow us to determine directly the value (r0 ) taken on by the wave function at a given point r0 and a given time from knowledge of the whole wave function (r ) at a previous time (which is not necessarily infinitesimally close to ). To consider this possibility, we can take our inspiration from another domain of physics, electromagnetism, where both points of view are possible. Maxwell’s equations (the differential point of view) give the rates of variation of the various components of the electric and magnetic fields. Huygens’ principle (the global point of view) permits the direct calculation, when a monochromatic field is known on a surface Σ, of the field at any point : one sums the fields radiated at the point by fictional secondary sources 1, 2, 3 , ... situated on the surface Σ and whose amplitude and phase are determined by the value of the field at 1 , 2 , 3 , ... (Fig. 1). We intend to show in this complement that there exists an analogue of Huygens’ principle in quantum mechanics. More precisely, we can write, for 2 1: (r2

2)

=

d3

1

(r2

2 ; r1

1)

(r1

1)

(1)

a formula whose physical interpretation is the following: the probability amplitude of finding the particle at r2 at the instant 2 is obtained by summing all the amplitudes 335

COMPLEMENT JIII

• M

N1

Figure 1: In a diffraction experiment, Huygens’ principle permits the calculation of the electric field at the point , as a sum of fields radiated by secondary sources 1 , 2 , 3 , ... situated on a surface Σ.

N2 N3 Σ

“radiated” by the “secondary sources” (r1 1 ), (r1 1 )... situated in space-time on the surface = 1 , each of these sources contributing to a degree proportional to (r1 1 ), (r1 1 ), ... (Fig. 2). We shall prove the preceding formula, calculate , called the propagator for the Schrödinger equation, and study its properties. We shall then indicate very qualitatively how it is possible to present all of quantum mechanics in terms of (the Lagrangian formulation of quantum mechanics; Feynman’s point of view).

r1 r2 r1

t = t2

t = t1

2.

Figure 2: The probability amplitude (r2 2 ) can be obtained by summing the contributions of the various amplitudes (r1 1 ), (r1 1 ), etc... corresponding to a given previous instant 1 . With each of the arrows of the figure is associated a “propagator” (r2 2 ; r1 1 ), (r2 2 ; r1 1 ), etc...

Existence and properties of a propagator

2-a.

(2 1)

Existence of a propagator

The problem is to link directly the states of the system at two different times. This is possible if we use the evolution operator introduced in Complement FIII , since we can write: ( 2) = Given (r2 336

(

2

1)

( 1)

( 2 ) , it is easy to find the wave function 2)

= r2

( 2)

(2) (r2

2 ):

(3)



PROPAGATOR FOR THE SCHRÖDINGER EQUATION

Substituting (2) into (3) and inserting the closure relation: d3

1

r1 r1 =

between (r2

( 2)

1)

2

(4)

and

( 1 ) , we obtain:

=

d3

1

r2

(

2

1)

r1 r1

=

d3

1

r2

(

2

1)

r1

(r1

( 1) 1)

(5)

The result is thus a formula identical to (1), on the condition that we set: (

r2

2

1)

r1 =

(r2

2 ; r1

1)

Moreover, since we want to use formulas of the type of (1) only for = 0 for 2 then becomes: 1 . The exact definition of (r2

2 ; r1

where ( ( (

2

1) 1)

= r2

(

2

1)

r1

(

2

2

1)

1,

we can set

(6)

is the “step function”:

2

1)

= 1 if

2

1

2

1 ) = 0 if

2

1

(7)

The introduction of ( 2 1 ) is of both physical and mathematical interest. From the physical point of view, it is a simple way of compelling the secondary sources situated on the surface = 1 of Figure 2 to “radiate” only towards the future. For this reason, (r2 2 ; r1 1 ) as defined by (6) is called the retarded propagator. From the mathematical point of view, we shall see later that (r2 2 ; r1 1 ), because of the factor ( 2 1 ), obeys a partial differential equation whose right-hand side is a delta function, which is the equation that defines a Green’s function.

Comments:

( ) Note, however, that equation (5) remains valid even if 2 1 . It is possible, moreover, to introduce mathematically an “advanced” propagator which would be different from zero only for 2 1 and which would also obey the equation defining a Green’s function. Since the physical meaning of such an advanced propagator is not obvious at this stage, we shall not study it here. ( ) When no ambiguity is possible, we shall simply write (r2 2 ; r1 1 ). 2-b.

Physical interpretation of

(2 1) for

(2 1)

This interpretation follows very simply from definition (6): (2 1) represents the probability amplitude that the particle, starting from point r1 , at time 1 , will arrive at 337

COMPLEMENT JIII



point r2 at a later time point r1 :

2.

If we take as the initial state at time

1

a state localized at

( 1 ) = r1 at time

2,

(8)

the state vector has become:

( 2) =

(

1)

2

( 1) =

(

2

1)

(9)

r1

The probability amplitude of finding the particle at point r2 at this time is then: ( 2 ) = r2

r2 2-c.

(

1)

2

Expression for

(10)

r1

(2 1) in terms of the eigenstates of

Assume that the Hamiltonian does not depend explicitly on time, and call its eigenstates and eigenvalues:

and

=

(11)

According to formula (18) of FIII , we have: (

2

1)

(

=e

2

1)

~

(12)

The closure relation: =

(13)

enables us to write (12) in the form: (

2

1)

(

=e

2

1)

~

(14)

that is, taking (11) into account: (

2

1)

=

e

(

2

1)

~

(15)

To calculate (2 1), it then suffices to take the matrix element of both sides of (15) between r2 and r1 and to multiply it by ( 2 1 ). Since: =

(r2 )

(16)

r1 =

(r1 )

(17)

r2

this leads to: (r2 338

2 ; r1

1)

= (

2

1)

(r1 )

(r2 ) e

(

2

1)

~

(18)

• 2-d.

Equation satisfied by

(r2 ) e that, in the r

PROPAGATOR FOR THE SCHRÖDINGER EQUATION

(2 1)

~

is a solution of the Schrödinger equation. We deduce from this representation: 2

~

r2

~

∇2

(r2 )e

2

~

=0

(19)

2

where ∇2 is a condensed notation which designates the three operators

,

,

2

2

. 2

Let us then apply, to both sides of equation (18), the operator: ~

r2

~

∇2

2

which acts only on the variables r2 and that: (

2

1)

= (

2

2.

We know [cf. Appendix II, relation (44)]

1)

(20)

2

Consequently, using (19), we obtain: r2

~

~

∇2

(r2

2 ; r1

1)

=

2

~ (

2

1)

(r1 )

(r2 ) e

(

2

1)

~

(21)

Because of the presence of ( 2 1 ), we can replace 2 1 by zero in the sum over appearing on the right-hand side of (21). This makes the exponential equal to 1. We are thus left with the quantity (r2 ) (r1 ), which, according to (13), (16) and (17), is equal to (r2 r1 ) [taking the matrix element of (13) between r2 and r1 ]. Finally, satisfies the equation: r2

~

~

∇2

(r2

2 ; r1

1)

= ~ (

2

1)

(r2

r1 )

(22)

2

The solutions of equation (22), whose right-hand side is proportional to a four-dimensional “delta function”, are called Green’s functions. It can be shown that, to determine (2 1) completely, it suffices to associate with (22) the boundary condition: (r2

2 ; r1

1)

=0

if

2

(23)

1

Equations (22) and (23) have interesting implications, in particular with regard to perturbation theory, which we shall study in Chapter XI. 3.

Lagrangian formulation of quantum mechanics

3-a.

Concept of a space-time path

Let us consider, in space-time, the two points (r1 1 ) and (r2 2 ) (cf. Fig. 3; is plotted as the abscissa, and the ordinate axis represents the set of the three spatial axes). Choose intermediate times ( =1 2 ), evenly spaced between 1 and 2 : 1

1

2

1

2

(24) 339

COMPLEMENT JIII

• rα 2

r

rα N

r1 rα N – 1

rα 1

O

t1

r2

tα1

tα2

tαN – 1

tαN

t

t2

Figure 3: Diagram associated with a “spacetime path”: one picks intermediate times ( =1 2 ) evenly spaced between 1 and 2 , and chooses for each of them a value of r.

and, for each of them, a position r in space. We can thus construct, when approaches infinity, a function r( ) (which we shall assume to be continuous) such that: r( 1 ) = r1

(25a)

r( 2 ) = r2

(25b)

r( ) is said to define a space-time path between (r1 1 ) and (r2 2 ): such a path might be thought of as the trajectory of a physical point leaving point r1 at time 1 and arriving at r2 at time 2 . 3-b.

Decomposition of

(2 1) into a sum of partial amplitudes

We first return to the case where the number Formula (10) of Complement FIII enables us to write: (

2

1)

=

(

2

) (

1

)

(

2

1

) (

of intermediate times is finite.

1

1)

(26)

We now take the matrix elements of both sides of (26) between r2 and r1 and insert the closure relation relative to the r representation for each intermediate time . According to (6) and (24), we thus obtain: d3

(2 1) =

d3

d3

1

2

d3

1

(2

)

( (

1) 2

1)

(

1

1)

(27)

Now consider the product: (2

)

(

1)

(

2

1)

(

1

1)

(28)

Generalizing the argument of § 2-b, we can interpret this term as being the probability amplitude for the particle, having left point 1 (r1 1 ), to arrive at point 2 (r2 2 ), having 340



PROPAGATOR FOR THE SCHRÖDINGER EQUATION

passed successively through all points (r ) of Figure 3. Note that, in formula (27), one is summing over all possible positions r at each time . We now let approach infinity1 . A series of points then defines a space-time path between 1 and 2, and the product (28) associated with it becomes the probability amplitude for the particle to follow this path. Of course, the number of integrations in formula (27) becomes infinite. It is understandable, however, that the summation over the set of possible positions at each time should reduce to a summation over the set of possible paths. (2 1) is thus seen to be a sum (in fact, an integral) that corresponds to the coherent superposition of the amplitudes associated with all possible space-time paths starting from 1 and ending at 2. 3-c.

Feynman’s postulates

The concepts of a propagator and a space-time path permit a new formulation of the postulate concerning the time evolution of physical systems. We shall outline here such a formulation for the case of a spinless particle. We define (2 1) directly as being the probability amplitude for the particle, starting from r1 at time 1 , to arrive at r2 at time 2 . We then postulate that: ()

(2 1) is the sum of an infinity of partial amplitudes, one for each of the space-time paths connecting (r1 1 ) and (r2 2 ).

( ) The partial amplitude Γ (2 1) associated with one of these paths Γ is determined in the following manner: let Γ be the classical action calculated along Γ, that is: Γ

=

(r p ) d

(29)

(Γ)

where (r p ) is the Lagrangian of the particle (cf. ppendix III). equal to: Γ (2

where

1) =

e~

Γ (2

1) is then

Γ

(30)

is a normalization constant (which can be determined explicitly).

It can be shown that the Schrödinger equation follows as a consequence of these two postulates, which also lead to the canonical commutation relation between the components of the observables R and P. The two preceding postulates therefore permit a formulation of quantum mechanics which is different from that of Chapter III, but equivalent. 3-d.

The classical limit and Hamilton’s principle

The formulation we have just evoked is particularly useful for discussing the relation between quantum and classical mechanics. Consider a situation in which the actions Γ are much larger than ~. In this case, the variation ∆ Γ of the action between two different paths, even if its relative 1 In

this treatment, we make no attempt to be mathematically rigorous.

341

COMPLEMENT JIII

value is small

• ∆

Γ

1 , is usually much larger than ~. Consequently, the phase of

Γ

(2 1) of most of Γ (2 1) varies rapidly, and the contributions to the global amplitude the paths Γ cancel out by interference. Let us assume, however, that there exists a path Γ0 for which the action is stationary (meaning that it does not vary, to first order, when one goes from Γ0 to another infinitesimally close path). The amplitude Γ0 (2 1) then interferes constructively with those of the paths next to Γ0 , since, this time, their phases remain practically equal. Consequently, when the actions Γ are much larger than ~, one is in a “quasi-classical” situation: to obtain (2 1), one can ignore all the paths except Γ0 and paths infinitely close to it; it can then be said that, between points 1 and 2, the particle follows the trajectory Γ0 . Now this is indeed the classical trajectory, defined by Hamilton’s principle as being the path along which the action is minimal. Feynman’s postulates therefore include, at the classical limit, Hamilton’s principle of least action. They enable us, moreover, to associate with it the following picture: it is the wave associated with a particle which, “exploring” the various possible paths, picks the one for which the action will be the smallest. The Lagrangian formulation of quantum mechanics presents numerous other advantages, which we shall not examine in detail. Let us point out, for example, that it lends itself easily to relativistic generalization since one is already reasoning in spacetime. Moreover, it can be applied to any classical system (not necessarily mechanical) governed by a variational principle (for example, a field). However, it has a certain number of drawbacks on the mathematical level (summation over an infinite number of paths, the limit , ...). References and suggestions for further reading:

Feynman’s original article (2.38); Feynman and Hibbs (2.25); Bjorken and Drell (2.6), Chaps. 6 and 7.

342



UNSTABLE STATES. LIFETIME

Complement KIII Unstable states. Lifetime

1 2 3

1.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Definition of the lifetime . . . . . . . . . . . . . . . . . . . . . 344 Phenomenological description of the instability of a state . 345

Introduction

Consider a conservative system (a system whose Hamiltonian is time-independent). Assume that at time = 0 the state of the system is one of the eigenstates of the Hamiltonian, of energy : (0) =

(1)

with: =

(2)

In this case, the system remains indefinitely in the same state (a stationary state, § D-2-b of Chapter III). We shall study the hydrogen atom in Chapter VII by solving the eigenvalue equation of its Hamiltonian, which is a time-independent operator. The states of the hydrogen atom (that is, the possible values of its energy) which we shall find are in very good agreement with the experimentally measured energies. However, it is known that most of these states are actually unstable: if, at the instant = 0, the atom is in an excited state (an eigenstate corresponding to an energy greater than that of the ground state, which is the lowest energy state), it generally “falls back” into this ground state by emitting one or several photons. The state is not really, therefore, a stationary state in this case. This problem arises from the fact that, in calculations of the type used in Chapter VII, the system under study (the hydrogen atom) is treated as if it were totally isolated, while it is actually in constant interaction with the electromagnetic field. Although the evolution of the global system “atom + electromagnetic field” can be perfectly well described by a Hamiltonian, it is not rigorously possible to define a Hamiltonian for the hydrogen atom alone [cf. comment ( ) of § 5-b of Complement EIII ]. However, since the coupling between the atom and the field happens to be weak (it can be shown that its 1 , which we shall introduce “force” is characterized by the fine structure constant 137 in Chapter VII), the approximation consisting of completely neglecting the existence of the electromagnetic field is very good, except, of course, if we are interested precisely in the instability of the states. 343

COMPLEMENT KIII



Comments:

( ) If, at the initial instant, a strictly conservative and isolated system is in a state formed by a linear combination of several stationary states, it evolves over time, and does not always remain in the same state. But its Hamiltonian is a constant of the motion, and consequently (cf. Chapter III, § D-2-c), the probability of finding one energy value or another is independent of time, as is the mean value of the energy. On the other hand, in the case of an unstable state, one state is transformed irreversibly into another, with a loss of energy for the system: this energy is taken away by the photons emitted1 . ( ) The instability of the excited states of an atom is caused by the spontaneous emission of photons; the ground state is stable, since there exists no lower energy state. Recall, nevertheless, that atoms can also absorb light energy and so ascend to higher energy levels. We intend to indicate here how to take the instability of a state into account phenomenologically. The description will not be rigorous, as we shall continue to consider the system as if it were isolated. We shall try to incorporate this instability as simply as possible into the quantum description of the system. Complement DXIII presents a more rigorous treatment of this problem, justifying the phenomenological approach used here. 2.

Definition of the lifetime

Experiments show that the instability of a state can often be characterized by just one parameter , having the dimensions of a time, which is called the lifetime of the state. More precisely, if one prepares the system at time = 0 in the unstable state , one observes that the probability ( ) of its still being excited at a later time is equal to: ()=e

(3)

This result can also be expressed in the following way. Consider a large number of identical independent systems, all prepared at time = 0 in the state . At time , there will remain ( ) = e in this state. Between times and + d , a certain number d ( ) of systems leave the unstable state: d ()=

()

( +d )=

d () d = d

()

d

For each of the ( ) systems that are still in the state therefore be defined: d ()=

(4) at time , a probability can

d () d = ()

(5)

of their leaving this state during the time interval d following the instant . We see that 1 d ( ) is independent of : the system is said to have a probability per unit time of leaving the unstable state. 1 which,

344

moreover, may also take away linear and angular momentum.



UNSTABLE STATES. LIFETIME

Comments:

( ) Let us calculate the mean value of the time during which the system remains in the unstable state. It is equal to: e

d

=

(6)

0

is therefore the mean time the system spends in the state it is called the lifetime of this state. For a stable state, ( ) is always equal to 1, and the lifetime

; this is why is infinite.

( ) A remarkable property of the lifetime is that it does not depend on the procedure used to prepare the system in the unstable state, that is, on its previous “history”: the lifetime is a characteristic of the unstable state itself. (

) According to the time-energy uncertainty relation (§ D-2-e of Chapter III), the time characteristic of the evolution of an unstable state is associated with an uncertainty in the energy ∆ given by: ~



(7)

One indeed finds that the energy of an unstable state cannot be determined with arbitrary accuracy, but only to within an uncertainty of the order of ∆ . ∆ is called the natural width of this state. For the case of the hydrogen atom, the width of the various states is negligible compared to their separation. This explains why we can treat them, in a first approximation, as if they were stable. 3.

Phenomenological description of the instability of a state

First let us consider a conservative system, prepared, at the initial time, in the eigenstate of the Hamiltonian . According to the rule (D-54) of Chapter III, the state vector, at time , becomes: ~

() =e The probability is:

(8) ( ) of finding, in a measurement at time , the system in the state 2

~

()= e

(9)

Since the energy is real ( being an observable), this probability is constant and equal to 1: we again find that is a stationary state. Let us examine what would happen if, in expression (9), we replaced the energy by the complex number: =

~

2

(10) 345

COMPLEMENT KIII



The probability ()= e

(

( ) then becomes: ~ 2 )

2 ~

=e

(11)

In this case, the probability of finding the system in the state decreases exponentially with time, as in formula (3). Therefore, to take into account phenomenologically the instability of a state whose lifetime is , it suffices to add, as in (10), an imaginary part to its energy, and set: =

1

(12)

Comment:

When is replaced by , the norm of the state vector written in (8) becomes 2 e and therefore varies with time. This result is not surprising. We saw in § D-1-c of Chapter III that the conservation of the norm of the state vector arose from the Hermitian nature of the Hamiltonian operator; now, an operator whose eigenvalues are complex, as are the , cannot be Hermitian. Of course, as we pointed out in § 1, this is due to the fact that the system under study is part of a larger system (it is interacting with the electromagnetic field) and its evolution cannot be described rigorously by means of a Hamiltonian. It is already rather remarkable that its evolution can be simply explained by introducing a “Hamiltonian” with complex eigenvalues.

346



EXERCISES

Complement LIII Exercises 1. In a one-dimensional problem, consider a particle whose wave function is: ( )= where

and

e

2 0

~

0

+

2

are real constants and

Determine

so that

is a normalization coefficient.

( ) is normalized.

The position of the particle is measured. What is the probability of finding a result between and + ? 3 3 Calculate the mean value of the momentum of a particle which has wave function. 2. Consider, in a one-dimensional problem, a particle of mass time is ( ).

( ) for its

whose wave function at

At time , the distance of this particle from the origin is measured. Write, as a function of ( ), the probability ( 0 ) of finding a result greater than a given length 0 . What are the limits of ( 0 ) when 0 0 and when 0 ? Instead of performing the measurement of question , one measures the velocity of the particle at time . Express, as a function of ( ), the probability of finding a result greater than a given value 0 3. The wave function of a free particle, in a one-dimensional problem, is given at time = 0 by: +

(

0) =

where

0

and

d e

0

e

are constants.

What is the probability ( 1 0) that a measurement of the momentum, performed at time = 0, will yield a result included between 1 and + 1 ? Sketch the function ( 1 0). What happens to this probability ? Interpret.

(

1

) if the measurement is performed at time

What is the form of the wave packet at time = 0? Calculate for this time the product ∆ ∆ ; what is your conclusion? Describe qualitatively the subsequent evolution of the wave packet.

347

COMPLEMENT LIII



4. Spreading of a free wave packet Consider a free particle. Show, applying Ehrenfest’s theorem, that value remaining constant.

is a linear function of time, the mean

Write the equations of motion for the mean values these equations.

2

and

+

. Integrate

Show that, with a suitable choice of the time origin, the root mean square deviation ∆ is given by: (∆ )2 =

1 2

(∆ )20

2

+ (∆ )20

where (∆ )0 and (∆ )0 are the root mean square deviations at the initial time. How does the width of the wave packet vary as a function of time (see § 3-c of Complement GI )? Give a physical interpretation. 5. Particle subjected to a constant force In a one-dimensional problem, consider a particle having a potential energy ( ) given by ( ) = , where is a positive constant [ ( ) arises, for example, from a gravity field or a uniform electric field]. Write Ehrenfest’s theorem for the mean values of the position and the momentum of the particle. Integrate these equations; compare with the classical motion. Show that the root mean square deviation ∆ Write the Schrödinger equation in the tion between

()

2

and

does not vary over time.

representation. Deduce from it a rela()

2

. Integrate the equation thus obtained;

give a physical interpretation. 6. Consider the three-dimensional wave function: 2 +2 +2

(

)=

e

where ,

and

are three positive lengths.

Calculate the constant

which normalizes

.

Calculate the probability that a measurement of between 0 and .

will yield a result included

Calculate the probability that simultaneous measurements of and results included respectively between and + , and and + .

will yield

Calculate the probability that a measurement of the momentum will yield a result included in the element d d d centered at the point = = 0; =~ .

348



EXERCISES

7. Let ( ) = (r) be the normalized wave function of a particle. Express in terms of (r) the probability for: a measurement of the abscissa

, to yield a result included between

a measurement of the component between 1 and 2 ; simultaneous measurements of 1

1

and

2;

of the momentum, to yield a result included and

, to yield:

2

0 simultaneous measurements of 1

2

3

4

5

6

,

,

to yield:

Show that this probability is equal to the result of + 6 a measurement of the component result included between

1

and

=

1 ( 3

+

when

3,

5

;

4,

+ ) of the position, to yield a

2.

8. Let J(r) be the probability current associated with a wave function (r) describing the state of a particle of mass [Chap. III, relations (D-17) and (D-19)]. Show that: d3 J(r) = P where P is the mean value of the momentum. Consider the operator L (orbital angular momentum) defined by L = R the three components of L Hermitian operators? Establish the relation: d3 [r

P. Are

J(r)] = L

9. One wants to show that the physical state of a (spinless) particle is completely defined by specifying the probability density (r) = (r) 2 and the probability current J(r). Assume the function (r) =

(r)e

(r) known and let (r) be its argument:

(r)

Show that: ~ J(r) = (r) ∇ (r) Deduce that two wave functions leading to the same density (r) and current J(r) can differ only by a global phase factor. 349



COMPLEMENT LIII

Given arbitrary functions (r) and J(r), show that a quantum state (r) can be associated with them only if ∇ v(r) = 0, where v(r) = J(r) (r) is the velocity associated with the probability fluid. Now assume that the particle is submitted to a magnetic field B(r) = ∇ A(r) [see Chap. III, definition (D-20) of the probability current in this case]. Show that: J=

(r)

[~∇ (r)

A(r)]

and: ∇

v(r) =

B(r)

10. Virial theorem In a one-dimensional problem, consider a particle with the Hamiltonian: 2

=

2

+

( )

where: ( )= Calculate the commutator [ ]. If there exists one or several stationary states in the potential show that the mean values and of the kinetic and potential energies in these states satisfy the relation: 2 = . In a three-dimensional problem, =

P2 + 2

is written:

(R)

Calculate the commutator [ R P]. Assume that (R) is a homogeneous function of th order in the variables , , . What relation necessarily exists between the mean kinetic energy and the mean potential energy of the particle in a stationary state? Apply this to a particle moving in the potential Recall that a homogeneous function definition satisfies the relation: (

)=

(

350

+

=

)

(

e2

(hydrogen atom).

of th degree in the variables ,

and satisfies Euler’s identity: +

( )=

)

and

by



EXERCISES

Consider a system of particles of positions R and momenta P ( = 1 2 ). When their potential energy is a homogeneous ( th degree) function of the set of components , , , can the results obtained above be generalized? Apply these results to an arbitrary molecule formed of nuclei of charges and electrons of charge . All these particles interact by pairs through Coulomb forces. In a stationary state of the molecule, what relation exists between the kinetic energy of the system of particles and their energy of mutual interaction? 11. Two-particle wave function In a one-dimensional problem, consider a system of two particles (1) and (2) with which is associated the wave function ( 1 2 ). What is the probability of finding, in a measurement of the positions of the two particles, a result such that: 1

+d

2

?

What is the probability of finding particle (1) between observations are made on particle (2)]?

and

+d

Give the probability of finding at least one of the particles between Give the probability of finding one and only one particle between

1

and

2

[when no

and . and .

What is the probability of finding the momentum of particle (1) included between and and the position of particle (2) between and ? The momenta of finding

1

and

2

;

1

of the two particles are measured; what is the probability ? 2

The only quantity measured is the momentum 1 of the first particle. Calculate, first from the results of and then from those of , the probability of finding this momentum included between and . Compare the two results obtained. The algebraic distance 1 2 between the two particles is measured; what is the probability of finding a result included between and + ? What is the mean value of this distance? 12. Infinite one-dimensional well Consider a particle of mass ( )=0

if

( )=+

placed in the potential:

0

if

0

or

are the eigenstates of the Hamiltonian of the system, and their eigenvalues are 2 2 2 ~ = (cf. Complement HI ). The state of the particle at the instant = 0 is: 2 2 (0) =

1

1

+

2

2

+

3

3

+

4

4

351

COMPLEMENT LIII



What is the probability, when the energy of the particle in the state 3 2 ~2 measured, of finding a value smaller than ? 2

(0) is

What is the mean value and what is the root mean square deviation of the energy of the particle in the state (0) ? Calculate the state vector ( ) at the instant . Do the results found in at the instant = 0 remain valid at an arbitrary time ?

and

8 2 ~2 When the energy is measured, the result of is found. After the measurement, 2 what is the state of the system? What is the result if the energy is measured again? 13. Infinite two-dimensional well (cf. Complement GII ) In a two-dimensional problem, consider a particle of mass is written: =

; its Hamiltonian

+

with: 2

2

=

+

2

( )

=

2

+

( )

The potential energy ( ) [or ( )] is zero when ] and is infinite everywhere else.

(or ) is included in the interval [0,

Of the following sets of operators, which form a C.S.C.O.?

Consider a particle whose wave function is: (

)=

when 0

cos

cos

and 0

sin

2

sin

2

, and is zero everywhere else (

is a constant).

What is the mean value of the energy of the particle? If the energy is measured, what results can be found, and with what probabilities? The observable is measured; what results can be found, and with what 2 2 ~ probabilities? If this measurement yields the result , what will be the 2 2 results of a subsequent measurement of , and with what probabilities? Instead of performing the preceding measurements, one now performs a simultaneous measurement of and . What are the probabilities of finding: =

9 2 ~2 2 2

and: 0

352

0

+d ?



EXERCISES

14. Consider a physical system whose state space, which is three-dimensional, is spanned by the orthonormal basis formed by the three kets 1 , 2 , 3 . In this basis, the Hamiltonian operator of the system and the two observables and are written:

=~

where

0

1

0

0

0

2

0

0

0

2

;

=

1

0

0

0

0

1

0

1

0

;

=

0

1

0

1

0

0

0

0

1

and are positive real constants. 0, The physical system at time = 0 is in the state: 1 2

(0) =

1

+

1 2

2

+

1 2

3

At time = 0, the energy of the system is measured. What values can be found, and with what probabilities? Calculate, for the system in the state (0) , the mean value and the root mean square deviation ∆ . Instead of measuring at time = 0, one measures ; what results can be found, and with what probabilities? What is the state vector immediately after the measurement? Calculate the state vector

( ) of the system at time .

Calculate the mean values can be made?

( ) and

( ) of

What results are obtained if the observable for the observable . Interpret.

and

at time . What comments

is measured at time ? Same question

15. Interaction picture (It is recommended that Complement FIII and perhaps Complement GIII be read before this exercise is undertaken.) Consider an arbitrary physical system. Denote its Hamiltonian by corresponding evolution operator by 0 ( ): ~

0(

0( 0

0)

0)

=

0(

)

0(

0(

) and the

0)

=

Now assume that the system is perturbed in such a way that its Hamiltonian becomes: ()=

0(

)+

()

The state vector of the system in the “interaction picture”, state vector ( ) in the Schrödinger picture by: () =

0(

0)

( ) , is defined from the

() 353



COMPLEMENT LIII

Show that the evolution of ~

d d

() =

()

( ) is given by:

()

where ( ) is the transform of operator associated with 0 ( 0 ): ()=

0(

0)

()

0(

( ) under the unitary transformation

0)

Explain qualitatively why, when the perturbation ( ) is much smaller than the motion of the vector ( ) is much slower than that of ().

0(

),

Show that the preceding differential equation is equivalent to the integral equation: () = where:

( 0) +

( 0) =

1 ~

d

( )

( )

0

().

Solving this integral equation by iteration, show that the ket panded in a power series in of the form: () =

+

1 ~

d

( )+

0

1 ( ~)2

d

( )

d

0

( ) can be ex-

( )+

( 0)

0

16. Correlations between two particles (It is recommended that the Complement EIII be read in order to answer question of this exercise.) Consider a physical system formed by two particles (1) and (2), of the same mass , which do not interact with each other and which are both placed in an infinite potential well of width (cf. Complement HI , § 2-c). Denote by (1) and (2) the Hamiltonians of each of the two particles and by (1) and (2) the corresponding eigenstates of 2 2 2 2 2 2 ~ ~ the first and second particle, of energies and . In the state space of the 2 2 2 2 global system, the basis chosen is composed of the states defined by: =

(1)

(2)

What are the eigenstates and the eigenvalues of the operator = (1)+ (2), the total Hamiltonian of the system? Give the degree of degeneracy of the two lowest energy levels. Assume that the system, at time = 0 is in the state: (0) =

1 6

1

1

+

1 3

1

2

+

1 6

2

1

What is the state of the system at time ? 354

+

1 3

2

2

• The total energy probabilities?

EXERCISES

is measured. What results can be found, and with what

Same questions if, instead of measuring

, one measures

(1).

Show that (0) is a tensor product state. When the system is in this state, calculate the following mean values: (1) , (2) and (1) (2) . Compare (1) (2) with (1) (2) ; how can this result be explained? Show that the preceding results remain valid when the state of the system is the state ( ) calculated in . Now assume that the state (0) =

1 5

1

1

+

(0) is given by: 3 5

1

2

+

1 5

2

1

Show that (0) cannot be put in the form of a tensor product. Answer for this case all the questions asked in . Write the matrix, in the basis of the vectors , that represents the density operator (0) corresponding to the ket (0) given in . What is the density matrix ( ) at time ? Calculate, at the instant = 0, the partial traces: (1) = Tr2

and

(2) = Tr1

Do the density operators , (1) and (2) describe pure states? Compare (1) (2); what is your interpretation? Answer the same questions as in , but choosing for

with

(0) the ket given in .

The subject of the following exercises is the density operator: they therefore assume the concepts and results of Complement EIII to be known. 17. Let be the density operator of an arbitrary system, where and are the eigenvectors and eigenvalues of . Write and 2 in terms of the and . What do the matrices representing these two operators in the basis look like – first, in the case where describes a pure state and then, in the case of a statistical mixture of states? (Begin by showing that, in a pure case, has only one non-zero diagonal element, equal to 1, while for a statistical mixture, has several diagonal elements included between 0 and 1.) Show that corresponds to a pure case if and only if the trace of 2 is equal to 1. 18. Consider a system whose density operator is ( ), evolving under the influence of a Hamiltonian ( ). Show that the trace of 2 does not vary over time. Conclusion: can the system evolve so as to be successively in a pure state and a statistical mixture of states? 19. Let (1) + (2) be a global system, composed of two subsystems (1) and (2). and denote two operators acting in the state space (1) (2). Show that the two partial traces Tr1 and Tr1 are equal when (or ) actually acts only in the space (1), that is, when (or ) can be written: =

(1)

(2)

[or

=

(1)

(2)] 355

COMPLEMENT LIII



Application: if the operator , the Hamiltonian of the global system, is the sum of two operators that act, respectively, only in (1) and only in (2): =

(1) +

(2)

d calculate the variation (1) of the reduced density operator (1). Give the physical d interpretation of the result obtained.

References Exercise 5

Flügge (1.24), §§ 40 and 41; Landau and Lifshitz (1.19), § 22. Exercise 10

Levine (12.3), Chap. 14; Eyring et al. (12.5), § 18 b Exercise 15

See references of Complement GIII .

356

• REVISITING ONE-DIMENSIONAL PROBLEMS

Now that we are more familiar with the mathematical formalism and the physical content of quantum mechanics, we can go in more detail into some of the results obtained in Chapter I. In the three complements that follow, we shall study in a general way the quantum properties of a particle subjected to a scalar potential1 of arbitrary form, confining ourselves for simplicity to one-dimensional problems. We shall first focus on the bound stationary states of a particle, whose energies form a discrete spectrum (Complement MIII ), and then treat the unbound states corresponding to an energy continuum (Complement NIII ). In addition, we shall examine a special case that is very important because of its applications, particularly in solid state physics, that of a periodic potential (Complément OIII ).

1 The

effects of a vector potentiel A will be studied later, in particular in Complement EVI .

357



BOUND STATES IN A “POTENTIAL WELL” OF ARBITRARY SHAPE

Complement MIII Bound states in a “potential well” of arbitrary shape

1 2

Quantization of the bound state energies . . . . . . . . . . . 359 Minimum value of the ground state energy . . . . . . . . . . 363

In complement HI , we studied, for a special case (finite or infinite “square” well), the bound states of a particle in a potential well. We derived certain properties of these bound states: a discrete energy spectrum and a ground state energy greater than the classical minimum energy. These properties are, in fact, general, and have numerous physical consequences, as we shall show in this complement. When the potential energy of a particle posesses a minimum (see Figure 1a), the particle is said to be placed in a “potential well”1 . Before studying qualitatively the stationary states of a quantum particle in such a well, let us recall the corresponding motion of a classical particle. When its energy takes on the minimum possible value = . (where is the depth of the well), the particle is motionless at the point 0 0 whose abscissa is . In the case where 0, the particle oscillates in 0 0 0 the well, with an amplitude that increases with . Finally, when 0, the particle does not remain in the well, but moves off towards infinity. The “bound states” of the classical particle therefore correspond to all negative energy values between 0 and 0. For a quantum particle, the situation is very different. Well-defined energy states are stationary states whose wave functions ( ) are solutions of the eigenvalue equation of the Hamiltonian : ~2 d2 + 2 d 2

( )

( )=

( )

(1)

Such a second-order differential equation has an infinite number of solutions, whatever the value chosen for : if we pick arbitrary values of ( ) and its derivative at any given point, we can obtain for any other value of . Equation (1) alone cannot, therefore, restrict the possible energy values. However, we shall show that if, in addition, we impose certain boundary conditions on ( ), only a certain number of values of remain possible (quantization of energy levels). 1.

Quantization of the bound state energies

We shall call “bound states of the particle” states whose wave functions ( ) satisfy the eigenvalue equation (1) and are square-integrable [indispensable if ( ) is actually to describe the physical state of a particle]. These are therefore stationary states, for which the position probability density ( ) 2 takes on non-negligible values only in a limited + region of space [for d ( ) 2 to converge, ( ) 2 must approach zero sufficiently 1 The

potential energy, of course, is only defined to within a constant. Following the usual convention, we set the potential equal to zero at infinity.

359

COMPLEMENT MIII



rapidly when ]. Bound states remind us of classical motion where the particle oscillates inside the well without ever being able to emerge (energy negative, but greater than ). 0 We shall see that in quantum mechanics, the fact that ( ) is required to be square-integrable implies that the possible energies form a discrete set of values which are also included between 0 and 0. To understand this, let us return to the potential shown in Figure 1a. For simplicity, we shall assume that ( ) is identically equal to zero outside an interval [ 1 2 ]. If ( ) = 0, and the solution to equation (1) can immediately 1 (region I), be written: – if

0: I(

)=

e

+

e

(2)

with: 2

= – if

(3)

~2 0:

I(

)=

e

+

e

(4)

with: 2

=

(5)

~2

We are looking for a square-integrable solution; we must therefore eliminate the form (2) in which I ( ) is a superposition of plane waves of constant modulus which cause the integral: 1

d

I(

)2

(6)

to diverge. Only possibility (4) remains, and we obtain our first result: the bound states of the particle all have a negative energy. In (4), we cannot retain the term in e , which diverges when . We are therefore left with: I(

)=e

if

1

(7)

[We have omitted the proportionality factor since the homogeneity of equation (1) allows us to define ( ) to within a multiplicative coefficient]. The value of ( ) in the interval 1 2 (region II) is obtained by extending 1 for = 1 I ( ): we must look for the solution of equation (1) which is equal to e 1 and whose derivative at this point is equal to e . The function II ( ) thus obtained depends on and, of course, on the exact expression for ( ). Nevertheless, since (1) is a second-order differential equation, II ( ) is determined uniquely by the preceding boundary conditions; it is, moreover, real (which enables us to trace curves such as those in Figures 1b, 1c and 1d). 360



BOUND STATES IN A “POTENTIAL WELL” OF ARBITRARY SHAPE

V(x) a

x1 0

x2

x

φ(x)

b E < E3 0

c E = E3 0

d E > E3 0

x

x

x

Figure 1: Potential well (fig. a) situated between the points = 1 and = 2 . We choose a solution ( ) of the eigenvalue equation of which, for 1 , approaches zero exponentially when . We then extend this solution to the entire -axis. For an arbitrary energy value , ( ) diverges like ˜ ( )e when + : figure b represents the case where ˜ ( ) 0; figure d, that where ˜ ( ) 0. However, if the energy is chosen so as to make ˜ ( ) = 0, ( ) approaches zero exponentially when + (fig. c), and ( ) is square-integrable.

361

COMPLEMENT MIII



All that now remains to be done is to obtain the solution when this solution can be written: III (

) = ˜e

2

+ ˜e

(region III);

(8)

where ˜ and ˜ are real constants determined by the two continuity conditions for ( ) and d d at the point = 2 . ˜ and ˜ depend on , as well as on the function ( ) We have therefore constructed a solution of equation (1), such as the one shown in Figure 1b. Is this solution square-integrable? We see from (8) that, in general, it is not, except when ˜ is zero (this special case is shown in Figure 1c). Now, for a given function ( ), ˜ is a function of through the intermediary of . The only values of for which a bound state exists are therefore solutions of the equation ˜ ( ) = 0. These solutions 1 , 2 , ... (cf. Fig. 2) form a discrete spectrum which, of course, depends on the potential ( ) chosen (we shall see in the following section that all the energies are greater than 0 ).

B(E)

E1 – V0

E2

E3 0

E

Figure 2: Graphical representation of the function ˜ ( ). The zeros of ˜ ( ) give the values of for which ( ) is square-integrable (the situation in Figure 1c), that is, the energies 1 , 2 , 3 ... of the bound states; all these energies are included between 0 and 0.

We thus arrive at the following result: the bound state energy values possible for a particle placed in a potential well of arbitrary shape form a discrete set (it is often said that the bound state energies are quantized). This result can be compared to the quantization of electromagnetic modes in a cavity. There is no analogue in classical mechanics, where, as we have seen, all energy values included between 0 and 0 are acceptable. In quantum mechanics, the lowest energy level 1 is called the ground state, the energy level 2 immediately above, the first excited state, the next energy level 3 , the second excited state, etc. The following schematic diagram is often associated with each of these states: inside the potential well representing ( ), a horizontal line is drawn whose vertical position corresponds to the energy of the state and whose length gives an idea of the spatial extension of the wave function (this line actually covers the points of the axis which would be reached by a classical particle of the same energy). For the set of energy levels, we obtain a schematic diagram of the type shown of Figure 3. As we saw in Chapter I, the phenomenon of energy quantization was one of the factors which led to the introduction of quantum mechanics. Discrete energy levels appear in a very large number of physical systems: atoms (cf. Chap. VII, hydrogen atom), the harmonic oscillator (cf. Chap. V), atomic nuclei, etc. 362



BOUND STATES IN A “POTENTIAL WELL” OF ARBITRARY SHAPE

V(x)

0

x

E3 E2 E1 – V0

Figure 3: Schematic representation of the bound states of a particle in a potential well. For each of these stationary states, one draws a horizontal line whose ordinate is equal to the energy of the corresponding level. The ends of this line are the points of intersection with the curve representing the potential ( ). The line is confined to the region of classical motion for the same energy, and gives an idea of the extension of the wave function.

2.

Minimum value of the ground state energy

We now show that the energies 1 , 2 , etc... are all greater than the minimum value ( ). We shall see how this result can be easily understood 0 of the potential energy using Heisenberg’s uncertainty relation. If ( ) is a solution of (1), we obtain, multiplying this equation by ( ) and integrating the relation thus obtained: ~2 2

+

d

( )

d2 ( )+ d 2

+

d

( )

( )2

+

= For a bound state, the function written simply: =

( )2

d

(9)

( ) can be normalized, and equation (9) can be

+

(10)

with: =

+

~2 2

d

( )

d2 ~2 ( ) = d 2 2

+

d

d d

2

( )

[where we have performed an integration by parts and used the fact that zero when ] and:

(11) ( ) goes to

+

=

d

( ) ( )2

(12) 363

COMPLEMENT MIII



Relation (10) shows simply that

is the sum of the mean value of the kinetic energy:

2

=

(13)

2

and that of the potential energy: =

( )

(14)

From relations (11) and (12), it follows immediately that: 0

(15) +

d (

0)

( )2=

0

(16)

Consequently: =

+

0

(17)

Since is negative, as we showed in § 1, we see that, as in classical mechanics, the bound state energies are always between 0 and 0. There exists, nevertheless, an important difference between the classical and quantum situations: while, in classical mechanics, the particle can have an energy equal to 0 (case of a particle at rest at 0 ) or slightly greater than 0 (case of small oscillations), the same is not true in quantum mechanics, where the lowest possible energy is the energy 1 of the ground state, which is necessarily greater than 0 (cf. Fig. 3). The Heisenberg uncertainty relations enable us to understand the physical origin of this result, as we now show. If we try to construct a state of the particle for which the mean potential energy is as small as possible, we see from (12) that we must choose a wave function which is practically localized at the point 0 . The root mean square deviation ∆ is then very small, so ∆ is necessarily very large. Since: 2

= (∆ )2 +

2

(∆ )2

(18)

2 the kinetic energy = 2 is thus also very large. Therefore, if the potential energy of the particle approaches its minimum, the kinetic energy increases without bound. The wave function of the ground state corresponds to a compromise, for which the sum of these two energies is a minimum. The ground state of the quantum particle is thus characterized by a wave function that has a certain spatial extension (cf. Fig. 3), and its energy is necessarily greater than 0 . Unlike the situation in classical mechanics, there exists no well-defined energy state in quantum mechanics where the particle is “at rest” at the bottom of the potential well.

Comment:

Since the energy of the bound states is included between 0 and 0, such states can exist only if the potential ( ) takes on negative values in one or several regions of the -axis. This is why we have chosen for this complement a potential 364



BOUND STATES IN A “POTENTIAL WELL” OF ARBITRARY SHAPE

V(x)

V2 V1 x0 0

x

– V0

Figure 4: Potential well of depth 0 situated between two potential barriers of height 1 and 2 (assuming, for example, 1 2 ). Classically, there exist particle states whose energy is between 0 and 1 that remain confined between the two barriers. In quantum mechanics, a particle whose energy is between 0 and 1 can penetrate the barrier by the tunnel effect (§ 2-b- of Complement HI ); consequently, the bound states always have energies between 0 and 0.

“well” like the one shown in Figure 1a (while in the following complement, we shall not confine ourselves to the case of a potential well). However, there is nothing to prevent ( ) from being positive for certain values of ; for example, the “well” can be surrounded by potential “barriers” as is shown in Figure 4 (we shall always assume the potential to be zero at infinity). In this case, certain classical motions of positive energy remain bounded, while in quantum mechanics, the same reasoning as above shows that the bound states always have an energy between 0 and 0. Physically, this difference arises from the fact that a potential barrier of finite height is never able to make a quantum particle turn back completely: the particle always has a non-zero probability of passing through by the tunnel effect.

References and suggestions for further reading:

Feynman III (1.2), § 16-6; Messiah (1.17), Chap. III, § II; Ayant and Belorizky (1.10), Chap. IV, §§ 1, 2, 3; Schiff (1.18), § 8.

365



UNBOUND STATES OF A PARTICLE IN THE PRESENCE OF A POTENTIAL WELL OR BARRIER

Complement NIII Unbound states of a particle in the presence of a potential well or barrier of arbitrary shape

1

2 3

Transmission matrix ( ) . . . . . . . . . . . 1-a Definition of ( ) . . . . . . . . . . . . . . . 1-b Properties of ( ) . . . . . . . . . . . . . . . Transmission and reflection coefficients . . . Example . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

368 368 370 372 373

In complement MIII , we showed that bound states of a particle placed in a potential ( ) have negative energies1 and that they exist only if ( ) is an attractive potential (a potential well which allows classical bounded motion). We had to reject positive energy values since they led to eigenfunctions ( ) of the Hamiltonian which, at infinity, behaved like superpositions of non square-integrable exponentials e . Nevertheless, we saw as early as Chapter I, that, by superposing such functions linearly, one can construct square-integrable wave functions ( ) (wave packets) which can therefore represent the physical state of a particle. It is clear that, since the states thus obtained involve several values of (that is, of the energy), they are no longer stationary states; the wave function ( ) therefore evolves over time, propagating and becoming deformed. However, the fact that ( ) is already expanded in terms of the eigenfunctions ( ) enables us to calculate this evolution very simply [as we did, for example, in complement JI , where we used the properties of the ( ) to calculate the transmission and reflection coefficients of a potential barrier, the delay upon reflection, etc.]. This is why, despite the fact that each of the ( ) cannot alone represent a physical state, it is useful to study the positive energy eigenfunctions2 of , as we have already done, in complement HI , for certain square potentials. In this complement, we are going to study in a general way (confining ourselves, nevertheless, to one-dimensional problems) the effect of a potential ( ) on the positive energy eigenfunctions ( ). We shall assume nothing about the shape of ( ), which may present one or several barriers, wells, etc., except that ( ) goes to zero outside a finite interval [ 1 , 2 ] of the -axis. We shall show that, in all cases, the effect of ( ) on the functions ( ) can be described by a 2 2 matrix, ( ), which possesses a certain number of general properties. We shall thus obtain various results that are independent of the shape of the potential ( ) chosen. For example, we shall see that the transmission and reflection coefficients of a barrier (whether symmetrical or not) are the same for a particle coming from the left and for a particle of the same energy coming from the right. An additional aim of this complement NIII is to serve as the point of 1 Recall

that we chose the energy origin so as to make ( ) zero at infinity. might also consider studying the non square-integrable negative energy eigenfunctions of (those whose energies do not belong to the discrete spectrum obtained in complement MIII ). However, these functions diverge very rapidly (exponentially) at infinity, and one could not obtain square-integrable wave functions by linearly superposing them. 2 One

367

COMPLEMENT NIII

• V(x)



0

l 2

+

x

l 2

Figure 1: The potential ( ) under consideration varies in an arbitrary way within the interval 2 2 and goes to zero outside this interval.

departure for the calculations of complement OIII , in which we study the properties of a particle in a periodic potential ( ). 1.

Transmission matrix

1-a.

Definition of

( )

( )

In a one-dimensional problem, consider a potential ( ) which is zero outside an interval [ 1 2 ] of length , but which varies in an arbitrary way inside this interval (Fig. 1). We choose the origin to be in the middle of the interval [ 1 2 ], so as to have ( ) vary only for 2. The equation satisfied by every wave function ( ) associated with a stationary state of energy is: d2 2 + 2 [ 2 d ~

( )]

( )=0

(1)

In the rest of this complement, we shall choose, to characterize the energy, the parameter given by: =

2

(2)

~2

In the region

2, the function e

the solution of this equation that is identical to e

satisfies equation (1); let us call for

2 necessarily a linear combination of two independent solutions e gives us: if if

2 +

2

:

( )=e

:

( )=

. When and e

( )

+ , ( ) is 2 of (1). This

(3a) ( )e

+

( )e

(3b)

where ( ) and ( ) are coefficients which depend on , as well as on the shape of the potential under study. Similarly, we can introduce the solution ( ), which, for 2, is equal to e : if if 368

2 +

2

:

( )=e

:

( )=

(4a) ( )e

+

( )e

(4b)



UNBOUND STATES OF A PARTICLE IN THE PRESENCE OF A POTENTIAL WELL OR BARRIER

The most general solution ( ) of equation (1) (of second order in ), for a given value of (that is, of ), is a linear combination of and : ( )=

( )+

( )

(5)

Relations (3a) and (4a) imply that: if

2

:

( )=

e

+

e

(6a)

+ ˜ e

(6b)

while relations (3b) and (4b) yield: if

+

2

( ) = ˜e

:

with: ˜=

( )

+

( )

˜ =

( )

+

( )

By definition, the matrix ( )

( )

( )

( )

(7) ( ) is the 2

2 matrix:

( )=

(8)

which allows us to write relations (7) in the matrix form: ˜ ˜

=

( )

(9)

( ) therefore enables us to determine, given the behavior (6a) of the wave function to the left of the potential, its behavior (6b) to the right. We call ( ) the “transmission matrix” of the potential.

Comment:

The current associated with a wave function ( ) is: ( )=

~

( )

2

d d

( )

d d

(10)

Differentiating, we find: d d

( )=

~ 2

( )

d2 d 2

( )

d2 d

2

(11)

Taking (1) into account, we obtain: d d

( )=0

(12) 369



COMPLEMENT NIII

Therefore, the current ( ) associated with a stationary state is the same at all points of the -axis. Note, moreover, that (12) is simply the one-dimensional analogue of the relation: div J(r) = 0

(13)

which is valid, according to relation (D-11) of Chapter III, for any stationary state of a particle moving in three-dimensional space. According to (12), the current ( ) associated with ( ) can therefore be calculated for any , choosing either the form (6a) or the form (6b) of ( ): ( )= 1-b.

~

2

Properties of

2

=

~

˜2

˜

2

(14)

( )

It is easy to show, using the fact that the function ( ) is real, that if ( ) is a solution of equation (1), so is ( ). Now consider the function ( ), which is a solution of (1); comparison of (3a) and (4a) shows that it is identical to therefore have, for all : ( )=

( ) when

( )

2

. We

(15)

Substituting relations (3b) and (4b) into this relation, we obtain: ( )=

( )

(16)

( )=

( )

(17)

It follows that the matrix ( )

( ) can be written in the simplified form:

( )

( )=

(18) ( )

( )

We saw above [cf. (12)] that the probability current for a stationary state. We must therefore have [cf. (14)]: 2

for any ˜2

2

= ˜2

and ˜

2

˜

2

( ) does not depend on

(19)

. Now relations (9) and (18) yield: =[ ( ) + [ ( ) + =

( )

2

( )

][

( ) ( )

2

( ) ][

( ) 2

+

( ) +

2

( )

] ] (20)

Condition (19) is therefore equivalent to: ( )2 370

( ) 2 = Det

( )=1

(21)



UNBOUND STATES OF A PARTICLE IN THE PRESENCE OF A POTENTIAL WELL OR BARRIER

Comments:

( ) We have made no particular assumptions about the shape of the potential. If it is even, that is, if ( ) = ( ), the matrix ( ) possesses an additional property: it can be shown that ( ) is a pure imaginary. ( ) Relations (6) show that and ˜ are the coefficients of “incoming” plane waves, i.e. waves associated with particles arriving respectively from = and = + and moving towards the zone of influence of the potential (incident particles). On the other hand, ˜ and are the coefficients corresponding to “outgoing” waves, associated with particles moving away from the potential (transmitted or reflected particles). It is useful to introduce the matrix , which allows us to calculate the amplitude of the outgoing waves in terms of that of the incoming waves: ˜ = ( ) ˜

(22)

( ) can easily be expressed in terms of the elements of the matrix we now show. The relations:

( ), as

˜=

( )

+

( )

(23a)

˜ =

( )

+

( )

(23b)

imply that: =

1 ˜ ( )

( )

(24)

Substituting this relation into (23a), we obtain: ˜=

1 ( )

( )

( )

( )

( )

+

( )˜

Taking (21) into account, we can then write the matrix

( )=

1 ( )

1

(25) ( ):

( ) (26)

( )

1

It is easy to verify, using (21) again, that: ( )

( )=

( ) ( )=1

(27)

( ) is therefore unitary. This matrix plays an important role in collision theory; we could have proved its unitary property from that of the evolution operator (cf. Complement FIII ), which simply expresses the conservation over time of the total probability of finding the particle somewhere on the Ox axis (norm of the wave function). 371

COMPLEMENT NIII

2.



Transmission and reflection coefficients

To calculate the reflection and transmission coefficients for a particle encountering the potential ( ), one should (as in complement JI ) construct a wave packet with the eigenfunctions of which we have just studied. Consider, for example, an incident particle of energy coming from the left. The corresponding wave packet is obtained by superposing functions ( ), for which we set ˜ = 0, with coefficients given by a function ( ) which has a marked peak in the neighborhood of = = 2 ~2 . We shall not go into these calculations in detail here; they are analogous in every way to those of complement JI . They show that the reflection and transmission coefficients are equal, respectively, to ( ) ( ) 2 and ˜( ) ( ) 2 . ˜ Since = 0, relations (22) and (26) yield: 1 ( ) ( ) ( ) ( ) ( )

˜( ) = ( )=

(28)

The reflection and transmission coefficients are therefore equal to: 1(

( ) ( )

)=

˜( ) 1( ) = ( )

2

= 2

=

2

( ) ( )

(29a)

1 ( )

(29b)

2

[it is easy to verify that condition (21) insures that 1 ( ) + 1 ( ) = 1]. If we now consider a particle coming from the right, we must take gives: ( )˜ ( ) ( ) 1 ˜( ) ( )

˜( ) = ( )=

= 0, which

(30)

The transmission and reflection coefficients are now equal to: 2(

( ) ˜( )

2

)=

˜( ) ˜( )

2

)=

=

1 ( )

(31a)

2

and: 2(

=

( ) ( )

2

(31b)

Comparison of (29) and (31) shows that 1 ( ) = 2 ( ) and that 1 ( ) = 2 ( ): for a given energy, the transparency of a barrier (whether symmetrical or not) is therefore always the same for particles coming from the right and from the left. In addition, from (21) we have: ( ) 372

1

(32)



UNBOUND STATES OF A PARTICLE IN THE PRESENCE OF A POTENTIAL WELL OR BARRIER

V(x)

Figure 2: Square potential barrier.

0

l 2



+

x

l 2

If (32) becomes an equality, the reflection coefficient is zero and the transmission coefficient is equal to 1 (resonance). On the other hand, the inverse situation is not possible: since (21) imposes that ( ) ( ) , one can never have = 0 and = 1 [except in the case where and tend simultaneously towards infinity]. Actually, such a situation can only occur for = 0. To see this, divide the function ( ) defined in (3) by ( ). If ( ) goes to infinity, the wave function will be identically zero on the left hand side, and hence necessarily, by extension, zero on the right hand side. However, this is impossible unless = 0 and = . 3.

Example

Let us return to the square potentials studied in § 2-b of complement HI : in the region , ( ) is equal3 to a constant 0 (see Figure 2, where 2 2 to be positive). First, let us assume that is smaller than 0 , and set: =

2 ( ~2

0

0

has been chosen

)

(33)

An elementary calculation analogous to the one in complement HI yields: 2

cosh

+

2

sinh

2

2 0

e

2

sinh

( )=

(34) 2 0

2

2

sinh

cosh

2

2

sinh

e

with: 0

(

0

=

2

0

(35)

~2

is necessarily positive here, since we have assumed

0 ).

3 In

fact, we are considering here a barrier that is displaced relative to that of complement HI , since we are assuming it to be situated between = 2 and = + 2, instead of between = 0 and = .

373



COMPLEMENT NIII

If now we assume that =

2 ( ~2

0,

we set:

0)

(36)

and: 0

2

=

(where

0

(37)

~2 = +1 if

0

0 and 2

cos

+ 2

+

1 if

0

0). We thus obtain:

2

sin

2 0

e

2

sin

( )=

(38) 2 0

2

2

sin

It is easy to verify that the matrices (17) and (21).

cos

2

2

sin

e

( ) written in (34) and (38) satisfy relations (16),

References and suggestions for further reading:

Merzbacher (1.16), Chap. 6, §§ 5, 6 and 8; see also the references of complement MIII .

374



QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

Complement OIII Quantum properties of a particle in a one-dimensional periodic structure

1

2

3

Passage through several successive identical potential barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-a Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-b Matching conditions . . . . . . . . . . . . . . . . . . . . . . . 1-c Iteration matrix ( ) . . . . . . . . . . . . . . . . . . . . . . 1-d Eigenvalues of ( ) . . . . . . . . . . . . . . . . . . . . . . . Discussion: the concept of an allowed or forbidden energy band . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-a Behavior of the wave function ( ) . . . . . . . . . . . . . . 2-b Bragg reflection; possible energies for a particle in a periodic potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantization of energy levels in a periodic potential; effect of boundary conditions . . . . . . . . . . . . . . . . . . . . . . 3-a Conditions imposed on the wave function . . . . . . . . . . . 3-b Allowed energy bands: stationary states of the particle inside the lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-c Forbidden bands: stationary states localized on the edges . .

376 377 377 379 380 381 381 382 383 384 386 390

In this complement, we are going to study the quantum properties of a particle placed in a potential ( ) having a periodic structure. The functions ( ) which we shall consider will not necessarily be periodic in the strict sense of the term; it suffices for them to have the shape of a periodic function in a finite region of the -axis (Fig. 1), that is, to be the result of juxtaposing times the same motif at regular intervals [ ( ) is truly periodic only in the limit ]. Such periodic structures are encountered, for example, in the study of a linear molecule formed by atoms (or groups of atoms) which are identical and equally spaced. They are also encountered in solid state physics, when one chooses a one-dimensional model in order to understand the disposition of the energy levels of an electron in a crystal. If is very large (as in the case of a linear macromolecule or a macroscopic crystal), the potential ( ) is given in a wide region of space by a periodic function, and the properties of the particle can be expected to be practically the same as they would be if ( ) were really periodic. However, from a physical point of view, the limit of infinite is never attained, and we shall be concerned here with the case where is arbitrary. To study the effect of the potential ( ) on an eigenfunction ( ) of the Hamiltonian , of eigenvalue , we shall introduce a 2 2 matrix, the iteration matrix , which depends on . We shall show that the behavior of ( ) is totally different depending on whether the eigenvalues of the iteration matrix are real or imaginary. Since these eigenvalues depend on the energy chosen, we shall find it useful to distinguish between 375

COMPLEMENT OIII



domains of energy corresponding to real eigenvalues and those which lead to imaginary eigenvalues. The concept of an allowed or forbidden energy band will thus be introduced. Comments:

( ) For the sake of convenience, we shall speak of a “potential barrier” to designate the motif which, repeated times, gives the potential ( ) (Fig. 1). However, this motif can also be a “potential well” or have an arbitrary shape. ( ) Common usage in solid state physics reserves the letter to designate a parameter that appears in the expression of the stationary wave functions, and which is not simply proportional to the square root of the energy. To conform to this usage, we shall henceforth use a notation slightly different from that of Complement NIII ; we shall replace by , setting: =

2

(1)

~2

and we shall not introduce the letter until later (we shall see that is directly related to the eigenvalues of the matrix when they are complex). 1.

Passage through several successive identical potential barriers

Consider a potential ( ) which is obtained by juxtaposing barriers as in Figure 1: the first barrier is centered at = 0, the second, at = , the third, at = 2 , ..., the last at = ( 1) . We intend to study the behavior, during passage through this set of barriers, of an eigenfunction ( ) which is a solution of the eigenvalue equation of : d2 2 + 2[ 2 d ~ where

and

( )]

( )=0

(2)

are related by (1).

V(x)

l – 2

0

l 2

3l 2

5l 2

7l 2

x

Figure 1: Potential ( ) having a periodic structure obtained by juxtaposing same motif ( = 4 in the figure).

376

times the

• 1-a.

QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

Notation

To the left of the

barriers, that is, for

2

solution of equation (2) is: if

2

:

( )=

e

0

+

0

,

( ) is zero, and the general

e

(3a)

Consider, as in § 1-a of Complement NIII , the two functions ( ) and ( ) which here become ( ) and ( ). In the region of the first barrier, centered at = 0, the general solution of (2) is written: if

2

2

:

( )=

( )+

1

( )

1

(3b)

Similarly, in the region of the second barrier, centered at if

3 : 2

2

( )=

(

2

)+

(

2

= , we obtain: )

(3c)

and, more generally, in the region of the th barrier, centered at if

(

1)

(

2

1) +

( )=

[

Finally, to the right of the zero, and we have: if

(

1) +

:

2

1) :

:

2 (

1) ] +

[

(

barriers, that is, for

( )=

=(

0

e

[

(

1) ]

+

1) ]

(

0

e

1) + , 2 [

(

(3d) ( ) is again

1) ]

We must now match these various expressions for the wave function =

2

+

1-b.

2

(

(3e) ( ) at

1) + . This is what we shall do in the following section. 2

Matching conditions

The functions and depend on the form of the potential chosen. We shall show, however, that it is simple to calculate them, and their derivatives as well, at the two edges of each barrier, by using the results of Complement NIII . To do so, let us imagine that all but one of the barriers are removed, leaving, for example, the th one, centered at = ( 1) . Solution (3d), always valid inside this barrier, must then be extended to the left and to the right by superposing plane waves. These waves are obtained by replacing, in formulas (6a) and (6b) of NIII , by ( 1) and by , and adding an index to , , ˜, ˜ . Thus we have, if the th barrier is isolated: for ( 1) : 2 e

[

(

1) ]

+

e

[

(

1) ]

(4) 377

COMPLEMENT OIII

for

1) + : 2

( ˜ e

with: ˜ ˜



[

(

1) ]

=

+ ˜ e

[

(

1) ]

(5)

( )

(6)

where, with the change in notation taken into account, ( ) is the matrix ( ) introduced in Complement NIII . Consequently, at the left edge of the th barrier, the function ( ) defined in (3d) has the same value and the same derivative as the superposition of plane waves (4). Similarly, at the right edge of this barrier, it has the same value and the same derivative as (5). These results enable us to write simply the matching conditions in the periodic structure. At the left edge of the first barrier (that is, at = 2), it is sufficient to note that (3a) has the same value and the same derivative as 1 e + 1 e , which yields directly: 0

=

0

=

1

(7)

1

(a result which was obvious from NIII ). At the right edge of the first barrier, which is the same as the left edge of the second one, we must write that ˜1 e + ˜1 e and 2 e ( ) + 2 e ( ) have the same value and the same derivative, which yields: 2

= ˜1 e

2

= ˜1 e

Similarly, at the junction of the

(8)

, we obtain, 2 setting equal the value and derivative of (5) and those of the expression obtained by replacing by + 1 in (4): +1

= ˜ e

+1

= ˜ e

th and ( + 1)th barriers

=

(9)

Finally, at the right edge of the last barrier

, we must write 2 that (3e) has the same value and the same derivative as the expression obtained by replacing by in (5), which yields:

378

0

= ˜

0

= ˜

=(

1) +

(10)

• 1-c.

QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

Iteration matrix

( )

Let us introduce the matrix e

( ) defined by:

0

( )=

(11) 0

e

It enables us to write the matching condition (9) in the form: +1

˜ ( ) ˜

=

+1

(12)

that is, taking (6) into account: +1

=

( )

( )

(13)

+1

Iterating this equation and using (7), we obtain: +1

=[ ( )

( )]

+1

1 1

=[ ( )

( )]

0

(14)

0

Finally, the matching condition (10) can be transformed by using (6) and (14): 0

=

( )

=

( )[ ( )

0

1

( )]

0

(15)

0

that is:

0

=

( ) ( )

( ) ( )

( )

( )

0

0

(16)

0

matrices

( )

In this formula, which enables us to go from

0

to

0

0

, a matrix

( ) is associated

0

with each barrier, and another matrix ( ) with each interval between two successive barriers. Relations (13) and (14) demonstrate the importance of the role played by the matrix: ( )=

( )

( )

which enters to the

(17) th power when one goes from

1 1

one performs a translation through a distance

to

+1

, that is, when

+1

along the periodic structure. For this 379

COMPLEMENT OIII



reason, we shall call ( ) the “iteration matrix”. Using formula (18) of Complement NIII and expression (11) for ( ), we obtain: e

( )

e

( )

( )=

(18) e

( )e

( )

The calculation of [ ( )] is facilitated if we change bases so as to make diagonal; for this reason we shall study the eigenvalues of ( ). 1-d.

Eigenvalues of

Let written: e

( )

be an eigenvalue of

( )

e

( )

( ). The characteristic equation of the matrix (18) is ( )2=0

( )

(19)

that is, taking into account relation (21) of Complement NIII : 2

2

where

( )+1=0

(20)

( ) is the real part of the complex number e

( ) = Re e

( ) =

( ):

1 Tr ( ) 2

(21)

Recall [cf. Complement NIII , relation (21)] that the modulus of the same is therefore true of e ( ). The discriminant of the second-degree equation (20) is: ∆ = [ ( )]2

( ) is greater than 1;

1

(22)

Two cases may then arise: ( ) If the energy ( )

is such that:

1

(for example, if, in Figure 2,

(23) is between

0

and

1 ),

one can set:

( ) = cos[ ( ) ]

(24)

with: 0

( ) A simple calculation then shows that the eigenvalues of =e

( )

(25) ( ) are given by: (26)

There are therefore two eigenvalues, which are complex conjugates and whose modulus is equal to 1. 380



QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

Y(α)

α2 α3 –1

+1

X(α)

O α4 α1

α0

Figure 2: Variation with respect to of the complex number e ( ) = ( ) + ( ). Since ( ) 1, the curve obtained in the complex plane falls outside the circle of unit radius centered at . The following discussion shows that if ( ) is less than 1, that is, if the value of chosen gives a point of the curve which is between the two vertical dashed lines of the figure, the corresponding energy falls in an “allowed band”; in the opposite case, it falls in a “forbidden band”.

( ) If, on the other hand, the energy ( )

gives a value of

such that:

1

(27)

(for example, if, in Figure 2, ( )=

is between

1

and

2 ),

one sets:

cosh[ ( ) ]

(28)

with: ( ) and

0

(29)

= +1 if = e

( ) is positive,

2-a.

1 if

( ) is negative. We then find:

( )

In this case, both eigenvalues of 2.

=

(30) ( ) are real, and they are each other’s inverse.

Discussion: the concept of an allowed or forbidden energy band Behavior of the wave function

( )

To apply (14), we begin by calculating the two column matrices Λ1 ( ) and Λ2 ( ) associated with the eigenvectors of ( ) and corresponding respectively to the eigenval381



COMPLEMENT OIII

ues

1

and

1

=

2.

1(

We then decompose the column matrix

1

into the form:

1

)Λ1 ( ) +

2(

)Λ2 ( )

(31)

1

which enables us to obtain directly: 1

=

1

1(

)Λ1 ( ) +

1 2

2(

)Λ2 ( )

(32)

It is clear from this expression that the behavior of the wave function is very different depending on whether ( ) is smaller or greater than 1 in the energy domain of the wave function. In the first case, formula (26) shows that the effect of traversing successive barriers is expressed in (32) by a phase shift in the components of the column matrix onto Λ1 ( ) and Λ2 ( ). The behavior of ( ) here recalls that of a superposition of imaginary exponentials. On the other hand, if the energy is such that ( ) 1, formula (30) indicates that only one of the two eigenvalues (for example, 1 ) has a modulus greater than 1. For sufficiently large, we have, as a result: 1 (

e

1) ( )

1(

)Λ1 ( )

(33)

and therefore increase exponentially with [except in the special case where ) = 0]; the wave function ( ) then increases in modulus as it traverses the successive potential barriers, and its behavior recalls that of a superposition of real exponentials. 1(

2-b.

Bragg reflection; possible energies for a particle in a periodic potential

Depending on whether ( ) behaves like a superposition of real or imaginary exponentials, the resulting phenomena can reasonably be expected to be different. Let us evaluate, for example, the transmission coefficient ( ) of the set of identical barriers. For these barriers, relation (15) shows that the matrix ( )[ ( )] plays a role analogous to the one played by ( ) for a single barrier. Now, according to relation (29b) of Complement NIII , the transmission coefficient ( ) is expressed in terms of the element of this matrix which is placed in the first row and the first column [the inverse of ( ) is equal to the square of the modulus of this element]. What happens if the energy of the particle is chosen so as to make the eigenvalues of ( ) real, that is, given by (30)? When becomes sufficiently large, the eigenvalue 1 = e ( ) becomes dominant, and the matrix [ ( )] 1 increases exponentially with [as can also be seen from relation (33)]. Consequently, the transmission coefficient decreases exponentially: ( )

e

2

( )

(34)

In this case, for large values of , the set of potential barriers reflects the particle practically without fail. This is explained by the fact that the waves scattered by the different potential barriers interfere totally destructively for the transmitted wave, and constructively for the reflected wave. This phenomenon can therefore be likened to Bragg reflection. Note, moreover, that this destructive interference for the transmitted wave can be produced even if the energy is greater than the height of the barrier (a case where, in classical mechanics, the particle is transmitted). 382

1



QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

Nevertheless, if the transmission coefficient of an isolated barrier is very close to 1, we have ( ) 1 [for example, in Figure 2, ( ) 1 if , that is, the energy, approaches infinity]. The point representing the complex number e ( ) is then very close to the circle of unit radius centered at . Figure 2 shows that the regions of the energy axis where ( ) 1, that is, where total reflection occurs, are very narrow and can practically be seen as isolated energy values. Physically, this is explained by the fact that, if the energy of the incident particle is much larger than the amplitude of variation of the potential ( ), its momentum is well-defined, as is the associated wavelength. The Bragg condition = (where is an integer) then gives well-defined 2 energy values. If, on the other hand, the energy of the particle falls in a domain where the eigenvalues are of modulus 1 as in (26), the elements of the matrix [ ( )] 1 no longer approach infinity when does. Under these conditions, the transmission coefficient ( ) does not approach zero when the number of barriers is increased. We are again dealing with a purely quantum mechanical phenomenon, related to the wave-like nature of the wave function, which enables it to propagate in the regular periodic potential structure without being exponentially attenuated. Note especially that the transmission coefficient ( ) is very different from the product of the individual transmission coefficients of the barriers taken separately (this product approaches zero when since all the factors are smaller than 1). The quantization of energy levels for a particle placed in a series of identical and evenly spaced potential wells (i.e. a periodic potential ( )) is another interesting problem, particularly in solid state physics. It will be studied in detail in § 3, but we can already guess the form of the spectrum of possible energies. If we assume that the energy of the particle is such that ( ) 1, equation (33) shows that the coefficients and become infinite when . It is clear that this possibility must be rejected, since it means that the wave function does not remain bounded. The corresponding energies are therefore forbidden; hence, the name of forbidden bands given to the energy domains for which ( ) 1. On the other hand, if the energy of the particle is such that ( ) 1, and remain bounded when ; the corresponding regions of the energy axis are called allowed bands. To sum up, the energy spectrum is composed of finite intervals inside which all the energies are acceptable, separated by regions all of whose energies are forbidden. 3.

Quantization of energy levels in a periodic potential; effect of boundary conditions

Consider a particle of mass

placed in the potential

( ) shown in Figure 3. In the region

+ , ( ) has the form of a periodic function, composed of a series of +1 2 2 successive barriers of height 0 , centered at = 0, , 2 , ..., , Outside this region, ( ) undergoes arbitrary variations over distances comparable to , then becomes equal to a positive constant value . In what follows, the region [0 ] will be called “inside the lattice” and the limiting regions

and 2 Physically, such a function

+ , “ends (or edges) of the lattice”. 2 ( ) can represent the potential seen by an electron in a linear 3 molecule or in a crystal (in a one-dimensional model). The potential wells at = , , ... 2 2

383

COMPLEMENT OIII

• V(x) Ve V0

0

l

Nl

x

Figure 3: Variation with respect to of the potential seen by an electron in a “onedimensional crystal” and on its edges. Inside the crystal, the potential has a periodic structure; ( ) is maximum between the ions (barriers at = 0, , 2 , ...) and minimum at the positions of the ions (wells at = 2, 3 2, ...). On the edges of the crystal, ( ) varies in a more or less complicated way over a distance comparable to , then rapidly approaches a constant value .

then correspond to the attraction of the electron by the various ions. Far from the crystal (or the molecule), the electron is not subjected to any attractive forces, which is why ( ) rapidly becomes constant outside the region

+ . 2 2 The potential ( ) that we have chosen fits perfectly into the framework of Complement MIII (apart from a change in the energy origin). We already know, therefore, that the bound states of the particle form a discrete spectrum of energies, all less than e . However, the potential ( ) picked here also presents the remarkable peculiarity of having a periodic structure of the type of those considered in § 1 above; relying on the results of this section, we shall show that the conclusions of Complement MIII take on a special form in this case. For example, in Complement MIII we stressed the fact that it is the boundary conditions [ ( ) 0 when ] that introduce the quantization of the energy levels. The boundary conditions of the problem we are studying here, that is, the variation of the potential at the edges of the lattice, might thus be expected to play a critical role in determining the possible energies. Actually, this is not the case: we shall see that these energies depend practically only on the values of ( ) in the region where it is periodic, and not on the edge effects (on condition, of course, that the number of potential wells is sufficiently large). In addition, we shall verify the result obtained intuitively in § 2-b, showing that most of the possible energies are grouped in allowed energy bands. Only a few stationary states, localized near the edges, depend on a critical manner on the variation of ( ) in this region and can have an energy which falls in a forbidden band. We shall therefore proceed essentially as in Complement MIII , first examining precisely the conditions imposed on the wave function ( ) of a stationary state.

3-a.

Conditions imposed on the wave function

In the region where ( ) is periodic, relation (3d) gives the form of the wave function ( ); the coefficients and are determined from (32). To write (32) more explicitly, let

384



QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

us set: 1(

)Λ1 ( ) =

2(

)Λ2 ( ) =

1(

) ( ) 1 2(

) ( ) 2

(35)

We then obtain: 1

=

1(

)

1

=

1(

)

1

1

1

+

2(

)

2

+

2(

)

2

1

(36)

Now let us examine the boundary conditions on the wave function ( ). First of all, to the left, far from the lattice, ( ) is equal to and ( ) is written in the form: ( )=

( )

e

(37a)

with: 2 ( ~2

( )=

)

(37b)

(we eliminate the solution in e ( ) , which diverges when ). The probability current associated with the function (37) is zero (cf. Complement BIII ). Now, for a stationary state, this current is independent of [cf. Complement NIII , relation (12)]; it therefore remains zero at all , even inside the lattice. According to relation (14) of Complement NIII , the coefficients and therefore necessarily have the same modulus. Thus, if we choose to express the boundary conditions on the left as relations between the coefficients 1 and 1 [that is, by writing that the expression for

( ) for

(37)], we find a relation of the form: 1

=e

( )

2

2

is the extension of the wave function

(38a)

1

( ) is a real function of (and therefore of the energy ) which depends on the precise behavior of ( ) at the left-hand edge of the lattice [in what follows, we shall not need the exact expression for this function ( ); the essential point is that the boundary conditions on the left have the form (38a)]. The same type of reasoning can obviously be applied on the right ( + ). The boundary conditions are written: +1

=e

( )

(38b)

+1

where the real function ( ) depends on the behavior of ( ) on the right-hand edge of the lattice. To sum up, we can say that the quantization of the energy levels can be obtained in the following manner: – we start with two coefficients 1 and 1 that satisfy (38a); this ensures that the function ( ) will remain bounded when . Since ( ) is defined to within a constant factor, we can choose, for example: 1

=e

1

=e

( ) 2 ( ) 2

(39)

385



COMPLEMENT OIII

– we then calculate, using (36), the coefficients and so as to extend the wave function chosen throughout all the crystal. Note that the condition (39) implies that ( ) is real (cf. Complement NIII , § 1-b); calculation of and must therefore yield: =

(40)

– finally, we write that the coefficients +1 and +1 satisfy (38b), a relation ensuring ( ) will remain bounded when + . In fact, relation (40) shows that the ratio +1 +1 is automatically a complex number of unit modulus; condition (38b) therefore amounts to an equality between the phases of two complex numbers. We thus obtain a real equation in , which has a certain number of real solutions giving the allowed energies. We are going to apply this method, distinguishing between two cases: real eigenvalues of ( ) [the case where ( ) 1] and imaginary ones [the case where ( ) 1]. that

3-b.

Allowed energy bands: stationary states of the particle inside the lattice First assume that the energy

.

is in a domain where

( )

1.

Form of the quantization equation Taking (26) into account, relations (36) become: =

1(

)e (

1) ( )

+

2(

)e

(

1) ( )

=

1(

)e (

1) ( )

+

2(

)e

(

1) ( )

(41)

We also know that the choice (39) of 1 and 1 implies that = for all . Now, it is easy to show that relations (41) yield two complex conjugate numbers only if: 1(

)=

2(

)

2(

)=

1(

)

(42)

Condition (38b) can then be written: ) e2 2 2( )e

( )

1(

+ +

( )

2( 1(

) =e )

This equation in

(43)

gives the quantization of the energy levels. To solve it, let us set:

1(

)e 1( ) e

Θ( ) = Arg

( )

( ) 2 ( ) 2

2(

)e 2( )e

[Θ( ) can, in principle, be calculated from can then be written simply: e2

( )

= e Θ(

)

( ) 2

(44)

( ) 2

( ),

( ) and the matrix

( )]. Equation (43) (45)

The energy levels are therefore given by: ( )=

Θ( ) + 2

(46)

with: =0 1 2

(

1)

(47)

[the other values of must be excluded, since condition (25) forces ( ) to vary within an interval of width ]. We can already see that if is very large, we can write equation (46) in the simplified form: ( )

386

(48)

• .

QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

Graphical solution; locating the energy levels

If we substitute definition (24) of ( ) into (46), we obtain an equation in that gives the allowed energies. To solve it graphically, let us begin by tracing the curve that represents the function ( ) = Re[e ( )]. Because of the imaginary exponential e , we expect this curve to have an oscillatory behavior, of the type shown in Figure 4a. Since ( ) is greater than 1 [cf. Complement NIII , relation (32)], the amplitude of the oscillation is greater than 1, and the curve intersects the two straight lines ( ) = 1 at certain values 0 , 1 , 2 , ... of the variable . We then eliminate all regions of the -axis, bounded by these values, where the condition ( ) 1 is not satisfied. Using the set of arcs of curves thus obtained for ( ), we must represent the function: ( )=

1

Arc cos

( )

(49)

Taking into account the form of the Arc cosine function (cf. Fig. 5), we are led to the curve whose shape is shown in Figure 4b. Equation (46) indicates that the energy levels correspond to Θ( ) the intersections of this curve with the curves representing the functions + , that is, 2 if 1, with the horizontal lines whose equations are = (with = 0, 1, 2, ..., 1).

X(α)

a +1

0

α0

α1

α2

α3

α

–1

k(α)

b π l

0

α0

α1

α2

α3

α4

α

Figure 4: Variation with respect to of ( ) = Re[ ( ) e ] (see Fig. 2) and of ( ) = 1 Arc cos ( ). The values of (that is, of the energy ) associated with stationary states are obtained (if 1) by cutting the curve which represents ( ) with the horizontal lines whose equations are = ( = 0 1 2 1). The allowed bands are thus revealed (intervals 0 very close levels. 1 , etc.); each includes The forbidden bands are represented by the shaded areas ( 1 2 , etc...). The dashed-line curves correspond to the special case where ( ) = 0 (a free particle).

387

COMPLEMENT OIII



Arc cos z π

Figure 5: The Arc cosine function.

z 0

–1

+1

We thus obtain groups of levels, associated with equidistant values of ( ) and situated in the allowed bands defined by 0 1 2 3 , etc. Between these allowed bands are the forbidden bands (we shall examine their properties in § c). If we consider a particular allowed band, we can locate each level according to the value of ( ) which corresponds to it. This leads to choosing as the variable and considering and, consequently, as functions ( ) and ( ) of . The variation of with respect to is given ~2 2 directly by the curve of Figure 4b, so it suffices to evaluate the function to obtain the 2 energy ( ). The corresponding curve has the shape shown in Figure 6. Comment: It is clear from Figure 4-b that, to a given value of , correspond several values of and therefore of the energy; this is why several arcs appear in Figure 6. Nevertheless,

E(k)

Figure 6: Variation of the energy with respect to the parameter . The solid lines correspond to the energies for the first two allowed bands (the values of which give the energy levels being equidistant inside the interval 0 ). The dashed lines correspond to the special case where the potential ( ) is zero (a free particle); the allowed bands are then contiguous, and there are no forbidden bands.

0

388

π/l

k



QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

if, within a given allowed band, ( ) increases steadily from 1 to +1 (or decreases steadily from +1 to 1), only one energy level corresponds to each value of for this band, and this band includes energy levels.

.

Discussion

The preceding calculations show how, when we go from = 1 to very high values of , we move gradually from a set of discrete levels to allowed energy bands. Rigorously, these bands are formed by discrete levels, but their separation is so small for a macroscopic lattice that they practically constitute a continuum. When is taken as a parameter, the density of states (the number of possible energies per unit interval of ) is constant and equal to . This property, which is very useful, explains why is generally chosen as the variable. An important point appears in going from (46) to (48): when is large, the edge effects of the lattice, which enter only through the intermediary of the functions ( ), ( ) and, in (46), Θ( ), no longer play any role; only the form of the periodic potential inside the lattice is important in determining the possible energies. It is interesting to consider the two following limiting cases: ( ) If

( ) = 0 (free particle), we have:

( )=1 (50) ( ) = cos and we obtain: if

0

: 2

if

( )= :

( )=

2

etc

(51)

(the corresponding broken line is shown in Figure 4b as a dashed line). Relation (50) shows that the condition ( ) 1 is always satisfied: as we know, forbidden bands do not exist for a free particle. Figure 6 thus enables us to see the effect of the potential ( ) on the curve ( ). When forbidden bands appear, the curves representing the energy become deformed so as to have horizontal tangents for = 0 and = (edges of the band). Unlike what happens for a free particle, there exists a point of inflection for each band where the energy varies linearly with . ( ) If the transmission coefficient equations (29) and (21)]: ( )

( ) is practically zero, we have [cf. Complement NIII ,

1 (52)

( )

1

In Figure 2, the point representing the complex number e ( ) is very far from the origin. We thus see in this figure that the regions of the -axis where ( ) 1 are extremely narrow. The allowed bands therefore shrink if the transmission coefficient of the elementary barrier decreases; in the limit of zero transmission, they reduce to individual levels in an isolated well. Inversely, as soon as the tunnel effect allows the particle to pass from one well to the next one, each of the discrete levels of the well gives rise to an energy band, whose width increases as the transmission coefficient grows. We shall return to this property in Complement FXI .

389



COMPLEMENT OIII

3-c.

Forbidden bands: stationary states localized on the edges

.

Form of the equations; energy levels

Let us now assume that belongs to a domain where relations (36) can then be written: =

1

1(

) e(

1) ( )

+

2(

)e

(

1) ( )

=

1

( 1( ) e

1) ( )

+

2( ) e

(

1) ( )

( )

1. According to (30),

(53)

The fact that

=

1(

)=

1(

)

2(

)=

2(

)

for all

means that we must have:

(54)

The quantization condition (38b) then takes on the form: +1

=

+1

1(

)+ 1( )+

2(

)e 2( )e

2

( )

2

( )

e

( )

(55)

that is: e

2

( )

= ( )

(56)

where the real function ( )=

1(

)e 2( )e

( ) is defined by:

( ) 2 ( ) 2

1(

)e 2( ) e

Consider the case where

( ) 2

(57)

( ) 2

1; we then have e

2

( )

0, and equation (56) reduces

to: ( )=0

(58)

The energy levels situated in the forbidden bands are therefore given by the zeros of the function ( ) (cf. Fig. 7). enters neither into (57) nor into (58), so the number of these levels does not depend on (unlike the number of levels situated in an allowed band). Consequently, when 1, it can be said that practically all the levels are grouped in the allowed bands.

.

Discussion

The situation here is radically different from the one encountered in § b: the number , that is, the length of the lattice, plays no role (provided, nevertheless, that it is sufficiently large); on the other hand, definition (57) of ( ) shows that the functions ( ) and ( ) play an essential role in the problem. Since we already know that these functions depend on the behavior of ( ) on the edges of the lattice, we expect to obtain states localized in these regions. This is indeed the case. Equations (57) and (58) offer two possibilities: ( ) if

1(

) = 0, the fact that

) = ( ) 1

1(

( )

1(

390

) =e ( ) 1

( ) = 0 requires that: (59)



QUANTUM PROPERTIES OF A PARTICLE IN A ONE-DIMENSIONAL PERIODIC STRUCTURE

L(α)

Figure 7: Variation of ( ) with respect to in a forbidden band. The zeros of ( ) give the stationary states which are localized on the edges of the lattice. α1

α2

0

α

Let us return to definition (35) of 1 ( ) and 1 ( ); we see that relation (59) shows that the wave function constructed from the first eigenvector of ( ) satisfies the boundary conditions on the right. This is easy to understand: if we start at = 0 with an arbitrary wave function which satisfies the boundary conditions on the left, the matrix

1

has components on the

1

two eigenvectors of

( ); the coefficients

+1

and

+1

by (33), which expresses the fact that the column matrix

are then (if +1

1) essentially given

is proportional to the column

+1

matrix of the first eigenvector of ( ). Note that since the eigenvalue 1 ( ) is greater than 1, the wave function grows exponentially when increases. The stationary state given by the first eigenvector of ( ) is therefore localized at the right end of the lattice. ( ) if 1 ( ) = 0, (54) gives 1 ( ) = 0, and definitions (35) imply that 1 ( ) = 0: the corresponding stationary state is associated with the second eigenvector of ( ). Aside from the fact that this state is localized at the left end of the lattice, the conclusions obtained in ( ) remain valid. References and suggestions for further reading:

Merzbacher (1.16), Chap. 6, § 7; Flügge (1.24), §§ 28 and 29; Landau and Lifshitz (1.19), § 104; see also solid state physics texts (section 13 of the bibliography).

391

Chapter IV

Application of the postulates to simple cases: spin 1/2 and two-level systems

A

B

C

Spin 1/2 particle: quantization of the angular momentum . A-1 Experimental demonstration . . . . . . . . . . . . . . . . . . A-2 Theoretical description . . . . . . . . . . . . . . . . . . . . . . Illustration of the postulates in the case of a spin 1/2 . . . B-1 Actual preparation of the various spin states . . . . . . . . . B-2 Spin measurements . . . . . . . . . . . . . . . . . . . . . . . . B-3 Evolution of a spin 1/2 particle in a uniform magnetic field . General study of two-level systems . . . . . . . . . . . . . . . C-1 Outline of the problem . . . . . . . . . . . . . . . . . . . . . . C-2 Static aspect: effect of coupling on the stationary states of the system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3 Dynamical aspect: oscillation of the system between the two unperturbed states . . . . . . . . . . . . . . . . . . . . . . . .

394 394 399 401 401 404 409 411 412 413 418

In this chapter, we intend to illustrate the postulates of quantum mechanics, which we stated and discussed in Chapter III. We shall apply them to simple concrete cases, in which the dimension of the state space is finite (equal to two). The interest of these examples is not confined to their mathematical simplicity, which will allow a better understanding of the postulates and their consequences. It is also based on their physical importance: they exhibit typically quantum mechanical behavior which can be verified experimentally. In §§ A and B, we study the spin 1/2 case (which we shall take up again in more detail in Chapter IX). First, we describe (§ A-1) a fundamental experiment that revealed

Quantum Mechanics, Volume I, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

the quantization of a simple physical quantity, the angular momentum. We shall see that the component along of the angular momentum (or magnetic moment) of a neutral paramagnetic atom can take on only certain values, which belong to a discrete set. Thus, for a silver atom in its ground state, there are only two possible values (+~ 2 and ~ 2) for the component of its angular momentum: a silver atom in the ground state is said to be a spin 1/2 particle. In § A-2, we indicate how quantum mechanics describes the “spin variables” of such a particle. In situations where one can dispense with a quantum treatment of the “external variables” r and p, the state of the particle (“spin state space”) has only two dimensions. We shall then (§ B) be able to illustrate and discuss the quantum mechanical postulates in this particularly simple case: we shall first see how to prepare silver atoms in any desired arbitrary spin state, in a real experiment. We shall then show how the measurement of the physical values of the spin on such silver atoms enables us to verify the quantum mechanical postulates experimentally. By integrating the corresponding Schrödinger equation, we shall study the evolution of a spin 1/2 particle in a uniform magnetic field (Larmor precession). Finally, in § C, we shall begin the study of two-level systems. Although these systems are not generally spin 1/2 particles, their study leads to calculations very similar to those developed in §§ A and B. We shall treat in detail the effect of an external perturbation on the stationary states of a two-level system and use this very simple model to point out important physical effects. A.

Spin 1/2 particle: quantization of the angular momentum

A-1.

Experimental demonstration

First of all, we are going to describe and analyze the Stern-Gerlach experiment, which demonstrated the quantization of the components of an angular momentum (sometimes called “space quantization”). A-1-a.

The Stern-Gerlach apparatus

The experiment consists of studying the deflection of a beam of neutral paramagnetic atoms (in this case, silver atoms) in a highly inhomogeneous magnetic field. The apparatus used is shown schematically1 in Figure 1. Silver atoms contained in a furnace , heated to a high temperature, leave through a small opening and propagate in a straight line in the high vacuum existing inside the whole apparatus. A collimating slit selects those atoms whose velocity is parallel to a particular direction that we shall choose for the axis. The atomic beam thus constructed traverses the gap of an electromagnet before condensing on a plate . Let us describe the characteristics of the magnetic field B produced by the electromagnet . This magnetic field has a plane of symmetry (which we shall designate by ) that contains the initial direction of the atomic beam. In the air-gap, it is the same at all points situated on any given line parallel to (the edges of the electromagnet are parallel to , and we neglect edge effects). B has no component along . Its largest component is along ; it varies strongly with : in Figure 1-b, the field lines are much closer together close to the north pole than close to the south pole of the magnet.

1 We

only indicate the most important characteristics of this equipment. A more detailed description of the experimental technique can be found in a textbook on atomic physics.

394

A. SPIN 1/2 PARTICLE: QUANTIZATION OF THE ANGULAR MOMENTUM

z a

P

A

E

N

F O

H

y

z

b

South

x

North

Figure 1: Schematic diagram of the Stern-Gerlach experiment. Figure a shows the trajectory of a silver atom emitted from the high-temperature furnace . This atom is deflected by the gradient of the magnetic field created by the electromagnet and then condenses at on plate . Figure b shows a cross section in the plane of the electromagnet ; the lines of force of the magnetic field are shown in dashed lines. has been assumed to be positive and , negative. Consequently, the trajectory of figure a corresponds to a negative component M of the magnetic moment, that is, to a positive component of S ( is negative for a silver atom).

Of course, since the magnetic field has a conserved flux (divB = 0), it must also have a component along which varies with the distance from the plane of symmetry. Classical calculation of the deflection2

A-1-b.

Note, first, that the silver atoms, being neutral, are not subjected to the Lorentz force. On the other hand, they possess a permanent magnetic moment M (they are paramagnetic atoms); the resulting forces are derived from the potential energy: =

M B

(A-1)

The existence, for an atom, of an electronic magnetic moment M and an angular momentum S is due to two causes: the motion of the electrons about the nucleus (the corresponding rotation of the charges being responsible for the appearance of an orbital magnetic moment) and the intrinsic angular momentum, or spin, (cf. Chapter IX) of the electrons, with which is associated a spin magnetic moment. It can be shown (as we shall 2 We

only give here an outline of the calculation.

395

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

z

B



Figure 2: The silver atom possesses a magnetic moment M and an angular momentum S which are proportional. Consequently, the effect of a uniform maguetic field B is to cause M to turn about B with a constant angular velocity (Larmor precession).

θ

O

assume here without proof) that, for a given atomic level, M and S are proportional3 : M = S

(A-2)

The proportionality constant is called the gyromagnetic ratio of the level under consideration. Before the atoms traverse the electromagnet, the magnetic moments of the silver atoms that form the atomic beam are oriented randomly (isotropically). Let us study the action of the magnetic field on one of these atoms, whose magnetic moment M has a given direction at the entrance of the air-gap. From expression (A-1) for the potential energy, it is easy to deduce that the resultant of the forces exerted on the atom is: M B) F = ∇(M

(A-3)

(this resultant would be equal to zero if the field B were uniform), and that their total moment relative to the position of the atom is: Γ=M

B

(A-4)

The angular momentum theorem can be written: S dS =Γ d

(A-5)

3 In the case of silver atoms in the ground state (like those of the beam), the angular momentum S is simply equal to the spin of the outer electron, which is therefore solely responsible for the existence of the magnetic moment M . This is because the outer electron has a zero orbital angular momentum, and the resultant orbital and spin angular momenta of the inner electrons are also zero. Moreover the experimental conditions realized in practice are such that effects linked to the spin of the nucleus are negligible. This is why the silver atom in the ground state, like the electron, has a spin 1/2.

396

A. SPIN 1/2 PARTICLE: QUANTIZATION OF THE ANGULAR MOMENTUM

that is: S dS = d

S

B

(A-6)

S d is perpendicular to S , The atom thus behaves like a gyroscope (Fig. 2): dS and the angular momentum turns about the magnetic field, the angle between S and B remaining constant. The rotational angular velocity is equal to the product of the gyromagnetic ratio and the modulus of the magnetic field. The components of M which are perpendicular to the magnetic field therefore oscillate around zero, the component parallel to B remaining constant. To calculate the force F [formula (A-3)], we can, to a very good approximation, neglect in the potential energy the terms proportional to M and M and take M to be constant. This is because the frequency of oscillation due to the rotation of M is so great that only the time-averaged values of M and M can play a role in , and these are both zero. Consequently, it is as if the atom were submitted to the sole force: F = ∇(M

)=M ∇

(A-7)

In addition, the components of ∇ along and are zero: = 0 because the magnetic field is independent of (§ A-1-a above), and = 0 at all points of the plane of symmetry . The force on the atom is therefore parallel to Oz and proportional to M . Since it is this force that produces the deflection of the atom (Fig. 1), is proportional to M (and hence, to S ). Consequently, measuring is equivalent to measuring M or S . Since, at the entrance to the air-gap, the moments of the various atoms of the beam are distributed isotropically (all values of M included between M and M are found), we expect the beam to form a single pattern, symmetrical with respect to , on the plate . The upper bound 1 and the lower bound 2 of this pattern correspond in principle to the maximum value M and minimum value M of M . In fact, the dispersion of the velocities and the finite width of the slit cause the atoms having a given value of M to condense, not at the same point, but in a spot centered about the deflection corresponding to the average velocity. A-1-c.

Results and conclusions

The results of the experiment (performed for the first time in 1922 by Stern and Gerlach) are in complete contradiction with the preceding predictions. We do not observe a single spot centered at , but two spots (Fig. 3) centered at the points 1 and 2 , symmetrical with respect to (the width of these two spots is due to the dispersion of the velocities, and to the width of the slit ). The predictions of classical mechanics are therefore shown to be invalidated by the experiment. Now let us see how these experimental results can be interpreted. Of the physical quantities associated with a silver atom, some correspond to its external degrees of freedom (that is, are functions of its position r and its linear momentum p), and others, to its internal degrees of freedom (also called spin degrees of freedom) M or S . Let us first show that, under these experimental conditions, it is not necessary to treat the external degrees of freedom quantum mechanically. To do this, we shall verify that it is possible, in order to describe the motion of the silver atoms, to construct wave 397

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

Figure 3: Spots observed on the plate in the Stern-Gerlach experiment. The magnetic moments M of the atoms emitted from the furnace are distributed randomly in all directions of space, so classical mechanics predicts that a measurement of M can yield with equal probability all values included between + M and M . One should therefore observe only one large spot centered in (dashed lines in the figure). In reality, the result of the experiment is completely different: two spots, centered at 1 and 2 , are observed. This means that a measurement of M can yield only two possible results (quantization of the measurement result).

P

N1

H

N2

packets whose width ∆ and momentum dispersion ∆ must satisfy the Heisenberg relation: ∆ ∆

&~

(A-8)

Numerically, the mass uncertainty ∆ = ∆ ∆ ∆

are negligible. ∆ and ∆

&

~

10

of a silver atom is equal to 1 8 must be such that: 9

MKSA

10

25

kg. ∆ and the velocity

(A-9)

Now what are the lengths and velocities involved in the problem? The width of the slit is about 0.1 mm and the separation 1 2 of the two spots, that is several millimeters. The distance over which the magnetic field varies appreciably can be deduced from the values of the field in the middle of the air-gap ( 104 gauss) and its gradient 5 ( 10 gauss/cm), which yields 1 mm. In addition, the velocity of the silver atoms leaving a furnace at an absolute temperature of 1000 K is of the order of 500 m/s. However well-defined the beam is, the dispersion of the velocities along is not much less than several meters per second. It is then easy to find uncertainties ∆ and ∆ , which, while satisfying (A-9), are negligible on the scale of the experiment being considered. As far as the external variables r and p of each atom are concerned, it is therefore not necessary to resort to quantum mechanics. It is possible to reason in terms of quasi-pointlike wave packets moving along classical trajectories. Consequently, it is correct to claim that measurement of the deflection constitutes a measurement of M or S . The results of the experiment thus lead us necessarily to the following conclusion: if we measure the component S of the intrinsic angular momentum of a silver atom in its ground state, we can find only one or the other of two values corresponding to the deflections 1 and 2 . We are therefore obliged to reject the classical image of a vector S whose angle with the magnetic field can take on any value: S is a quantized physical quantity whose discrete spectrum includes only two eigenvalues. When we study the quantum theory of angular momentum (Chap. VI), we shall see that these eigenvalues are +~ 2 and ~ 2; we shall assume this here and say that the spin of the silver atom in its ground state is 1/2. 398

A. SPIN 1/2 PARTICLE: QUANTIZATION OF THE ANGULAR MOMENTUM

A-2.

Theoretical description

We are now going to show how quantum mechanics describes the degrees of freedom of a silver atom, that is, of a spin 1/2 particle. We do not yet possess all the necessary elements for the presentation of a deductive and rigorous theory of the spin 1/2 particle. Such a study will be developed in Chapter IX, in the framework of the general theory of angular momentum. We shall therefore be forced here to assume without proof a small number of results which will be proved later, in Chapter IX. Such a point of view is justified by the fact that the essential goal of the present chapter is to show the reader how to handle the quantum mechanical formalism in a simple and concrete case, and not to focus on the angular momentum aspect of the spin 1/2. The idea is to give precise examples of kets and observables, to show how physical predictions can be extracted from them and how to distinguish clearly between the various stages of an experiment (preparation, evolution, measurement). We saw in Chapter III that with every measurable physical quantity must be associated, in quantum mechanics, an observable, that is, a Hermitian operator whose eigenvectors can form a basis in the state space. We must therefore define the state space and the observables corresponding to the components of S (S , S , S and, more generally, Su = S u, where u is an arbitrary unit vector), which we know from § A-1 to be measurable. A-2-a.

The observable

and the spin state space

With S we must associate an observable which has, according to the results of the experiment described in § A-1 above, two eigenvalues, +~ 2 and ~ 2. We shall assume (see Chap. IX) that these two eigenvalues are not degenerate, and we shall denote by + and the corresponding orthonormal eigenvectors: + =+ =

~ + 2 ~ 2

(A-10)

with: ++ = +

=1

=0

(A-11)

alone therefore forms a C.S.C.O., and the spin state space is the two-dimensional space spanned by its eigenvectors + and . The fact that these eigenvectors form a basis of is expressed by the closure relation: + + +

=

The most general (normalized) vector of

(A-12) is a linear superposition of + and

: =

+ +

(A-13)

with: 2

+

2

=1

(A-14) 399

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

z

u θ

Figure 4: Definition of the polar angles characterizing a unit vector u.

and

y

O

φ

x

In the +

(

)=

A-2-b.

~ 2

basis, the matrix representing

1

0

0

1

is diagonal and is written:

(A-15)

The other spin observables

With the S and S components of S will be associated the observables and . The operators and must be represented in the + basis by 2 2 Hermitian matrices. We shall see in Chapter VI that in quantum mechanics, the three components of an angular momentum do not commute with each other but satisfy well-defined commutation relations. This will enable us to show that, in the case of a spin 1/2, with which we are concerned here, the matrices representing and in the basis of the eigenvectors + and of are the following:

(

(

)=

)=

~ 2

~ 2

0

1

1

0

(A-16)

0 (A-17) 0

For the moment, we shall assume this result. As for the Su component of S along the unit vector u, characterized by the polar angles and (Fig. 4), it is written: Su = S u = S sin cos 400

+ S sin sin

+ S cos

(A-18)

B. ILLUSTRATION OF THE POSTULATES IN THE CASE OF A SPIN 1/2

Using (A-15), (A-16) and (A-17), we easily find the matrix that represents the corresponding observable u = S. u in the { + } basis: (

)=(

~ = 2

) sin cos cos

+( sin

) sin sin e

+(

) cos

i

(A-19) sin

ei

cos

In what follows, we shall need to know the eigenvalues and eigenvectors of the observables and . The calculations that enable us to obtain them from the matrices (A-16), (A-17) and (A-19) are not difficult. We shall only present the results here. The and operators have the same eigenvalues, +~ 2 and ~ 2, as . This result could have been expected, since it is always possible to rotate the SternGerlach apparatus as a whole so as to make the axis defined by the magnetic field parallel either to , to , or to u. Since all directions of space have the same properties, the phenomena observed on the plate of the apparatus must be unchanged under such rotations: the measurement of S , S or S can therefore yield only one of two results: +~ 2 or ~ 2. As for the eigenvectors of , and , we shall denote them respectively by , and (the sign in the ket is that of the corresponding eigenvalue). Their expansions on the basis of eigenvectors of is written: =

1 [+ 2

=

1 [+ 2

+

= cos =

B.

2

sin

]

i

]

2

e 2

(A-20)

e

(A-21)

+ + sin 2

2

+ + cos

2

e 2

e

(A-22a) 2

(A-22b)

Illustration of the postulates in the case of a spin 1/2

Using the formalism that we have just described, we are now going to apply the postulates of quantum mechanics to a certain number of experiments on silver atoms which can actually be performed with the Stern-Gerlach apparatus. We shall thus be able to discuss the consequences of these postulates in a concrete case. B-1.

Actual preparation of the various spin states

In order to make predictions about the result of a measurement, we must know the state of the system (here, the spin of a silver atom) immediately before the measurement. We are going to see how to prepare a beam of silver atoms so that they are all in a given spin state. 401

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

z B

O

N1

y F E

A

x

N2 P

Figure 5: When we pierce a hole in the plate at the position of the spot 1 , the atoms that pass through this hole are all in the spin state + . The Stern-Gerlach apparatus is then acting like a polarizer.

B-1-a.

Preparation of the states + and

Let us assume that we pierce a hole in the plate of the apparatus represented in Figure 1-a, at the position of the spot centered at 1 (Fig. 3). The atoms which are deflected downward continue to condense about 2 , while some of those which are deflected upward pass through the plate (Fig. 5). Each of the atoms of the beam which propagates to the right of the plate is a physical system on which we have just performed a measurement of the observable , the result being +~ 2. According to the fifth postulate of Chapter III, this atom is in the eigenstate corresponding to this result, that is, in the state + (since alone constitutes a C.S.C.O., the measurement result suffices to determine the state of the system after this measurement). The device in Figure 5 thus produces a beam of atoms which are all in the spin state + . This device acts like an “atomic polarizer”, since it acts the same way on atoms as an ordinary polarizer does on photons. Of course, if we pierced the plate around 2 and not around 1 , we would obtain a beam all of whose atoms would be in the spin state . B-1-b.

Preparation of the states

,

,

The observable also constitutes a C.S.C.O. since none of its eigenvalues is degenerate. To prepare one of its eigenstates, we must simply select, after a measurement of , the atoms for which this measurement has yielded the corresponding eigenvalue. In practice, if we rotate the apparatus of Figure 5 through an angle of + 2 about , we obtain a beam of atoms whose spin state is + (Fig. 6). This method can be generalized: by placing the Stern-Gerlach apparatus so that the axis of the magnetic field is parallel to an arbitrary unit vector u, and piercing the 4 plate either at 1 or at 2 , we can prepare silver atoms in the spin state + or . B-1-c.

Preparation of the most general state

We indicated above that the most general (normalized) ket of the spin state space is of the form: = 4 The

+ +

direction of the atomic beam is no longer necessarily concerns us here.

402

(B-1) , but this is not important in what

B. ILLUSTRATION OF THE POSTULATES IN THE CASE OF A SPIN 1/2

z A N2

y

O

F E B

x

N1

Figure 6: When the apparatus of Figure 5 is rotated through 90 about polarizer which prepares atoms in the spin state + .

, we obtain a

with: 2

2

+

=1

(B-2)

Is it possible to prepare atoms whose spin state is described by the corresponding ket ? We are going to show that there exists, for all , a unit vector u such that is collinear with the ket + u . We therefore choose two complex numbers and which satisfy relation (B-2) but which are arbitrary in every other respect. Taking (B-2) into account, we find that there necessarily exists an angle such that: cos sin

2 2

= (B-3) =

If, in addition, we impose: 0

(B-4)

the equation tan 2 = of the phases of and

determines uniquely. We already know that only the difference enters into the physical predictions. Let us therefore set: = Arg

Arg

(B-5)

= Arg

+ Arg

(B-6)

1 1 + 2 2 1 1 2 2

(B-7)

We thus have:

With this notation, the ket =e

2

cos

2

e

2

Arg

=

Arg

=

can be written:

+ + sin

2

e

2

(B-8) 403

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

If we compare this expression with formula (A-22a), we see that differs from the ket + (which corresponds to the unit vector u characterized by and ) only by the phase factor e 2 , which has no physical significance. Consequently, to prepare silver atoms in the state , it suffices to place the SternGerlach apparatus (with its plate pierced at 1 ) so that its axis is directed along the vector u whose polar angles are determined from and by (B-3) and (B-5). B-2.

Spin measurements

We saw in § A that a Stern-Gerlach apparatus enables us to measure the component of the angular momentum S of silver atoms along a given axis. We have just pointed out, in § B-1, that an apparatus of the same type can be used to prepare an atomic beam in a given spin state. Consequently, if we place two Stern-Gerlach magnets one after the other, we can verify experimentally the predictions of the postulates. The first apparatus acts like a “polarizer”: the beam coming out of it is composed of a large number of silver atoms all in the same spin state. This beam then enters the second apparatus, which is used to measure a specified component of the angular momentum S : this is, as it were, the “analyzer” (note the analogy with the optical experiment described in § A-3 of Chapter I). We shall assume in this section that the spin state of the atoms of the beam does not evolve between the time they leave the “polarizer” and the time they enter the “analyzer”, that is, between the preparation and the measurement. It would be easy to forgo this hypothesis, by using the Schrödinger equation to determine the spin evolution between the moment of preparation and the moment of measurement. B-2-a.

First experiment

Let us choose the axes of the two apparatuses parallel to (Fig. 7). The first one prepares the atoms in the state + and the second one measures S . What is observed on the plate of the second apparatus?

B2

B1

E1

F1

A1

A2 P1

P2

Figure 7: The first apparatus (a “source” composed of the furnace 1 and the slit 1 , plus a “polarizer” formed by the magnet 1 and the pierced plate 1 ) prepares the atoms in the state + . The second one (an “analyzer” composed of the magnet 2 and the plate 2 ) measures the component S . The result obtained is certain (+} 2).

Since the state of the system under study is an eigenstate of the observable which we want to measure, the postulates indicate that the measurement result is certain: we find, without fail, the corresponding eigenvalue (+~ 2). Consequently, all the atoms of 404

B. ILLUSTRATION OF THE POSTULATES IN THE CASE OF A SPIN 1/2

the beam must condense into a single spot on the plate of the second apparatus, at the position corresponding to +~ 2. This is indeed what is observed experimentally: all the atoms strike the second plate in the vicinity of 1 , none hitting near 2 . B-2-b.

Second experiment

Now let us place the axis of the first apparatus along the unit vector u, with polar angles , = (u is therefore contained in the plane). The axis of the second apparatus remains parallel to (Fig. 8). According to (A-22a), the spin state of the atoms when they leave the “polarizer” is (we ignore an irrelevant factor multiplying whole ket): =

cos

+ + sin

2

(B-9)

2

The “analyzer” measures S on these atoms. What are the results?

z θ

θ

O

y E1

x

B2

B1

F1 A1

P1

A2 P2

Figure 8: The first apparatus prepares the spins in the state + (u is the unit vector of the plane that makes an angle with ). The second one measures the S component. The possible results are +~ 2 (probability cos2 2) and ~ 2 (probability sin2 2).

This time, we find that certain atoms condense at 1 , and others at 2 , although they have all been prepared in the same way: there is an indeterminacy in the behavior of each of the atoms taken individually. The postulate of spectral decomposition merely enables us to predict the probability of each atom’s appearance at 1 or 2 . Since (B-9) gives the expansion of the spin state of an atom in terms of the eigenstates of the observable being measured, we can calculate directly that these probabilities are, respectively, cos2 2 and sin2 2. Thus, when enough atoms have condensed on the plate, we observe that the intensity of the spots at 1 and 2 corresponds to numbers of atoms which are proportional, respectively, to cos2 2 and sin2 2.

Comment:

For any value of the angle (except exactly 0 or ), it is therefore always possible to find the two results +~ 2 and ~ 2 in a measurement of . This prediction 405

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

may seem to a certain extent paradoxical. For example, if is very small, the spin at the exit of the first apparatus points in a direction which is practically , and yet one can find ~ 2 as well as +~ 2 in a measurement of (while, in classical mechanics, the result would be (~ cos ) 2 ~ 2). Nevertheless, the smaller , the smaller the probability of finding ~ 2. Moreover, we shall see later [formula (B-11)] that the mean value of the results which would be obtained in a large number of identical experiments is = ~2 cos , which corresponds to the classical result. B-2-c.

Third experiment

Let us take a “polarizer” positioned as in § B-2-b, so as to prepare atoms in the state (B-9), and let us rotate the “analyzer” until its axis is directed along , so that it measures the S component of the angular momentum. To calculate the predictions of the postulates in this case, we must expand the state (B-9) in terms of the eigenstates of the observable [formula (A-20)]. We find: +

1 (cos + sin ) = 2 2 2 1 (cos sin ) = 2 2 2

= =

cos( sin(

4

2

4

2

) (B-10) )

The probability of finding the eigenvalue +~ 2 of is therefore cos2 4 2 and 2 that of finding (~ 2), sin 4 . 2 It is possible to verify these predictions by measuring the intensity of the two spots on the plate situated at the exit of the second Stern-Gerlach apparatus.

Comment:

The fact that it is 4 2 that enters in here is not at all surprising: in § B-2-b, the angle between the axes of the two apparatus was ; it became 2 after rotation of the second apparatus. B-2-d.

Mean values

In the situation of § B-2-b, we find experimentally that, of a great number of atoms, cos2 2 arrive at 1 and sin2 2, at 2 . The measurement of S therefore yields +~ 2 for each of the first group and ~ 2 for each of the second. If we calculate the mean value of these results, we obtain: =

1

~ 2

~ = cos 2

cos2

2

~ 2

sin2

2 (B-11)

It is easy to verify from formulas (B-9) and (A-10) that this is indeed the value of the matrix element . 406

B. ILLUSTRATION OF THE POSTULATES IN THE CASE OF A SPIN 1/2

Similarly, the average of the measurement results obtained in the experiment of § B-2-c is equal to: = =

1

~ 2

cos2

4

2

~ 2

sin2

4

2

~ sin 2

(B-12)

To calculate the matrix element , we can use the matrix (A-16) which represents in the + basis. In this same basis, the ket is represented by the column cos

2

vector

, and the bra sin

by the corresponding row vector. We therefore have:

2

=

=

~ 2

cos

2

~ sin 2

sin

0

1

cos

2

1

0

sin

2

2

(B-13)

The mean value of S is indeed equal to the matrix element, in the state , of the associated observable . It is interesting to note that if we were dealing with a classical angular momentum of modulus ~ 2 directed along the axis of the “polarizer”, its components along and would be precisely (~ 2) sin and (~ 2) cos . More generally, if we calculate [using the same technique as in (B-13)] the mean values of , and in the state + [formula (A-22a)], we find: +

+

+

+

+

+

~ sin cos 2 ~ = sin sin 2 ~ = cos 2 =

(B-14)

These mean values are equal to the components of a classical angular momentum of modulus ~ 2 oriented along the vector u whose polar angles are and . Therefore, we can also establish here a relation between classical mechanics and quantum mechanics through the mean values. However, we must not lose sight of the fact that a measurement of S , for example, on a given atom will never yield ~2 sin cos : the only results which can be found are +~ 2 and ~ 2. Only in taking the average of values obtained in a large number of identical measurements (same state of the system, here + , and same observable measured, here ) do we obtain ~2 sin cos . Comment: It is useful to consider again at this stage the problem of external degrees of freedom (position, momentum).

407

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

a

b N1

N2 c N1

N2

Figure 9: When the spin is in the state + (fig. a) or (fig. b), the center of the wave packet follows a well-defined trajectory which can be calculated classically. When the spin state is a linear superposition of + and , the wave packet splits into two parts and it is no longer possible to say that the atom follows a classical trajectory (despite the fact that the spread of each of the packets is much smaller than the characteristic dimensions of the problem).

When a silver atom enters the second Stern-Gerlach apparatus in the spin state given by (B-9), we have just seen that it is impossible to predict with certainty whether it will condense at 1 or 2 . It seems difficult to reconcile this indeterminacy with the idea of a perfectly well-determined classical trajectory, given the initial state of the system. In fact, this is not a real paradox. To say that the external degrees of freedom can be treated classically means only that it is possible to form wave packets which are much smaller than all the dimensions of the problem. It does not necessarily mean, as we shall see, that the particle itself follows a classical trajectory. Let us first consider a silver atom which enters the apparatus in the initial spin state + . The wave function which describes the external degrees of freedom of this particle is a wave packet whose spread is very small and whose center follows the classical trajectory of Figure 9-a. Similarly, if the silver atom enters with the spin state , the center of the wave packet associated with it follows the classical trajectory of Figure 9-b. If we now consider an atom which enters with the spin state of formula (B-9), the corresponding initial state is a well-defined linear superposition of the two preceding initial states. Since the Schrödinger equation is linear, the wave function of the particle at a subsequent instant (Fig. 9-c) is a linear superposition of the two wave packets of Figures 9-a and 9-b. The particle therefore has a certain probability amplitude of being in one or the other of these two wave packets. We see that it does not follow a classical trajectory at all, unlike what happens to the centers of the two wave packets. Upon arrival on the screen, the wave function has non-zero values in two different regions, each

408

B. ILLUSTRATION OF THE POSTULATES IN THE CASE OF A SPIN 1/2

very localized, around the points 1 and 2 . The particle can therefore appear either near 1 or near 2 , and we cannot predict with certainty at which of these two points the appearance will occur. Note that the two wave packets of Figure 9-c do not represent two different particles; they represent only one particle, whose wave function has two parts, each of which is very localized about a different point. The two wave packets, moreover, have a welldefined phase relation because they arise from the same initial wave packet, split into two under the influence of the gradient of B. We could recombine them to form one wave packet again by removing the screen (that is, by not performing the measurement) and by submitting them to a new field gradient, whose sign would be the opposite of the first one. B-3.

Evolution of a spin 1/2 particle in a uniform magnetic field

B-3-a.

The interaction Hamiltonian and the Schrödinger equation

Consider a silver atom in a uniform magnetic field B0 , and choose the axis along B0 . The classical potential energy of the magnetic moment M = S of this atom is then: M B0 =

= where 0

0

M

0

=

0S

(B-15)

is the modulus of the magnetic field. Let us set:

=

(B-16)

0

It is easy to see that 0 has the dimensions of the inverse of a time, that is, of an angular velocity. Since we are quantizing only the internal degrees of freedom of the particle, S must be replaced by the operator , and the classical energy (B-15) becomes an operator: it is the Hamiltonian which describes the evolution of the spin of the atom in the field B0 : H=

(B-17)

0

Since this operator is time-independent, solving the corresponding Schrödinger equation amounts to solving the eigenvalue equation of . We see that the eigenvectors of are those of : + =+

~

=

~

0

2

+ (B-18)

0

2

There are therefore two energy levels, + = +~ 0 2 and = ~ 0 2 (Fig. 10). Their separation ~ 0 is proportional to the magnetic field; they define a single “Bohr frequency”: +

=

1

(

+

)=

0

2

(B-19)

409

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

E

E+

|+⟩

Figure 10: Energy levels of a spin 1/2, of gyromagnetic ratio , placed in a magnetic field 0 parallel to ; 0 is defined by 0 = 0.

ħω0 |–⟩

E–

Comment:

( ) If the field B0 is parallel to the unit vector u whose polar angles are , relation (B-17) must be replaced by: = where

and

(B-20)

0

= S u is the component of S along u.

( ) For the silver atom, is negative; 0 is therefore positive, according to (B-16). This explains the arrangement of the levels in Figure 10. B-3-b.

Larmor precession

Let us assume that, at time = 0, the spin is in the state: (0) = cos

2

2

e

+ + sin

2

2

e

(B-21)

(we showed in § B-1-c that any state could be put in this form). To calculate the state ( ) at an arbitrary instant t 0, we apply the rule (D-54) given in Chapter III. In expression (B-21), (0) is already expanded in terms of the eigenstates of the Hamiltonian, and we therefore obtain: ( ) = cos

2

e

2

e

or, using the values of ( ) = cos

2

e

( +

+

0

+~

+ + sin

and

:

) 2

+ + sin

2

2

e

e(

2

+

e

0

~

) 2

(B-22)

(B-23)

The presence of the magnetic field B0 therefore introduces a phase shift, proportional to the time, between the coefficients of the kets + and . 410

C. GENERAL STUDY OF TWO-LEVEL SYSTEMS

Comparing expression (B-23) for ( ) with that for the eigenket + of the observable S u [formula (A-22a)], we see that the direction u( ) along which the spin component is +~ 2 with certainty is defined by the polar angles: ()= ()=

+

(B-24) 0

The angle between u( ) and (the direction of the magnetic field B0 ) therefore remains constant, but u( ) revolves about at an angular velocity of 0 (proportional to the magnetic field). Thus, we find in quantum mechanics the phenomenon which we described for a classical magnetic moment in § A-1-b, and which bears the name of Larmor precession. From expression (B-17) for the Hamiltonian, it is obvious that the observable is a constant of the motion. It can be verified from (B-23) that the probabilities of obtaining +~ 2 or ~ 2 in a measurement of this observable are time-independent. Since the modulus of e ( + 0 ) 2 is equal to 1, these probabilities are equal, respectively, to cos2 2 and sin2 2. The mean value of is also time-independent: ~ cos (B-25) 2 On the other hand, and do not commute with [it is easy to show this by using the matrices representing and , which are given in (A-15), (A-16) and (A-17)]. Thus, formulas (B-14) here become: ()

() =

~ sin cos( + 0 ) 2 (B-26) ~ () () = sin sin( + 0 ) 2 In these expressions, we again find the single Bohr frequency 0 2 of the system. Moreover, the mean values of and behave like the components of a classical angular momentum of modulus ~ 2 undergoing Larmor precession. ()

C.

() =

General study of two-level systems

The simplicity of the calculations presented in § B derives from the fact that the state space has only two dimensions. There exist numerous other cases in physics which, to a first approximation, can be treated just as simply. Consider, for example, a physical system having two states whose energies are close together and very different from those of all other states of the system. Assume that we want to evaluate the effect of an external perturbation (or of internal interactions previously neglected) on these two states. When the intensity of the perturbation is sufficiently weak, it can be shown (cf. Chap. XI) that its effect on the two states can be calculated, to a first approximation, by ignoring all the other energy levels of the system. All the calculations can then be perfomed in a two-dimensional subspace of the state space. In this section, we shall study certain general properties of two-level systems (which are not necessarily spin 1/2 particles). Such a study is interesting because it enables us, using a mathematically simple model, to bring out some general and important physical ideas (quantum resonance, oscillation between two levels, etc...). 411

CHAPTER IV

C-1.

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

Outline of the problem

C-1-a.

Notation

Consider a physical system whose state space is two-dimensional (as we have already pointed out, this is usually an approximation: under certain conditions, we can confine ourselves to a two-dimensional subspace of the state space). For a basis, we choose the system of the two eigenstates 1 and 2 of the Hamiltonian H0 whose eigenvalues are, respectively, 1 and 2 : 0

1

=

1

1

0

2

=

2

2

(C-1)

This basis is orthonormal: =

;

=1 2

(C-2)

Assume that we want to take into account an external perturbation, or interactions internal to the system, initially neglected in H0 . The Hamiltonian becomes: =

+

0

(C-3)

The eigenstates and eigenvalues of +

=

+

will be denoted by

and

+

: (C-4)

=

H0 is often called the unperturbed Hamiltonian and , the perturbation or coupling. We shall assume here that is time-independent. In the basis of eigenstates 1 2 of H0 (called unperturbed states), is represented by a Hermitian matrix:

(

11

)=

and 12

22

=(

11

12

21

22

(C-5)

are real. Moreover:

21 )

(C-6)

In the absence of coupling, 1 and 2 are the possible energies of the system, and the states 1 and 2 are stationary states (if the system is placed in one of these two states, it remains there indefinitely). The problem consists of evaluating the modifications that appear when the coupling is introduced. C-1-b.

.

Consequences of the coupling 1

and

2

are no longer the possible energies of the system

A measurement of the energy of the system can yield only one of the two eigenvalues and of , which generally differ from 1 and 2 . The first problem is to calculate + and in terms of 1 2 and the matrix elements of . This amounts to studying the effect of the coupling on the energy levels. +

412

C. GENERAL STUDY OF TWO-LEVEL SYSTEMS

.

1

and

2

are no longer stationary states

Since 1 and 2 are not generally eigenstates of the total Hamiltonian , they are no longer stationary states. If, for example, the system at time = 0 is in the state 1 , there is a certain probability P12 ( ) of finding it in the state 2 at time : therefore induces transitions between the two unperturbed states. Hence the name “coupling” (between 1 and 2 ) given to . This dynamic aspect of the effect of constitutes the second problem with which we shall be concerned.

Comment: In Complement CIV , the two problems we have just cited are considered by introducing the concept of a fictitious spin. It can indeed be shown that the Hamiltonian to be diagonalized has the same form as that of a spin 1/2 placed in a static magnetic field B, whose components , and are simply expressed in terms of 1 , 2 and the matrix elements . In other words, with every two-level system (not necessarily a spin 1/2), can be associated a spin 1/2 (called a fictitious spin) placed in a static field B and described by a Hamiltonian whose form is identical to that of the initial two level system. All the results related to two-level systems which we are going to establish in this section can be interpreted in a simple geometric way in terms of magnetic moment, Larmor precession, and the various concepts introduced in §§ A and B of this chapter in connection with spin 1/2 particles. This geometrical interpretation is developed in Complement CIV . C-2.

Static aspect: effect of coupling on the stationary states of the system

C-2-a.

Expressions for the eigenstates and eigenvalues of

In the

( )=

1

1

+

basis, the matrix representing

2

11

is written:

12

21

2

+

(C-7) 22

The diagonalization of matrix (C-7) presents no problems (it is performed in detail in Complement BIV ). We find the eigenvalues: +

1 ( 2 1 = ( 2 =

1

+

11

+

2

+

22 )

1

+

11

+

2

+

22 )

(it can be verified that if

5 If

hand, if

2,

1 1

2,

+ +

approaches approaches

+

= 0,

1 2

and and

1 2 1 2 +

(

1

+

11

2

2 22 )

+4

12

(

1

+

11

2

2 22 )

+4

12

and

approaches approaches

are identical5 to

2

when

1

2

2

and

(C-8) 2 ).

The

approaches zero. On the other

1.

413

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

eigenvectors associated with +

= cos =

2

e

2

sin

where the angles

2

1

and tan

.

2

2

2

e

(C-9a)

2

(C-9b)

2

are defined by: 2 1

C-2-b.

are written:

e

2

+ cos

= 21

and

+ sin

1

e

2

+

=

+

21

12

11

with : 2

0

(C-10)

22

e

(C-11)

Discussion

Graphical representation of the effect of coupling

All the interesting effects which we shall discuss later arise from the fact that the perturbation possesses non-diagonal matrix elements 12 = 21 (if 12 = 0, the eigenstates of are the same as those of 0 , the new eigenvalues being simply 1 + 11 and 2 + 22 ). To simplify the discussion, we shall therefore assume from now on that the matrix ( ) is purely non-diagonal6 , that is, that 11 = 22 = 0. Formulas (C-8) and (C-10) then become: +

1 ( 2 1 = ( 2 =

tan

=

+

2)

1+

2)

1

2 1

+

12

+

1 2 1 2

(

2

1

2)

+4

1

2 2) + 4

12

2

(C-12) (

12

2

0

(C-13)

2

We now study the effect of the coupling on the energies + and in terms of the values of 1 and 2 . Assume that 12 is fixed and introduce the two parameters: = ∆=

1 ( 2 1 ( 2

1

+

2)

(C-14) 1

2)

We see from (C-12) that the variation of + and with respect to is extremely simple: changing reduces to shifting the origin along the energy axis. Moreover, it can be verified from (C-9), (C-10) and (C-11) that the vectors + and do not depend on . We are therefore concerned only with the influence of the parameter ∆. Let us show on the same graph, in terms of ∆, the four energies 1 2 + and . We obtain for 1 and 2 two straight lines of slope +1 and 1 (shown in dashed lines in Figure 11). Substituting (C-14) into (C-12), we find: +

=

+

∆2 +

12

2

(C-15)

6 If

11 and 22 are non-zero, we simply set: 1 = 1 + obtained in this section then remain valid if we replace 1 and

414

2

11 ,

2

by

1

= 2+ and 2 .

22 .

All the results

C. GENERAL STUDY OF TWO-LEVEL SYSTEMS

Energies

E+ W12 E1 Em

Δ E2

– W12 E–

Figure 11: Variation of the energies + and as a function of the energy difference ∆ = ( 1 2 ) 2. In the absence of coupling, the levels cross at the origin (dashed straight lines). Under the effect of the non-diagonal coupling , the two perturbed levels “repel each other” and we obtain an “anti-crossing”: the curves giving + and in terms of ∆ are branches of a hyperbola (solid lines in the figure) whose asymptotes are the unperturbed levels.

∆2 +

=

12

2

(C-16)

When ∆ varies, + and describe the two branches of a hyperbola which is symmetrical with respect to the coordinate axes and whose asymptotes are the two straight lines associated with the unperturbed levels; the minimum separation between the two branches is 2 12 (solid lines in Figure 11)7 . .

Effect of the coupling on the energy levels

In the absence of coupling, the energies 1 and 2 of the two levels “cross” at ∆ = 0. It is clear from Figure 11 that under the effect of coupling, the two levels “repel each other”– that is, the energy values move further away from each other. The diagram in solid lines in Figure 11 is often called, for this reason, an anti-crossing diagram. Also, we see that, for any ∆, we always have: +

1

2

This is a result that appears rather often in other domains of physics (for example, in electrical circuit theory): the coupling separates the normal frequencies. 7 It

is clear from Figure 11 why, when

+

1

2

if

1

2

+

2

1

if

1

2

0:

415

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

Near the asymptotes, that is, for ∆ 12 , formulas (C-15) and (C-16) can be written in the form of a limited power series expansion in ∆12 : =

+

+∆ 1+

2

1 2

12

(C-17)

2

1 ∆ 1+ 2

=

+

∆ 12



+

On the other hand, at the center of the hyperbola, for and (C-16) yield: =

+

+

2

=

1

(∆ = 0), formulas (C-15)

12

=

(C-18)

12

Therefore, the effect of the coupling is much more important when the two unperturbed levels have the same energy. The effect is then of first order, as can be seen from (C-18), while it is of second order when ∆ 12 [formulas (C-17)]. .

Effect of the coupling on the eigenstates When (C-14) is used, formula (C-13) becomes: tan

12

=

(C-19)



It follows that, when ∆ 2; on the other hand, when 12 (strong coupling), ∆ (weak coupling), 0 (assuming ∆ 0). 12 At the center of the hyperbola, when 2 = 1 (∆ = 0), we have: +

1 e 2

=

1 2

=

2

1

+e

2

1

+e

2

2

(C-20) 2

e

2

while near the asymptotes (that is, for ∆ +

=

e

=

e

2

2

1

2

+e e

12

2∆

2

+

1

+

), we have, to first order in

12

∆:

(C-21)

12

2∆

12

In other words, for a weak coupling ( 1 2 12 ), the perturbed states differ very slightly from the unperturbed states. We see from (C-21) that to within a global 2 phase factor e , + is equal to the state 1 slightly “contaminated” by a small contribution from the state 2 . On the other hand, for a strong coupling ( 1 2 are very different from 12 ), formulas (C-20) indicate that the states + and the states 1 and 2 , since they are linear superpositions of them with coefficients of the same modulus. Thus, like the energies, the eigenstates undergo significant modifications in the neighborhood of the point where the two unperturbed states cross. 416

C. GENERAL STUDY OF TWO-LEVEL SYSTEMS

C-2-c.

Important application: the phenomenon of quantum resonance

When 1 = 2 = , the corresponding energy of 0 is two-fold degenerate. As we have just seen, the coupling 12 lifts this degeneracy and, in particular, gives rise to a level whose energy is lowered by 12 . In other words, if the ground state of a physical system is two-fold degenerate (and sufficiently far from all the other levels), any (purely non-diagonal) coupling between the two corresponding states lowers the energy of the ground state of the system, which thus becomes more stable. As a first example of this phenomenon, we shall cite the resonance stabilization of the benzene molecule C6 H6 . Experiments show that the six carbon atoms are situated at the vertices of a regular hexagon, and we would expect the ground state to include three double bonds between neighboring carbon atoms. Figures 12-a and 12-b represent two possible dispositions of these bonds. The nuclei are assumed here to be fixed because of their high masses. Thus, the electronic states 1 and 2 , associated with Figures 12a and 12-b respectively, are different. If the structure of Figure 12-a were the only one possible, the ground state of the electronic system would have an energy of = is the Hamiltonian of the electrons in the potential created by 1 1 , where the nuclei. But the bonds can also be placed as shown in Figure 12-b. By symmetry, we have 2 2 = 1 1 , and we could conclude that the ground state of the molecule is doubly degenerate. However, the non-diagonal matrix element 2 1 of the Hamiltonian is not zero. This coupling between the states 1 and 2 gives rise to two distinct levels, one of which has an energy lower than . The benzene molecule is therefore more stable than we would have expected. Moreover, in its true ground state, the configuration of the molecule cannot be represented either by Figure 12-a or by Figure 12-b: this state is a linear superposition of 1 and 2 [the coefficients of this superposition having, as in (C-20), the same modulus]. This is what is symbolized by the double arrow of Figure 12, commonly used by chemists.

Figure 12: Two possible configurations of the double bonds in a benzene molecule.

Another example is that of the (ionized) molecule 2+ , composed of two protons p1 and p2 and one electron. The two protons, because of their large masses, can be considered to be fixed. Let us call the distance between them and 1 and 2 , the states where the electron is localized around p1 or around p2 , its wave function being that of the hydrogen atom it would form with p1 or p2 (Fig. 13). As above, the diagonal elements 1 1 and 2 2 of the Hamiltonian are equal because of symmetry; we shall denote them by ( ). The two states 1 and 2 are not, however, stationary states, since the matrix element 1 2 is not zero. Here again, we obtain an energy level lower than ( ) and, in the ground state, the wave function of the electron is a linear combination of those of Figures 13-a and 13-b. The electron is thus no longer localized about one of the two protons alone, and it is this delocalization which, by 417

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

a

b e–

e– p1

p2

p2

p1

Figure 13: In the 2+ ion, the electron could logically be localized either around the proton 1 (fig. a) or around the proton 2 (fig. b). In the ground state of the ion, the wave function of the electron is a linear superposition of the wave functions associated with figures a and b. Its probability of presence is symmetrical with respect to the plane bisecting 1 2 .

lowering its potential energy, is responsible for the chemical bond8 . C-3.

Dynamical aspect: oscillation of the system between the two unperturbed states

C-3-a.

Evolution of the state vector

Let: () =

1(

)

1

+

2(

)

(C-22)

2

be the state vector of the system at the instant . The evolution of of the coupling is given by the Schrödinger equation: ~

d d

() =(

0

+

) ()

(C-23)

Let us project this equation onto the basis vectors (C-5) (where we have set 11 = 22 = 0) and (C-22): d d d ~ d ~

1(

)=

1

1(

2(

)=

21

( ) in the presence

)+

12

2(

)

2

2(

)

1

and

2

. We obtain, using

(C-24) 1(

)+

If 12 = 0, these equations form a linear system of homogeneous coupled differential equations. The classical method of solving such a system reduces, in fact, to the application of rule (D-54) of Chapter III: look for the eigenvectors + (eigenvalue + ) and (eigenvalue ) of the operator = 0+ [whose matrix elements are the coefficients of equations (C-24)], and decompose (0) in terms of + and : (0) = 8A

418

+

+

more elaborate study of the ionized molecule

(C-25) + 2

will be presented in Complement GXI .

C. GENERAL STUDY OF TWO-LEVEL SYSTEMS

(where

and

() =

are fixed by the initial conditions). We then have: e

~

+

+

+

~

e

(C-26)

[which enables us to obtain 1 ( ) and 2 ( ) by projecting ( ) onto 1 and 2 ]. It can be shown that a system whose state vector is the vector ( ) given in (C-26) oscillates between the two unperturbed states 1 and 2 . To see this, we shall assume that the system at time = 0 is in the state 1 : (0) =

(C-27)

1

and calculate the probability P12 ( ) of finding it in the state C-3-b.

2

at time .

Calculation of P12 ( ): Rabi’s formula

As in (C-25), let us expand the ket basis. Inverting formulas (C-9), we obtain: (0) =

2

=e

1

cos

sin

+

2

(0) given in (C-27) on the

+

(C-28)

2

from which we deduce, using (C-26): 2

() =e

cos

2

e

+

~

sin

+

2

~

e

(C-29)

The probability amplitude of finding the system at time in the state written: 2

() =e

2

=e

2

cos sin

2

e

+

cos

[e

2

2

~

2

+

~

which enables us to calculate P12 ( ) = P12 ( ) =

1 sin2 2

= sin

2

1

cos

sin

+

2

2

()

2

]

. We thus find:

+

~

sin

2

is then

(C-30)

~

e

~

e

2

2

(C-31)

+

2~

or, using expressions (C-12) and (C-13): P12 ( ) =

4

12

4 12 2 2+( 1

2 2)

sin2

4

12

2

+(

1

2)

2

2~

(C-32)

Formula (C-32) is sometimes called Rabi’s formula. 419

CHAPTER IV

SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS

12(t)

sin2 θ

t

0 πħ/(E+ – E–)

Figure 14: Variation with respect to time of the probability P12 ( ) of finding the system in the state 2 when it was initially in the state 1 . When the states 1 and 2 have the same unperturbed energy, the probability P12 ( ) can attain the value 1.

C-3-c.

Discussion

Relation (C-31) shows that the probability P12 ( ) oscillates over time with a frequency of ( + ) , which is simply the unique Bohr frequency of the system. P12 ( ) varies between zero and a maximum value which, according to (C-31), is equal to sin2 . This maximum value is attained for all values of such that = (2k + 1) 2( + ), with k = 0 1 2 , ... (Fig. 14). The oscillation frequency ( + ) , as well as the maximum value sin2 of P12 ( ), are functions of 12 and 1 2 , whose main features we are now going to describe. When 1 = 2 , ( + ) is equal to 2 12 , and sin2 takes on its greatest possible value, that is, 1: at certain times, = (2 + 1) ~ 2 12 , the system (which started from the state 1 ) is in the state 2 . Therefore, any coupling between two states of equal energy causes the system to oscillate completely from one state to the other with a frequency proportional to the coupling9 . When 1 ) , while sin2 decreases. For a weak 2 increases, so does ( + 2 coupling ( 1 differs very little from 1 becomes 2 12 ), + 2 , and sin very small. This last result is not surprising since, in the case of a weak coupling, the state 1 is very close to the stationary state + [cf. formulas (C-21)]: the system, having started in the state 1 , evolves very little over time. C-3-d.

Example of oscillation between two states

Let us return to the example of the 2+ molecule. We shall assume that, at a certain time, the electron is localized about proton p1 : it is, for example, in the state shown in Figure 13-a. According to the results of the preceding section, we know that it will oscillate between the two protons with a frequency equal to the Bohr frequency associated 9 The same phenomenon is found in other domains of physics. Consider, for example, two identical pendulums (1) and (2), suspended from the same support and having the same frequency. Let us assume that at time = 0, pendulum (1) is set in motion. The coupling is ensured by their common support. We then know (cf. Complement HV ) that, after a certain time (which decreases if the coupling is increased), we arrive at a situation where only pendulum (2) oscillates, with the initial amplitude of pendulum (1). Then the motion is transferred back to pendulum (1), and so on.

420

C. GENERAL STUDY OF TWO-LEVEL SYSTEMS

with the two stationary states + and of the molecule. To this oscillation of the electron between the two states, represented in 13-a and 13-b, corresponds an oscillation of the mean value of the electric dipole moment of the molecule (the dipole moment is non-zero when the electron is localized about one of the two protons, and changes sign depending on whether the proton involved is p1 or p2 ). Thus we see concretely how, when the molecule is not in a stationary state, an oscillating electric dipole moment can appear. Such an oscillating dipole can exchange energy with an electromagnetic wave of the same frequency. Consequently, this frequency must appear in the absorption and emission spectrum of the 2+ ion. Other examples of oscillations between two states are discussed in Complements FIV , GIV and HIV . References and suggestions for further reading: The Stern-Gerlach experiment: original article (3.8); Cagnac and Pebay-Peyroula (11.2), Chap. X; Eisberg and Resnick (1.3), § 8-3; Bohm (5.1), §§ 22.5 and 22.6; Frisch (3.13). Two-level systems: Feynman III (1.2), Chaps. 6, 10 and 11; Valentin (16.1), Annexe XII; Allen and Eberly (15.8), particularly Chap. 3.

421

COMPLEMENTS OF CHAPTER IV, READER’S GUIDE

AIV : THE PAULI MATRICES BIV : DIAGONALISATION OF A 2 2 HERMITIAN MATRIX

Technical study of 2 2 matrices; simple, and important for solving numerous quantum mechanical problems.

CIV : FICTITIOUS SPIN 1/2 ASSOCIATED WITH A TWO-LEVEL SYSTEM

Establishes the close relation that exists between §§ B and C of Chapter IV; supplies a simple geometrical interpretation of the properties of two-level systems (easy, but not indispensable for what follows).

DIV : SYSTEM OF TWO SPIN 1/2 PARTICLES

Simple illustration of the tensor product and of the postulates of quantum mechanics (can be considered to be a set of worked exercises).

EIV : SPIN 1/2 DENSITY MATRIX

Illustration, in the case of spin 1 2 particles, of the concepts introduced in Complement EIII .

FIV : SPIN 1/2 PARTICLE IN A STATIC MAGNETIC FIELD AND A ROTATING FIELD: MAGNETIC RESONANCE

Study of a very important physical phenomenon with many applications: magnetic resonance. Can be studied later.

GIV : A SIMPLE MODEL OF THE AMMONIA MOLECULE

Example of a physical system whose study can be reduced, in a first approximation, to that of a two-level system; moderately difficult.

HIV : COUPLING BETWEEN A STABLE STATE AND AN UNSTABLE STATE

Study of the influence of coupling between two levels with different lifetimes; easy, but requires the concepts introduced in Complement KIII .

JIV : EXERCISES

423



THE PAULI MATRICES

Complement AIV The Pauli matrices

1 2 3

Definition; eigenvalues and eigenvectors . . . . . . . . . . . . 425 Simple properties . . . . . . . . . . . . . . . . . . . . . . . . . 426 A convenient basis of the 2 2 matrix space . . . . . . . . . 427

In § A-2 of Chapter IV, we introduced the matrices representing the three components , and of a spin in the + basis (eigenvectors of ). In quantum mechanics, it is often convenient to introduce the dimensionless operator σ, proportional to S, and given by: S=

~ σ 2

(1)

The matrices representing the three components of σ in the the “Pauli matrices”. 1.

+

basis are called

Definition; eigenvalues and eigenvectors

Let us go back to equations (A-15), (A-16) and (A-17) of Chapter IV. Using (1), we see that the definition of the Pauli matrices is: 0

1

0

=

1

= 1

0

0

= 0

(2) 0

1

These are Hermitian matrices, all three of which have the same characteristic equation: 2

1=0

The eigenvalues of =

1

(3) ,

and

are therefore: (4)

which is consistent with the fact that those of , and are ~ 2. It is easy to obtain, from definition (2), the eigenvectors of , and . They are the same, respectively, as those of , and , already introduced in § A-2 of Chapter IV: = =

(5)

= 425



COMPLEMENT AIV

with: 1 [+ 2 1 = [+ 2 =

2.

] ]

(6)

Simple properties

It is easy to see from definition (2) that the Pauli matrices verify the relations: Det( ) =

1

=

or

(7)

Tr( ) = 0 (

2

)=( =

2

(8)

)=(

2

)=

(where

is the 2

2 unit matrix)

=

as well as the equations that can be deduced from (10) by cyclic permutation of , . Equations (9) and (10) are sometimes condensed into the form: =

(9) (10)

+

and

(11)

where is antisymmetric with respect to the interchange of any two of its indices. It is equal to:

=

0

if

the indices

are not all different

1

if

is an even permutation of

1

if

is an odd permutation of

(12)

From (10), we immediately conclude: [

]=2

(13)

(and the relations obtained by cyclic permutation). This yields: [

]= ~

[

]= ~

[

]= ~

(14)

We shall see later (cf. Chap. VI) that equations (14) are characteristic of an angular momentum. We also see from (10) that: +

=0

(15)

(the matrices are said to anticommute with each other) and that, taking (9) into account: = 426

(16)



THE PAULI MATRICES

Finally, let us mention an identity which is sometimes useful in quantum mechanics. If A and B denote two vectors whose components are numbers (or operators which commute with all operators acting in the two-dimensional spin state space): (σ A)(σ B) = A B

+ σ (A

(17)

B)

This identity can be demonstrated as follows. Using formula (11) and the fact that A and B commute with σ, we can write: (σ A)(σ B) =

=

+

=

+

(18)

In the second term, we recognize the scalar product A B. In addition, it is easy to see from (12) that is the th component of the vector product A B. This proves (17). Note that if A and B do not commute, they must appear in the same order on both sides of the identity. 3.

A convenient basis of the 2

Consider an arbitrary 2

=

11

12

21

22

2 matrix space

2 matrix:

(19)

It can always be written as a linear combination of the four matrices: (20) since, using (2), we can immediately verify that: 11 + 22 11 22 12 + 21 = + + 2 2 2 Therefore, any 2 2 matrix can be put in the form: =

0

+a σ

+

12

21

2

(21)

(22)

where the coefficients 0 and are complex numbers. Comparing (21) and (22), we see that is Hermitian if and only if the coefficients 0 and a are real. These coefficients can be expressed formally in terms of the matrix in the following manner: 1 (23a) 0 = Tr 2 1 a = Tr σ (23b) 2 These formulas can easily be proven from (8), (9) and (10). 427



DIAGONALIZATION OF A 2

2 HERMITIAN MATRIX

Complement BIV Diagonalization of a 2

1 2 3

1.

2 Hermitian matrix

Introduction . . . . . . . . . . . . . . . . . . . . . Changing the eigenvalue origin . . . . . . . . . . Calculation of the eigenvalues and eigenvectors 3-a Angles and . . . . . . . . . . . . . . . . . . . 3-b Eigenvalues of . . . . . . . . . . . . . . . . . . 3-c Eigenvalues of . . . . . . . . . . . . . . . . . . 3-d Normalized eigenvectors of . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

429 429 430 430 431 431 432

Introduction

In quantum mechanics, one must often diagonalize 2 2 matrices. When we need only the eigenvalues, it is very easy to solve the characteristic equation since it is of second degree. In principle, the calculation of the normalized eigenvectors is also extremely simple; however, if it is performed clumsily, it can lead to expressions that are unnecessarily complicated and difficult to handle. The goal of this complement is to present a simple method of calculation which is applicable in all cases. After having changed the origin of the eigenvalues, we introduce the angles and , defined in terms of the matrix elements, which enable us to write the normalized eigenvectors in a simple easy-to-use form. The angles and also have an interesting physical interpretation in the study of two-level systems, as we shall see in Complement CIV . 2.

Changing the eigenvalue origin

Consider the Hermitian matrix:

( )=

11

and 12

=

22

11

12

21

22

(1)

are real. Moreover: (2)

21

The matrix ( ) therefore represents, in an orthonormal basis Hermitian operator1 .

1

2

, a certain

1 We use the letter because the Hermitian operator which we are trying to diagonalize is often a Hamiltonian. Nevertheless, the calculations presented in this complement can obviously be applied to any 2 2 Hermitian matrix.

429



COMPLEMENT BIV

Using the half-sum and half-difference of the diagonal elements can write ( ) in the following way: 1 2(

( )=

11

+

22 )

11

+

1 2(

22 )

11

12 1 2(

21

It follows that the operator 1 ( 2

where

11

+

22 )

1 + ( 2

2

21

11

(4) is the Hermitian operator represented in the

12

11

2

11

(3) 22 )

22 )

11

1 ( )=

we

itself can be decomposed into:

is the identity operator and basis by the matrix: 2

1

22 ,

22 )

+

=

and

0 1 2(

0

11

22

(5)

1 22

It is clear from (4) that and have the same eigenvectors. Let eigenvectors, and and , the corresponding eigenvalues for and :

be these

=

(6)

=

(7)

From (4), we immediately conclude that: 1 ( 2

=

11

+

22 )

1 + ( 2

22 )

11

(8)

Actually, the first matrix appearing on the right-hand side of (3) plays a minor role: we could make it disappear by choosing ( 11 + 22 ) 2 for the eigenvalue origin2 . 3.

Calculation of the eigenvalues and eigenvectors

3-a.

Angles

Let tan

and

=

2 11

2 Furthermore, 11

430

+

22

and

be the angles defined in terms of the matrix elements 21

with

0

by: (9)

22

this new origin is the same, whatever the basis = Tr( ) is invariant under a change of orthonormal bases.

1

2

initially chosen, since



21

=

21

e

with

0

=

12

2 HERMITIAN MATRIX

2

(10)

is the argument of the complex number and: 12

DIAGONALIZATION OF A 2

21 .

According to (2), we have

e

12

=

21

(11)

If we use (9), (10) and (11), the matrix ( ) becomes: 1

tan

e

( )=

(12) tan

3-b.

e

1

Eigenvalues of

The characteristic equation of the matrix (12): Det[( )

]=

2

1

tan2 = 0

directly yields the eigenvalues +

=+ =

+

(13)

and

of ( ):

1 cos 1 cos

(14a) (14b)

We see that they are indeed real (property of a Hermitian matrix, cf. § D-2-a of Chapter II). If we want to express 1 cos in terms of , all we need to do is use (9) and notice that cos and tan have the same sign since 0 : 1 cos

=

(

2 22 )

11 11

3-c.

+4

12

2

(15)

22

Eigenvalues of

Using (8), (14) and (15), we immediately obtain: +

1 ( 2 1 = ( 2

=

11

+

22 )

11

+

22 )

+

1 2 1 2

(

11

22 )

2

+4

12

(

11

2 22 )

+4

12

2

(16a)

2

(16b)

Comments:

( ) As we have already said, the eigenvalues (16) can easily be obtained from the characteristic equation of the matrix ( ). If we need only the eigenvalues of ( ), it is therefore not necessary to introduce the angles and as we have done here. On the other hand, we shall see in the following section that this method is very useful when we need to use the normalized eigenvectors of . 431



COMPLEMENT BIV

( ) It can be verified immediately from formulas (16) that: +

+

=

11

=

+

+

11

22

= Tr( )

(17)

2

(18)

22

2

= Det( )

2 2 ) To have + = , we must have ( 11 = 0; that is, 22 ) + 4 12 = and = = 0. A 2 2 Hermitian matrix with a degenerate 11 22 12 21 spectrum is therefore necessarily proportional to the unit matrix.

(

3-d.

Normalized eigenvectors of

Let and be the components of and (14a), they must satisfy: 1

tan

e

e

on

1

and

1 cos

= tan

+

1

2

. According to (7), (12)

(19)

which yields: 1 cos

1

+ tan

e

=0

(20)

that is: sin

2

2

e

+ cos

The normalized eigenvector +

= cos

2

2

e

1

2

2

e +

+ sin

=0

(21)

can therefore be written: 2

2

e

(22)

2

An analogous calculation would yield: =

sin

2

e

It can be verified that

2

1

+

+ cos and

2

e

2

2

(23)

are orthogonal.

Comment:

While the trigonometric functions of the angle can be expressed rather simply in terms of the matrix elements [see, for example, formulas (9) and (15)], the same is not true of those of the angle 2. Consequently, formulas (22) and (23) for the normalized eigenvectors + and become complicated when cos 2 and sin 2 are replaced by their expressions in terms of ; they are no longer very convenient. It is better to use expressions (22) and (23) directly, keeping the functions cos 2 and sin 2 during the entire calculation involving the normalized eigenvectors of . Furthermore, the final result of the calculation often 432



DIAGONALIZATION OF A 2

2 HERMITIAN MATRIX

involves only functions of the angle (see, for example, the calculation of § C-3-b of Chapter IV) and, consequently, can be expressed simply in terms of the . Expressions (22) and (23) thus enable us to carry out the intermediate calculations elegantly, avoiding unnecessarily complicated expressions. This is the advantage of the method presented in this complement. Another advantage concerns the physical interpretation and will be discussed in the next complement.

433



FICTITIOUS SPIN 1/2 ASSOCIATED WITH A TWO-LEVEL SYSTEM

Complement CIV Fictitious spin 1/2 associated with a two-level system

1 2 3

1.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 Interpretation of the Hamiltonian in terms of fictitious spin 435 Geometrical interpretation . . . . . . . . . . . . . . . . . . . . 437 3-a Fictitious magnetic fields associated with 0 , and . . . 437 3-b Effect of coupling on the eigenvalues and eigenvectors of the Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438 3-c Geometrical interpretation of P12 ( ) . . . . . . . . . . . . . . 439

Introduction

Consider a two-level system whose Hamiltonian is represented, in an orthonormal 1 basis 1 2 , by the Hermitian matrix ( ) [formula (1) of Complement BIV ] . If we choose ( 11 + 22 ) 2 as the new energy origin, the matrix ( ) becomes:

( )=

1 2(

22 )

11

12 1 2(

21

(1) 22 )

11

Although the two-level system under consideration is not necessarily a spin 1/2, we can always associate with it a spin 1/2 whose Hamiltonian is represented by the same matrix ( ) in the + basis of eigenstates of the component of this spin. We shall see that ( ) can then be interpreted as describing the interaction of this “fictitious spin” with a static magnetic field B, whose direction and modulus are very simply related to the parameters introduced in the preceding complement in the discussion of the diagonalization of ( ). Thus it is possible to give a simple physical meaning to these parameters. Moreover, if the Hamiltonian is the sum = 0 + of two operators, we shall see that we can associate with , 0 and three magnetic fields, B, B0 and b, such that B = B0 + b. Introducing the coupling is equivalent, in terms of fictitious spin, to adding the field b to B0 . We shall show that this point of view enables us to interpret very simply the different effects studied in § C of Chapter IV. 2.

Interpretation of the Hamiltonian in terms of fictitious spin

We saw in Chapter IV that the Hamiltonian a magnetic field B, of components , , = 1 We

B.S =

(

+

+

of the coupling between a spin 1/2 and can be written:

)

(2)

are using the same notation as in Complement BIV and Chapter IV.

435



COMPLEMENT CIV

To calculate the matrix associated with this operator, we substitute into this relation the matrices associated with [Chap. IV, relations (A-15), (A-16), (A-17)]. This immediately yields:

~ 2

( )=

(3) +

Therefore, to make matrix (1) identical to ( ), we must simply choose a “fictitious field” B defined by: 2 Re( ~

=

2 Im( ~ 1 = ( 22 ~ =

12 ) 12 )

Note that the modulus to: =

2 ~

(4) 11 )

of the projection B

of B onto the

12

(5)

According to formulas (9) and (10) of Complement BIV , the angles with the matrix ( ) = ( ) written in (3) are given by: tg = (

plane is then equal

+

)=

0 e

0

and

associated

(6) 2

The gyromagnetic ratio is a simple calculation tool and can have an arbitrary value. If we agree to choose negative, relations (6) show that the angles and associated with the matrix ( ) are simply the polar angles of the direction of the field B (if we had chosen positive, they would be those of the opposite direction). Finally, we see that we can forget the two-level system with which we started and consider the matrix ( ) as representing, in the basis of the eigenstates + and of , the Hamiltonian of a spin 1/2 placed in a field B whose components are given by (4). can also be written: = where is the operator S u which describes the spin component along the direction u, whose polar angles are and , and is the Larmor angular velocity: =

B

(7)

The following table summarizes the various correspondences between the two-level system and the associated fictitious spin 1 2. 436



FICTITIOUS SPIN 1/2 ASSOCIATED WITH A TWO-LEVEL SYSTEM

Two-level system

Fictitious spin 1/2 +

1

2

+

+

~

+

Angles

and

introduced in BIV 11

Polar angles of the fictitious field B ~

22

~

21

3.

2

Geometrical interpretation of the various effects discussed in § C of Chapter IV

3-a.

Fictitious magnetic fields associated with

Assume, as in § C of Chapter IV, that =

0

0,

and

appears as the sum of two terms:

+

(8)

In the basis, the unperturbed Hamiltonian 0 is represented by a diagonal 1 , 2 matrix which, with a suitable choice of the energy origin, is written: 1 0

2

0

2

=

(9) 1

0

2

As far as the coupling purely non-diagonal: 0 (

is concerned, we assume, as in § C of Chapter IV, that it is

12

)=

The discussion of the preceding section then enables us to associate with ( ) two fields B0 and b such that [cf. formulas (4) and (5)]: 0 0

=

(10)

0

12

(

2

2

0)

and

1

~

(11)

=0 437

COMPLEMENT CIV



z

u b

Figure 1: Relative disposition of the fictitious fields. B0 is associated with 0 , b with , and B = B0 + b with the total Hamiltonian = 0+ .

B0 B θ

O

=0 =

2 ~

12

(12)

B0 is therefore parallel to and proportional to ( 1 2 ) 2; b is perpendicular to and proportional to ), the field B associated with the 12 . Since ( ) = ( 0 ) + ( total Hamiltonian is the vector sum of B0 and b: B = B0 + b

(13)

The three fields B0 , b and B are shown in Figure 1; the angle introduced in § C-2-a of Chapter IV is the angle between B0 and B, since B0 is parallel to . The strong coupling condition introduced in § C-2 of Chap. IV ( 12 1 2 ) is equivalent to b B0 (Fig. 2-a). The weak coupling condition ( 12 1 2 ) is equivalent to b B0 (Fig. 2-b). 3-b.

Effect of coupling on the eigenvalues and eigenvectors of the Hamiltonian

correspond respectively to the Larmor angular velocities 1 2 and + = B0 and = B in the fields B0 and B. We see in Figure 1 that B0 , b and B form a right triangle whose hypotenuse is B; we therefore always have B B0 , which again shows that + is always greater than 1 2 . For a weak coupling (Fig. 2-b), the difference between B and B0 is very small in relative value, being of second order in b B0 . From this we deduce immediately that and 1 + 2 differ in relative value by terms of second order in 12 ( 1 2 ). On the other hand, for a strong coupling (Fig. 2-a), B is much larger than B0 and practically equal to b ; + is then much larger than 1 2 and practically proportional to 12 . We thus find again all the results of § C-2 of Chapter IV. As far as the effect of the coupling on the eigenvectors is concerned, it can also be understood very simply from Figures 1 and 2. The eigenvectors of and 0 are 0

438



FICTITIOUS SPIN 1/2 ASSOCIATED WITH A TWO-LEVEL SYSTEM

u

z

z

b

B0 θ B B0

u

b θ B O

O a

b

Figure 2: Relative disposition of the fictitious fields B0 , b and B in the case of strong coupling (fig. a) and weak coupling (fig. b).

associated respectively with the eigenvectors of the components of S on the and axes. These two axes are practically parallel in the case of weak coupling (Fig. 2-b) and perpendicular in the case of strong coupling (Fig. 2-a). The eigenvectors of and , and, consequently, those of and 0 , are very close in the first case and very different in the second one. 3-c.

Geometrical interpretation of P12 ( )

In terms of fictitious spin, the problem considered in § C-3 of Chapter IV can be put in the following way: at time = 0, the fictitious spin associated with the two-level system is in the eigenstate + of ; b is added to B0 ; what is the probability P+ ( ) of finding the spin in the state at time ? With the correspondences summarized in the table, P12 ( ) must be identical to P+ ( ). The calculation of P+ ( ) is then very simple since the time evolution of the spin reduces to a Larmor precession about B (Fig. 3). During this precession, the angle between the spin and the direction of B remains constant. At time , the spin points in the direction On, making an angle with ; the angle formed by the (Oz, Ou) and (Ou, On) planes is equal to . A classical formula of spherical trigonometry enables us to write:

cos

= cos2 + sin2 cos

(14)

Now, when the spin points in a direction that makes an angle of with , the probability of finding it in the state of is equal (cf. § B-2-b of Chapter IV) to sin2 2 = (1 cos ) 2. From this we deduce, using (14), that: P+ ( ) = sin2

2

=

1 sin2 2

(1

cos

)

(15) 439



COMPLEMENT CIV

z

u

b

B0

B n

ωt

α

θ

Figure 3: Geometrical interpretation of Rabi’s formula in terms of fictitious spin. Under the effect of the coupling (represented by b), the spin, initially oriented along , precesses about B; consequently, the probability of finding ~ 2 in a measurement of its component on is an oscillating function of time.

θ

O

This result is identical, when we replace by ( + ) ~, to formula (C-31) of Chapter IV (Rabi’s formula). We have thus given this formula a purely geometrical interpretation. References and suggestions for further reading: Abragam (14.1), Chap. II, § F; Sargent et al. (15.5), § 7-5; Allen (15.7), Chap. 2; see also the article by Feynman et al. (1.33).

440



SYSTEM OF TWO SPIN 1/2 PARTICLES

Complement DIV System of two spin 1/2 particles

1

Quantum mechanical description . . . . . . . . . . . . . . . . 441 1-a

State space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441

1-b

Complete sets of commuting observables . . . . . . . . . . . . 442

1-c

The most general state . . . . . . . . . . . . . . . . . . . . . . 443

2

Prediction of the measurement results . . . . . . . . . . . . . 444 2-a

Measurements bearing simultaneously on the two spins

2-b

Measurements bearing on one spin alone . . . . . . . . . . . . 445

. . . 444

In this complement, we intend to use the formalism introduced in § A-2 of Chapter IV to describe a system of two spin 1/2 particles. This case is hardly more complicated than that of a single spin 1/2 particle. Its interest, as far as the postulates are concerned, lies in the fact that none of the various spin observables alone constitutes a C.S.C.O. (while this is the case for one spin alone). Thus, we shall be able to consider measurements bearing either on one observable with a degenerate spectrum or simultaneously on two observables. In addition, this study provides a very simple illustration of the concept of a tensor product, introduced in § F of Chapter II. We shall be concerned, as in Chapter IV, only with the internal degrees of freedom (spin states), and we shall moreover assume that the two particles which constitute the system are not identical (systems of identical particles will be studied in a general way in Chapters XIV and XV). 1.

Quantum mechanical description

We saw in Chapter IV how to describe quantum mechanically the spin state of a spin 1/2 particle. Thus, all we need to do is apply the results of § F of Chapter II in order to know how to describe systems of two spin 1/2 particles. 1-a.

State space

We shall use the indices 1 and 2 to distinguish between the two particles. When particle (1) is alone, its spin state is defined by a ket which belongs to a two-dimensional state space (1). Similarly, the spin states of particle (2) alone form a two-dimensional space (2). We shall designate by S1 and S2 the spin observables of particles (1) and (2) respectively. In (1) [or (2)], we choose as a basis the eigenkets of 1 (or 2 ), which we shall denote by 1 : + and 1 : (or 2 : + and 2 : ). The most general ket of (1) can be written: (1) =

1

1:+ +

1

1:

(1) 441

COMPLEMENT DIV

and that of

(2):

(2) = (

1,



2:+ +

2

2

2:

(2)

1,

2 , 2 are arbitrary complex numbers). When we join the two particles to make a single system, the state space a system is the tensor product of the two preceding spaces:

=

(1)

(2)

of such

(3)

In the first place, this means that a basis of the two bases defined above for (1) and

can be obtained by multiplying tensorially (2). We shall use the following notation:

++ = 1:+ 2:+ +

= 1:+ 2: + = 1:

2:+

= 1:

2:

(4)

In the state + , for example, the component along of the spin of particle (1) is +~ 2, with absolute certainty; that of the spin of particle (2) is ~ 2, with absolute certainty. We shall agree here to denote by + the conjugate bra of the ket + ; the order of the symbols is therefore the same in the ket and in the bra: the first symbol is always associated with particle (1) and the second, with particle (2). The space is therefore four-dimensional. Since the 1 : and 2 : bases are orthonormal in (1) and (2) respectively, the basis (4) is orthonormal in : 1 2

1 2

=

1 1

(5)

2 2

( 1 2 1 2 are to be replaced by + or depending on the case; is equal to 1 if and are identical and 0 if they are different). The system of vectors (4) also satisfies a closure relation in : 1 2

1 2

= ++ ++

+

+

+

+

+

+

1 2

1-b.

+

=

(6)

Complete sets of commuting observables

We extend into the observables S1 and S2 which were originally defined in (1) and (2) (as in Chapter II, we shall continue to denote these extensions by S1 and S2 ). Their action on the kets of the basis (4) is simple: the components of S1 , for example, act only on the part of the ket related to particle (1). In particular, the vectors of the basis (4) are simultaneous eigenvectors of 1 and 2 :

442

1

1 2

2

1 2

~ 2 ~ = 2 =

1

1 2

2

1 2

(7)



SYSTEM OF TWO SPIN 1/2 PARTICLES

For the other components of S1 and S2 , we apply the formulas given in § A-2 of Chapter IV. For example, we know from relation (A-16) of Chapter IV how 1 acts on the kets 1 : : ~ 1: 2 ~ = 1:+ 2

1

1:+ =

1

1:

From this we deduce the action of

(8)

1

on the kets (4):

~ + 2 ~ + = 2 ~ + = ++ 2 ~ = + 2 ++ =

1

1

1

1

(9)

It is then easy to verify that, although the three components of S1 (or of S2 ) do not commute with each other, any component of S1 commutes with any component of S2 . In (1), the observable 1 alone constituted a C.S.C.O., and the same was true of 2 in (2). In , the eigenvalues of 1 and 2 remain ~ 2, but each of them is two-fold degenerate. To the eigenvalue +~ 2 of 1 , for example, correspond two orthogonal vectors, + + and + [formulas (7)] and all their linear combinations. Therefore, in , neither 1 nor 2 (taken separately) constitutes a C.S.C.O. On the other hand, the set 1 is a C.S.C.O. in , as can be seen from formulas (7). 2 This is obviously not the only C.S.C.O. that can be constructed. For example, another one is 1 2 . These two observables commute, as we noted above, and each of them constitutes a C.S.C.O. in the space in which it was initially defined. The eigenvectors which are common to 1 and 2 are obtained by taking the tensor product of their respective eigenvectors in (1) and (2). Using relation (A-20) of Chapter IV, we find: 1:+ 2:+ 1:+ 2: 1:

2:+

1:

2:

1-c.

of

1 [ ++ + + 2 1 = [ ++ + 2 1 = [ + + 2 1 = [ + 2

=

] ] (10) ] ]

The most general state

The vectors (4) were obtained by multiplying tensorially a ket of (1) and a ket (2). More generally, using an arbitrary ket of (1) [such as (1)] and an arbitrary 443



COMPLEMENT DIV

ket of

(2) [such as (2)], we can construct a ket of

(1)

(2) =

1 2

++ +

1 2

+

+

2 1

: + +

(11)

1 2

The components of such a ket in the basis (4) are the products of the components of (1) and (2) in the bases of (1) and (2) which were used to construct (4). But all the kets of are not tensor products. The most general ket of is an arbitrary linear combination of the basis vectors: =

++ +

+

+

If we want to normalize 2

2

+

+

2

+

+ +

(12)

, we must choose: 2

=1

(13)

Given , it is not in general possible to find two kets (1) and (2) of which it is the tensor product. For (12) to be of the form (11), we must have, in particular: =

(14)

and this condition is not necessarily fulfilled. 2.

Prediction of the measurement results

We are now going to envisage a certain number of measurements that can be performed on a system of two spin 1/2 particles and we shall calculate the predictions provided by the postulates for each of them. We shall assume each time that the state of the system immediately before the measurement is described by the normalized ket (12). 2-a.

Measurements bearing simultaneously on the two spins

Since any component of S1 commutes with any component of S2 , we can envisage measuring them simultaneously (Chap. III, § C-6-a). To calculate the predictions related to such measurements, all we need to do is use the eigenvectors common to the two observables. .

First example

Imagine we are simultaneously measuring 1 and 2 . What are the probabilities of the various results that can be obtained? Since the set 1 2 is a C.S.C.O., there exists only one state associated with each measurement result. If the system is in the state (12) before the measurement, we can therefore find: ~ for 2 ~ + 2 ~ 2 ~ 2

+

444

1

~ for 2 ~ 2 ~ + 2 ~ 2

and +

2

with the probability

++

2

+

2

=

2

2

=

2

2

=

2

+

2

=

(15)

• .

SYSTEM OF TWO SPIN 1/2 PARTICLES

Second example

We now measure 1 and 2 . What is the probability of obtaining +~ 2 for each of the two observables? Here again, 1 constitutes a C.S.C.O. The eigenvector common to 1 and 2 that corresponds to the eigenvalues +~ 2 and +~ 2 is the tensor product of the 2 vector 1 : + and the vector 2 : + : 1:+

1 [ ++ + 2

2:+ =

+]

(16)

Applying the fourth postulate of Chapter III, we find that the probability we are looking for is:

=

2

1 [ ++ 2

P= 1 2

+ ]

2

(17)

The result therefore appears in the form of a “square of a sum”1 . After the measurement, if we have actually found +~ 2 for the system is in the state (16). 2-b.

1

and +~ 2 for

2

,

Measurements bearing on one spin alone

It is obviously possible to measure only one component of one of the two spins. In this case, since none of these components constitutes by itself a C.S.C.O., there exist several eigenvectors corresponding to the same measurement result, and the corresponding probability will be a “sum of squares”. .

First example

We measure only 1 . What results can be found, and with what probabilities? The possible results are the eigenvalues ~ 2 of 1 . Each of them is doubly degenerate. In the associated eigensubspace, we choose an orthonormal basis: we can, for example, take + + + for +~ 2 and + for ~ 2. We then obtain: P +

~ 2

= P

~ 2

2

= ++ 2

= =

+

+

2

+

2

2

(18) 2

+ 2

+ +

2

1 It must be remembered that the sign of changes when we go from (16) to the conjugate bra. If 2 since this were to be forgotten, the result obtained would be incorrect ( + 2 = is not in general real).

445



COMPLEMENT DIV

Comment:

Since we are not performing any measurement on the spin (2), the choice of the basis in (2) is arbitrary. We can, for example, choose as a basis of the eigensubspace of 1 associated with the eigenvalue +~ 2 the vectors: 1:+ 2:

=

1 [ ++ 2

+

]

(19)

which again yields: P +

~ 2

1 2

=

2

=

2

+

+

1 2

2

2

+

(20)

The general proof of the fact that the probability obtained is independent (in the case of a degenerate eigenvalue) of the choice of the basis in the corresponding eigensubspace was given in § B-3-b- of Chapter III. .

Second example

We now choose to measure 2 . What is the probability of obtaining ~ 2? The eigensubspace associated with the eigenvalue ~ 2 of 2 is two-dimensional. We can choose as a basis in that subspace: 1 [ ++ 2 1 = [ + 2

1:+ 2: 1:

=

2:

+

] (21) ]

We then find:

=

2

1 [ ++ 2

P= 1 2

2

+

+

]

1 2

+

1 [ 2

2

+

] (22)

2

In this result, each of the terms of the “sum of squares” is itself the “square of a sum”. If the measurement actually yields ~ 2, the state of the system immediately after this measurement is the (normalized) projection of onto the corresponding eigensubspace. We have just calculated the components of on the basis vectors (21) of this subspace: they are equal, respectively, to 12 ( ) and 12 ( ). Consequently: 1

= 1 2

2

+

1 2

2

1 ( 2 446

)( + +

+

1 ) + ( 2

)(

+

)

(23)



SYSTEM OF TWO SPIN 1/2 PARTICLES

Comment:

We have considered, in this complement, only the components of S1 and S2 on the coordinate axes. It is obviously possible to measure their components S1 u and S2 v on arbitrary unit vectors u and v. The reasoning is the same as above.

447



SPIN 1 2 DENSITY MATRIX

Complement EIV Spin 1 2 density matrix

1 2 3 4 5

1.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . Density matrix of a perfectly polarized spin (pure case) . Example of a statistical mixture: unpolarized spin . . . . Spin 1/2 at thermodynamic equilibrium in a static field . Expansion of the density matrix in terms of the Pauli matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

449 449 450 452

. 453

Introduction

The aim of this complement is to illustrate the general considerations developed in Complement EIII , using a very simple physical system, that of a spin 1/2. We are going to study the density matrices which describe a spin 1/2 in a certain number of cases: perfectly polarized spin (pure case), unpolarized or partially polarized spin (statistical mixture). This will lead us to verify and interpret the general properties stated in Complement EIII . In addition, we shall see that the expansion of the density matrix in terms of the Pauli matrices can be expressed very simply as a function of the mean values of the various spin components. 2.

Density matrix of a perfectly polarized spin (pure case)

Consider a spin 1/2, coming out of an “atomic polarizer” of the type described in § B of Chapter IV. We assume that it is in the eigenstate + u (eigenvalue +~ 2) of the S u component of the spin (recall that the polar angles of the unit vector u are and ). The spin state is then perfectly well-known and is written [cf. formula (A-22a) of Chapter IV]: = cos

2

2

e

+ + sin

2

e

2

(1)

We saw in Complement EIII that, by definition, such a situation corresponds to a pure case. We shall say that the beam which leaves the “polarizer” is perfectly polarized. Recall also that, for each spin, the mean value S is equal to ~2 u [Chap. IV, relations (B14)]. It is simple to write, in the + , basis, the density matrix ( ) corresponding to the state (1). We write the matrix of the projector onto this state: cos2 (

)= sin

2

cos

sin

2 2

e

2

cos sin

2

e (2)

2

2 449



COMPLEMENT EIV

This matrix is generally non-diagonal. The “populations” ++ and have a very simple physical significance. Their difference is equal to cos = 2 ~ [cf. equations (B-14) of Chapter IV], and their sum is, of course, equal to 1. The populations are therefore related to the longitudinal polarization . Similarly, the modulus of the 1 “coherences” + and is = = sin = ~1 S (where S is the + + + 2 projection of S onto the plane). The argument of is , which is the angle + between S and : the coherences are therefore related to the transverse polarization S . It can also be verified that: )]2 = (

[ (

)

(3)

a relation characteristic of a pure state. 3.

Example of a statistical mixture: unpolarized spin

Now let us consider the spin of a silver atom leaving a furnace, such as the one in Figure 1 of Chapter IV, and which has not passed through an “atomic polarizer” (the spin has not been prepared in a particular state). The only information we then possess about this spin is the following: it can point in any direction of space, and all directions are equally probable. With the notation of Complement EIII , such a situation corresponds to a statistical mixture of the states + with equal weights. Formula (28) of Complement EIII defines the density matrix that corresponds to this case. Nevertheless, the discrete sum Σ must here be replaced by an integral over all possible directions: =

1 4

dΩ (

)=

1 4

2

d 0

sin

d

(

)

(4)

0

(the factor 1 4 insures the normalization of the probabilities associated with the various directions). The integrals which give the matrix elements of are simple to calculate and lead to the following result: 1 2

0

=

(5) 0

1 2

It is easy to deduce from (5) that 2 = 2, which shows that, in the case of a statistical mixture of states, 2 is different from . In addition, if we calculate from (5) the mean values of , , , we obtain: = Tr

=

1 Tr 2

=0

with:

=

(6)

We again find the fact that the spin is unpolarized: since all the directions are equivalent, the mean value of the spin is zero.

450



SPIN 1 2 DENSITY MATRIX

Comments:

( ) It is clear from this example how the non-diagonal elements (coherences) of can disappear from the summation over the various states of the statistical mixture. As we saw in § 2, the coherences + and + are related to the transverse polarization S of the spin. Upon summing the vectors S corresponding to all (equiprobable) directions of the plane, we obviously find a null result. ( ) The case of an unpolarized spin is also very instructive, since it helps us understand the impossibility of describing a statistical mixture by an “average state vector”. Assume that we are trying to choose and so that the vector: =

+ +

(7)

with: 2

+

2

=1

(8)

represents an unpolarized spin, for which simple calculation gives: ~ ( 2 ~ = ( 2 ~ = ( 2 =

+

,

and

are zero. A

) )

(9)

)

If we want to make zero, we must choose and so as to make a pure imaginary; similarly, must be real for to be zero. We must therefore have = 0, that is: either = 0, which implies = 1 and = ~ 2 or = 0, which implies = 1 and =~ 2 Therefore, , and cannot all be zero at the same time; consequently, the state (7) cannot represent an unpolarized spin. Furthermore, the discussion of § B-1-c of Chapter IV shows that for any and that satisfy (8), one can always associate with them two angles and fixing a direction u such that is an eigenvector of S u with the eigenvalue +~ 2. Thus we see directly that a state such as (7) always describes a spin which is perfectly polarized in a certain direction of space. (

) The density matrix (5) represents a statistical mixture of the various states + u , all the directions u being equiprobable (this is how we obtained it). We could, however, imagine other statistical mixtures which would lead to the same density matrix: for example, a statistical mixture of equal proportions of the states + and , or a statistical mixture of equal proportions of three states + u , such that the tips of the three corresponding vectors u are the vertices of an equilateral triangle centered at . Thus we see that the same 451



COMPLEMENT EIV

density matrix can be obtained in several different ways. In fact, since all the physical predictions depend only on the density matrix, it is impossible to distinguish physically between the various types of statistical mixtures that lead to the same density matrix. They must be considered to be different expressions of the same incomplete information we possess about the system. 4.

Spin 1/2 at thermodynamic equilibrium in a static field

Consider a spin 1/2 placed in a static field B0 parallel to . We saw in § B-3-a of Chapter IV that the stationary states of this spin are the states + and , of energies +~ 0 2 and ~ 0 2 (with 0 = , where is the gyromagnetic ratio of the spin). 0 If we know only that the system is in thermodynamic equilibrium at the temperature , 1 1 we can assert that it has a probability e ~ 0 2 of being in the state + and +~ 0 2 ~ 0 2 +~ 0 2 e of being in the state , where = e +e is a normalization factor ( is called the “partition function”). We have here another example of a statistical mixture, described by the density matrix:

=

~

e

1

2

0

0 (10)

0

e

+~

2

0

Once more, it is easy to verify that 2 = . The non-diagonal elements are zero since all directions perpendicular to B0 (that is, to ) and fixed by the angle are equivalent. From (10), it is easy to calculate: = Tr(

)=0

= Tr(

)=0

= Tr(

)=

(11) ~ tanh 2

~ 2

0

We see that the spin acquires a polarization parallel to the field in which it is placed. The larger 0 (that is, 0 ) and the lower the temperature , the greater the polarization. Since tanh 1, this polarization is less than the value ~ 2 corresponding to a spin that is perfectly polarized along . The density matrix (10) can therefore be said to describe a spin which is “partially polarized” along .

Comment:

The magnetization is equal to . It is possible to calculate from (11) the paramagnetic susceptibility of the spin, defined by: =

=

(12)

0

We find (Brillouin’s formula): = 452

~ 2

tanh 0

~ 2

0

(13)

• 5.

SPIN 1 2 DENSITY MATRIX

Expansion of the density matrix in terms of the Pauli matrices

We saw in Complement AIV that the unit matrix and the Pauli matrices , and form a convenient basis for expanding a 2 2 matrix. We therefore set, for the density matrix of a spin 1 2: =

+a σ

0

where the coefficients

(14) are given by [cf. Complement AIV , relations (23)]:

1 Tr 2 1 1 = Tr = Tr 2 ~ 1 1 = Tr = Tr 2 ~ 1 1 = Tr = Tr 2 ~ Thus we have: 1 0 = 2 1 a= S ~ and can be written: 1 1 = + S σ 2 ~ Therefore, the density matrix of the mean value S of the spin. 0

=

(15)

(16)

(17) of a spin 1/2 can be expressed very simply in terms

Comment:

Let us square expression (17). We obtain, using identity (17) of Complement AIV : 1 1 1 + 2 S2 + S σ (18) 4 ~ ~ The condition 2 = , characteristic of the pure case, is therefore equivalent, for a spin 1 2, to the condition: 2

=

~2 (19) 4 This condition is obviously not satisfied for an unpolarized spin ( S is then zero) or for a spin in thermodynamic equilibrium (we saw in § 4 that in this case S ~ 2). On the other hand, it can be verified, using formulas (B-14) of Chapter IV, that, for a spin in the state given in (1), S 2 is indeed equal to 2 ~ 4. S

2

=

References and suggestions for further reading: Abragam (14.1), Chap. II, § C. 453



SPIN 1/2 PARTICLE IN A STATIC AND A ROTATING MAGNETIC FIELDS: MAGNETIC RESONANCE

Complement FIV Spin 1/2 particle in a static and a rotating magnetic fields: magnetic resonance

1

2

3 4

Classical treatment; rotating reference frame . . 1-a Motion in a static field; Larmor precession . . . . 1-b Influence of a rotating field; resonance . . . . . . Quantum mechanical treatment . . . . . . . . . . 2-a The Schrödinger equation . . . . . . . . . . . . . 2-b Changing to the rotating frame . . . . . . . . . . 2-c Transition probability; Rabi’s formula . . . . . . 2-d Case where the two levels are unstable . . . . . . Relation between the classical treatment and the mechanical treatment: evolution of M . . . . . Bloch equations . . . . . . . . . . . . . . . . . . . 4-a A concrete example . . . . . . . . . . . . . . . . 4-b Solution in the case of a rotating field . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . quantum . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

455 455 456 458 458 459 460 461

. . . .

463 463 464 466

In Chapter IV, we used quantum mechanics to study the evolution of a spin 1 2 in a static magnetic field. In this complement, we shall consider the case of a spin 1 2 simultaneously subjected to several magnetic fields, some of which can be time dependent, as is the case in magnetic resonance experiments. Before attacking this problem quantum mechanically, we shall briefly review several results obtained using classical mechanics. 1.

Classical treatment; rotating reference frame

1-a.

Motion in a static field; Larmor precession

Consider a system with angular momentum j that possesses a magnetic moment m = j collinear with j (the constant is the gyromagnetic ratio). The system is placed in a static magnetic field B0 , which exerts a torque m B0 on the system. The classical equation of motion of j is: dj =m d

(1)

B0

or: d m( ) = d

m( )

B0

(2)

Performing a scalar multiplication of both sides of this equation by either m( ) or B0 , we obtain: d [m( )]2 = 0 d

(3) 455

COMPLEMENT FIV

• z Z B0

eZ

ez eY

O

ey

Y y

eX

ex

B1 ωt x X

Figure 1: is a fixed coordinate system. The static magnetic field B0 is directed along the axis. The system [the axis is in the direction of the field B1 ( )] rotates about with the angular velocity .

d [m( ) B0 ] = 0 (4) d m( ) therefore evolves with a constant modulus, maintaining a constant angle with B0 . If we project equation (2) onto the plane perpendicular to B0 , we see that m( ) rotates about B0 (Larmor precession) with an angular velocity of 0 = 0 (the rotation is clockwise if is positive). 1-b.

Influence of a rotating field; resonance

Now assume that we add to the static field B0 a field B1 ( ), perpendicular to B0 , and which is of constant modulus and rotates about B0 with an angular velocity (cf. Fig. 1). We set: 0

=

0

1

=

1

(5)

We shall designate by (unit vectors e , e , e ) a fixed coordinate system, whose axis is the direction of the field B0 , and by (unit vectors e , e , e ), the axes obtained from Oxyz by rotation through an angle about [ is the direction of the rotating field B1 ( )]. The equation of motion of m( ) in the presence of the total field B( ) = B0 + B1 ( ) then becomes: d m( ) = d 456

m( )

[B0 + B1 ( )]

(6)



SPIN 1/2 PARTICLE IN A STATIC AND A ROTATING MAGNETIC FIELDS: MAGNETIC RESONANCE

To solve this equation, it is convenient to place ourselves, not in the laboratory reference frame , but in the rotating reference frame , with respect to which the relative velocity of the vector m( ) is: dm d

= rel

dm d

m( )

e

(7)

Let us set: ∆ =

(8)

0

Substituting (6) into (7), we obtain: dm d

= m( )

[∆ e

1

e ]

(9)

rel

This equation is much simpler to solve than equation (6), since the coefficients of the right-hand side are now time-independent. Moreover, its form is analogous to that of (2): the relative motion of the vector m( ) is therefore a rotation about the “effective field” Beff (which is static with respect to the rotating reference frame), given by (cf. Fig. 2): Beff =

1

[∆ e

1e

]

(10)

To obtain the absolute motion of m( ), we must combine this precession about Beff with a rotation about of angular velocity . Z ω – ω0 γ Beff

m O

Y

Figure 2: In the rotating reference frame , the effective field Beff has a fixed direction, about which the magnetic moment m( ) rotates with a constant angular velocity (precession in the rotating reference frame).

ω1 B1 = – γ

X

These first results already enable us to understand the essence of the magnetic resonance phenomenon. Let us consider a magnetic moment which, at time = 0, is 457



COMPLEMENT FIV

parallel to the field B0 (the case, for example, of a magnetic moment in thermodynamic equilibrium at very low temperatures: it is in the lowest energy state possible in the presence of the field B0 ). What happens when we apply a weak rotating field B1 ( )? If the rotation frequency 2 of this field is very different from the natural frequency 2 (more precisely, if ∆ = 0 0 is much larger than 1 ), the effective field is directed practically along . The precession of m( ) about Beff then has a very small amplitude and hardly modifies the direction of the magnetic moment. On the other hand, if the resonance condition 0 is satisfied (∆ 1 ), the angle between the field Beff and is large. The precession of the magnetic moment then has a large amplitude and, at resonance (∆ = 0), the magnetic moment can even be completely flipped. 2.

Quantum mechanical treatment

2-a.

The Schrödinger equation

Let + and be two eigenvectors of the projection of the spin onto , with respective eigenvalues +~ 2 and ~ 2. The state vector of the system can be written: () =

+(

)+ +

() ( ) of the system is1 :

The Hamiltonian operator ()=

(11)

M B( ) =

S [B0 + B1 ( )]

(12)

that is, expanding the scalar product: ()=

+

0

1 [cos

+ sin

]

(13)

Using formulas (A-16) and (A-17) of Chapter IV, we obtain the matrix that represents in the + basis:

=

1 2

0

1

1

e

e (14) 0

Using (11) and (14), we can write the Schrödinger equation in the form: d d d d

1 In

+(

)=

()=

0

+(

2 1

2

e

)+ +( )

1

2

e

() (15) 0

2

()

expression (12), M B( ) symbolizes the scalar product ( )+ ( )+ ( ), where and are operators (observables of the system under study), while () ( ) and () are numbers (since we consider the magnetic field to be a classical quantity whose value is imposed by an external device, independent of the system under study).

458



SPIN 1/2 PARTICLE IN A STATIC AND A ROTATING MAGNETIC FIELDS: MAGNETIC RESONANCE

2-b.

Changing to the rotating frame

Equations (15) constitute a linear homogeneous system with time-dependent coefficients. It is convenient to define new functions by setting: +(

2

)=e

+( 2

()=e

) ()

(16)

Substituting (16) into (15), we obtain a system which has constant coefficients: d ∆ 1 () +( ) = +( ) + d 2 2 d ∆ 1 ()= () +( ) + d 2 2 This system can also be written: ~

d d

() =

()

if we introduce the ket () =

=

~ 2

+(

(17)

)+ +



(18) ( ) and the operator

defined by:

()

1

(19)

(20)



1

Transformation (16) has led to equation (18), which is analogous to a Schrödinger equation in which the operator , given in (20), plays the role of a time-independent Hamiltonian. describes the interaction of the spin with a fixed field, whose components are none other than those of the effective field introduced above in the frame [formula (10)]. We can therefore consider that the transformation (16) is the quantum mechanical equivalent of the change from the fixed frame to the rotating frame. This result can be proved rigorously. According to (16), we can write: () = where

() ()

(21)

( ) is the unitary operator defined by:

( )=e

~

(22)

We shall see later (cf. Complement BVI ) that ( ) describes a rotation of the coordinate system through an angle about . (18) is therefore indeed the transformed Schrödinger equation in the rotating frame.

Equation (18) is very simple to solve. To determine ( ) , given (0) , all we need to do is expand (0) on the eigenvectors of (which can be calculated exactly) and then apply rule (D-54) of Chapter III (which is possible since is not explicitly time-dependent). We then go from ( ) to ( ) by using formulas (16). 459



COMPLEMENT FIV

2-c.

Transition probability; Rabi’s formula

Consider a spin which, at time = 0, is in the state + : (0) = +

(23)

According to (16), this corresponds to: (0) = +

(24)

What is the probability P+ ( ) of finding this spin in the state ( ) and ( ) have the same modulus, we can write: P+ ( ) =

()

2

=

()2=

()2=

2

at time ? Since

(25)

We must therefore calculate ( ) 2 , where ( ) is the solution of (18) that corresponds to the initial condition (24). The problem we have just posed has already been solved, in § C-3-b of Chapter IV. To use the calculations of that section, all we need to do is apply the following correspondences: +

1 2

~ ∆ 2

1

2

12

~ ∆ 2 ~ 2

(26)

1

Rabi’s formula [equation (C-32) of Chapter IV] then becomes: P+ ( ) =

2 1 2 1

+ (∆ )2

sin2

2 1

+ (∆ )2

2

(27)

The probability P+ ( ) is zero at time = 0, and then varies sinusoidally with time 2 1 between the values 0 and 2 +(∆ . Again, we have a resonance phenomenon. For )2 1 ∆ 1 , P+ ( ) remains almost zero (cf. Fig. 3-a); near resonance, the oscillation amplitude of P+ ( ) becomes large and, when the condition ∆ = 0 is exactly satisfied, we have P+ ( ) = 1 at times = (2 +1) (cf. Fig. 3-b). 1 Thus we again find the result which we have already obtained classically: at resonance, a very weak rotating field is able to reverse the direction of the spin. Note, more2 2 over, that the angular frequency of the oscillation of P+ ( ) is Beff . 1 + (∆ ) = This oscillation corresponds, in the rotating frame, to the projection onto of the precession of the magnetic moment about the effective field, sometimes called “Rabi precession” [see also the calculation of P+ ( ) in Complement CIV , § 3-c]. 460



SPIN 1/2 PARTICLE IN A STATIC AND A ROTATING MAGNETIC FIELDS: MAGNETIC RESONANCE

+–

(t)

+–

(t)

l

l ω – ω0 = 3ω1

l l0

ω = ω0

t

0

t

0 2π/ω1

a

b

Figure 3: Variation with respect to time of the transition probability between the states + and , under the effect of a rotating magnetic field 1 ( ). Outside resonance (fig. a), this probability remains small; at resonance (fig. b), however small the field 1 , there exist times when the transition probability is equal to 1.

2-d.

Case where the two levels are unstable

We are now going to assume that the two states correspond to two sublevels of an excited atomic level (whose angular momentum is assumed equal to 1 2). atoms are excited2 per unit time, all being raised to the state + . Each atom decays, by spontaneous emission of radiation, with a probability per unit time of 1 , which is the same for the two sublevels . We know that, under these conditions, an atom which was excited at time has a probability e of still being excited at time = 0 (cf. Complement KIII ). We assume that the experiment is performed in the steady state: in the presence of the fields B0 and B1 ( ), the atoms are excited at a constant rate into the state + . After a time much longer than the lifetime , what is the number of atoms which decay per unit time from the state ? If an atom is excited at time , the probability of finding it in the state at = 0 is e P+ ( ), where P+ ( ) is given by relation (27). The total number of atoms in the state is obtained by taking the sum of atoms excited at all previous times , that is, by calculating the integral: e

P+ ( )

d

(28)

0

This calculation presents no difficulties. Multiplying the number of atoms thus obtained by their probability 1 of decay per unit time, we obtain: =

2 (∆ )2 +

2 1 2 1

+ (1

)2

(29)

2 In practice, this excitation can be produced, for example, by placing the atoms in a light beam. When the incident photons are polarized, conservation of angular momentum, in certain cases, requires that the atoms which absorb them can attain only the state + (and not the state ). Similarly, by detecting the polarization of the photons re-emitted by the atoms, one can know whether the atoms fall back into the ground state from the state + or the state .

461

COMPLEMENT FIV



N

2L γ 0

B0

– ω/γ

The variation of =

2 1

+ (1

with respect to ∆

Figure 4: Resonance curve. To observe a resonance phenomenon, we perform an experiment in which atoms are excited per unit time into the state + . Under the effect of a field B1 ( ), rotating at the frequency 2 , the atoms undergo transitions towards the state . In the steady state, if we measure the number of atoms which decay per unit time from the state , we obtain a resonant variation when we scan the static field . 0 about the value

corresponds to a Lorentz curve whose half-width is:

)2

(30)

In the experiment described above, let us measure, for various values of the magnetic field 0 (that is, with assumed to be fixed, for various values of ∆ ), the number of atoms which decay from the level . According to (29), we obtain a resonance curve which has the shape shown in Figure 4. It is very interesting to obtain such a curve experimentally, since one can use it to determine several parameters: – if we know and measure the value 0 of the field 0 corresponding to the peak of the curve, we can deduce the value of the gyromagnetic ratio through the relation = 0 . – if we know , we can, by measuring the frequency 2 corresponding to resonance, measure the static magnetic field 0 . Various magnetometers, often of very great precision, operate on this principle. In certain cases, one can derive interesting information from such a measurement of the field. If, for example, the spin being considered is that of a nucleus which belongs to a molecule or to a crystal lattice, one can find the local field seen by the nucleus, its variation with the site occupied, etc. – if we trace the square which, extrapolated to

2

of the half-width as a function of 12 , we obtain a straight line of the excited level (cf. Fig. 5). 1 = 0, gives the lifetime

L2

1 τ

Figure 5: The extrapolation to 1 = 0 of the squared half-width of the resonance curve of Figure 4 gives the lifetime of the level being studied.

2

0

462

ωt2

• 3.

SPIN 1/2 PARTICLE IN A STATIC AND A ROTATING MAGNETIC FIELDS: MAGNETIC RESONANCE

Relation between the classical treatment and the quantum mechanical treatment: evolution of M

The results obtained in §§ 1 and 2 are very similar, although we used classical mechanics in one case and quantum mechanics in the other. We are now going to show that this similarity is not accidental. It arises from the fact that the quantum mechanical evolution equations of the mean value of a magnetic moment placed in an arbitrary magnetic field are identical to the corresponding classical equations. The mean value of the magnetic moment associated with a spin 1 2 is: M ()=

S()

(31)

To calculate the evolution of M ( ), we use theorem (D-27) of Chapter III: ~

d M ( ) = [M d

where

( )]

(32)

( ) is the operator:

()=

M B( )

(33)

Let us calculate for example the commutator [ ( )]. Using the fact that the field components ( ) and ( ) are numbers (cf. note 1), we find: [

, H(t)] =

2

=

2

[

( )+ ( )[

( )+ 2

]

( )[

( )] ]

(34)

Using relations (14) of Complement AIV , we obtain: [

( )] = ~

2

[

()

()

]

(35)

Substituting (35) into (32): d d

()= [

()

()

()

( )]

(36)

By cyclic permutation, we can calculate analogous expressions for the components on and ; the three equations obtained can be condensed into: d M ()= d

M()

B( )

(37)

Let us compare (37) with (6): the evolution of the mean value M ( ) obeys the classical equations exactly, whatever the time-dependence of the magnetic field B( ). 4.

Bloch equations

In practice, in a magnetic resonance experiment, it is not the magnetic moment of a single spin that is observed, but rather that of a great number of identical spins (as in the experiment described in § 2-d above, where the number of atoms which decay from the state is detected). Moreover, one is not concerned solely with the quantity P+ ( ), calculated above. One can also measure the global magnetization M of the sample under study: the sum of the mean values

463

COMPLEMENT FIV



C F E A

Figure 6: Schematic drawing of an experimental device which supplies cell in the state + .

with atoms

of the observable M corresponding to each spin of the sample3 . It is interesting, therefore, to obtain the equations of motion of M , called the Bloch equations. In order to understand the physical significance of the various terms appearing in these equations, we are going to derive them for a simple concrete case. The results obtained can be generalized to other more complicated situations. 4-a.

A concrete example

Consider a beam of atoms coming from an atomic polarizer of the type studied in § B1-a of Chapter IV. All the atoms of the beam4 are in the spin state + and therefore have their magnetic moments parallel to . They enter a cell through a small opening (Fig. 6), rebound a certain number of times from the inside walls of the cell and, after a certain time, escape through the same opening. We shall denote by the number of polarized atoms entering the cell per unit time; is generally small and the atomic density inside the cell is low enough to allow atomic interactions to be neglected. Moreover, if the inside walls of the cell are suitably coated, collisions with the walls have little effect on the spin state of the atoms5 . We shall assume that there is a probability per unit time 1 for the elementary magnetization introduced into the cell by a polarized atom to disappear, either because of a depolarizing collision with the walls or simply because the atom has left the cell. is called the “relaxation time”. The cell is placed in a magnetic field B( ) which may have a static component and a rotating component. The problem consists of finding the equation of motion of the global magnetization M ( ) of the atoms which are inside the cell at time . First, let us write the exact expression for M ( ): N

M( )=

N ( )

=1

( )M

( )

M ( )( )

() =

(38)

=1

In (38), the sum is taken over the N spins which are already in the cell and which, at time , ( ) have neither left nor undergone a depolarizing collision. ( ) is the state vector of such a spin ( ) at time [we are not counting, in (38), the spins which have undergone a depolarizing 3 It is possible to detect, for example, the electromotive force emf induced in a coil by the variation of M with respect to time. 4 For example, silver or hydrogen atoms in the ground state. For the sake of simplicity, all effects related to nuclear spin are neglected. 5 For example, for hydrogen atoms bouncing off teflon walls, tens of thousands of collisions are required for the magnetic moment of the hydrogen atom to become disoriented.

464



SPIN 1/2 PARTICLE IN A STATIC AND A ROTATING MAGNETIC FIELDS: MAGNETIC RESONANCE

collision and have not yet left the cell, since their global contribution is zero: their spins point randomly in all directions]. Between times and + d , M ( ) varies for three different reasons: ( ) A certain proportion, d , of the N spins undergo a depolarizing collision or leave the compartment; these spins disappear from the sum (38) and M ( ) therefore decreases by: M( ) = dM

d

M( )

(39)

( ) The other spins evolve freely in the field B( ). We saw in § 3 above that, for each of them, the evolution of the mean value of M:

M ( )( ) =

( )

( )

( )M

()

obeys the classical law: M ( )( ) = dM

M ( )( )

B( )d

(40)

Since the right-hand side of (40) is linear with respect to M ( ) ( ), the contribution of these spins to the variation of M ( ) is given by: M( ) = dM (

M( )

B( ) d

(41)

) Finally, a certain number, d , of new spins have entered the cell. Each of them adds to the global magnetization a contribution µ0 , equal to the mean value of M in the state + (µ0 is parallel to and µ0 = ~ 2). M therefore increases by: M ( ) = µ0 d dM

(42)

The global variation of M is obtained by adding (39), (41) and (42). Dividing by d , we obtain the equation of motion of M ( ) (Bloch equation): d M ( ) = µ0 d

1

M( )+ M( )

B( )

(43)

We have derived (43) in a specific case, making certain hypotheses. However, the main features of this equation remain valid for a great number of other experiments where the rate of variation of M ( ) appears in the form of a sum of three terms: – a source term (here µ0 ) which describes the preparation of the system. It would, in fact, be impossible to observe magnetic resonance without a preliminary polarization of the spins, which can be achieved through selection using a magnetic field gradient (as in the example studied here), a polarized optical excitation (as in the example studied in § 2-d above), cooling of the sample in a strong static field, etc. 1 – a damping term here M ( ) which describes the disappearance or “relaxation” of the global magnetization under the effect of various processes: collisions, disappearance of atoms, change in atomic levels through spontaneous emission (as in the example studied in § 2-d), etc.

– a term which describes the precession of M ( ) in the field B( ) [last term of (43)].

465

COMPLEMENT FIV

4-b.



Solution in the case of a rotating field

When the field B( ) is the sum of a static field B0 and a rotating field B1 ( ), such as those considered above, equations (43) can be solved exactly. As in §§ 1 and 2, one changes to the rotating frame , with respect to which the relative variation of M ( ) is: d M d

= µ0 rel

1

M + M

(44)

Beff

where Beff is defined by equation (10).

(𝓜X)s

0

Δω ω1

2

(𝓜Y)s Δω

0

(𝓜Z)s

0

Δω

Figure 7: Variation with respect to ∆ = 0 of the stationary values of the components of M in the rotating frame. One obtains a dispersion curve for (M ) and absorption 2 curves for (M ) and (M ) . The three curves have the same width, 2 )2 , 1 + (1 which increases with 1 . They have been drawn assuming that 1 = 1 (“halfsaturation”).

Projecting this equation onto , and , we obtain a system of three linear differential equations with constant coefficients, whose stationary solution (valid after a time

466



SPIN 1/2 PARTICLE IN A STATIC AND A ROTATING MAGNETIC FIELDS: MAGNETIC RESONANCE

much greater than (M ) =

µ0

(M ) =

0

(M ) =

0

) is: 1∆ 2 1 + (1 1 (∆ )2 + 12 + (1 2 1 1 2 2 (∆ ) + 1 +

(∆ )2 +

)2 )2 (1

)2

(45)

M ) , when the field 0 varies, have The three components of the stationary magnetization (M resonant variations about the value 0 = (cf. Fig. 7). (M ) and (M ) follow absorp2 M ) follows a dispersion curve tion curves (Lorentz curves of width 2 )2 ). (M 1 + (1 (of the same width).

References and suggestions for further reading: Feynman II (7.2), Chap. 35; Cagnac and Pebay-Peyroula (11.2), Chaps. IX § 5, X § 5, XI §§ 2 to 5, XIX § 3; Kuhn (11.1), § VI, D. See the references of section 14 of the bibliography, particularly Abragam (14.1) and Slichter (14.2).

467



A SIMPLE MODEL OF THE AMMONIA MOLECULE

Complement GIV A simple model of the ammonia molecule

1 2

3

1.

Description of the model . . . . . . . . . . . . . . . . . . . . . Eigenfunctions and eigenvalues of the Hamiltonian . . . . . 2-a Infinite potential barrier . . . . . . . . . . . . . . . . . . . . . 2-b Finite potential barrier . . . . . . . . . . . . . . . . . . . . . . 2-c Evolution of the molecule. Inversion frequency . . . . . . . . The ammonia molecule considered as a two-level system . . 3-a The state space . . . . . . . . . . . . . . . . . . . . . . . . . . 3-b Energy levels. Removal of the degeneracy due to the transparency of the potential barrier . . . . . . . . . . . . . . . . . 3-c Influence of a static electric field . . . . . . . . . . . . . . . .

469 471 471 473 476 477 478 479 481

Description of the model

In the ammonia molecule NH3 , the three hydrogen atoms form the base of a pyramid whose apex is the nitrogen atom (cf. Fig. 1). We shall study this molecule by using a simplified model with the following features: the nitrogen atom (of mass ), much heavier than its partners (of mass ), is motionless; the hydrogen atoms form a rigid equilateral triangle whose axis always passes through the nitrogen atom. The potential energy of the system is thus a function of only one parameter, the (algebraic) distance between the nitrogen atom and the plane defined by the three hydrogen atoms1 . The x

Figure 1: Schematic drawing of the ammonia molecule; is the algebraic distance between the plane of the hydrogen atoms and the nitrogen atom, which is assumed to be motionless.

N

0

H

H

H

shape of this potential energy

( ) is given by the solid-line curve in Figure 2. The

1 In

this one-dimensional model, effects linked to the rotation of the molecule are obviously not taken into account.

469

COMPLEMENT GIV



symmetry of the problem with respect to the = 0 plane requires ( ) to be an even function of . The two minima of ( ) correspond to two symmetrical configurations of the molecule in which, classically, it is stable; we shall choose the energy origin such that its energy is then zero. The potential barrier at = 0, of height 1 , expresses the fact that, if the nitrogen atom is in the plane of the hydrogen atoms, they repel it. Finally, the increase in ( ) when is greater than corresponds to the chemical bonding force which insures the cohesion of the molecule. V (x)

V1

V0

–b a

0

+b

x

a

Figure 2: Variation with respect to of the potential energy ( ) of the molecule. ( ) has two minima (classical equilibrium positions), separated by a potential barrier due to the repulsion for small between the nitrogen atom and the three hydrogen atoms. The “square potential” used to approximate ( ) is shown in dashed lines. This model therefore reduces the problem to a one-dimensional one in which a fictitious particle of mass (it can be shown that the “reduced mass” of the system is equal to 33 + ) is under the influence of the potential ( ). Under these conditions, what are the energy levels predicted by quantum mechanics? With respect to classical predictions, two major differences appear: (i) The Heisenberg uncertainty relation forbids the molecule to have an energy equal to the minimum of ( ) ( = 0 in our case). We have already seen, in Complements CI and MIII why this energy must be greater than min . ( ) Classically, the potential barrier at = 0 cannot be cleared by a particle whose energy is less than 1 : the nitrogen atom thus always remains on the same side of the plane of the hydrogen atoms, and the molecule cannot invert itself. Quantum mechanically, such a particle can cross this barrier by the tunnel effect (cf. Chap. I, § D-2-c): the inversion of the molecule is therefore always possible. We are going to discuss the consequences of this effect. We are concerned here only with a qualitative discussion of the physical phenomena and not with an exact quantitative calculation which would not have much significance in this approximate model. For example, we shall try to demonstrate the existence of an inversion frequency of the ammonia molecule, without giving an exact or even an approximate value of this frequency. We shall therefore simplify the problem further by replacing the function ( ) by the square potential drawn in dashed lines in Figure 2 [two infinite potential steps at = ( + 2) and a potential barrier of height 0 centered at = 0 and of width (2 )]. 470

• 2.

A SIMPLE MODEL OF THE AMMONIA MOLECULE

Eigenfunctions and eigenvalues of the Hamiltonian

2-a.

Infinite potential barrier

Before calculating the eigenfunctions and eigenvalues of the Hamiltonian corresponding to the “square” potential of Figure 2, we are going to assume, in this first stage, that the potential barrier 0 is infinite (in which case, no tunnel effect is possible). This will lead us to a better understanding of the consequences of the tunnel effect across the finite potential barrier of Figure 2. We shall therefore consider, to begin with, a particle in a potential ( ) composed of two infinite wells of width centered at = (Fig. 3). If the particle is in one of these two wells, it obviously cannot go into the other one. ~ V(x)

Figure 3: When the height 0 of the potential barrier of Figure 2 is large, we have two practically infinite potential wells of width whose centers are separated by a distance of 2 . +b

–b 0 a

a

Each of the two wells of Figure 3 is similar to the one studied in Complement HI , in § 2-c- . We can therefore use the results obtained in that complement. The possible energies of the particle are: =

~2 2

2

(1)

with: =

(2)

(where is a positive integer). Each of the energy values is twofold degenerate, since two wave functions correspond to it: 2 1(

)=

sin

+

sin

+

if

2

2 everywhere else

0 2 2( ) =

0

2

+

if

2 everywhere else

+

2

+

2

(3)

471

COMPLEMENT GIV



In the state 1 , the particle is in the infinite well on the right; in the state 2 , it is in the one on the left. Figure 4 shows the first two energy levels of the molecule, which are two-fold degenerate. The Bohr frequency ( 2 associated with these two levels corresponds, 1) as we saw in Complement AIII (§ 2-b), to the to-and-fro motion of the particle between the two sides of the well on the right (or on the left) when its state is a linear superposition of 1 2 1 2 1 and 1 (or of 2 and 2 ). Physically, such an oscillation represents a molecular vibration of the plane of the three hydrogen atoms about its stable equilibrium position, which corresponds to = + (or = ). The frequency of this oscillation falls in the infrared part of the spectrum. E

E2 =

4π2ħ2 2ma2

ν=

E2 – E1 h

E1 =

ħ2π2 2ma2 0

Figure 4: First energy levels obtained in the potential wells of Figure 3. The oscillation of the system in one of the two wells at the Bohr frequency = ( 2 represents 1) the vibration of the molecule about one of its two classical equilibrium positions. In the rest of the calculation, it is convenient to change bases, in each of the eigensubspaces of the Hamiltonian of the particle. Since the function ( ) is even, this Hamiltonian commutes with the parity operator Π (cf. Complement FII , § 4). In this case, we can find a basis of eigenvectors of that are even or odd; the wave functions of these vectors are the symmetrical and antisymmetrical linear combinations: 1 [ 2 1 ( )= [ 2 ( )=

1(

)+

2(

)] (4)

1( )

2 ( )]

In the states and , the particle can be found in one or the other of the two potential wells. In what follows, we shall confine ourselves to the study of the ground state, for which the wave functions 11 ( ), 12 ( ), 1 ( ) and 1 ( ) are shown in Figure 5. 472



A SIMPLE MODEL OF THE AMMONIA MOLECULE

φ11 (x)

–b

b

x

0 a

φ21 (x)

b

–b

x

0 φ1s (x)

x 0 b

φ1a (x)

x 0

Figure 5: The states 11 ( ) and 12 ( ), shown in figure a, are stationary states with the same energy, respectively localized in the right-hand well and the left-hand well of Figure 3. To use the symmetry of the problem, it is more convenient to choose as stationary states the symmetrical state 1 ( ) and the antisymmetrical state 1 ( ), linear combinations of 11 ( ) and 12 ( ) (figure b) .

2-b.

Finite potential barrier

Let us try to find the shape of the eigenfunctions of the first energy levels when 0 has a finite value (assumed, nevertheless, to be greater than the energy of these levels). Inside the two “square” potential wells (dashed lines in Figure 2), ( ) = 0. The wave function is therefore of the form: ( )=

sin

+

( )=

sin

+

if

2 2

+

if

2 2

+

2 +

(5) 2 473



COMPLEMENT GIV

where

is related to the energy

=

~2 2

of the level by the relation:

2

(6)

As in the preceding paragraph, ( ) always goes to zero at = ( + 2), since ( ) becomes infinite at these two points. On the other hand, since 0 is finite, ( ) no longer goes to zero at = ( 2); consequently, no longer satisfies relation (2). Once again, since ( ) is even, we can look for eigenfunctions of the Hamiltonian, ( ) and ( ), which are respectively even and odd. Let us denote by and , and the values of the coefficients and , introduced in (5), which correspond to ( ) and ( ). We have, obviously: =

(7)

=

The eigenvalues associated with and will be denoted by and , which enables us, using (6), to define the corresponding values and of the parameter . In the interval ( 2) +( 2), the wave function is no longer zero, as it was before, since 0 is finite. It must be a linear combination, even or odd depending on whether we are considering or , of exponentials e and e ; and are defined in terms of and 0 by: 2 ( ~2

=

)=

0

2

2

(8)

with: 0

=

~2 2

2

Therefore, for

(9) (

2)

( )=

cosh(

)

( )=

sinh(

)

(

2), the functions

and

are written: (10)

Finally, we must match the eigenfunctions and their derivatives at The even solution ( ) must therefore satisfy the conditions: sin(

)=

cos(

)=

Since and to obtain: tan(

)=

474

)=

(

2).

2

sinh

2

(11)

cannot be zero simultaneously, we can take the ratio of equations (11)

coth

For the odd solution tan(

cosh

=

2

(12)

( ), we obtain in the same way:

tanh

2

(13)



A SIMPLE MODEL OF THE AMMONIA MOLECULE

If and are replaced by their values in terms of can be written: tan(

)=

2

2

2

2

coth

2

tanh

2

2

2

and

, relations (12) and (13)

(14)

and: tan(

)=

2

2

(15)

In theory, therefore, the problem is solved. Relations (14) and (15) express the energy quantization since they give the possible values of and and therefore, thanks to relation (6), the energies and (with the condition that they be less than 0 ). The transcendental equations (14) and (15) can be solved graphically. A certain number of roots are found: 1 , 2 ,..., 1 , 2 ... The root is different from , since equations (14) and (15) are not the same: the energies and are therefore different. Of course, when 0 becomes very large, and both approach the value found in the preceding section; this can be seen by letting approach infinity in equations (14) and (15), which yields tan( ) = 0, an equation equivalent to (2). The energies and therefore approach the value = ~2 2 2 2 2 calculated in the preceding section for 0 approaching infinity. Finally, it is easy to see that, the more 0 exceeds , the closer together the two energies and will be. The exact values of and are of little importance to us here. We shall content ourselves with sketching the shape of the energy spectrum in Figure 6, which shows what happens to the energies of levels 1 and 2 of Figure 4 when the finite height 0 of the potential barrier is taken into account. We see that the tunnel effect across this barrier removes the degeneracy of 1 and 2 , giving rise to doublets, ( 1 , 1 ) and ( 2 , 2 ) (assuming, of course, that all these energies are less than 0 ). Since the ( 1 , 1 ) doublet 1 2 2 is the deeper one, it is clear that 1 . Finally, the distance between the doublets is much greater than the spacing within each doublet (experimentally, their ratio is of the order of a thousand). These spacings enable us, moreover, to define new Bohr (angular) frequencies:

1

Ω1 =

1

~

2

Ω2 =

2

~

whose physical significance we shall study in the next paragraph (the corresponding transitions are represented by arrows in Figure 6). Figure 7 shows the shape of the eigenfunctions 1 ( ) and 1 ( ), which are given by equations (5), (7) and (10) once 1 and 1 have been determined from (14) and (15). We see that they greatly resemble the functions 1 ( ) and 1 ( ) of Figure 5, the essential difference being that the wave function is no longer zero in the interval ( 2) ( 2). The reason for introducing the 1 and 1 basis in the preceding paragraph can now be understood: the eigenfunctions 1 and 1 , in the presence of the tunnel effect, resemble 1 and 1 much more than 11 and 12 . 475

COMPLEMENT GIV

2-c.



Evolution of the molecule. Inversion frequency

Assume that at time = 0, the molecule is in the state: ( = 0) =

1 2

1

1

+

(16)

The state vector ( ) at time can be obtained by using the general formula (D-54) of Chapter III; we obtain: () =

1 e 2

1+

1

e+ Ω1

2~

2

1

+e

Ω1

2

1

(17)

From this we deduce the probability density: )2=

(

1 2

1

( )

2

+

1 2

1

( )

2

+ cos(Ω1 )

1

( )

1

( )

(18)

The variation with respect to time of this probability density is simple to obtain graphically from the curves of Figure 7. They are shown in Figure 8. For = 0 (Fig. 8-a), we see that the initial state chosen in (16) corresponds to a probability density which is concentrated in the right-hand well (in the left-hand well, the functions 1 and 1 are of opposite sign and very close in absolute value, so their sum is practically zero). It can therefore be said that the particle, initially, is practically in the right-hand well. At time = 2Ω1 (Fig. 8-b), it has moved appreciably, through the tunnel effect, into the left-hand well, is practically there at time = Ω1 (Fig. 8-c), and then performs the process in reverse (Figures 8-d and 8-e).

E Ea2 Es2

Ea1 Es1 0

476

ħΩ2

ħΩ1

Figure 6: When one takes the finite height 0 of the barrier into account, one finds that the energy spectrum of Figure 4 is modified: each level splits into two distinct ones. The Bohr frequencies Ω1 2 and Ω2 2 corresponding to tunnelling from one well to the other are the inversion frequencies of the ammonia molecule for the first two vibration levels. The tunnel effect is more important in the higher vibration level, so Ω2 Ω1 .



A SIMPLE MODEL OF THE AMMONIA MOLECULE

χs1 (x)

x –b

0

+b χa1 (x)

–b

Figure 7: Wave functions associated with the levels 1 and 1 in Figure 6. Note the analogy with the functions in Figure 5-b; however, these new functions do not vanish on the interval + 2 2.

x 0

+b

The fictitious particle therefore moves from one side of the potential barrier to the other with the frequency Ω1 2 , which means that the plane of the hydrogen atoms continually passes from one side of the nitrogen atom to the other. This is why the frequency Ω1 2 is called the inversion frequency of the molecule. Note that this inversion frequency has no classical analogue, since its existence is related to the tunnel effect of the fictitious particle across the potential barrier. Since the nitrogen atom tends to attract the electrons of the three hydrogen atoms, the ammonia molecule possesses an electric dipole moment which is proportional to the mean value of the position of the fictitious particle we have studied; we see in Figure 8 that this dipole moment is an oscillating function of time. Under these conditions, the ammonia molecule is capable of emitting or absorbing electromagnetic radiation of frequency Ω1 2 . Experimentally, this is indeed observed; the value of Ω1 falls in the domain of centimeter waves. In radioastronomy, ammonia molecules in interstellar space have been shown to emit and absorb electromagnetic waves of this frequency. Let us also point out that the principle of the ammonia maser is based on the stimulated emission of these waves by the NH3 molecule. 3.

The ammonia molecule considered as a two-level system

We see in Figure 6 that we have a situation which is analogous to the one mentioned in the introduction of § C of Chapter IV. The system under study possesses two levels, 1 and 1 , which are very close to each other and very far from all other levels 2 , 2 , ... If we are interested only in the two levels 1 and 1 , we can “forget” all the others (the exact justification for such an approximation will be given in the framework of perturbation theory in Chapter XI). We are going to return to the preceding discussion with a slightly different point of view and show that the general considerations of Chapter IV concerning two-level 477

COMPLEMENT GIV

• ψ (x, t) 2

d

a

x

x 0

0

e

b x 0

x 0

c x 0

Figure 8: Evolution of a wave packet obtained by superposing the two stationary wave functions of Figure 7. The particle, initially, is in the right-hand well (fig. a), tunnels into the left-hand well (fig. b) and, after a certain time, becomes localized there (fig. c); then it returns to the right-hand well (fig. d) and the initial state (fig. e), and so on.

systems can be applied to the ammonia molecule. This point of view will also enable us to study very simply the effect of a static external electric field on this molecule.

3-a.

The state space

The state space we are going to consider is spanned by the two orthogonal vectors and 12 , whose wave functions are given by (3). As we explained above, we shall ignore the other states 1 and 2 for which 1. In the states 11 and 12 , the nitrogen atom is either above or below the plane of the hydrogen atoms. We introduced in (4) a second orthonormal basis of the state space, composed of the even and odd vectors: 1 1

1

1

1 2 1 = 2 =

1 1

+

1 2

(19) 1 1

1 2

There is the same probability in these two states of finding the nitrogen atom above or below the plane of the hydrogen atoms. 478

• 3-b.

A SIMPLE MODEL OF THE AMMONIA MOLECULE

Energy levels. Removal of the degeneracy due to the transparency of the potential barrier

When the height 0 of the potential barrier is infinite, the states 11 and 12 have the same energy (as do the states 1 and 1 ), so that 0 , the Hamiltonian of the system, is written: =

0

(20)

1

(where is the identity operator in the two-dimensional state space). To take into account phenomenologically the fact that the barrier is not infinite, 1 we add to 0 a perturbation which is non-diagonal in the 11 2 basis and is represented by the matrix: 0

1

=

(21) 1

0

is a real positive coefficient2 . If we want to find the stationary states of the molecule, we must diagonalize the total Hamiltonian operator = 0 + , whose matrix is written: where

1

=

(22) 1

A simple calculation gives the eigenvalues and eigenvectors of 1

+

:

1

corresponding to the eigenket

(23)

1

1

We see that, under the effect of the perturbation , the two levels, which were degenerate when was zero, now split; an energy difference, equal to 2 , appears, and the new eigenstates are the states 1 and 1 . We again find the results of § 2. If, at time = 0, the molecule is in the state 11 : 1 1

( = 0) =

=

1 2

the state vector at time () = =e

1 e 2 1

1

~

~

1

1

(24)

will be: ~

e

cos ~

2 We

+

1

+e 1 1

1

~

+ sin ~

1 2

are forced to assume 0 in order to obtain the relative disposition of the of Figure 6 [see eigenvalues (23)].

(25) 1

and

1

levels

479

COMPLEMENT GIV



In a measurement performed at time , we therefore have a probability cos2 ( ~) of finding the molecule in the state 11 (the nitrogen atom above the plane of the hydrogen atoms) and sin2 ( ~) of finding it in the state 12 (the nitrogen atom below). Thus we again find that, under the effect of the coupling , the ammonia molecule inverts periodically.

480



A SIMPLE MODEL OF THE AMMONIA MOLECULE

Comment:

The perturbation [given in (21)] describes (phenomenologically) the fact that the potential barrier is finite. This approach is less precise than the discussion above, since we obtain here eigenfunctions 1 ( ) and 1 ( ) which, unlike 1 and 1 , go to zero in the region ( + 2) ( 2). This much more simple description nevertheless explains two fundamental physical effects: the removal of the degeneracy of 1 and the periodic oscillation of the molecule between the states 11 and 12 (inversion). 3-c.

Influence of a static electric field

We saw above that, in the states 11 and 12 , the electric dipole moment of the molecule takes on two opposite values, which we shall denote by + and . If we call the observable associated with this physical quantity, we can therefore assume that 1 is represented in the { 11 2 } basis by a diagonal matrix whose eigenvalues are + and : 0 =

(26) 0

When the molecule is placed in a static electric field3 E , the interaction energy with this field is: E

(E ) =

(27)

This term of the Hamiltonian4 is represented in the { 1

E

(E ) =

+

1

(E ) =

1

1 1

1

=

1

+

2

+

2E 2

2

+

2E 2

} basis, the total Hamil-

(29) + E

This matrix can easily be diagonalized; its eigenvalues vectors + and are given by: =

1 2

E 1

+

} basis by the matrix:

0

Let us then write the matrix which represents, in the { tonian operator of the molecule, 0 + + (E ):

+

1 2

(28) 0

0

1 1

+

and

and its eigen-

(30)

3 For the sake of simplicity, we assume here that this field is parallel to the axis of Figure 1 (one-dimensional model). 4 In (E ), is an observable, while E is a classical quantity which is externally imposed (cf. note, page 458).

481

COMPLEMENT GIV



and: = cos

+

= sin

2 2

1 1 1 1

sin + cos

2 2

1 2

(31) 1 2

where we have set: tan

=

(0

E

)

(32)

[cf. Complement BIV , relations (9), (10), (22) and (23); since is real and negative, the angle introduced in that complement is here equal to ]. When E is zero, = 2, and we again obtain the results of § 3-b, since: + (E

= 0) =

1

(E = 0) =

1

+

(33)

with: = 0) =

1

(E = 0) =

1

+ (E

(34)

When, for an arbitrary E , obtain: +(

= 0) =

1

(

= 0) =

1

is zero (a perfectly opaque potential barrier), we

E

+

(35)

E

with, if E is positive5 : (

= 0) =

+(

= 0) =

1 1

(36)

1 2

In this case, the energies therefore vary linearly with E (dashed straight lines in Figure 9). Physically, results (35) and (36) are easy to understand: when the electric field alone acts on the molecule, it “pulls” the positively charged hydrogen atoms above or below the nitrogen atom; this is why the stationary states are 11 and 12 . When the electric field E and the coupling constant are both arbitrary, the states are linear superpositions of the states 11 and 12 (and of the states + and 1 1 and as well), and result from a compromise between the action of the electric field, which tends to pull the hydrogen atoms to one side of the nitrogen atom, and that of the coupling , which tends to draw the nitrogen atom accross the potential barrier. The variation of the energies + and is shown graphically in Figure 9, in which we see the phenomenon of anti-crossing (cf. Chap. IV, § C-2-b) due to the coupling . + and correspond to the two branches of a hyperbola whose asymptotes are the dashed lines associated with the energies in the absence of coupling. Finally, we can calculate 5 If

482

E is negative, the roles of

1 1

and

1 2

are inverted in (36).



A SIMPLE MODEL OF THE AMMONIA MOLECULE

E E+ E1 + A

0

E1 – A E–

Figure 9: Influence of an electric field E on the first two levels of the ammonia molecule (their spacing 2 in a zero field is due to the tunnel effect coupling). For weak E , the molecule acquires a dipole moment proportional to E , and the corresponding energy varies with E 2 . For large E , the dipole moment approaches a limit (corresponding to the nitrogen atom either above or below the plane of the hydrogen atoms), and the energy becomes a linear function of E . the mean values of the electric dipole moment . Using (26) and (31), we find: + and +

+

=

= cos

in each of the two stationary states

(37)

which, according to (32), yields: 2 +

+

=

=

2

E

+

2E 2

(38)

For E = 0, these two mean values are zero. This corresponds to the fact that, in the two states 1 , the particle has an equal probability of being in one or the other of the two wells. On the other hand, when E , we again find the dipole moment + (or ) corresponding to the state 11 (or 12 ). When the electric field is weak ( E ), formulas (38) can be written in the form: 2 +

+

=

=

E

(39)

We see that the molecule in the stationary state + (or ) acquires an electric dipole moment proportional to the external field E . If we define an electric susceptibility of the molecule in the state by the relation: =

E

(40) 483

COMPLEMENT GIV



we find, according to (39), that: 2

=

(41)

(the same calculations are valid for

+

and yield

+

=

).

Comment:

In a weak field, formulas (30) can be expanded as a power series of E

+

=

1

=

1

+

1 2 1 + 2

2

2

E2 E2

:

+

(42a)

+

(42b)

Let us now consider ammonia molecules moving in a region where E is weak but where E 2 has a strong gradient in the direction (i.e. along the axis of the molecules): d (E 2 ) = d

(43)

According to (42a), the molecules in the state to which is equal to: =

d d

=

1 2

are subjected to a force parallel

2

Relation (42b) indicates that the molecules in the state opposite force: +

=

d d

+

=

(44) +

are subjected to an

(45)

This result is the basis of the method used in the ammonia maser to sort the molecules and select those in the higher energy state. The device is analogous to the Stern-Gerlach apparatus: a beam of ammonia molecules crosses a region where there is a strong electric field gradient, the molecules follow different trajectories depending on whether they are in one state or the other; one can, using a suitable diaphragm, isolate either one of the two states.

References and suggestions for further reading: Feynman III (1.2), § 8-6 and Chap. 9; Alonso and Finn III (1.4), § 2-8; article by Vuylsteke (1.34); Townes and Schawlow (12.10), Chap. 12; see (15.11) for references to original articles on masers; articles by Lyons (15.14), Gordon (15.15), and Turner (12.14). See also Encrenaz (12.11), Chap. VI.

484



EFFECTS OF A COUPLING BETWEEN A STABLE STATE AND AN UNSTABLE STATE

Complement HIV Effects of a coupling between a stable state and an unstable state

1 2 3

1.

Introduction. Notation . . . . . . . . . Influence of a weak coupling on states Influence of an arbitrary coupling on energy . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . 485 of different energies 486 states of the same . . . . . . . . . . . . . 487

Introduction. Notation

The effects of a coupling between two states 1 and 2 of energies 1 and 2 were discussed in detail in § C of Chapter IV. What modifications appear when one of the two states ( 1 , for example) is unstable? The concepts of an unstable state and a lifetime were introduced in Complement KIII . We shall assume, for example, that 1 is an excited atomic state. When the atom is in this state, it can fall back to a lower energy level through spontaneous emission of one or several photons, with a probability 1 1 per unit time: 1 is the lifetime of the unstable state 1 . On the other hand, we assume that in the absence of the coupling , the state 2 is stable ( 2 is infinite). We saw in Complement KIII that a simple way of taking the instability of a state into account consists of adding an imaginary term to the corresponding energy. We shall therefore replace the energy 1 of the state 1 by: 1

=

~ 2

1

(1)

1

with: 1

=

1

(2)

1

(since 2 is infinite, 2 is zero and 2 = 2 ). In the absence of coupling, the matrix 1 representing the “Hamiltonian” 0 of the system is now written in the { 1 2 } basis :

0

=

1

0

0 = 2

1

~ 2 1

0

0 (3) 2

1 The

operator 0 is not Hermitian and is therefore not really a Hamiltonian (see the comment at the end of Complement KIII ).

485



COMPLEMENT HIV

2.

Influence of a weak coupling on states of different energies

Let us assume, as in § C of Chapter IV, that we add to matrix in the basis is: 1 2 0

0

a perturbation

12

=

, whose

(4)

0

21

What now happens to the energies and lifetimes of the levels? Let us calculate the eigenvalues 1 and 2 of the matrix:

=

1

and

0+

2

1

=

~ 2 1

12

21

2

(5)

are the solutions of the equation in :

2

1

+

~ 2

2

+

1

1

2

~ 2

1

2

12

2

=0

(6)

To simplify the calculation, we shall confine ourselves to the case where the coupling is weak, i.e.

~ 2

1

1

2

2

(

12

1

12

+

1

2 2

1

+

~ 2 1

12

+ 2

2 2)

+

1

+

~2 2 4 1;

we then find:

2 ~ 2 1

(7)

The energies of the eigenstates in the presence of the coupling are the real parts of 1 and 2 ; the lifetimes are inversely proportional to their imaginary parts. We see from (7) that the coupling changes, to second order in 12 , both the energies and the lifetimes. In particular, we observe that 1 and 2 are both complex when 12 is not zero: in the presence of the coupling, there is no longer any stable state. We can write 2 in the form: 2

~ Γ2 2

= ∆2

(8)

with: ∆2 = Γ2 =

2

+

( (

2 2 12

1

(

2

1)

2 1) 12 2 ~ 2 2 1) + 4 1 2

+

~2 2 4 1

(9a) (9b)

The state 2 therefore acquires, under the effect of the coupling, a finite lifetime whose inverse is given in (9b) (Bethe’s formula). This result is easy to understand physically: 486



EFFECTS OF A COUPLING BETWEEN A STABLE STATE AND AN UNSTABLE STATE

if the system at = 0 is in the stable state 2 , there is a non-zero probability at a subsequent time of finding it in the state 1 , in which the system has a finite lifetime. It is sometimes said figuratively that “the coupling brings into the stable state part of the instability of the other state”. Moreover, it can be seen from expressions (7) that, as in the case studied in § C of Chapter IV, the smaller the difference between the unperturbed energies 1 and 2 , the more effectively the perturbation acts on the energies and lifetimes. We shall therefore study in the next section the case where this difference is zero. 3.

Influence of an arbitrary coupling on states of the same energy

When the energies 1 and 2 are equal, the operator appear explicitly, as in § 2 of Complement BIV : =

~ 4

1

+

1

(10)

where is the identity operator and has for its matrix: ~ 4 1

( )=

(

2

=

12

1

1

2

basis,

(11)

~ 4 1

and

~2 16

2

is the operator which, in the

12

12 )

The eigenvalues

is written, if we make its trace

of

2

are the two solutions of the characteristic equation:

2 1

(12)

They therefore have opposite values: 1

=

(13)

2

which yields for the eigenvalues of 1

=

1

2

=

1

~ 4 ~ 4

1

+

1

:

1

(14)

1

are the same; a simple calculation2 enables us to

The eigenvectors of and obtain these vectors 1 and 2 : 1

=

12

1

+

2

=

12

1

+

2 For

1

+ 1

~ 4

+

1

~ 4

2

1

(15)

2

the calculation performed here, it is not indispensable to normalize that, since is not Hermitian, 1 and 2 are not orthogonal.

1

and

2

. Note also

487



COMPLEMENT HIV

Assume that the system at time in the absence of the coupling): ( = 0) =

1 [ 2 1

=

2

1

2

= 0 is in the state

2

(which would be stable

]

(16)

Using (14), we see that, at time , the state vector is: () =

1 e 2 1

~

1

1 4

e

[e

1

~

1

e

1

~

1

2

]

(17)

The probability P21 ( ) of finding the system at time P21 ( ) =

1

=

4

1

e

1

2

2

e

1

2

2

1

=

4

1

1

is:

2

()

1

in the state

2

e

~

1

12

2

1

e

1 ~

1

e

1

~

e

1

~ 2

1

2

(18)

We shall distinguish between several cases: (i) When the condition: ~ 4

12

(19)

1

is satisfied, we obtain directly, using (12):

1

=

2

=

12

and the eigenvalues

1

1

=

1

2

=

2

+

12

12

1

2

~ 4

2

and

(20)

1

are given by:

2

2

2

~ 4

1

2

~ 4

1

2

~ 4

1

~ 4

1

(21)

and 2 have the same imaginary part, but different real parts. The states therefore have the same lifetime, 2 1 , but different energies. Substituting (20) into (18), we obtain:

1

and

2

12

P21 ( ) = 12

2

2 2 ~ 4 1

e

1

2

sin2

12

2

~ 4

2 1

(22) ~

The form of this result recalls Rabi’s formula [cf. Chap. IV, equation (C-32)]. The function P21 ( ) is represented by a damped sinusoid with time constant 2 1 (Fig. 1). Condition (19) thus expresses the fact that the coupling is sufficiently strong to make 488



EFFECTS OF A COUPLING BETWEEN A STABLE STATE AND AN UNSTABLE STATE

21(t)

Figure 1: Effect of a strong coupling between a stable state 2 and an unstable state 1 . If the system is initially in the state 2 , the probability P21 ( ) of finding it in the state presents damped oscillations. 1 at time t

0

the system oscillate between the states 1 can have a real effect.

1

and

2

before the instability of the state

(ii) If, on the other hand, the condition: ~ 4

12

(23)

1

is satisfied, we then have:

1

=

2

2

~ 4

=

1

12

2

(24)

and:

1

=

1

~ 4

1

2

=

1

~ 4

1

The states (18) becomes: P21 ( ) =

1

+

2 ~ 4 1

1

~ 4

1

and

12

2

~ 4

2

12

2 12

2

(25)

then have the same energy and different lifetimes. Formula

2

e 12

2

1

2

~ 4

sinh2

2

2 1

12

2

(26) ~

This time, P21 ( ) is a sum of damped exponentials (Fig. 2). This result has a simple physical interpretation: condition (23) expresses the fact that the lifetime 1 is so short that the system is completely damped before the coupling has had the time to make it oscillate between the states 1 and 2 . (iii) Finally, let us examine the case where we have exactly: 12

=

~ 4

(27)

1

We see then from (14) that the states and the same lifetime 2 1 .

1

and

2

both have the same energy

1

489

COMPLEMENT HIV



21(t)

Figure 2: When the coupling is weak, oscillations between the states 1 and 2 do not have time to occur.

t

0

Equations (22) and (26), in this case, take on indeterminate forms, which can be resolved and both yield: P21 ( ) =

12 ~2

2 2

e

1

2

(28)

Comment:

The preceding discussion is very similar to that of the classical motion of a damped harmonic oscillator. Conditions (19), (23) and (27) correspond respectively to weak, strong and critical damping.

References and suggestions for further reading: An important application of the phenomenon discussed in this complement is the shortening of the lifetime of a metastable state due to an electric field. See: Lamb and Retherford (3.11), App. II; Sobel’man (11.12), Chap. 8, § 28-5.

490



EXERCISES

Complement JIV Exercises 1. Consider a spin 1/2 particle of magnetic moment M = S. The spin state space is spanned by the basis of the + and vectors, eigenvectors of with eigenvalues +~ 2 and ~ 2. At time = 0, the state of the system is: ( = 0) = + . If the observable is measured at time = 0, what results can be found, and with what probabilities? . Instead of performing the preceding measurement, we let the system evolve under the influence of a magnetic field parallel to , of modulus 0 . Calculate, in the + basis, the state of the system at time . . At this time , we measure the observables , , . What values can we find, and with what probabilities? What relation must exist between 0 and for the result of one of the measurements to be certain? Give a physical interpretation of this condition.

2. Consider a spin 1/2 particle, as in the previous exercise (using the same notation). . At time = 0, we measure and find +~ 2. What is the state vector immediately after the measurement?

(0)

. Immediately after this measurement, we apply a uniform time-dependent field parallel to . The Hamiltonian operator of the spin ( ) is then written: ()=

0(

)

Assume that 0 ( ) is zero for 0 and and increases linearly from 0 to 0 when 0 ( is a given parameter having the dimension of a time). Show that at time the state vector can be written: () =

1 [e 2

( )

+ + e

where ( ) is a real function of

( )

]

(to be calculated by the student).

. At a time = , we measure . What results can we find, and with what probabilities? Determine the relation that must exist between 0 and in order for us to be sure of the result. Give the physical interpretation.

491

COMPLEMENT JIV



3. Consider a spin 1/2 particle placed in a magnetic field B0 with components: =

1 2

0

1 2

0

=0 =

The notation is the same as that of exercise (1). . Calculate the matrix representing, in the Hamiltonian of the system.

+

basis, the operator

. Calculate the eigenvalues and the eigenvectors of

, the

.

. The system at time = 0 is in the state . What values can be found if the energy is measured, and with what probabilities? . Calculate the state vector ( ) at time . At this instant, is measured; what is the mean value of the results that can be obtained? Give a geometrical interpretation.

4. Consider the experimental device described in § B-2-b of Chapter IV (cf. Fig. 8): a beam of atoms of spin 1/2 passes through one apparatus, which serves as a “polarizer” in a direction which makes an angle with in the plane, and then through another apparatus, the “analyzer”, which measures the component of the spin. We assume in this exercise that between the polarizer and the analyzer, over a length of the atomic beam, a magnetic field B0 is applied which is uniform and parallel to . We call the speed of the atoms and = the time during which they are submitted to the field B0 . We set 0 = 0. . What is the state vector

of a spin at the moment it enters the analyzer?

1

. Show that when the measurement is performed in the analyzer, there is a probability equal to 12 (1 + cos cos 0 ) of finding +~ 2 and 12 (1 cos cos 0 ) of finding ~ 2. Give a physical interpretation. . (This question and the following one involve the concept of a density operator, defined in Complement EIII . The reader is also advised to refer to Complement EIV ). Show that the density matrix 1 of a particle which enters the analyzer is written, in the + basis:

1

=

1 2

1 + cos sin

Calculate Tr{ 1 density operator 492

cos

cos

sin + cos

0

sin

0

1

cos

}, Tr{ 1 } and Tr{ 1 describe a pure state?

1

sin cos

0

0

}. Give an interpretation. Does the



EXERCISES

. Now assume that the speed of an atom is a random variable, and hence the time is known only to within a certain uncertainty ∆ . In addition, the field 0 is assumed to be sufficiently strong that 0 ∆ 1. The possible values of the product 0 are then (modulus 2 ) all values included between 0 and 2 , all of which are equally probable. In this case, what is the density operator 2 of an atom at the moment it enters the analyzer? Does 2 correspond to a pure case? Calculate the quantities Tr{ 2 }, Tr{ 2 } and Tr{ 2 }. What is your interpretation? In which case does the density operator describe a completely polarized spin? A completely unpolarized spin? Describe qualitatively the phenomena observed at the analyzer exit when varies from zero to a value where the condition 0 ∆ 1 is satisfied.

0

5. Evolution operator of a spin 1/2 (cf. Complement FIII ) Consider a spin 1/2, of magnetic moment M = S, placed in a magnetic field B0 of components = , = , = . We set: 0

=

B0

. Show that the evolution operator of this spin is: ( 0) = e where

is the operator:

=

1 [ ~

+

+

]=

1 [ 2

+

+

]

expressed as a function of the three Pauli matrices ment AIV ).

,

Calculate the matrix representing Show that:

basis of eigenvectors of

2

=

1 [ 4

2

+

2

+

2

0

]=

in the

+

and

(cf. Comple.

2

2

. Put the evolution operator into the form: ( 0) = cos

0

2

2

sin 0

0

2

. Consider a spin which at time = 0 is in the state (0) = + . Show that the probability P++ ( ) of finding it in the state + at time is: P++ ( ) = +

( 0) +

2

493

COMPLEMENT JIV



and derive the relation: 2

P++ ( ) = 1

+ 2 0

2

sin2

0

2

Give a geometrical interpretation.

6. Consider the system composed of two spin 1/2’s, S1 and S2 , and the basis of four vectors defined in Complement DIV . The system at time = 0 is in the state: (0) =

1 1 ++ + + 2 2

+

1 2

. At time = 0, S1z is measured; what is the probability of finding ~ 2? What is the state vector after this measurement? If we then measure 1 , what results can be found, and with what probabilities? Answer the same questions for the case where the measurement of 1 yielded +~ 2. . When the system is in the state (0) written above, 1 and 2 are measured simultaneously. What is the probability of finding opposite results? Identical results? . Instead of performing the preceding measurements, we let the system evolve under the influence of the Hamiltonian:

=

1 1

+

2 2

What is the state vector ( ) at time ? Calculate at time the mean values S1 and S2 . Give a physical interpretation. . Show that the lengths of the vectors S1 and S2 are less than ~ 2. What must be the form of (0) for each of these lengths to be equal to +~ 2?

7. Consider the same system of two spin 1/2’s as in the preceding exercise; the state space is spanned by the basis of four states . . Write the 4 4 matrix representing, in this basis, the eigenvalues and eigenvectors of this matrix?

1

operator. What are the

. The normalized state of the system is: =

++ +

+

+

+ +

where , , and are given complex coefficients. 1 and 2 are measured simultaneously; what results can be found, and with what probabilities? What happens to these probabilities if is a tensor product of a vector of the state space of the first spin and a vector of the state space of the second spin? 494

• . Same questions for a measurement of

1

and

2

EXERCISES

.

. Instead of performing the preceding measurements, we measure only 2 . Calculate, first from the results of and then from those of , the probability of finding ~ 2.

8. Consider an electron of a linear triatomic molecule formed by three equidistant atoms. A

B

C

We use , , to denote three orthonormal states of this electron, corresponding respectively to three wave functions localized about the nuclei of atoms , , . We shall confine ourselves to the subspace of the state space spanned by , and . When we neglect the possibility of the electron jumping from one nucleus to another, its energy is described by the Hamiltonian 0 whose eigenstates are the three states , , with the same eigenvalue 0 . The coupling between the states , , is described by an additional Hamiltonian defined by: = = = where

is a real positive constant.

. Calculate the energies and stationary states of the Hamiltonian

=

0

+

. The electron at time = 0 is in the state . Discuss qualitatively the localization of the electron at subsequent times . Are there any values of for which it is perfectly localized about atom , or ? . Let be the observable whose eigenstates are , , with respective eigenvalues , 0, . is measured at time ; what values can be found, and with what probabilities? . When the initial state of the electron is arbitrary, what are the Bohr frequencies that can appear in the evolution of ? Give a physical interpretation of . What are the frequencies of the electromagnetic waves that can be absorbed or emitted by the molecule?

9. A molecule is composed of six identical atoms 1 , 2 ,... 6 which form a regular hexagon. Consider an electron which can be localized on each of the atoms. Call the state in which it is localized on the th atom ( = 1 2 6). The electron states will be confined to the space spanned by the , assumed to be orthonormal. 495



COMPLEMENT JIV

A1 A6

A2

A5

A3 A4

Define an operator 1

=

2

by the following relations:

;

2

=

3

;

;

6

Find the eigenvalues and eigenstates of basis of the state space.

=

1

. Show that the eigenvectors of

form a

When the possibility of the electron passing from one site to another is neglected, its energy is described by a Hamiltonian 0 whose eigenstates are the six states , with the same eigenvalue 0 . As in the previous exercise, we describe the possibility of the electron jumping from one atom to another by adding a perturbation to the Hamiltonian 0 ; is defined by: 1

=

6

;

6

2

=

; 5

2

=

1

3

;

1

Show that commutes with the total Hamiltonian = 0 + . From this deduce the eigenstates and eigenvalues of . In these eigenstates, is the electron localized? Apply these considerations to the benzene molecule.

Exercise 9. Reference: Feynman III (1.2), § 15-4.

496

Chapter V

The one-dimensional harmonic oscillator

A

B

C

D

A. A-1.

Introduction

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

A-1

Importance of the harmonic oscillator in physics . . . . . . . 497

A-2

The harmonic oscillator in classical mechanics . . . . . . . . . 499

A-3

General properties of the quantum mechanical Hamiltonian . 502

Eigenvalues of the Hamiltonian . . . . . . . . . . . . . . . . . 503 B-1

Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

B-2

Determination of the spectrum . . . . . . . . . . . . . . . . . 506

B-3

Degeneracy of the eigenvalues . . . . . . . . . . . . . . . . . . 509

Eigenstates of the Hamiltonian . . . . . . . . . . . . . . . . . 510 C-1

The

representation . . . . . . . . . . . . . . . . . . . . 510

C-2

Wave functions associated with the stationary states . . . . . 515

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518 D-1

Mean values and root mean square deviations of and in a state . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518

D-2

Properties of the ground state . . . . . . . . . . . . . . . . . . 520

D-3

Time evolution of the mean values . . . . . . . . . . . . . . . 522

Introduction Importance of the harmonic oscillator in physics

This chapter is devoted to the study of a particularly important physical system: the one-dimensional harmonic oscillator.

Quantum Mechanics, Volume I, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

The simplest example of such a system is that of a particle of mass a potential which depends only on and has the form: ( )=

1 2

2

moving in

(A-1)

( is a real positive constant). The particle is attracted towards the = 0 plane [the minimum of ( ), corresponding to positions of stable equilibrium] by a restoring force:

=

d d

=

(A-2)

which is proportional to the distance between the particle and the = 0 plane ( is an algebraic variable: ≶ 0). We know that in classical mechanics, the projection onto of the particle’s motion is a sinusoidal oscillation about = 0, of angular frequency: =

(A-3)

Actually, a large number of systems are governed (at least approximately) by the harmonic oscillator equations. Whenever one studies the behavior of a physical system in the neighborhood of a stable equilibrium position, one arrives at equations which, in the limit of small oscillations, are those of a harmonic oscillator (see § A-2). The results we shall derive in this chapter are applicable, therefore, to a whole series of important physical phenomena – for example, the vibrations of the atoms of a molecule about their equilibrium position, the oscillations of atoms or ions of a crystalline lattice (phonons)1 . The harmonic oscillator is also involved in the study of the electromagnetic field. We know that in a cavity, there exist an infinite number of possible stationary waves (normal modes of the cavity). The electromagnetic field can be expanded in terms of these modes and it can be shown, using Maxwell’s equations, that each of the coefficients of this expansion (which describe the state of the field at each instant) obeys a differential equation, which is identical to that of a harmonic oscillator whose angular frequency is that of the associated normal mode. In other words, the electromagnetic field is formally equivalent to a set of independent harmonic oscillators (cf. Complement KV ). The quantization of the field is obtained by quantizing these oscillators associated with the various normal modes of the cavity (cf. Chapter XIX). Recall, moreover, that it was the study of the behavior of these oscillators at thermal equilibrium (blackbody radiation) which, historically, led Planck to introduce, for the first time in physics, the constant which bears his name. We shall see (cf. Complement LV ) that the mean energy of a harmonic oscillator in thermodynamic equilibrium at the temperature is different for classical and quantum mechanical oscillators. The harmonic oscillator also plays an important role in the description of a set of identical particles which are all in the same quantum mechanical state (they must obviously be bosons, cf. Chap. XIV). As we shall see later, this is because the energy levels of a harmonic oscillator are equidistant, the spacing between two adjacent levels being equal to ~ . With the energy level labelled by the integer (situated at a distance ~ above the ground state) can then be associated a set of identical particles (or 1 Complement

498

AV is devoted to a qualitative study of some physical examples of harmonic oscillators.

A. INTRODUCTION

quanta), each possessing an energy ~ (cf. Chapter XV). The transition of the oscillator from level to level +1 or 1 corresponds to the creation or annihilation of a quantum of energy ~ . In this chapter, we shall introduce the operators and , which enable us to describe this transition from level to level +1 or 1. These operators, respectively called “creation” and “annihilation” operators2 , are used throughout quantum statistical mechanics and quantum field theory3 . The detailed study of the harmonic oscillator in quantum mechanics is therefore extremely important from a physical point of view. Moreover, we are dealing with a quantum mechanical system for which the Schrödinger equation can be solved rigorously. Having studied spin 1/2 and two-level systems in Chapter IV, we shall therefore now consider another simple example which illustrates the general formalism of quantum mechanics. We shall show in particular how to solve an eigenvalue equation by dealing only with the operators and the commutation relations (this technique will also be applied to angular momentum). We shall also study in a detailed way the motion of wave packets, particularly at the classical limit (cf. Complement GV on quasi-classical states). In § A-2, we shall review some results related to the classical oscillator before stating (§ A-3) certain general properties of the eigenvalues of the Hamiltonian . Then, in §§ B and C, we shall determine these eigenvalues and eigenvectors by introducing creation and annihilation operators and using only the consequences of the canonical commutation relation [ ] = ~, as well as the particular form of . § D is devoted to a physical study of the stationary states of the oscillator and wave packets formed by linear superpositions of these stationary states. A-2.

The harmonic oscillator in classical mechanics

The potential energy ( ) [formula (A-1)] is shown in Figure 1. The motion of the particle is governed by the dynamical equation: d2 = d2

d d

=

(A-4)

The general solution of this equation is of the form: =

cos(

)

(A-5)

where is defined by (A-3), and the constants of integration and are determined by the initial conditions of the motion. The particle therefore oscillates sinusoidally about the point , with an amplitude and an angular frequency . The kinetic energy of the particle is: =

1 2

d d

2

2

=

2

(A-6)

2 Annihilation

operators are also often called “destruction operators”. aim of quantum field theory is to describe interactions between particles in the relativistic domain, especially the interactions between electrons, positrons and photons. It is clear that creation and annihilation operators should play an important role, since such processes are indeed observed experimentally (absorption or emission of photons, pair creation...). The quantum theory of electromagnetism is introduced in Chapter XIX. 3 The

499

CHAPTER V

where

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

d is the momentum of the particle. The total energy is: d

=

2

=

+

=

2

+

1 2

2 2

(A-7)

Substituting solution (A-5) into this equation, we find: =

1 2

2 2

(A-8)

The energy of the particle is therefore time-independent (this is a general property of conservative systems) and can take on any positive (or zero) value, since is a priori arbitrary. If we fix the total energy , the limits of the classical motion = can be determined from Figure 1 by taking the intersection of the parabola with the line parallel to of ordinate . At these points = , the potential energy is at a maximum and equal to , and the kinetic energy is zero. On the other hand, at = 0, the potential energy is zero and the kinetic energy is maximum.

Comment:

Consider an arbitrary potential ( ) which has a minimum at = 0 (Fig. 2). Expanding the function ( ) in a Taylor’s series in the neighborhood of 0 , we obtain: ( )=

+ (

0)

2

+ (

0)

3

+

(A-9)

V(x)

E

x – xM

0

xM

Figure 1: The potential energy ( ) of a one-dimensional harmonic oscillator. The amplitude of the classical motion of energy is .

500

A. INTRODUCTION

V(x)

E

x

0

x1

x0

x2

Figure 2: In the neighborhood of a minimum, any potential ( ) can be approximated by a parabolic potential (dashed line). In the potential ( ), a classical particle of energy oscillates between 1 and 2 .

The coefficients of this expansion are given by: =

(

0)

1 2

d2 d 2

1 = 3!

d3 d 3

=

=

0

(A-10) =

0

and the linear term in ( 0 ) is zero since 0 corresponds to a minimum of The force derived from the potential ( ) is, in the neighborhood of 0 : d d

= Since

=

0

=

2 (

0)

3 (

0)

2

+

represents a minimum, the coefficient

( ).

(A-11) is positive.

The point = 0 corresponds to a stable equilibrium position for the particle: is zero for = 0 ; moreover, for ( and ( 0 ) sufficiently small, 0) have opposite signs since is positive. If the amplitude of the motion of the particle about 0 is sufficiently small for 3 2 the term in ( 0 ) of (A-9) [and therefore, the corresponding term in ( 0) of (A-11)] to be negligible compared to the preceding ones, we have a harmonic oscillator since the dynamical equation can then be approximated by: d2 d2

2 (

0)

The corresponding angular frequency

(A-12) is related to the second derivative of

( ) 501

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

at

=

0

by the formula: 2

=

1

=

d2 d 2

(A-13) =

0

Since the amplitude of the motion must remain small, the energy of the harmonic oscillator will be low. For higher energies , the particle will be in periodic but not sinusoidal motion between the limits 1 and 2 (Fig. 2). If we expand the function ( ) in a Fourier series giving the position of the particle, we find, not one, but several, sinusoidal terms; their frequencies are integral multiples of the lowest frequency. We then say that we are dealing with an anharmonic oscillator. Note also that, in this case, the period of the motion is not generally 2 , where is given by formula (A-13). A-3.

General properties of the quantum mechanical Hamiltonian

In quantum mechanics, the classical quantities by the observables and , which satisfy: [

and

are replaced respectively

]= ~

(A-14)

It is then easy to obtain the Hamiltonian operator of the system from (A-7): 2

=

+

2

1 2

2

2

(A-15)

Since is time-independent (conservative system), the quantum mechanical study of the harmonic oscillator reduces to the solution of the eigenvalue equation: =

(A-16)

which is written, in the ~2 d2 1 + 2 d 2 2

2 2

representation: ( )=

( )

(A-17)

Before undertaking the detailed study of equation (A-16), let us indicate some important properties that can be deduced from the form (A-1) of the potential function: ( ) The eigenvalues of the Hamiltonian are positive. It can be shown that, in general (Complement MIII ), if the potential function ( ) has a lower bound, the eigen2

values

of the Hamiltonian

( ): ( )

requires

=

2

+

( ) are greater than the minimum of

(A-18)

For the harmonic oscillator we are studying here, we have chosen the energy origin such that is zero. 502

B. EIGENVALUES OF THE HAMILTONIAN

( ) The eigenfunctions of have a definite parity. This is due to the fact that the potential ( ) is an even function: (

)=

( )

(A-19)

We can then (cf. Complements FII and CV ) look for eigenfunctions of , in the representation, amongst the functions which have a definite parity (in fact, we shall see that the eigenvalues of are not degenerate; consequently, the wave functions associated with the stationary states are necessarily either even or odd). (

) The energy spectrum is discrete. Whatever the value of the total energy, the classical motion is limited to a bounded region of the axis (Fig. 1), and it can be shown (Complement MIII ) that in this case, the eigenvalues of the Hamiltonian form a discrete set. We shall derive these properties (in a more precise form) in the following sections. However, it is interesting to note that they can be obtained simply by applying to the harmonic oscillator some general theorems concerning one-dimensional problems.

B.

Eigenvalues of the Hamiltonian

We are now going to study the eigenvalue equation (A-16). First of all, using only the canonical commutation relation (A-14), we shall find the spectrum of the Hamiltonian written in (A-15). B-1.

Notation

We shall begin by introducing some useful notations. The ˆ and ˆ operators

B-1-a.

The observables and obviously have dimensions (those of a length and a momentum, respectively). Since has the dimension of the inverse of a time and ~, of an action (product of an energy and a time), it is easy to see that the observables ˆ and ˆ defined by: ˆ= ˆ=

~ 1 ~

(B-1)

are dimensionless. If we use these new operators, the canonical commutation relation will be written: [ ˆ ˆ] =

(B-2)

and the Hamiltonian can be put in the form: =~

ˆ

(B-3) 503

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

with: ˆ =1 2

ˆ 2 + ˆ2

(B-4)

We shall therefore seek the solutions of the eigenvalue equation: ˆ

=

(B-5)

where the operator ˆ and the eigenvalues are dimensionless. The index can belong to either a discrete or a continuous set, and the additional index enables us to distinguish between the various possible orthogonal eigenvectors associated with the same eigenvalue . B-1-b.

The ,

and

operators

If ˆ and ˆ were numbers and not operators, we could write the sum ˆ 2 + ˆ 2 appearing in expression (B-4) for ˆ in the form of a product of linear terms, and obtain ˆ )( ˆ + ˆ ). In fact, since ˆ and ˆ are non-commuting operators, ˆ 2 + ˆ 2 is not (ˆ ˆ )( ˆ + ˆ ). We shall show, however, that the introduction of operators equal to ( ˆ ˆ enables us to simplify considerably our search for proportional to ˆ + ˆ and ˆ eigenvalues and eigenvectors of ˆ . We therefore set4 : 1 ˆ ( + ˆ) 2 1 ˆ ˆ) = ( 2 =

(B-6a) (B-6b)

These formulas can immediately be inverted to yield: ˆ= 1 ( 2 ˆ=

2

(

+ )

(B-7a)

)

(B-7b)

Since ˆ and ˆ are Hermitian, and are not (because of the factor ), but are adjoints of each other. The commutator of and is easy to calculate from (B-6) and (B-2): [

]= =

1 2

ˆ+ ˆ ˆ ˆ ˆ

2

ˆ ˆ ˆ

2

(B-8)

that is: [

]=1

(B-9)

This relation is completely equivalent to the canonical commutation relation (A-14). 4 Until

now, we have designated operators by capital letters. However, to conform to standard usage, we shall use the small letters and for the operators (B-6).

504

B. EIGENVALUES OF THE HAMILTONIAN

Finally, we derive some simple formulas which will be useful in the rest of this chapter. We first calculate : 1 ˆ ( 2 1 = ( ˆ2 + 2 1 = ( ˆ2 + 2 =

ˆ )( ˆ + ˆ ) ˆ2 + ˆ ˆ ˆ2

ˆ ˆ)

1)

(B-10)

Comparing this with expression (B-4), we see that: ˆ =

+

1 1 = (ˆ 2 2

ˆ )( ˆ + ˆ ) + 1 2

(B-11)

Unlike the situation in the classical case, ˆ cannot be put in the form of a product of linear terms. The non-commutativity of ˆ and ˆ is at the origin of the additional term 1/2 that appears on the right-hand side of (B-11). Similarly, it can be shown that: 1 2

ˆ =

(B-12)

Let us now introduce the operator

defined by:

=

(B-13)

This operator is Hermitian since: =

( ) =

=

(B-14)

Moreover, according to (B-11): ˆ =

+

1 2

(B-15)

so that the eigenvectors of ˆ are eigenvectors of , and vice versa. Finally, let us calculate the commutators of with and : [ [

]=[ ]=[

]= ]=

[

]+[ [

]+[

] = ] =

(B-16)

that is: [ [

]=

(B-17a)

]=

(B-17b)

Our study of the harmonic oscillator will be based on the use of the , and operators. We have replaced the eigenvalue equation of , which we first wrote in the form (B-5), by that of : =

(B-18) 505

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

When this equation is solved, we shall know that the eigenvector of is also an eigenvector of with the eigenvalue = ( + 1 2)~ [formulas (B-3) and (B-15)]: = ( + 1 2)~

(B-19)

The solution of equation (B-18) will be based on the commutation relation (B-9), which is equivalent to the initial relation (A-14), and on formulas (B-17), which are consequences of it. B-2.

Determination of the spectrum

B-2-a.

.

Lemmas

Lemma I (property of the eigenvalues of The eigenvalues of the operator Consider an arbitrary eigenvector is positive or zero: 2

=

are positive or zero. of . The square of the norm of the vector

0

(B-20)

Let us then use definition (B-13) of = Since

)

:

=

(B-21)

is positive, comparison of (B-20) and (B-21) shows that: 0

.

(B-22)

Lemma II (properties of the vector Let be a (non-zero) eigenvector of We shall prove the following:

( ) If

= 0, the ket

( ) If

0, the ket

=0

) with the eigenvalue .

is zero.

is a non-zero eigenvector of

with the eigenvalue

1.

( ) According to (B-21), the square of the norm of is zero if = 0; now, the norm of a vector is zero if and only if this vector is zero. Consequently, if = 0 is an eigenvalue of , all eigenvectors 0 associated with this eigenvalue satisfy the relation:

0

=0

(B-23)

Let us now show that relation (B-23) is characteristic of these eigenvectors. Consider a vector which satisfies: =0

(B-24)

Multiply both sides of this equation from the left by = 506

=0

: (B-25)

B. EIGENVALUES OF THE HAMILTONIAN

Any vector which satisfies (B-24) is therefore an eigenvector of with the eigenvalue = 0. ( ) Now let us assume that is strictly positive. According to (B-21), the vector is then non-zero, since the square of its norm is not equal to zero. Let us show that is an eigenvector of . To do this, let us apply the operator relation (B-17a) to the vector : [

]

= =

(B-26)

= Therefore: =(

1)[

which shows that .

]

(B-27)

is an eigenvector of

with the eigenvalue

Lemma III (properties of the vector

)

Let be a (non-zero) eigenvector of We shall prove the following:

of eigenvalue .

()

is always non-zero.

( )

is an eigenvector of

with the eigenvalue

+ 1.

( ) It is easy to calculate the norm of the vector (B-13): 2

1.

, using formulas (B-9) and

= =

(

+ 1)

= ( + 1)

(B-28)

Since, according to lemma I, is positive or zero, the ket always has a non-zero norm and, consequently, is never zero. ( ) The proof of the fact that is an eigenvector of is analogous to that of lemma II ; starting from relation (B-17b) between operators, we obtain: [

]

= =

B-2-b.

The spectrum of

+

= ( + 1)

(B-29)

is composed of non-negative integers

Consider an arbitrary eigenvalue of and a non-zero eigenvector associated with this eigenvalue. According to lemma I, is necessarily positive or zero. First, let us assume to be non-integral. We are now going to show that such a hypothesis contradicts lemma I and must consequently be excluded. If is non-integral, we can always find an integer 0 such that: +1

(B-30) 507

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

Now let us consider the series of vectors: (B-31) According to lemma II, each of the vectors of this series (with 0 ) is non-zero and an eigenvector of with the eigenvalue (cf. Fig. 3). The proof is by iteration: is non-zero by hypothesis; is non-zero (since 0) and corresponds 1 , an eigenvector to the eigenvalue 1 of ...; is obtained when acts on of with the strictly positive eigenvalue + 1, since and [cf. (B-30)]. Now let act on the ket . Since 0 according to (B-30), the action of on (an eigenvector of with the eigenvalue 0) yields a non-zero vector (lemma II ). Moreover, again according to lemma II, +1 is an eigenvector of with the eigenvalue 1, which is strictly negative according to (B-30). If is non-integral, we can therefore construct a non-zero eigenvector of with a strictly negative eigenvalue. Since this is impossible, according to lemma I, the hypothesis of non-integral must be rejected. What now happens if: =

(B-32)

with a positive integer or zero? In the series of vectors (B-31), is non-zero and an eigenvector of with the eigenvalue 0. According to lemma II (§ ( )), we therefore have: +1

=0

(B-33)

The series of vectors obtained by repeated action of the operator on is therefore limited when is integral. It is then never possible to obtain a non-zero eigenvector of which corresponds to a negative eigenvalue. In conclusion, can only be a non-negative integer. Lemma III can then be used to show that the spectrum of indeed includes all positive or zero integers. We have already constructed an eigenvector of with an eigenvalue of zero ( ). All we must do is let ( ) act on such a vector in order to obtain an eigenvector of of eigenvalue , where is an arbitrary positive integer.

v–n

0

1 αn φ iv

Figure 3: Letting with eigenvalues

508

v–n+1

2 αn – 1

φ iv

act several times on the ket 1, 2 etc...

v–1

n–1

v

n α φ iv

n+1 φ iv

, we can construct eigenvectors of

B. EIGENVALUES OF THE HAMILTONIAN

If we then refer to formula (B-19), we conclude that the eigenvalues of the form: 1 = + ~ 2

are of (B-34)

with = 0, 1, 2, ... Therefore, in quantum mechanics, the energy of the harmonic oscillator is quantized and cannot take on any arbitrary value. Note also that the smallest value (the ground state) is not zero, but ~ 2 (see § D-2 below). B-2-c.

Interpretation of the

and

operators

If we start with an eigenstate of corresponding to the eigenvalue = ( + 1 2)~ , application of the operator yields an eigenvector associated with the eigenvalue ~ , and application of yields, in the same way, the 1 = ( + 1 2)~ energy +1 = ( + 1 2)~ + ~ . For this reason, is said to be a creation operator and an annihilation operator (or destruction operator); their action on an eigenvector of makes an energy quantum ~ appear or disappear. B-3.

Degeneracy of the eigenvalues

We now show that the energy levels of the one-dimensional harmonic oscillator, given by equation (B-34), are not degenerate. B-3-a.

The ground state is non-degenerate

The eigenstates of associated with the eigenvalue 0 = ~ 2, that is, the eigenstates of associated with the eigenvalue = 0, according to lemma II of § B-2-a- , must all satisfy the equation: 0

=0

(B-35)

To find the degeneracy of the 0 level, all we must do is see how many linearly independent kets satisfy (B-35). Using definition (B-6a) of and relations (B-1), we can write (B-35) in the form: 1 2

+ ~

In the

~

0

=0

(B-36)

representation, this relation becomes: +

~

d d

0(

)=0

(B-37)

where: 0(

)=

(B-38)

0

Therefore we must solve a first-order differential equation. Its general solution is: 0(

)= e

1 2 ~

2

(B-39)

where is the constant of integration. The various solutions of (B-37) are all proportional to each other. Consequently, to within a multiplicative factor, there exists only one ket 2 is not degenerate. 0 that satisfies (B-35): the ground state 0 =~ 509

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

B-3-b.

All the states are non-degenerate

We have just seen that the ground state is not degenerate. Let us show by recurrence that this is also the case for all the other states. All we need prove is that, if the level = ( + 1 2)~ is not degenerate, the level = ( + 1 + 1 2)~ is not either. Let us therefore assume that there exists, to +1 within a constant factor, only one vector such that: =

(B-40)

Then consider an eigenvector = ( + 1)

+1

+1

corresponding to the eigenvalue

+ 1: (B-41)

+1

We know that the ket with the +1 is not zero and that it is an eigenvector of eigenvalue (cf. lemma II ). Since this ket is not degenerate by hypothesis, there exists a number such that: +1

=

(B-42)

It is simple to invert this equation by applying +1

=

to both sides: (B-43)

that is, taking (B-13) and (B-41) into account: +1

=

(B-44)

+1

We already knew that was an eigenvector of with the eigenvalue ( + 1); we see here that all kets associated with the eigenvalue ( + 1) are proportional to +1 . They are therefore proportional to each other: the eigenvalue ( + 1) is not degenerate. Thus, since the eigenvalue = 0 is not degenerate (see § B-3-a), the eigenvalue = 1 is not either, nor is = 2, etc...: all the eigenvalues of and, consequently, all those of , are non-degenerate. This enables us to write simply for the eigenvector of associated with the eigenvalue = ( + 1 2)~ . C.

Eigenstates of the Hamiltonian

In this section, we are going to study the principal properties of the eigenstates of the operator and of the Hamiltonian . C-1.

The

representation

We shall assume that and are observables, meaning their eigenvectors constitute a basis in the space , the state space of a particle in a one-dimensional problem (this could be proved by considering the wave functions associated with the eigenstates of , which we shall calculate in § C-2 below). Since none of the eigenvalues of (or of ) is degenerate (see § B-3), (or ) alone constitutes a C.S.C.O. in . 510

C. EIGENSTATES OF THE HAMILTONIAN

C-1-a.

The basis vectors in terms of

The vector 0

0

associated with

0

= 0 is the vector of

that satisfies:

=0

(C-1)

It is defined to within a constant factor; we shall assume 0 to be normalized, so the indeterminacy is reduced to a global phase factor of the form e , with real. According to lemma III of § B-2-a, the vector 1 which corresponds to = 1 is proportional to 0 : 1

=

1

(C-2)

0

We shall determine 1 by requiring 1 to be normalized and choosing the phase of 1 (relative to 0 ) such that 1 is real and positive. The square of the norm of 1 , according to (C-2), is equal to: 1

1

=

1

=

1

2

0

2

0

0

(

+ 1)

(C-3)

0

where (B-9) has been used. Since eigenvalue zero, we find: 1

1

=

1

2

0

is a normalized eigenstate of

=1

=

=

2

We require 2

2

1

= 1 and, consequently: (C-5)

0

Similarly, we can construct 2

2

from

1

: (C-6)

1

to be normalized and choose its phase such that

2

= = =2

2 2

with the

(C-4)

With the preceding phase convention, we have 1

=

2

1

2 2

1

2

is real and positive:

1

(

2

=1

1

=

+ 1)

1

(C-7)

Therefore: 2

=

1 2

1 ( )2 2

if we take (C-5) into account. This procedure can easily be generalized. If we know then the normalized vector is written: =

1

(C-8)

0

1

(which is normalized),

(C-9) 511

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

Since: 2

=

1 2

=

1

=1

(C-10)

we choose, with the same phase conventions as above: =

1

(C-11)

With these successive phase choices, we can obtain all the 1

=

1

=

1

=

1

1

1 1

1 ( ) 2

1

( )2

2

from

0

:

= (C-12)

0

that is: 1

=

C-1-b.

!

( )

(C-13)

0

Orthonormalization and closure relations

Since is Hermitian, the kets corresponding to different values of are orthogonal. Since each of them is also normalized, they satisfy the orthonormalization relation: =

of the

(C-14)

In addition, is an observable (we shall assume this here without proof); the set therefore constitutes a basis in . This is expressed by the closure relation: =1

(C-15)

Comment: It can be verified directly from expression (C-13) that the kets 1 ! !

=

0

are orthonormal: (C-16)

0

But: 0

=

1

=

1

=

512

1

(

)

(

+ 1)

1

0 1

0

1

0

(C-17)

C. EIGENSTATES OF THE HAMILTONIAN

1

(using the fact that = with the eigenvalue 0 is an eigenstate of Thus can we reduce the exponents of and by iteration. We obtain, finally: if

:

if

:

0

0

0

=

0

if

=

:

=

(

(

0

1)

1) ( 0

=

2

1

+ 1) (

0

1)

1).

0

0

(C-18a)

( )

0

(C-18b)

0

(C-18c)

2

1

0

The expression (C-18a) is zero because 0 = 0. Similarly, (C-18b) is equal to zero because 0 ( ) 0 can be considered to be the scalar product of 0 and the bra associated with . Finally, if we substitute (C-18c) into 0 , which is zero if (C-16), we see that is equal to 1.

C-1-c.

Action of the various operators

The observables and are linear combinations of the operators and [formulas (B-1) and (B-7)]. Consequently, all physical quantities can be expressed in terms of and . Now, the action of and on the vectors is especially simple [see equations (C-19) below]. In most cases, it is therefore desirable to use the representation to calculate the matrix elements and mean values of the various observables. With the phase conventions introduced in § C-1-a above, the action of the and operators on the vectors of the basis is given by:

=

+1

+1

(C-19a) =

1

(C-19b) We have already proved (C-19a): it suffices to replace by + 1 in equations (C-9) and (C-11). To obtain (C-19b), multiply both sides of (C-9) on the left by the operator and use (C-11): =

1 1

=

1

(

+ 1)

1

=

1

(C-20)

Comment:

The adjoint equations of (C-19a) and (C-19b) are: = =

+1

+1 1

(C-21a) (C-21b)

Note that decreases or increases by one unit depending on whether it acts on the ket or on the bra . Similarly, increases or decreases by one unit, depending on whether it acts on the ket or on the bra . 513

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

Starting with (C-19) and using (B-1) and (B-7), we immediately find the expressions for the kets and :

=

~

=

~

1 ( 2

+ )

=

(

)

=

2

The matrix elements of the , are therefore:

=

~ 2 ~ 2

,

and

+1

+1

+1

+1

=

~

1

(C-22b)

representation

(C-23b)

+1

~ 2

=

(C-22a)

1

(C-23a) +1

2

+1

operators in the

1

=

+

+1

+1

+

+1

1

1

(C-23c) (C-23d)

The matrices representing and are indeed Hermitian conjugates of each other, as can be seen from their explicit expressions:

0

( )=

514

1

0 2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

3

0 0

0

0

(C-24a)

C. EIGENSTATES OF THE HAMILTONIAN

and: 0 1

0

0

0

0

0

0

0

0

0

0

0

0

0

0 ( )=

2

0

0

3

(C-24b)

0 0

0

0

0

+1 0

As for the matrices representing and , they are both Hermitian: the matrix associated with is, to within a constant factor, the sum of the two preceding ones; the matrix associated with is proportional to their difference, but the presence of the factor in (C-22b) re-establishes its Hermiticity. C-2.

Wave functions associated with the stationary states

We shall now use the representation and write the functions ( )= which then represent the eigenstates of the Hamiltonian. We have already determined the function 0 ( ) which represents the ground state 0 (cf. § B-3-a): 1 4 0(

)=

0

=

e

1 2 ~

2

(C-25)

~

The constant that appears before the exponential insures the normalization of 0 ( ). To obtain the functions ( ) associated with the other stationary states of the harmonic oscillator, all we need to do is use expression (C-13) for the ket and the fact that, in the representation, is represented by: 1 2

~ ~

since is represented by multiplication by , and obtain: ( )= =

= 1 !

1 2

1

( )

!

by

~ d [formula (B-6b)]. Thus we d

0

~ ~

d d

d d

0(

)

(C-26)

515

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

ħ

φ0

1/4



1

ħ

φ1



ħ

φ2



1

0

1/4



1



1 2

0

1/4



1 2 3

0

1 2 3

Figure 4: Wave functions associated with the first three levels of a harmonic oscillator.

Figure 5: Probability densities associated with the first three levels of a harmonic oscillator.

that is: ( )=

1 2

1 2

~ !

1 4

~

~

d d

e

1 2 ~

2

1

(C-27) 2

It is easy to see from this expression that ( ) is the product of e 2 ~ and a polynomial of degree and parity ( 1) , called a Hermite polynomial (cf. Complements BV and CV ). A simple calculation gives the first several functions ( ): 1(

)=

2(

)=

3 1 4

4

e

1 2 ~

2

~ 1 4

4 ~

2

2

1 e

1 2 ~

2

(C-28)

~

These functions are shown in Figure 4, and the corresponding probability densities in Figure 5. Figure 6 gives the shape of the wave function ( ) and that of the probability density ( ) 2 for = 10. We see from these figures that when increases, the region of the axis in which ( ) takes on non-negligible values becomes larger. This corresponds to the fact, 516

C. EIGENSTATES OF THE HAMILTONIAN

in classical mechanics, that the amplitude of the particle’s motion increases with the energy [cf. Fig. 1 and relation (A-8)]. It follows that the mean value of the potential energy grows with [cf. comment ( ) of § D-1], since ( ), when is large, takes on non-negligible values in regions of the -axis where ( ) is large. Moreover, we see in these figures that the number of zeros of ( ) is (cf. Complement BV , where this property is derived). This implies that the mean kinetic energy of the particle increases with [cf. comment ( ) of § D-1], since this energy is given by:

1 2

2

=

~2 2

+

( )

d2 d 2

( )d

(C-29)

When the number of zeros of

( ) increases, the curvature of the wave function ind2 creases, and, in (C-29), the second derivative ( ) takes on larger and larger values. d 2 Finally, when is large, we observe (see, for example, Figure 6) that the probability density ( ) 2 is large for [where is the amplitude of the classical motion of energy ; cf. (A-8)]. This result is related to a feature of the motion predicted by classical mechanics: the classical particle has a zero velocity at = ; therefore, on the average, it spends more time in the neighborhood of these two points than in the center of the interval .

ħ φ10

1/4



0.6 0.4

xˆ a

ħ

2

φ10

1/2



0.3 0.2

xˆ b

– xˆM

–1

0

1

2

3

+ xˆM

Figure 6: Shape of the wave function (fig. a) and of the probability density (fig. b) for the = 10 level of a harmonic oscillator.

517

CHAPTER V

D.

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

Discussion

D-1.

Mean values and root mean square deviations of

and

in a state

Neither nor commutes with , and the eigenstates of are not eigenstates of or . Consequently, if the harmonic oscillator is in a stationary state , a measurement of the observable or the observable can, a priori, yield any result (since the spectra of and include all real numbers). We shall now calculate the mean values of and in such a stationary state and then their root mean square deviations ∆ and ∆ , which will enable us to verify the uncertainty relation. As we indicated in § C-1-c, we shall perform these calculations with the help of the operators and . As far as the mean values of and are concerned, the result follows directly from formulas (C-22), which show that neither nor has diagonal matrix elements: =0 =0

(D-1)

To obtain the root mean square deviations ∆ values of 2 and 2 : (∆ )2 =

2

(∆ )2 =

2

and ∆ , we must calculate the mean

(

)2 =

2

(

)2 =

2

(D-2)

Now, according to (B-1) and (B-7): 2

= =

2

~

(

~

(

2 2

+ )( 2

~ ( 2 ~ ( 2

= =

+

+

+ )(

+

)

2

)

)

2

The terms in 2 and is proportional to other hand: (

+ )

+

2

)

(D-3)

2

do not contribute to the diagonal matrix elements, since 2 2 to . On the 2 , and +2 ; both are orthogonal to

=

(2

+ 1)

=2 +1

(D-4)

Consequently: (∆ )2 =

2

=

(∆ )2 =

2

=

518

1 2 1 + 2 +

~

(D-5a)

~

(D-5b)

D. DISCUSSION

The product ∆ ∆



=

∆ +

1 2

is therefore equal to: (D-6)

~

We again find (cf. Complement CIII ) that it is greater than or equal to ~ 2. In fact, this lower bound is attained for = 0, that is, for the ground state (§ D-2 below).

Comments:

( ) If

denotes the amplitude of the classical motion whose energy is given by = ( + 1 2)~ , it is easy to see, using (A-8) and (D-5a), that: ∆

=

1 2

(D-7)

Similarly, if momentum:

denotes the oscillation amplitude of the corresponding classical

=

(D-8)

we obtain: ∆

=

1 2

(D-9)

It is not surprising that ∆ is of the order of the interval [ + ] over which the classical motion occurs (cf. Fig. 1): we saw at the end of § C, that it is approximately inside this interval that ( ) takes on non-negligible values. Furthermore, it is easy to understand why, when increases, so does ∆ . For large , the probability density ( ) 2 has two symmetric peaks situated approximately at = . The root mean square deviation cannot be much smaller than the distance between these peaks, even if each of them is very sharp (cf. Chap. III, § C-5 and the discussion of § 1-b of Complement AIII ). An analogous argument can be set forth for ∆ (cf. Complement DV ). ( ) The mean potential energy of a particle in the state ( ) =

1 2

that is, since ( ) =

2

is:

2

(D-10)

is zero [cf. (D-1)]: 1 2

2

(∆ )2

(D-11)

Similarly, we could find the mean kinetic energy of this particle: 2

2

=

1 (∆ )2 2

(D-12) 519

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

Substituting relations (D-5) into (D-11) and (D-12), we obtain: 1 2 1 = 2

1 2 1 + 2

( ) = 2

2

+

~ = ~ =

2 (D-13)

2

The mean potential and kinetic energies are therefore equal. This is an illustration of the virial theorem (cf. exercise 10 of Complement LIII ). (

) A stationary state has no equivalent in classical mechanics: its energy is not zero although the mean values and are. Nevertheless, there is a certain analogy between the state and that of a classical particle whose position is given by (A-5) [where is related to the energy by relation (A-8)], but for which the initial phase of the motion is chosen at random (all values included between 0 and 2 have the same probability). The mean values of and are then zero, since: 2

1 2

=

cos(

1 2

=

)d = 0

0

(D-14)

2

sin(

)d = 0

0

Moreover, we find, for the root mean square deviations of the position and the momentum, values identical to those of the state [formulas (D-7) and (D-9)]: 2

2

= =

2

2

1 2 1 2

2

2

cos2 (

)d =

0 2

2 2

sin2 (

)d =

0

2

(D-15)

that is:

D-2.

=

2

=

2

(

)2 =

(

)2 =

2 2

(D-16)

Properties of the ground state

In classical mechanics, the lowest energy of the harmonic oscillator is obtained when the particle is at rest (zero momentum and kinetic energy) at the -origin ( = 0 and therefore zero potential energy). The situation is completely different in quantum mechanics: the minimum energy state is 0 , whose energy is not zero, and the associated wave function has a certain spatial extension, characterized by the root mean square deviation ∆ = ~ 2 . This essential difference between the quantum and classical results can be seen to have its source in the uncertainty relations, which forbid the simultaneous minimization 520

D. DISCUSSION

of the kinetic energy and the potential energy. As we pointed out in Complements CI and MIII , the ground state corresponds to a compromise in which the sum of these two energies is as small as possible. In the special case of a harmonic oscillator, it is possible to state these qualitative considerations semi-quantitatively, and thus find the order of magnitude of the energy and the spatial extension of the ground state. If the distance characterizes this spatial extension, the mean potential energy will be of the order of: 1 2 But ∆

2 2

is then equal to about ~ , so the mean kinetic energy is approximately: ~2

2

=

(D-17)

2

(D-18)

2

2

The order of magnitude of the total energy is therefore: =

~2

+

2

2

+

1 2

2 2

(D-19)

The variation of , and with respect to is shown in Figure 7. For small values of , prevails over ; the opposite occurs for large values of . The ground state therefore corresponds approximately to the minimum of the function (D-19); it is easy to see that this minimum occurs at: ~

(D-20)

and is equal to: (D-21)

~ We again find the correct orders of magnitude of

T

V

0

ξm

ξ

0

and ∆

in the state

0

.

Figure 7: Variation of the potential energy and of the kinetic energy with respect to a parameter characterizing the spatial extension of the wave function about = 0. Since the potential energy is at a minimum at = 0, is a function that increases with 2 ( ). On the other hand, according to Heisenberg’s uncertainty relation, the kinetic energy is a decreasing function of . The lowest possible total energy, obtained for = , results from a compromise in which the sum + (solid line) is at a minimum.

521

CHAPTER V

THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

The harmonic oscillator possesses the pecularity that, because of the form of the potential ( ), the product ∆ ∆ actually attains its lower bound, ~ 2, in the ground state 0 [formula (D-6)]. This is related to the fact (cf. Complement CIII ) that the wave function of the ground state is Gaussian. D-3.

Time evolution of the mean values

Consider a harmonic oscillator whose state at = 0 is: (0) =

(0)

(D-22)

=0

( (0) is assumed to be normalized). Its state (D-54) of Chapter III: () =

( ) at

can be obtained by using rule

~

(0) e =0

=

(0) e

1 +2

(D-23)

=0

The mean value of any physical quantity

is therefore given as a function of time

by: ()

() =

(0)

(0)

e(

)

(D-24)

=0 =0

with: =

(D-25)

Since and are integers, the time evolution of the mean values involves only the frequency 2 and its various harmonics, which constitute the Bohr frequencies of the harmonic oscillator. Let us consider, in particular, the mean values of the observables and . According to formulas (C-22), the only non-zero matrix elements and are those for which = 1. Consequently, the mean values of and include only terms in e ; they are sinusoidal functions of time with angular frequency . This obviously relates to the classical solution of the harmonic oscillator problem. Moreover, as we pointed out in the discussion of Ehrenfest’s theorem (Chap. III, § D-1-d- ), the form of the harmonic oscillator potential implies that for all the mean values of and rigorously satisfy the classical equations of motion. Thus, according to general formulas (D-34) and (D-35) of Chapter III: d d d d 522

1 [ ~ 1 = [ ~ =

] = ] =

(D-26a) 2

(D-26b)

D. DISCUSSION

If we integrate these equations, we obtain: ()=

(0) cos

+

()=

(0) cos

+

1

(0) sin (0) sin

(D-27)

We again find the sinusoidal form indicated by formula (D-24).

Comment:

It is important to note that this analogy with the classical situation appears only when (0) is a superposition of states of the type of (D-22), where several coefficients (0) are non-zero. If all these coefficients except one are equal to zero, the oscillator is in a stationary state and the mean values of all the observables are constant over time. It follows that, in a stationary state , the behavior of a harmonic oscillator is totally different from that predicted by classical mechanics, even if is very large (the limit of large quantum numbers). If we want to construct a wave packet whose average position oscillates over time, we must superpose different states (see Complement GV ). References and suggestions for further reading:

Dirac (1.13), § 34; Messiah (1.17), Chap. XII.

523

COMPLEMENTS OF CHAPITER V, READER’S GUIDE

AV : SOME EXAMPLES OF HARMONIC OSCILLATORS

Demonstrates, with some examples chosen from various fields, the importance of the quantum mechanical harmonic oscillator in physics. Semiquantitative and rather simple; recommended for a first reading.

BV : STUDY OF THE STATIONARY STATES IN THE REPRESENTATION. HERMITE POLYNOMIALS

Technical study of the stationary wave functions of the harmonic oscillator. Intended to serve as a reference.

CV : SOLVING THE EIGENVALUE EQUATION OF THE HARMONIC OSCILLATOR BY THE POLYNOMIAL METHOD

Another method that yields the results of Chapter V. Reveals the relation between energy quantization and the behavior of the wave functions at infinity. Moderately difficult.

DV : STUDY OF THE STATIONARY STATES IN THE REPRESENTATION

Shows that, in a stationary state of the harmonic oscillator, the momentum probability distribution has the same form as that of the position. Fairly simple.

EV : THE ISOTROPIC THREE-DIMENSIONAL HARMONIC OSCILLATOR

Generalisation of the results of Chapter V to three dimensions. Recommended for a first reading, since it is simple and important.

FV : A CHARGED HARMONIC OSCILLATOR PLACED IN A UNIFORM ELECTRIC FIELD

A direct and simple application of the results of Chapter V (except for § 3, which uses the translation operator introduced in Complement EII ). Recommended for a first reading.

GV : COHERENT “QUASI-CLASSICAL” STATES OF THE HARMONIC OSCILLATOR

Detailed study of the “quasi-classical” states of the harmonic oscillator, which illustrates the relation between quantum and classical mechanics. Important because of its applications to the quantum theory of radiation. Moderately difficult; can be omitted in a first reading.

HV : NORMAL VIBRATIONAL MODES OF TWO COUPLED HARMONIC OSCILLATORS

Study, in the very simple case of two coupled harmonic oscillators, of the normal vibrational modes of a system. Recommended, since it is simple and physically important.

525

JV : VIBRATIONAL MODES OF AN INFINITE LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS KV : VIBRATIONAL MODES OF A CONTINUOUS PHYSICAL SYSTEM. APPLICATION TO RADIATION; PHOTONS

LV : THE ONE-DIMENSIONAL HARMONIC OSCILLATOR IN THERMODYNAMIC EQUILIBRIUM AT A TEMPERATURE

MV : EXERCISES

526

JV , KV : Introduction, using simplified models, of concepts that are particularly important in physics. Rather difficult (graduate level); can be reserved for later study. JV : determination of the normal vibrational modes of a linear chain of coupled oscillators, leading to the notion of phonon, fundamental in solid state physics. KV : normal vibrational modes of a continuous system. A simple way to introduce photons in the quantum mechanical study of the electromagnetic field.

Application of the density operator (introduced in Complement EIII ) to a harmonic oscillator in thermal equilibrium. Important from a physical point of view, but requires knowledge of EIII .



SOME EXAMPLES OF HARMONIC OSCILLATORS

Complement AV Some examples of harmonic oscillators

1

2

3

4

Vibration of the nuclei of a diatomic molecule . . . . . . . . 1-a Interaction energy of two atoms . . . . . . . . . . . . . . . . . 1-b Motion of the nuclei . . . . . . . . . . . . . . . . . . . . . . . 1-c Experimental observations of nuclear vibration . . . . . . . . Vibration of the nuclei in a crystal . . . . . . . . . . . . . . . 2-a The Einstein model . . . . . . . . . . . . . . . . . . . . . . . 2-b The quantum mechanical nature of crystalline vibrations . . Torsional oscillations of a molecule: ethylene . . . . . . . . . 3-a Structure of the ethylene molecule C2 H4 . . . . . . . . . . . . 3-b Classical equations of motion . . . . . . . . . . . . . . . . . . 3-c Quantum mechanical treatment . . . . . . . . . . . . . . . . . Heavy muonic atoms . . . . . . . . . . . . . . . . . . . . . . . 4-a Comparison with the hydrogen atom . . . . . . . . . . . . . . 4-b The heavy muonic atom treated as a harmonic oscillator . . . 4-c Order of magnitude of the energies and spread of the wave functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

527 527 529 531 534 534 535 536 536 538 539 541 541 542 544

We mentioned in the introduction to Chapter V that the results obtained in the study of the harmonic oscillator are applicable to numerous cases in physics, especially those concerning small oscillations of a system about a position of stable equilibrium (where the potential energy is at a minimum). The aim of this complement is to describe some examples of such oscillations and to point out their physical importance: vibration of the nuclei in a diatomic molecule or a crystalline lattice, torsional oscillations in a molecule, motion of a muon inside a heavy nucleus. We do not intend to discuss these phenomena in great detail here. We shall confine ourselves to a simple, qualitative discussion. 1. 1-a.

Vibration of the nuclei of a diatomic molecule Interaction energy of two atoms

The formation of a molecule from two neutral atoms occurs because the interaction energy ( ) of these two atoms has a minimum ( is the distance between them). The form of ( ) is shown in Figure 1. When is very large, the two atoms do not interact and ( ) approaches a constant which we shall choose as the energy origin. Then, as decreases, ( ) varies approximately like 1 6 : the corresponding attractive forces are the Van der Waals forces (which we shall study in Complement CXI ). When becomes so small that the electronic wave functions overlap, ( ) decreases faster and passes through a minimum at = ; it then increases and becomes very large as approaches zero. 527

COMPLEMENT AV



V (r)

0

– V0

re

r

Figure 1: Form of the interaction potential between two atoms that can form a stable molecule. Classically, 0 is the dissociation energy of the molecule and , the distance between the two nuclei in the equilibrium position. In quantum mechanics, one obtains vibrational states (the horizontal lines inside the well) whose energies are all greater than 0.

The minimum of ( ) is responsible for the phenomenon of the chemical bond that can form between the two atoms. We have already pointed out, in § C-2-c of Chapter IV (taking the molecule 2+ as an example), that the cause of this lowering of the energy is a delocalization phenomenon of the electronic states (quantum resonance) which allows the electrons to profit from the attraction of the two nuclei. The rapid rise of ( ) at small distances is due to the repulsion of the nuclei. If the nuclei were classical particles, they would have stable equilibrium positions separated by = . The depth 0 of the potential well at = is called, classically, the dissociation energy of the molecule: it is, in fact, the energy that must be furnished to the two atoms in order to separate them. The larger 0 , the more stable the molecule. The theoretical and experimental determination of the curve of Figure 1 is a very important problem in atomic and molecular physics. We shall see that, by studying the vibrations of the nuclei, we get a certain amount of information about this curve. Comment: (the Born-Oppenheimer approximation) The quantum mechanical description of a diatomic molecule is actually a very complex problem; it involves finding the stationary states of a system of particles, the nuclei and the electrons, all interacting with each other. In general, it is impossible to solve the Schrödinger equation for such a system exactly. A significant simplification arises from the fact that the mass of the electrons is much smaller than that of the nuclei. It follows from this that one can, in a first approximation, study the two motions separately. One begins by determining the motion of the electrons for a fixed value of the distance between the two nuclei; thus one obtains a series of stationary states for the electronic system, of energies 1 ( ), 2 ( )... Then one considers the ground state, of energy 1 ( ), of the electronic system; when varies because of the motion of the nuclei, the electronic system always remains in the ground state, for all . This means that the system’s wave function adapts itself instantaneously to any change in : the electrons, which are very mobile, are said to follow “adiabatically” the motion of the nuclei. In the study of this motion, the electronic energy 1 ( ) then plays the role of a potential energy of interaction between these two nuclei. This interaction potential depends on the distance between the nuclei, , and adds to their electrostatic repulsion 1 2 2 (where 1 and 2 are the atomic numbers of the two nuclei; we have set 2 = 2 4 0 , where is the charge of the electron). The total potential energy ( ) of the system of the two nuclei, which

528



SOME EXAMPLES OF HARMONIC OSCILLATORS

enables us to determine their motion, is then: ( )=

1(

)+

1

2

2

(1)

It is this function that is shown in Figure 1. 1-b.

.

Motion of the nuclei

Separation of the rotational and vibrational motions

We are thus faced with a problem that involves the motion of two particles of masses and 2 , whose interaction is described by the potential ( ) of Figure 1 depending only on the distance between them. The problem is complicated by the existence of several degrees of freedom: vibrational (variation of ) and rotational (variation of the polar angles and which give the direction of the axis of the molecule). In addition, these degrees of freedom are coupled: when the molecule vibrates, its moment of inertia changes because of the variation of , and the rotational energy is modified. If we confine ourselves to small amplitude vibrations, it can be shown that the coupling between vibrational and rotational degrees of freedom is negligible since the relative variation of the moment of inertia is very small during the vibration. The problem is then reduced (as we shall see in detail in Complement FVII ) to two independent problems: in the first place, the study of the rotation of a “dumbbell”1 composed of two masses 1 and 2 separated by a fixed distance ; plus a one-dimensional problem (in which is the only variable) involving a fictitious particle whose mass is equal to the reduced mass of 1 and 2 (cf. Chap. VII, § B): 1

=

1

2

1+

(2) 2

moving in the potential ~2 d2 + 2 d 2

( )

( ) of Figure 1. We must then solve the eigenvalue equation: ( )=

( )

(3)

We shall concentrate on the latter problem here. .

Vibrational states

If we confine ourselves to small amplitude oscillations, we can make a limited expansion of ( ) in the neighborhood of its minimum, at = : ( )=

0

+

1 2

( )(

)2 +

1 6

( )(

)3 +

(4)

The discussion in § A-2 of Chapter V shows that if we neglect higher than second-order terms in expression (4), we are left with the equation of a one-dimensional harmonic oscillator centered at = , of angular frequency: =

( )

(5)

1 We

shall study this system (also called a “rigid rotator”) quantum mechanically in Complement CVI , once we have introduced angular momentum.

529

COMPLEMENT AV



The vibrational states have energies given by: =

+

1 2

~

, shown by the horizontal lines in Figure 1, therefore

0

(6)

where

= 0, 1, 2, ... ( is used instead of in the notation of molecular vibrations). According to the discussion in § D-3 of Chapter V, the mean value of the distance between the two nuclei oscillates about with a frequency of = 2 which can thus be seen to be the vibrational frequency of the molecule. Comments: ( ) Even in the ground state, the wave function of a harmonic oscillator has a finite spread, of the order of ~ 2 (cf. § D-2 of Chapter V). The distance between the two nuclei of the molecule in the vibrational ground state is therefore defined only to within ~ 2 . An important condition for the decoupling of the vibrational and rotational degrees of freedom is therefore that: ~ 2

(7)

( ) When the reduced mass is known, the measurement of yields, according to (5), the second derivative ( ). When the quantum number increases, it is no longer possible to neglect terms in ( )3 in expression (4) (which indicate the deviation of the potential well from a parabolic form). The oscillator then becomes anharmonic. Studying the effects of the term in ( )3 of (4) by perturbation theory (as we shall do in Complement AXI ), one finds that the separation +1 of two neighboring states is not the same for large and small values of . Studying the variation of with respect to enables us to obtain the coefficient +1 ( ) of the term in ( )3 . Thus we see how the study of the frequencies of molecular vibration enables us to define more precisely the form of the curve ( ) in the neighborhood of its minimum.

.

Order of magnitude of the vibrational frequencies

Molecular vibrational frequencies are commonly expressed in cm 1 , by giving the inverse of the wavelength (expressed in cm) of an electromagnetic wave of the same frequency . Note that 1 cm 1 corresponds to a frequency of 3 1010 Hertz and to an energy of 1 24 10 4 eV. The vibrational frequencies of diatomic molecules fall between several tens and several thousands of cm 1 . The corresponding wavelengths therefore go from a few microns to a few hundred microns, consequently falling in the infrared. Formula (5) shows that as decreases, increases. This frequency also increases with ( ), that is, with a greater curvature of the potential well at = . Since is always of the same order of magnitude (a few ˚ A), ( ) increases with the depth 0 of the well: therefore increases with the chemical stability. We shall consider some concrete illustrations of the preceding observations. The vibrational frequencies of the hydrogen and deuterium molecules (H2 and D2 ) are, respectively (not taking into account anharmonicity corrections):

530

= 4 401 cm

1

2

= 3 112 cm

1

2

(8)



SOME EXAMPLES OF HARMONIC OSCILLATORS

The curve ( ) is the same in these two cases: the chemical bond between the two atoms depends only on the electronic atmosphere. However, the reduced mass of H2 is half as large as that of D2 . We must therefore have, according to (5), 2 2 = 2 . This is in agreement with the experimental values (8). Now let us consider an example of two molecules that have about the same reduced mass but very different chemical stabilities. The molecule 79 Br 85 Rb is chemically stable (halogenalkaline bond); its vibrational frequency is 181 cm 1 . Molecules of 84 Kr85 Rb have been observed recently in optical pumping experiments. Their chemical stability is much lower, because krypton, which is a rare gas, is practically inert from a chemical point of view (in fact, the cohesion of the molecule is due only to Van der Waals forces). These molecules have been found to have a vibrational frequency of the order of 13 cm 1 . The considerable difference between this figure and the preceding one is due solely to the difference in chemical stability of the two types of molecules since the reduced masses are, to within a few per cent, practically the same.

1-c.

Experimental observations of nuclear vibration

We shall now explain how nuclear vibration can be detected experimentally. In particular, we shall consider the interaction of the molecule with an electromagnetic wave.

.

Infrared absorption and emission

First, let us assume the molecule to be heteropolar (composed of two different atoms). Since the electrons are attracted towards the more electronegative atom, the molecule generally has a permanent dipole moment ( ) which depends on the distance between the two nuclei. Expanding ( ) in the neighborhood of the equilibrium position = , we obtain: ( )=

0

+

1(

)+

(9)

where

0 and 1 are real constants. When the molecule is in a linear superposition ( ) of several stationary vibrational states , the mean value ( ) ( ) ( ) of its electric dipole moment oscillates about the value 0 with a frequency of 2 . The oscillatory term arises from the mean value of the term 1 ( ) of (9) ( plays the same role in our problem as the observable of the harmonic oscillator studied in § D-3 of Chapter V). Now ( ) has a non-zero matrix element between two states and only when = 1. This selection rule enables us to understand why only one Bohr frequency = 2 appears in the motion of ( ) ( ) [the harmonic frequencies evidently appear when one takes into account the anharmonicity of the potential and terms of higher order in expansion (9); their intensity is however much weaker]. This vibration of the electric dipole moment results in a coupling between the molecule and the electromagnetic field; the molecule can consequently absorb or emit radiation of frequency . In terms of photons, the molecule can absorb a photon of energy and move from the state to the state +1 (Fig. 2-a) or emit a photon by going from to 1 (Fig. 2-b).

531

COMPLEMENT AV



Figure 2: Absorption (fig. a) or emission (fig. b) of a photon by a heteropolar molecule going from the vibrational state to the state +1 or 1 .

.

The Raman effect

Now let us consider a homopolar molecule (consisting of two identical atoms). Because of symmetry, the permanent electric dipole moment is then zero for all , and the molecule is “inactive” in the infrared. Imagine that an optical wave of frequency Ω 2 strikes this molecule. This frequency, much higher than those considered previously, is able to excite the electrons of the molecule; under the effect of the optical wave, the electrons will undergo forced oscillation and re-emit radiation of the same frequency in all directions. This is the wellknown phenomenon of the molecular scattering of light (Rayleigh scattering)2 . What new phenomena are produced by the vibration of the molecule? What happens can be explained qualitatively in the following way. The electronic susceptibility3 of the molecule is generally a function of the distance between the two nuclei. When varies (recall that this variation is slow compared to the motion of the electrons), the amplitude of the induced electric dipole moment, which vibrates at a frequency of Ω 2 , varies. The time dependence of the dipole moment is therefore that of a sinusoid of frequency Ω 2 whose amplitude is modulated at the frequency of the molecular vibration 2 , which is much smaller (Fig. 3). The frequency distribution of the light emitted by the molecule is given by the Fourier transform of the motion of the electric dipole shown in Figure 3. It is easy to see (Fig. 4) that there exists a central line of frequency Ω 2 (Rayleigh scattering) and two shifted lines, of frequency (Ω ) 2 (Stokes Raman scattering) and frequency (Ω + ) 2 (anti-Stokes Raman scattering). It is very simple to interpret these lines in terms of photons. Consider an optical photon of energy ~Ω which strikes the molecule when it is in the state (Fig. 5-a). If the molecule does not change vibrational states during the scattering process, the scattering is elastic. Because of conservation of energy, the scattered photon has the 2 In Complement A XIII , we shall use quantum mechanics to study the forced motion of the electrons of an atom under the effect of incident light waves. 3 Under the effect of the field E e Ω of the incident optical wave, the electronic cloud of the molecule 0 acquires an induced dipole moment D given by:

D = (Ω)E0 e



(Ω) is, by definition, the electronic susceptibility of the molecule. The important point here is that depends on .

532



SOME EXAMPLES OF HARMONIC OSCILLATORS

same energy as the incident photon (Fig. 5-b: Rayleigh line). However, the molecule, during the scattering process, can make a transition from the state to the state . The molecule acquires an energy ~ at the expense of the scattered photon, +1 whose energy therefore is ~(Ω ) (Fig. 5-c): the scattering is inelastic (Stokes Raman line). Finally, the molecule may move from the state to the state 1 , in which case the scattered photon will have an energy of ~(Ω + ) (Fig. 5-d: anti-Stokes Raman line).

Comments: ( ) The Raman effect can also be observed with heteropolar molecules. ( ) The Raman effect has enjoyed a revival of interest because of the development of lasers. If, in the cavity of a laser oscillating at a frequency of Ω 2 , one places a cell filled with a substance that exhibits the Raman effect, one can, in certain cases, obtain an amplification (stimulated Raman effect) and hence a laser oscillation at the frequency (Ω ) 2 , where is the vibrational frequency of the molecules in the cell (Raman laser). Thus, by varying this substance, one can vary the oscillation frequency of the laser. (

) The study of Raman and infrared spectra of molecules is useful in chemistry because it permits the identification of the various bonds which exist in a complex molecule.

D

Figure 3: The vibration of a molecule modulates the amplitude of the oscillating electric dipole induced by an incident light wave.

t 0

Ω 2π Ω–ω

Ω+ω





Figure 4: Spectrum of the oscillations shown in Figure 3. In addition to the central line, whose frequency is the same as that of the incident light wave (Rayleigh line), two shifted lines appear (the Stokes and anti-Stokes Raman lines). The frequency shift is equal to the vibrational frequency of the molecule.

533

COMPLEMENT AV



Figure 5: Schematic representation of the scattering of a photon of energy ~Ω by a molecule which is initially in the vibrational state (fig. a): Rayleigh scattering without a change in the vibrational state (fig. b); Stokes or anti-Stokes Raman scattering with a change in the molecule’s state from to +1 (fig. c) or to 1 (fig. d).

For example, the vibration frequency of a group of two carbon atoms depends on whether the bond between them is single, double or triple.

2.

Vibration of the nuclei in a crystal

2-a.

The Einstein model

A crystal consists of a system of atoms (or ions) which are regularly distributed in space, forming a periodic lattice. For simplicity, let us choose a one-dimensional model in which we consider a linear chain of atoms. The average position of the nucleus of the th atom is: 0

=

(10)

˚ is the distance between adjacent atoms (on the order of a few A). Let ( 1 2 ) be the total potential energy of the crystal nuclei, which 0 depends on their positions 1 2 If is not too large, that is, if each nucleus is not too far from its equilibrium position, ( 1 2 ) has, in certain cases, the following simple form: where

(

1

)

2

0

+

1 ( 2

0 2

)

0

+

(11)

0 where 0 and 0 are real constants (with 0 0). The absence of terms linear in 0 shows that is a stable equilibrium position for the nucleus ( ) (a minimum of ). We add to (11) the total kinetic energy: 2

=

534

2

(12)



SOME EXAMPLES OF HARMONIC OSCILLATORS

where is the momentum of the nucleus ( ) of mass . The total Hamiltonian of the system is, to within the constant 0 , a sum of Hamiltonians of one-dimensional harmonic oscillators centered at each nucleus ( ): 2

=

0+

2

1 + ( 2

0 2

)

0

(13)

Consequently, in this simplified model, each nucleus vibrates about its equilibrium position independently of its neighbors, with an angular frequency: =

0

(14)

As in the case of the diatomic molecule, increases when decreases and when the curvature of the potential attracting the nucleus towards its equilibrium position increases.

Comment: In the simple model we have just presented, each nucleus vibrates independently of the others. This is because the proposed potential does not contain any terms which are simultaneously dependent on more than one of the variables , as it would if it described internuclear interactions. This model is not realistic since such interactions do, in fact, exist. In Complement JV , we shall present a more elaborate model which takes into account the coupling between each nucleus and its two nearest neighbors. We shall see that it is still possible, in this model, to put the total Hamiltonian of the system in the form of a sum of Hamiltonians of independent harmonic oscillators. 2-b.

The quantum mechanical nature of crystalline vibrations

Despite its very schematic character, the Einstein model enables us to understand a certain number of phenomena related to the quantum mechanical nature of crystalline vibrations. The low temperature behavior of the constant volume specific heat, which cannot be explained using classical mechanics, will be described in Complement LV in connection with the study of the properties of a harmonic oscillator in thermodynamic equilibrium. In the present complement, we shall discuss a spectacular effect related to the finite spread of the wave functions associated with the position of each atom in the ground state. At absolute zero, under a pressure of one atmosphere, all substances except helium are solids. To solidify helium, it is necessary to apply a pressure of at least 25 atmospheres. Can this peculiarity be explained qualitatively? First let us try to understand the phenomenon of the melting of an ordinary substance. At absolute zero, the atoms are practically localized at their equilibrium positions; the spread of their wave functions about the 0 is given by [cf. formula (D-5a) of Chapter V]: ∆

~ 2

=

1 4

~2 4

(15) 0

535

COMPLEMENT AV



H

H

C

H

Figure 6: Plane structure of the ethylene molecule.

C

H

[where we have used expression (14) for ]. ∆ is, in general, very small. When the crystal is heated, the nuclei move into higher and higher vibrational states: in classical language, they vibrate with a larger and larger amplitude; in quantum mechanical language, the spread of their wave functions increases [with the square root of the vibrational quantum number – see formula (D-5a) of Chapter V]. When this spread is no longer negligible with respect to the interatomic distance , the crystal melts (see § 4-c of Complement LV , in which this phenomenon is studied more quantitatively). It is impossible to solidify helium at ordinary pressures. This corresponds to the fact that, even at absolute zero, the spread of the wave function given by (15) is not negligible compared to . This results from the fact that the mass of helium is very small and its chemical affinity, very weak (the curvature 0 of the potential in the neighborhood of each minimum is very small, since the potential wells are very shallow). Both factors produce the same effect in formula (15): a large spread ∆ . Now, an increase in the pressure results in an increase in 0 and therefore, in ; consequently, ∆ decreases. This is due to the fact that, at high pressures, each helium atom is “wedged” between its neighbors: the smaller the average distance between these neighbors (the higher the pressure), the sharper the potential minimum (the greater 0 ). Thus we see how an increase in the pressure makes the solidification of helium possible. 3. 3-a.

Torsional oscillations of a molecule: ethylene Structure of the ethylene molecule C2 H4

The structure of the molecule C2 H4 is well-known: the six atoms of the molecule are in the same plane (Fig. 6) and the angles between the various C H and C C bonds are close to 120 . Now imagine that, without changing the relative positions of the bonds of each carbon atom, we rotate one of the CH2 groups, about the C C axis, through an angle with respect to the other one. Figure 7 represents the molecule as seen along the C C axis: the C H bonds of one CH2 group are shown in solid lines and those of the other one, in dashed lines. How does the potential energy ( ) of the molecule vary with respect to ? Since the stable structure of the molecule is planar, the angle = 0 must correspond to a minimum of ( ). It is also clear that = corresponds to another minimum 536



SOME EXAMPLES OF HARMONIC OSCILLATORS

H

C

H

H

α

C

Figure 7: Torsion of the ethylene molecule (seen along the C C axis): one of the CH2 groups has rotated with respect to the other one through an angle about the C C axis.

H

of

( ), since the two structures associated with = 0 and = are undistinguishable. ( ) therefore has the form shown in Figure 8 [ varies from 2 to 3 2; (0) is chosen as the energy origin]. The two stable positions = 0 and = are separated by a potential barrier of height 0 . The potential of Figure 8 is often approximated by the simple formula: ( )=

0

2

(1

cos 2 )

(16)

Comment: Quantum mechanics enables us to interpret all the features of the C2 H4 molecule which we have just described. In this molecule, each carbon atom has four valence electrons. Three of these electrons ( electrons) are found to have wave functions that are symmetrical about three coplanar lines making angles of 120 with each other, and defining the directions of the chemical bonds (Fig. 6). These wave functions overlap those of the electrons of the neighboring atoms to a considerable extent, and it is this overlap that insures the stability of the C H bonds and of part of the C C bond (this phenomenon is called “sp2 hybridation” and will be studied in greater detail in Complement EVII ). The last valence electron of each carbon atom ( electron) has a wave function which is symmetrical about a line passing through C and perpendicular to the plane defined by C and its three neighbors. The overlap of the wave functions of the two electrons is maximum and, consequently, the chemical stability of the double bond is greatest when

V (α) V0

α – π/2

0

π/2

π

3π/2

Figure 8: The potential energy of the molecule depends on the torsion angle minimal for = 0 and = (planar structures).

;

( ) is

537



COMPLEMENT AV

Figure 9: To write the classical equations of motion, we denote by 1 and 2 the angles formed by the planes of the two CH2 groups with a fixed plane.

α1 α2

the two lines associated with the electrons are parallel, that is, when the six atoms of the molecules are in the same plane. The structure of Figure 6 is thus entirely explained.

Since ( ) can be approximated by a parabola in the neighborhood of each of its two minima, the molecule performs torsional oscillations about its two stable equilibrium positions. We now examine them. First, we shall review rapidly the corresponding classical equations. 3-b.

Classical equations of motion

We denote by 1 and 2 the angles formed by the planes of the two CH2 groups with a fixed plane passing through the C C axis (Fig. 9). The angle in Figure 7 is obviously: =

1

(17)

2

Let be the moment of inertia of one of the CH2 groups with respect to the C C axis. Since the potential energy depends only on = 1 2 , the dynamical equations describing the rotation of each group are written: d2 d

1 2

d2 d

2 2

1

2)

( )

1

d d

1

( )

2

d 2) = + d

=

(

=

(

=

(18)

Adding and subtracting these two equations we obtain: d2 ( d2 d2 = d2

538

1

+ 2

2)

d d

=0

(19a)

( )

(19b)



SOME EXAMPLES OF HARMONIC OSCILLATORS

Equation (19a) indicates that the entire molecule can rotate freely about the C C axis independently of the torsional motion: the angle ( 1 + 2 ) 2 of the plane bisecting the planes of the two CH2 groups is a linear function of time. Equation (19b) describes the torsional motion (rotation of one group with respect to the other). Let us consider this motion in the immediate neighborhood of one of the stable equilibrium positions, = 0. We expand expression (16) in the neighborhood of = 0: ( )

0

2

(20)

Substituting (20) into (19b), we obtain: d2 4 + 2 d

0

=0

(21)

We recognize (21) as the equation of a one-dimensional harmonic oscillator ( is the only variable) of angular frequency: =2

0

For the C2 H4 molecule, 3-c.

(22) is of the order of 825 cm

1

.

Quantum mechanical treatment

In the neighborhood of its two equilibrium positions = 0 and = , the molecule possesses “torsional states” of quantized energy = ( +1 2)~ , with = 0 1 2 In a first approximation, each energy level = ( +1 2)~ is therefore doubly degenerate, since for each one there are two states and whose wave functions ( ) and ( ) differ only in that one is centered at = 0 and the other, at = (Fig. 10-a and 10-b). In fact, we must also take into account a typically quantum mechanical effect: the tunnel effect across the potential barrier separating the two minima (Fig. 8). We have already encountered a situation of this type, in Complement GIV , in connection with the inversion of the NH3 molecule. Calculations analogous to those in that complement could show that the degeneracy between the two states and is removed by the tunnel effect. Thus, for each value of , two stationary states, + and , appear (to a first approximation, they are symmetrical and antisymmetrical linear combinations of and ). The larger (that is, the closer the initial energy is to 0 , and hence, the more important the tunnel effect), the greater their energy difference ~ . However, ~ is always much smaller than the distance ~ between adjacent levels and 1 (Fig. 11). For the mean value of the angle , quantum mechanics therefore predicts the following motion: rapid oscillations of frequency about one of the two values = 0 and = , upon which are superposed much slower oscillations between = 0 and = , at the Bohr frequencies 0 2 , 1 2 , 2 2 ...

Comment: States of course exist for which the energy is greater than the maximum height 0 of the potential barrier of Figure 8. These states correspond to a rotational kinetic energy

539



COMPLEMENT AV

V (α) a

φ1 φ0 α 0

π/2

3π/2

π

V (α) b

φ1 φ0 α 0

π/2

π

3π/2

Figure 10: When one neglects the tunnel effect across the potential barriers at = 2 and = 3 2, one can find torsional states of the molecule localized in the wells centered at = 0 (fig. a) and = (fig. b).

that is large enough for one of the CH2 groups to be considered as rotating almost freely with respect to the other one (while being, nevertheless, periodically slowed down and accelerated by the potential of Figure 8).

The ethane molecule C2 H6 behaves in this way. The absence of electrons in this molecule permits a much freer rotation of one of the CH3 groups with respect to the other (the potential barrier 0 is much lower). In this case, the potential ( ), which tends to oppose the free rotation of one of the CH3 groups with respect to the other, has a period of 2 3 because of symmetry.

540



SOME EXAMPLES OF HARMONIC OSCILLATORS

E

ћδ1

{ ψ +1 , ψ 1–

ћδ0

Figure 11: The tunnel effect removes the degeneracy of the energy levels shown in Figure 10. As one approaches the top of the barrier, this phenomenon becomes more im0 0 1 portant ( 1 , + , 1 0 ). + , are the new stationary states.

{ ψ +0 , ψ 0–

4.

Heavy muonic atoms

The muon (sometimes called, for historical reasons, the “ meson”) is a particle which has the same properties as the electron except that its mass is about 207 times greater4 . In particular, it is not sensitive to strong interactions, and its coupling with nuclei is essentially electromagnetic. A muon which has been slowed down in matter can be attracted by the Coulomb field of an atomic nucleus and can form a bound state with that nucleus. The system thus constituted is called a muonic atom. 4-a.

Comparison with the hydrogen atom

In Chapter VII (§ C), we shall study the bound states of two particles of opposite charge and, in particular, those of the hydrogen atom. We shall see that the results of quantum mechanics concerning the energies of bound states are the same as those of the Bohr model (Chap. VII, § C-2). Similarly, the spread of the wave functions which describe these bound states is of the order of the Bohr orbital radius. Let us therefore begin by using this simple model to calculate the energies and spreads of the first bound states of a muon in the Coulomb field of a heavy atom such as lead ( = 82, = 207). If we consider the nucleus to be infinitely heavy, the th Bohr orbital has an energy of: 2

=

2~2

4

1 2

(23)

where is the atomic number of the nucleus, 2 = 2 4 0 (where is the electron charge), and represents the mass of the electron or of the muon, depending on the case. When one goes from hydrogen to the muonic atom under study here, is multiplied by a factor of 2 (82)2 207 1 4 106 . From this we deduce that, for the

4 The

muon is unstable: it decays into an electron and two neutrinos.

541



COMPLEMENT AV

muonic atom: 19 MeV

1

(24) 4 7 MeV

2

The radius of the th Bohr orbital is given by: 2 2

~

=

For hydrogen,

1

(25)

2

3

05˚ A. Here, this number must be divided by

1

10

13

, which gives:

cm (26)

2

12

10

13

cm

In the preceding calculations, we have implicitly assumed the nucleus to be pointlike (in the Bohr model and in the theory presented in Chapter VII, § C, the potential 2 energy is taken equal to ). The small values found for 1 and 2 [formulas (26)] show us that this viewpoint is not at all valid for a heavy muonic atom. The lead nucleus has a non-negligible radius 0 , on the order of 8 5 10 13 cm (recall that the radius of a nucleus increases with 1 3 ). The preceding qualitative calculation therefore leaves us with the impression that the spread of the wave functions of the muon may be smaller than the nucleus5 . Consequently, we must reconsider the problem completely and first calculate the potential “seen” by the muon on the inside as well as on the outside of the nuclear charge distribution. 4-b.

The heavy muonic atom treated as a harmonic oscillator

We shall use a rough model of the lead nucleus: we shall assume its charge to be evenly distributed throughout a sphere of radius 0 8 5 10 13 cm. When the distance of the muon from the center of this sphere is greater than 0 , its potential energy is given by: 2

( )=

for

0

(27)

For 0 , one can calculate the electrostatic force acting on the muon, using Gauss’s theorem; it is directed towards the center of the sphere and its absolute value is: 3 2 0

1 2

2

=

3 0

(28)

5 For hydrogen, the spread of the wave functions, on the order of an Angström, is about 105 times larger than the dimensions of the proton, which can therefore be treated like a point. The new situation encountered here results from several factors which reinforce each other: increased and increased , which results in a greater electrostatic force and a larger nuclear radius.

542



SOME EXAMPLES OF HARMONIC OSCILLATORS

V (r) ρ0

0



3

Ze2

2

ρ0

r

Figure 12: Form of the potential ( ) seen by a muon attracted by a nucleus of radius 0 situated at = 0. When 0, the variation of the potential is parabolic (if the charge density of the nucleus is uniform); when ( ) varies like 1 0, (Coulomb’s law).

This force is derived from the potential energy: ( )=

2

1 2

2 3 0

The constant for = 0 : =

3 2

+

for

0

(29)

is determined by the condition that expressions (27) and (29) be identical

2

(30) 0

Figure 12 represents the potential energy of the muon, plotted with respect to . Inside the nucleus, the potential is parabolic. The orders of magnitude we calculated in § 4-a indicate that it would not be realistic to choose a pure Coulomb potential for the ground state of the muonic lead atom since the wave function is actually concentrated in the region where the potential is parabolic. It is therefore certainly preferable to consider the muon to be “elastically bound” to the nucleus in this case. We then have a three-dimensional harmonic oscillator (Complement EV ) whose angular frequency is: 2

=

3 0

(31)

In fact, we shall see that the wave function of the ground state of this harmonic oscillator is not zero outside the nucleus, so the harmonic approximation is not perfect either.

Comment: It is interesting that the physical system studied here presents many analogies with the first atomic model, proposed by J. J. Thomson. This physicist assumed the positive charge of the atom to be distributed in a sphere whose radius was of the order of a few Angströms, with the electrons moving in the parabolic potential existing inside this charge distribution (model of the elastically bound electron). We know from Rutherford’s experiments that the nucleus is much smaller and that such a model does not correspond to reality for atoms.

543

COMPLEMENT AV

4-c.



Order of magnitude of the energies and spread of the wave functions

If we substitute into expression (31) the numerical values: = 82 2

~ ~

3

1 137 1 05

108 m sec 207 m

10

34

Joule sec

0

85

10

1 86 15

10

28

kg

m

we find: 13

1022 rad sec

(32)

which corresponds to an energy ~ on the order of: 8 4 MeV

~

(33)

We can compare ~ to the total depth of the well

3 2

2

, which is equal to: 0

2

3 2

21 MeV

(34)

0

We see that ~ is smaller than this depth, but not small enough for us to be able to neglect completely the non-parabolic part of ( ). Similarly, the spread of the ground state, if the well were perfectly parabolic, would be on the order of: ~ 2

47

10

13

cm

(35)

The qualitative predictions of § 4-a are therefore confirmed: a large part of the wave function of the muon is inside the nucleus. Nevertheless, what happens outside cannot be completely neglected. The exact calculation of the energies and the wave functions is therefore more complicated than it would be for a simple harmonic oscillator. The Schrödinger equation corresponding to the potential of Figure 12 must be solved (taking into account, in addition, spin, relativistic corrections, etc...). Such a calculation is important: the study of the energy of photons emitted by a heavy muonic atom contributes information about the structure of the nucleus, for example concerning the real charge distribution inside the nuclear volume.

Comment: In the case of ordinary atoms (with an electron instead of a muon), it is valid to neglect the 2 effects of the deviation of the potential from the law. However, one can take this deviation into account by using perturbation theory (cf. Chap. XI). In Complement DXI , we shall study this “volume effect” of the nucleus on the atomic energy levels.

544



SOME EXAMPLES OF HARMONIC OSCILLATORS

References and suggestions for further reading:

Molecular vibrations: Karplus and Porter (12.1), Chap. 7; Pauling and Wilson (1.9), Chap. X; Herzberg (12.4), Vol. I, Chap. III, § 1; Landau and Lifshitz (1.19), Chaps. XI and XIII. Stimulated Raman effect: Baldwin (15.19), § 5.2; see also Schawlow’s article (15.17). Torsion oscillations: Herzberg (12.4), Vol. II, Chap. II, § 5d; Kondratiev (11.6), § 37 The Einstein model: Kittel (13.2), Chap. 6; Seitz (13.4), Chap. III; Ziman (13.3), Chap. 2; see also Bertman and Guyer’s article (13.20). Muonic atoms: Cagnac and Pebay-Peyroula (11.2), § XIX-7; Weissenberg (16.19), § 4-2; see also De Benedetti’s article (11.21).

545



STUDY OF THE STATIONARY STATES IN THE X REPRESENTATION. HERMITE POLYNOMIALS

Complement BV Study of the stationary states in the polynomials

1

representation. Hermite

Hermite polynomials . . . . . . . . . . . . . . . . . . . . . . . 547 1-a

Definition and simple properties . . . . . . . . . . . . . . . . 547

1-b

Generating function . . . . . . . . . . . . . . . . . . . . . . . 548

1-c

Recurrence relations; differential equation . . . . . . . . . . . 549

1-d

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550

2

The eigenfunctions of the harmonic oscillator Hamiltonian 550 2-a

Generating function . . . . . . . . . . . . . . . . . . . . . . . 550

2-b

( ) in terms of the Hermite polynomials . . . . . . . . . . 551

2-c

Recurrence relations . . . . . . . . . . . . . . . . . . . . . . . 552

We now intend to study, in a little more detail than in § C-2 of Chapter V, the wave functions ( ) = associated with the stationary states of the harmonic oscillator. Before undertaking this study, we shall define the Hermite polynomials and mention their principal properties. 1.

Hermite polynomials

1-a.

Definition and simple properties

Consider the Gaussian function: ( )=e

2

(1)

represented by the bell-shaped curve in Figure 1. The successive derivatives of given by: ( )= ( ) = (4

2

2

2 e

(2)

2

2) e

(3)

The th-order derivative, ( )

( ) = ( 1)

( )e

are

( )

( ), can be written:

2

(4)

where ( ) is an th-degree polynomial in . The proof is by recurrence. This relation is valid for = 1, 2 [cf. equations (2) and (3)]. Assume it is true for 1: (

1)

( ) = ( 1)

1

1(

)e

2

(5) 547

COMPLEMENT BV

• 1

F

z –1

1

Figure 1: Shape of the Gaussian function ( ) and of its first and second derivatives ( ) and ( ).

F

F

where 1 ( ) is a polynomial of degree differentiation, if we set: ( )=

2

d d

1(

)

1. We then obtain relation (4) directly by

(6)

Since 1 in , we see from this last relation that 1 ( ) is a polynomial of degree ( ) is indeed an th-degree polynomial. The polynomial ( ) is called the nth-degree Hermite polynomial. Its definition is therefore: ( ) = ( 1) e

2

dn e d

2

(7)

We see from (2) and (3) that 1 ( ) and 2 ( ) are, respectively, even and odd. Moreover, relation (6) shows that if ( ) has the opposite 1 ( ) has a definite parity, parity. From this, we deduce that the parity of ( ) is ( 1) . The zeros of ( ) correspond to those of the th-order derivative of the function ( ). We are going to show that ( ) has real zeros, between which one finds those of . It can be seen from Figure 1 and from relations (1), (2) and (3) that this is 1 true for = 0 1 2. Arguing by recurrence, we can generalize this result: assume that 1 real zeros; if 1 and 2 are two consecutive zeros of 1 ( ) has 1 ( ) and therefore of ( 1) ( ), Rolle’s theorem shows that the derivative ( ) ( ) of ( 1) ( ) goes to zero at a point 3 between 1 and 2 ; therefore, ( 3 ) = 0. Since, in addition, ( 1) ( ) goes to zero when and when + , ( ) ( ) and ( ) have at least real zeros [and not more, because ( ) is th-degree] between which are interposed those of 1 ( ). 1-b.

Generating function

Consider the function of ( + )=e 548

( + )2

and : (8)



STUDY OF THE STATIONARY STATES IN THE X REPRESENTATION. HERMITE POLYNOMIALS

Taylor’s formula enables us to write: ( )

( + )=

!

=0

=

!

=0

( )

( 1)

( )e 2

Multiplying this relation by e

2

(

)=

that is, if we replace 2

e

+2

=

(

and replacing

by

we obtain: (10)

) by its value: ( )

!

=0

(9)

( )

!

=0

2

(11)

The Hermite polynomials can therefore be obtained from the series expansion in of 2 the function e +2 , which for this reason, is called the “generating function” of the Hermite polynomials. Relation (11) gives us another definition of the Hermite polynomials ( ): ( )=

2

e

+2

(12) =0

1-c.

Recurrence relations; differential equation

We have already obtained, in (6), one recurrence relation. It is easy to obtain others by differentiating relation (11). A differentiation with respect to yields: 2 e

2

+2

d !d

= =0 2

that is, replacing e power in :

+2

( )

(13)

by the expansion (11) and setting equal terms of the same

d ( )=2 (14) 1( ) d Similarly, if we differentiate (11) with respect to , an analogous argument yields: ( )=2 mials

1(

)

2(

1)

2(

)

(15)

Finally, it is not difficult to obtain a differential equation satisfied by the polyno( ). Differentiating (14) and using (6), we get:

d2 d 2

d d = 2 [2

( )=2

that is, replacing d2 d 2

2

d +2 d

1(

1(

)

1(

)

( )]

(16)

) by its value as given in (14): ( )=0

(17) 549

COMPLEMENT BV

1-d.



Examples

Definition (7) or the recurrence relation (6) (which amounts to the same thing) enables us to calculate the first Hermite polynomials easily: 0(

)=1

1(

)=2

2(

)=4

2

2

)=8

3

12

3(

(18)

In general: ( )= 2.

d d

2

(1)

(19)

The eigenfunctions of the harmonic oscillator Hamiltonian

2-a.

Generating function

Consider the function: (

)= =0

(20)

!

Using the relation [cf. Chap. V, formula (C-13)]: =

1 !

( )

(21)

0

we obtain: (

(

)=

) !

=0

=

e

0

0

(22)

We now introduce, as in Chapter V, the dimensionless operators ˆ and ˆ : ˆ= (23) ˆ= ~ where the parameter , which has the dimensions of an inverse length, is defined by: =

(24) ~

The operator: e 550

=e

2



ˆ)

(25)



STUDY OF THE STATIONARY STATES IN THE X REPRESENTATION. HERMITE POLYNOMIALS

can be calculated by using formula (63) of Complement BII , where we set: ˆ

=

2

(26) ˆ

=

2

We obtain: ˆ

e

=e

2

=e

2

ˆ

e

2

e

2

ˆ

2

e4 ˆ

[ ˆ ˆ]

2

e

4

(27)

Substituting this result into (22), we find: (

)=e =e

2

4

2

4

2) ˆ (

e(

2) ˆ

e

2

e

(

0

2)

e

~

(28)

0

Now, we have [cf. Complement EII , formula (15)]: e

~ 2

=

2

(29)

and (28) can be written: (

)=e =e

2

4

2

4

e

2

e

2

2 0(

0

2)

(30)

Using formula (C-25) of Chapter V, we finally obtain: 2

(

1 4

)=

2 2

exp

2

2

+

2

(31)

2

According to definition (20), all we must do to find the wave functions is expand this expression in powers of : (

)= =0

(

( )

!

(32)

) is called the generating function of the

2-b.

( ).

( ) in terms of the Hermite polynomials

Replacing, in formula (11),

by

2

exp

( )=

2

+

2

= =0

2

2 and 1 !

(

by )

, we obtain: (33) 551



COMPLEMENT BV

Substituting this expression into (31): 2

(

1 4

)=

1 e !

2

=0

2

2

2

(

)

Setting equal the coefficients of the various powers of 1 4

2

1 2

( )=

!

e

2

2

2

(

(34) in (32) and (34), we obtain:

)

(35)

The shape of the function ( ) is therefore analogous to that of the th-order derivative of the Gaussian function ( ) considered in § 1 above; ( ) is of parity ( 1) and possesses zeros interposed between those of ( ). We mentioned in § C-2 of +1 Chapter V that this property is related to the increase in the average kinetic energy of the states when increases. 2-c.

Recurrence relations

Let us write the equations: =

1

(36) =

+1

+1

in the representation. Using the definitions of and [cf. Chap. V, relations (B-6)], we see that in the representation, the action of these operators is given by: =

+

2

1 d 2 d

=

1 d 2 d

2

(37)

Equations (36) therefore become:

2 2

+

1 d 2 d

( )=

1 d 2 d

( )=

1(

) (38)

+1

+1 (

)

Let us take the sum and difference of these equations: 2 2 d d

552

( )= ( )=

1(

1(

)+ )

+1 +1

+1 (

+1 (

)

(39) )

(40)



STUDY OF THE STATIONARY STATES IN THE X REPRESENTATION. HERMITE POLYNOMIALS

Comment:

If we replace the functions ( ) in (39) and (40) by their expressions given in (35), we obtain, after simplification (setting ˆ = ): 2ˆ 2

ˆ

(ˆ) +

d dˆ

(ˆ) = 2

1 (ˆ)

(ˆ) = 2

1 (ˆ)

+

+1 (ˆ)

(41)

+1 (ˆ)

(42)

By taking the sum and the difference of these equations, we obtain relations (6) and (14) of § 1-a. References

Messiah (1.17), App. B, § III; Arfken (10.4), Chap. 13, § 1; Angot (10.2), § 7.8.

553

• THE EIGENVALUE EQUATION OF THE HARMONIC OSCILLATOR BY THE POLYNOMIAL METHOD

Complement CV Solving the eigenvalue equation of the harmonic oscillator by the polynomial method

1 2

Changing the function and the variable . . . . . . . . . The polynomial method . . . . . . . . . . . . . . . . . . 2-a The asymptotic form of ˆ(ˆ) . . . . . . . . . . . . . . . 2-b The calculation of (ˆ) in the form of a series expansion 2-c Quantization of the energy . . . . . . . . . . . . . . . . 2-d Stationary wave functions . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

555 557 557 558 558 560

In § B of Chapter V, the method used to calculate the energies of the harmonic oscillator stationary states is based on the use of the operators , and , as well as their their commutation relations. It is possible to obtain the same results in a completely different way, by solving the eigenvalue equation of the Hamiltonian in the representation. This is what we are going to do in this complement. 1.

Changing the function and the variable

In the

representation, the eigenvalue equation of 2

is written:

2

~ d 1 + 2 d 2 2

2 2

( )=

( )

(1)

As in Chapter V, let us introduce the dimensionless operators ˆ and ˆ : ˆ= (2) ˆ= ~ where the parameter , which has the dimensions of an inverse length, is defined by: =

(3) ~

Let us denote by ˆ



ˆ

ˆ

the eigenvector of ˆ with eigenvalue is ˆ: (4)

ˆ

The orthonormalization and closure relations of the kets ˆ

ˆ

= (ˆ

ˆ)

ˆ

are written: (5)

+



ˆ

ˆ

=1

(6)

555

COMPLEMENT CV

The ket

• is obviously an eigenvector of

ˆ

, with the eigenvalue ˆ

. Therefore,

when: ˆ=

(7)

the two kets and ˆ are proportional. However, they are not equal. Writing the closure relation for the kets : +

d

=1

(8)

and making the change of variables given in (7), we obtain: +



=

ˆ

=

ˆ

=1

(9)

Comparison with (6) shows that we can, for example, set: =

ˆ

=

(10)

ˆ

to orthonormalize the kets ˆ with respect to ˆ, since the kets are orthonormal with respect to . Let be an arbitrary ket, ( ) = its wave function in the representation, and ˆ(ˆ) = ˆ its wave function in the representation. According to ˆ (10): ˆ(ˆ) =

ˆ

=

1

=

ˆ

(11)

that is: ˆ(ˆ) = If

1

( =

ˆ

)

(12)

is normalized, relation (8) yields: +

=

+

d

=

( ) ( )d =1

(13)

ˆ (ˆ) ˆ(ˆ) dˆ = 1

(14)

and relation (6) gives: +

=

+



ˆ

ˆ

=

The wave function ( ) is therefore normalized with respect to the variable , as is ˆ(ˆ) with respect to the variable ˆ. [This can be shown by using the integral in (13), in which we make the change of variables indicated in (7)]. Now, substituting (7) and (12) into (1), we obtain: 1 2 556

d2 + ˆ2 ˆ(ˆ) = dˆ2

ˆ(ˆ)

(15)

• THE EIGENVALUE EQUATION OF THE HARMONIC OSCILLATOR BY THE POLYNOMIAL METHOD setting: (16)

= ~

Equation (15) is more convenient than equation (1), since all the quantities appearing in it are dimensionless. 2.

The polynomial method

2-a.

The asymptotic form of ˆ(ˆ)

Equation (15) can be written: d2 dˆ2

(ˆ2

2 ) ˆ(ˆ) = 0

(17)

Let us try to predict intuitively the behavior of ˆ(ˆ) for very large ˆ. To do this, consider the functions: ˆ2 2

(ˆ) = e

(18)

They are solutions of the differential equations: d2 dˆ2

(ˆ2

1)

(ˆ) = 0

(19)

When ˆ approaches infinity: ˆ2

1

ˆ2

ˆ2

2

(20)

and equations (17) and (19) take on the same form asymptotically. We should therefore 2 expect the solutions of equation (17) to behave1 , for large ˆ, either like e ˆ 2 or like 2 e ˆ 2 . From a physical point of view, the only functions ˆ(ˆ) of interest to us are those that are bounded everywhere. This restricts us to solutions of (17) that behave like 2 e ˆ 2 (if they exist). This is why we shall set: ˆ(ˆ) = e

ˆ2 2

(ˆ)

(21)

Substituting (21) into (17), we obtain: d2 (ˆ) dˆ2



d (ˆ) + (2 dˆ

1) (ˆ) = 0

(22)

We are going to show how this equation can be solved by expanding (ˆ) in a power series. Then we shall impose the condition that its solutions be physically acceptable. 1 The

2

2

solutions of equation (17) are not necessarily equivalent to e ˆ 2 or e ˆ 2 when ˆ : the intuitive arguments which we have given do not exclude, for example, the possibility that ˆ(ˆ) may 2 2 behave like the product of e ˆ 2 or e ˆ 2 by a power of ˆ.

557



COMPLEMENT CV

2-b.

The calculation of (ˆ) in the form of a series expansion

As we pointed out in § A-3 of Chapter V, the solutions of equation (1) [or, which amounts to the same thing, of (17)] can be sought amongst either even or odd functions. 2 Since the function e ˆ 2 is even, we can therefore set: (ˆ) = ˆ

0

+



2

+



4

+

+

ˆ2 +

2

(23)

with 0 = 0 (where 0 ˆ is, by definition, the first non-zero term of the expansion). Writing (23) in the form: (ˆ) =

ˆ2

2

+

(24)

=0

we easily obtain: d (ˆ) = dˆ

(2

+ )

ˆ(2

2

+

1)

(25)

=0

and: d2 (ˆ) = dˆ2

(2

+ )(2

+

1)

2

ˆ(2

+

2)

(26)

=0

Let us substitute (24), (25) and (26) into (22). For this equation to be satisfied, each term of the series expansion of the left-hand side must be equal to zero. For the general term in ˆ2 + , this condition is written: (2

+ + 2)(2

+ + 1)

2

+2

The term of lowest degree is in ˆ (

1)

0

= (4 2

+2

+2

=

2

(27)

; its coefficient will be zero if:

=0

(28)

Since 0 is not zero, we therefore have either = 1 [ ˆ(ˆ) is then odd]. Relation (27) can be written: 2

2 + 1)

4 +2 +1 2 (2 + + 2)(2 + + 1)

2

= 0 [the function ˆ(ˆ) is then even] or

(29)

which is a recurrence relation between the coefficients 2 . Since 0 is not zero, relation (29) enables us to calculate 2 in terms of 0 , then 4 in terms of 2 , and so on. For arbitrary , we therefore have the series expansion of two linearly independent solutions of equation (22), corresponding respectively to = 0 and = 1. 2-c.

Quantization of the energy

We must now choose, from amongst all the solutions found in the preceding section, those which satisfy the physical conditions that ˆ(ˆ) be bounded everywhere. 558

• THE EIGENVALUE EQUATION OF THE HARMONIC OSCILLATOR BY THE POLYNOMIAL METHOD For most values of , the numerator of (29) does not go to zero for any positive or zero integer . Since none of the coefficients 2 is then zero, the series has an infinite number of terms. It can be shown that the asymptotic behavior of such a series makes it physically unacceptable. We see from (29) that: 2

1

+2

(30)

2

Now consider the power series expansion of the function e ter): ˆ2

e

=

ˆ2

(where

is a real parame-

ˆ2

2

(31)

=0

with: =

2

(32)

!

For this second series, we therefore have: 2

+2

=

2

+1

! + 1)!

(

=

If we choose the value of the parameter 0

(33)

+1 such that:

1

(34)

we see from (30) and (33) that there exists an integer implies: 2

+2

2

2

+2

such that the condition

0

(35)

2

We can deduce from this that, when condition (34) is satisfied, we have: ˆ

(ˆ)

2

(ˆ)

e

ˆ2

(ˆ)

(36)

2

where (ˆ) and (ˆ) are polynomials of degree 2 given by the first series (23) and (31). When ˆ approaches infinity, (36) gives: (ˆ)

2 ˆ

ˆ e

ˆ2

+ 1 terms of

(37)

2

and therefore: ˆ(ˆ)

2 ˆ

ˆ e(

1 2)ˆ2

(38)

2

Since we can choose 1 2

1

such that: (39) 559



COMPLEMENT CV

ˆ(ˆ) is not bounded when ˆ . We must therefore reject this solution, which makes no sense physically. There is only one possibility left: that the numerator of (29) goes to zero for a value 0 of . We then have: = 0 if

2

0

(40) = 0 if

2

0

and the power series expansion of (ˆ) reduces to a polynomial of degree 2 0 + . The 2 behavior at infinity of ˆ(ˆ) is then determined by that of the exponential e ˆ 2 , and ˆ(ˆ) is physically acceptable (since it is square-integrable). The fact that the numerator of (29) goes to zero at = 0 imposes the condition:

2 = 2(2

0

+ )+1

(41)

If we set: 2

0

+

=

(42)

equation (41) can be written: =

=

+1 2

(43)

where is an arbitrary positive integer or zero ( is an arbitrary positive integer or zero, and is equal to 0 or 1). Condition (43) introduces the quantization of the harmonic oscillator energy, since it implies [cf. (16)]: =

+

1 2

(44)

~

We have thus obtained relation (B-34) of Chapter V. 2-d.

Stationary wave functions

The polynomial method also yields the eigenfunctions associated with the various energies , in the form: ˆ (ˆ) = e

ˆ2 2

(ˆ)

(45)

where (ˆ) is an th degree polynomial. According to (23) and (42), (ˆ) is an even function if is even and an odd function if is odd. The ground state is obtained for = 0, that is, for 0 = = 0; 0 (ˆ) is then a constant, and: ˆ0 (ˆ) = 560

0

e

ˆ2 2

(46)

• THE EIGENVALUE EQUATION OF THE HARMONIC OSCILLATOR BY THE POLYNOMIAL METHOD A simple calculation shows that, to normalize ˆ0 (ˆ) with respect to the variable ˆ, it suffices to choose: 0

1 4

=

(47)

Then, using (12), we find: 1 4

2 0(

)=

e

2

2

2

(48)

which is indeed the expression given in Chapter V [formula (C-25)]. To the first excited state 1 = 3~ 2 corresponds = 1, that is, 0 = 0 and = 1; 1 (ˆ) then has only one term, obtained by a calculation analogous to the preceding one: 1 4

4

ˆ1 (ˆ) =

ˆe

ˆ2 2

(49a)

that is:

For 2

=

6

4

1( ) =

1 4

e

2

2

0

= 1 and

= 2, we have 2

2

(49b) = 0. Relation (29) then yields: (50)

0

which finally leads to: 1 4

ˆ2 (ˆ) =

1 4

(2ˆ2

1) e

ˆ2 2

(51a)

that is: 2 2(

)=

1 4

4

(2

2 2

1) e

2

2

2

(51b)

For arbitrary , (ˆ) is the polynomial solution of equation (22), which can be written, taking the quantization condition (43) into account: d2 dˆ2



d +2 dˆ

(ˆ) = 0

(52)

We recognize (52) to be the differential equation satisfied by the Hermite polynomial (ˆ) [see equation (17) of Complement BV ]. The polynomial (ˆ) is therefore proportional to (ˆ), where the proportionality factor is determined by normalization of ˆ(ˆ). This is in agreement with formula (35) of complement BV . References

Mathematical treatment of differential equations: Morse and Feshbach (10.13), Chaps. 5 and 6; Courant and Hilbert (10.11), § V-11. 561



STUDY OF THE STATIONARY STATES IN THE MOMENTUM REPRESENTATION

Complement DV Study of the stationary states in the

1

representation

Wave functions in momentum space . . . . . 1-a Changing the variable and the function . . . 1-b Determination of ˆ (ˆ) . . . . . . . . . . . . 1-c Calculation of the phase factor . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . .

2

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

563 563 564 564 565

The distribution of the possible momenta of a particle in the state is given by the wave function ( ) in the p representation, the Fourier transform of the wave function ( ) in the representation. We shall show in this complement that in the case of the harmonic oscillator, the functions and are the same (to within multiplicative factors). Thus, in a stationary state, the probability distributions of the momentum and the position have similar forms. 1.

Wave functions in momentum space

1-a.

Changing the variable and the function

In Complement CV , we introduced, for simplicity, the operator: ˆ=

(1)

where: =

(2) ~

as well as the eigenkets ˆ of ˆ and the wave function ˆ(ˆ) in the We shall follow a similar procedure for the operator: ˆ=

ˆ

representation.

(3) ~

We shall therefore call ˆ

ˆ



ˆ

the eigenkets of ˆ : (4)

ˆ

and denote by ˆ (ˆ) the wave function in the ˆ (ˆ) =

ˆ

representation: (5)

ˆ

Just as the ket ˆ is proportional to the ket = ˆ , the ket ˆ is proportional to the ket = ~ˆ . If we change to 1 ~ [cf. (1) and (3)], equation (10) of Complement CV shows that: ˆ

=

~

= ~ˆ

(6) 563



COMPLEMENT DV

The wave function ˆ (ˆ) in the representation is therefore related to the wave ˆ function ( ) in the representation by: ˆ (ˆ) =

~ ( = ~ˆ)

(7)

Furthermore, we can use (6) and relation (10) of Complement CV to obtain: ˆ

ˆ

e

=

ˆˆ

(8)

2

We therefore have, using definition (5) and the closure relation for the

ˆ

basis:

+

ˆ (ˆ) =

ˆ

ˆ



ˆˆ

ˆ(ˆ) dˆ

+

1 2

=

ˆ

e

(9)

The function ˆ is therefore the Fourier transform of ˆ. Determination of ˆ (ˆ)

1-b.

We have seen [cf. equation (15) of Complement CV ] that the stationary wave functions ˆ(ˆ) of the harmonic oscillator satisfy the equation: 1 2

d2 + ˆ2 ˆ(ˆ) = dˆ2

ˆ(ˆ)

d2 ˆ(ˆ) is ˆ2 ˆ (ˆ) and that of ˆ2 ˆ(ˆ) is dˆ2 The Fourier transform of equation (10) is therefore: Now, the Fourier transform of

d2 ˆ (ˆ) = ˆ (ˆ) dˆ2

1 2 ˆ 2

(10) d2 ˆ (ˆ). dˆ2

(11)

If we compare (10) and (11), we see that the functions ˆ and ˆ satisfy the same differential equation. Moreover, we know that this equation, when = + 1 2 (where is a positive integer or zero), has only one square-integrable solution (the eigenvalues are non-degenerate; cf. Chapter V, § B-3). We can conclude that ˆ and ˆ are proportional. Since these two functions are normalized, the proportionality factor is a complex number of modulus 1, so that: ˆ (ˆ) = e where e 1-c.

ˆ (ˆ = ˆ)

(12)

is a phase factor which we shall now determine. Calculation of the phase factor

The wave function of the ground state is given by [cf. Complement CV , formulas (46) and (47)]: ˆ0 (ˆ) = 564

1 4

e

ˆ2 2

(13)



STUDY OF THE STATIONARY STATES IN THE MOMENTUM REPRESENTATION

This is a Gaussian function; its Fourier transform is therefore [cf. Appendix I, relation (50)]: 1 4

ˆ 0 (ˆ) =

ˆ2 2

e

(14)

This implies that 0 is zero. To find , let us write, in the =

+1

ˆ

and

ˆ

representations, the relation: (15)

+1

1 d representation, ˆ and ˆ act like ˆ and ; therefore acts like dˆ 1 d d ˆ . In the ˆ representation, ˆ acts like and ˆ like ˆ; therefore dˆ dˆ 2 d acts like ˆ . dˆ 2 In the representation, relation (15) therefore becomes: ˆ In the

ˆ

+1 (ˆ)

ˆ

1

=

2( + 1)

while in the ˆ

+1 (ˆ)

ˆ (ˆ)

(16)

representation, it becomes:

ˆ

=

d dˆ

ˆ

2( + 1)

d dˆ

ˆ ˆ (ˆ)

(17)

We therefore have: e

+1

=

e

(18)

that is, knowing that e

=(

0

= 0:

)

(19)

Thus we obtain: ˆ (ˆ) = (

) ˆ (ˆ = ˆ)

(20)

or, returning to the functions ( )=(

)

1

= ~

2.

and

2~

: (21)

Discussion

Consider a particle in the state . When the position of the particle is measured, one has a probability ( ) d of finding a result between and + d , where ( ) is given by: ( )=

( )2

(22) 565

COMPLEMENT DV



Similarly, in a measurement of the momentum of the particle, one has a probability ( ) d of finding a result between and + d , with: ( )2

( )=

(23)

Relation (21) then yields: ( )=

1

=

(24)

which shows that the momentum distribution in a stationary state has the same form as the position distribution. We see, for example (cf. Fig. 6 of Chapter V), that if is large, ( ) has a peak at each of the two values: =

=

(25)

where is the maximum momentum of the classical particle moving in the potential well with an energy . An argument analogous to the one set forth at the end of § C-2 of Chapter V enables us to understand this result. When the momentum of the classical particle is equal to , its acceleration is zero (its velocity is stationary), and the values of the momentum are, averaging over time, the most probable ones. Comment (i) of § D-1 of Chapter V concerning the probability density ( ) can easily be transposed to this context; for example, when is large, the root mean square deviation ∆ can be interpreted as being of the order of magnitude of the distance between the peaks of ( ) situated at = . It is also possible to understand directly from Figure 6-a of Chapter V why these values of the momentum are highly probable when is large. The wave function then performs a large number of oscillations between the two peaks, analogous to those of a sinusoid. This happens because the differential equation for the wave function [cf. 2 2 formula (A-17) of Chapter V] when 2 becomes: d2 2 ( )+ 2 d 2 ~

( )

0

(26)

which yields, according to the definition of ( )

e

~

+

~

e

: (27)

The wave function (when is large) therefore looks like a sinusoid of wavelength over a relatively large region of the axis. This sinusoid can be considered to be the sum of two progressive waves [cf. (27)] associated with the two opposite momenta (corresponding to the to-and-fro motion of the particle in the well). It is not surprising, therefore, that the probability density ( ) should be large in the neighborhood of the values = . An analogous argument also enables us to understand the order of magnitude of the product ∆ ∆ . This product is equal to [cf. Chap. V, relations (D-6), (D-7) and (D-9)]: ∆ 566



=

+

1 2

~=

2

(28)



STUDY OF THE STATIONARY STATES IN THE MOMENTUM REPRESENTATION

When increases, the amplitudes and of the oscillations increase, and the product ∆ ∆ takes on values much greater than its minimum value ~ 2. We might wonder why this is the case, since we have seen in several examples that when the width ∆ of a function increases, the width ∆ of its Fourier transform decreases. This is indeed what would happen for the functions ( ) if, in the interval + where they take on non-negligible values, they varied slowly, reaching, for example, a single maximum or minimum. This is in fact the case for small values of , for which the value of the product ∆ ∆ is indeed near its minimum. However, when is large, the functions ( ) perform numerous oscillations in the interval + , where they have zeros. One can therefore associate with them wavelengths of the order of ∆ , corresponding to momenta of the particle situated in a domain of dimension ∆ given by: ∆



(29)

We thus find again that: ∆



(30)

This situation is somewhat analogous to the one studied in § 1 of Complement AIII , in connection with the infinite one-dimensional well.

567



THE ISOTROPIC THREE-DIMENSIONAL HARMONIC OSCILLATOR

Complement EV The isotropic three-dimensional harmonic oscillator

1 2 3

The Hamiltonian operator . . . . . . . . . . . . . . . . . . . . 569 Separation of the variables in Cartesian coordinates . . . . 570 Degeneracy of the energy levels . . . . . . . . . . . . . . . . . 572

In Chapter V, we studied the one-dimensional harmonic oscillator. We now show how to use the results of this study to treat the three-dimensional harmonic oscillator. 1.

The Hamiltonian operator

Consider a spinless particle of mass which can move in three-dimensional space. The particle is subjected to a central force (i.e. a force that is constantly directed towards the coordinate origin ) whose absolute value is proportional to the distance of the particle from the point : F=

(1)

r

( is a positive constant). This force field is derived from the potential energy: (r) =

1 2 1 r = 2 2

2 2

(2)

r

where the angular frequency

is defined as for the one-dimensional harmonic oscillator:

=

(3)

The classical Hamiltonian is therefore: (r p) =

p2 1 + 2 2

2 2

(4)

r

Using the quantization rules (Chap. III, § B-5), we immediately deduce the Hamiltonian operator: =

P2 1 + 2 2

2

R2

Since the Hamiltonian

(5) is time-independent, we shall solve its eigenvalue equation:

= where

(6) belongs to the state space

r

of a particle in three-dimensional space. 569



COMPLEMENT EV

Comment:

Since (r) depends only on the distance = r of the particle from the origin [ (r) is consequently invariant under an arbitrary rotation], this harmonic oscillator is said to be isotropic. Nevertheless, the calculations which follow can easily be generalized to the case of an anisotropic harmonic oscillator, for which: (r) =

2 2

+

2

2 2

where the three constants 2.

2 2

+ ,

(7)

and

are different.

Separation of the variables in Cartesian coordinates

Recall that the state space product: r

r

can be considered (cf. Chap. II, § F) to be the tensor

=

(8)

where is the state space of a particle moving along , that is, the space associated with the wave functions ( ). and are defined analogously. Now, expression (5) for the Hamiltonian can be written in the form: =

1 2

=

2

2

+

+

+

2

+

1 2

2

+

2

+

2

+

2

(9)

with: 2

=

2

+

1 2

2

2

(10)

and similar definitions for and . is a function only of and : is therefore the extension into r of an operator that actually acts in . Similarly, and act only in and respectively. In , is a one-dimensional harmonic oscillator Hamiltonian. The same is true for and in and . , and commute. Each of them therefore commutes with their sum . Consequently, the eigenvalue equation (6) can be solved by seeking the eigenvectors of that are also eigenvectors of , and . Now, we already know the eigenvectors and eigenvalues of in , as well as those of in and of in : 1 ~ 2 1 + ~ 2 1 + ~ 2

=

+

= =

;

(11a)

;

(11b)

;

(11c)

( , and are positive integers or zero). It follows (cf. Chap. II, § F) that the eigenstates common to , , and are of the form: = 570

(12)



THE ISOTROPIC THREE-DIMENSIONAL HARMONIC OSCILLATOR

According to equations (9) and (11): =

+

+

+

3 2

(13)

~

The eigenvectors of are seen to be tensor products of eigenvectors of , and respectively, and the eigenvalues of , to be sums of eigenvalues of these three operators. According to equation (13), the energy levels of the isotropic three-dimensional harmonic oscillator are of the form: =

+

3 2

(14)

~

with: a positive integer or zero

(15)

since is the sum + + of three numbers, each of which can take on any nonnegative integral value. Furthermore, formula (12) enables us to deduce the properties of the vectors , common eigenstates of , , and , from those derived in § C-1 of Chapter V for (which are also valid for and ). Let us introduce three pairs of creation and annihilation operators: = = =

2~ 2~ 2~

+

=

2 ~

+

=

2 ~

+

=

2 ~

2~

2 ~

2~

2 ~

2~

2 ~

(16a) (16b) (16c)

These operators are the extensions into r of operators acting in , and . The canonical commutation relations between the components of R and P imply that the only non-zero commutators of the six operators defined in (16) are: [

]=[

]=[

]=1

(17)

Note that two operators with different indices always commute, as is logical because they act in different spaces. The action of the operators and on the states is given by the formulas: =(

)

=

1

=

1

=(

(18a)

)

=

+1

+1

=

+1

+1

(18b) 571



COMPLEMENT EV

For

, and , , we have analogous relations. We also know [cf. equation (C-13) of Chapter V] that: 1

= where

!

)

(19)

0

is the vector of

0 0

(

that satisfies the condition:

=0

(20)

In and , there are analogous expressions for ing to (12): 1 ! !

=

!

(

) (

) ( )

and

000

. Consequently, accord-

(21)

where 0 0 0 is the tensor product of the ground states of the three one-dimensional oscillators, so that: 000

=

000

=

=0

000

Finally, recall that, since tion is of the form: =

r

( )

(22) is a tensor product, the associated wave func-

( )

( )

(23)

where , and are stationary wave functions of the one-dimensional harmonic oscillator (Chap. V, § C-2). For example: 3 4

r 3.

000

=

e

2~ (

2

+

2

+

2

)

(24)

~

Degeneracy of the energy levels

We showed in § B-3 of Chapter V that constitutes a C.S.C.O. in ; the same is true for in and for in . According to § F of Chapter II, is thus a C.S.C.O. in r . Therefore, there exists (to within a constant factor) a unique ket of r corresponding to a given set of eigenvalues of , and , that is, to fixed non-negative integers , and . However, alone does not form a C.S.C.O. since the energy levels are degenerate. If we choose an eigenvalue of , = ( + 3 2)~ ) (which amounts to fixing a non-negative integer ), all the kets of the basis that satisfy: +

+

=

(25)

are eigenvectors of with the eigenvalue . The degree of degeneracy of is therefore equal to the number of different sets satisfying condition (25). To find , we can proceed as follows. With fixed, choose first, giving it one of the values: =0 1 2 572

(26)

• With

THE ISOTROPIC THREE-DIMENSIONAL HARMONIC OSCILLATOR

thus fixed, we must have: +

=

(27)

There are then (

+ 1) possibilities for the pair

= 0

1

The degree of degeneracy =

(

1 of

: 0

(28)

is therefore equal to:

+ 1)

(29)

=0

This sum is easy to calculate: = ( + 1)

1 =0

= =0

( + 1)( + 2) 2

Consequently, only the ground state

0

=

(30)

3 ~ is non-degenerate. 2

Comment:

The kets constitute an orthonormal system of eigenvectors of , which forms a basis in r . Since the eigenvalues of are degenerate, this system is not unique. We shall see in particular in Complement BVII that, in order to solve equation (6), it is possible to use a set of constants of the motion other than : thus we obtain a basis of r which is different from the preceding one, although still consisting of eigenvectors of . The kets of this new basis are orthonormal linear combinations of the belonging to each of the eigensubspaces of , that is, corresponding to a fixed value of the sum + + .

573



A CHARGED HARMONIC OSCILLATOR IN A UNIFORM ELECTRIC FIELD

Complement FV A charged harmonic oscillator in a uniform electric field

Eigenvalue equation of (E ) in the representation Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-a Electrical susceptibility of an elastically bound electron . 2-b Interpretation of the energy shift . . . . . . . . . . . . . . Use of the translation operator . . . . . . . . . . . . . . .

1 2

3

. . . . .

. . . . .

575 577 577 578 579

The one-dimensional harmonic oscillator studied in Chapter V consists of a particle of mass having a potential energy: ( )=

1 2

2

2

(1)

Assume, in addition, that this particle has a charge and that it is placed in a uniform electric field E parallel to . What are its new stationary states and the corresponding energies? The classical potential energy of a particle placed in a uniform field E is equal to1 : E

(E ) =

(2)

To obtain the quantum mechanical Hamiltonian operator (E ) in the presence of the field E , we must therefore add to the potential energy (1) of the harmonic oscillator the term: E

(E ) =

(3)

which gives: 2

(E ) =

2

+

1 2

2

2

E

(4)

We must now find the eigenvalues and eigenvectors of this operator. To this end, we shall use two different methods. First, we shall consider directly the eigenvalue equation of (E ) in the representation, as it is very simple to interpret the results obtained. Then we shall show how the problem can be solved in a purely operator formalism. 1.

Eigenvalue equation of

Let

be an eigenvector of (E )

1 We

(E ) in the

representation

(E ):

=

use the convention of zero potential energy for the particle at

(5) = 0.

575



COMPLEMENT FV

Using (4), we can write this equation in the ~2 d2 1 + 2 d 2 2

E

2 2

( )=

Completing the square with respect to ~2 d2 1 + 2 2 d 2

2

Let us now replace the variable =

( )

(6)

on the left-hand side of (6), we get:

2

E

2

representation:

2

E2

2

2

( )=

( )

(7)

by a new variable , setting:

E

(8)

2

Through the intermediary of , 1 ~2 d2 + 2 d 2 2

2 2

is then a function of , and equation (7) becomes:

( )=

( )

(9)

with: 2

=

+

E2

(10)

2

2

Thus we see that equation (9) is the same as the one used to obtain the stationary states of the harmonic oscillator in the absence of an electric field in the representation [cf. Chap. V, relation (A-17)]. Therefore, we have already solved this equation, and we know that the acceptable values of are given by: =

+

1 2

(11)

~

(where is a positive integer or zero). Relations (10) and (11) show that, in the presence of the electric field, the energies of the stationary states of the harmonic oscillator are modified: (E ) =

+

1 2

2

~

2

E2 2

(12)

The entire spectrum of the harmonic oscillator is therefore shifted by the quantity 2 2 2 E 2 . Now, let us show that the eigenfunctions ( ) associated with the energies (12) can all be obtained from the ( ) by a translation along . The solution of (9) corresponding to a given value of is ( ) [where the function is given, for example, by formula (35) of Complement BV ]. According to (8), we have: ( )=

E 2

(13)

This translation comes from the fact that the electric field exerts a force on the particle2 . 2 It can be seen from (13) that the function 2; ( ) is obtained from ( ) by a translation of E if the product E is positive, the translation is performed in the positive -direction, which is indeed the direction of the force exerted by E .

576



A CHARGED HARMONIC OSCILLATOR IN A UNIFORM ELECTRIC FIELD

V

W

V+W

q mω2 O –

q2

x

2

2mω2

O

Figure 1: The presence of a uniform electric field has the effect of adding a linear term to the potential energy of the harmonic oscillator; the total potential + is then represented by a displaced parabola.

Comment:

The change of variable given in (8) allows us to reduce the case of an arbitrary electric field to an already solved problem, the one in which E was zero. The only effect of the electric field is to change the -origin [cf. (13)] and the energy origin [cf. (12)]. This result can easily be understood graphically (cf. Fig. 1). When E is zero, the potential energy ( ) is represented by a parabola centered at . When E is not zero, it is necessary to add to this potential energy the quantity E , which corresponds to the dashed line in this figure; the curve representing + is again a parabola. Thus, in the presence of the field E we still have a harmonic oscillator. Since the two parabolas are superposable, they correspond to the same value of and therefore to the same energy difference between the levels. However, their minima and are different, as is consistent with formulas (12) and (13). 2.

Discussion

2-a.

Electrical susceptibility of an elastically bound electron

In certain situations, the electrons of an atom or a molecule behave, to a good approximation, as if they were “elastically bound”, that is, as if each of them were a harmonic oscillator. We shall prove this for atoms in Complement AXIII , using timedependent perturbation theory. The contribution of each electron to the electric dipole moment of the atom is described by the operator: =

(14) 577

COMPLEMENT FV



where is the charge of the electron ( 0) and the corresponding position observable. We are going to examine the mean value of in the model of the elastically bound electron. In the absence of an electric field, the mean value of the electric dipole moment in a stationary state of the oscillator is zero: =

=0

(15)

[see formulas (D-1) of Chapter V]. Now, let us assume that the field E is turned on so slowly that the state of the electron changes gradually from to ( remaining the same). The mean dipole moment is now different from zero, since: +

=

=

( )2

d

(16)

Using (8) and (13), we obtain: +

2

( )2 d +

=

E

+

2

( )2 d =

2

E

(17)

2

since the first integral is zero by symmetry. is therefore proportional to E . In this model, the electrical susceptibility of the atomic electron under consideration is equal to: 2

=

E

=

(18)

2

It is positive, whatever the sign of . It is simple to interpret result (18) physically. The effect of the electric field is to shift the classical equilibrium position of the electron, that is, the mean value of its position in quantum mechanics [see formula (13)]. This results in the appearance of an induced dipole moment. decreases when increases because the oscillator is less easily deformable when the restoring force (which is proportional to 2 ) is larger. 2-b.

Interpretation of the energy shift

Using the model just described, we can interpret formula (12) by calculating the variation in the mean kinetic and potential energies of the electron when it passes from the state to the state . The variation in the kinetic energy is, in fact, zero (as can be understood intuitively from Figure 1, for example): 2

2

2

2

=

~2 2

+

( )

d2 d 2

( )d +

( )

d2 d 2

( )d

according to formula (13). The variation in the potential energy can be treated in two terms: 578

= 0 (19)



A CHARGED HARMONIC OSCILLATOR IN A UNIFORM ELECTRIC FIELD

the first term, (E ) , corresponds to the electrical potential energy of the dipole in the field E . Since the dipole is parallel to the field, we have, according to (17): (E ) =

2

E

=

E2

(20)

2

the second term, ( ) ( ) , arises from the electric field modification of the wave function of the level labeled by the quantum number . The “elastic” potential energy of the particle therefore changes by a quantity: ( )

( ) =

+

1 2

2

+ 2

( )2d

2

( )2d

(21)

The first integral can be calculated by using (13) and the change of variable (8): +

+ 2

( )2 d =

2

( )2 d +

2 E

+ Since ( ) is normalized, and since the integral of obtain, finally: 2

( )

( ) =

2

E2

+

( )2 d

2 2

E 2

+

( )2 d

(22)

( ) 2 is zero by symmetry, we

(23)

2

We see why this result should be positive, since the electric field moves the particle away from the point and attracts it into a region where the “elastic” potential energy ( ) is larger. Adding (20) to (23), we again find that the energy of the state is less than 2 that of the state by 2 E 2 2 . 3.

Use of the translation operator

We shall see in this section that, instead of using the representation, as we have done until now, we can argue directly in terms of the operator (E ) given in (4). More precisely, we are going to show that a unitary transformation (which corresponds to a translation of the wave function along the -axis) transforms the operator = (E = 0) into the operator (E ) (to within an additive constant which does not change the eigenvectors). Since the eigenvectors and eigenvalues of were determined in Chapter V, this approach enables us to solve our problem. Therefore, consider the operator: ( )=e where

(

)

is a real constant. Its adjoint

( )=e

(

)

(24) ( ) is: (25) 579



COMPLEMENT FV

It is clear that: ( )

( )=

( ) ( )=1

(26)

( ) is therefore a unitary operator. Under the corresponding unitary transformation, becomes: ˜ =

( )

( )

1 + 2

=~

( )

( )

(27)

We must now calculate the operator: ( )

( )=˜ ˜

(28)

with: ˜=

( )

˜ =

( )

( ) ( )

(29)

To obtain ˜ and ˜ , we use formula (63) of Complement BII , (which can be applied here since the commutator of and is equal to 1), which yields: +

( )=e

=e

+

( )=e

=e

e

e

e e

2

2 2

2

(30)

Also, formula (51) of Complement BII enables us to write: e

=

e (31)

e

=

e

that is: e e

e e

= =

(32)

Thus it follows that: ˜=e

e

e

e

=e

(

)e

=

(33)

and, similarly: ˜ = 580

(34)



A CHARGED HARMONIC OSCILLATOR IN A UNIFORM ELECTRIC FIELD

˜ is therefore given by: 1 +( 2 1 + 2

˜ =~ =~ =

~ ( +

)(

)

( +

)+ 2

)+

2

(35)

~

Since ( + ) is proportional to the operator it suffices to set: E

=

[formulas (B-1) and (B-7) of Chapter V],

1 2 ~

(36)

to obtain: 2

E

˜ =

+

E2 2

2

2

=

(E ) +

2

E2 2

(37)

The two operators ˜ and (E ) therefore have the same eigenvectors, and their eigen2 values differ by 2 E 2 2 . Now, we know (cf. Complement CII , § 2) that if the eigenvectors of are the kets , those of ˜ are the kets: ˜

=

( )

(38)

and the corresponding eigenvalues of and ˜ are the same. The stationary states of the harmonic oscillator in the presence of the field E are therefore the states ˜ given by (38). The associated eigenvalue of (E ) is, according to (37): (E ) =

+

1 2

2

~

2

E2 2

(39)

which is the same as formula (12) of the preceding section. Expression (38) for the eigenvectors can be put into the form: = ˜

=e

E ~

2

(40)

using (24) and (36), as well as formulas (B-1) and (B-7) of Chapter V. We interpreted, in ~ Complement EII , the operator e as being the translation operator over an algebraic distance along . is therefore the state obtained from by a translation 2 E , just as is indicated by formula (13). References:

The elastically bound electron: see references of Complement AXIII .

581



COHERENT “QUASI-CLASSICAL” STATES OF THE HARMONIC OSCILLATOR

Complement GV Coherent “quasi-classical” states of the harmonic oscillator

1

2

3

4

Quasi-classical states . . . . . . . . . . . . . . . . . . . . . . . 584 1-a Introducing the parameter 0 to characterize a classical motion 584 1-b Conditions defining quasi-classical states . . . . . . . . . . . . 586 1-c Quasi-classical states are eigenvectors of the operator . . . 587 Properties of the states . . . . . . . . . . . . . . . . . . . . 588 2-a Expansion of on the basis of the stationary states . . 588 2-b Possible values of the energy in an state . . . . . . . . . . 589 2-c Calculation of , , ∆ and ∆ in an state . . . . . 591 2-d The operator ( ): the wave functions Ψ ( ) . . . . . . . . 591 2-e The scalar product of two states. Closure relation . . . . 593 Time evolution of a quasi-classical state . . . . . . . . . . . . 594 3-a A quasi-classical state always remains an eigenvector of . . 595 3-b Evolution of physical properties . . . . . . . . . . . . . . . . . 595 3-c Motion of the wave packet . . . . . . . . . . . . . . . . . . . . 596 Example: quantum mechanical treatment of a macroscopic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596

The properties of the stationary states of the harmonic oscillator were studied in Chapter V; for example, in § D, we saw that the mean values and of the position and the momentum of the oscillator are zero in such a state. Now, in classical mechanics, it is well-known that the position and the momentum are oscillating functions of time, which always remain zero only if the energy of the motion is also zero [cf. Chap. V, relations (A-5) and (A-8)]. Furthermore, we know that quantum mechanics must yield the same results as classical mechanics in the limiting case where the harmonic oscillator has an energy much greater than the quantum ~ (limit of large quantum numbers). Thus, we may ask the following question: is it possible to construct quantum mechanical states leading to physical predictions which are almost identical to the classical ones, at least for a macroscopic oscillator? We shall see in this complement that such quantum states exist: they are coherent linear superpositions of all the states . We shall call them “quasi-classical states” or “coherent states of the harmonic oscillator”. The problem we are considering here is of great general interest in quantum mechanics. As we saw in the introduction to Chapter V and in Complement AV , many physical systems can be likened to a harmonic oscillator, at least to a first approximation. For all these systems, it is important to understand, in the framework of quantum mechanics, how to move gradually from the case in which the results given by the classical approximation are sufficient to the case in which quantum effects are preponderant. Electromagnetic radiation is a very important example of such a system. Depending on the experiment, it either reveals its quantum mechanical nature (as is the case in the 583

COMPLEMENT GV



experiment discussed in § A-2-a of Chapter I, in which the light intensity is very low) or else can be treated classically. “Coherent states” of electromagnetic radiation were introduced by Glauber and are in current use in the domain of quantum optics. The position, the momentum, and the energy of a harmonic oscillator are described in quantum mechanics by operators which do not commute; they are incompatible physical quantities. It is not possible, therefore, to construct a state in which they are all perfectly well-defined. We shall thus only look for a state vector such that, for all , the mean values , and are as close as possible to the corresponding classical values. This will lead us to a compromise in which none of these three observables is perfectly known. We shall see, nevertheless, that the root mean square deviations ∆ , ∆ and ∆ are, in the macroscopic limit, completely negligible. 1.

Quasi-classical states

1-a.

Introducing the parameter

0

to characterize a classical motion

The classical equations of motion of a one-dimensional harmonic oscillator, of mass and angular frequency , are written: d 1 ()= () d

(1a)

d 2 ()= () (1b) d The quantum mechanical calculations we shall perform later will be simplified by the introduction of the dimensionless quantities: ˆ( ) =

() (2)

1 ˆ( ) = ~

()

where: =

(3) ~

Equations (1) can then be written: d ˆ( ) = d

ˆ( )

(4a)

d ˆ( ) = ˆ( ) (4b) d The classical state of the harmonic oscillator is determined at time when we know its position ( ) and its momentum ( ), that is, ˆ( ) and ˆ( ). We shall therefore combine these two real numbers into a single dimensionless complex number ( ) defined by: ()= 584

1 [ˆ( ) + ˆ( )] 2

(5)



COHERENT “QUASI-CLASSICAL” STATES OF THE HARMONIC OSCILLATOR

ˆ x(t) √2

Figure 1: The point ( ), which corresponds to the complex number ( ), characterizes the state of the harmonic oscillator at each instant. moves in a circle with an angular velocity . The abscissa and ordinate of give the position and momentum of the oscillator.

Im α(t)

M0

–𝜔t

Re α(t)

O ˆ p(t) √2

M (t)

The set of two equations (4) is equivalent to the single equation: d ()= d

()

(6)

whose solution is: ()=

0

e

(7)

where we have set: 0

= (0) =

1 [ˆ(0) + ˆ(0)] 2

(8)

Now consider the points 0 and in the complex plane that correspond to the complex numbers 0 and ( ) [Fig. 1]. is at 0 at = 0 and describes, with an angular velocity , a circle centered at of radius 0. Since, according to (5), the coordinates of are equal to ˆ( ) 2 and ˆ( ) 2, we thus obtain a very simple geometrical representation of the time evolution of the state of the system. Every possible motion corresponding to given initial conditions is entirely characterized by the point 0 , that is, by the complex number 0 (the modulus of 0 gives the amplitude of the oscillation and the argument of 0 , its phase). According to (5) and (7), we have: ˆ( ) =

ˆ( ) = As for the classical energy

1 2

0

2

e

0

+

e

0

e

0

(9a)

e

(9b)

of the system, it is constant in time and equal to:

1 1 2 [ (0)]2 + [ (0)]2 2 2 ~ = [ˆ(0)]2 + [ˆ(0)]2 2 =

(10) 585



COMPLEMENT GV

which yields, taking (8) into account: =~

0

2

(11)

For a macroscopic oscillator, the energy

is much greater than the quantum ~ , so:

1

0

1-b.

(12)

Conditions defining quasi-classical states

We are looking for a quantum mechanical state for which at every instant the mean values , and are practically equal to the values , and which correspond to a given classical motion. To calculate , and , we use the expressions: ˆ=

1 ( + 2

=

ˆ= 1 ~

=

2

)

(

)

(13)

and: =~

+

1 2

(14)

For an arbitrary state ( ) , the time evolution of the matrix element is given by (cf. § D-1-d of Chapter III): ~

d d

()= [

()=

()

]()

()

(15)

Now: [

]=~ [

]=~

(16)

which implies: d d

()=

()

(17)

that is: ()=

(0) e

The evolution of ()=

(0) e

=

(0) e

(18) ()=

()

( ) obeys the complex conjugate equation:

(18) and (19) are analogous to the classical equation (7). 586

(19)



COHERENT “QUASI-CLASSICAL” STATES OF THE HARMONIC OSCILLATOR

Substituting (18) and (19) into (13), we obtain: ˆ ()=

1 2

(0) e

+

(0) e (20)

ˆ ()=

(0) e

2

(0) e

Comparing these results with (9), we see that, in order to have at all times : ˆ ( ) = ˆ( ) (21) ˆ ( ) = ˆ( ) it is necessary and sufficient to set, at the instant = 0, the condition: (0) =

(22)

0

where 0 is the complex parameter characterizing the classical motion which we are trying to reproduce quantum mechanically. The normalized state vector ( ) of the oscillator must therefore satisfy the condition: (0)

(0) =

(23)

0

We must now require the mean value: =~

(0) +

~ 2

(24)

to be equal to the classical energy given by (11). Since, for a classical oscillator, 0 is much greater than 1 [cf. (12)], we shall neglect the term ~ 2 (of purely quantum 2 mechanical origin; see § D-2 of Chapter V) with respect to ~ 0 . The second condition on the state vector can now be written: (0) =

0

2

(25)

that is: (0)

(0) =

0

2

(26)

We shall see that conditions (23) and (26) are sufficient to determine the normalized state vector (0) (to within a constant phase factor). 1-c.

Quasi-classical states are eigenvectors of the operator

We introduce the operator ( (

0)

=

0

0)

defined by: (27) 587



COMPLEMENT GV

We then have: (

0)

(

0)

=

0

+

0

0

and the square of the norm of the ket ( (0)

(

0)

(

0)

(28)

0 0)

(0) is:

(0) =

(0)

(0)

0

(0)

(0)

(0)

0

(0) +

0

0

(29)

Substituting into this relation conditions (23) and (26), we obtain: (0)

(

The ket ( (

0)

0) 0)

(

0)

(0) =

0

0

0 0

0

0

+

0

0

=0

(30)

(0) , whose norm is zero, is therefore zero:

(0) = 0

(31)

that is: (0) =

0

(0)

(32)

Conversely, if the normalized vector (0) satisfies this relation, it is obvious that conditions (23) and (26) are satisfied. We therefore arrive at the following result: the quasi-classical state, associated with a classical motion characterized by the parameter 0 , is such that (0) is an eigenvector of the operator with the eigenvalue 0 . In what follows, we shall denote the eigenvector of with eigenvalue by : =

(33)

[we shall show later that the solution of (33) is unique to within a constant factor]. 2.

Properties of the

2-a.

Expansion of

states on the basis of the stationary states

Let us determine the ket the states : =

which is a solution of (33) by using an expansion on

( )

(34)

We then have: =

( )

1

(35)

and, substituting this relation into (33), we obtain: +1 (

588

)=

+1

( )

(36)



COHERENT “QUASI-CLASSICAL” STATES OF THE HARMONIC OSCILLATOR

This relation enables us to determine by recurrence all the coefficients 0 ( ): ( )=

0(

!

( ) in terms of

)

(37)

It follows that, when 0 ( ) is fixed, all the ( ) are also fixed. The vector is therefore unique to within a multiplicative factor. We shall choose 0 ( ) real and positive and normalize the ket , which determines it completely. In this case, the coefficients ( ) satisfy: ( )2=1

(38)

that is: 2 0(

)2

!

=

2

)2e

0(

=1

(39)

With the convention we have chosen: 0(

2

)=e

2

(40)

and, finally: 2

=e

2-b.

2

(41)

! Possible values of the energy in an

state

Let us consider an oscillator in the state . We see from (41) that a measurement of the energy can yield the result = ( + 1 2)~ with the probability: 2

( )=

( )2=

2

e

!

The probability distribution obtained,

(42) ( ), is therefore a Poisson distribution. Since:

2

( )=

1(

)

it is easy to verify that

(43) ( ) reaches its maximum value when:

= the integral part of

2

To calculate the mean value

=

( )

+

(44) of the energy, we can use (42) and the expression:

1 ~ 2

(45)

Nevertheless, it is quicker to notice that, since the adjoint relation of (33) is: =

(46) 589



COMPLEMENT GV

we have: =

(47)

and therefore: =~

+

1 2

=~

2

+

1 2

(48)

Comparing this result to (44), we see that, when 1, is not very different, in relative value, from the energy which corresponds to the maximum value of ( ). 2 Let us calculate the mean value : 2

= ~2

2

+

2

1 2

(49)

Using (33) and the fact that [ 2

= ~2

2

4

+2

2

] = 1, we easily obtain: +

1 4

(50)

from which we get: ∆

=~

(51)

If we compare (48) and (51), we see that, if ∆

1

is very large, we have:

1

(52)

The energy of the state small.

is very well-defined, since its relative uncertainty is very

Comment:

Since: =

+

1 2

~

(53)

we immediately obtain from (48) and (51): =

2

(54) ∆

=

Thus we see that, to obtain a quasi-classical state, we must linearly superpose a very large number of states since ∆ 1. However, the relative value of the dispersion over is very small: ∆

590

=

1

1

(55)

• 2-c.

COHERENT “QUASI-CLASSICAL” STATES OF THE HARMONIC OSCILLATOR

Calculation of

and

,

,∆

and ∆

in an

state

The mean values and can be obtained by expressing [formula (13)] and using (33) and (46). We obtain: =

2~

=

=

=

Re( )

and

in terms of

(56)

2 ~ Im( )

An analogous calculation yields: 2

=

~

~ 2 and therefore: 2

=



=



=

)2

(

(57)

~ ~ 2

(58)

nor ∆



2-d.

1

2

Neither ∆ value: ∆

)2 + 1

( +

2

, depends on

. Note also that ∆



=~ 2

takes on its minimum (59)

The operator

( ): the wave functions Ψ ( )

Consider the operator

( ) defined by:

( )=e

(60)

This operator is unitary, since: ( )=e

(61)

immediately implies: ( )

( )=

( ) ( )=1

(62)

Since the commutator of the operators and is equal to we can use relation (63) of Complement BII to write: 2

( )=e

2

e

e

, which is a number, (63)

Now let us calculate the ket

( )

0

; since:

2

e

0

= 1 =

0

+

2

2!

+

0

(64) 591



COMPLEMENT GV

then: ( )

2

=e

0

2

2

=e

e

0

(

2

) 0

! 2

=e

2

(65)

!

Comparing (41) and (65), we see that: =

( )

(66)

0

( ) is therefore the unitary transformation which transforms the ground state the quasi-classical state . Formula (66) will enable us to obtain the wave function: ( )=

0

into

(67)

which characterizes the quasi-classical state ( )=

( )

in the

representation. To calculate: (68)

0

we shall write the operator

in terms of

and

:

+

=

2

~

(69)

2

~

Using formula (63) of Complement BII again, we obtain: +

( )=e

=e

e

2

~

2

2

~

e

2

(70)

4

Substituting this result into (68), we find: 2

( )=e

e

2

=e

+

2

4

e

2

~

2

~

+

2

e

4

0

~

Now, the operator e ment EII ):

e

2

~

2

~

(71)

0

is the translation operator of

along

(cf. Comple-

+

e

~

=

2

~

2

( +

)

(72)

Relation (71) therefore yields: 2

( )=e 592

4

2

e

~

2

0

~ 2

( +

)

(73)

• If we write

and

( )=e

COHERENT “QUASI-CLASSICAL” STATES OF THE HARMONIC OSCILLATOR

in terms of ~

e

0

and

(

2

=e

( ) becomes:

)

where the global phase factor e e

[formulas (56)],

(74)

is defined by:

2

(75)

4

Relation (74) shows that ( ) can easily be obtained from the wave function 0 ( ) of the ground state of the oscillator: it is sufficient to translate this function along by ~ the quantity and to multiply it by the oscillating exponential e (since the factor e plays no physical role, it can be omitted)1 . If we replace 0 in (74) by its explicit expression, we obtain, finally: 2

1 4

( )=e

exp

+

2∆

~

The form of the wave packet associated with the ( )2= ~

state is therefore given by:

2

1 2

exp

(76) ~

(77)



For any state, we obtain a Gaussian wave packet. This result is consitent with the fact that the product ∆ ∆ is always minimal (cf. Complement CIII ). 2-e.

The scalar product of two

states. Closure relation

The states are eigenvectors of the non-Hermitian operator . There is therefore no obvious reason for these states to satisfy orthogonality and closure relations. In this section, we shall investigate this question. First, we shall consider two eigenkets and of the operator . Relation (41) gives their scalar product, since: =

( ) ( )

(78)

We therefore have: =e

=e

2

2

2

2

2

e

(

2

) !

2

e

2

e

(79)

from which we conclude: 2

=e

2

(80)

1 The exponential e ~ is obviously not a global phase factor since its value depends on . The presence of this exponential in (74) insures that the mean value of in the state described by ( ) be equal to .

593



COMPLEMENT GV

This scalar product is therefore never zero. However, we shall show that the states do satisfy a closure relation, which is written: 1 d Re d Im =1 (81) To do so, we replace 1

, on the left-hand side of (81), by its expression (41). This yields:

2

e

!

d Re

!

d Im

that is, going into polar coordinates in the complex 2

1

d 0

e(

= e ):

)

(83)

! !

0

The integral over

plane (setting

+

2

d e

(82)

is easily calculated:

2

e(

)

d =2

(84)

0

which yields for (83): 1 !

(85)

with: =2

d e

2

2

=

0

d e

Integrating by parts, we find a recurrence relation for the =

(86)

0

:

1

(87)

whose solution is: = !

0

= !

(88)

Substituting this result into (85), we see that the left-hand side of formula (81) can finally be written: (89) which proves that formula. 3.

Time evolution of a quasi-classical state

Consider a harmonic oscillator which, at the instant = 0, is in a particular (0) =

0

state: (90)

How do its physical properties evolve over time? We already know (cf. § 1-b) that the mean values ( ) and ( ) always remain equal to the corresponding classical values. We shall now study other interesting properties of the state vector ( ) . 594

• 3-a.

COHERENT “QUASI-CLASSICAL” STATES OF THE HARMONIC OSCILLATOR

A quasi-classical state always remains an eigenvector of

Starting with (41), we can use the general rule to obtain tonian is not time-dependent (cf. Chap. III, § D-2-a): 2

() =e

0

2

2

=e

e

0

2

~

e

!

0

2

( ) when the Hamil-

e

(91)

!

If we compare this result with (41), we see that, to go from (0) = 0 to ( ) , all 2 we must do is change 0 to 0 e and multiply the ket obtained by e (which is a global phase factor with no physical consequences): 2

() =e

=

0

e

(92)

In other words, we see that a quasi-classical state remains an eigenvector of for all time, with an eigenvalue 0 e which is nothing more than the parameter ( ) of Figure 1 (corresponding to the point ), which characterizes at all times the classical oscillator whose motion is reproduced by the state ( ) . 3-b.

Evolution of physical properties

Using (92) and changing 2~

()=

Re[

0

e

to

0

e

in (56), we immediately obtain:

] (93)

()=

2 ~ Im[

0

e

]

As predicted, these equations are similar to the classical equations (9). The average energy of the oscillator is time-independent: =~

0

2

+

1 2

According to (51) and (58), the root mean square deviations ∆ , ∆ to: ∆

=~

(94) and ∆

are equal (95)

0

and: ∆ ∆

= =

~ 2

(96) ~ 2

∆ and ∆ are not time-dependent, the wave packet remains a minimum wave packet for all time. 595

COMPLEMENT GV

3-c.



Motion of the wave packet

Let us calculate the wave function at time : (

)=

where

()

(97)

( ) is given by (92). From (76), we obtain: 1 4

(

)=e

2

e

( )

( )

e

~

e

2

2∆

(98)

~ At , the wave packet is still Gaussian. Its form does not vary with time, since: )2=

(

0

[

( )]

2

(99)

Thus, it remains “minimum” for all time [cf. (96)]. Figure 2 shows the motion of the wave packet, which performs a periodic oscillation ( =2 ) along the axis, without becoming distorted. We saw in Complement GI that a Gaussian wave packet, when it is free, becomes distorted as it propagates, since its width varies (“spreading” of the wave packet). We see here that nothing of the sort occurs when a wave packet is subject to the influence of a parabolic potential. Physically, this result arises from the fact that the tendency of the wave packet to spread is compensated by the potential, whose effect is to push the wave packet towards the origin from regions where ( ) is large. What happens to these results when is very large? The root mean square deviations ∆ and ∆ do not change, as is shown by (96). On the other hand, the oscillation amplitudes of ( ) and ( ) become much larger than ∆ and ∆ . By choosing a sufficiently large value of , one can obtain a quantum mechanical motion for which the position and momentum of the oscillator are, in relative value, as well-defined as might be desired. Therefore when 1, an state describes very well the motion of a macroscopic harmonic oscillator, for which the position, the momentum and the energy can be considered to be classical quantities. 4.

Example: quantum mechanical treatment of a macroscopic oscillator

Let us consider a concrete example: a macroscopic body of mass = 1 kg suspended from a rope of length = 0 1 m and placed in the gravitational field ( 10 m/s2 ). We know that, for small oscillations, the period of the motion is given by: (100)

=2 In our case, we obtain: 0 63 s

(101) = 10 rd s 596



COHERENT “QUASI-CLASSICAL” STATES OF THE HARMONIC OSCILLATOR

ψ (x,t) 2

0

x

0

x

0

x

0

x

Figure 2: Motion of the Gaussian wave packet associated with an state: under the effect of the parabolic potential ( ), the wave packet oscillates without becoming distorted.

Let us now assume that this oscillator performs a periodic motion of amplitude = 1 cm. What is the quantum mechanical state that best represents its oscillation? We have seen that this state is an the relation: =

state in which, according to (93),

satisfies

(102)

2~

that is, in our case: 5

1015

(the argument of

22

1015

1

(103)

is determined by the initial phase of the motion). 597



COMPLEMENT GV

The root mean square deviations ∆ ∆ ∆

=

~

22

2

18

10

17

m

22

kg m s

The root mean square deviation ∆ ∆

22

are then:

(104) ~ 2

=

10

and ∆

10

17

of the velocity is equal to:

m s

(105)

Since the maximum velocity of the oscillator is 0.1 m/s, we see that the uncertainties of its position and its velocity are completely negligible compared to the quantities involved in the problem. For example, ∆ is less than a Fermi (10 15 m), that is, the approximate size of a nucleus. Measuring a macroscopic length with this accuracy is obviously out of the question. Finally, the energy of the oscillator is known with an excellent relative accuracy, since, according to (52): ∆

1

04

10

15

1

(106)

The laws of classical mechanics are therefore quite adequate for the study of the evolution of the macroscopic oscillator. References and suggestions for further reading:

Glauber’s lectures in (15.2).

598



NORMAL VIBRATIONAL MODES OF TWO COUPLED HARMONIC OSCILLATORS

Complement HV Normal vibrational modes of two coupled harmonic oscillators

1

Vibration of the two coupled in classical mechanics . . . . . 599

2

1-a

Equations of motion . . . . . . . . . . . . . . . . . . . . . . . 599

1-b

Solving the equations of motion . . . . . . . . . . . . . . . . . 600

1-c

The physical meaning of each of the modes . . . . . . . . . . 601

1-d

Motion of the system in the general case . . . . . . . . . . . . 602

Vibrational states of the system in quantum mechanics . . 605 2-a

Commutation relations . . . . . . . . . . . . . . . . . . . . . . 605

2-b

Transformation of the Hamiltonian operator . . . . . . . . . . 606

2-c

Stationary states of the system . . . . . . . . . . . . . . . . . 607

2-d

Evolution of the mean values . . . . . . . . . . . . . . . . . . 609

This complement is devoted to the study of the motion of two coupled (onedimensional) harmonic oscillators. Such a study is of interest because it permits the introduction, in a very simple case, of a physically important concept: that of normal vibrational modes. This concept, encountered in quantum mechanics as well as in classical mechanics, appears in numerous problems: for example, in the study of atomic vibrations in a crystal (cf. Complement JV ) and of the vibrations of electromagnetic radiation (cf. Complement KV ). 1.

Vibration of the two coupled in classical mechanics

1-a.

Equations of motion

Let us therefore consider two particles (1) and (2), of the same mass , moving along the axis, with abscissas 1 and 2 . To begin, we assume their potential energy to be: 0( 1

2)

=

1 2

2

(

)2 +

1

1 2

2

(

2

+ )2

(1)

When 1 = and 2 = , the potential energy 0 ( 1 2 ) is minimal, and the two particles are in stable equilibrium. If the particles move from these equilibrium positions, they are subjected to the forces 1 and 2 , respectively:

1

=

0( 1

2)

=

2

=

2

(

1

)

1 2

=

(2) 0( 1

2)

(

2

+ )

2

599



COMPLEMENT HV

and their motion is given by: d2 d2

1(

d2 d2

2(

2

)=

(

)

1

(3) 2

)=

(

2

+ )

Each particle therefore follows an independent sinusoidal motion, centered at its equilibrium position. The amplitude of the motion of each particle is arbitrary1 and can be fixed by a suitable choice of the initial conditions. Now let us assume the potential energy ( 1 2 ) of the two particles to be: (

1

2)

=

0( 1

2)

1

2)

=

2

+

(

1

2)

(4)

with: (

(

2)

1

2

(5)

(where is a dimensionless positive constant which we shall call the “coupling constant”). To the forces 1 and 2 written in (2), we must add, respectively, the forces 1 and 2 given by: 1

2

=

(

1

2)

=2

2

(

1)

2

1

=

(6) (

1

2)

=2

2

(

2)

1

2

We see that the introduction of ( 1 2 ) takes into account an attractive force between the particles, which is proportional to the distance between them. The two particles (1) and (2) are therefore no longer independent; what is their motion now? Before attacking this problem from a quantum mechanical point of view, we shall recall the results given by classical mechanics. 1-b.

Solving the equations of motion

In the presence of the coupling coupled differential equations: d2 d2

1(

d2 d2

2(

)=

2

(

)+2

1

(

2 ),

1

2

(

2

we must replace (3) by the system of

1)

(7) )=

2

(

2

+ )+2

2

(

1

2)

We know how to solve such a system (see for example Chapter IV, § C-3-a). We diagonalize the matrix of the coefficients appearing on the right-hand side of (7): = 1 Of

2

1+2 2

2 1+2

(8)

course, the choice of the potential (1) implies that we are not taking into account the collisions that could occur if sufficiently large amplitudes were chosen.

600



NORMAL VIBRATIONAL MODES OF TWO COUPLED HARMONIC OSCILLATORS

We then replace 1 ( ) and 2 ( ) by linear combinations of these two functions (given by the eigenvectors of ) whose time dependence obeys uncoupled linear differential equations (with coefficients which are the eigenvalues of ). In this case, these linear combinations are: ()=

1 [ 2

1(

)+

2(

)]

(9)

(the position of the center of mass of the two particles) and: ()=

1(

)

2(

)

(10)

(the abscissa of the “relative particle”). Substituting (9) and (10) into (7), we obtain (taking the sum and the difference): d2 d2

()=

2

d2 d2

()=

2

()

[

()

2 ]

4

2

()

(11)

These equations can be integrated immediately: ()=

0

cos(

+

) (12)

2 ()= + 1+4

0

cos(

+

)

with: = (13) =

1+4

0 , 0, and are integration constants fixed by the initial conditions. To obtain the motion of particles (1) and (2), all we must do is invert formulas (9) and (10):

1(

2(

)= )=

( )+

1 2

()

()

1 2

()

(14)

and substitute (12) into these equations. 1-c.

The physical meaning of each of the modes

Through the change of functions performed in (9) and (10), we have been able to find the motion of the two interacting particles by associating with them two fictitious particles ( ) and ( ), of abscissas ( ) and ( ). These fictitious particles do not 601

COMPLEMENT HV



interact; their motions are independent, so their amplitudes and phases can be fixed arbitrarily by a suitable choice of the initial conditions. For example, it is possible to require one of the two fictitious particles to be motionless without this being the case for the other one: we then say that a vibrational mode of the system is excited. It must be understood that, in a vibrational mode, the real particles (1) and (2) are both in motion with the same angular frequency ( or , depending on the mode). No solution of the equations of motion exists for which one of the two real particles (1) or (2) remains motionless while the other one vibrates. If, at the instant = 0, one were to give an initial velocity to only one of the two particles, (1) or (2), the coupling force would set the other one in motion (cf. discussion of § 1-d below). The simplest case is of course the one in which neither of the two modes is excited. In formulas (12) such a situation corresponds to 0 = 0 = 0; ( ) and ( ) then always remain equal to zero and 2 (1+4 ) respectively, which, according to (14), yields:

1

=

2

=

(15)

1+4

The system does not oscillate and the two particles (1) and (2) remain motionless in their new equilibrium positions given by (15) (it can be verified that, for these values of 1 and 2 , the forces exerted on the particles are zero; the fact that these equilibrium positions are closer in the presence of the coupling than when = 0 is due to their mutual attraction). To excite only the mode corresponding to ( ), one places the two particles (1) and (2) at the initial instant at the same distance 2 (1+4 ) as in the preceding case, and one gives them equal velocities. One then finds that ( ) remains equal to 2 (1 + 4 ) (the initial conditions require 0 to be zero). The two particles move “in unison”, performing the same motion without the distance between them varying. For this mode, the two-particle system can be treated like a single undeformable particle of mass 2 2 on which is exerted the force 1 + 2 = 2 ( ). We then see why the angular frequency of this mode is = [cf. formula (A-3) of Chapter V]. To excite only the mode corresponding to ( ), one chooses an initial state in which the positions and initial velocities of the two particles are opposite. One then finds that, at every subsequent instant, ( ) = 0, and the two particles move symmetrically with respect to the origin . For this mode, the distance ( 2 1 ) varies and the attractive force between the two particles enters into the equations of motion; this is the reason why the angular frequency of this mode is not but = 1+4 . The dynamical variables ( ) and ( ) associated with the independent modes, that is, with the fictitious particles ( ) and ( ), are called normal variables. 1-d.

Motion of the system in the general case

In the general case, both modes are excited and the positions 1 ( ) and 2 ( ) are both given by the superposition of two oscillations of different frequencies and [cf. formulas (14)]. The motion of the system is not periodic, except in the case in which the ratio is rational2 . 2 If

is:

602

=2

1

= 1 1 + 4 is equal to an irreducible rational fraction =2 2 .

1

2,

the period of the motion



NORMAL VIBRATIONAL MODES OF TWO COUPLED HARMONIC OSCILLATORS

Let us investigate, for example, what happens if, at the initial time 0 , particle (1) is motionless at its equilibrium position 1 = (1 + 4 ), while particle (2) has a non-zero velocity (this is, in classical mechanics, the analogue of the problem studied in § C-3-b of Chapter IV). In the absence of coupling, particle (2) would oscillate alone and particle (1) would remain motionless. We shall show that the coupling sets particle (1) in motion. Two different angular frequencies and appear in the time evolution of 1 ( ) and 2 ( ). The two corresponding oscillations give rise to a beat phenomenon (Fig. 1), whose frequency is: =

= [ 1+4 1] (16) 2 2 If the coupling is weak ( 1), this frequency is negligible with respect to and . In this case, as long as ( 1 , particle (2) is practically the 0) only one to oscillate; the vibrational energy is then slowly transferred to particle (1), whose amplitude of oscillation increases, while that of (2) decreases. After a certain time, the original situation is inverted: particle (1) oscillates strongly while particle (2) is practically motionless. Then the amplitude of (1) slowly decreases and that of (2) increases until the energy is again almost entirely localized in oscillator (2). The same process is repeated indefinitely. The effect of a weak coupling is to cause the energy of the oscillator associated with particle (1) to be constantly transferred to the one associated with particle (2) and vice versa, with a frequency proportional to the intensity of the coupling. x1

a 1 + 4λ

0

t

Figure 1: Oscillations of the position of particle (1), assumed to be motionless at its equilibrium position at = 0, particle (2) having an initial velocity. A beat phenomenon is produced between the two modes, and the amplitude of the oscillations of particle (1) varies over time.

Comments:

( ) If 1 and 2 are the respective momenta of particles (1) and (2), the classical Hamiltonian of the system under study can be written: (

1

2

1

2)

=

2 1

2

+

2 2

2

+

0( 1

2)

+

(

1

2)

(17) 603

COMPLEMENT HV



If we set: ()=

1(

)+

2(

1 ( ) = [ 1( ) 2

) (18) 2(

)]

and: =2 (19) =

2

it can be verified that 2

=

2

+

becomes: 2

1 2

2

2

+

+

2

1 2

2

2 1+4 +

2

2 2

4 1+4

(20)

By a suitable change in the energy origin, one can eliminate the last term of this expression, which is constant. can then be seen to be the sum of two energies, each of which corresponds to one of the modes. Unlike the situation in (17), in which the terms in 1 2 of ( 1 2 ) are responsible for a coupling between the particles, there is no coupling term in (20) between the modes: they are indeed independent. ( ) We have assumed, for simplicity, the masses 1 and 2 of particles (1) and (2) to be equal. It is easy to eliminate this hypothesis by replacing (9), (10), (18) and (19) by: 1 1(

()=

)+ +

2 2(

1

()= =

1

1(

)+

+

2

2(

)

2

(21)

)

(the position, momentum and mass associated with the center of mass) and:

()=

1(

)

2(

2 1(

()=

) 1

= 604

1 1+

2 2

) 1 2(

+

2

)

(22)



NORMAL VIBRATIONAL MODES OF TWO COUPLED HARMONIC OSCILLATORS

(the position, momentum and mass of the “relative particle”). One then finds a result analogous to (20). (

2.

) In the absence of coupling, the two modes have the same angular frequency ; in the presence of coupling, two different angular frequencies and appear. This is an example of a result which is often found in physics: the effect of a coupling between two oscillations is, in most cases, to separate their normal frequencies (the same phenomenon would occur here if the two oscillators originally had different angular frequencies). If, instead of two, we have an infinite number of oscillators (which, if isolated, would have the same frequency), we shall see in Complement JV that the effect of the coupling is to create an infinite number of different frequencies for the modes.

Vibrational states of the system in quantum mechanics

Let us now reconsider the problem from a quantum mechanical point of view. We must now replace the positions 1 ( ), 2 ( ) and the momenta 1 ( ), 2 ( ) of the particles by operators, which we shall denote, respectively, by 1 , 2 , 1 , 2 . We then introduce, by analogy with (9), (10) and (18), the observables: = =

1 ( 2 1

=

1

+

2)

(23)

+

1

2

2

(24) 1 = ( 2

1

2)

To see if the operator , the Hamiltonian of the system, can be put into a form analogous to (20), we shall begin by examining the commutation relations of , , et . 2-a.

Commutation relations

Since all the observables concerning only particle (1) commute with those concerning particle (2), the only non-zero commutators involving 1 , 2 , 1 and 2 are: [

1

1]

= ~

[

2

2]

= ~

In particular, [

]=0

(25) 1

commutes with

2,

and we see immediately that: (26)

Similarly: [

]=0

(27) 605



COMPLEMENT HV

Calculating the commutator [ [

1 [ 2

]=

1 2

=

1]

1

+[

], we obtain: 2]

1

+[

1]

2

+[

2

2]

~+ ~ = ~

(28)

Similarly, one finds: [

]= ~

(29)

The two commutators [ [

1 [ 4

]=

1 4

=

1

~

] and [ 1]

[

2]

1

] remain to be examined; they are equal to:

+[

1]

2

[

2

2]

~ =0

(30)

and, similarly: [

]=0

(31)

We can thus consider , and , to be the position and momentum operators of two distinct particles. Formulas (28) and (29) are the canonical commutation relations for each of these particles. Moreover, relations (26), (27), (30) and (31) express the fact that all the observables concerning one of them commute with all those which concern the other one. 2-b.

Transformation of the Hamiltonian operator

In the presence of the coupling =

(

2 ),

1

we have:

+

(32)

with: =

1 2

2 1

2 2

+

(33)

(the kinetic energy operator) and: =

1 2

2

(

2

) +(

1

2

2

+ ) +2 (

(the potential energy operator). Since 1 and if these operators were numbers; we find: = 606

1 2

2

+

1 2

2

2 2)

1

2

(34)

commute, (33) can be transformed as

(35)



NORMAL VIBRATIONAL MODES OF TWO COUPLED HARMONIC OSCILLATORS

where and are defined in (19). Similarly, since above [formula (20)]: =

1 2

2

2

+

1 2

2

2 1+4

2

1

+

2 2

and

2

commute, we have, as

4 1+4

(36)

where

and are given by (13). Thus we see that can be put into a form which is analogous to (20), in which there is no coupling term: =

+

2 2

+

4 1+4

(37)

with: 2

=

+

2 2

= 2-c.

1 2

2

1 + 2

2

2

2

2 1+4

(38)

2

Stationary states of the system

The state space of the system is the tensor product (1) (2) of the state spaces of particles (1) and (2); it is also the tensor product ( ) ( ) of the state spaces of the fictitious particles, the “center of mass” and the “relative particle” associated with each of the two modes. Since is the sum of two operators and which act only 2 2 4 in ( ) and ( ) respectively (the constant merely introduces a shift in 1+4 the energy origin), we know (Chap. II, § F) that we can find a basis of eigenvectors of in the form: =

(39)

where and are, respectively, eigenvectors of and in ( ) and ( ). Now, and are Hamiltonians of one-dimensional harmonic oscillators, whose eigenvectors and eigenvalues we know. If the operators and are defined by: =

1 2

~

~ (40a)

=

1 2

~

~

with: 2 1+4

= and if 0 vectors of =

(40b)

and 0 denote respectively the ground states of are the vectors: 1 !

(

)

0

and

, the eigen-

(41) 607



COMPLEMENT HV

whose eigenvalues are: =

+

those of

1 2

(42)

~

being:

=

1 !

(

)

(43)

0

with the eigenvalues: =

+

1 2

(44)

~

Thus we have here a situation which is analogous to the one encountered in the study of a two-dimensional anisotropic (since = ) harmonic oscillator. The stationary states of the system are given by: =

=

(

) ( ) ! !

(45)

00

and their energies are: =

+

=

+

2 2

+ 1 2

+

~

4 1+4 +

1 2

~

+

2 2

4 1+4

(46)

The operators and or ( and ) can thus be seen to be annihilation or creation operators of an energy quantum in the mode corresponding to ( ) [or ( )]. We see from (45) that, through the repeated action of and , we can obtain stationary states of the system in which the number of quanta in each mode is arbitrary. The action of , , or on the stationary states is very simple: =

+1

+1

=

1

=

+1

+1

=

(47)

1

In general, there are no degenerate levels since there do not exist two different pairs of integers and such that: +

=

+

(except when the ratio 608

(48) =

1 + 4 is rational).

• 2-d.

NORMAL VIBRATIONAL MODES OF TWO COUPLED HARMONIC OSCILLATORS

Evolution of the mean values

The most general state of the system is a linear superposition of stationary states : () =

()

(49)

with: ()=

(0) e

~

(50)

According to relations (40) and their adjoints, ( ) is a linear combination of and (of and ). We then see, by using (47), that has non-zero matrix elements between two states and only when = 1, = (for , we would have = , = 1). From this we deduce that the only Bohr frequencies which can appear in the time evolution of ( ) and ( ) are, respectively3 : 1

= ~ 1

=

(51)

~ Thus we again find that ( ) and ( ) oscillate at angular frequencies of , which is consistent with the classical result obtained in § 1-a.

and

References and suggestions for further reading:

Coupling between two classical oscillators: Berkeley 3 (7.1), §§ 1.4 and 3.3; Alonso and Finn (6.1), Vol. I, § 12.10.

3 For

these frequencies actually to appear, at least one of the products must be different from zero.

1

, or

1

609

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS

Complement JV Vibrational modes of an infinite linear chain of coupled harmonic oscillators; phonons

1

2

3

Classical treatment . . . . . . . . . . . . . . . . . . . . . . . . 612 1-a

Equations of motion . . . . . . . . . . . . . . . . . . . . . . . 612

1-b

Simple solutions of the equations of motion . . . . . . . . . . 613

1-c

Normal variables . . . . . . . . . . . . . . . . . . . . . . . . . 616

1-d

Total energy and energy of each of the modes . . . . . . . . . 620

Quantum mechanical treatment . . . . . . . . . . . . . . . . . 622 2-a

Stationary states in the absence of coupling . . . . . . . . . . 622

2-b

Effects of the coupling . . . . . . . . . . . . . . . . . . . . . . 623

2-c

Normal operators. Commutation relations . . . . . . . . . . . 624

2-d

Stationary states in the presence of coupling

. . . . . . . . . 626

Application to the study of crystal vibrations: phonons . . 626 3-a

Outline of the problem . . . . . . . . . . . . . . . . . . . . . . 626

3-b

Normal modes. Speed of sound in the crystal . . . . . . . . . 627

In Complement HV , we studied the motion of a system of two coupled harmonic oscillators. We concluded, in essence, that, while the individual dynamical variables of each oscillator do not evolve independently, it is possible to introduce linear combinations of them (normal variables) which possess the important property of being uncoupled. Such variables describe vibrational normal modes of well-defined frequencies. Expressed in terms of these normal variables, the Hamiltonian of the system appears in the form of a sum of Hamiltonians of independent harmonic oscillators, thus making quantization simple. In this complement we shall show that these ideas are also applicable to a system formed by an infinite series of identical harmonic oscillators, regularly spaced along an axis, each one coupled to its neighbors. To do this, we shall determine the various vibrational normal modes of the system and show that each one corresponds to a collective vibration of the system of particles, characterized by an angular frequency Ω and a wave vector . The process of finding the eigenstates and eigenvalues of the quantum mechanical Hamiltonian is then greatly simplified by the fact that the total energy of the system is the sum of the energies associated with each vibrational normal mode. The results obtained will enable us to indicate how vibrations propagate in a crystal and to introduce the concept of a phonon, a central idea in solid state physics. Of course, in this complement, we shall emphasize the introduction and quantization of the normal modes, and not the detailed properties of phonons, which would be treated in a solid state physics course. 611



COMPLEMENT JV

1.

Classical treatment

1-a.

Equations of motion

Let us consider an infinite chain of identical one-dimensional harmonic oscillators, each one labeled by an integer (positive, negative or zero). The particle of mass which constitutes oscillator ( ) has its equilibrium position at the point whose abscissa is (Fig. 1), where is the unit distance of the oscillator chain. We denote by the (algebraic) displacement of oscillator ( ) with respect to its equilibrium position. The state of the system at the instant is defined by specifying the dynamical variables ( ) and their time derivatives ˙ ( ) at this instant.

(q – 1)l

Mq – 1

ql

Mq

xq – 1

xq

Mq + 1

(q + 1)l

xq + 1

Figure 1: Infinite chain of oscillators; the displacement of the th particle with respect to its equilibrium position is denoted by .

In the absence of interactions between the various particles, the potential energy of the system is: +

(

1

0

)=

+1

=

1 2

2 2

(1)

where is the angular frequency of each oscillator. The evolution of the system is then given by the equations: d2 d2

()=

2

()

(2)

)

(3)

whose solutions are: ()=

cos(

where the integration constants and are fixed by the initial conditions of the motion. The oscillators therefore vibrate independently. Now imagine that these particles are interacting. For simplicity, we shall assume that one need take into account only the forces exerted on a particle by its two nearest neighbors and that these forces are attractive and proportional to the distance. Thus, particle ( ) is subjected to two new attractive forces exerted by particles ( + 1) and ( 1). These forces are proportional to +1 and 1 (the coefficient of proportionality being the same in both cases). The total force to which particle ( ) 612

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS is subjected can therefore be written: 2

=

2 1

[

+

( + 1)

+1 ] 2 1

2

=

2 1

(

2 1

+1 )

[

+

(

(

1)

1]

1)

(4)

where 1 is a constant [having the dimensions of an inverse time] which characterizes the intensity of the coupling. Equations (2) must now be replaced by: d2 d2

2

()=

2 1

()

[2

()

+1 (

)

1(

It can easily be verified that the interaction forces [terms in the potential energy of the coupling given by: (

1

0

+1

2 1

(5) of (4)] are derived from

+

1 2

)=

)]

2 1

2 +1 )

(

(6)

=

According to (5), the evolution of depends on +1 and 1 . Therefore, we must solve an infinite system of coupled differential equations. Before we introduce new variables that allow these equations to be uncoupled, it is interesting to try to find simple solutions of equations (5) and investigate their physical significance. 1-b.

Simple solutions of the equations of motion

.

Existence of simple solutions

The infinite chain of coupled oscillators we are studying is analogous to an infinite macroscopic spring. Now, we know that progressive longitudinal waves (corresponding to expansions and compressions) can propagate along this spring. Under the influence of a sinusoidal wave of this type, of wave vector and angular frequency Ω, the point of the spring whose abscissa is at equilibrium is found at time at + ( ), with: (

)=

e(

Ω )

+

e

(

Ω )

(7)

Such solutions of the equations of motion (5) do indeed exist. However, since the oscillator chain is not a continuous medium, the effects of the wave are observed only at a series of points, corresponding to the abscissas = ; ( ) thus represents the displacement of oscillator ( ) at time : ()= (

e(

)=

Ω )

+

e

(

Ω )

It is easy to verify that this expression is a solution of equations (5) if Ω and Ω2 =

2

2 1

Ω is therefore related to Ω( ) =

2

+4

2 1

sin2

2

e

e

(8) satisfy: (9)

by the “dispersion relation”:

2

(10)

which we shall discuss in detail later (§ 1-b- ). 613

COMPLEMENT JV

.



Physical interpretation

In the solution (8) of the equations of motion, all the oscillators are vibrating at the same frequency Ω 2 , with the same amplitude 2 , but with a phase that depends periodically on their rest positions. It is as if the displacements of the various oscillators were determined by a progressive sinusoidal wave of wave vector and phase velocity: Ω( )

( )=

(11)

This is easy to show. Using (8), we see that: 1+ 2

()=

2

(12)

1

Thus, oscillator ( 1 + 2 ) performs the same motion as oscillator ( 1 ), shifted by the time taken by the wave to travel, at a velocity , the distance 2 separating the two oscillators. Since all the oscillators are then in motion, solutions (8) are called “collective modes” of vibration of the system. .

Possible values of the wave vector Consider two values of the wave vector, :

of 2

=

+

2

with

and

, which differ by an integral number

an integer (positive or negative);

(13)

We have, obviously: e

=e (14)

Ω( ) = Ω( ) where the second relation follows directly from (10). We see from (8) that the two progressive waves and lead to the same motion for the oscillators and are, consequently, physically indistinguishable. Therefore, in the problem we are studying here, it suffices to let vary over an interval of 2 . For reasons of symmetry, we choose: (15) The corresponding interval is often called the “first Brillouin zone”. .

Dispersion relation

The dispersion relation (10) which gives the angular frequency Ω( ) associated with each value of enables us to study the propagation of vibrations in the system. If, for example, we form a “wave packet” by superposing waves with different wave vectors, we know that it has a group velocity given by: = 614

dΩ( ) d

(16)

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS which is different from . Figure 2 shows the form of the variation of Ω( ) with respect to , where within the first Brillouin zone.

varies

Ω

ω2 + 4 ω12

ω

0

– π/l

+ π/l

k

Figure 2: Dispersion relation giving the variation of the angular frequency of the vibrational normal modes with respect to the wave number in the first Brillouin zone [ + ]. The dashed line corresponds to the case = 0.

It is clear from this figure that Ω( ) cannot take on arbitrary values: a vibration of frequency can propagate freely in the medium only if falls within the “allowed band”: 2

+4 2

2

2 1

(17)

The other values of correspond to “forbidden bands”. The two limiting frequencies of interval (17) are often called “cut-off frequencies”. The mode of lowest angular frequency, Ω(0) = , has a zero wave vector ; it corresponds to an in-phase vibration of all the oscillators, whose particles are moving “together” without changing their relative distances (Fig. 3). This explains why the angular frequency of this mode is the same as in the absence of coupling (cf. Complement HV , § 1-b). 2 + 4 2 , in the As for the mode of highest angular frequency, Ω( ) = 1

(q – 2)l

xq^ – 2

(q – 1)l

xq – 1

(q + 1)l

ql

xq

(q + 2)l

xq + 1

Figure 3: The lowest frequency mode ( = 0; Ω = ) corresponds to a displacement of the system of oscillators “as a whole”. This is why its frequency does not depend on the coupling .

615

COMPLEMENT JV

• (q – 1)l

(q – 2)l

xq – 2

ql

(q + 1)l

xq

xq – 1

xq + 1

(q + 2)l

xq + 2

Figure 4: The modes = are those in which two neighboring oscillators are completely out of phase; the coupling strongly modifies their frequency.

corresponding vibration of the system, two adjacent oscillators are completely out of phase (Fig. 4); the effect of the attractive forces due to the coupling is then maximal. 1-c.

Normal variables

.

Obtaining uncoupled equations

Returning to the equations of motion (5), let us introduce new dynamical variables (linear combinations of the ) that evolve independently. To do so, we multiply both sides of equation (5) by the quantity e and sum over . If we notice that: +

+ 1e

=e

1

=

e

(

1)

= +

=e

e

(18)

=

and if we set: +

( )e

= (

)

(19)

=

we see that (5) becomes: 2 2

(

)=

2

+

2 1

2

e

e

(

)

(20)

that is, taking (9) into account: 2 2

(

)=

Ω2 ( ) (

)

(21)

This relation shows that the time evolution of ( ) is independent of that of ( ) for different from . The quantities ( ) introduced in (19) are therefore completely uncoupled and have a remarkably simple equation of motion.

616

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS Comments:

( ) Equations (5) are easy to uncouple because the problem is invariant under a translation of the system of oscillators by a quantity (replacement of by 1). This invariance is itself due to the fact that the chain is regular and infinite. ( ) In reality, every chain is of course finite, even if it contains a very large number of oscillators. To find its vibrational normal modes, one must thus take into account the boundary conditions at the two ends of the chain, and the problem becomes much more complicated (edge effects). Instead of obtaining, as we do here, a continuous infinity of vibrational normal modes corresponding to the various values of in the first Brillouin zone, one finds a finite number of eigenmodes, equal to the number of oscillators. When one is concerned only with the behavior of the chain far from the ends, one often introduces artificial boundary conditions which are different from the real boundary conditions but which have the advantage of simplifying the calculations while conserving the essential physical properties. Thus, one requires the two end oscillators to have the same motion (“periodic” boundary conditions, also called “Born-Von Karman conditions”). We shall have the opportunity to return to this question in connection with the study of other periodic structures (cf. Complement FXI ; see also § 1-c of Complement CXIV ). Therefore we shall not dwell any further on periodic boundary conditions, but shall continue our discussion, confining ourselves to the simple case of an infinite chain. The function ( ) introduced in (19) is, by definition, the sum of a Fourier series whose coefficients are the displacements ( ). It is a periodic function, of period 2 , which is therefore perfectly well-defined when its values in the interval are specified [this is the first Brillouin zone, defined in (15)]. ( ) depends on the positions of all the oscillators at time . Conversely, these positions are unambiguously defined when the values of in the interval (15) are given for time . This is true because it is possible to invert relation (19) since, using: +

d e(

)

=

2

(22)

we obtain: +

()=

d

2

(

)e

Note also that, since the displacements (

)=

(

(23) ( ) are real, the function (

)

Similarly, one can, using the momenta (

)=

) satisfies:

( )e

(24) ()=

˙ ( ), define the function: (25)

617

COMPLEMENT JV

and the



( ) can be expressed as: +

()=

(

2

The fact that the (

)=

(

)e

d

(26)

( ) are real implies that: )

(27)

Differentiating both sides of (19) term by term, and using (25) and then (21), we finally obtain: ( (

)= ( )=

)

Ω2 ( ) (

(28a) )

(28b)

At time , the dynamical state of the system can be characterized just as well by the specification of the ( ) and ( ) for all integers (positive, negative or zero), as by that of the “normal variables” ( ) and ( ) (where can take on any value in the first Brillouin zone). The equations of motion (28) of the normal variables corresponding to each value of describe the evolution of the position and momentum of a harmonic oscillator of mass and angular frequency Ω( ); however, and are complex. Thus we have reduced the study of an infinite but discrete chain of coupled harmonic oscillators to that of a continuous system of fictitious independent oscillators (labeled by the index ).

Comments:

Rigorously, these fictitious oscillators are not completely independent since, according to conditions (24) and (27), the initial values ( 0) and ( 0) must satisfy:

.

( 0) =

(

0)

( 0) =

(

0)

Normal variables

(

(29) ) associated with the progressive waves

It is convenient (see also § 1-a of Complement GV ) to condense the two normal variables ( ) and ( ) into one, ( ), defined by: (

)=

where ˆ(

618

1 ˆ ( 2 ) and ˆ(

ˆ(

)= ( ) (

ˆ(

)=

) + ˆ(

)

) are dimensionless quantities proportional to (

1 ( ~ ( )

(30) ) and (

):

) (31) )

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS To simplify the quantum mechanical calculations presented later, we shall set: Ω( ) ~

( )=

(32)

It is easy to show, using (30), that the two equations (28) are equivalent to the single equation: (

) = Ω( ) (

)

(33)

which is of first order in [ ( ) is completely defined by the specification of while ( ) depends on ( 0) and ( 0)]. The general solution of (33) is: (

Ω( )

) = ( 0) e

(34)

Using (19) and (25), we easily obtain the expression for ( ) and ( ): (

)=

1 ( ) 2

e

() Ω( )

( )+

Let us show, conversely, that the ( ) and ( ). According to (24) and (27): (

)= =

1 ˆ ( 2 1 ˆ ( 2

( 0),

) )

ˆ ( ˆ(

(

) in terms of the

(35)

( ) can be simply expressed in terms of the

)

)

(36)

From this we deduce: ˆ(

)=

ˆ(

)=

1 [ ( 2 2

[ (

)+

(

)

)] (

(37a) )]

(37b)

d

(38)

which allows us to write formula (23) in the form: +

()=

2

Changing ]:

2 to

2

) ( )

+

e

d +

(

) ( )

e

in the second integral, we finally obtain [ ( ) is an even function of

+

()=

(

2

(

) ( )

+

e

d +

( ) e ( )

d

(39) 619

COMPLEMENT JV



An analogous calculation, starting with (26), yields: +

~

()=

2

+

( ) (

2

)e

d

( )

(

)e

d

(40)

The state of the system is therefore described just as well by the ( ) as by the set of ( ) and ( ). If we replace, in (39), ( ) by its general expression (34), ( ) takes on the form: +

()=

2

( 0) [ e ( )

d

2

Ω( ) ]

+

(41)

The most general solution of the problem of the chain of coupled oscillators is therefore a linear superposition of the progressive waves introduced in § 1-b (where the coefficients ( 0) of this linear combination are ). These progressive waves constitute the 2 2 ( ) vibrational normal modes of the system1 .

Comment:

For each value of , the two terms appearing on the right-hand sides of (39) and (40) are complex conjugates of each other. This insures the reality of the ( ) and ( ) without the necessity of imposing an arbitrary condition on the ( ). The ( ), consequently, are truly independent variables. 1-d.

Total energy and energy of each of the modes

The total energy of the system under consideration is the sum of the kinetic energies of each particle ( ) and the potential energies (1) and (6): (

1

0

+1

1

0

+1 +

=

)= 1 2

2

+

1 2

2 2

+

1 2

2 1

(

+1 )

2

(42)

We shall see in this section that this energy can be expressed very simply in terms of the energies that can be associated with each of the modes. Let us therefore calculate the various sums involved in (42). Since the displacements are the coefficients of the Fourier series defining the function ( ), Parseval’s

1 We could also have introduced the modes corresponding to stationary waves in the system (sum of two progressive waves of the same frequency and opposite velocities). We would then have obtained equivalent results but the motion of the system would have been expanded on a different “basis”. An expansion of this type is used in Complement KV .

620

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS relation [Appendix I, relation (18)] immediately yields: +

+

(

)2 =

=

)2d

(

2

+

(43)

+

( )2 = =

)2d

(

2

(44)

The sum which, in (42), corresponds to the coupling, now remains to be calculated. To do so, notice as in (18) that, if the displacements are the coefficients of the Fourier series of ( ), the +1 are those of e ( ). The quantities ( +1 ) are therefore the coefficients of the Fourier series of [1 e ] ( ), and Parseval’s relation yields: +

+

2

(

+1 ) =

=

(1

2

e

)2d

(

+

=

4 sin2

2

(

2

)2d

(45)

Substituting (43), (44) and (45) into (42), we finally obtain: +

=

2

2

2

2 1

+4

sin2

(

2

)2+

1 2

(

)2

d

(46)

We shall write this result in the form: +

=

( )d

2

(47)

with: ( )=

1 2

Ω2 ( )

)2+

(

1 2

(

)2

(48)

where Ω( ) is given by (10). is thus the sum (in fact, the integral) of the energies associated with the fictitious uncoupled harmonic oscillators for which ( ) gives the position and ( ), the momentum. We can also express ( ) in terms of the variables ( ) associated with each normal mode. Using (37), we transform expression (48) into: ( )=

1 ~Ω( ) 2

(

)

(

)+

(

) (

)

(49)

( 0) +

(

0) (

0)

(50)

that is, taking (34) into account: ( )=

1 ~Ω( ) 2

( 0)

( ) is therefore time-independent, which is not surprising since ( ) is the energy of a harmonic oscillator. Furthermore, we again find in (47) that the fictitious oscillators are 621

COMPLEMENT JV



independent, since the total energy is simply the sum of the energies associated with each of them. Substituting expression (49) into (47), we obtain: +

=

d

2

1 ~Ω( ) 2

(

)

(

)+ (

)

(

)

(51)

We can then change to in the integral of the second term and consider to be the sum of the energies ( ) associated with the normal modes characterized by the ( ): +

=

d

2

( )

(52)

with: ( ) = ~Ω( )

(

= ~Ω( ) 2.

) (

)

( 0) ( 0)

(53)

Quantum mechanical treatment

The quantum mechanical treatment of the problem of the infinite chain of coupled oscillators is based, in accordance with the general quantization rules, on the replacement of the classical quantities ( ) and ( ) by the observables and which satisfy the canonical commutation relations: [

1

2

2-a.

]= ~

(54)

1 2

Stationary states in the absence of coupling

In the absence of coupling ( written: (

1

1 2

= 0) =

2

2

+

1 2

1

= 0), the Hamiltonian

of the system can be

2

=

(55)

where is the Hamiltonian of a one-dimensional harmonic oscillator acting in the state space of particle ( ). We introduce the operator defined by: =

1 2

+ ~

(56) ~

can then be written: = 622

1 2

+

~ =

+

1 2

~

(57)

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS and are the creation and annihilation operators of an energy quantum for oscillator ( ). We know (Chap. V, § C-1-a) that the eigenstates of are given by: 1 ( )!

=

(58)

0

where 0 is the ground state of oscillator ( ) and is a positive integer or zero. If we choose as the energy origin the energy of the ground state [which amounts to omitting, in (57), the term 1/2], we obtain for the energy of the state ): =

(59)

~

In the absence of coupling, the stationary states of the global system are tensor products of the form: 1

0 1

1 0

(60)

1

and their energies are2 : =

=[

+

1

+

0

+

1

+

]~

(61)

The ground state, whose energy was chosen for the origin, is not degenerate since = 0 is obtained, in (61), only for: = 0 for all

(62)

The corresponding state (60) is therefore unique. On the other hand, all the other levels are infinitely degenerate. For example, to the first level, of energy ~ , correspond all states (60) for which all the numbers are zero except for one, which is itself equal to one. All but one of the oscillators are then in their ground states. It is because the excitation can be localized in any one of the oscillators that the level = ~ is infinitely degenerate. 2-b.

Effects of the coupling

When the coupling is not zero, the Hamiltonian operator becomes: =

(

1

= 0) +

(63)

with: =

1 2

2 1

(

2 +1 )

(64)

2 If we had not changed the energy origin of each oscillator by omitting the term 1/2 in (57), we would have found an infinite energy for the system, whatever the quantum number . This difficulty does not arise if, instead of an infinite chain, one considers a chain formed by a very large but finite number of oscillators. However, problems related to “edge effects” then appear.

623



COMPLEMENT JV

The states (60) are no longer, in this case, the stationary states of the system as they are eigenstates of ( 1 = 0), but not of . To see this, we write in terms of the operators and : =

1 ~ 4

2

1

+

1

+1

(65)

+1

Now, it is clear that the action of on a state of type (60) changes the state: the numbers are no longer “good quantum numbers” since, for example, can transfer an excitation from site ( ) to site ( + 1) (term in +1 ). To find the stationary states of the system in the presence of the coupling, it is useful, as in classical mechanics, to introduce “normal variables”, that is, operators associated with the normal modes of the system. 2-c.

Normal operators. Commutation relations

To the normal variables ( defined by:

) and (

) correspond the operators Ξ( ) and Π( )

Ξ( ) =

e

(66a)

Π( ) =

e

(66b)

The domain of variation of the continuous parameter is again limited to the first Brillouin zone (15). Note that, since the normal variables ( ) and ( ) are complex, the associated operators Ξ( ) and Π( ) are not Hermitian, unlike and . The relations corresponding to (24) and (27) are here: Ξ(

)=Ξ ( )

(67a)

Π(

)=Π ( )

(67b)

The canonical commutation relations (54) enable us to calculate the commutators of Ξ( ) and Π( ). We immediately see that Ξ( ) and Ξ( ) commute, as do Π( ) and Π( ). As for the commutator [Ξ( ), Π ( )], it can be written: [Ξ( ) Π ( )] =

[

= ~

e

]e (

e+

)

Using formula (31) of Appendix II and the fact that (15), we obtain: [Ξ( ) Π ( )] = ~ 624

2

(

)

(68) and

both belong to interval

(69)

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS We saw in § 1-c- that it is convenient to condense the two normal variables ( ) and ( ) into one, ( ) [formula (30)]. The operator associated with ( ) will be: ( )=

1 2

( ) Ξ( ) +

~ ( )

Π( )

(70)

where ( ) is defined in (32). Note that the adjoint of ( ) can be written: ( )=

1 2

( )Ξ ( )

~ ( )

Π ( )

(71)

Using (69) and (67), we find without difficulty that: [ ( ) ( )] = ( )

( ) 2

( ) =

( ) =0

(

(72a)

)

(72b)

To the classical quantity ( ) defined in (48) corresponds the operator: ( )=

1 1 Π( ) Π ( ) + 2 2

Ω2 ( ) Ξ( ) Ξ ( )

(73)

since Ξ( ) and Ξ ( ) commute, as do Π( ) and Π ( ). To obtain the equivalent of the classical formula (49), one must take into consideration the fact that ( ) and ( ) do not commute; the order in which these operators appear must therefore be retained throughout the calculation. Relations (37), taking (67) into account, can be written here: ( ) Ξ( ) =

1 2

1 Π( ) = ~ ( )

( )+

2

(

( )

) (

(74a) )

(74b)

Substituting these expressions into (73), we find: 1 ~Ω( ) ( ) ( ) + ( ) ( ) 2 As in (52), we can put the total Hamiltonian

( )=

(75) of the system in the form:

+

=

2

d

( )

(76)

with: ( )=

1 ~Ω( ) 2

( ) ( )+

( ) ( )

(77)

( ) and ( ) thus can be seen to be annihilation and creation operators analogous to those of a harmonic oscillator. However, since is a continuous index, the commutation relations (72) involve ( ) instead of a Kronecker delta, so ( ) must remain in the symmetrical form (77). It can easily be shown that the various operators ( ) commute: [

( )

( )] = 0

(78) 625

COMPLEMENT JV

2-d.



Stationary states in the presence of coupling

According to formulas (76) and (77), the ground state 0 of the system of coupled oscillators is defined by the condition: ( )0 =0

(79)

for all values of . The other stationary states can be obtained from the state 0 by the action of the operators ( ); their corresponding energy is the integral of the energies associated with each of the modes. A certain number of difficulties arise because of the continuous infinity of normal modes; in particular, the energy of the ground state that can be deduced from (76) and (77) is infinite. We shall not discuss these difficulties here; in any case, they do not arise for a real chain, that is, a finite one (cf. footnote 2). Formula (10) gives the value of the energy quantum ~Ω( ) associated with each of the modes. It therefore indicates what energy quanta the system can absorb or emit: they must correspond to frequencies situated within the allowed band (17). 3. 3-a.

Application to the study of crystal vibrations: phonons Outline of the problem

Consider a solid body, composed of a large number of atoms (or ions) whose equilibrium positions are regularly arranged in space, forming a crystalline lattice. For simplicity, we shall assume that this lattice is one-dimensional and can be treated like an infinite chain of atoms. We intend to use here the results of the preceding section to study the motion of the nuclei of these atoms about their equilibrium positions. With this object in view, we shall make use of the same approximation as in the study of molecular vibrations (Born-Oppenheimer approximation; cf. Complement AV , comment of § 1-a). We shall assume that the motion of the electrons can be calculated as if the positions of the nuclei were fixed parameters . Thus, we shall solve the corresponding Schrödinger equation (actually, this equation is too complex to be solved exactly; in practice, one must again settle for approximations). We shall then denote the energy of the electronic system in its ground state by ( ) where 1 0 1 is the displacement of nucleus ( ) from its equilibrium position. It can be shown that it is then possible to calculate the motion of the nuclei, to a good approximation, by assuming that they possess a total potential energy ( ) equal to the 1 0 1 sum of their electrostatic interaction energy and ( ). 1 0 1 In fact, we shall further simplify the problem by making some reasonable hypotheses about (this is indispensable, since we do not know ). We shall assume that describes essentially the interactions of each of the nuclei with its nearest neighbors (in an infinite linear chain, each nucleus has two such neighbors), that is, that the forces exerted between non-adjacent nuclei can be neglected. In addition, we shall grant that, in the range of values that the displacements can attain, is well represented by an expression of the form: 1 2 where 626

2 1

(

+1 )

2

is the mass of a nucleus and

(80)

1

characterizes the intensity of its interaction

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS with its neighbors. We shall not, therefore, take into account terms of higher order in ( +1 ), that is, the anharmonicity of the potential. Since expression (80) is identical to (6), we can apply the results of the preceding sections to the simple model of a solid body that we have just defined. Note, nevertheless, that we must choose = 0 because is the total potential energy of the system of the nuclei, which interact with their neighbors, but are not elastically bound to their equilibrium positions3 . 3-b.

Normal modes. Speed of sound in the crystal

Each of the vibrational normal modes of the crystal is characterized by a wave vector and an angular frequency Ω( ). In solid state physics, the energy quantum associated with a mode is called a “phonon”. The phonons can be considered to be particles of energy ~Ω( ) and momentum ~ . Actually, a phonon is not a true particle, since its existence involves a state of collective vibration of the real particles which constitute the crystal. The phonons are sometimes said to be “quasi-particles”: they are entirely analogous to the fictitious particles, of position ( ) and ( ), introduced in Complement HV . In addition, a phonon can be created or destroyed by giving to or taking from the crystal the corresponding vibrational energy, while (at least in the nonrelativistic domain to which we are restricting ourselves) a particle such as an electron cannot be created or annihilated. In this connection, note that, since the number of phonons in a given mode is arbitrary, phonons are bosons (Chap. XIV). The dispersion relation giving the function Ω( ) differs, for phonons, from the one discussed in § 1.b. , since the angular frequency is zero here. In this case, choosing = 0 in (10), we obtain: Ω( ) = 2

1

sin

2

(81)

The curve representing Ω( ) is given in Figure 5; it is composed of two half-arcs of a sinusoid. Unlike what happens for not equal to zero, Ω( ) now goes to zero for = 0 and varies linearly when is very small, since, as long as: 1

(82)

we have: Ω( )

1

=

(83)

where: =

1

(84)

3 The Einstein model, which we described in Complement A , is based on a different hypothesis: V each nucleus is assumed to “see” an average potential, due to its interactions with the other nuclei, but practically independent of the exact positions of these other nuclei. To a first approximation, this average potential is assumed to be parabolic, and one has a system of independent harmonic oscillators. Here, on the other hand, we are studying a somewhat more elaborate model, in which we explicitly (although approximately) take into account interactions between the nuclei.

627

COMPLEMENT JV

• Ω

2ω1

– π/l

0

+ π/l

Figure 5: Dispersion relation for phonons (curve of Figure 2 for curve at the origin gives the speed of sound in the crystal.

k

= 0); the slope of the

Condition (82) means that the wavelength 2 associated with the mode being considered must be much greater than the separation between nuclei. For such wavelengths, the discontinuous structure of the chain is negligible, and the medium is not dispersive: the phase velocity Ω( ) is independent of , which implies that a wave packet involving only small values of (of the same sign) propagates without being deformed at the velocity . Since acoustical wavelengths satisfy (82), is the speed of sound in the crystal. When is of the order of 1 , the discontinuous structure of the chain becomes important, and the angular frequency Ω( ) increases less rapidly with than formula (83) would indicate (in Figure 5, the curve deviates from the straight dashed lines which are its tangents at the origin). The medium is then dispersive, and a wave packet moves with a group velocity: =

Ω( ) dΩ( ) = d

(85)

Finally, when approaches the edges of the first Brillouin zone ( ), we see from Figure 5 that the group velocity approaches zero. As in an electromagnetic waveguide, the propagation velocity goes to zero when the cut-off frequency (here 1 2 ) is attained. Figure 5 can also be seen as giving the spectrum of possible phonon energies ~Ω( ) in terms of their momentum ~ . Knowledge of such a spectrum, for a real crystal, is very important. It gives the precise energies and momenta that the crystal can supply or absorb when it interacts with another system. For example, the inelastic scattering of light by a crystal (the Brillouin effect) can be interpreted as the result of the annihilation or creation of a phonon, with a change in the energy and momentum of the incident photon (total energy and momentum being conserved throughout the process).

628

• VIBRATIONAL MODES OF A LINEAR CHAIN OF COUPLED HARMONIC OSCILLATORS; PHONONS Comment:

The simple one-dimensional model described here has allowed us to present some important physical concepts, which remain valid for a real crystal: energy quanta associated with the normal modes, dispersion of the medium, allowed and forbidden frequency (and, therefore, energy) bands. In reality, the crystalline lattice is three-dimensional, and a normal mode is characterized by a true wave vector k; Ω then depends, in general, not only on the absolute value of k, but also on its direction. Also, the situation may arise (as is the case for an ionic crystal) in which the vertices of the lattice are not all occupied by identical particles, but rather, for example, by two different types of particles in regular alternation4 . Then, for each wave vector k, several angular frequencies Ω(k) appear. Some of them, which go to zero when k 0, constitute “acoustic branches” like the one encountered above; the others belong to what are called “optical branches”5 , in which a phonon of zero momentum has a non-zero energy. It would be out of the question to study all these problems here, although they are of primordial importance in solid state physics. References and suggestions for further reading:

Chains of coupled classical oscillators: Berkeley 3 (7.1), §§ 2.4 and 3.5. See section 13 of the bibliography, particularly Kittel (13.2), Chap. 5. Other examples of collective oscillations: Feynman III (1.2), Chap. 15.

4 A real crystal also contains impurities and imperfections distributed at random. Here we are speaking only about perfect crystals. 5 This name arises from the fact that, in an ionic crystal, the “optical” phonons are coupled to electromagnetic waves like those in the visible domain, whose wavelengths are much larger than the atomic separation.

629



VIBRATIONAL MODES OF A CONTINUOUS PHYSICAL SYSTEM. PHOTONS

Complement KV Vibrational modes of a continuous physical system. Application to radiation; photons

1 2

3

1.

Outline of the problem . . . . . . . . . . . . . . . . . . . . . Vibrational modes of a continuous mechanical system: example of a vibrating string . . . . . . . . . . . . . . . . . . . 2-a Notation. Dynamical variables of the system . . . . . . . . 2-b Classical equations of motion . . . . . . . . . . . . . . . . . 2-c Introduction of the normal variables . . . . . . . . . . . . . 2-d Classical Hamiltonian . . . . . . . . . . . . . . . . . . . . . 2-e Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . Vibrational modes of radiation: photons . . . . . . . . . . 3-a Notation. Equations of motion . . . . . . . . . . . . . . . . 3-b Introduction of the normal variables . . . . . . . . . . . . . 3-c Classical Hamiltonian . . . . . . . . . . . . . . . . . . . . . 3-d Quantization . . . . . . . . . . . . . . . . . . . . . . . . . .

. 631 . . . . . . . . . . .

632 632 633 633 634 637 639 639 640 641 642

Outline of the problem

In Complements HV and JV , we introduced the idea of normal variables for a system of two or a countable infinity of coupled harmonic oscillators. The aim of this complement is to show that the same ideas can also be applied to the electromagnetic field, which is a continuous physical system (there is no natural lower bound for the wavelength of radiation). The quantization of the electromagnetic field will be studied in much more detail in Chapter XIX. This study raises a certain number of delicate problems. Therefore, before beginning, and in order to make a smooth transition between this complement and the preceding ones, HV and JV , we shall start by studying in § 2 the vibrational modes of a continuous mechanical system: a vibrating string. It is obvious that, on the atomic scale, such a system is not continuous: the string is composed of a very large number of atoms. However, we shall ignore this atomic structure and treat the string as if it were really continuous, since the fundamental aim of the calculation is to show how normal variables can be introduced for a continuous system. Since, moreover, we are dealing with a mechanical system, we can, without difficulty, define the conjugate momenta of the normal variables, calculate the Hamiltonian of the system, and show that it indeed appears in the form of a sum of Hamiltonians of independent one-dimensional harmonic oscillators. We shall also discuss in detail the quantization of such a system. The results obtained in § 2 will enable us to attack, in § 3, the problem of the vibrational modes of radiation. We shall show that the study of radiation confined to a parallelepiped cavity leads to equations very similar to those of the vibrating string. The 631

COMPLEMENT KV



same transformations allow the introduction of completely uncoupled normal variables for the radiation (associated with the standing waves which can exist inside the cavity). Then we shall generalize the results obtained in § 2 to derive simply the concept of a photon (as it is not feasible here to show rigorously how one can introduce, for a nonmechanical system like the electromagnetic field, conjugate momenta, a Lagrangian and a Hamiltonian).

2.

2-a.

Vibrational modes of a continuous mechanical system: example of a vibrating string Notation. Dynamical variables of the system

u(x, t) P

O x L

F

Figure 1: Vibrating string passing through two fixed points and and subjected to a tension ; ( ) denotes the deviation with respect to the equilibrium position of the point of the string situated at a distance from

The string is fixed at a point (Fig. 1). It passes through a very small hole pierced in a plate, and a weight exerts a tension on it. For simplicity, we assume that the string is inextensible (it has a constant length) and that it always remains in the same plane, passing through and . Its state is defined at time when we know at this time the displacement ( ) of the various points (labeled by their abscissas on ( ) ), as well as the corresponding velocities . The constraints imposed at and are expressed by the boundary conditions: (0 ) = (

)=0

(1)

where 0 and are the abscissas of the points and . It is important to remember that, in this problem, the dynamical variables are the displacements ( ) at each point of abscissa : there is a continuous infinity of dynamical variables. Consequently, is not a dynamical variable, but rather a continuous index which labels the dynamical variable with which we are concerned ( plays the same role as indices 1 and 2 of Complement HV or as the index of Complement JV ). 632

• 2-b.

VIBRATIONAL MODES OF A CONTINUOUS PHYSICAL SYSTEM. PHOTONS

Classical equations of motion

Let be the mass per unit length of the string. If we assume the string to be perfectly flexible and if we confine ourselves to small displacements, a classical calculation enables us to obtain the partial differential equation satisfied by . We find: 1

2

2

2

2

(

2

)=0

(2)

where: =

(3)

is the propagation velocity of a perturbation along the string. Such an equation expresses the fact that the evolution of the variable correspond2 ing to the point depends on the variables at infinitely near points (via 2 ). Thus the variables ( ) are all coupled to each other. We can then ask the following question: is it possible, as in Complements HV and JV , to introduce new, uncoupled variables that are linear combinations of the variables ( ) associated with the various points ? 2-c.

Introduction of the normal variables

Consider the set of functions of : 2

( )= where as (

sin

(4)

is a positive integer: ):

(0) =

=1 2 3

The

( )=0

( ) satisfy the same boundary conditions (5)

In addition, it is easy to verify the relations: ( )

( )d =

(6)

0

(orthonormalization relation) and: d2 + d 2

2 2

( )=0

2

(7)

It can be shown that any function which goes to zero at = 0 and = [as is the case for ( )] can be expanded in one and only one way in terms of the ( ). We can therefore write: (

)=

()

( )

(8)

=1

( ) can be obtained easily, using (6): ()=

(

)

( )d

(9)

0

633

COMPLEMENT KV



The state of the string at the instant (

)

(

)

can be defined either by the set of values

corresponding to the various points

or by the set of numbers

( ) ˙ ( ) . The new variables ( ) just introduced are linear combinations of the old ( ), as can be seen from (9). The converse is obviously also true [cf. formula (8)]. To obtain the equation satisfied by the ( ), we substitute expansion (8) into the equation of motion (2). Using (7), we obtain, after a simple calculation: 1 d2 2 d 2

( ) =1

that is, since the d2 + d2

2

2 2

( )+

2

() =0

(10)

( ) are linearly independent:

()=0

(11)

with: =

(12)

Thus we see that the new variables ( ), also called “normal variables”, evolve independently: they are uncoupled. Moreover, equation (11) is identical to that of a one-dimensional harmonic oscillator of angular frequency , so that: ()=

cos(

)

(13)

Each of the terms ( ) ( ) appearing on the right-hand side of (8) consequently represents a standing wave of frequency 2 and half-wavelength . Each normal variable is therefore associated with a vibrational normal mode of the string, the most general motion of the string being a linear superposition of these normal modes.

Comment:

In Complement JV , we started with an infinite discrete set of harmonic oscillators and introduced a continuous infinity of normal variables. Here we find ourselves in the opposite situation: the ( ) form a continuous set with respect to the index , while, because of the boundary conditions, the normal variables ( ) are labeled by a discrete index . 2-d.

.

Classical Hamiltonian

Kinetic energy

The kinetic energy of the segment of string included between and + d is 2 1 ( ) d . It follows that the total kinetic energy kin of the string is equal to: 2 kin

634

=

( 2

0

)

2

d

(14)

• kin

VIBRATIONAL MODES OF A CONTINUOUS PHYSICAL SYSTEM. PHOTONS

can be expressed simply in terms of the

kin

=

d ( )d ( ) d d

2

( )

, using (8): ( )d

(15)

0

which can also be written, taking (6) into account:

kin

.

=

2

d d

2

(16)

Potential energy

Consider the segment of string included between the abscissas makes an angle with the axis such that: (

tan =

)

and

+ d . It

(17)

Its length is therefore equal to: d cos

1 2

1 + tan2

=d

(18)

Since the displacements are small, d cos

=d

1+

1 2

(

is very small, and we can write:

2

)

(19)

From this we deduce that the total increase in the length of the string with respect to its equilibrium position (which corresponds to 0 for all ) is equal to: ∆ =

1 2

(

)

2

d

(20)

0

Now, ∆ represents the distance over which the end of the inextensible rope have risen, which means that the weight has the same motion. The potential energy pot of the string, with respect to the value corresponding to the equilibrium position, is therefore equal to:

pot

=

∆ =

1 2

(

)

2

d

(21)

0

pot can also be expressed in terms of the normal variables tion, using (8) and (4), yields:

. A simple calcula-

2 2 pot

=

2

2

2

(22) 635

COMPLEMENT KV

.



Conjugate momenta of the The Lagrangian =

kin

pot

=

; classical Hamiltonian

of the system (cf. Appendix III) can be written: ˙2

2

2 2

(23)

From this, we deduce the expression for the conjugate momentum =

=

˙

˙

2 kin

+

: (24)

so that we finally obtain, for the Hamiltonian =

of

pot

=

2

+

1 2

2

(

2

) of the system, the expression: (25)

that is: =

(26)

with: 2

=

2

+

1 2

2 2

(27)

Since and are conjugate variables, we recognize to be the Hamiltonian of a one-dimensional harmonic oscillator of angular frequency . is therefore a sum of Hamiltonians of independent one-dimensional harmonic oscillators (independent, since the normal variables are uncoupled). It is useful to introduce, as in Complements BV and CV , the dimensionless variables: ˆ = ˆ =

(28a) 1

(28b) ~

where: =

(29) }

is a (dimensional) constant. = 636

1 ~ 2

ˆ2 + ˆ2

can then be written: (30)

• 2-e.

VIBRATIONAL MODES OF A CONTINUOUS PHYSICAL SYSTEM. PHOTONS

Quantization

.

Preliminary comment

The calculations performed in this section are, of course, not intended to reveal quantum effects in the motion of a macroscopic vibrating string. The vibrational frequencies 2 which can be excited on such a string are so low (of the order of a kilohertz at most) and the elementary energies ~ so much smaller than the macroscopic energy of the string, that a classical description is quite sufficient. It might be thought that could be as large as we like because has no upper bound in formula (12). In fact, for sufficiently small wavelengths 2 , the rigidity of the string can no longer be neglected, and equation (2) is no longer valid. Furthermore, as we pointed out in the introduction, the string is not really a continuous system, and it would make no sense to consider wavelengths smaller than the interatomic distance. The calculations which will be presented here must be considered as a simple first approach to problems posed by the quantum mechanical description of radiation. Now, radiation constitutes a truly continuous system (no natural lower bound exists for the wavelength) and satisfies an equation which is analogous to (2), whatever frequencies and wavelengths may be involved1 . .

Eigenstates and eigenvalues of the quantum mechanical Hamiltonian

We quantize each oscillator by associating with ˆ and ˆ [see formula (28)] observables ˆ and ˆ such that: ˆ

ˆ

=

(31)

Since the normal variables are uncoupled, we also assume that the operators relating to two different oscillators commute. We therefore have: ˆ

ˆ

=

(32)

Let: =

1 ~ 2

ˆ2 + ˆ2

(33)

be the quantum mechanical Hamiltonian of oscillator . From the results of Chapter V, we know its eigenstates and eigenvalues: =

+

1 2

(34)

~

where is a non-negative integer (to simplify the notation, we shall writen instead of ). Since the commute, we can choose the eigenstates of to be in the form of tensor products of the : 1

2

=

1

2

(35)

1 If we were really interested in a microscopic “vibrating string” (for example, a linear macromolecule), it would be more realistic to consider, as in Complement JV , a chain of atoms and to study not only their longitudinal displacements but also their transversal ones (transverse phonons).

637



COMPLEMENT KV

The ground or “vacuum” state corresponds to all the 0 0

0

equal to zero:

= 0

(36)

When we choose the energy origin to be the energy of the state 0 , the energy of state (35) is equal to: 1

=

2

(37)

~

A state such as (35) can be considered to represent a set of 1 energy quanta ~ 1 energy quanta ~ These vibrational quanta are analogous to the phonons studied in Complement JV . Finally, using ˆ and ˆ , we can introduce, as in § B of Chapter V, creation and annihilation operators for an energy quantum ~ : =

1 2

where

ˆ + ˆ

(38)

is the adjoint of

. We then have:

=

(39)

and: 1

1

2

=

1

2

=

+1

1

2

(40)

1

+1

2

All the states (35) can be expressed in terms of the vacuum state 0 : 1

.

=

2

( 1)

1

1!

( 2)

2

(

)

2!

!

0

(41)

Quantum mechanical state of the system

The most general quantum mechanical state of the system is a linear superposition of the states 1 2 : () =

1 1

The equation of motion of ~

d d

2

()

1

2

(42)

2

() =

( ) is the Schrödinger equation:

()

(43)

Using (37) and (43), we easily obtain: 1

638

2

()=

1

2

(0) e

(44)

• .

VIBRATIONAL MODES OF A CONTINUOUS PHYSICAL SYSTEM. PHOTONS

Observables associated with the dynamical variables (

)

When the system is quantized, ( ) becomes an observable ( ) which does not depend2 on , and which is obtained by replacing in (8) ( ) by the observable : ( )=

( ) 1

=

2

( )

+

(45)

We see that a displacement observable ( ) can be defined for each value of , and that it depends linearly on the creation and annihilation operators and . It is interesting to compare the mean value of ( ), ( ) ( ) ( ) , with the classical quantity ( ). Since, according to (40), and can link only states whose energy difference is ~ , we deduce from (45) that the only Bohr frequencies that can appear in the evolution of ( ) ( ) are the frequencies 1 2 , 2 2 , ..., 2 , ... associated, respectively, with the spatial functions 1 ( ), 2 ( ), ... ( ) ... Thus we find for ( ) ( ) a linear superposition of the standing waves that can exist on the string. This analogy can, moreover, be pursued further. Let us calculate the derivative 2

( ) ( ); using the fact that [cf. Complement GV , equation (17)]:

2

d d

=

(46)

and relations (7) and (12), we easily find that the mean value of satisfies the differential equation: 1

2

2

2

( ) given by (45)

2 2

( ) ()=0

(47)

which is identical to (2). Finally, note that since does not commute with , ( ) does not commute with . The displacement and the total energy are therefore, in quantum mechanics, incompatible physical quantities. 3.

Vibrational modes of radiation: photons

3-a.

Notation. Equations of motion

The classical state of the electromagnetic field at a given is defined when we know, for this time, the value of the components of the electric field E and the magnetic field B at each point r of space. As in § 2 above, we therefore have a continuous infinity of dynamical variables: the six components and at each point r. In order to stress the importance of the idea of normal variables (or normal modes) of a field, we shall introduce a simplification which consists of forgetting the vector nature 2 Recall

that, in quantum mechanics, the time dependence is usually contained in the state vector and not in the observables (cf. discussion of § D-1-d of Chapter III).

639



COMPLEMENT KV

of the fields E and B: we shall base our arguments on a scalar field (like each of the components of E and B) the equation: 1

2

2

2



(r ) = 0

(r ) which obeys

(48)

where

is the speed of light. We shall assume the field to be confined to a parallelepiped cavity whose inside walls are perfectly conducting and whose edges, parallel to , , , have, respectively, the lengths 1 , 2 , 3 . As boundary conditions, we require (r ) to be zero on the walls of the cavity (in the real problem, it is, for example, the tangential components of the electric field E that must go to zero on these walls). We can therefore write: ( =0

)= ( = = (

3-b.

)= (

1

=

3

=0

)=

)=0

(49)

Introduction of the normal variables

Consider the set of functions of , , : (

8

)= 1

sin

2

sin

3

sin

1

2

(50) 3

where , , are positive integers ( = 1 2 3 ). The ( ) go to zero on the walls of the cavity and therefore satisfy the same boundary conditions as ( ):

( =0

)=

( =

)=

1

=

(

=

3)

=0

(51)

In addition, the following relations are simple to verify: 1

2

d 0

3

d

d

0

(

)

(

)=

(52)

0

and: 2

∆+

2 1

2

+

2 2

2 2

+

(

2 3

)=0

(53)

Any function that goes to zero on the walls of the cavity, (r ) in particular, can be expanded in one and only one way in terms of the ( ). We therefore have: (

)=

()

(

)

(54)

Formula (54) can easily be inverted, with the help of (52): 1

()= 0

640

2

d

3

d 0

d 0

(

)

(

)

(55)



VIBRATIONAL MODES OF A CONTINUOUS PHYSICAL SYSTEM. PHOTONS

Thus we see that the field at time is described, either by the set of variables ( ), or by the set of variables ( ). Formulas (54) and (55) enable us to go from one set to the other one. Substituting (54) into (48) and using (53), we obtain, after a simple calculation: d2 + d2

2

()=0

(56)

where: 2 2

=

2 2 2 1

2

+

2 2

2

+

(57)

2 3

The normal variables ( ) are therefore uncoupled. According to (56), ( ) varies like cos( ). Each of the terms () ( ) of the sum (54) therefore represents a standing wave (a vibrational normal mode of the field in the cavity) characterized by its frequency 2 and its spatial dependence in the three directions , , (half-wavelengths 1 , 2 and 3 respectively). Thus we have been able to generalize the results of § 2-c without difficulty. Note, nevertheless, that when the vectorial nature of the electromagnetic field is taken into account, the structure of the modes is more complex, However, the general idea is the same, and one reaches similar conclusions. 3-c.

Classical Hamiltonian

Basing our discussion on the very close analogy between the results of § 2-c and those of 3-b, we shall assume without proof that one can associate with the field (r ) a Lagrangian , from which can be deduced the equation of motion (48), the conjugate momenta ( ) of the normal variables, and finally, the expression for the Hamiltonian of the system. The only point which is important here is that this Hamiltonian is analogous to (30): 1 ~ 2

= where ˆ ˆ

and ˆ

=

2



2

) + (ˆ

)

are dimensionless variables proportional to the ˆ

(58) and

1

=

: (59)

~ is a (dimensional) constant which is analogous to the one introduced in (29).

Comments: ( ) The equation of motion of each normal variable [established in (56)] is analogous to that of a one-dimensional harmonic oscillator of angular frequency . Thus we see why we obtain for a sum of Hamiltonians of independent onedimensional harmonic oscillators. It is possible, moreover, to obtain (56) from (58).

641



COMPLEMENT KV

The Hamilton-Jacobi equations (cf. Appendix III) can, in fact, be written, taking (59) into account: dˆ d

=

dˆ d

1 ~ ˆ

(60)

1 ~ ˆ

=

that is, with the form (58) of dˆ d

=

dˆ d

=

:

ˆ

(61a)

ˆ

Eliminating ˆ

(61b)

between these two equations, we indeed find (56).

( ) For the real electromagnetic field, composed of two fields E and B, one can also obtain expression (58) for directly without using the Lagrangian. One simply takes the total energy of the field as the sum of the electrical and magnetic energies contained in the cavity: =

1

0

2

d

2

3

d

0

0

d

E2 +

2

B2

(62)

0

and uses for E and B expansions analogous to (54). Thus one finds that the terms in ˆ2 and ˆ2 of (58) correspond respectively to the electrical and magnetic energies. 3-d.

Quantization

Now, starting with equation (58), we can carry out the same operations as in § 2-e. .

Eigenstates and eigenvalues of

We associate with ˆ and ˆ two observables ˆ and ˆ whose commutator is equal to . Since observables relating to two different modes commute, we can write: ˆ

ˆ

=

Let =

(63)

be the Hamiltonian associated with the mode ( ~

2

ˆ

+

2

ˆ

):

2

(64)

We know its eigenstates and eigenvalues: = where 642

+

1 2

~

is a non-negative integer.

(65)



VIBRATIONAL MODES OF A CONTINUOUS PHYSICAL SYSTEM. PHOTONS

Since the commute with each other, we can choose the eigenstates of to be in the form of tensor products of the : 111

211

121

(66)

112

The ground state, called the “vacuum”, corresponds to all the 0 0 0 0

0

=

equal to zero:

= 0

(67)

With respect to the vacuum state energy, the energy of state (66) is equal to: =

111

(68)

~

A state such as (66) can be considered to represent a set of 111 energy quanta ~ 111 energy quanta ~ , ... These quanta are none other than the photons. Thus we see that a certain type of photon is associated with each normal mode of the cavity. We can, as in (38), introduce annihilation and creation operators for a photon of type ( ): 1 2

ˆ

1 = 2

ˆ

=

+ ˆ (69) ˆ

and establish formulas identical to (39), (40) and (41): =

(70)

111

=

111

=

1

111

+1

(71) +1

111

111

111

=

111

0

111!

.

(72)

!

Quantum state of the field The most general state of the field is a linear superposition of states (66): () =

111

()

(73)

111

111

The Schrödinger equation: ~

d d

() =

()

(74)

enables us to obtain the coefficients

111

()=

111

111

(0) e

( ) in the form: (75) 643

COMPLEMENT KV

.



Field operator Upon quantization, (r ) becomes an observable (r) which no longer depends and is obtained by replacing in (54) ( ) by :

on

1

(r) =

(r) ˆ

(76)

One can also, with the help of (69), express (r) in terms of the creation and annihilation operators: (r) =

1 2

1

(r)

+

(77)

The same arguments as in § 2-e- enable us to show, using formulas (71) and (75), that the only Bohr frequencies that can appear in the time evolution of the mean value of the field: (r) ( ) =

( ) (r) ( )

are the frequencies 111 2 , 211 2 , ..., 2 , ... associated, respectively, with the spatial functions 111 (r), 211 (r), ... (r)... Thus we find for (r) ( ) a linear superposition of the classical standing waves that can exist in the cavity. A calculation identical to the one in § 2-e- would enable us to show that (r) satisfies equation (48). Finally, we find that (r) and do not commute. It is therefore impossible, in quantum theory, to know simultaneously and with certainty both the number of photons and the value of the electromagnetic field at a point in space. Comment: For the electromagnetic field, coherent states can be constructed which are analogous to the ones introduced in Complement GV and which represent the best possible compromise between the incompatible quantities, field and energy.

.

Vacuum fluctuations

We saw in § D-1 of Chapter V that, in the ground state of a harmonic oscillator, 2 is zero while is not, and we discussed the physical meaning of this typically quantum mechanical effect. In the problem we are studying here, (r) presents many analogies with the operator of Chapter V; we see from (77) that (r) is a linear combination of creation and annihilation operators. Consider the mean value of (r) in the ground state 0 of the field, that is, the “vacuum” state of photons. Since the diagonal elements of and are zero according to (71), we see that: 0 (r) 0 = 0

(78) 2

On the other hand, the corresponding matrix element of [ (r)] is not zero. According to (71): 0 =0 0 0

644

(79)

=0 0 =



VIBRATIONAL MODES OF A CONTINUOUS PHYSICAL SYSTEM. PHOTONS

Therefore, a simple calculation enables us to establish, using (77), that: 0[ (r)]2 0 =

1 2

1 2

[

(r)]2

(80)

From this we see that in the vacuum, that is, in the absence of photons, the electromagnetic field (r) at a point of space has a zero mean value but a non-zero root mean square deviation. This means, for example, that if we perform one measurement of (r), we can find a non-zero result (varying, of course, from one measurement to another), even if there is no photon present in space. This effect has no equivalent in classical theory, in which, when the energy is zero, the field is rigorously zero. The preceding result is often expressed by saying that the “vacuum state” of photons is subjected to fluctuations of the field, characterized by (78) and (80) and called vacuum fluctuations. The existence of these fluctuations has interesting physical consequences for the interaction of an atomic system with the electromagnetic field. Consider, for example, an atom in a state of energy , interacting with a classically represented electromagnetic wave. We shall see in Complement AXIII , using time-dependent perturbation theory (cf. Chap. XIII), that under the effect of such an excitation, the atom can move to a higher energy state (absorption) or to a lower energy state (induced emission). But in this semi-classical treatment, if no field is present in space, the atom must remain indefinitely in the state . In fact, we have just established that, even in the absence of incident photons, the atom “sees” the “vacuum fluctuations” related to the quantum mechanical nature of the electromagnetic field. Under the effect of these fluctuations, it can emit a photon and fall back into a lower energy state (the energy of the global system being conserved during this process). This is the phenomenon of spontaneous emission, which can thus be considered to be, as it were, an “emission induced by the vacuum fluctuations”. (No spontaneous absorption is possible, since this would cause the atom to move to a higher energy state, and no electromagnetic energy can be extracted from the field, which is in its ground state.) It can also be shown that another effect of “vacuum fluctuations” is to impart to the atomic electrons an erratic motion which slightly modifies the energies of the levels. The observation of this effect in the hydrogen atom spectrum (the “Lamb shift”) constituted the point of departure for the development of modern quantum electrodynamics.

Comment: In the preceding discussions, we have always chosen the energy of the vacuum state as the energy origin. In fact, harmonic oscillator theory gives us the absolute value of the energy of the vacuum state: 0

=

1 ~ 2

(81)

There is obviously a close relationship between 0 and the electrical and magnetic energy associated with “vacuum fluctuations”. One of the difficulties of quantum electrodynamics, of which we have just given a brief overview, is that the sum (81) is in fact infinite, as is, moreover, (80)! Nevertheless, it is possible to surmount this difficulty: using the procedure called “renormalization”, one manages to bypass infinite quantities and calculate the physical effects that are actually observable, such as the “Lamb shift”, with remarkable accuracy. It is obviously out of the question to consider these vast problems here.

645

COMPLEMENT KV



References and suggestions for further reading:

Vibration modes of a continuous string in classical mechanics: Berkeley 3 (1.1), §§ 2.1, 2.2 and 2.3. Quantization of the electromagnetic field: Mandl (2.9); Schiff (1.18), Chap. 14; Messiah (1.17), Chap. XXI; Bjorken and Drell (2.10), Chap. 11; Power (2.11); Heitler (2.13). The Lamb shift: Lamb and Retherford (3.11); Frisch (3.13); Kuhn (11.1), Chap. III, § A 5 e; Series (11.7), Chaps. VIII, IX and X.

646



ONE-DIMENSIONAL HARMONIC OSCILLATOR IN THERMODYNAMIC EQUILIBRIUM AT A TEMPERATURE

Complement LV One-dimensional harmonic oscillator in thermodynamic equilibrium at a temperature

1

Mean value of the energy . . . . . . . . . . . 1-a Partition function . . . . . . . . . . . . . . . 1-b Calculation of . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . 2-a Comparison with the classical oscillator . . . 2-b Comparison with a two-level system . . . . . Applications . . . . . . . . . . . . . . . . . . . 3-a Blackbody radiation . . . . . . . . . . . . . . 3-b Bose-Einstein distribution law . . . . . . . . . 3-c Specific heats of solids at constant volume . . Probability distribution of the observable 4-a Definition of the probability density ( ) . . 4-b Calculation of ( ) . . . . . . . . . . . . . . . 4-c Discussion . . . . . . . . . . . . . . . . . . . . 4-d Bloch’s theorem . . . . . . . . . . . . . . . .

2

3

4

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

648 648 648 650 650 651 651 651 652 653 655 655 656 658 659

This complement is devoted to the study of the physical properties of a onedimensional harmonic oscillator in thermodynamic equilibrium with a reservoir at temperature . We know (cf. Complement EIII ) that such an oscillator is not in a pure state (it is impossible to describe its state by a ket ). The partial information which we possess about it and the results of statistical mechanics enable us to characterize it by a statistical mixture of stationary states with weights proportional to e ( : Boltzmann constant; : energy of the state ). We saw in Complement EIII (§ 5-a) that the corresponding density operator is then written: = where

1

e

(1)

is the Hamiltonian operator, and:

= Tr e

(2)

is a normalization factor which insures that: Tr = 1

(3)

(

is the partition function, cf. Appendix VI). We shall calculate the mean value of the oscillator’s energy, interpret the result obtained, and show how it enters into numerous problems in physics (blackbody radiation, specific heat of solids, ...). Finally, we shall establish and discuss the expression for the probability density of the particle’s position (the observable ). 647



COMPLEMENT LV

1.

Mean value of the energy

1-a.

Partition function

The energies of the states are, according to the results of § B of Chapter V, equal to ( + 1 2)~ . Since the energy levels are not degenerate, we have, according to (2): =

e =0

=

( +1 2)~

e =0

2

~

=e

1+e

~

+e

2~

+

(4)

Inside the brackets, we recognize a geometric progression of common ratio e Therefore: =

e 1

1-b.

~

.

2

~

e

(5)

~

Calculation of

According to formula (31) of Complement EIII and expression (1) for : = Tr(

1

)=

Tr(

e

)

Writing the trace explicitly in the 1

=

( + 1 2)~ e

(6) basis, we obtain:

( +1 2)~

(7)

=0

To calculate this quantity, we differentiate both sides of (4) with respect to d d

=

1

( + 1 2)~ e

2

( +1 2)~

: (8)

=0

We see that: =

2

1d d

(9)

Using (5), we then find, after a simple calculation: =

648

~ + ~ 2 e

~ 1

(10)



ONE-DIMENSIONAL HARMONIC OSCILLATOR IN THERMODYNAMIC EQUILIBRIUM AT A TEMPERATURE

Comments: ( ) Isotropic three-dimensional oscillator Using the results and notation of Complement EV , we can write: =

+

where

+

(11)

is given by: 1

=

Tr(

e

) (

=0

=

=0

+ 1 2)~

e [(

+1 2)+(

+1 2)+(

+1 2)]~

=0

e [( =0

=0

+1 2)+(

+1 2)+(

+1 2)]~

=0

(12) The sums over and can be separated out and are identical in the numerator and denominator, so that: (

+ 1 2)~ e

(

+1 2)~

=0

=

(13) e

(

+1 2)~

=0

Aside from the replacement of by , the final expression is identical to the one calculated in the preceding section; is therefore equal to the value given in (10). It is easy to show that the same is true for and . Therefore, we have established the following result: at thermodynamic equilibrium, the mean energy of an isotropic three-dimensional oscillator is equal to three times that of a one-dimensional oscillator of the same angular frequency. ( ) Classical oscillator The energy ( ) of a classical one-dimensional oscillator is equal to: 2

(

)=

+

2

1 2

2 2

(14)

In expression (14), and can take on any values between and + . According to the results of classical statistical mechanics, the mean energy of this classical oscillator is given by: +

=

+ +

( +

)e e

(

( )

)

d d (15) d d

Substituting (14) into (15), we find, after a simple calculation: =

(16)

An argument analogous to that of comment ( ) shows that result (16) must be multiplied by 3 when we go from one to three dimensions.

649

COMPLEMENT LV



Mean energy

Figure 1: As a function of the temperature, variation of the mean energy of a quantum mechanical oscillator (solid line) compared with that of a classical oscillator (straight dashed line).

H ћω 2 ℋ = kT T

0

2.

Discussion

2-a.

Comparison with the classical oscillator

In Figure 1, the solid line gives the mean energy of the one-dimensional quantum mechanical oscillator as a function of . The dashed line corresponds to the mean energy of the classical oscillator. For = 0, = ~ 2. This result corresponds to the fact that at absolute zero, one is sure that the oscillator is in the ground state 0 , with energy ~ 2 (~ 2 is, for this reason, sometimes called the “zero point energy”). As for the classical oscillator, it is motionless ( = 0) at its stable equilibrium position ( = 0), and its energy is zero: = 0. As long as remains small – more precisely, as long as ~ – only the population of the ground state is appreciable, and remains practically equal to ~ 2: in this region, the solid-line curve of Figure 1 has a horizontal tangent. We can see this directly from expression (10), which can be written, for small : =

~ +~ e 2

~

+

On the other hand, for large =

~ + 2

1

1 ~ 2

(17) , that is, for +

~ , the same formula yields: (18)

or: (19) to within an infinitesimal of the order of (~ )2 . The asymptote of the curve giving as a function of is therefore the straight line = . In conclusion, the quantum mechanical and classical oscillators have the same mean energy, , at high temperatures ( ~ ). Striking differences appear at low temperatures ( . ~ ): it is no longer possible to ignore the quantization of the oscillator’s energy once the energy that characterizes the reservoir becomes of the order of (or smaller than) the energy difference ~ separating two adjacent levels of the oscillator. 650



ONE-DIMENSIONAL HARMONIC OSCILLATOR IN THERMODYNAMIC EQUILIBRIUM AT A TEMPERATURE

H E1 + E2

Figure 2: Mean energy of a quantum mechanical system with two energy levels 1 and 2 , in thermodynamic equilibrium at a temperature .

2

E1 T

0

2-b.

Comparison with a two-level system

It is interesting to compare the preceding results with those obtained for a twolevel system. Let 1 and 2 be the corresponding states, with energies 1 and 2 (with 1 2 ). For such a system, the general equation (6) yields: 1

=

e e

1 1

+ 2e +e 2

2

(20)

The mean energy of a two-level system, given by (20), is shown in Figure 2. For 1 small ( are preponderant in both the numerator 2 1 ), the terms in e and the denominator of (20) (since 1 ) and we obtain: 2 0

1

(21)

It can be verified that the curve starts with a horizontal tangent. For large ( 2 1 ), the asymptote of the curve is the straight line parallel to the -axis of ordinate ( 1 + 2 ) 2. The preceding results are easy to understand: for = 0, the system is in its ground state 1 , of energy 1 ; at high temperatures, the populations of the two levels are practically equal, and approaches half the sum of the two energies 1 and 2. Although the solid-line curves of Figures 1 and 2 have the same shape at low temperatures, we see that this is not at all true at high temperatures. For the harmonic oscillator, is not bounded and increases linearly with , while, for a two-level system, cannot exceed a certain value. This difference is due to the fact that the energy spectrum of the harmonic oscillator extends upward indefinitely: when increases, levels of higher and higher are occupied, and this causes to increase. On the other hand, for a two-level system, once the populations of the two levels are equalized, an additional increase in the temperature does not change the mean energy. 3. 3-a.

Applications Blackbody radiation

We have already pointed out, in the introduction to Chapter V (and in Complement KV , where we justified this result more precisely), that the electromagnetic field in a cavity is equivalent to a set of independent one-dimensional harmonic oscillators. Each of these oscillators is associated with one of the standing waves that can exist inside the cavity (normal modes) and has the same angular frequency as this wave. Let us show that this result, combined with those

651



COMPLEMENT LV

obtained above for and , leads very simply to the Rayleigh-Jeans law and the Planck law for blackbody radiation. Let be the volume of the cavity, whose walls are assumed to be perfectly reflecting. The first modes of the cavity (those of lowest frequency) depend strongly on the form of the cavity. On the other hand, for the high-frequency modes (those whose wavelength = is much smaller than the dimensions of the cavity), a classical electromagnetic calculation shows that, if ( ) d denotes the number of modes whose frequency is between and + d , ( ) is practically independent of the form of the cavity and equal to: 2

8

( )=

(22)

3

Let ( ) d be the electromagnetic energy per unit volume of the cavity contained in the frequency band ( + d ) when the cavity is in thermodynamic equilibrium at a temperature . To obtain the energy ( ) d , one must multiply the number of modes whose frequency is between and + d by the mean energy of the corresponding harmonic oscillators. We calculated this energy above; it is equal1 to or ~ 2, depending on whether the problem is treated classically or quantum mechanically. We then obtain, using (10), (16) and (22): ( )=

2

8

(23)

3

in a classical treatment, and: ( )=

2

8 3

1 e

1

(24)

in a quantum mechanical treatment. We recognize (23) to be Rayleigh-Jeans’ law and (24) to be Planck’s law, which reduces to the preceding one in the limit of low frequencies or high temperatures ( 1). The differences between these two laws reflect those which exist between the two curves of Figure 1. At high frequencies, difficulties arise in Rayleigh-Jeans’ law: the quantity ( ) given in (23) approaches infinity when , which is physically absurd. In order to remedy this defect, Planck was led to postulate that the energy of each oscillator varied discontinuously, by jumps proportional to (energy quantization); thus he obtained formula (24), which accounts perfectly for the experimental results. 3-b.

Bose-Einstein distribution law

Instead of calculating the mean value of the energy, as we did in § 1, let us calculate the mean value of the operator . Since, according to formula (B-15) of Chapter V: =

+

1 ~ 2

(25)

we deduce from result (10) that: = 1 We

1 e

1

(26)

use ~ 2 and not for the following reason: ( ) represents an electromagnetic energy which can be extracted from the cavity. At absolute zero, all the oscillators are in their ground states and no energy can be radiated outward because the system is in its lowest energy state; ( ) must therefore be zero at absolute zero, as experimental observations indeed show it to be. This requires us to define the mean energy of the field in the cavity with respect to the value corresponding to = 0.

652



ONE-DIMENSIONAL HARMONIC OSCILLATOR IN THERMODYNAMIC EQUILIBRIUM AT A TEMPERATURE

The fact that the levels of a one-dimensional harmonic oscillator are equidistant enables us to associate with the oscillator in the state a set of identical particles (quanta) of the same energy . In this interpretation, the operators and , which take into +1 or is thus the operator associated with the number of 1 , create or destroy a particle. particles ( is the eigenstate of with the eigenvalue ). In the special case of the electromagnetic field, the quanta associated with each harmonic oscillator are none other than photons. To each mode of the cavity considered in the preceding paragraph correspond photons of a certain type, characterized by the frequency, polarization, and spatial distribution of the mode. Expression (26) gives the mean number of photons associated with a mode of frequency at thermodynamic equilibrium. We recognize (26) to be the Bose-Einstein distribution law, which can be derived in a more general way; here, we have established it very simply by studying the harmonic oscillator and interpreting the states .

Comment: To be rigorous, we should write the Bose-Einstein distribution law for bosons of energy : =

1 e(

)

1

(27)

where is the chemical potential. In the case of photons, = 0. This is due to the fact that the total number of photons in the global system radiation-reservoir is not fixed, because of the possibility of absorption or emission of photons by the walls. 3-c.

Specific heats of solids at constant volume

We shall confine ourselves here to the Einstein model (cf. Complement AV ), in which a solid is considered to be composed of atoms vibrating independently about their equilibrium positions with the same angular frequency . The internal energy of the solid at the temperature is therefore equal to the sum of the mean energies of the isotropic threedimensional oscillators in thermodynamic equilibrium at this temperature. Using comment ( ) of § 1, we see that: =3

(28)

where is the mean energy of a one-dimensional harmonic oscillator of angular frequency . We know that the constant volume specific heat is the derivative of the internal energy with respect to the temperature: =

d d

d d

=3

(29)

which, taking (10) into account, yields: ~

=3

[e~

2

e~ 1]

2

(30)

The variation of with is shown in Figure 3. According to (29), is proportional to the derivative of the solid-line curve of Figure 1. It is therefore very simple to describe the behavior of the specific heat as a function of the temperature. In Figure 1, we see that has a horizontal tangent at the origin and increases very slowly; is therefore zero for = 0 and also increases very slowly. On the other hand, for large ( ~ ), approaches ; we deduce that approaches a constant, 3 , independent of . The transition region corresponds to ~ 1.

653

COMPLEMENT LV



cV

Figure 3: (Constant volume) specific heat of a solid in Einstein’s model. The hightemperature limit corresponds to the classical Dulong-Petit law.

3𝒩k

0

T

The asymptote of Figure 3 corresponds to the Dulong-Petit law: if one takes an gram atom of any solid, is equal to Avogadro’s number and the limiting value of is equal to 3 (where is the ideal gas constant), that is, to about 6 cal. degree 1 mole 1 . As we pointed out above, the quantum mechanical nature of crystalline vibrations manifests itself at low temperatures when becomes of the order of ~ or less. Insofar as is concerned, this means that the specific heat approaches zero when approaches zero. It is as if the degrees of freedom corresponding to crystalline vibrations were “frozen” beneath a certain temperature and no longer entered into the specific heat. This can be understood physically: at absolute zero, each oscillator is in its ground state 0 ; as long as ~ , it cannot absorb any thermal energy, since its first excited state has an energy far greater than .

Comments: ( ) Comparison with the specific heat of a two-level system We can apply an analogous argument to a sample composed of a set of two-level systems (for example, a paramagnetic sample composed of spin 1/2 particles): its specific heat is given, to within a coefficient, by the derivative of the curve of Figure 2. For such a system, the variation of with is shown in Figure 4. The behavior for 0 is the same as in the case of Figure 3. However, we see that approaches zero when 2 1 , since the mean energy then becomes independent of and is equal to ( 1 + 2 ) 2 (cf. Fig. 2). For a two-level system, therefore has a maximum (Schottky anomaly) whose physical interpretation is the following: like the harmonic oscillator, the two-level system cannot absorb any thermal energy at very low temperatures, as long as 2 ; is therefore 1 zero at the origin. Then, as increases, 2 becomes populated, and increases. When the temperature is high enough for the two populations to be practically equal, the system cannot absorb any more thermal energy, since the populations can no longer change: therefore approaches zero when .

cV

Figure 4: Specific heat for a set of twolevel systems. At high temperatures, approaches zero because the energy spectrum has an upper bound. 0

654

T



ONE-DIMENSIONAL HARMONIC OSCILLATOR IN THERMODYNAMIC EQUILIBRIUM AT A TEMPERATURE

( ) Einstein’s model enables us to understand simply why the specific heat approaches zero when the temperature approaches zero (a classically inexplicable result). However, it is too schematic to describe the exact dependence of at low temperatures. In a real crystal, the various oscillators are coupled. This gives rise to a set of vibrational normal modes (phonons) whose frequencies go from zero to a certain cutoff frequency (cf. Complement JV ). (30) must then be summed over the different possible frequencies (taking into account the fact that the number of modes whose frequencies are included between and + d depends on ). Thus one finds an expression for the specific heat which, at low temperatures, varies like 3 (this is confirmed experimentally).

4.

Probability distribution of the observable

4-a.

Definition of the probability density ( )

Let us return to the one-dimensional harmonic oscillator in thermodynamic equilibrium. We seek the probability ( ) d of finding, in a measurement of the position of the particle, a result included between and + d . It is clear that ( ) plays an important role in a large number of physical problems. For example, for a solid described by Einstein’s model, the width of ( ) gives an idea of the amplitude of atomic vibrations; the study of the variation of this width with respect to makes it possible to understand the phenomenon of melting [which occurs when the width of ( ) is no longer negligible compared to the interatomic distance]. When the oscillator is in the stationary state , the corresponding probability density ( ) is: ( )2=

( )=

(31)

At thermodynamic equilibrium, the oscillator is described by a statistical mixture of the 1 states with the weights: e . The probability density ( ) is then: ( )=

1

( )e

(32)

( ) is the weighted sum of the probability densities ( ) associated with the various states . Some of the ( ) are shown in Figures 5 and 6 of Chapter V. We shall see later that the oscillations of the functions ( ) which are visible in these figures disappear in the summation over : we shall show that ( ) is simply a Gaussian function. The probability density ( ) defined in (32) is related very simply to the density operator of the harmonic oscillator in thermodynamic equilibrium. Using (31) and (32), we obtain: ( )=

1

e

(33)

On the right-hand side, we can bring in the operator e the closure relation for the states , can be written: e

=e

=

e

which, taking into account

(34) 655



COMPLEMENT LV

We then see that: 1

( )=

e

=

(35)

where the density operator is given by formula (1). ( ) can then be seen to be the diagonal element of which corresponds to the ket . 4-b.

Calculation of ( ) We know that: =~

+

1 2

(36)

so that ( ) can be written in the form: 1

( )=

2

e

( )

(37)

with: ~

=

(38)

and: ( )=

e

(39)

In order to know ( ), all we need to do, therefore, is evaluate this diagonal matrix element. To do this, let us calculate the variation in ( ) when is changed to + d . Since the ket + d is given by [cf. Complement EII , relation (20)]: +d

d ~

= 1

(40)

we obtain, substituting this relation and the adjoint relation into (39) (neglecting infinitesimals of second order in d ): ( +d )=

d ~

( )+

e

(41)

The matrix element appearing on the right-hand side of (41) involves the operator , which is proportional to ( ). Now, it is the operator, proportional to ( + ), which acts in a simple way on the kets . We shall therefore transform [ e ] so as to make appear. We shall begin by seeking the relation between e and e . This can be obtained very simply in the representation: e

=

e

e e

=

(

1

(42a)

1

(42b)

1)

that is: e

=e

e

(43)

which can also be written: 1

656

tanh

2

e

= 1 + tanh

2

e

(44)



ONE-DIMENSIONAL HARMONIC OSCILLATOR IN THERMODYNAMIC EQUILIBRIUM AT A TEMPERATURE

Similarly, it can be shown that: e

=e

e

(45)

that is: 1 + tanh

e

2

= 1

tanh

e

2

(46)

We now subtract, term by term, relations (44) and (46); we obtain: e

=

where the symbol [ [

]+ =

tanh

+

2

e

(47) +

]+ denotes the anticommutator:

+

(48)

If we take into account the numerical factors which result from formulas (B-1) and (B-7) in Chapter V, (47) finally becomes: e

=

tanh

e

2

(49) +

Substituting this result into relation (41): ( +d )

( )=

d

tanh

~ =

2

tanh ~

e

2

2

+

( )d

(50)

( ) therefore satisfies the differential equation: d d

( )+

2

( )=0

2

(51)

where , which has the dimensions of a length, is defined by: =

~

coth

2

~

=

coth

~ 2

(52)

Equation (51) can be integrated directly: ( )=

2

(0) e

2

(53)

Therefore, we know ( ) to within a constant factor, since, according to (37): 1

( )=

e

2

(0) e

2

2

(54)

Since we know that the integral of ( ) over the whole -axis must be equal to 1, we obtain finally: ( )=

1

e

2

2

The function ( ) is thus a Gaussian, whose width is characterized by the length (52).

(55) defined in

657

COMPLEMENT LV

4-c.



Discussion

Starting from the probability density (55), it is easy to calculate: =0 2 2

= (∆ )2 =

(56)

2

Figure 5 shows the variation of (∆ )2 with respect to . We see from (52) that (∆ )2 is equal to ~ 2 when = 0. This result is not surprising: at = 0, the oscillator is in its ground state, and ( ) is equal to 0 ( ) 2 ; ∆ is found to be the root mean square deviation of in the ground state [cf. formula (D-5a) of Chapter V]. Then, when increases, so does (∆ )2 ; when ~ , we have: (∆ )2

(57)

2

In this case, ( ) becomes identical to the probability density of a classical oscillator in thermodynamic equilibrium at the temperature : 2 2

( )=

( )

e

=

+

e

( )

d

1

e

2

2

(58)

2

2 which leads to (∆ )2 = (the straight dashed line in Figure 5). Here again, classical and quantum mechanical predictions meet for ~ . Now, let us apply the preceding results to the problem of melting of a solid body (for simplicity, we shall choose the one-dimensional Einstein model; see Complement AV ). Experiments show that the solid melts when ∆ is of the order of an appreciable fraction of the interatomic distance . Consequently, the melting point temperature is given

(∆X)2

ħ 2mω 0

T

Figure 5: Variation with respect to the temperature of (∆ )2 , for a harmonic oscillator in thermodynamic equilibrium. When , ∆ is identical to the classical value, shown by the dashed line; at low temperatures, quantum mechanical effects (Heisenberg uncertainty relation) prevent ∆ from approaching zero. 658



ONE-DIMENSIONAL HARMONIC OSCILLATOR IN THERMODYNAMIC EQUILIBRIUM AT A TEMPERATURE

approximately by: 2 2

2

(59)

2

where can be replaced by its expression (52), with = . Assuming that is large enough that ~ , we can use2 in (59) the asymptotic form (57), and we obtain the law for : 2 2

(60)

Θ

(61)

2

If we set: =

~

(Θ is called the “Einstein temperature”) and if we note that does not vary very much from one substance to another (anyway, much less than , that is, Θ ), we find the approximate law: (62)

Θ2

The melting point temperature of a crystal is therefore approximately proportional to the square of a vibrational frequency which is characteristic of the crystal. 4-d.

Bloch’s theorem Consider the operator e

e [where

= Tr [ e

, where

is a real variable. Its mean value:

]

(63)

is given by (1)] is a function of , which we shall denote by ( ):

( )= e

(64)

In probability theory, ( ) is called the characteristic function of the random variable . It is easy to calculate ( ) if we place ourselves in the representation: +

( )=

d

e

+

=

d

e

+

=

d

( )e

(65)

To within a factor of 2 , ( ) is therefore the Fourier transform of the function ( ) calculated above (§ 4-b). Since ( ) is a Gaussian [formula (55)], ( ) is also a Gaussian [cf. Appendix I, relation (50)]: ( )=e

2 2

4

(66)

2 This is not always possible. Recall that helium remains liquid at atmospheric pressure, even at = 0: is never negligible compared to , whatever may be (cf. Complement AV ).

659



COMPLEMENT LV

which, according to formula (56), can be written: 2

e

able

=e

2

(67)

2

We could perform calculations analogous to those of §§ 4-a and 4-b above for the observinstead of . We would then define the probability density ( ) by: 1

( )=

( )2

e

(68)

Formula (24) of Complement DV shows that: ( )=

1

=

(69)

Therefore: ( )=

2

1

2

e

2 2

Consequently, the study of e 2

e

=e

(70) would lead to the same result as in (67):

2

2

(71)

The generalization of formulas (67) and (71) is known as Bloch’s theorem: if ( ) is an arbitrary linear combination of the position and the momentum of a one-dimensional harmonic oscillator in thermodynamic equilibrium at the temperature , then: 2

e

=e

2

2

(72)

This theorem is used in solid state physics, for example in the theory of emission without recoil by the nuclei of a crystalline lattice (the Mössbauer effect). References and suggestions for further reading:

Specific heats: Kittel (8.2), Chap. 6, p. 91 and 100; Kittel (13.2), Chap. 6; Seitz (13.4), Chap. III; Ziman (13.3), Chap. 2. Blackbody radiation: Eisberg and Resnick (1.3), Chap. 1; Kittel (8.2), Chap. 15; Reif (8.4) § 9-13 to 9-15; Bruhat (8.3), Chap. XXII. Bloch’s theorem: Messiah (1.17), Chap. XII, § 11-12.

660



EXERCISES

Complement MV Exercises 1. Consider a harmonic oscillator of mass the state of this oscillator is given by:

and angular frequency

. At time

= 0,

(0) = where the states

are stationary states with energies ( + 1 2)~ .

What is the probability that a measurement of the oscillator’s energy performed at an arbitrary time 0, will yield a result greater than 2~ ? When = 0, what are the non-zero coefficients ? From now on, assume that only 0 and 1 are different from zero. Write the normalization condition for (0) and the mean value of the energy in terms of 0 and 1 . With the additional requirement = ~ , calculate 0 2 and 1 2 . As the normalized state vector (0) is defined only to within a global phase factor, we fix this factor by choosing 0 real and positive. We set: 1 = 1 e 1 . We assume that = ~ and that: =

1 2

Calculate

~ 1.

With (0) so determined, write ( ) for 0 and calculate the value of . Deduce the mean value ( ) of the position at .

2. Anisotropic three-dimensional harmonic oscillator In a three-dimensional problem, consider a particle of mass energy: 2

(

)=

where

and 0

2

1+

2 3

2

+

2

+ 1

4 3

1

at

and of potential

2

are constants which satisfy: 0

3 4

What are the eigenstates of the Hamiltonian and the corresponding energies? Calculate and discuss, as functions of , the variation of the energy, the parity and the degree of degeneracy of the ground state and the first two excited states. 661



COMPLEMENT MV

3. Harmonic oscillator: two particles Two particles of the same mass , with positions , are subjected to the same potential: 2 ( )=

1 2

2

1

and

2

and momenta

1

and

2

The two particles do not interact. Write the operator can be written: =

1

+

, the Hamiltonian of the two-particle system. Show that

2

where 1 and 2 act respectively only in the state space of particle (1) and in that of particle (2). Calculate the energies of the two-particle system, their degrees of degeneracy, and the corresponding wave functions. Does form a C.S.C.O.? Same question for the set 1 2 . We denote by Φ 1 2 the eigenvectors common to 1 and 2 . Write the orthonormalization and closure relations for the states Φ 1 2 . Consider a system which, at = 0, is in the state: (0) =

1 ( Φ0 0 + Φ1 0 + Φ0 1 + Φ1 1 ) 2

What results can be found, and with what probabilities, if at this time one measures: – the total energy of the system? – the energy of particle (1)? – the position or velocity of this particle?

4. (This exercise is a continuation of the preceding one and uses the same notation.) The two-particle system, at = 0, is in the state (0) given in exercise 3. At = 0, one measures the total energy

and one finds the result 2~ .

Calculate the mean values of the position, the momentum, and the energy of particle (1) at an arbitrary positive . Same question for particle (2). At 0, one measures the energy of particle (1). What results can be found, and with what probabilities? Same question for a measurement of the position of particle (1); trace the curve for the corresponding probability density. Instead of measuring the total energy , at = 0, one measures the energy 2 of particle (2); the result obtained is ~ 2. What happens to the answers to questions and of ? 662



EXERCISES

5. (This exercise is a continuation of exercise 3 and uses the same notation.) We denote by Φ 1 2 the eigenstates common to 1 and 2 , of eigenvalues ( 1 2)~ and ( 2 + 1 2)~ . The “two particle exchange” operator is defined by: Φ

1

2

= Φ

2

1+

1

1 Prove that = and that is unitary. What are the eigenvalues of ? Let = be the observable resulting from the transformation by of an arbitrary observable . Show that the condition = ( invariant under exchange of the two particles) is equivalent to [ ] = 0.

Show that: 1

=

2

2

=

1

Does commute with , . 2 2

? Calculate the action of

Construct a basis of eigenvectors common to and form a C.S.C.O.? What happens to the spectrum of eigenvalues if one retains only the eigenvectors Φ of

on the observables

1,

1,

. Do these two operators and the degeneracy of its for which Φ = Φ?

6. Charged harmonic oscillator in a variable electric field A one-dimensional harmonic oscillator is composed of a particle of mass , charge 2 2 and potential energy ( ) = 12 . We assume in this exercise that the particle is placed in an electric field E ( ) parallel to and time-dependent, so that to ( ) must be added the potential energy: ()=

E( )

Write the Hamiltonian ( ) of the particle in terms of the operators Calculate the commutators of and with ( ).

and

.

Let ( ) be the number defined by: ()=

()

()

where ( ) is the normalized state vector of the particle under study. Show from the results of the preceding question that ( ) satisfies the differential equation: d ()= d

( )+

()

where ( ) is defined by: ()=

2 ~

E( )

Integrate this differential equation. At time , what are the mean values of the position and momentum of the particle? 663



COMPLEMENT MV

The ket

( ) is defined by:

() =[

( )] ( )

where ( ) has the value calculated in . Using the results of questions show that the evolution of ( ) is given by: ~

d d

and ,

( ) =[ ( )+~ ] ( )

How does the norm of

( ) vary with time?

Assuming that (0) is an eigenvector of with the eigenvalue ( ) is also an eigenvector of , and calculate its eigenvalue.

(0), show that

Find at time the mean value of the unperturbed Hamiltonian 0 = ( ) () as a function of (0). Give the root mean square deviations ∆ , ∆ and ∆ 0 ; how do they vary with time? Assume that at = 0, the oscillator is in the ground state 0 . The electric field acts between times 0 and and then falls to zero. When , what is the evolution of the mean values ( ) and ( )? Application: assume that between 0 and , the field E ( ) is given by E ( ) = E0 cos( ); discuss the phenomena observed (resonance) in terms of ∆ = . If, at , the energy is measured, what results can be found, and with what probabilities? 7. Consider a one-dimensional harmonic oscillator of Hamiltonian states :

and stationary

= ( + 1 2)~ The operator

( ) is defined by:

( )=e where Is

is real. ( ) unitary? Show that, for all , its matrix elements satisfy the relation: 2

( )

=1

Express ( ) in terms of the operators and . Use Glauber’s formula [formula (63) of complement BII ] to put ( ) in the form of a product of exponential operators. Establish the relations: e e where 664

0

=

0

=

0

!

is an arbitrary complex parameter.

• Find the expression, in terms of element:

= ~2

2

2

and

EXERCISES

= ~ , for the matrix

( )

0

What happens when directly?

8. The evolution operator

approaches zero? Could this result have been predicted

( 0) of a one-dimensional harmonic oscillator is written:

~

( 0) = e with: =~

+

1 2

Consider the operators: ˜( ) = ˜ ()=

( 0)

( 0)

( 0)

( 0)

By calculating their action on the eigenkets and ˜ ( ) in terms of and .

of

, find the expression for ˜( )

Calculate the operators ˜ ( ) and ˜ ( ) obtained from transformation: ˜( ) =

( 0)

( 0)

˜( ) =

( 0)

( 0)

and

by the unitary

How can the relations so obtained be interpreted? Show that

2 larly, establish that

0

is an eigenvector of 2

0

and specify its eigenvalue. Simi-

is an eigenvector of

.

At = 0, the wave function of the oscillator is ( 0). How can one obtain from ( 0) the wave function of the oscillator at all subsequent times = 2 (where is a positive integer)? Choose for ( 0) the wave function ( ) associated with a stationary state. From the preceding question derive the relation which must exist between ( ) and its Fourier transform ¯ ( ). Describe qualitatively the evolution of the wave function in the following cases: ()

(

0) = e

( )

(

0) = e

where , real, is given. where

is real and positive. 665



COMPLEMENT MV

(

) 1

= (

=0 ( )

666

(

if

0) =

0) = e

2

2

2

2

everywhere else

where

is real.

Chapter VI

General properties of angular momentum in quantum mechanics

A B

C

D

A.

Introduction: the importance of angular momentum . . . . 667 Commutation relations characteristic of angular momentum669 B-1 Orbital angular momentum . . . . . . . . . . . . . . . . . . . 669 B-2 Generalization: definition of an angular momentum . . . . . 670 B-3 Statement of the problem . . . . . . . . . . . . . . . . . . . . 671 General theory of angular momentum . . . . . . . . . . . . . 671 C-1 Definitions and notation . . . . . . . . . . . . . . . . . . . . . 671 C-2 Eigenvalues of J2 and . . . . . . . . . . . . . . . . . . . . . 673 C-3 “Standard” } representations . . . . . . . . . . . . . 677 Application to orbital angular momentum . . . . . . . . . . 685 D-1 Eigenvalues and eigenfunctions of L2 and . . . . . . . . . 685 D-2 Physical considerations . . . . . . . . . . . . . . . . . . . . . 693

Introduction: the importance of angular momentum

The present chapter is the first in a series of four Chapters (VI, VII, IX and X) devoted to the study of angular momenta in quantum mechanics. This is an extremely important problem, and the results we are going to establish are used in many domains of physics: the classification of atomic, molecular and nuclear spectra, the spin of elementary particles, magnetism, etc... We already know the important role played by angular momentum in classical mechanics; the total angular momentum of an isolated physical system is a constant of Quantum Mechanics, Volume I, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

the motion. Furthermore, this is also true in certain cases in which the system is not isolated. For example, if a point particle , of mass , is moving in a central potential (one which depends only on the distance between and a fixed point ), the force to which is subjected is always directed towards . Its moment with respect to is consequently zero, and the angular momentum theorem implies that: d =0 (A-1) d where is the angular momentum of with respect to . This fact has important consequences: the motion of the particle is limited to a fixed plane (the plane passing through and perpendicular to the angular momentum ); moreover, this motion obeys the law of constant areas (Kepler’s second law). All these properties have their equivalents in quantum mechanics. With the angular momentum of a classical system is associated an observable L, actually a set of three observables, , and , which correspond to the three components of in a Cartesian frame. If the physical system under study is a point moving in a central potential, we shall see in Chapter VII that , and are constants of the motion in a quantum mechanical sense, that is, they commute with the Hamiltonian describing the particle in the central potential ( ). This important property considerably simplifies the determination and classification of eigenstates of . Also, we described the Stern-Gerlach experiment in Chapter IV and this revealed the quantization of angular momentum: the component, along a fixed axis, of the intrinsic angular momentum of an atom can take on only certain discrete values. We shall see that all angular momenta are quantized in this way. This enables us to understand atomic magnetism, the Zeeman effect, etc... Furthermore, to analyze all these phenomena, we must introduce typically quantum mechanical angular momenta, which have no classical equivalents (intrinsic angular momenta of elementary particles, Chap. IX). From now on, we shall denote by orbital angular momentum any angular momentum that has a classical equivalent (and by L, the corresponding observables), and by spin angular momentum any intrinsic angular momentum of an elementary particle (for which we reserve the letter S). In a complex system, such a nucleus, an atom, or a molecule, the orbital angular momenta L of the various elementary particles which constitute the system (electrons, protons, neutrons, ...) combine with each other and with the spin angular momenta S of these same particles to form the total angular momentum J of the system. The way in which angular momenta are combined in quantum mechanics (coupling of angular momenta) will be studied in Chapter X. Finally, let us add that J will also be used to denote an arbitrary angular momentum when it is not necessary to specify whether we are dealing with an orbital angular momentum, a spin, or a combination of several angular momenta. Before beginning the study of the physical problems just mentioned (energy levels of a particle in a central potential, spin, the Zeeman effect, addition of angular momenta,...), we shall establish, in this chapter, the general quantum mechanical properties associated with all angular momenta, whatever their nature. These properties follow from commutation relations satisfied by the three observables , and , the components of an arbitrary angular momentum J. The origin of these commutation relations is discussed in § B: for an orbital angular momentum, they are simply consequences of the quantization rules (§ B-5 of Chapter III) and the canonical commutation relations [formulas (E-30) of Chapter II]; for spin angular momenta, 668

B. COMMUTATION RELATIONS CHARACTERISTIC OF ANGULAR MOMENTUM

which have no classical equivalents, they actually serve as definitions of the corresponding observables1 . In § C, we study the consequences of these commutation relations which are characteristic of angular momenta. In particular, we discuss space quantization, that is, the fact that any component of an angular momentum possesses a discrete spectrum. Finally, the general results so obtained are applied, in § D, to the orbital angular momentum of a particle. B.

Commutation relations characteristic of angular momentum

B-1.

Orbital angular momentum

To obtain the observables , , associated in quantum mechanics with the three components of the angular momentum of a spinless particle, we simply apply the quantization rules stated in § B-5 of Chapter III. Consider for instance the component of the classical angular momentum: =

(B-1)

We associate with the position variables and , the observables and , and with the momentum variables and , the observables and . Although formula (B-1) involves products of two classical variables, no precaution needs to be taken in replacing them by the corresponding observables, since and commute, as do and [see the canonical commutation relations (E-30) of Chapter II]. We therefore do not need to symmetrize expression (B-1) in order to obtain the operator : =

(B-2)

For the same reason ( and commute, as do and ), the operator thus obtained is Hermitian. In the same way, we find the operators and corresponding to the components and of the classical angular momentum. This allows us to write: L=R

(B-3)

P

Since we know the canonical commutation relations of the position R and momentum P observables, we can easily calculate the commutators of the operators , and . For example, let us evaluate [ ]: [

]=[

]

=[ since [

]+[

]

commutes with ]=

[

= = ~

] ~

+

, and [

(B-4) , with

. We then have:

]

+ ~ (B-5)

1 The fundamental origin of these commutation relations is purely geometrical. We shall discuss this point in detail in Complement BVI , in which we demonstrate the intimate relation between the angular momentum of a system with respect to a point and the geometrical rotations of this system about .

669

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

Analogous calculations yield the other two commutators, and we obtain, finally: [

]= ~

[

]= ~

[

]= ~

(B-6)

Thus we have established the commutation relations for the components of the angular momentum of a spinless particle. This result can be generalized to a system of spinless particles. The total angular momentum of such a system is, in quantum mechanics: L=

(B-7)

L =1

with: L =R

(B-8)

P

Now, each of the individual angular momenta L satisfies the commutation relations (B6) and commutes with L when is not equal to (operators acting in state spaces of different particles). Thus we see that relations (B-6) remain valid for the total angular momentum L. B-2.

Generalization: definition of an angular momentum

The three operators associated with the components of an arbitrary classical angular momentum therefore satisfy the commutation relations (B-6). It can be shown, moreover (cf. Complement BVI ), that the origin of these relations lies in the geometric properties of rotations in three-dimensional space. This is why we shall adopt a more general point of view and define an angular momentum J as any set of three observables , , that satisfies: [

]= ~

[

]= ~

[

]= ~

(B-9)

We then introduce the operator: J2 =

2

+

2

+

2

(B-10)

the (scalar) square of the angular momentum J. This operator is Hermitian, since , and are Hermitian. We shall assume that it is an observable. Let us show that J2 commutes with the three components of J: [J2 J] = 0

(B-11)

We perform the calculation for J2

670

=

2

=

2

+

2

+

+

, for example:

2 2

(B-12)

C. GENERAL THEORY OF ANGULAR MOMENTUM

since obviously commutes with itself and, therefore, with its square. The other two commutators can be obtained from (B-9): 2

2

=

[

]+[

=

~

~

=

[

]+[

= ~

] (B-13a) ]

+ ~

(B-13b)

The sum of these two commutators, which enters into (B-12), is indeed zero. Angular momentum theory in quantum mechanics is founded entirely on the commutation relations (B-9). Note that these relations imply that it is impossible to measure simultaneously the three components of an angular momentum; however, J2 and any component of J are compatible. B-3.

Statement of the problem

Let us return to the example of a spinless particle in a central potential, mentioned in the introduction. We shall see in Chapter VII that, in this case, the three components of the angular momentum L of the particle commute with the Hamiltonian ; thus, this is also true for the operator L2 . We then have at our disposal four constants of the motion: L2 , , , . But these four operators do not all commute; to form a complete set of commuting observables with , we must pick only L2 and one of the three other operators, for example. For a particle subjected to a central potential, we can then look for eigenstates of the Hamiltonian which are also eigenvectors of L2 and , without restricting the generality of the problem. However, it is impossible to obtain a basis of the state space composed of eigenvectors common to the three components of L, as these three observables do not commute. The situation is the same in the general case: since the components of an arbitrary angular momentum J do not commute, they are not simultaneously diagonalizable. We shall therefore seek the system of eigenvectors common to J2 and , observables corresponding to the square of the absolute value of the angular momentum and to its component along the axis. C.

General theory of angular momentum

In this section, we shall determine the spectrum of J2 and for the general case and study their common eigenvectors. The reasoning will be analogous to the one used in Chapter V for the harmonic oscillator. C-1.

Definitions and notation

C-1-a.

The

+

and

operators

Instead of using the components and of the angular momentum J, it is more convenient to introduce the following linear combinations: +

= =

+

(C-1)

671

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

Like the operators and of the harmonic oscillator, + and are not Hermitian: they are adjoints of each other. In the rest of this section, we shall use only the operators + , , and J2 . It is straightforward, using (B-9) and (B-11), to show that these operators satisfy the commutation relations: [

+]

[ [

=~

]=

(C-3)

~

] = 2~

+

[J

(C-2)

+

2

+]

= [J

(C-4)

2

2

] = [J

]=0

Let us calculate the products =(

+

+

+

=

2

=

2

)(

+

[

+

2

+~

+

2

+

2

)(

=

2

=

2

and

+

+.

We find:

)

2

=(

(C-5)

] (C-6a) +

+ [

) ] (C-6b)

~

Using definition (B-10) of the operator J2 , we can write these expressions in the form: + +

= J2

2

2

2

=J

+~

(C-7a)

~

(C-7b)

Adding relations (C-7)21, we obtain: J2 =

1 ( 2

+

+

+)

+

2

(C-8)

Notation for the eigenvalues of J2 and

C-1-b.

According to (B-10), J2 is the sum of the squares of three Hermitian operators. Consequently, for any ket , the matrix element J2 is positive or zero: J2

= =

2

+ 2

+

2

+ 2

+

2 2

0

(C-9)

Note that this could have been expected, since J2 corresponds to the square of the absolute value of the angular momentum J. From this we see, in particular, that all the eigenvalues of J2 are positive or zero, since if is an eigenvector of J2 , J2 is the product of the corresponding eigenvalue and the square of the norm of , which is always positive. We shall write the eigenvalues of J2 in the form ( + 1)~2 , with the convention: 0 672

(C-10)

C. GENERAL THEORY OF ANGULAR MOMENTUM

This notation is intended to simplify the arguments which follow; it does not influence the result. Since J has the dimensions of ~, an eigenvalue of J2 is necessarily of the form ~2 , where is a real dimensionless number. We have just seen that must be positive or zero; it can then be shown that the second-degree equation in : ( + 1) =

(C-11)

always has one and only one positive or zero root. Therefore, if we use relation (C-10), the specification of determines uniquely; any eigenvalue of J2 can thus be written ( + 1)~2 , with positive or zero. As for the eigenvalues of , which have the same dimensions as ~, they are traditionally written ~, where is a dimensionless number. Eigenvalue equations for J2 and

C-1-c.

We shall label the eigenvectors common to J2 and by the indices and which characterize the associated eigenvalues. However, J2 and do not in general constitute a C.S.C.O. (see, for example, § A of Chapter VII), and it is necessary to introduce a third index in order to distinguish between the different eigenvectors corresponding to the same eigenvalues ( + 1)~2 and ~ of J2 and (this point will be expanded in § C-3-a below). We shall call this index (which does not necessarily imply that it is always a discrete index). We shall therefore try to solve the simultaneous eigenvalue equations: J2

= ( + 1)~2 =

(C-12)

~

Eigenvalues of J2 and

C-2.

As in § B-2 of Chapter V, we shall begin by proving three lemmas which will then enable us to determine the spectrum of J2 and . C-2-a.

.

Lemmas

Lemma I (Properties of the eigenvalues of J2 and 2

2

If ( + 1)~ and ~ are the eigenvalues of J and eigenvector , then and satisfy the inequality:

) associated with the same

(C-13) To prove this, consider the vectors of their norms is positive or zero: +

2

=

2

=

+ +

and

+

, and note that the square

0

(C-14a)

0

(C-14b) 673

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

To calculate the left-hand sides of these inequalities, we can use formulas (C-7). We find (if we assume to be normalized): (J2

=

+

2

2

= ( + 1)~

~2

2

)

~

2

=

+

)

~

2 2

(J 2

+~

2 2

= ( + 1)~

(C-15a)

~2

~ +

(C-15b)

Substituting these expressions into inequalities (C-14), we obtain: ( + 1)

(

+ 1) = (

( + 1)

(

1) = (

)( +

+ 1)

+ 1)( +

)

0

(C-16a)

0

(C-16b)

that is: ( + 1)

(C-17a) +1

(C-17b)

These two conditions are satisfied simultaneously only if .

Lemma II (Properties of the vector

) 2

Let

satisfies inequality (C-13).

be an eigenvector of J and

with the eigenvalues ( + 1)~2 and

~. ( ) If

=

,

= 0. is a non-null eigenvector of J2 and 1)~.

( ) If , ( + 1)~2 and (

with the eigenvalues

(i) According to (C-15b), the square of the norm of the ket is equal to ~2 [ ( + 1) ( 1)] and therefore goes to zero for = . Since the norm of a vector goes to zero if and only if this vector is a null vector, we conclude that all vectors are null: =

=

=0

(C-18)

It is easy to establish the converse of (C-18): =0= Letting obtain:

+

=

(C-19)

act on both sides of the equation appearing in (C-19), and using (C-7a), we

~2 [ ( + 1)

2

+

]

= ~2 ( +

Using (C-13), (C-20) has only one solution,

)(

+ 1) =

=0

.

(ii) Now assume to be greater than . According to (C-15b), then a non-null vector since the square of its norm is different from zero. 674

(C-20)

is

C. GENERAL THEORY OF ANGULAR MOMENTUM

Let us show that it is an eigenvector of J2 and commute; consequently: [J2

]

. The operators

=0

and J2 (C-21)

which can be written: J2

J2

=

= ( + 1)~2

(C-22)

This relation expresses the fact that is an eigenvector of J2 with the eigenvalue 2 ( + 1)~ . Moreover, if we apply operator equation (C-3) to : [

]

=

(C-23)

~

that is: =

~

=

~

=(

~ 1)~

(C-24)

is therefore an eigenvector of .

with the eigenvalue (

Lemma III (Properties of the vector

+ 2

Let

be an eigenvector of J and

1)~.

) with the eigenvalues ( + 1)~2 and

~. ( ) If

= ,

= 0.

+

( ) If , + ( + 1)~2 and (

is a non-null eigenvector of J2 and + 1)~.

with the eigenvalues

(i) The argument is similar to that of (§ C-2-a- ). According to (C-14a), the square of the norm of + is zero if = . Therefore: =

=

+

=0

(C-25)

The converse can be proved in the same way: =0

+

=

(C-26)

(ii) If , an argument analogous to that of § C-2-a- -ii yields, using formulas (C-5) and (C-2): J2

+

= ( + 1)~2

+

=(

+ 1)~

+

+

(C-27)

(C-28) 675

CHAPTER VI

C-2-b.

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

Determination of the spectrum of J2 and

We shall now show that the three lemmas above enable us to determine the possible values of and . Let be a non-null eigenvector of J2 and with the eigenvalues ( + 1)~2 and ~. According to lemma I, . It is therefore certain that a positive or zero integer exists such that: +1

(C-29)

Now consider the series of vectors: (

)

(C-30)

According to lemma II, each of the vectors ( ) of this series ( = 0 1 ) is a non-null eigenvector of J2 and with the eigenvalues ( + 1)~2 and ( )~. The proof is by iteration. By hypothesis, is non-null and corresponds to the eigenvalues ( + 1)~2 and ~. ( ) is obtained by the action of on ( ) 1 , which is an eigenvector of J2 and with the eigenvalues ( + 1)~2 and ( + 1)~. The latter eigenvalue is necessarily greater than since, according to (C-29): +1

+1

+1

(C-31)

It follows, according to point (ii) of lemma II, that ( ) is a non-null eigenvector of J2 and , the corresponding eigenvalues being ( + 1)~2 and ( )~. Now let act on ( ) . Let us first assume that the eigenvalue ( )~ of associated with ( ) is greater than ~, that is, that: (C-32) By point ( ) of lemma II, eigenvalues ( + 1)~2 and ( according to (C-29):

(

1

)

is then non-null and corresponds to the 1)~. This is in contradiction with lemma I since, (C-33)

We must therefore have equal to . In this case, ( ) corresponds to the eigenvalue of , and, according to point (i) of lemma II, ( ) is zero. The vector series (C-30) obtained by the repeated action of on is therefore limited and the contradiction with lemma I is removed. We have now shown that there exists a positive or zero integer such that: =

(C-34)

An analogous argument, based on lemma III, would show that there exists a positive or zero integer such that: + =

(C-35)

since the vector series: +

676

(

+)

(C-36)

C. GENERAL THEORY OF ANGULAR MOMENTUM

must be limited if there is to be no contradiction with lemma I. Combining (C-34) and (C-35), we obtain: + =2

(C-37)

is therefore equal to a positive or zero integer divided by 2. It follows that is necessarily integral or half-integral2 . Furthermore, if there exists a non-null vector , all the vectors of series (C-30) and (C-36) are also non-null and eigenvectors of J2 with the eigenvalue ( + 1)~2 , as well as of with the eigenvalues: ~ (

+ 1)~ (

+ 2)~

(

2)~ (

1)~

~

(C-38)

We summarize the results obtained above as follows: Let J be an arbitrary angular momentum, obeying the commutation relations (B-9). If ( + 1)~2 and ~ denote the eigenvalues of J2 and , then: – the only values possible for are positive integers or half-integers or zero, that is: 0, 1/2, 1, 3/2, 2, ... (these values are the only ones possible, but they are not all necessarily realized for all angular momenta). – for a fixed value of , the only values possible for are the (2 + 1) numbers: +1 1 ; is therefore integral if is integral and half-integral if is half-integral. All these values of are realized once one of them is.

C-3.

“Standard”

} representations

We shall now study the eigenvectors common to J2 and , which form a basis of the state space since J2 and are, by hypothesis, observables. C-3-a.

The basis states

Consider an angular momentum J acting in a state space . We shall show how to construct an orthonormal basis in composed of eigenvectors common to J2 and . Take a pair of eigenvalues, ( +1)~2 and ~, that are actually found in the case we are considering. The set of eigenvectors associated with this pair of eigenvalues forms a vector subspace of which we shall denote by ( ); the dimension ( ) of this subspace may well be greater than 1, since J2 and do not generally constitute a C.S.C.O. We choose in ( ) an arbitrary orthonormal basis, ; =1 2 ( ) . If is not equal to , there must exist another subspace ( + 1) in composed of eigenvectors of J2 and associated with the eigenvalues ( + 1)~2 and ( + 1)~. Similarly, if is not equal to , there exists a subspace ( 1). In the case where is not equal to or , we shall construct orthonormal bases in ( + 1) and in ( 1), starting with the one chosen in ( ). First, let us show that, if 1 is not equal to 2 , + 1 and + 2 are orthogonal, as are and . We can find the scalar product 1 2 2A

number is said to be “half-integral” if it is equal to an odd number divided by 2.

677

CHAPTER VI

of

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

and

1 2

by using formulas (C-7):

2

=

1

(J2

2

= [ ( + 1)

(

1)]~

)

~

(

These scalar products are therefore zero if mal; if 1 = 2 , the square of the norm of [ ( + 1)

2

1)]~ 1

= 1

2

2

1 2

(C-39)

1

since the basis of ( is equal to:

) is orthonor-

2

Now let us consider the set of the (

) vectors defined by:

1

+1 =

( + 1)

~

(

+ 1)

(C-40)

+

Because of what we have just shown, these vectors are orthonormal. We shall show that they constitute a basis in ( + 1). Assume that there exists, in ( + 1), a vector + 1 orthogonal to all the + 1 obtained from (C-40). The vector + 1 would not be null since ( + 1) cannot be equal to ; it would belong to ( ) and would be orthogonal to all vectors +1 . Now, according to relation (C-40), the ket +1 is proportional to , that is, to [formula + (C-7b)]. Therefore, + 1 would be a non-null vector of ( ) which would be orthogonal to all vectors of the basis. But this is impossible. Consequently, the set of vectors (C-40) constitutes a basis in ( + 1). It can be shown, using an analogous argument, that the vectors 1 defined by: 1

1 =

( + 1)

~

(

(C-41)

1)

form an orthonormal basis in ( 1). We see, in particular, that the dimension of subspaces ( + 1) and ( is equal to that of ( ). In other words, this dimension is independent3 of : (

+ 1) = (

1) = (

)= ( )

1) (C-42)

We then proceed as follows. For each value of actually found in the problem under consideration, we choose one of the subspaces associated with this value of , for example ( ) corresponding to = . In this subspace, we choose an arbitrary orthonormal basis, ; =1 2 ( ) . Then, using formula (C-41), we construct, by iteration, the basis to which each of the other 2 subspaces ( ) will be related: the arrows of table (VI-1) indicate the method used. By treating all the values of found in the problem in this way, we arrive at what is called a standard basis of the state space . The orthonormalization and closure relations for such a basis are: = +

( )

=

=1

(C-43a) =1

(C-43b)

3 If this dimension is infinite, the result must be interpreted as follows: there is a one-to-one correspondence between the basis vectors of two subspaces corresponding to the same value of .

678

C. GENERAL THEORY OF ANGULAR MOMENTUM

Comments:

( ) The use of formula (C-41) implies a choice of phases: the basis vectors in ( 1) are chosen to be proportional, with a real and positive coefficient, to the vectors obtained by application of to the basis of ( ). ( ) Formulas (C-40) and (C-41) are compatible, since, if we apply + to both sides of (C-41) and take (C-7a) into account, we find (C-40) [with replaced by ( 1)]. This means that one is not obliged to start, as we did, with the maximum value = and (C-41) in order to construct bases of the subspaces ( ) corresponding to a given value of .

( ) different values of =1

(

= )

( (2 + 1) spaces ( )

=

1)

1

1

1

.. . (

( )

2

( )

2

1

.. . )

1

.. . (

=2

.. . 2

.. . )

1

( =1 )

( )

1 .. .

( ) .. .

2

( =2 )

.. . ( )

( = ( ) )

( ) spaces ( ) Table VI-1: Schematic representation of the construction of the (2 + 1) ( ) vectors of a “standard basis” associated with a fixed value of . Starting with each of the ( ) vectors of the first line, one uses the action of to construct the (2 + 1) vectors of the corresponding column. Each subspace ( ) is spanned by the ( ) vectors situated in the same row. Each subspace ( ) is spanned by the (2 + 1) vectors of the corresponding column.

679

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

In most cases, in order to define a standard basis, one uses observables that commute4 with the three components of , and form a C.S.C.O. with J2 and (we shall see a concrete example of this in § A of Chapter VII): [

J] = [

J] =

=0

(C-44)

For the sake of simplicity, we shall assume that only one of these observables is required to make a C.S.C.O. with J2 and . Under these conditions, each of the subspaces ( ) defined above is globally invariant under the action of : if is an arbitrary vector of ( ), is still, according to (C-44), an eigenvector of J2 and : J2

=

J2

=

= ( + 1)~2 =

~

(C-45)

with the same eigenvalues as . Thus also belongs to ( ). If we then choose a value of , we can diagonalize inside the corresponding subspace ( ). We denote by the various eigenvalues found in this way: the index indicates in which space ( ) they were found, and the index (assumed to be discrete, for simplicity) distinguishes between them. A single vector (written ) of ( ) is associated with each eigenvalue , since , J2 and form, by hypothesis, a C.S.C.O.: =

(C-46)

The set ; fixed; = 1 2 ( ) constitutes an orthonormal basis in ( ), from which we construct, using the method described above, a basis in the other subspaces ( ) related to the value of chosen. By applying this procedure successively for all values of , we arrive at a “standard” basis, of the state space, all of whose vectors are eigenstates, not only of J2 and , but also of : =

(C-47)

This can be shown as follows. If hypothesis (C-44) is satisfied, commutes with , which means that , that is 1 , is an eigenvector of with the same eigenvalue as : =

=

(C-48)

By repeating this process, it is easy to prove relation (C-47).

Comments: ( ) An observable that commutes with J2 and does not necessarily commute with and ( is itself an example). Consequently, it should not be necessary, in order to form a C.S.C.O. with J2 and , to choose observables that commute with the three components of J as in (C-44). However, if did not commute with + and (that is, with and ), would not necessarily be an eigenvector of with the same eigenvalue as . ( ) The spectrum of is the same in all the subspaces ( ) associated with the same value of . However, the eigenvalues generally depend on (this point will be illustrated by concrete examples in §§ A and C of Chapter VII). 4 An operator that commutes with the three components of the total angular momentum of a physical system is said to be “scalar” (cf. Complement BVI ).

680

C. GENERAL THEORY OF ANGULAR MOMENTUM

C-3-b.

The spaces

(

)

In the preceding section, we introduced a “standard basis” of the state space by starting with a basis chosen in the subspace ( = ) and constructing a basis of ( = 1), then one of ( = 2),..., ( ), etc... The state space can be considered to be the direct sum of all the orthogonal subspaces ( ), where varies by integral jumps from to + and takes on all the values actually found in the problem. This means that any vector of can be written in one and only one way as a sum of vectors, each belonging to a particular subspace ( ). Nevertheless, the use of the subspaces ( ) presents certain disadvantages. First of all, their dimension ( ) depends on the physical system being considered and is not necessarily known. In addition, the subspaces ( ) are not invariant under the action of J since, by the very means of construction of the vectors , + and have non-zero matrix elements between vectors of ( ) and those of ( 1). We shall therefore introduce other subspaces of , the spaces ( ). Instead of grouping together the kets with fixed indices and [which span ( )], we shall now group together those for which and have given values, and we shall call ( ) the subspace which they span. This amounts to associating, in table (VI-1), the (2 + 1) vectors of one column [instead of the ( ) vectors of one row]. can then be seen to be the direct sum of the orthogonal subspaces ( ), which have the simpler properties: – the dimension of ( ) is (2 + 1), whatever the value of physical system under consideration. –

and whatever the

( ) is globally invariant under the action of J: any component of J [or a function (J) of J], acting on a ket of ( ), yields another ket also belonging5 to ( ). This result is not difficult to establish, since [or (J)] can always be expressed in terms of , + and . Now, , acting on , yields a ket proportional to ; + , a ket proportional to + 1 ; and , a ket proportional to 1 . The existence of the property in question therefore follows from the very means of construction of the “standard basis” .

C-3-c.

Matrices representing the angular momentum operators

Using the subspaces ( ) considerably simplifies the search for the matrix which represents, in a “standard” basis, a component of J [or an arbitrary function (J)]. The matrix elements of such operators between two basis kets belonging to two different subspaces ( ) are zero. The matrix therefore has the following form:

5 It

( J.

can also be shown that ( ) is “irreducible” with respect to J: there exists no subspace of ) other than ( ) itself which is globally invariant under the action of the various components of

681

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

(

)

(

)

( ) matrix (2 +1) (2 +1) 0

(

) .. .

( 0

)

( 0

)

... 0

0

0

0

matrix (2 +1) (2 +1) 0

matrix (2 +1) (2 +1)

0

0

0

0

... (C-49)

All we must then do is calculate the finite-dimensional matrices that represent the operator under consideration inside each of the subspaces ( ). Another very important simplification arises from the fact that each of these finite submatrices is independent of and of the physical system under study; it depends only on and, of course, on the operator which we want to represent. To see this, note that the definition of the [cf. (C-12), (C-40) and (C-41)] implies that: = +

~

=~

( + 1)

(

+ 1)

+1

=~

( + 1)

(

1)

1

(C-50)

that is: = =~

~ ( + 1)

(

1)

1

(C-51)

These relations show that the matrix elements representing the components of J depend only on and and not on . In order to know, in all cases, the matrix associated with an arbitrary component in a standard basis, all we need to do is calculate, once and for all, the “universal“ matrices ( )( ) which represent inside the subspaces ( ) for all possible values of ( = 0, 1 2, 1, 3 2, ). When we study a particular physical system and its angular momentum J, we shall determine the values of actually found in the problem, as well as the number of subspaces ( ) associated with each of them [that is, its degree of degeneracy (2 + 1) ( )]. We know that the matrix representing in this particular case has the “block-diagonal” form (C-49), and we can therefore construct it from the “universal” matrices which we have just defined: for each value of , we shall have ( ) “blocks” identical to ( )( ) . Let us give some examples of (

for zero. 682

)(

)

matrices:

() =0 The subspaces ( = 0) are one-dimensional, since zero is the only possible value . The ( )(0) matrices therefore reduce to numbers, which, according to (C-51), are

C. GENERAL THEORY OF ANGULAR MOMENTUM

( ) =1 2 The subspaces ( = 1 2) are two-dimensional ( = 1 2 or 1 2). If we choose the basis vectors in this order ( = 1 2 = 1 2), we find, using (C-51): (

)(1

2)

~ 2

1 0

=~

0 0

=

0 1

(C-52)

and: (

(1 2) +)

1 0

(

)(1

2)

=~

1 0

(

)(1

2)

=

0 1

0 0

(C-53)

that is, using (C-1): (

)(1

2)

=

0 1

~ 2

0

~ 2

(C-54)

0

The matrix representing J2 is therefore: (J2 )(1

2)

3 2 ~ 4

=

1 0

0 1

(C-55)

We thus find the matrices that were introduced without justification in Chapter IV, § A-2. ( ) =1 We now have (order of the basis vectors: (

)(1) = ~

1 0 0

(

(1) =~ +)

0 0 0

0 0 0

=1

=0

=

0 0 1

1): (C-56)

2 0 0

0 2 0

(

)(1) = ~

(

)(1) =

0 2 0

0 0 2

0 0 0

(C-57)

and therefore: ~ 2

0 1 0

1 0 1

0 1 0

(J2 )(1) = 2~2

1 0 0

0 1 0

0 0 1

(

)(1) =

~ 2

0

0 0

0

(C-58) 0

and: (C-59)

It can be verified that matrices (C-56) and (C-58) satisfy the commutation relations (B-9).

683

CHAPTER VI

(iv)

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

arbitrary

We use relations (C-51), which, according to (C-1), can also be written:

= ( + 1)

~ 2 (

+ 1)

+1

+ 1)

+1

+

( + 1)

(

1)

1

(C-60)

( + 1)

(

1)

1

(C-61)

and:

= ( + 1)

~ 2 (

( )

As for the matrix ( ), it is diagonal and its elements are the (2 +1) values of ~. The only non-zero matrix elements of ( )( ) and ( )( ) are those directly above and directly below the diagonal: ( )( ) is symmetrical and real, and ( )( ) is antisymmetrical and pure imaginary. Since the kets J2

are, by construction, eigenvectors of J2 , we have: = ( + 1)~2

The matrix (J2 )( ) is therefore proportional to the (2 + 1) diagonal elements are all equal to ( + 1)~2 .

(C-62) (2 + 1) unit matrix: its

Comment:

The axis which we have chosen as the “quantization axis” is entirely arbitrary. All directions in space are physically equivalent, and we should expect the eigenvalues of or to be the same as those of (their eigenvectors, however, are different, since and do not commute with ). It can indeed be verified that the eigenvalues of the ( )(1 2) and ( )(1 2) matrices [formulas (C-54)] are ~2 , and that those of the ( )(1) and ( )(1) matrices [formulas (C-58)] are +~, 0 ~. This result is general: inside a given subspace ( ), the eigenvalues of and (like those of the component = J u of J along an arbitrary unit vector u) are ~ ( 1)~ ( + 1)~ ~. The corresponding eigenvectors (eigenvectors common to J2 and , J2 and , or J2 and ) are linear combinations of the with and fixed.

To conclude this section devoted to “standard” representations, we summarize: 684

D. APPLICATION TO ORBITAL ANGULAR MOMENTUM

An orthonormal basis to J2 and : J2

of the state space, composed of eigenvectors common

= ( + 1)~2 =

~

is called a “standard basis ” if the action of the operators vectors is given by: +

D.

=~

( + 1)

(

+ 1)

+1

=~

( + 1)

(

1)

1

+

and

on the basis

Application to orbital angular momentum

In § C, we studied the general properties of angular momenta, derived uniquely from the commutation relations (B-9). We shall now return to the orbital angular momentum L of a spinless particle [formula (B-3)] and see how the general theory just developed applies to this particular case. Using the r representation, we shall show that the eigenvalues of the operator L2 are the numbers ( + 1)~2 corresponding to all positive integral or zero : of the possible values for found in § C-2-b, the only ones allowed in this case are the integral values, all of which are present. Then we shall indicate the eigenfunctions common to L2 and and their principal properties. Finally, we shall study these eigenstates from a physical point of view.

Eigenvalues and eigenfunctions of L2 and

D-1. D-1-a.

Eigenvalue equation in the

r

representation

In the r representation, the observables R and P correspond respectively to multiplication by r and to the differential operator (~ )∇. The three components of the angular momentum L can then be written: =

~

(D-1a)

=

~

(D-1b)

=

~

(D-1c)

It is more convenient to work in spherical (or polar) coordinates, since, as we shall see, the various angular momentum operators act only on the angular variables and , and not on the variable . Instead of characterizing the vector r by its Cartesian components , we label the corresponding point in space (OM = r) by its 685

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

spherical coordinates , ,

(Fig. 1):

= sin cos = sin sin = cos with :

(D-2) 0 0 0

2

The volume element d3 = d d d is written in spherical coordinates:

z

M

Figure 1: Definition of the spherical coordinates , , of an arbitrary point in space.

θ r O

y φ

x

d3 =

2

sin

=

2

d dΩ

d d d

(D-3)

where: dΩ = sin d d is the solid angle element about the direction of polar angles

(D-4) and .

Applying the classical technique of changing variables, we obtain, from formulas (D-1) and (D-2), the following expressions (the calculations are rather time-consuming 686

D. APPLICATION TO ORBITAL ANGULAR MOMENTUM

but pose no great problem): = ~ sin = ~

+

cos

cos tan sin + tan

(D-5a) (D-5b)

~

=

(D-5c)

which yields: 2

L2 = +

~2

+

2

= ~e

+

= ~e

1 tan

+

2

1 sin2

(D-6a)

2

(D-6b)

cot +

cot

(D-6c)

In the r representation, the eigenfunctions associated with the eigenvalues ( + 1)~2 of L2 and ~ of are the solutions of the partial differential equations: 2 2

+

1 tan

+

(

)=

2

1 sin2

(

2

(

) = ( + 1)

)

(

)

(D-7a) (D-7b)

Since the general results of § C are applicable to the orbital angular momentum, we already know that is integral or half-integral and that, for fixed , can take on only the values , + 1, ..., 1, . In equations (D-7), does not appear in any differential operator, so we can consider it to be a parameter and take into account only the - and -dependence of . Thus, we denote by ( ) a common eigenfunction of L2 and which corresponds to the 2 eigenvalues ( + 1)~ and ~: L2

(

) = ( + 1)~2

(

)=

~

(

(

)

(D-8a)

)

(D-8b)

To be completely rigorous, we would have to introduce an additional index in order to distinguish between the various solutions of (D-8) corresponding to the same pair of values of and . In fact, as we shall see further on, these equations have only one solution (to within a constant factor) for each pair of allowed values of and ; this is why the indices and are sufficient. Comments:

( ) Equations (D-8) give the - and -dependence of the eigenfunctions of L2 and . Once the solution ( ) of these equations has been found, these eigenfunctions will be obtained in the form: (

)= ( )

(

)

(D-9) 687

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

where ( ) is a function6 of that appears as an integration constant for the partial differential equations (D-7). The fact that ( ) is arbitrary shows that L2 and do not form a C.S.C.O. in the space of functions of r (or of ). ( ) In order to normalize ( ), it is convenient to normalize ( ) and ( ) separately (as we shall do here). We then have, taking (D-4) into account: 2

d

sin

0

)2d =1

(

(D-10)

0

and: 2

( )2 d =1

(D-11)

0

D-1-b.

Values of

.

and

and

must be integral

Using expression (D-5c) for ~

(

)=

which shows that (

)=

~ (

(

, we can write (D-8b) in the form:

)

(D-12)

) is equal to:

( )e

(D-13)

We can cover all space by letting vary between 0 and 2 . Since a wave function must be continuous at all points in space7 , we must have, in particular: (

= 0) =

(

=2 )

(D-14)

which implies that: e2

=1

(D-15)

According to the results of § C, is integral or half-integral. Relation (D-15) shows that, in the case of an orbital angular momentum, must be an integer (e2 would be equal to 1 if were half-integral). But we know that and are either both integral or both half-integral: it follows that must also be an integer. .

All integral values (positive or zero) of

can be found

Choose an integral value of (positive or zero). We know from the general theory of § C that ( ) must satisfy: + 6

(

)=0

(D-16)

( ) must be such that ( ) is square-integrable. ( ) were not continuous for = 0, it would not be differentiable and could not be an eigenfunction of the differential operators (D-5c) and (D-6a). For example, ( ) would produce a function ( ), which is incompatible with (D-12). 7 If

688

D. APPLICATION TO ORBITAL ANGULAR MOMENTUM

which yields, taking (D-6b) and (D-13) into account: d d

cot

( )=0

(D-17)

This first order equation can be integrated immediately if we notice that: cot d =

d(sin ) sin

(D-18)

Its general solution is: ( )= where (

(sin )

(D-19)

is a normalization constant8 . Consequently, for each positive or zero integral value of , there exists a function ) which is unique (to within a constant factor):

(

)=

(sin ) e

(D-20)

1 Through the repeated action of , we construct . Thus we see that there corresponds to the pair of eigenvalues ( + 1)~2 and ~ (where is an arbitrary positive integer or zero and is another integer such that ), one and only one eigenfunction ( ), which can be unambiguously calculated from (D-20). The eigenfunctions ( ) are called spherical harmonics.

D-1-c.

Fundamental properties of the spherical harmonics

The spherical harmonics ( ) will be studied in greater detail in Complement AVI . Here we shall confine ourselves to summarizing this study by stating without proof its principal results. .

Recurrence relations According to the general results of § C, we have: (

)=~

( + 1)

(

1)

1

(

)

(D-21)

Using expressions (D-6b) and (D-6c) for the operators + and and the fact that ( ) is the product of a function of alone and e , we obtain: e ( e

cot ) (

cot )

(

)= (

)=

( + 1) ( + 1)

(

+1

+ 1) (

1)

(

) 1

(

(D-22a) )

(D-22b)

8 Inversely,

one can easily show that the function obtained in this way is actually an eigenfunction of and with eigenvalues ( + 1)~2 and ~. According to (D-5c) and (D-13), it is immediately seen that ( )= ~ ( ). Then, using this equation and (D-16), as well as (C-7b), one can show that ( ) is also an eigenfunction of L2 with the expected eigenvalue. L2

689

CHAPTER VI

.

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

Orthonormalization and closure relations

Equations (D-7) determine the spherical harmonics only to within a constant factor. We now choose this factor so as to orthonormalize the ( ) (as functions of the angular variables and ): 2

d

sin

0

d

(

)

(

)=

(D-23)

0

Furthermore, any function of spherical harmonics:

and

,

(

), can be expanded in terms of the

+

(

)=

( =0

)

(D-24)

=

with: 2

=

d 0

sin

d

(

) (

)

(D-25)

0

The spherical harmonics therefore constitute an orthonormal basis in the space functions of and . This fact is expressed by the closure relation: ( =0

)

(

) = (cos

cos ) (



of

)

=

=

1 sin

(

) (

)

(D-26)

[it is (cos cos ), and not ( ), that enters into the right-hand side of the closure relation, because the integrations over the variable are performed using the differential element sin d = d(cos )]. .

Parity and complex conjugation

First of all, recall that the change from r to r (reflection through the coordinate origin) is expressed in spherical coordinates by (Fig. 2): = =

(D-27)

=

+

It is simple (see Complement AVI ) to show that: (

+ ) = ( 1)

(

)

(D-28)

The spherical harmonics are therefore functions with a definite parity, which is independent of ; they are even if is even and odd if is odd. Also, it can easily be seen that: [

690

(

)] = ( 1)

(

)

(D-29)

D. APPLICATION TO ORBITAL ANGULAR MOMENTUM

z

M N′

Figure 2: Transformation in spherical coordinates of an arbitrary point by reflection through the origin; is not changed, becomes , and becomes + .

θ

π–θ

π+φ

y N

φ M′

x

D-1-d.

“Standard” bases of the wave function space of a spinless particle

As we have already noted [comment ( ) of § D-1-a], L2 and do not constitute a C.S.C.O. in the wave function space of a spinless particle. We shall now indicate, relying on the reasoning and results of § C-3, the form of the “standard” bases of this space. Let ( = ) be the subspace of eigenfunctions common to L2 and , of eigen2 values ( + )~ and ~, where is a fixed positive integer or zero. The first step in the construction of a “standard” basis (cf. § C-3) consists of choosing an arbitrary orthonormal basis in each of the ( = ). We shall denote by (r) the functions that constitute the basis chosen in ( = ), the index (assumed to be discrete for simplicity) serving to distinguish between the various functions of this basis. By repeated application of the operator on the (r), we then construct the functions (r) which complete the “standard” basis for = ; they satisfy equations (C-12) and (C-50), which become here: L2

(r) = ( + 1)~2 (r) =

(r)

(D-30)

(r)

~

and: (r) = ~

( + 1)

(

1)

1 (r)

(D-31)

But we saw in § D-1-a that all eigenfunctions common to L2 and that correspond to given eigenvalues ( + 1)~2 and ~ have the same angular dependence, that of ( ); only their radial dependence differs. From equations (D-30), we therefore deduce that (r) has the form: (r) =

( )

(

)

(D-32) 691

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

Let us now show that, if the (r) constitute a “standard” basis, the radial functions (r) are independent of . Since the differential operators do not act on the -dependence, we have, according to (D-21): (r) =

(r)

=~

(

( + 1)

)

(

1)

1

( )

(

)

(D-33)

Comparison with (D-31) shows that the radial functions must satisfy, for all : 1(

)=

( )

(D-34)

and are consequently independent of . The functions (r) of a “standard” basis of the wave function space of a (spinless) particle are therefore necessarily of the form: (r) =

( )

(

)

(D-35)

The orthonormalization relation for such a basis is: d3

(r)

2

(r) =

d

( )

( )

0 2

d 0

sin d

(

)

(

)=

(D-36)

0

Since the spherical harmonics are orthonormal [formula (D-23)], we obtain, finally: 2

d

( )

( )=

(D-37)

0

The radial functions ( ) are therefore normalized with respect to the variable ; moreover, two radial functions corresponding to the same value of but to different indices are orthogonal.

Comments:

( ) Relation (D-37) is simply a consequence of the orthonormality of the functions (r) = ( ) ( ), which have been chosen as a basis in the subspace ( = ). It is therefore essential that the index be the same for the two functions appearing on the left-hand side. For = , (r) and (r) are orthogonal anyway because of their angular dependence (they are eigenfunctions of the Hermitian operator L2 with different eigenvalues). The integral: 2

d

( )

( )

(D-38)

0

may therefore take on any value a priori if and 692

are different.

D. APPLICATION TO ORBITAL ANGULAR MOMENTUM

( ) In general, the radial functions ( ) depend on , for the following reason. A function of the form ( ) ( ) can be continuous at the coordinate origin ( = 0, and arbitrary) only if ( ) reduces to a constant or if ( ) goes to zero at = 0 [since if ( ) depends on and , the limit of ( ) ( ) when 0 depends on the direction along which one approaches the origin if (0) is not zero]. Consequently, if we want the basis functions (r) to be continuous, only the radial functions corresponding to = 0 can be non-zero at = 0 [ 00 ( ) is indeed a constant]. Similarly, if we require the (r) to be differentiable (once or several times) at the origin, we obtain conditions for the (r) that depend on the value of . D-2.

Physical considerations

D-2-a.

Study of a

state

Consider a (spinless) particle in an eigenstate of L2 and [whose associated wave function is (r)], that is, a state in which the square of its angular momentum and the projection of this angular momentum along the axis have welldefined values [ ( + 1)~2 and ~ respectively]. Suppose that we want to measure the component along the or axis of the angular momentum of this particle. Since and do not commute with , is an eigenstate neither of nor of ; we cannot, therefore, predict with certainty the result of such a measurement. Let us calculate the mean values and root mean square deviations of and in the state . These calculations can be performed very simply if we express and in terms of + and . We invert formulas (C-1): 1 ( 2 1 = ( 2 =

+

+

) )

+

(D-39)

Thus we see that 1 ; this leads to:

and

are linear combinations of

=

=0

+ 1 and

(D-40)

Furthermore: 2

=

2

=

1 4

( 1 4

2 +

(

2

+ 2 +

+

+ 2

+

+

+

+) +)

The terms in 2+ and 2 do not contribute to the result, since to 2 . In addition, formula (C-8) yields: +

+

+

= 2(L2

2

)

(D-41) 2

is proportional

(D-42) 693

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

We therefore obtain: 2

2

=

1 (L2 2 ~2 = [ ( + 1) 2

2

=

Thus, in the state



=

=0

=∆

=~

2

)

]

(D-43)

: (D-44a) 1 [ ( + 1) 2

2]

(D-44b)

These results suggest the following picture. Consider a classical angular momentum, whose modulus is equal to ~ ( + 1) and whose projection along is ~ (Fig. 3): OL = ~

( + 1)

OH =

(D-45)

~

We denote by Θ and Φ the polar angles that characterize its direction. Since the triangle has a right angle at , and = , we have: 2

=

2

=~

( + 1)

2

(D-46)

Consequently, the components of such a classical angular momentum would be: =~

( + 1)

2

cos Φ

=~

( + 1)

2

sin Φ

=~

( + 1) cos Θ =

(D-47)

~

Now let us assume that OL and Θ are known and that Φ is a random variable which can take on any value in the interval [0 2 ], all these values being equally probable (an evenly distributed random variable). We then have, averaging over Φ: 2

cos Φ dΦ = 0

(D-48a)

sin Φ dΦ = 0

(D-48b)

0 2 0

which corresponds to (D-44a). In addition: 2

=

2

1 2 ~ [ ( + 1) 2

2

cos2 Φ dΦ =

] 0

~2 [ ( + 1) 2

2

]

(D-49)

and, similarly: 2

694

=

~2 [ ( + 1) 2

2

]

(D-50)

D. APPLICATION TO ORBITAL ANGULAR MOMENTUM

z H L Θ

K

O I

y

Φ J

x Figure 3: A classical model for the orbital angular momentum of a particle in a state . We assume that the distance |OL| and the angle Θ are known, but that Φ is a random variable whose probability density is constant inside the interval [0 2 ]. The classical mean values of the components of OL, as well as those of the squares of these components, are then equal to the corresponding quantum mechanical mean values.

These mean values are identical to the ones we found in (D-44). Consequently, the angular momentum of a particle in the state , behaves, insofar as the mean values of its components and their squares are concerned, like a classical angular momentum of magnitude ~ ( + 1) having a projection ~ along , but for which Φ is a random variable evenly distributed between 0 and 2 . Of course, this picture must be used carefully: we have shown throughout this chapter how much the quantum mechanical properties of angular momenta differ from their classical properties. In particular, we must stress the fact that an individual measurement of or on a particle in the state cannot yield an arbitrary value 2 2 between ~ ( + 1) and +~ ( + 1) , as the preceding model might lead us to believe. The only possible results are the eigenvalues of or (we saw at the end of § C that these are the same as those of ), that is, since is fixed here, one of the (2 + 1) values ~, ( 1)~, ..., ( + 1)~, ~. 695

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

Calculation of the physical predictions concerning measurements of L2 and

D-2-b.

Consider a particle whose state is described by the (normalized) wave function: = (r) = (

r

)

(D-51)

We know that a measurement of L2 can yield only the results 0, 2~2 , 6~2 , ( +1)~2 , ..., and a measurement of , only the results 0, ~, 2~, ~, ... How can we calculate the probabilities of these different results from the wave function ( )? .

General formulas

Let us denote by PL2 ( ) the probability of finding, in a simultaneous measurement of L2 and , the results ( + 1)~2 and ~. This probability can be obtained by expanding (r) on a basis composed of eigenfunctions of L2 and ; we shall choose a “standard” basis of the type introduced in § D-1-d: (r) =

( )

(

)

(D-52)

(r) can then be written: (r) =

( )

where the coefficients d3

=

(

)

(D-53)

can be calculated by using the usual formula: (r)

(r) 2

2

=

d

( )

0

d 0

sin d

(

)

(

)

According to the postulates of Chapter III, the probability PL2 under these conditions, by: PL2

(

(D-54)

0

(

) is given,

2

)=

(D-55)

If we measure only L2 , the probability of finding the result ( + 1)~2 is equal to: +

PL2 ( ) =

+

PL2

(

=

( )=

(D-56)

=

Similarly, if it is only P

2

)=

PL2

that we wish to measure, the probability of obtaining (

)=

2

~ is: (D-57)

(the restriction is automatically satisfied, since there are no coefficients for which would be greater than ). Actually, since L2 and act only on and , we see that it is the - and dependence of the wave function (r) that count in the preceding probability calculations. To be more precise, consider ( ) as a function of and depending on the 696

D. APPLICATION TO ORBITAL ANGULAR MOMENTUM

parameter . Like any other function of the spherical harmonics: (

)=

( )

The coefficients

(

and

,

can then be expanded in terms of

)

(D-58)

of this expansion depend on the “parameter”

and are given by:

2

( )=

d

sin d

0

(

)

(

)

(D-59)

0

If we compare expressions (D-58) and (D-53), we see that the of the expansion of ( ) on the functions ( ): ( )=

( )

are the coefficients

(D-60)

with, taking (D-54) and (D-59) into account: 2

=

d

( )

( )

(D-61)

0

Using (D-37) and (D-60), we also obtain: 2

( )2=

d

2

(D-62)

0

The probability PL2 PL2

(

(

) [formula (D-55)] can therefore also be written in the form:

2

)=

d

( )2

(D-63)

0

From this we can deduce, as in (D-56) and (D-57): +

PL2 ( ) =

2 =

d

( )2

(D-64)

0

and: P

2

( )=

d

( )2

(D-65)

0

[here again, ( ) exists only for ]. Consequently, to obtain the physical predictions concerning measurements of L2 and , we may consider the wave function as depending only on and . We then expand it in terms of the spherical harmonics as in (D-58) and apply formulas (D-63), (D-64) and (D-65). Similarly, since acts only on , it is the -dependence of the wave function (r) that counts in the calculation of P ( ). To see this, we shall use the fact that the 697

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

spherical harmonics are products of a function of shall write them in the form: (

)=

alone and a function of

e

( )

alone. We

(D-66)

2

so that each of the functions of the product is normalized, since we have: 2

d

e

e 2

0

=

2

(D-67)

Substituting this formula into the orthonormalization relation (D-23) for the spherical harmonics, we find: sin d

( )

( )=

(D-68)

0

[for reasons analogous to those indicated in comment ( ) of § D-1-d, the same value of is involved in both functions of the left-hand side]. If we consider ( ) to be a function of defined in the interval [0 2 ] and depending on the “parameters” and , we can expand it in a Fourier series: (

)=

(

)

where the coefficients

(

(

(D-69)

2 ) can be calculated from the formula:

2

1 2

)=

e

d e

(

)

(D-70)

0

If we compare formulas (D-69) and (D-70) with (D-58) and (D-59), we see that the ( ) for fixed are the coefficients of the expansion of ( ) on the functions corresponding to the same value of : (

)=

( )

( )

(D-71)

with: ( )=

sin d

( )

(

)

(D-72)

0

With (D-68) taken into account, expansion (D-71) requires that: sin d

)2=

(

( )2

(D-73)

0

Substituting this formula into (D-65), we obtain P P

2

( )= 0

698

d

sin d 0

(

)2

( ) in the form: (D-74)

D. APPLICATION TO ORBITAL ANGULAR MOMENTUM

Therefore, as far as measurements of alone are concerned, all we need to do is consider the wave function as depending solely on and expand it in a Fourier series as in (D-69) in order to calculate the probabilities of the various possible results. We might be tempted to think that an argument analogous to the preceding ones would give PL2 ( ) in terms of the expansion of ( ) with respect to the variable alone. In fact, this is not the case: predictions concerning a measurement of L2 alone involve both the - and the -dependence of the wave function; this is related to the fact that L2 acts on both and . We must therefore use formula (D-64). .

Special cases and examples

Suppose that the wave function (r) representing the state of the particle appears in the form of a product of a function of alone and a function of and : (

)= ( ) (

)

(D-75)

We can always assume ( ) and ( 2

) to be separately normalized:

( )2=1

d

(D-76a)

0 2

d

sin d

0

)2=1

(

(D-76b)

0

To obtain the expansion (D-58) of such a wave function, all we must do is expand ( in terms of the spherical harmonics: (

)=

(

)

)

(D-77)

with: 2

=

d 0

sin d

(

) (

)

In this case, therefore, the coefficients ( ): ( )=

( ) of formula (D-58) are all proportional to

( )

(D-79)

With (D-76a) taken into account, expression (D-63) for the probability PL2 becomes simply: P L2

(

(D-78)

0

)=

2

(

)

(D-80)

This probability is totally independent of the radial part ( ) of the wave function. Similarly, let us consider the case in which the wave function ( ) is the product of three functions of a single variable: (

)= ( ) ( ) ( )

(D-81) 699

CHAPTER VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS

which we shall assume to be separately normalized: 2 2

( )2=

d

( )2=

sin d

0

0

d

( )2=1

(D-82)

0

Of course, (D-81) is a special case of (D-75), and the results we have just established apply here. But, in addition, if we are interested only in a measurement of , all we must do is expand ( ) in the form: e

( )=

(D-83)

2

where: =

2

1 2

d e

( )

(D-84)

0

in order to obtain the equivalent of formula (D-69), with: (

)=

( ) ( )

According to (D-82), P P

(D-85) ( ) is then given by (D-74) as:

2

( )=

(D-86)

The preceding considerations can be illustrated by some very simple examples. First, let us assume that the wave function (r) is in fact independent of and , so that: 1 2 1 ( )= 2 ( )=

(D-87)

We then have: (

)=

1 4

=

0 0(

)

(D-88)

Thus a measurement of L2 or of must yield zero. Now, let us modify only the -dependence, choosing: ( )= ( )=

3 cos 2 1 2

(D-89)

In this case: ( 700

)=

3 cos = 4

0 1(

)

(D-90)

D. APPLICATION TO ORBITAL ANGULAR MOMENTUM

We are again sure of the results of measuring L2 or . For L2 , we can only obtain 2~2 ; for , 0. It can be verified that this modification of the -dependence has not changed the physical predictions concerning the measurement of . On the other hand, if we modify the -dependence by setting, for example: 1 2 e ( )= 2 ( )=

(D-91)

( ) is no longer equal to a single spherical harmonic. According to (D-86), all the probabilities P ( ) are zero except for: P

(

= 1) = 1

(D-92)

But the predictions concerning a measurement of L2 are also changed with respect to the case (D-87). In order to calculate these predictions, we must expand the function: (

)=

1 e 4

(D-93)

on the spherical harmonics. It can be verified that all the ( ), with odd and = 1, actually appear in the expansion of the function (D-93). Consequently, we are no longer sure of the result of a measurement of L2 (the probabilities of the various possible results can be calculated from the expression for the spherical harmonics). We therefore conclude from this example that, as pointed out at the end of § D-2-b- , the dependence of the wave function also enters into the calculation of predictions concerning measurements of L2 . References and suggestions for further reading: Dirac (1.13), §§ 35 and 36; Messiah (1.17), Chap. XIII; Rose (2.19); Edmonds (2.21).

701

COMPLEMENTS OF CHAPTER VI, READER’S GUIDE

AVI : SPHERICAL HARMONICS

Detailed study of the spherical harmonics ( ); establishes certain properties used in Chapter VI, as well as in certain subsequent complements.

BVI : ANGULAR MOMENTUM AND ROTATIONS

Brings out the close relation that exists between the angular momentum J of a quantum mechanical system and the spatial rotations that can be performed on it. Shows that the relation commutations between the components of J express purely geometrical properties of these rotations; introduces the concept of a scalar or vector observable, which will reappear in other complements (especially DX ). Important theoretically; however, sometimes difficult; can be reserved for later study.

CVI : ROTATION OF DIATOMIC MOLECULES

A simple and direct application of quantum mechanical properties of angular momentum: pure rotational spectra of heteropolar diatomic molecules, and Raman rotational spectra. Elementary level. Because of the importance of the phenomena studied in physics and chemistry, can be recommended for a first reading.

DVI : ANGULAR MOMENTUM OF STATIONNARY STATES OF A TWO-DIMENSIONAL HARMONIC OSCILLATOR

Can be considered as a worked example. Studies the stationnary states of the two-dimensional harmonic oscillator; in order to class these states by angular momentum, introduces the concept of “circular quanta”. Not theoretically difficult. Some results will be used in Complement EVI .

EVI : A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

A general study of the quantum mechanical properties of a charged particle in a magnetic field, followed by a study of the important special case in which the magnetic field is uniform (Landau levels). Not theoretically difficult. Recommmended for a first reading, which can, however, be confined to §§ 1-a and 1-b, 2-a and 2-b, 3-a.

FVI : EXERCISES

.

703



SPHERICAL HARMONICS

Complement AVI Spherical harmonics

1

2

Calculation of spherical harmonics . . . . . . . . . 1-a Determination of ( ) . . . . . . . . . . . . . . 1-b General expression for ( ) . . . . . . . . . . 1-c Explicit expressions for = 0, 1 and 2 . . . . . . . Properties of spherical harmonics . . . . . . . . . . 2-a Recurrence relations . . . . . . . . . . . . . . . . . 2-b Orthonormalization and closure relations . . . . . 2-c Parity . . . . . . . . . . . . . . . . . . . . . . . . . 2-d Complex conjugation . . . . . . . . . . . . . . . . . 2-e Relation between the spherical harmonics and the polynomials and associated Legendre functions . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Legendre . . . . . .

705 705 707 709 710 710 711 711 712 712

This complement is devoted to the study of the form and principal properties of spherical harmonics. It includes the proofs of certain results that were stated without proof in § D-1-c of Chapter VI. 1.

Calculation of spherical harmonics

In order to calculate the various spherical harmonics ( ), we shall use the method indicated in Chapter VI (§ D-1-c); starting with the expression for ( ), we shall use the operator to obtain by iteration the spherical harmonics corresponding to the same value of and the (2 + 1) values of associated with it. Recall that the operators act only on the angular dependence of a wave function and can be written: + and = ~e 1-a.

+ cot

Determination of

(1) (

)

We have seen (§ D-1-c of Chapter VI) that equation: +

(

)=0

(

) can be calculated from the

(2)

and from the fact that: (

)=

( )e

(3)

Thus we obtained: (

)=

(sin ) e

(4) 705



COMPLEMENT AVI

where

is an arbitrary constant. First, let us determine the absolute value of by requiring ized with respect to the angular variables and : 2

(

) to be normal-

2

d

sin d

0

)2=

(

2

0

sin d (sin )2 = 1

d 0

(5)

0

We obtain: 2

= 1 (2

where

)

(6)

is given by: +1

sin d (sin )2 =

= (setting

2

d (1

0

)

(7)

1

= cos ).

can easily be calculated by recurrence since:

+1

=

+1 2

d (1

)(1

2

)

1

=

1

2

d

1

(

2

)

1

(8)

1

An integration by parts of the last integral yields: =

1 2

1

(9)

We therefore have: =

2 2 +1

(10)

1

with: +1 0

=

d =2

(11)

1

From this, we can immediately derive the value of = (

(2 )!! (2 + 1)!!

0

=

22 +1 ( !)2 (2 + 1)!

(12)

) is then normalized if: =

1 2 !

(2 + 1)! 4

In order to define choose: =

( 1) 2 !

(13) completely, we must choose its phase. It is customary to

(2 + 1)! 4

We shall see later that, with this convention, real positive value for = 0. 706

:

(14) 0

( ) (which is independent of

) has a

• 1-b.

General expression for

(

SPHERICAL HARMONICS

)

We shall obtain the other spherical harmonics ( ) by successive application of the operator to the ( ) we have just determined. First, we shall prove a convenient formula that will enable us to simplify the calculations. .

The action of (

) on a function of the form e

The action of the operators is any integer) is given by: [e

~e (

( )] =

1)

+

and

on a function of the form e

d [(sin ) d(cos )

(sin )1

( ) ( ) (where

( )]

(15)

More generally: (

( )] = ( ~) e (

) [e

)

d [(sin ) d(cos )

(sin )

( )]

(16)

First, let us prove formula (15). We know that: d d d = = d(cos ) d(cos ) d

1 d sin d

(17)

and therefore: d (sin ) d(cos )

(sin )1

1 sin

= (sin )1 =

( )

cot

1

(sin )

( )+

cos

d ( ) d

( ) + (sin )

d ( ) d

(18)

Consequently: e(

1)

d [(sin ) d(cos )

(sin )1

( )] =

e(

cot

=e

+ cot

1)

( ) e

( )

(19)

We recognize expression (1) for the operators + and ; relation (19) is therefore identical to (15). Now, to establish formula (16), we can reason by recurrence, since, for = 1, (16) reduces to (15), which we have just proved. Let us therefore assume that relation (16) is true for ( 1):

(

1

)

[e

( )] = ( ~)

1

e(

1)

(sin )

1

d 1 d(cos )

1

[(sin )

( )]

(20)

and let us show that it is then also valid for . To do so, we apply to both sides of (20); for the right-hand side, we can use formula (15), making the substitutions: = ( )=

1 (sin )

1

d 1 d(cos )

1

[(sin )

( )]

(21)

707

COMPLEMENT AVI



We then obtain: (

( )] = ( ~) e (

) [e

)

d d(cos )

+

(sin )

+1

(sin )

= ( ~) e (

)

d 1 d(cos )

1

(sin )

d [(sin ) d(cos )

(sin )

1

[(sin )

( )]

( )]

(22)

Formula (16) is therefore proven by recurrence.

.

Calculation of (

(

) from

(

)

As we have already indicated (Chapter VI, § D-1-c- ), the spherical harmonics ) must satisfy: (

)=~ =~

( + 1) (

1)

1

+ 1)

1

(

)(

(

)

(

)

(23)

1 These relations automatically insure that is normalized if is. Also, they fix the relative phases of spherical harmonics corresponding to the same value of and different values of . In particular, we can calculate ( ) from ( ) by using the operator given by (1) and formula (23). Thus, we shall obtain directly a normalized function ( ) whose phase will be determined by the convention used for ( ) [formula (14)]. To go from ( ) to ( ), we must apply ( ) times the operator ; according to (23), we thus obtain:

(

)

(

)

=( )

(2 )(1)

(2

1)(2)

( +

+ 1)(

)

(

) (24)

that is: (

)=

( + (2 )!(

)! )!

(

Finally, using expression (4) for and formula (16) (with = and = (

.

)=

( 1) 2 !

Calculation of

2 + ( + 4 ( (

)

(25)

~

) from

(

)! e )!

) [where the coefficient is given by (14)] ), we can write (25) explicitly in the form: (sin )

(

d d(cos )

(sin )2

(26)

)

In order to obtain expression (26), we started with the result of § 1-a. It is, of course, just as easy to calculate ( ) first and then use the operator + . The expression thus obtained for is different from (26), although the two are completely equivalent. 708

• Let us therefore calculate (sin )2 = (1

(

SPHERICAL HARMONICS

) from (26)1 . Since:

cos2 )

(27)

is a polynomial of degree 2 in cos

only its highest-order term contributes to

d2 (sin )2 = ( 1) (2 )! d(cos )2

(

):

(28)

We therefore immediately find that: (

)=

1 2 !

(2 + )! e 4

(sin )

(29)

( ) can then be obtained by applying ( + ). Using (23) and (16), we finally arrive at:

( (

)=

1-c.

( 1) + 2 !

2 +1( 4 ( +

)! e )!

(sin )

) times the operator

d+ d(cos ) +

(sin )2

+

to

(30)

Explicit expressions for = 0, 1 and 2

General formulas (26) and (30) yield the expressions of the spherical harmonics for the first values of : 0 0

1 4

=

1 1

1 We

)=

0 1(

)=

2

(

)=

1

(

)=

0 2(

)=

2

2

(

(31)

3 sin 8 3 cos 4

(32)

15 sin2 e 2 32 15 sin cos e 8 5 (3 cos2 1) 16

could obviously calculate (

e

(33)

from the equation:

)=0

However, its phase would then remain arbitrary. By using (26), we shall determine and its phase will be a consequence of the convention chosen in § 1-a.

(

) completely,

709



COMPLEMENT AVI

2.

Properties of spherical harmonics

2-a.

Recurrence relations

By their very construction, the spherical harmonics satisfy relations (23); that is, using (1): e

cot

(

)=

( + 1)

(

1

1)

(

)

(34)

Also note the following formula, which is often useful: cos

(

( + + 1)( + 1) (2 + 1)(2 + 3)

)=

+1 (

) ( + )( (2 + 1)(2

+

) 1)

1(

) (35)

Here is an outline of its proof. According to (25): cos

( + (2 )!(

=

)! )!

cos

Now, using expression (1) for [

cos ] = ~e

(

)

(36)

~ , it is easy to verify that:

sin

(37)

and: [

e

sin ] = 0

(38)

Using a recurrence argument, we can then calculate the commutator of ( if we assume that: 1

~) and cos , since,

2

cos

=(

1)e

sin

~

(39) ~

we obtain: 1

cos

1

=

~

cos ~

+

cos

~

~

~

1

=

(40)

1

e

sin + (

1)e

sin

~

~

that is: 1

cos

=

e

sin

1

=

~

e

~

sin

(41)

~

This relation has therefore been established by recurrence. We can use it to write (36) in the form: cos

710

=

( + (2 )!(

1

)! )!

cos ~

(

)

e ~

sin

(42)



SPHERICAL HARMONICS

Using (4) and (14), we can easily show that: e

sin

2 +1 (1 2

=

cos2 )

1 1

(43)

If we then calculate the explicit expressions for we find that: cos

=

cos2

1 1

=

1 2 +3 2 2 +1

+1

and

1 +1

from the general expression (26),

(44a)

+1

1 +1

2 +3

+

1 2 +1

1 1

(44b)

Substituting relations (43) and (44) into (42) and using (23), we obtain (35). 2-b.

Orthonormalization and closure relations

Because of the way we constructed them, the spherical harmonics constitute a set of normalized functions; they are also orthogonal, since they are eigenfunctions of the Hermitian operators L2 and with different eigenvalues. The corresponding orthonormalization relation is: 2

d

sin d

0

(

)

(

)=

(45)

0

It can be shown (here, we shall assume) that any square-integrable function of can be expanded in one and only one way on the spherical harmonics:

and (

)=

(

)

(46)

with: 2

=

d 0

sin

(

) (

)

(47)

0

The set of spherical harmonics therefore constitutes an orthonormal basis of the space of square-integrable functions of and . This can be expressed by the closure relation: (

2-c.

)

(

) = (cos

cos ) (

)

(48)

Parity

The parity operation on a function defined in ordinary space (cf. Complement FII ) consists of replacing in this function the coordinates of any point in space by those of the point symmetrical to it with respect to the origin of the reference frame: r=

r

(49) 711

COMPLEMENT AVI



In spherical coordinates, this operation is expressed by the substitutions (Fig. 2 of Chapter VI): = =

(50)

=

+

Consequently, if we are using a standard basis for the wave function space of a spinless particle (§ D-1-d of Chapter VI), the radial part of the basis functions ( ) is unchanged by the parity operation. The only transformation is that of the spherical harmonics, which we shall now describe. First, note that in the substitution of (50): sin = cos =

sin cos

e

( 1) e

=

(51)

Under these conditions, the function into: (

+ ) = ( 1)

(

(

) which we calculated in § 1-a is transformed

)

(52)

Moreover: = (53) = Relations (51) and (53) show that the operators + and [formulas (1)] remain unchanged [which means that + and are even operators, in the sense defined in Complement FII (§ 2-a)]. Consequently, according to result (52) and formula (25), which enables us to calculate ( ): (

+ ) = ( 1)

(

)

(54)

The spherical harmonics are therefore functions whose parity is well-defined and independent of : they are even for even and odd for odd. 2-d.

Complex conjugation

Because of their -dependence, the spherical harmonics are complex-valued functions. It can be seen directly, by comparing (26) and (30), that: [ 2-e.

(

)] = ( 1)

(

)

(55)

Relation between the spherical harmonics and the Legendre polynomials and associated Legendre functions

The -dependence of the spherical harmonics resides in functions known as Legendre polynomials and associated Legendre functions. We shall neither prove nor even enumerate all the properties of these functions, but shall simply indicate their relation with the spherical harmonics. 712

• 0

.

( ) is proportional to a Legendre polynomial

For 0

SPHERICAL HARMONICS

= 0, formulas (26) and (30) yield:

( )=

( 1) 2 !

2 +1 d (sin )2 4 d(cos )

(56)

which can be written in the form: 0

2 +1 4

( )=

(cos )

(57)

setting: ( )=

( 1) d (1 2 ! d

2

)

(58)

According to its definition (58),

( ) is an th order polynomial in

of parity2

( 1) : (

) = ( 1)

( )

(59)

( ) is the th order Legendre polynomial. It is easy to show that it has interval [ 1 +1], and that the numerical coefficient in (58) insures that: (1) = 1

zeros in the (60)

It can also be proven that the Legendre polynomials form a set of orthogonal functions: +1

d

( )

( )=

1

sin d

(cos )

0

on which can be expanded functions of ( )=

(cos ) =

2 2 +1

(61)

alone:

(cos )

(62)

with: =

2 +1 2

sin d

(cos ) ( )

(63)

0

Comment:

According to (57) and (60): 0

(0) =

2 +1 4

As we pointed out in § 1-a, the phase convention chosen for positive value to 0 (0).

(64) (

) gives a real

2 Parity with respect to the variable . Note, however, that the parity operation in space [formulas (50)] amounts to changing cos to cos ; property (59) can be expressed by: 0

(

) = ( 1)

0

( )

which is a special case of (54).

713

COMPLEMENT AVI

.

(

) is proportional to an associated Legendre function

For (



positive, (1 (1 +

)=

(

) can be obtained by applying

)! )!

+

0

( )

(

+

to

0

( ); using (23):

0)

(65)

~

Using formulas (1) and (16), we then find: ( where

2 +1 ( 4 (1 +

) = ( 1)

)! )!

(cos ) e

(

0)

(66)

is an associated Legendre function, defined by: ( )=

d d

2)

(1

( )

( 1

+1)

(67)

2) ( ) is the product of (1 and a polynomial of degree ( ) and 0 parity ( 1) ; ( ) is the Legendre polynomial ( ). The set of ( ) for fixed constitutes an orthogonal system of functions: +1

d

( )

( )=

sin d

1

(cos )

2 ( + 2 +1(

(cos ) =

0

on which can be expanded functions of alone. Formula (66) is valid for positive (or zero); for negative relation (55) to obtain: ( .

)=

2 +1 ( + 4 (1

)! )!

(cos ) e

(

0)

)! )!

(68)

, it suffices to use

(69)

Spherical harmonic addition theorem

Consider two arbitrary directions in space, and , defined respectively by the polar angles ( ) and ( ), and call the angle between them . The following relation can be proven: 2 +1 4

+

(cos ) =

( 1)

(

)

(

)

(70)

=

(where is the th-order Legendre polynomial). It is known as the “spherical harmonics addition theorem”. We shall indicate the main steps of an elementary proof of relation (70). First of all, note that, if cos is expressed in terms of the polar angles ( ) and ( ), the left-hand side of (70) can be considered to be a function of and ; it can therefore be expanded on the spherical harmonics ( ). The coefficients of this expansion, which are, of course, functions of the other two variables, and , can also be expanded on the spherical harmonics ( ). We must therefore have: 2 +1 4

714

(cos ) =

;

(

)

(

)

(71)

• where the problem is to calculate the coefficients following process:

SPHERICAL HARMONICS

. They can be obtained by the

;

(i) in the first place, these coefficients are different from zero only for: =

=

(72)

To show this, first fix the direction ; (cos ) then depends only on and ’. If the axis is chosen along , cos = cos and (cos ) is proportional to 0 ( ) [cf. relation (57)]. To generalize to the case in which the direction of is arbitrary, we perform a rotation which takes onto this direction: cos remains unchanged, as does (cos ). Since the rotation operators (Complement BVI , § 3-c- ) commute with L2 , the transform of 0 ( ) remains an eigenfunction of L2 with the eigenvalue ( + 1)~2 , that is, a linear combination of the spherical harmonics ( ); we therefore have = . Similarly, it can be established that = . ( ) under a rotation of both directions and through an angle about , the angle is not changed, and neither are and , while and become + and + . The left-hand side of (71) therefore does not change in value, and each term of the right-hand side is multiplied by e ( + ) . Consequently, the only non-zero coefficients in the sum of the right-hand side are those which satisfy: + (

=0

(73)

) combining results (72) and (73), we see that formula (71) can be written in the form:

2 +1 4 If we set 2 +1 = 4

+

(cos ) =

( 1)

(

)

(

)

(74)

=

=

and

=

, we obtain, according to (60):

+

( 1)

(

)

(

)

(75)

=

Since ( 1) is simply , the integration of (75) with respect to dΩ = sin d d yields, with the orthonormalization equation (45): +

2 +1=

(76) =

We now take the square of the modulus of both sides of (74) and integrate over dΩ and + 2 dΩ . Using relation (45), it is easy to see that the right-hand side yields . As far = as the left-hand side is concerned, we can again take advantage of the invariance of the angle with respect to a rotation in order to show that dΩ (cos ) 2 is actually independent of ( ). If we then choose along to evaluate this integral, we find, according to relation (61): dΩ

(cos ) 2 =

dΩ

(cos ) 2 = 2

2 2 +1

Integrating over dΩ , we find a second relation between the coefficients

(77) :

+ 2

2 +1=

(78)

=

715

COMPLEMENT AVI



( ) equations (76) and (78) suffice for the determination of the (2 + 1) coefficients : they are all equal to 1. To show this, consider, in a normed (2 +1)-dimensional vector space, the vector X of components = 2 + 1 and the vector Y of components m = 1 2 + 1. The Schwartz inequality indicates that: (X

X)(Y

Y)

Y

X2

(79)

where there is equality if and only if X and Y are proportional. Now, (76) and (78) show that this is the case: and are therefore independent of , as is , and we have, necessarily, = 1. This concludes the proof of formula (70).

References Messiah (1.17), App. B, § IV; Arfken (10.4), Chap. 12; Edmonds (2.21), Table 1; Butkov (10.8), Chap. 9, §§ 5 and 8; Whittaker and Watson (10.12), Chap. XV; Bateman (10.39), Chap. III; Bass (10.1), vol. I, § 17-7.

716



ANGULAR MOMENTUM AND ROTATIONS

Complement BVI Angular momentum and rotations

1 2

3

4

5

6

1.

Introduction

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 717

Brief study of geometrical rotations R . . . . . . . . . . . . . 718 2-a

Definition. Parametrization . . . . . . . . . . . . . . . . . . . 718

2-b

Infinitesimal rotations . . . . . . . . . . . . . . . . . . . . . . 719

Rotation operators in state space. Example: a spinless particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720 3-a

Existence and definition of rotation operators . . . . . . . . . 720

3-b

Properties of rotation operators

3-c

Expressing rotation operators in terms of angular momentum observables . . . . . . . . . . 723

. . . . . . . . . . . . . . . 721

Rotation operators in the state space of an arbitrary system727 4-a

System of several spinless particles . . . . . . . . . . . . . . . 727

4-b

An arbitrary system . . . . . . . . . . . . . . . . . . . . . . . 729

Rotation of observables . . . . . . . . . . . . . . . . . . . . . . 730 5-a

General transformation law . . . . . . . . . . . . . . . . . . . 730

5-b

Scalar observables . . . . . . . . . . . . . . . . . . . . . . . . 732

5-c

Vector observables . . . . . . . . . . . . . . . . . . . . . . . . 732

Rotation invariance . . . . . . . . . . . . . . . . . . . . . . . . 734 6-a

Invariance of physical laws . . . . . . . . . . . . . . . . . . . . 734

6-b

Consequence: conservation of angular momentum . . . . . . . 736

6-c

Applications

. . . . . . . . . . . . . . . . . . . . . . . . . . . 737

Introduction

We indicated in Chapter VI (§ B-2) that the commutation relations between the components of an angular momentum are actually the expression of the geometrical properties of rotations in ordinary three-dimensional space. This is what we intend to show in this complement, where we investigate the relation between rotations and angular momentum operators. Consider a physical system ( ) whose quantum mechanical state, at a given time, is characterized by the ket of the state space . We perform a rotation R on this system; in this new position, the state of the system is described by a ket which is different from . Given the geometrical transformation R, the problem is to determine from . We shall see that it has the following solution: with every geometrical rotation R can be associated a linear operator acting in the state space such that: =

(1) 717

COMPLEMENT BVI



Let us immediately stress the necessity of distinguishing between the geometrical rotation R, which operates in ordinary space, and its “image” , which acts in the state space: R=

(2)

We shall begin (§ 2) by reviewing the principal properties of geometrical rotations R. We shall not embark upon a detailed study of them; rather, we shall simply note some results which will be useful to us later. Then, in § 3, we shall use the example of a spinless particle to define the rotation operators precisely, to study their most important properties, and to determine their relation to the angular momentum operators L. We shall then be able to interpret the commutation relations amongst the components of the angular momentum L as the image, in the space r , of purely geometrical characteristics of rotations R. We shall then generalize (§ 4) these concepts to arbitrary quantum mechanical systems. In § 5, we shall examine the behavior of the observables describing the physical quantities measurable in this system, upon rotation of the system. This will lead us to classify observables according to how they transform under a rotation (scalar, vector, tensor observables). Finally, in § 6, we shall briefly consider the problem of rotation invariance and indicate some important consequences of this invariance. Brief study of geometrical rotations R

2. 2-a.

Definition. Parametrization

A rotation R is a one-to-one transformation of three-dimensional space that conserves a point of this space, the angles and the distances, as well as the handedness of the reference frames1 . We shall be concerned here with the set of rotations that conserve a given point , which we shall choose as the origin of the reference frame. A rotation can then be characterized by the axis of rotation (given by its unit vector u or its polar angles and ) and the angle of rotation (0 2 ). To determine a rotation, three parameters are required; they can be chosen to be the components of the vector: α=

u

(3)

whose absolute value is equal to the angle of rotation and whose direction defines the axis of rotation. Note that a rotation can also be characterized by three angles, called Euler angles. We shall denote by Ru ( ) the geometrical rotation through an angle about the axis defined by the unit vector u. The set of rotations R constitutes a group: the product of two rotations (that is, the transformation resulting from the successive application of these two rotations) is also a rotation; there exists an identity rotation (rotation through a zero angle about an arbitrary axis); for every rotation Ru ( ) there is an inverse rotation, R u ( ). The group of rotations is not commutative: in general, the product of two rotations depends on the order in which they are performed2 : Ru ( ) Ru ( ) = Ru ( ) Ru ( ) 1 This

(4)

last property is imposed in order to exclude reflections with respect to a point or a plane. one writes R2 R1 , this means that rotation R1 must be performed first, R2 being applied subsequently to the result obtained. 2 When

718



ANGULAR MOMENTUM AND ROTATIONS

Recall, however, that two rotations performed about the same axis always commute: Ru ( ) Ru ( ) = Ru ( ) Ru ( ) = Ru ( + (if necessary, 2 is subtracted from 2-b.

+

)

(5)

, to keep it within the interval [0 2 ]).

Infinitesimal rotations

An infinitesimal rotation is defined as a rotation that is infinitesimally close to the identity rotation, that is, a rotation Ru (d ) through an infinitesimal angle d about an arbitrary axis u. It is easy to see that the transform of a vector OM under the infinitesimal rotation Ru (d ) can be written, to first order in d : Ru (d ) OM = OM + d u

(6)

OM

Every finite rotation can be decomposed into an infinite number of infinitesimal rotations, since the angle of rotation can vary continuously and since, according to (5): Ru ( + d ) = Ru ( ) Ru (d ) = Ru (d ) Ru ( )

(7)

where Ru (d ) is an infinitesimal rotation. Thus, the study of the rotation group can be reduced to an examination of infinitesimal rotations3 . Before ending this rapid survey of the properties of geometrical rotations, we note the following relation which will be useful to us later: Re ( d ) Re (d ) Re (d ) Re ( d ) = Re (d d )

(8)

where e , e and e denote the unit vectors of the three coordinate axes , and respectively. If d and d are first-order infinitesimal angles, this relation is correct to the second order. It describes, in a special case, the non-commutative structure of the rotation group. To prove relation (8), let us apply its left-hand side to an arbitrary vector OM. We use formula (6) to find the vector OM , the transform of OM under the succesive action of the four infinitesimal rotations. It can be seen immediately that if d is zero, the left-hand side of (8) reduces to the product Re ( d )Re (d ), which is equal to the identity rotation [see (5)]; the vector OM OM must therefore be proportional to d . For an analogous reason, it must also be proportional to d . Consequently, the difference OM OM is proportional to d d . Therefore, to calculate OM to second order, we may restrict ourselves to first order in each of the two infinitesimal angles d and d . First of all, according to (6): Re ( d ) OM = OM

d

e

(9)

OM

We must then apply Re (d ) to this vector; this can be done by again using (6): Re (d ) Re ( d ) OM =(OM =OM

d d

e e

OM) + d OM + d

e e

(OM OM

d d d

e e

OM) (e

OM)

(10)

3 However,

limiting ourselves to infinitesimal rotations, we lose sight of a “global” property of the finite rotation group: the fact that a rotation through an angle of 2 is the identity transformation. The rotation operators (see § 3) constructed from infinitesimal operators do not always have this global property. In certain cases (see Complement AIX ), the operator associated with a 2 rotation is not the unit operator but its opposite.

719

COMPLEMENT BVI



The action of Re (d ) on the vector appearing on the right-hand side of (10) results in the addition of the following infinitesimal terms to this vector: d

OM + d d

e

e

(e

(11)

OM)

obtained by the vector multiplication of the right-hand side of (10) by d e , with only the first order terms in d being retained. Therefore: Re (d ) Re (d )Re ( d ) OM = OM + d

e

OM + d d [e

(e

OM)

e

(e

OM)]

Finally, OM is equal to the sum of the vector just obtained and its vector product by To first order in d , this vector product can be written simply: d

e

(12) d e . (13)

OM

which means that: Re ( d ) Re (d ) Re (d ) Re ( d ) OM = OM + d d [e

(e

OM)]

(14)

= Re (d d ) OM

(15)

OM)

e

(e

It is then easy to transform the double vector products; we find: Re ( d ) Re (d ) Re (d ) Re ( d ) OM = OM + d d

e

OM

Since this relation is true for any vector OM, expression (8) is verified.

3.

Rotation operators in state space. Example: a spinless particle

In this section, we consider a physical system composed of a single (spinless) particle in three-dimensional space. 3-a.

Existence and definition of rotation operators

At a given time, the quantum mechanical state of the particle is characterized, in the state space r , by the ket with that is associated the wave function (r) = r . Let us perform a rotation R on this system which associates with the point r0 ( 0 0 0 ) of space the point r0 ( 0 0 0 ) such that: r0 = Rr0

(16)

Let be the state vector of the system after rotation, and (r) = r , the corresponding wave function. It is natural to assume that the value of the initial wave function (r) at the point r0 will be found, after rotation, to be the value of the final wave function (r) at the point r0 given by (16): (r0 ) = (r0 )

(17)

that is: (r0 ) = (R 720

1

r0 )

(18)



ANGULAR MOMENTUM AND ROTATIONS

Since this equation is valid for any point (r0 ) in space, it can be written in the form: 1

(r) = (R

(19)

r)

By definition, the operator in the state space r associated with the geometrical rotation R being treated is the one that acts on the state before rotation to yield the state after the rotation R: =

(20)

is called a “rotation operator”. Relation (19) characterizes its action in the representation: = R

r

1

r

(21)

r

where R 1 r is the basis ket of this representation determined by the components of the vector R 1 r.

Comment: If the state of the particle after rotation were e (where is an arbitrary real number) instead of , its physical properties would not be modified. In other words, relation (17) could be replaced by: ( 0) = e

( 0)

(22)

would obviously be independent of not treat this difficulty here. 3-b.

0,

but could depend on the rotation R. We shall

Properties of rotation operators

.

is a linear operator

This essential property of rotation operators follows from their very definition. If the state before the rotation is a linear superposition of states, for example: =

1

+

1

2

(23)

2

formula (21) indicates that: r

=

1

R

=

1

r

1

r

1

+

+

1

2

2

R

r

1

r

2

(24)

2

Since this relation is true for any ket of the operator: =

[

1

1

+

2

2

]=

1

1

+

2

basis, we deduce that

r

2

is a linear

(25) 721



COMPLEMENT BVI

.

is unitary

In formula (21), the ket bra r is therefore given by: = R

r

1

can be arbitrary. The action of the operator

on the (26)

r

Taking the Hermitian conjugate of both sides of equation (26), we obtain: r = R

1

(27)

r

Moreover, if we recall that the ket r represents a state in which the particle is perfectly localized at the point r, we see that: r = Rr

(28)

This equation simply expresses the fact that if the particle was localized at the point r before the rotation, it will be localized at the point r = Rr after the rotation. To get (28) from (21), we choose a basis state r0 for : r0 = R

r

1

1

r r0 = [(R

r)

r0 ]

(29) 4

where we have used the orthonormalization relation of the r basis. Furthermore : [(R

1

r)

r0 ] = [r

(Rr0 )]

(30)

Substituting (30) into (29), we indeed find that: r

0

(Rr0 )] = r Rr0

= [r

that is, since

is a basis of

r

(31)

r:

r0 = Rr0

(32)

Starting with formulas (27) and (28), it is easy to show that: =

=1

since the action of for example: r =

R

The operator

(33) or

1

on any vector of the

r = RR

1

r = r

r

basis yields the same vector; (34)

is therefore unitary.

Comment:

The operator therefore conserves the scalar product and the norm of vectors that it transforms: = =

=

=

(35)

This property is very important from the physical point of view since the probability amplitudes which yield physical predictions appear in the form of scalar products of two kets. 4 Relation (30) can easily be established by using the definition of delta “functions” and the fact that a rotation conserves the infinitesimal volume element.

722

• .

The set of operators

ANGULAR MOMENTUM AND ROTATIONS

constitutes a representation of the rotation group

We have pointed out (§ 2) that the geometrical rotations form a group; in particular, the product of two rotations R1 and R2 is always a rotation: R2 R1 = R3

(36)

With the three geometrical rotations R1 , R2 and R3 are associated, in the state space r , three rotation operators 1 , 2 and 3 , respectively. If the three geometrical rotations satisfy relation (36), we shall show that the corresponding rotation operators are such that: 2

(

1

=

(37)

3

is a product of operators of r as defined in Chapter II, § B-3-a). Consider a particle whose state is described by an arbitrary ket r of the basis characterizing the r representation. If we perform the rotation R1 on this particle, its state becomes: 2

1

1

r = R1 r

(38)

by definition of 1 . Now, we perform the rotation R2 on the new state we have just obtained; the state of the particle after this second rotation is, according to (38) and the definition of 2 : 2

1

r =

2

R1 r = R2 R1 r

(39)

If we take (36) into account, we see that relation (39) is equivalent to: 2

1

r = R3 r

Now, the operator 3

(40) 3,

associated with the rotation R3 , is such that:

r = R3 r

(41)

Relation (37) is therefore proven, since the ket r under consideration can be chosen in an arbitrary way from the kets of the r basis. To express the important result we have just established, one says that the correspondence R = between geometrical rotations and rotation operators conserves the group law, or that the set of operators constitutes a “representation” of the rotation group. Of course, with the identity rotation, we associate the identity operator in r , 1 and with the rotation R 1 (the inverse of a rotation R), the operator , which is the 1 inverse of the one corresponding to R (we showed in § 3-b- that = ). 3-c. Expression for rotation operators in terms of angular momentum observables

.

Infinitesimal rotation operators

First of all, let us consider an infinitesimal rotation about the axis, Re (d ). If we apply it to a particle whose state is described by the wave function (r), we know 723

COMPLEMENT BVI



from (19) that the wave function rotation satisfies:

(r) associated with the state of the particle after

(r) = [Re 1 (d )r]

(42)

) are the components of r, those of R

But if ( (6):

1

(d )r can easily be calculated from

+ Re (d )r = R 1

e

(d )r = (r

d e

d d

r)

(43)

Equation (42) can then be written in the form: (

)= ( +

d

d

)

(44)

which yields, to first order in d : (

)= (

)+d

= (

)

d

(

)

Within the brackets, we recognize, to within a factor of ~ , the expression in the representation for the operator = . We therefore obtain the result: (r) = r

= r

1

d

(45) r

(46)

~ Now, by definition of the operator =

e

(d )

(47)

Therefore, since the original state

e

(d ) associated with the rotation R (d ):

(d ) = 1

is arbitrary, we finally find that:

d

(48)

~

The preceding argument can easily be generalized to an infinitesimal rotation about an arbitrary axis. We therefore have, in general: u (d

)=1

d L u

(49)

~

Comment:

(46) can also be quickly established by using the spherical coordinates ( since then corresponds to the differential operator ~ . 724

),

• .

ANGULAR MOMENTUM AND ROTATIONS

Interpretation of the commutation relations for the components of the angular momentum L

What then is the “image” in the state space r of relation (8)? According to the results of § 3-b- and the expressions we have just obtained, this relation implies that, to first order with respect to each of the angles d and d : 1+

d

1

d

~

1

d

~

1+

~

d

=1

~

Expanding the left-hand side and setting the coefficients of d d that relation (50) reduces to: [

]

= ~

d d

(50)

~ equal, we easily find

(51)

Of course, the two other commutation relations for the components of L can be found, by an analogous argument, from the formulas obtained from (8) by cyclic permutation of the vectors e , e and e . Thus, the commutation relations of the orbital angular momentum of a particle can be seen to be consequences of the non-commutative structure of the geometrical rotation group. .

Finite rotation operators

Now, consider a rotation Re ( ) through an arbitrary angle about the axis. According to formula (7), the operator e ( ) associated with such a rotation must satisfy (again using the results of § 3-b- ): e

( +d )=

e

( )

e

(d )

(52)

where the two operators of the right-hand side commute. But we know the expression for e (d ), so we have: e

( +d )=

e

( ) 1

e

( )=

d

(53)

~

that is: e

( +d )

d ~

e

( )

(54)

Here again, e ( ) and must commute. Although we are dealing with operators, the solution of equation (54) is formally the same as it would be if we were considering an ordinary function of the variable : e

( )=e

~

(55)

Indeed, if we recall (cf. Complement BII , § 4) that the exponential of an operator is defined by the corresponding power series expansion, it is easy to verify that expression (55) is the solution of equation (54). Moreover, the “integration constant” is equal to 1, since we know that: e

(0) = 1

(56)

725



COMPLEMENT BVI

As in § 3-caxis: u(

above, it is easy to generalize this result to a finite rotation about an arbitrary

)=e

Lu

~

(57)

Comments:

(i) Formula (57) can be written explicitly in the form: u(

)=e

where , that, since u(

(

~

+

+

)

(58)

and are the components of the unit vector u. Recall, however, , and do not commute:

)=e

e

~

e

~

( ) It can be seen from expression (57) that the operator the components of L are Hermitian: [

u(

(59)

~

u(

Lu

)] = e ~

) is unitary. Since (60)

we have (as L u obviously commutes with itself): [ (

u(

)]

u(

)=

u(

)[

u(

)] = 1

(61)

) In the special case envisaged in this section, we find that: u (2

)=1

(62)

We shall confine ourselves to proving this result for the rotation through 2 about the axis (the generalization of this proof involves no difficulties). To this end, consider an arbitrary ket , and expand it on a basis composed of eigenvectors of the observable : =

(63)

with: =

(64)

~

( symbolizes the indices other than that are necessary to specify the vectors of the basis used, such as a “standard” basis – cf. § C-3 of Chapter VI). The action of e ( ) on is then easy to obtain: e

726

( )

=

e

=

e

~

(65)



ANGULAR MOMENTUM AND ROTATIONS

But we know that, for the orbital angular momentum of a particle, integral. Consequently, when attains the value 2 , all the factors e equal to 1, and: e

(2 )

=

=

Since this relation is satisfied for all operator.

is always become

(66) , we deduce that

e

(2 ) is the identity

The preceding argument clearly indicates that formula (62) would not be valid if half-integral values of were not excluded. Indeed, we shall see in Complement AIX that, for a spin 1/2, the operator associated with a rotation of 2 is equal to 1 and not 1; this result is related to the fact that we constructed the finite rotations from infinitesimal rotations (cf. footnote 3). 4.

Rotation operators in the state space of an arbitrary system

We shall now generalize the concepts we introduced and the results we obtained for a special case (in § 3). 4-a.

System of several spinless particles

First of all, the arguments of § 3 can be extended without difficulty to systems composed of several spinless particles. We shall quickly demonstrate this, choosing as an example a system of two spinless particles, (1) and (2). The state space of such a system is the tensor product of the state spaces r1 and r2 of the two particles: =

r1

(67)

r2

We shall use the same notation as in § F-4-b of Chapter II. Starting from the position and momentum observables (R1 and P1 on the one hand, R2 and P2 on the other), we can define an orbital angular momentum for each of the particles: L1 = R1

P1

L2 = R2

P2

(68)

The components of L1 , as well as those of L2 , satisfy the commutation relations characteristic of angular momenta. Consider a vector which is a tensor product of a vector of r1 and a vector of r2 : =

(1)

(2)

(69)

represents the state of the system formed by particle (1) in the state (1) and particle (2) in the state (2) . If we perform a rotation through an angle about u on the two-particle system, the state of the system after the rotation corresponds to the two particles in the “rotated states” (1) and (2) respectively: =

(1)

(2) = [

1 u(

) (1) ]

[

2 u(

) (2) ]

(70) 727



COMPLEMENT BVI

where 1 u( 2 u(

1 u(

2 u(

) and

)=e )=e

) are the rotation operators in

r1

and

r2 :

~

L1 u

(71a)

~

L2 u

(71b)

Relation (70) can also be written, by definition of the tensor product of two operators (Chap. II, § F-2-b): =[

1 u(

2 u(

)

)] (1)

(2)

(72)

Since every vector of is a linear combination of vectors analogous to (69), the rotation transform of an arbitrary vector of is: =[

1 u(

2 u(

)

)]

(73)

Using formula (F-14) of Chapter II and the fact that L1 and L2 commute (they are operators relating to different particles), we obtain for the rotation operators in : 1 u(

2 u(

)

)=e

~

L1 u

e

~

L2 u

=e

~

Lu

(74)

where: L = L1 + L2

(75)

is the total angular momentum of the two-particle system. All the formulas of the preceding section therefore remain valid as long as L represents the total angular momentum.

Comments:

(i) L is an operator that acts in . In (75), L1 is, rigorously, the extension of the operator L1 acting in r1 into (an analogous comment could be made for L2 ). To simplify the notation, we shall not use different symbols for L1 and its extension into (cf. Chap. II, § F-2-c). ( ) We might consider performing a rotation on only one of the two particles, for example, particle (1). In the course of such a “partial rotation”, a vector such as (69) transforms into: [

1 u(

) (1) ]

(2)

(76)

where only the state of particle (1) is modified. As above, it can be shown that the effect of a rotation performed only on particle (1) on an arbitrary state of is described by the operator: 1 u(

where 728

)

(2) = e

~

L1 u

is the unit operator in

(77) r2

[in (77), L1 acts in ].

• 4-b.

ANGULAR MOMENTUM AND ROTATIONS

An arbitrary system

The starting point of the arguments elaborated thus far is equation (19), which gives the transformation law of the state vector of the system in terms of that of its wave function. In the case of an arbitrary quantum mechanical system (which does not necessarily have a classical analogue), one cannot use the same method. For example, for a particle with spin, the operators , and no longer form a C.S.C.O., and the state of the particle can no longer be defined by a wave function ( ) (cf. Chap. IX). One must reason directly in the state space of the system. Without going into detail, we shall assume here that an operator acting in can be associated with any geometrical rotation R; if the system is initially in the state , the rotation R takes it into the state: =

(78)

where the operator is linear and unitary (cf. comment of § 3-b- ). As far as the group law of the rotations R is concerned, it is conserved by the operators , but only locally: the product of two geometrical rotations, at least one of which is infinitesimal, is represented in the state space by the product of the corresponding operators (which implies, in particular, that the “image” of a rotation through an angle equal to zero is the identity operator). However, the operator associated with a geometrical rotation through an angle 2 is not necessarily the identity operator [cf. comment ( ) of § 3-c- and Complement AIX ]. Now, let us consider an infinitesimal rotation Re (d ) about the axis. Since the group law is conserved for infinitesimal rotations, the operator e (d ) is necessarily of the form: e

(d ) = 1

d

(79)

~

where is a Hermitian operator since e (d ) is unitary (cf. Complement CII , § 3). This relation is the definition of . Similarly, the Hermitian operators and can be introduced by starting with infinitesimal rotations about the and axes. The total angular momentum J of the system is then defined in terms of its three components , and . Now we can use the reasoning of § 3-c- : the geometrical relation (8) implies that the components of J satisfy commutation relations which are identical to those of orbital angular momenta. Thus, the total angular momentum of any quantum mechanical system is related to the corresponding rotation operators; the commutation relations amongst its components follow directly, which enables us to use them, as in Chapter VI (§ B-2), to characterize any angular momentum. Finally, let us show that, with , and defined as we have just indicated, the operator u (d ) associated with an arbitrary infinitesimal rotation is written ( , and being the components of the unit vector u): u (d

)=1

d (

+

+

)

(80)

~

which can then be condensed into the form: u (d

)=1

d J u

(81)

~ 729

COMPLEMENT BVI



Formula (80) is simply a consequence of the geometrical relation: Ru (d ) = Re (

d ) Re (

d ) Re (

d )

(82)

valid to first order in d , and which can be obtained directly from formula (6). We have thus generalized expressions (48) and (49) for infinitesimal rotation operators. Since the group law is conserved locally (see above), relation (52) and the argument following it remain valid. Consequently, the finite rotation operators have expressions analogous to (55) and (57): u(

5.

)=e

~

Ju

(83)

Rotation of observables

We just showed how the vector representing the state of a quantum mechanical system transforms under rotation. But in quantum mechanics, the state of a system and the physical quantities are described independently. Therefore, we shall now indicate what happens to observables upon rotation. 5-a.

General transformation law

Consider an observable , relating to a given physical system; we shall assume, to simplify the notation, the spectrum of to be discrete and non-degenerate: =

(84)

In order to understand how this observable is affected by a rotation, we shall imagine that we have a device which can measure in the physical system under consideration. Now, the observable , the transform of with respect to the geometrical rotation R, is by definition what is measured by the device when it has been subjected to the rotation R. Let us assume the system to be in the eigenstate of : the device for measuring in this system will give the result without fail. But just before performing the measurement, we apply a rotation R to the physical system and, simultaneously, to the measurement device; their relative positions are unchanged. Consequently, if the observable which we are considering describes a physical quantity attached only to the system which we have rotated (that is, independent of other systems or devices which we have not rotated), then, in its new position, the measurement device will still give the same result without fail. Now, after rotation, the device, by definition, measures , and the system is in the state: =

(85)

We must therefore have: =

=

=

(86)

Combining (85) and (86), we find: = 730

(87)



ANGULAR MOMENTUM AND ROTATIONS

that is: =

(88)

since the inverse of is . The set of vectors ( is an observable), so we have:

constitutes a basis in the state space

=

(89)

that is: =

(90)

In the special case of an infinitesimal rotation Ru (d ), the general expression (81), substituted into (90), gives, to first order in d : =

1

d J u

1+

~ =

d [J u

d J u ~

]

(91)

~ Comments:

(i) In the case of a spinless particle, relation (90) implies that: r

r = r

(92)

r

Using (26) and (27), we therefore obtain: r

r = R

1

r

R

1

r

(93)

The transformation which enables us to obtain from is therefore completely analogous to the one which gives in terms of [cf. (19)]. ( ) Consider the case in which the observable is associated with a classical quantity A . A is then a function of the positions r and momenta p of the particles which constitute the system; the operator is obtained from this function by applying the quantization rules given in Chapter III. We know how to find the quantity A associated with A by a rotation R in classical mechanics: for example, if A is a scalar, A is the same as A ; if A is the component along an axis of a vectorial quantity, A is the component of this same vectorial quantity along the axis which is the result of the transformation of by the rotation R. We can also construct the quantum mechanical operator corresponding to A , by applying the same quantization rules as above. It can be shown that this operator is the same as the operator given in (90); this is what is shown in Figure 1.

731

COMPLEMENT BVI



ℛ 𝒜′

𝒜

R

A

5-b.

Figure 1: Behavior, under a rotation R, of a classical physical quantity A and of the associated observable .

A′

Scalar observables

An observable

is said to be scalar if:

= for all [

(94) . According to (91), this implies that:

J] = 0

(95)

A scalar observable commutes with the three components of the total angular momentum. There are numerous examples of scalar observables. J2 is always a scalar (this results, as we saw in § B-2 of Chapter VI, from the commutation relations that characterize an angular momentum). For a spinless particle, R2 , P2 and R P, which correspond to classical scalar quantities, are scalars. It is easy to show, moreover (cf. § 5-c below) that R2 , P2 and R P satisfy (95). We shall also see later (§ 6) that the Hamiltonian of an isolated physical system is a scalar. 5-c.

Vector observables

A vector observable V is a set of three observables (its Cartesian components) that is transformed by rotation according to the characteristic law of vectors. The transform, under a rotation R, of the component = V u of V along a given axis (of unit vector u) must be the component u = V u of V along the axis derived from by the rotation R. Consider, for example, the component of such an observable. We shall examine its behavior under infinitesimal rotations about each of the coordinate axes. is obviously unchanged by a rotation about ; according to (91), this can be expressed in the form: [

]=0

(96)

If we perform a rotation Re (d ) about the ( ) given by (91) as: (

) =

d [ ~

732

]

axis, the transform of

is the observable

(97)

• But is the component of V along the takes e onto e such that [formula (6)]: e =e +d e =e

axis, of unit vector

d e

) =V e =

. The rotation Re (d )

e (98)

Consequently, if V is a vector observable, ( (

ANGULAR MOMENTUM AND ROTATIONS

) must be the same as V e :

d V e d

(99)

Comparing (97) and (99), we see that: [

]= ~

(100)

For an infinitesimal rotation Re (d ) about the one above leads to the relation: [

axis, an argument analogous to the

]= ~

(101)

By studying the behavior of and under infinitesimal rotations, one can prove the formulas which can be derived from (96), (100) and (101) by cyclic permutation of the indices . The set of relations obtained in this way is characteristic of a vector observable: they imply that an arbitrary infinitesimal rotation transforms V u into V u , where u is the transform of u with respect to the rotation under consideration. It is clear that the angular momentum J itself is a vector observable; (96), (100) and (101) then follow from the commutation relations characterizing angular momenta. For a system composed of a single spinless particle, R and P are vector observables, as can easily be verified from the canonical commutation relations. Thus the vector notation we use for R P L and J is justified. Comments:

(i) The scalar product V W of two vector observables, defined by the customary formula: V W=

+

+

(102)

is a scalar operator. To verify this, we can calculate, for example, the commutator of V W with : [V W

]=[

]+[

=

[

=

~

=0

]+[ ~

] ]

+ + ~

[

]+[

]

+ ~ (103)

We have already pointed out that J2 , R2 , P2 and R P are scalar observables. ( ) It is the total angular momentum of the system under study that appears in relations (96), (100) and (101). The following example illustrates the importance of this fact: if, for a two-particle system, we were to use L1 instead of L = L1 + L2 , R2 would appear to be a set of three scalar observables and not a vector observable. 733

COMPLEMENT BVI

6.



Rotation invariance

The discussion presented in the preceding sections does not have as its sole purpose the justification of the definition of angular momenta in terms of the commutation relations. The importance of rotations in physics is essentially related to the fact that physical laws are rotation invariant. We are going to explain in this section exactly what this means, and we shall indicate some of the consequences of this fundamental property. 6-a.

Invariance of physical laws

Consider a physical system ( ), classical or quantum mechanical, which we subject to a rotation R at some given time. If we take the precaution of rotating, at the same time as the system ( ) under consideration, all other systems or devices that can influence it, the physical properties and behavior of ( ) are not modified. This means that the physical laws governing the system have remained the same: the physical laws are said to be rotation invariant. Note that this property is not at all obvious: there exist transformations – those of similarity5 , for example – with respect to which the physical laws are not invariant6 . It is therefore advisable to consider rotation invariance to be a postulate which is justified by the experimental verification of its consequences. When we say that the physical properties and behavior of a system are unchanged by a rotation performed at the time 0 , this statement covers two observations: (i) the properties of the system at this time are not modified (although the description of the state of the system and the physical quantities are; see preceding sections). In quantum mechanics, this implies that the transform of an arbitrary observable has the same spectrum, and that the probability of finding one of the eigenvalues of this spectrum in a measurement of on the system after rotation is the same as it was for the measurement of on the system before rotation. From this, it can be deduced that the operators describing rotations in state space are linear and unitary, or antilinear and unitary (that is, anti-unitary7 ). ( ) the time evolution of the system is not affected. To state this point more precisely, let us denote the state of the system by ( 0 ) ; under a rotation performed at 0 , this state becomes: ( 0) =

( 0)

(104)

We now let the system evolve freely and compare its state ( ) at a subsequent time to the state ( ) which it would have attained if it had been allowed to evolve freely from ( 0 ) . If the behavior of the system is not modified, we must have: () =

()

(105)

that is, for all , the state ( ) must be obtained from ( ) by the same rotation as in (104). Therefore, if ( ) is a solution of the Schrödinger equation, ( ) is also 5 Consider, for example, a hydrogen atom. If we multiply the distance between the proton and the electron by a constant = 1 (without modifying the charges and masses of the particles), we obtain a system whose evolution no longer obeys either classical or quantum mechanical physical laws. 6 Let us also point out that experiments have shown that the laws governing the -decay of nuclei are not invariant under reflection with respect to a plane (non-conservation of parity). 7 All transformations which leave the physical laws invariant are described by unitary operators, except for time reversal, with which is associated an anti-unitary operator.

734



ANGULAR MOMENTUM AND ROTATIONS

a solution of this equation: the transform of a possible motion of the system is also a possible motion. We shall see in § 6-b that this implies that the Hamiltonian of the system is a scalar observable. The invariance of physical laws under rotation is related to the symmetry properties of the equations which state these laws mathematically. To understand the origin of these symmetries, consider, for example, a system composed of a single (spinless) particle. The expression of the physical laws governing such a system explicitly involves the parameters r( ) and p( ) which characterize the position of the particle and its momentum: in classical mechanics, r and p define at each instant the state of the particle; in quantum mechanics, although the meaning of these parameters is a little less simple, they appear in the wave function (r) and its Fourier transform (p). When the particle is subjected to an instantaneous rotation R, r and p are transformed into r and p such that: r =Rr p =Rp

(106)

If we replace r by R 1 r and p by R 1 p in the equations which express the physical laws, we obtain relations that now involve r and p . The invariance of physical laws under the rotation R thus implies that the form of the equations for r and p is the same as that of the equations for r and p: simply omitting the primes labeling the new parameters must give us back the original equations. It is clear that this considerably restrains the number of possible forms of these equations.

Comments: (i) What happens when we perform a rotation on a system which is not isolated? Consider, for example, a particle in an external potential. If we rotate this system, without simultaneously rotating the sources of the external potential, the subsequent evolution of the system is, in general, modified8 . In classical mechanics, the forces exerted on the particle are not the same in its new position. In quantum mechanics, (r ) = (R 1 r ) is a solution of a Schrödinger equation in which the potential (r) is replaced by (R 1 r), which is, in general, different from (r). Therefore, the transform of a possible motion is no longer a possible motion. The presence of the external potential destroys, so to speak, the homogeneity of the space in which the system under study evolves. However, the external potential may present certain symmetries that allow certain rotations to be performed on the physical system without its behavior being modified. If there exist rotations R0 such that (R0 1 r) is the same as (r), the properties of the system are unchanged by one of these rotations R0 . This is the case, for example, for 8 If the particle is placed in a vector potential, its properties immediately after the rotation may be profoundly modified. Consider, for example, a spinless particle in an external magnetic field. According to the transformation law (19), the probability current given by formula (D-20) of Chapter III cannot in general be derived by rotation of the initial current, as it depends on the vector potential describing the magnetic field.The physical interpretation of this phenomenon is as follows. We can imagine that, instead of rotating the particle, we rotate the magnetic field rapidly and in the opposite direction. The wave function has not had the time to change: this corresponds to formula (19). If the physical properties are modified, it is because an induced electromotive field has appeared and acted on the particle. This action does not depend on the exact way in which we rotate the magnetic field, provided that we do so quickly enough.

735

COMPLEMENT BVI



central potentials, that is, those depending only on the distance to a fixed point rotations R0 are then all rotations that conserve the point (cf. Chap. VII).

: the

( ) Let us return to the case of isolated physical systems. Thus far, we have adopted an “active” viewpoint: the observer remains fixed, and the physical system is rotated. We can also define a “passive” viewpoint: the observer rotates, and, without touching the system being studied, uses a new coordinate frame, derived from the initial frame by the given rotation. Rotation invariance is then expressed in the following way: in his new position (that is, using his new coordinate axes), the observer describes physical phenomena by laws that have the same form as in the old frame. Nothing allows him to assert that one of his positions is more fundamental than the other: it is impossible to define an absolute orientation in space by the study of any physical phenomenon. It is clear that, for an isolated system, a “passive” rotation is equivalent to the “active” rotation through an equal angle about the opposite axis. 6-b.

Consequence: conservation of angular momentum

We indicated in § 6-a that rotation invariance is related to symmetry properties of the equations expressing the physical laws. Here, we shall study the case of the Schrödinger equation, and we shall show that the Hamiltonian of an isolated physical system is a scalar observable. Consider an isolated system in the state ( 0 ) . We perform an arbitrary rotation R at time 0 ; the state of the system becomes: ( 0) =

( 0)

(107)

where is the “image” of the rotation R. If we now let the system evolve freely from ( 0 ) , its state at the instant 0 + d , according to the Schrödinger equation, will be: (

0

+d ) =

( 0) +

d ~

( 0)

Now, if we had not performed the rotation, the state of the system at time have been: (

0

+d ) =

( 0) +

d ~

( 0)

(108) 0

+ d would

(109)

Rotation invariance implies (cf. § 6-a) that: ( where that:

0

+d ) =

(

0

+d )

(110)

is the same as in (107). According to the two preceding equations, this implies ( 0) =

( 0)

(111)

that is: ( 0) =

( 0)

(112)

Since ( 0 ) is arbitrary, it follows that commutes with all rotation operators. For this to be so, it is necessary and sufficient that commute with the infinitesimal rotation 736



ANGULAR MOMENTUM AND ROTATIONS

operators, that is, with the three components of the total angular momentum J of the system: [

J] = 0

(113)

is therefore a scalar observable. Rotation invariance is therefore related to the fact that the total angular momentum of an isolated system is a constant of the motion: conservation of angular momentum can be seen to be a consequence of rotation invariance.

Comments: (i) The Hamiltonian of a non-isolated system is not, in general, a scalar. However, if certain rotations exist that leave the system invariant [comment (i) of § 6-a], the Hamiltonian commutes with the corresponding operators. Thus, the Hamiltonian of a particle in a central potential commutes with the operator L associated with the angular momentum of the particle with respect to the center of forces. ( ) For an isolated system composed of several interacting particles, the Hamiltonian commutes with the total angular momentum. However, it does not generally commute with the individual angular momentum of each particle. For the transform of a possible motion to remain a possible motion, the rotation must be performed on the whole system, not on only some of the particles. 6-c.

Applications

We have just shown that rotation invariance means that the total angular momentum J of an isolated system is a constant of the motion in the quantum mechanical sense. It is therefore useful to determine the stationary states of such a system (eigenstates of the Hamiltonian) which are also eigenstates of J2 and . We can then choose for the state space a standard basis , composed of eigenstates common to , J2 and : = 2

J

= ( + 1)~2 =

.

~

(114)

Essential rotational degeneracy

Since the Hamiltonian is a scalar observable, it commutes with + and . From this fact, we deduce that +1 and 1 , which are respectively proportional to + and , are eigenstates of with the same eigenvalue as [the argument is the same as for formula (C-48) of Chapter VI]. Thus it can be shown by iteration that the (2 + 1) vectors of the standard basis characterized by the given values of and have the same energy. The corresponding degeneracy of the eigenvalues of is called “essential” because it arises from rotation invariance and occurs for any form of the Hamiltonian . Of course, in certain cases, the energy levels can present additional degeneracies, which are called “accidental”. We shall see an example of this in Chapter VII, § C. 737

COMPLEMENT BVI

.



Matrix elements of observables in a standard basis

When we study a physical quantity in an isolated system, the knowledge of the behavior of the associated observable under a rotation enables us to establish some of its properties, without having to consider its precise form. We can predict that only some of its matrix elements in a standard basis such as will be different from zero, and we can give the relations between them. Thus, a scalar observable has matrix elements only between two basis vectors whose values of are equal, as are their values of [this results from the fact that this observable commutes with J2 and ; cf. theorems on commuting observables, § D-3-a of Chapter II]. Moreover, these non-zero elements are independent of (since the scalar observable also commutes with + and ). For vector or tensor observables, these properties are contained in the Wigner-Eckart theorem, which we shall prove later in a special case (cf. Complement DX ), and which is frequently used in the areas of physics in which phenomena are treated by quantum mechanics (atomic, molecular, and nuclear physics, elementary particle physics, etc.). References and suggestions for further reading: Symmetry and conservation laws: Feynman III (1.2), Chap. 17; Schiff (1.18), Chap. 7; Messiah (1.17), Chap. XV; see also the articles by Morrisson (2.28), Feinberg and Goldhaber (2.29), Wigner (2.30). Relation with group theory: Messiah (1.17), Appendix D; Meijer and Bauer (2.18), Chaps. 5 and 6; Bacry (10.31), Chap. 6; Wigner (2.23), Chaps. 14 and 15. See also Omnès (16.13), in particular Chap. III.

738



ROTATION OF DIATOMIC MOLECULES

Complement CVI Rotation of diatomic molecules

1 2

3

4

1.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rigid rotator. Classical study . . . . . . . . . . . . . . . . . . 2-a Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-b Motion of the rotator. Angular momentum and energy . . . . 2-c The fictitious particle associated with the rotator . . . . . . . Quantization of the rigid rotator . . . . . . . . . . . . . . . . 3-a The quantum mechanical state and observables of the rotator 3-b Eigenstates and eigenvalues of the Hamiltonian . . . . . . . . 3-c Study of the observable Z . . . . . . . . . . . . . . . . . . . . Experimental evidence for the rotation of molecules . . . . 4-a Heteropolar molecules. Pure rotational spectrum . . . . . . . 4-b Homopolar molecules. Raman rotational spectra . . . . . . .

739 740 740 741 741 741 741 742 744 746 746 749

Introduction

In § 1 of Complement AV , we studied the vibrations of the two nuclei of a diatomic molecule about their equilibrium position, neglecting the rotation of these two nuclei about their center of mass. We obtained stationary vibrational states of energies whose wave functions ( ) depended only on the distance between the nuclei. Here, we shall adopt a complementary point of view: we shall study the rotation of the two nuclei about their center of mass, neglecting their vibrations. That is, we shall assume that the distance between them remains fixed and equal to (where represents the distance between the two nuclei in the stable equilibrium position of the molecule; see Figure 1 of Complement AV ). The wave functions of the stationary rotational states then can depend only on the polar angles and which define the direction of the molecular axis. We shall see that these wave functions are the spherical harmonics ( ) [studied in Chapter VI (§ D-1) and in Complement AVI ], and that they correspond to a rotational energy that depends only on . Actually, in the center of mass frame, the molecule both rotates and vibrates, and the wave functions of its stationary states must be functions of the three variables and . We shall show in Complement FVII that, to a first approximation, these wave functions are of the form ( ) ( ) and correspond to the energy + . This result justifies the approach adopted here, which consists of considering only one degree of freedom – rotational or vibrational – at a time1 . We shall begin in § 2 by presenting the classical study of a system of two masses separated by a fixed distance (rigid rotator). The quantum mechanical treatment of this problem will then be taken up in § 3, where 1 In Complement F VII , we shall also study the corrections introduced by the coupling between the vibrational and rotational degrees of freedom.

739



COMPLEMENT CVI

z

M2 θ

r2

O

r1

y

M1

Figure 1: Parameters defining the position of the rigid rotator 1 2 whose center of mass is at the origin of the reference frame; the distances 1 and 2 are fixed; only the polar angles and can vary.

φ

x

we shall use the results of Chapter VI concerning orbital angular momentum. Finally in § 4, we shall describe some experimental manifestations of the rotation of diatomic molecules (pure and Raman rotational spectra). 2.

Rigid rotator. Classical study

2-a.

Notation

Two particles, of mass 1 and 2 , are separated by a fixed distance . Their center of mass is chosen as the origin of a coordinate frame with respect to which the direction of the axis connecting them is defined by means of the polar angles and (Fig. 1). The distances 1 and 2 are denoted respectively by 1 and 2 ; by definition of the center of mass: 1 1

=

(1)

2 2

which allows us to write: 1

2

=

2

=

1

1

+

The moment of inertia =

2 1 1

(2) 2

of the system with respect to

2 2 2

+

is equal to: (3)

Introducing the reduced mass: 1

= 1

+

2

(4) 2

and using (2), we can put = 740

2

in the form: (5)

• 2-b.

ROTATION OF DIATOMIC MOLECULES

Motion of the rotator. Angular momentum and energy

If no external force acts on the rotator, the total angular momentum of the system with respect to the point is a constant of the motion. The rotator therefore rotates about in a plane perpendicular to the fixed vector , with a constant angular velocity . The modulus of is related to by: =

+

1 1 1

2 2 2

=

(6)

that is, using (5): 2

=

(7)

The rotational frequency of the system = 2 is proportional to the angular momentum and inversely proportional to the moment of inertia . In the center of mass frame, the total energy of the system reduces to the rotational kinetic energy: =

1 2

2

(8)

which can also be written, using (6) and (5): 2

=

2

2-c.

2

=

2

(9)

2

The fictitious particle associated with the rotator

Formulas (5), (7) and (9) show that the problem we are studying here is formally equivalent to that of a fictitious particle of mass forced to remain at a fixed distance from the point , about which it rotates with the angular velocity . is the angular momentum of this fictitious particle with respect to . 3.

Quantization of the rigid rotator

3-a.

The quantum mechanical state and observables of the rotator

Since is fixed, the parameters defining the position of the rotator (or that of the associated fictitious particle) are the polar angles and of Figure 1. The quantum mechanical state of the rotator will then be described by a wave function ( ) which depends only on these two parameters. ( ) is square-integrable; we shall assume it to be normalized: 2

d

sin d

0

(

)2=1

(10)

0

The physical interpretation of ( ) is the following: ( ) 2 sin d d represents the probability of finding the axis of the rotator pointing in the solid angle element dΩ = sin d d about the direction of polar angles and . Using Dirac notation, we associate with every square-integrable function ( ), a ket of the state space Ω : (

)



(11) 741

COMPLEMENT CVI



The scalar product of =

dΩ

and

(

)

is, by definition:

(

)

(12)

where ( ) and ( ) are the wave functions associated with and . The quantum mechanical Hamiltonian of the rotator (or of the associated fictitious particle) can be obtained by replacing 2 in expression (9) for the classical energy by the operator L2 studied in § D of Chapter VI: =

L2 2 2

(13)

is an operator acting in Ω . According to formula (D-6a) of Chapter VI, if represented by the wave function ( ), is represented by: ~2 2

2 2

2

+

1 tg

+

1 sin2

is

2 2

(

)

(14)

Other observables of interest, which we shall study later, are those which correspond to the three algebraic projections of the segment 1 2 ( are also the coordinates of the fictitious particle): =

sin cos

=

sin sin

=

cos

(15)

The importance of these variables will be seen in § 4-a. The observables corresponding to act in Ω . With the kets , are associated the functions: sin

cos

(

)

sin

sin

(

)

cos

(

(16)

)

Comment: As we have already pointed out in the introduction, the true wave functions of the molecule depend on . Similarly, the observables of this molecule, obtained from the corresponding classical quantities by the quantization rules of Chapter III, act on these functions of three variables and not solely on the functions of and . In Complement FVII , we shall justify the point of view we are adopting here, namely, ignoring the radial part of the wave functions and considering to be a fixed parameter that is equal to [cf. formulas (14) and (16)]. 3-b.

Eigenstates and eigenvalues of the Hamiltonian

We determined the eigenvalues of the operator L2 in § D of Chapter VI: they are of the form ( + 1)~2 , where is any non-negative integer. Furthermore, we know an 742



ROTATION OF DIATOMIC MOLECULES

orthonormal system of eigenfunctions of L2 : the spherical harmonics ( ), which constitute a basis in the space of functions that are square-integrable in and (§ Dl-c- of Chapter VI). We shall denote by the ket of Ω associated with ( ): (

)

(17)

We see from (13) that: ( + 1)~2 2 2

=

(18)

It is customary to set: =

~ 4

~

=

(19)

2

4

is called the “rotational constant” and has the dimensions of a frequency2 . The eigenvalues of are thus of the form: =

( + 1)

(20)

Since, for a given value of , there exist (2 + 1) spherical harmonics ( ), where = +1 , we see that each eigenvalue is (2 + 1)-fold degenerate. Figure 2 represents the first energy levels of the rotator. The separation of two adjacent levels, and 1, is equal to: 1

=

( + 1)

(

1) = 2

(21)

and increases linearly with . The eigenstates of satisfy the following orthogonality and closure relations (deduced from those satisfied by the spherical harmonics, § D-1-c- of Chapter VI): = +

=1 =0

(22)

=

The most general quantum state of the rotator can be expanded on the states

:

+

() =

() =0

(23)

=

The component: ()=

() =

dΩ

(

)

(

; )

(24)

evolves in time in accordance with the equation: ()= 2 The

(0) e

~

speed of light is sometimes placed in the denominator of the right-hand side of (19). has the dimensions of an inverse length and is expressed in cm 1 (in the system).

(25) then

743

COMPLEMENT CVI



l=5 10 Bh

Figure 2: First levels of the rigid rotator, of energies: = ( + 1) (with = 0 1 2...). Each level for which 1 is separated from the next lower level by an energy 2 .

l=4 8 Bh l=3 6 Bh l=2 4 Bh l=1 l=0

2 Bh

3-c.

Study of the observable Z

Earlier, we introduced the observables which correspond to the projections onto the three axes of the segment 1 2 . In this section, we shall study the evolution of the mean values of these observables and compare the results obtained with those predicted by classical mechanics. We shall confine ourselves to the calculation of () since ( ) and ( ) have analogous properties. A Bohr frequency ( ) can appear in the function ( ), if has a nonzero matrix element between a state of energy and a state of energy . The first problem is therefore to find the non-zero matrix elements of . To solve this problem, we shall use the following relation, which can be established by using the mathematical properties of spherical harmonics [Complement AVI , formula (35)]: 2

cos

(

)=

4

2 2

1(

1

)+

( + 1)2 4( + 1)2

2

1

+1 (

)

(26)

From this, we deduce, using (16), (17) and (22): 2

=

1

4

2 2

1

+

+1

( + 1)2 4( + 1)2

2

1

(27)

Comment: According to (27), the selection rules satisfied by are: ∆ = 1 ∆ = 0. It can be shown that for and we have: ∆ = 1, ∆ = 1. Since the energies depend only on , the Bohr frequencies are the same for and .

The operator can therefore connect only states belonging to two adjacent levels of Figure 2 (the corresponding transitions are represented by vertical arrows in Figure 2). The only Bohr frequencies which appear in the evolution of ( ) are thus of the form: 1

=

1

=2

They form a series of equidistant frequencies, separated by the interval 2 744

(28) (Fig. 3).

• 1

0

2

1

3

2

4

ROTATION OF DIATOMIC MOLECULES

3

5

4 v

0

2B

4B

6B

8B

10B

Figure 3: Frequencies appearing in the evolution of the mean value of the observable . Because of the selection rule ∆ = 1, only the Bohr frequencies 2 (with 1), associated with two adjacent levels and 1 in Figure 2, are observed.

The mean value ( ) can evolve only at a well-defined series of frequencies, This is unlike the classical case, in which the frequency of rotation of the rotator can take on any value. According to (27), if the system is in a stationary state ( ) is always zero, even for large . To obtain a quantum mechanical state in which behaves like the corresponding classical variable , one must superpose a large number of states . If we assume that the state of the system is given by formula (23), and that the numbers (0) 2 have values which vary with as is shown in Figure 4, the most probable value of , M , is very large; the spread ∆ of the values of is also very large in absolute value, but very small in relative value: ∆ ∆

1

(29a)

1

(29b)

cl, m(0) 2 Δl

l 0

Figure 4: Square of the moduli of the expansion coefficients of a “quasi-classical” state on the stationary states of the rigid rotator. The spread ∆ is large; however, since the most probable value of , M , is very large, we have ∆ M 1, and the relative accuracy with respect to is very good.

lM

It can then be shown that, in such a state: L

2

L2

(

+ 1)~2

2

~2

In addition, the Bohr frequencies that appear in value) to: =2

(30) ( ) are then all very close (in relative (31) 745

COMPLEMENT CVI

Eliminating

• between (30) and (31), we obtain, according to (19):

2

L

L 2

=

~

(32)

which is the equivalent of the classical relation (6).

Comment: It is interesting to study in greater detail the motion of the wave packet corresponding to the state of Figure 4. It is represented by a function of and and can be considered to evolve on the sphere of unit radius. The preceding discussion shows that this wave packet rotates on the sphere with the average frequency . Because of the spread ∆ of and the corresponding spread 2 ∆ of the Bohr frequencies which enter into , and , the wave packet becomes distorted over time. This distortion becomes appreciable after a time of the order of: 1 2 ∆

(33)

Since the spread of

is small in relative value, we have:

1



(34)

The distortion of the wave packet is therefore slow, relative to its rotation. In fact, the Bohr frequencies of the system form a discrete series of equidistant frequencies, separated by the interval 2 . The motion which results from the superposition of these frequencies is therefore periodic, of period: =

1 2

(35)

with, according to (29a): 1

(36)

The distortion of the wave packet is therefore not irreversible; it follows a cycle which is repeated periodically. This is related to the fact that the wave packet evolves on the unit sphere, which is a bounded surface. This behavior should be compared with that of free wave packets (irreversible spreading; Complement GI ) and that of the quasi-classical states of the harmonic oscillator (oscillation without distortion; Complement GV ).

4. 4-a.

.

Experimental evidence for the rotation of molecules Heteropolar molecules. Pure rotational spectrum

Description of the spectrum

If the molecule is composed of two different atoms, the electrons are attracted by the more electronegative atom, and the molecule generally has a permanent electric dipole moment 0 , directed along the molecular axis. The projection of the electric dipole moment onto becomes, in quantum mechanics, an observable proportional to . We have seen that ( ) evolves at all the Bohr frequencies 2 ( = 1 2 3 ) shown in 746



ROTATION OF DIATOMIC MOLECULES

Figure 3. Thus we see how the molecule is coupled to the electromagnetic field and can absorb or emit radiation polarized parallel3 to Oz, on the condition that the frequency of this radiation is equal to one of the Bohr frequencies 2 . The corresponding absorption or emission spectrum of the molecule is called the “pure rotational spectrum”. It is composed of a series of equidistant lines, the frequency separation between two successive lines being equal to 2 , as in Figure 3. The absorption (or emission) of the line of frequency 2 corresponds to the passage of the molecule from the level 1 to the level (or from the level to the level 1), at the same time that a photon of frequency 2 is absorbed (or emitted). Figure 5 represents this process schematically [(5-a) represents the absorption and (5-b), the emission of a photon of frequency 2 ]. The pure rotational spectra of diatomic molecules therefore provide direct experimental proof of the quantization of the observable L2 . El a

2Bl

b

El – 1

.

Figure 5: Schematic representation of the passage of the molecule from a given rotational level to the neighboring level with absorption (fig. a) or emission (fig. b) of a photon.

Comparison with the “pure vibrational” spectrum

In § 1-c- of Complement AV , we studied the “pure vibrational” spectrum of a heteropolar diatomic molecule. It is interesting to compare this spectrum with the pure rotational spectrum we are studying here. (i) The rotational frequencies of a diatomic molecule are generally much lower than the vibrational frequencies. The separation 2 of two rotational lines varies between a few tenths of a cm 1 and a few dozen cm 1 . For small values of , the rotational frequencies 2 therefore correspond to wavelengths of the order of a centimeter or a millimeter. Taking as an example, the separation 2 is equal to 20.8 cm 1 , while 1 the vibration frequency, which corresponds to 2 886 cm , is more than a hundred times greater. Pure rotational spectra therefore fall in the very far infrared or the microwave domain.

Comment:

As we shall show in Complement FVII , the rotation of molecules is also responsible for a fine structure of vibrational spectra (vibration-rotation spectra). 2 can then be measured in a domain of wavelengths which is no longer that of microwaves. The same comment applies to the Raman rotational effect (§ 4-b below), which appears as a rotational structure of an optical line. 3 If we study the motion of radiation polarized parallel to

( ) and or .

( ), we see that the molecule can also absorb or emit

747

COMPLEMENT CVI



( ) The “pure vibrational” spectrum studied in Complement AV has only one vibrational line. This is due to the fact that the various vibrational levels are equidistant (if the anharmonicity of the potential is neglected) and, consequently, only one Bohr frequency appears in the dipole motion (selection rule ∆ = 1). On the other hand, the pure rotational spectrum consists of a series of equidistant lines. ( ) We indicated in Complement AV that the permanent electric dipole moment of the heteropolar molecule can be expanded in powers of in the neighborhood of the stable equilibrium position of the molecule: ( )=

0

+

1(

)+

(37)

For the pure vibrational spectrum to appear, ( ) must vary with : 1 must therefore be different from zero. On the other hand, even if remains fixed and equal to , the rotation of the molecule modulates the projection of the electric dipole onto one of the axes, provided that 0 is different from zero. Thus we see that the study of the intensity of vibrational and rotational lines permits the separate measurement of the coefficients 1 and 0 of (37). .

Applications

The study of pure rotational spectra has some interesting applications; we shall mention three examples. (i) Measurement of the separation 2 of two neighboring lines yields the moment of inertia of the molecule, according to (19). If we know 1 and 2 , we can deduce , the separation of the two nuclei in the stable equilibrium position of the molecule [ is the abscissa of the minimum of the curve ( ) of Figure 1 in Complement AV ]. Recall that measurement of the vibrational frequency yields the curvature of ( ) at = . ( ) Consider two diatomic molecules and , in which two isotopes and of the same element are bound to the same atom . Since the distances between the nuclei are equal in the two molecules, measurement of the ratio of the corresponding coefficients , which can be performed with great accuracy, yields the ratio of the masses of the two isotopes and . One could also compare the vibrational frequencies of the two molecules, but it is preferable to use the rotational spectrum, since the rotational frequencies vary with 1 [formula (19)], while the vibrational frequencies vary with 1 [formula (5) of AV ]. ( ) In the study of a sample containing a great number of identical molecules, the relative intensities of the lines (in absorption or emission) of the pure rotational spectrum yields information about the distribution of the molecules among the various levels . Unlike what happens in the case of the vibrational spectrum, transitions between two given adjacent levels (arrows of Figure 2) occur at a particular frequency, which is characteristic of these two levels. Thus, the intensity of the corresponding line depends on the number of molecules that are found in the two levels. This information can be used to determine the temperature of a medium4 . If thermodynamic equilibrium has been attained, we know that the probability that a given molecule is in a particular state of energy is proportional to e ; since the degeneracy of the rotational level is (2 + 1), the total probability P of finding the 4 Actually, the vibration-rotation or Raman rotational spectra are more often used, since they fall into more convenient frequency ranges than does the pure rotational spectrum.

748



ROTATION OF DIATOMIC MOLECULES

𝒫l hB

0.3

kT

=

1 10

0.2 0.1

1

0

2

3

4

5

6

7

l

Figure 6: Population P of the various rotational levels at thermodynamic equilibrium. The fact that P begins by increasing with arises from the (2 + 1)-fold degeneracy of the levels . When becomes sufficiently large, the Boltzmann factor e prevails and is responsible for the decrease in P .

molecule being considered in one of the states of the level level ) is: P = =

1 1

(the “population” of the

(2 + 1)e (2 + 1)e

( +1)

(38)

where: =

(2 + 1)e

( +1)

(39)

=0

is the associated partition function. If we are studying a system containing a large number of molecules whose interactions can be neglected, P gives the fraction of them whose energy is . At ordinary temperatures, is much smaller than , so that several rotational levels are populated. Note that the presence of the factor (2 + 1) means that it is not the lowest levels that are the most populated: Figure 6 indicates the shape of P as a function of for a temperature such that is of the order of 1 10. Recall that the vibrational levels, on the other hand, are non-degenerate, and that their separation is much greater than ; consequently, when the distribution of the molecules between the two rotational levels is that of Figure 6, they are practically all in the vibrational ground state ( = 0). 4-b.

Homopolar molecules. Raman rotational spectra

As we pointed out in § 1-c- of Complement AV , a homopolar molecule (that is, a molecule composed of two identical atoms) has no permanent electric dipole moment: in formula (37), we have 0 = 1 = = 0. The vibration and rotation of the molecule 749



COMPLEMENT CVI

induce no coupling with the electromagnetic field, and the molecule is consequently “inactive” in the near infrared (vibration) and the microwave (rotation) domains. Like the vibration (cf. § 1-c- of AV ), the rotation of the molecule can, however, be observed via the inelastic scattering of light (the Raman effect). .

The Raman rotational effect. Classical treatment

We have already introduced, in Complement AV , the susceptibility of a molecule in the optical domain: an incident light wave, whose electric field is E e Ω , sets the electrons of the molecule in forced motion and causes an electric dipole D e Ω , oscillating at the same frequency as the incident wave, to appear. is the coefficient of proportionality between D and E. If E is parallel to the axis of the molecule, depends on the distance between the two nuclei: when the molecule vibrates, vibrates at the same frequency. This is the origin of the Raman vibrational effect described in § 1-c- of AV . Actually, a diatomic molecule is an anisotropic system. When the angle between the molecular axis and E is arbitrary, D is not generally parallel to E: the relation between D and E is tensorial ( is the “susceptibility tensor”). D is parallel to E in the two following simple cases: E parallel to the molecular axis (we then have = ), and E perpendicular to this axis ( = ). In the general case, we choose the axis along the electric field E of the light wave (assumed to be polarized); we consider a molecule whose axis points in the direction of polar angles and and calculate the component along of the dipole induced by E on this molecule. E can be decomposed into a component E , parallel to the molecular axis, and a component E , perpendicular to and 1 2 (Fig. 7). The dipole induced 1 2 and contained in the plane formed by on the molecule by the field E cos Ω is then equal to: D=(

E

+

E ) cos Ω

(40)

z

Figure 7: Decomposition of the electric field E into a component E parallel to the molecular axis and a component E perpendicular to this axis. These fields induce electric dipoles E and E collinear with the corresponding fields. However, since and have different values (the molecule is anisotropic), the induced electrical dipole D = E + E is not collinear with E.

E D

E⊥

M2

D⊥

D//

θ E// O M1

Its projection onto = (cos 2

= (cos =[ 750



+(

can be calculated immediately:

E + sin



+ sin

E ) cos Ω

2

) cos Ω 2

) cos

] cos Ω

(41)



ROTATION OF DIATOMIC MOLECULES

We see that depends on , since and are not equal (the molecule is anisotropic). To see what happens when the molecule rotates, we shall begin by reasoning classically. The fact that the molecule rotates at the frequency 2 means that cos oscillates at the same frequency: cos =

cos(

)

(42)

where and depend on the initial conditions and the orientation of the angular momentum (which is fixed). Thus we see that the term in cos2 of (41) gives rise to components of that oscillate at frequencies of (Ω 2 ) 2 , in addition to the component that varies at the frequency Ω 2 . The fact that the rotation of the molecule at the frequency 2 modulates its polarizability at twice its frequency is easy to understand: after half a rotation, performed in half a period, the molecule returns to the same geometrical position with respect to the incident light wave. The light re-emitted with a polarization parallel to is that which is radiated by . It has an unshifted line of frequency Ω 2 (Rayleigh line), as well as two shifted lines, one on each side of the Rayleigh line, of frequencies (Ω 2 ) 2 (Raman-Stokes line) and (Ω + 2 ) 2 (Raman-anti-Stokes line). .

Quantum mechanical selection rules. Form of the Raman spectrum

Quantum mechanically, Raman scattering corresponds to an inelastic scattering process in which the molecule goes from level to level , while the energy ~Ω of the photon becomes ~Ω + (the total energy of the system is conserved during this process). The quantum theory of the Raman effect (which we shall not discuss here) indicates that the probability of such a process involves the matrix elements of ( ) cos2 + between the initial state ( ) and the final state ( ) of the molecule: dΩ

(

) (

) cos2 +



(

)

(43)

It can be shown, using the properties of the spherical harmonics, that such a matrix element is different from zero only if5 : = 0 +2

2

(44)

There is only one Rayleigh line (which corresponds to = ). Since the rotational levels are not equidistant, there are several Raman-anti-Stokes lines (which correspond to = 2), of frequencies: Ω + 2 with

+2

=

Ω +4 2

+

3 2

(45)

=0 1 2

and several Raman-Stokes lines (which correspond to Ω Ω +2 + = 2 2 with = 0 1 2

4

+

3 2

= + 2), of frequencies: (46)

5 The integral (43) is also zero if = . If we consider light re-emitted with a different polarization state from that of the incident light wave, we obtain the following selection rules for : ∆ = 0 1 2.

751

COMPLEMENT CVI



Rayleigh line

Raman-Stokes lines 4B 6B

Raman-anti-Stokes lines 6B 4B v

3→5 2→4 1→3 0→2

l→l′

2→0 3→1 4→2 5→3

Figure 8: Raman rotational spectrum of a molecule. This molecule, initially in the rotational level , inelastically scatters an incident photon of energy ~Ω. After the scattering, the molecule has moved to the rotational level , and the energy of the photon is ~Ω + (conservation of energy). If = , the scattered photon has the same frequency = Ω 2 as the incident photon; this yields the Rayleigh line. But it is also possible to have = 2; if = + 2, the frequency of the scattered photon is lower (Stokes scattering); if = 2, it is higher (anti-Stokes scattering). Since the rotational levels are not equidistant (cf. Fig. 2), there are as many Stokes or anti-Stokes lines as there are values of . These lines are labeled in the figure by (with = 2).

The form of the Raman rotational spectrum is shown in Figure 8. The Stokes and anti-Stokes lines occur symmetrically with respect to the Rayleigh line. The separation between two adjacent Stokes (or anti-Stokes) lines is equal to 4 , that is, to twice the separation which would be found between two adjacent lines of the pure rotational spectrum if it existed. Moreover, since the vibrational frequency is much larger than , the Raman-Stokes and anti-Stokes vibrational lines are situated much further to the left and to the right of the Rayleigh line than the rotational Raman lines and hence do not appear in the figure (these lines also have rotational structures similar to that of Figure 8). Comments:

(i) Consider a wave packet like those studied in § 3-c, that is, one for which the values of are grouped about a very large integer (Fig. 4). According to (45) and (46), the frequencies of the various Stokes and anti-Stokes lines will be very close (in relative value) to: Ω 4 2 that is, according to (31): Ω 2 752

2

(47)

(48)



ROTATION OF DIATOMIC MOLECULES

where is the average rotational frequency of the molecule. Thus, the quantum mechanical treatment, at the classical limit, yields the results of § 4-b- . ( ) In Raman rotational spectra, the Stokes and anti-Stokes lines appear with comparable intensities since levels of large have large populations, as is much smaller than . This is necessary for the observation of anti-Stokes lines, for which the initial state of the molecule must be at least = 2. However, the anti-Stokes vibrational line has a much smaller intensity than the Stokes line. The vibrational energy is much larger than ; the population of the vibrational ground state = 0 is much larger than the others, and Stokes processes = 0 =1 are much more frequent than anti-Stokes processes = 1 = 0. (

) The Raman rotational effect also exists for heteropolar molecules.

References and suggestions for further reading:

Karplus and Porter (12.1), § 7.4; Herzberg (12.4), Vol. I, Chap. III, §§ 1 and 2; Landau and Lifshitz (1.19), Chaps. XI and XIII; Townes and Schawlow (12.10), Chaps. 1 to 4.

753



ANGULAR MOMENTUM OF STATIONARY STATES OF A TWO-DIMENSIONAL HARMONIC OSCILLATOR

Complement DVI Angular momentum of stationary states of a two-dimensional harmonic oscillator

1

2

3

4

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-a Review of the classical problem . . . . . . . . . . . . . . . . . 1-b The problem in quantum mechanics . . . . . . . . . . . . . . Classification of the stationary states by the quantum numbers and . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-a Energies; stationary states . . . . . . . . . . . . . . . . . . . . 2-b does not constitue a C.S.C.O. in . . . . . . . . . . . Classification of the stationary states in terms of their angular momenta . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-a Significance and properties of the operator . . . . . . . . . 3-b Right and left circular quanta . . . . . . . . . . . . . . . . . . 3-c Stationary states of well-defined angular momentum . . . . . 3-d Wave functions associated with the eigenstates common to and . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quasi-classical states . . . . . . . . . . . . . . . . . . . . . . . 4-a Definition of the states and . . . . . . . . . . 4-b Mean values and root mean square deviations of the various observables . . . . . . . . . . . . . . . . . . . . . . . . . . . .

755 755 757 759 759 760 761 761 761 762 763 765 766 767

In this complement, we shall be concerned with the quantum mechanical properties of a two-dimensional harmonic oscillator. The quantum mechanical problem is exactly soluble and does not involve complicated calculations. Furthermore, this subject provides an opportunity to study a simple application of the properties of the orbital angular momentum L, since, as we shall see, the stationary states of such an oscillator can be classified with respect to the possible values of the observable . In addition, the results obtained will be useful in the subsequent Complement EVI . 1.

Introduction

1-a.

Review of the classical problem

A physical particle always moves in three-dimensional space. However, if its potential energy depends only on and , the problem can be treated in two dimensions. We shall assume here that this potential energy can be written: (

)=

2

2

(

2

+

2

)

where is the mass of the particle and system is then: =

+

(1) is a constant. The classical Hamiltonian of the (2) 755



COMPLEMENT DVI

with: 1 2 ( + 2 1 2 = 2

2

=

)+

1 2

2

(

2

+

2

) (3)

where , , are the three components of the momentum p of the particle. two-dimensional harmonic oscillator Hamiltonian. The equations of motion can easily be integrated to yield: ()= ()=

0 0

+

cos(

()= ()=

cos(

where 0 , assume

(4)

0

()= ()=

) sin(

0,

and

(5)

) )

sin( ,

is a

(6)

)

, , are constants which depend on the initial conditions (we to be positive). y A

yM

B

– xM

+ xM O

D

– yM

x

C

Figure 1: Projection of the classical trajectory of a particle in a two-dimensional harmonic potential onto the plane; we obtain an ellipse inscribed in the rectangle ABCD.

We see that the projection of the particle onto describes a uniform motion with a velocity of 0 . The projection onto the plane describes an ellipse inscribed in the rectangle ABCD of Figure 1. The direction the particle takes on this ellipse depends on the phase difference . When = , the ellipse reduces to the line AC. When is between and 0, the particle moves clockwise on the ellipse (“lefthanded” motion), with the axes of the ellipse parallel to and for = 2. When = 0, the ellipse reduces to the line BD. Finally, when is between 0 756



ANGULAR MOMENTUM OF STATIONARY STATES OF A TWO-DIMENSIONAL HARMONIC OSCILLATOR

and , the particle moves counterclockwise on the ellipse (“right-handed” motion), with the axes parallel to and for = 2. Note that the ellipse reduces to a circle if = 2 and = . It is easy to determine several constants of the motion related to the projection of the motion onto the plane: – the total energy , which, according to (3), (5), (6), is equal to: =

1 2

2

(

2

+

2

)

(7)

– the energies: 1 2 1 = 2 =

2 2

(8a)

2 2

(8b)

of the projections of the motion onto and ; – the component of the orbital angular momentum

of the particle along

:

=

(9)

which, according to (5) and (6), is equal to: =

sin(

)

(10)

We see that is positive or negative depending on whether the motion is counterclockwise (0 ) or clockwise ( 0). is zero for the two rectilinear motions ( = and = 0). Finally, for a motion at a given energy, that is, according to (7), for a fixed value of 2 + 2 , is maximal when = 2 and the product is maximal, which implies = . Of all motions at a given energy, it is the counterclockwise (clockwise) motion which corresponds to the maximal (minimal) algebraic value of . 1-b.

The problem in quantum mechanics

,

The quantization rules of Chapter III enable us to obtain . The stationary states of the particle are given by: =(

+

)

=

1 2

2

,

,

from

,

(11)

with: 2

=

+ 2

2

+

(

2

+

2

)

(12a)

2

=

2

(12b)

According to the results of Complement FI , we know that we can choose a basis of eigenstates of composed of vectors of the form: =

(13) 757

COMPLEMENT DVI

where and :



is an eigenvector of

in the state space

associated with the variables

= and

(14)

is an eigenvector of

in the space

associated with the variable :

=

(15)

The total energy associated with the state (13) is then: =

+

(16)

Now, equation (15), which in fact describes the stationary states of a free particle in a one-dimensional problem, can be solved immediately; it yields: = (where

1 e 2 ~

~

(17)

is an arbitrary real constant), with: 2

=

2

(18)

The problem therefore reduces to the determination of the solutions of equation (14), that is, the energies and stationary states of a two-dimensional harmonic oscillator. This is the problem we shall now try to solve. We shall see that the eigenvalues of are degenerate: alone does not constitute a C.S.C.O. in . We must therefore add one or several other observables to in order to construct a C.S.C.O. In fact, we find in quantum mechanics the same constants of the motion as in classical mechanics: and , the energies of the projection of the motion onto and ; and , the component along of the orbital angular momentum L. Since commutes with neither nor , we shall see that a C.S.C.O. can be formed of , and (§ 2) or of and (§ 3).

Comments: (i) Formula (18) indicates that the eigenvalues of are all two-fold degenerate in the space . Furthermore, the degeneracy in = of the eigenvalues (16) of the total Hamiltonian is not due solely to the degeneracy of in and of in : two eigenvectors of of the form (13) can have the same total energy without their corresponding values of (and of ) being equal. ( ) commutes with the component of L, but not with and . This results from the fact that the potential energy written in (1) is rotation-invariant only about . Moreover, of the three operators , and , only one, , acts only in . In the study of the two-dimensional harmonic oscillator, therefore, we shall use only the observable . In Complement BVII , we shall study the isotropic three-dimensional harmonic oscillator, whose potential energy is invariant with respect to any rotation about an axis which passes through the origin; we shall see that all the components of L then commute with the Hamiltonian.

758



ANGULAR MOMENTUM OF STATIONARY STATES OF A TWO-DIMENSIONAL HARMONIC OSCILLATOR

2.

Classification of the stationary states by the quantum numbers

2-a.

and

Energies; stationary states

To obtain the solutions of the eigenvalue equation (14), note that written: =

+

where

(19)

and

are both Hamiltonians of one-dimensional harmonic oscillators:

2

=

2

+

1 2

2

2

+

1 2

2

2

2

=

, can be

2

(20)

We know the eigenstates of in and the eigenstates of in . Their energies are, respectively, = ( + 1 2)~ and = ( + 1 2)~ (where and are positive integers or zero). The eigenstates of can thus be chosen in the form: =

(21)

where the corresponding energy =

+

=(

+

1 2

~ +

+

is given by: 1 2

~

+ 1)~

(22)

According to the properties of the one-dimensional harmonic oscillator, is nondegenerate in , and in . Consequently, a vector of , which is unique to within a constant factor, corresponds to a pair : and form a C.S.C.O. in . It will prove convenient to use the operators and (destruction operators of a quantum, relative to and respectively), defined by: 1 2 1 = 2 =

+ ~ (23)

+ ~

with: =

(24) ~

Since and act in different spaces, the four operators , , , , are: =

=1

and

, the only non-zero commutators between

(25) 759



COMPLEMENT DVI

The operators (the number of quanta relative to the of quanta relative to the axis) are given by:

axis) and

(the number

= =

(26)

which enables us to write =

+

=(

in the form: +

+ 1)~

(27)

We have, obviously: = =

(28)

The ground state 0 0

=

=0

is given by:

0 0

(29)

=0

The state defined by (21) can be obtained from cation of the operators and : 1 !

=

!

(

) (

)

)=

(2)

+

(

by the successive appli-

(30)

0 0

The corresponding wave function is the product of ment BV , formula (35)]: (

0 0

)!(

)!

e

2

(

2

+

2

) 2

( ) and

(

)

where the

are the Hermite polynomials (Complement BV , § 1).

2-b.

does not constitue a C.S.C.O. in

We see from (22) that the eigenvalues of =

( ) [cf. Comple-

(

)

(31)

are of the form:

= ( + 1)~

(32)

where: =

+

(33)

is a positive integer or zero. To each value of the energy correspond the various orthogonal eigenvectors: =

=0

=

1

=1

=0

=

(34)

Since there are ( +1) of these vectors, the eigenvalue is ( +1)-fold degenerate in . alone does not, therefore, constitute a C.S.C.O. On the other hand, we have seen that is a C.S.C.O.; this is also, obviously, true of et . 760



ANGULAR MOMENTUM OF STATIONARY STATES OF A TWO-DIMENSIONAL HARMONIC OSCILLATOR

3.

Classification of the stationary states in terms of their angular momenta

3-a.

Significance and properties of the operator

In the preceding section, we identified the stationary states by the quantum numbers and . But the and axes do not enjoy a privileged position in this problem. Since the potential energy is invariant under rotation about , we could just as well have chosen another system of orthogonal axes and in the plane; we would have then obtained stationary states different from the preceding ones. Therefore, in order to take better advantage of the symmetry of the problem, we shall now consider the component of the angular momentum, defined by: =

(35)

Expressing get:

and

in terms of

and

, and

and

in terms of

and

= ~

, we

(36)

Now, the expression for =

+

in terms of the same operators is:

+1 ~

(37)

Since: +

=

+

=

=0 +

=0

(38)

we find that: [

]=0

(39)

We shall therefore look for a basis of eigenvectors common to 3-b.

and

.

Right and left circular quanta

We introduce the operators 1 ( 2 1 = ( 2

=

and

defined by:

) +

)

(40)

We see from this definition that the action of (or ) on yields a state which is a linear combination of and , that is, a stationary state which has 1 1 one less energy quantum ~ . Similarly, the action of (or ) on yields another stationary state which has one more energy quantum. In fact, we shall see that (or ) is similar to (or ), and that and can be interpreted as being destruction operators of a right and left “circular quantum” respectively. 761



COMPLEMENT DVI

First of all, using (40) and (25), it is simple to verify that the only non-zero commutators between the four operators , , , are: [

]=[

]=1

(41)

These relations are indeed analogous to (25). Moreover, these operators, in a way that is similar to (37); since: 1 2 1 = 2 =

+

can be written, in terms of

+

+

+

(42)

+1 ~

(43)

we have: =

+

In addition, using (36), we see that: =~

(44)

If we introduce the operators

and

(the number of right and left “circular quanta”):

= =

(45)

formulas (43) and (44) become: =(

+

+ 1)~

= ~(

)

Thus, while maintaining 3-c.

(46) in a form as simple as (27), we have simplified that of

.

Stationary states of well-defined angular momentum

Using the operators and , we can now go through the same arguments we used for and . It follows that the spectra of and are composed of all positive integers and zero. In addition, specifying a pair of such integers determines uniquely (to within a constant factor) the eigenvector common to and , associated with these eigenvalues, which is written: =

1 (

)!( )!

( ) ( )

0 0

(47)

and therefore form a C.S.C.O. in . Thus we see, by using (46), that is also an eigenvector of and of , with the eigenvalues ( + 1)~ and ~, where and are given by: = = 762

+ (48)



ANGULAR MOMENTUM OF STATIONARY STATES OF A TWO-DIMENSIONAL HARMONIC OSCILLATOR

Equations (48) enable us to understand the origin of the name of right or left “circular quanta”. The action of the operator on yields a state with one more quantum, to which, since has increased by one, an additional angular momentum +~ must be attributed (this corresponds to a counterclockwise rotation about ). Similarly, yields a state with one more quantum, of angular momentum ~ (clockwise rotation). Since and are positive integers (or zero), our results are in agreement with those of the preceding section: the eigenvalues of are of the form ( + 1)~ , where is a positive integer or zero; their degree of degeneracy is ( + 1) since, for fixed , we can have: = = .. .

; 1;

=0 =1 (49)

=0

;

=

Furthermore, we see that the eigenvalues of are of the form ~, where is a positive or negative integer or zero, which is the result that was established for the general case in Chapter VI. In addition, table (49) tells us which values of are associated with a given value of . For example, for the ground state, we have = 0, and therefore, necessarily, = 0; for the first excited state, we can have = 1 and = 0, or =0 and = 1, which yields either = +1 or = 1. In general, formulas (48) and (49) show that, for a given energy level ( + 1)~ , the possible values of are: =

2

4

+2

It follows that, to a pair of values of a constant factor): =

+ 2

and

=

(50) and

, there corresponds a single vector (to within

2

therefore form a C.S.C.O. in

.

Comment:

For a given value of the total energy (labeled by ), the states = =0 and correspond to the maximal ( ~) and minimal ( ~) values of . =0 = These states therefore recall the classical right and left circular motions associated with a given value of the total energy, for which takes on its maximal and minimal values (see § 1-a). 3-d.

Wave functions associated with the eigenstates common to

and

To conserve the symmetry of the problem with respect to rotation about use polar coordinates, setting: = cos = sin

, we shall

0 0

2

(51)

763

COMPLEMENT DVI



Now, what is the action of the operators and on a function of and ? We shall begin by determining their action on a function of and . Knowing that of and and therefore that of (and, by analogy, that of ), we can use (40), which yields: 1 2

=

(

1

)+

(52)

According to the rules for differentiating functions of several variables, we then obtain: e

=

1

+

2

(53)

Similarly: =

e 2

=

e 2

1

(54)

and: 1

+

e

=

+ 1

+

2

(55)

To calculate the wave functions resenting and to the function 0 0 ( 0 0(

)=

2 2

e

( ), simply apply the differential operators rep), which is, according to (31):

2

(56)

Now it can be seen from (54) and (55) that the action of e ( ) is given by: e

( ) =

e

( ) =

e(

+1)

+

2 e(

( )

1d d

( )

1d d

1)

2

(or of

) on a function of the form

(57)

Through the repeated application of these relations to (56), we see that the -dependence of ) ( ) is simply given by: e ( . This is a general result, established in Chapter VI: the -dependence of an eigenfunction of of eigenvalue ~ is e . 2 2 2 If, in (57), we choose ( ) = e , then: e

e

2 2

2

Applying the operator 0(

e(

=

+1

to the function

)=

e (

+1)

(

)

e

e

2 2

0 0( 2 2

) 2

2

(58) times, we obtain: (59)

)!

An analogous calculation yields: 0

764

(

)=

e ( )!

(

)

e

2 2

2

(60)



ANGULAR MOMENTUM OF STATIONARY STATES OF A TWO-DIMENSIONAL HARMONIC OSCILLATOR

These wave functions are normalized. For a given energy level ( + 1)~ , the wave functions (59) and (60) correspond to the limiting values + and of the quantum number . Their dependence is particularly simple: their modulus reaches a maximum for = . Therefore (as in the case of a one-dimensional harmonic oscillator), the spatial spread of these wave functions increases with the energy ( + 1)~ with which they are associated. In the same way, application of the operators (or ) to (59) and (60) permits the construction of the functions ( ) for any and . The results obtained for the first excited levels are given in Table I.

=0

=0

0 0(

=1

)=

2 2

e

1 0(

)=

e

0 1(

)=

e

=2

2 0(

)=

=0

1 1(

0 2(

=1 =

=2

1

=

2

2

2 2

2

e

2 2

2

e

(

)2 e

)=

(

)2

)=

(

)2 e

2

2

Table I: Eigenfunctions common to the Hamiltonian levels of the two-dimensional harmonic oscillator.

2 2

2

2 2

1 e 2 2

2

)=e

2 2

2

(

e

2

2

and the observable

Comment: The functions ) given in (59) are proportional to e 0( generally, all their linear combinations are of the form: (

e2

2 2

2

(

, for the first

e )

e )

(61)

(where is an arbitrary function of one variable) and are eigenfunctions of eigenvalue zero. As expected, it can easily be shown from (55) that: (

)=0

4.

)=e

with the (62)

Similarly, the subspace of eigenfunctions of of the form: (

. More

2 2

2

(

e

)

of eigenvalue zero is composed of functions (63)

Quasi-classical states

Using the properties of the one-dimensional harmonic oscillator, we can easily calculate the time evolution of the state vector and the mean values of the various observables 765

COMPLEMENT DVI



of the two-dimensional oscillator. For example, it is not difficult to show that in the mean values ( ) and ( ), as well as ( ) and ( ), only the Bohr frequency appears. Moreover, it can be shown that these mean values exactly obey the classical equations of motion. In this section, we shall be concerned with the properties and evolution of the quasi-classical states of the two-dimensional harmonic oscillator. 4-a.

Definition of the states

and

To construct a quasi-classical state of the two-dimensional harmonic oscillator, we can base our reasoning on the one-dimensional oscillator (cf. Complement GV ). Recall that, in a quasi-classical state associated with a given classical motion, the mean values ( ) and ( ) coincide at each instant with ( ) and ( ). Similarly, the mean value of the Hamiltonian is equal (to within a half-quantum ~ 2) to the classical energy. We showed in Complement GV that, at any time, the quasi-classical states are eigenstates of the destruction operator and can be written: = where

( )

(64)

is the eigenvalue of , and:

( )=

!

e

2

2

(65)

In the case which concerns us here, we can use the rules of the tensor product to obtain the quasi-classical states in the form: =

=

( =0

)

(

)

(66)

=0

with: = =

(67)

We are then sure that , , , , , are the same as the corresponding classical quantities. Now, returning to definition (40) and using (67), we see that: = =

(68)

with: 1 ( 2 1 = ( 2 =

) +

)

(69)

Therefore, the state is also an eigenvector of and with the eigenvalues given in (69). We shall denote by the eigenvector common to and associated 766



ANGULAR MOMENTUM OF STATIONARY STATES OF A TWO-DIMENSIONAL HARMONIC OSCILLATOR

with the eigenvalues and . It is easy to show that the expansion of basis has the same form as that of on the basis: =

( =0

( )

(70)

=0

where the coefficients =

)

are given by (65). It follows from (68) and (69) that:

=

=

2

+ 2

Because of the properties of the states that if: (0) =

(71) (cf. Complement GV , § 3-a), we see

=

(72)

the state vector at the instant () =e

e

e

=e

e

e

4-b.

on the

will be:

(73)

Mean values and root mean square deviations of the various observables

We set: =

e

=

e

(74)

Using formulas (93) of Complement GV , we obtain: ()= ()=

()= ()=

2 2

cos(

) (75)

cos( 2 2

)

sin(

)

sin(

)

(76)

Comparing (75) and (76) with (5) and (6), we see that: = = where

2 2

e e

(77)

, , , are the parameters defining the classical motion which the state best reproduces. 767



COMPLEMENT DVI

Also: =

2

=

2

=

2

=

2

(78)

and: 1 2 1 = 2

2

+

2

2

+

2

=

+ (79)

that is, according to (46): 2

=~

2

+

2

+1 =~

2

+

+1

(80)

and: = 2~

sin(

2

)=~

2

(81)

According to (77), is the same as the classical value of [formula (10)]. Now let us consider the root mean square deviations of the position and momentum and then of the energy and angular momentum in a state . Directly applying the results of Complement GV , we obtain: ∆

=∆



=∆

=

1 2

=

(82)

2

The root mean square deviations of the position and momentum are independent of and ; if and are much greater than 1, the position and momentum of the oscillator have a very small spread about , and , . Finally, let us calculate the root mean square deviations ∆ for the energy and ∆ for the angular momentum. As in Complement GV : ∆

=



=



=



=

(83)

But the Hamiltonian involves must now calculate, for example: (∆ )2 = ( = (∆

+

)2

)2 + (∆

(

+

)2 + 2[

= )

+

, and

is proportional to

. We

2

]

(84)

According to (66), the state of the system is a tensor product, which means that the observables and are not correlated: = 768

(85)



ANGULAR MOMENTUM OF STATIONARY STATES OF A TWO-DIMENSIONAL HARMONIC OSCILLATOR

It follows that: (∆ )2 = (∆

)2 + (∆

)2

(86)

that is: ∆

2

=~

2

+

2

=~

+

2

(87)

Similarly: ∆

=~

2

+

2

=~

2

+

2

(88)

769



A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

Complement EVI A charged particle in a magnetic field: Landau levels

1

2

3

Review of the classical problem . . . . . . . . . . . . . . . . . 771 1-a Motion of the particle . . . . . . . . . . . . . . . . . . . . . . 771 1-b The vector potential. The classical Lagrangian and Hamiltonian 773 1-c Constants of the motion in a uniform field . . . . . . . . . . . 774 General quantum mechanical properties of a particle in a magnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775 2-a Quantization. Hamiltonian . . . . . . . . . . . . . . . . . . . 775 2-b Commutation relations . . . . . . . . . . . . . . . . . . . . . . 776 2-c Physical consequences . . . . . . . . . . . . . . . . . . . . . . 777 Case of a uniform magnetic field . . . . . . . . . . . . . . . . 779 3-a Eigenvalues of the Hamiltonian . . . . . . . . . . . . . . . . . 779 3-b The observables in a particular gauge . . . . . . . . . . . . . 782 3-c The stationary states . . . . . . . . . . . . . . . . . . . . . . . 785 3-d Time evolution . . . . . . . . . . . . . . . . . . . . . . . . . . 791

Thus far, we have been considering, for various special cases, the properties of a particle subjected to a scalar potential (r) (representing, for example, the effect of an electric field on a charged particle). Chapter V (the harmonic oscillator) and Chapter VII (particle subjected to a central potential) treat other examples of scalar potentials. Here we shall be concerned with a complementary problem, that of the properties of a particle subjected to a vector potential A(r) (a charged particle placed in a magnetic field). We shall encounter a number of purely quantum mechanical effects, such as equally spaced energy levels in a uniform magnetic field (Landau levels)1 . Before studying the problem from a quantum mechanical point of view, we shall rapidly review some classical results. 1.

Review of the classical problem

1-a.

Motion of the particle

When a particle of position r and charge the force f exerted on it is the Lorentz force: f= v

B(r)

is subjected to a magnetic field B(r), (1)

where: v=

dr d

(2)

1 This equal spacing is, as we shall show, a consequence of the properties of the harmonic oscillator, and it could have been treated in Chapter V. However, we shall also see that the properties of angular momentum are useful in the study and classification of the stationary states of the particle. This is why this complement follows Chapter VI.

771



COMPLEMENT EVI

is the velocity of the particle. Its motion obeys the fundamental law of dynamics: dv =f d

(3)

(where is the mass of the particle). In the rest of this complement, we shall often be considering the case in which the magnetic field is uniform; we shall choose an axis parallel to this field. By solving the equation of motion (3), one can show that in this case the three coordinates ( ), ( ) and ( ) of the particle are given by: ()=

0

+ cos(

0)

()=

0

+ sin(

0)

()=

0

+

(4)

0

where 0 , 0 , 0 , , 0 and 0 are six constant parameters which depend on the initial conditions; the cyclotron frequency is given by: =

(5)

Equations (4) show that the projection of the position of the particle onto the plane performs a uniform circular motion, of angular velocity and initial phase 0 , on a circle of radius whose center is the point 0 , with coordinates 0 and 0 . The motion of the projection of onto is simply rectilinear and uniform. It follows that the particle moves in space along a circular helix (cf. Fig. 1), whose axis is parallel to and passes through 0 . If we are concerned only with the motion of the point , the projection of onto the plane, we study the behavior of the vector: ρ=

e + e

(6)

(where e and e are the unit vectors of the v =

and

axes). The velocity of

dρ d

(7)

It is therefore convenient to introduce the components =

0

=

0

Since

e

772

=

0

=

of the vector C0 Q:

0,

we have: (9)

C0 Q

(where e is the unit vector of are related to the coordinates of 0

and

(8)

performs a uniform circular motion about

v =

is:

). This implies that the coordinates and to the components of v by:

0

and

0

of

0

1 +

1

(10)



A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

z B

ez

O

ex

y

ey C0 M + Q

x Figure 1: Classical trajectory of a charged particle in a uniform magnetic field parallel to . The particle moves at constant velocity along a circular helix whose axis, parallel to , passes through the point 0 . The figure is drawn for 0 (the case of the electron), that is, 0.

1-b.

The vector potential. The classical Lagrangian and Hamiltonian

To describe the magnetic field B(r), one can use a vector potential A(r) which is, by definition, related to B(r) by: B(r) =

A(r)

(11)

For example, if the field B is uniform, one can choose: A(r) =

1 r 2

B

(12)

We know, furthermore, that when B(r) is given, condition (11) does not determine A(r) uniquely: a gradient of an arbitrary function of r can be added to A(r) without changing2 B(r). It can be shown (cf. Appendix III, § 4-b) that the Lagrange function (r v) of the particle is given by: (r v) =

1 2 v + v A(r) 2

(13)

2 For example, for a uniform field parallel to , one could choose, instead of the vector A(r) given by (12), the vector whose components are = 0, = , = 0.

773



COMPLEMENT EVI

It follows that p, the conjugate momentum of the position r, is related to v and A(r) by: p=

(r v) =

v

v + A(r)

The classical Hamiltonian (r p) =

1 [p 2

(14)

(r p) is then:

A(r)]2

(15)

It will prove convenient to set: (r p) =

(r p) +

(r

(16)

p)

with: 1 [ 2 1 [ (r p) = 2

(r)]2 + [

(r p) =

(r)]2

(r)]2

(17)

Comment:

In this case, unlike that of a scalar potential (r), relation (14) shows that the momentum p is not equal to the mechanical momentum v. Also, comparing (14) with (15), we see that is equal to the kinetic energy v2 2 of the particle; this results from the fact that since the Lorentz force written in (1) is always perpendicular to v, it does no work during the motion. Similarly, it must be noted that the angular momentum: =r

(18)

p

is different from the moment of the mechanical momentum λ=r 1-c.

v: (19)

v

Constants of the motion in a uniform field

Consider the special case in which the field B is uniform. The motion of the particle (§ 1-a) is such that and , defined in (17), are constants of the motion3 . If we substitute (14) into (10), we obtain: 0

=

0

=

1 +

1

[

(r)]

[

(r)]

It follows that the radius 2

=( =

3 This

0)

2 2

2

+(

(20)

of the helical trajectory satisfies: 2 0)

=

1

2

[

(r)]2 + [

(r)]2 (21)

follows from the fact that, according to (14) and (17), and are equal, respectively, to the kinetic energies v2 2 and v2 2 associated with the motions perpendicular and parallel to .

774



A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

2

is therefore proportional to the Hamiltonian . Similarly, let θ be the moment of the mechanical momentum center 0 of the circle: θ = C0 M

=

[( 1

(22)

v

The component =

v with respect to the

of this moment can then be written, with (20) taken into account: 0)

(

0)

]

(r)]2 + [

[

(r)]2 =

2

(23)

is therefore a constant of the motion, as might have been expected. On the other hand, the component of the moment of the mechanical momentum v with respect to is not, in general, constant, since: =

+

[

0

()

0

( )]

(24)

Therefore, according to (4), varies sinusoidally in time. Finally, consider the projection onto of the angular momentum =

: (25)

According to (14), it can be written: =

[

+

(r)]

[

+

(r)]

(26)

It therefore depends explicitly on the gauge chosen, that is, on the vector potential A picked to describe the magnetic field. In most cases, is not a constant of the motion. Nevertheless, if one chooses the gauge given in (12), one obtains from (4): =

2

2 0

+

2 0

2

(27)

is then a constant of the motion. Relation (27) does not have a simple physical interpretation, since it is valid only in a particular gauge. However, it will prove useful to us in the following sections for the quantum mechanical study of the problem.

2.

General quantum mechanical properties of a particle in a magnetic field

2-a.

Quantization. Hamiltonian

Consider a particle placed in an arbitrary magnetic field described by the vector potential A( ). In quantum mechanics, the vector potential becomes an operator, a function of three observables, , and . The operator , the Hamiltonian of the particle, can be obtained from (15): =

1 [P 2

A(

2

)]

(28)

According to (14), the operator V associated with the velocity of the particle is given by: V=

1

[P

A(

)]

(29) 775



COMPLEMENT EVI

which enables us to write =

2

2-b.

in the form:

V2

(30)

Commutation relations

The observables R and P satisfy the canonical commutation relations: [

]=[

]=[

]= ~

(31)

The other commutators between components of R and P are zero. Two components of P therefore commute. However, we see from (29) that the same is not true for V; for example: [

]=

2

[

(R)] + [

(R)

]

(32)

This expression is easy to calculate, using the rule given in Complement BII [cf. formula (48)]: [

~

]=

=

2

~ 2

(R)

(33a)

Similarly, it can be shown that: ~

[

]=

2

[

]=

2

~

(R)

(33b)

(R)

(33c)

The magnetic field therefore enters explicitly into the commutation relations for the velocity. However, since A(R) commutes with , and , relation (29) implies that: [

]=

1

[

]=

~

(34a)

and, similarly: [

]=[

]=

~

(34b)

(the other commutators between a component of R and a component of V are zero). From these relations, it can be deduced (cf. Complement CIII ) that: ∆



~ 2

(35)

(with analogous inequalities for the components along and ). The physical consequences of the Heisenberg uncertainty relations are therefore not modified by the presence of a magnetic field. 776



A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

Finally, let us calculate the commutation relations between the components of the operator: Λ=

R

(36)

V

associated with the moment with respect to [Λ

Λ ]=

2

=

2

[

of the mechanical momentum4 . We obtain:

] [

]

+ +

[

]

2

[

2

2

[

]

]+[

]

(37)

+

(38)

that is, with (33) and (34) taken into account: [Λ

Λ ]= ~

+

+

2

+

It follows that: [Λ

Λ ]= ~ Λ +

(39)

R B(R)

(the other commutators can be obtained by cyclic permutation of the indices , and ). When the field B is not zero, the commutation relations of Λ are completely different from those of L. The operator Λ therefore does not, a priori, possess the properties of angular momenta proved in Chapter VI. 2-c.

.

Physical consequences

Evolution of R

The time variation of the mean position of the particle is given by Ehrenfest’s theorem: ~

d R = [R d

] =

R

2

V2

(40)

[according to formula (30)]. Equations (34) are not difficult to interpret, since, substituted into (40), they yield: d R = V d

(41)

As in the case in which the magnetic field is zero, the mean velocity is therefore equal to the derivative of R . Equation (41) is the quantum mechanical analogue of (2). .

Evolution of V . The Lorentz force Let us calculate the time derivative of the mean value V of the velocity: ~

d V = d

V

2

V2

4 Of course, the components of the angular momentum L = R tation relations.

(42) P always satisfy the usual commu-

777

COMPLEMENT EVI



Since, according to relations (33): V2

=

2

=

[ ~

=

+

2

2

+ ]+[

]

(R)

2

+

[

(R)

+

]+[

]

(R) +

(R)

(43)

it is easy to see that: d V = F(R V) d where the operator F(R V) is defined by:

(44)

F(R V) =

V B(R) B(R) V (45) 2 The last two relations are simply the analogues of the classical relations (1) and (3). Here, we obtain a symmetrized expression for F(R V) (cf. Chap. III, § B-5), since R and V do not commute; a minus sign appears since the vector product is antisymmetric. Evolution of Λ

.

Now let us evaluate: d ~ Λ = [Λ ] d To do so, let us calculate, for example, the commutator [ [

]= =

But and equal to: [

[ ~

]+[ (

commute, as do ]= =

[ ~

(

]

[

]+[

]:

]

) + ~( and

(46)

[

]

)

(47)

. The commutator we are calculating is therefore ]

[

]

) + ~(

]

)

Taking half the sum of these two expressions, we find d 1 Λ = + d 2 Analogous arguments give the derivative of Λ

[

d Λ d

(48) in the form: (49)

and Λ ; finally:

d 1 Λ = R F(R V) F(R V) R (50) d 2 The classical analogue of this relation is: d λ = r f (r v) (51) d which expresses a well-known theorem: the time derivative of the moment of the mechanical momentum with respect to a fixed point is equal to the moment with respect to of the force exerted on the particle. 778

• 3.

A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

Case of a uniform magnetic field

When the magnetic field is uniform, the preceding general study can easily be pursued further. We choose the direction of the field B as the axis. The commutation relations (33) then become, using definition (5): ~

[

]=

[

]=[

(52a) ]=0

(52b)

Comment:

Applying the results of Complement CIII to their root mean square deviations satisfy: ∆

and

, we can see from (52a) that

~ 2



(53)

The components of the velocity V are therefore incompatible physical quantities. 3-a.

Eigenvalues of the Hamiltonian

By analogy with (16), =

+

can be written in the form: (54)



with: =

=

2

2

+

2

(55a)

2

(55b)

2

According to (52b): [

]

=0

(56)

We can now look for a basis of eigenvectors common to (eigenvalues ); they will automatically be eigenvectors of = .

+

(eigenvalues ) and , with the eigenvalues:

(57)



Eigenvalues of



The eigenvectors of the operator are also eigenvectors of are two Hermitian operators which satisfy the relation: [

]=

~



.

Now,

and

(58) 779

COMPLEMENT EVI



We can therefore apply to them the results of Complement EII ; in particular, the spectrum of includes all the real numbers. Consequently, the eigenvalues of are of the form:

=

2

(59)

2

where is a real arbitrary constant. The spectrum of is therefore continuous: the energy can take on any positive value or zero. The interpretation of this result is obvious: describes the kinetic energy of a free particle moving along (as in classical mechanics; § 1-a). .

Eigenvalues of

We shall assume, as an example, that the particle under consideration has a negative charge ; the cyclotron frequency is then positive [formula (5)]5 . We set: ˆ= ~ ˆ=

(60) ~

Relation (52a) can then be written: [ ˆ ˆ] = and

(61)

becomes: =

~ 2

ˆ 2 + ˆ2

(62)

then takes on the form of the Hamiltonian of a one-dimensional harmonic oscillator [cf. Chap. V, relation (B-4)]. ˆ and ˆ, which satisfy (61), play the roles of the position ˆ and momentum ˆ of this oscillator. The arguments set forth in § B-2 of Chapter V for the operators ˆ and ˆ can be repeated here for ˆ and ˆ. For example, it can easily be shown that if is an eigenvector of : =

(63)

the kets: 1 2 1 = 2 =

ˆ+ ˆ

(64a)

ˆ

(64b)

ˆ

5 For a positive charge , one can keep the convention of positive axis opposite to the magnetic field.

780

by choosing the direction of the

• are also eigenvectors of

A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

:

=(

~

)

(65a)

=(

+~

)

(65b)

From this we deduce that the possible values of = where .

+

1 2

are given by: (66)

~

is a positive integer or zero. Eigenvalues of

According to the preceding results, the eigenvalues of the total Hamiltonian of the form: (

)=

+

1 2

~

+

1 2

2

are

(67)

The corresponding levels are called Landau levels. For a given value of , all possible values of (positive integers or zero) are actually 1 ˆ ˆ found. From (64) and 65 we see that the repeated action of the operators 2 on an eigenvector of of eigenvalue ( ) provides an energy state ( ) , where is any integer but where has not changed (since ˆ and ˆ commute with ). Therefore, although the energy of the motion along is not quantized, that of the motion projected onto is.

Comment:

We showed in Chapter V (§ B-3) that the energy levels of the one-dimensional harmonic oscillator are non-degenerate in . The situation is different here, since the particle under study is moving in three-dimensional space. Since the destruction 1 ˆ+ ˆ = operator of a quantum ~ is ( + ) , the eigenvectors 2~ 2 of corresponding to = 0 are solutions of the equation: (

+

)

=0

(68)

On the one hand, vectors that are solutions of (68) can be eigenvectors of with an arbitrary (positive) eigenvalue. On the other, even for a fixed value of , equation (68) is a partial differential equation with respect to and , and has an infinite number of solutions. The energies ( = 0 ) are therefore infinitely degenerate. By using the creation operator for a quantum, it can easily be shown that this is true for all the levels ( ), for any (non-negative integer). 781



COMPLEMENT EVI

3-b.

The observables in a particular gauge

In order to state the above results more precisely, we shall calculate the stationary states of the system. This will enable us to study their physical properties. It is now necessary to choose a gauge; we shall choose the one given by (12). The components of the velocity are then: =

2

=

+

2

= .

(69)

The Hamiltonians oscillator

and

.

Relation with the two-dimensional harmonic

Substituting (69) into (55), we obtain: 2

=

+ 2

2

2

+

2

+

2

8

+

2

(70a)

2

=



(70b)

2

where is the component along of the angular momentum L = R P. In the r representation, is an operator that acts only on the variable , while acts only on the variables and . We can therefore find a basis of eigenvectors of by solving in the eigenvalue equation of , and then, in , that of . All we must then do is take the tensor products of the vectors obtained. Actually, the eigenvalue equation of simply leads to the wave functions: 1 e 2 ~

( )=

~

(71)

with: 2

=

(72)

2

[we again find (59)]. Therefore, we shall concentrate on solving the eigenvalue equation of in ; the wave functions we shall be considering now depend on and , and not on . Comparing (70a) with expression (12a) of Complement DVI , we see that can be expressed simply in terms of the Hamiltonian of a two-dimensional harmonic oscillator: =

+

if we choose for the value of the constant that enters into = 782

2

(73)

2 :

(74)



A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

Now, in Complement DVI , we saw that and form a C.S.C.O. in , and we constructed a basis of eigenvectors common to these two observables [cf. formula (47) of DV ]. The are also eigenvectors of ; Complement DVI therefore gives the solutions to the eigenvalue equation of . Comments:

(i) In § 3-a, we saw that can be written in a form that is analogous to that of a Hamiltonian of a one-dimensional harmonic oscillator. Here, we find that, in a particular gauge, this same operator is also simply related to the Hamiltonian of a two-dimensional harmonic oscillator. These two results are not contradictory; they simply correspond to two different decompositions of the same Hamiltonian, which must obviously lead to the same physical conclusions. ( ) One must not lose sight of the fact that the Hamiltonian involves a physical problem which is completely different from that of the two-dimensional harmonic oscillator: the charged particle is subjected to a vector potential (describing a uniform magnetic field) and not a harmonic scalar potential (which would describe, for example, a non-uniform electric field). It so happens that, in the gauge chosen, the effects of the magnetic field can be likened to those of a fictitious harmonic scalar potential. .

Expression for the observables in terms of the creation and destruction operators of circular quanta

First of all, we shall express the observables describing the quantities associated with the particle in terms of the operators and [defined by equations (40) of Complement DVI ] and their adjoints and (we shall also use the operators = and = ). Substituting relations (46) of DVI into (73), we obtain6 : =

+

1 2

(75)

~

The energy associated with the state =

+

1 2

is therefore: (76)

~

as we found in (66). Moreover, since is independent of eigenvalues of are infinitely degenerate. Using relations (23) and (40) of DVI , we can see that: = =

1 2 2

+

+

, we see that all the

+ + (77)

6 Recall

that we have assumed to be positive. If were negative, the indices and would have to be inverted in a certain number of the following formulas; for example, (75) would become: = ( + 1 2)~ .

783



COMPLEMENT EVI

where, using (74), =

is defined by: (78)

2~

Similarly: ~ 2 ~ = 2 =

+

+

+

(79)

These expressions, substituted into (69), yield: = =

2 +

2

(80)

Since and do not commute with , it can be seen by using (75) that, as in classical mechanics, and are not constants of the motion; in addition, using the commutation relations of and , we indeed obtain (52a). It is also interesting to study the quantum mechanical operators associated with the various variables introduced in the description of the classical motion (§ 1): the coordinates ( 0 , 0 ) of the center 0 of the classical trajectory, the components ( ) of the vector C0 Q, etc. As above, we shall denote each of these operators by the capital letter corresponding to the small letter which designates the corresponding classical variable. By analogy with (10), we therefore set: 0

=

0

=

1 +

=

1

=

1 2

+

(81b)

2

The operators and commute with ; it follows that the motion. Formulas (81) also imply that: [

0]

0

=

=

2

2

(81a)

~

0

and

0

are constants of

(82)

Consequently, 0 and 0 are incompatible physical quantities, their root mean square deviations being related by: ∆

0



~

0

(83)

2

We also define: = =

784

0

0

=

=

1 2 2

+ (84)

• We immediately see that motion; moreover, and

A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

and , as in classical mechanics, are not constants of the are simply proportional to and respectively:

= =

(85)

like the corresponding classical variables [formula (9)]. According to (53), equations (85) imply: ∆

~



(86)

2

Let Σ2 be the operator corresponding to

2

(square of the radius of the classical trajec-

tory): Σ2 = (

0)

2

+(

0)

2

(87)

According to (81), we have: 1

Σ2 =

2

2

+

2

=

2

(88)

2

Σ2 is therefore a constant of the motion, as is 2 in classical mechanics. Finally, the operator associated with the moment of the mechanical momentum respect to is: Θ =

[(

0)

(

0)

]

v with

(89)

and formulas (81) indicate that: Θ =

2

(90)

as in (23). Θ is therefore a constant of the motion. On the other hand, the operator Λ , the component along of R V, is: Λ =

2

+~

+

and therefore does not commute with

3-c.

(91) .

The stationary states

We indicated above that the eigenvalues of the Hamiltonian are all infinitely degener( ) ate in . For each positive or zero integer , there exists an infinite-dimensional subspace of , all of whose kets are eigenvectors of with the same eigenvalues ( + 1 2)~ . In this section, we shall study different bases which can be chosen in each of these subspaces. First, we shall indicate the general properties of the stationary states, valid for any basis of eigenstates of .

785



COMPLEMENT EVI

.

General properties

Relations (88) and (90) show that an arbitrary stationary state is necessarily an eigenvector of Σ2 and Θ ; the corresponding physical quantities are therefore always well-defined in such a state and are equal to: (2 + 1)

~

for Σ2

(2 + 1)~

for Θ

(92)

The values of Σ2 and Θ are proportional to the energy; this corresponds to the classical description of the motion (cf. § 1). It follows from (80) and (84) that , , and have no matrix elements inside a ( ) given subspace ; it follows, for a stationary state, that: =

=0

=

=0

(93)

Nevertheless, since and (and therefore and ) are not constants of the motion, the corresponding physical quantities do not have perfectly well-defined values in a stationary state. In fact, by using (80), (84) and the properties of the one-dimensional harmonic oscillator [cf. Chap. V, relation (D-5)], it can be shown that: ∆

=



=

+

1 2

~



=



=

+

1 2

~

(94)

in agreement with (53). Moreover, we see that the only stationary states in which the product ∆ ∆ (or ∆ ∆ ) takes on its minimal value are the ground states ( = 0).

Comment: The various ground states are solutions of the equation: =0

(95a)

that is, using (80): (

+

)

=0

(95b)

as we found in (68).

.

The states

As we saw in Complement DVI , the fact that and form a C.S.C.O. in can be used to construct a basis of eigenvectors common to these two observables. This basis is composed of the vectors , since, according to (75) and formula (46) of Complement DVI :

= =(

786

+

1 ~ 2 )~

(96a) (96b)



A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

( )

The subspace defined by specifying the (non-negative) integer is therefore spanned by the set of vectors such that = . The eigenvalues of associated with these different vectors are of the form ~, and, for fixed , is an integer which can vary between and (for example, all the ground states correspond to negative values of ; this is related to the hypothesis 0 posed above). The wave functions associated with the states were calculated in Complement DVI (§ 3-a). Note that the states are eigenstates of the operator , but not of the operator Λ associated with the moment of the mechanical momentum. This can be seen directly from formula (91). In a state , the mean values 0 and 0 are zero, according to (81). However, neither 0 nor 0 corresponds to perfectly well-defined physical quantities, since, by using the properties of the one-dimensional harmonic oscillator, it can easily be shown that, in a state : ∆ ∆

0

=

+

1 2

~

0

=

+

1 2

~

(97)

The minimal value of the product ∆ 0 ∆ 0 is therefore attained for the states =0 , that is, the states of each energy level = ( + 1 2)~ for which takes on its maximal value ~ [cf. (96)]. However, let us define the operator: Γ2 =

2 0

+

2 0

(98)

It corresponds to the square of the distance from the center Using (81), we easily find: Γ2 =

~

=

~

0

of the trajectory to the origin.

+ (2

+ 1)

(99)

is therefore an eigenstate of Γ2 with the eigenvalue

The state

~

(2

+ 1); the fact

that this value can never go to zero is related to the non-commutativity of the operators 0.

Comment: The operator

0

and

, according to (75) and (99), is given by:

= ~(

)=~

1 2

~

2~

Γ2 +

1 2

(100)

that is, according to (88): =

2

Σ2

Γ2 =

2

Γ2

Σ2

(101)

which is the equivalent of the classical relation (27).

787

COMPLEMENT EVI

.



Other types of stationary states

Any linear combination of vectors associated with the same value of is an eigenstate of and therefore possesses the properties stated in § 3-c- . By a suitable choice of the coefficients of the linear combination, one can obtain stationary states that possess other interesting properties as well. We know, for example (§ 3-b- ), that 0 and 0 are constants of the motion. However, since 0 and 0 do not commute, there are no eigenstates common to these two operators. This means that, in quantum mechanics, it is not possible to obtain a state in which the two coordinates of the point 0 are known. To construct the eigenstates common to and 0 , we can use the properties of the one-dimensional harmonic oscillator; formula (81a) shows that 0 has the same expression, to within a constant factor, as the position operator of a one-dimensional oscillator whose destruction operator is : 0

=

1

ˆ

(102)

2

Since we know the wave functions ˆ (ˆ) associated with the stationary states ˆ of a onedimensional harmonic oscillator (cf. Complement BV , § 2-b), we know how to write the eigenvectors ˆ of the position operator as linear combinations of the states ˆ : ˆ =

ˆ

ˆ ˆ

=0

=

ˆ (ˆ) ˆ

(103)

=0

In order to obtain the eigenstates common to states = ; the vector: 0

=

ˆ (

2

0)

=

and

0

it suffices to apply this result to the

(104)

=

=0

is a common eigenvector of and 0 with the eigenvalues ( + 1 2)~ and 0 . The eigenstates common to and 0 can be found in an analogous fashion. 0 Relation (81b) indicates that 0 is proportional to the momentum operator of the ficticious one-dimensional oscillator just used: 0

=

1

ˆ

(105)

2

Consequently [see formula (20) of Complement DV ]: 0

=

ˆ (

2

0)

=

=

(106)

=0

We have just constructed the states in which either 0 or 0 is perfectly well-defined. We can also determine the stationary states in which the product ∆ 0 ∆ 0 reaches its minimal value, given by (83). For a one-dimensional harmonic oscillator, we studied in Complement GV the states in which the product ∆ ˆ ∆ ˆ is minimal; these are the quasi-classical states, given by: =

( ) =0

788

(107)



A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

with: ( )=

!

2

e

2

(108)

In these states: 1 2

∆ˆ =∆ˆ =

(109)

It follows that, in the case which interests us here, the state:

0

=

(

0)

=

(110)

=

=0

yields, for ∆

0

0

=∆

and =

0

The product ∆

0,

root mean square deviations:

1 2

0



(111) 0

is therefore minimal.

Comment: Since the magnetic field is uniform, the physical problem we are considering is invariant with respect to translation. Thus far, this symmetry has been masked by the choice of the particular gauge (12), which gives the origin a privileged position with respect to all other points in space. Consequently, neither the Hamiltonian nor its eigenstates are invariant with respect to translation. We know, however (cf. Complement HIII ) that the physical predictions of quantum mechanics are gauge-invariant. These predictions must remain the same if, by a change of gauge, we give a point other than a privileged position. Consequently, the translation symmetry must reappear when we study the physical properties of a given state. To show this more precisely, let us assume that, at a given instant, the state of the particle is characterized in the gauge (12) by the ket with which the wave function r = (r) is associated. We then perform a translation defined by the vector a, and consider the ket defined by: =e

~

Pa

(112)

with which, according to the results of Complement EII , is associated the wave function: (r) = r

= (r

(113)

a)

The same translation can be applied to the vector potential, which becomes: A (r) = A(r

a) =

1 (r 2

a)

(114)

B

A (r) clearly describes the same magnetic field as A(r). Since the physical properties attached to a given state vector depend only on this state vector and the potential A chosen, they must undergo the translation when (r) and A(r) are replaced by expressions (113) and (114). It is simple to use these relations to obtain the expression for the probability density associated with : (r) =

(r) 2 =

(r

a) 2 = (r

a)

(115)

789



COMPLEMENT EVI

and that for the current J (r), calculated with the vector potential A (r): 1 2 1 = 2 = J(r

(r)

~

(r

a)

J (r) =

+ ~

2

(r +

a) 2

(r

(r) +

B a)

B

(r

a) + (116)

a)

[where J(r) is the probability current associated with (r) in the gauge (12)]. The ket therefore describes, in the new gauge A (r), a state whose physical properties are related by the translation to those corresponding to the ket in the gauge A(r). Let us show, moreover, that the translation of a possible motion yields another possible motion; this will conclude the proof of the translation invariance of the problem. To do so, consider the Schrödinger equation in the r representation, in the gauge A(r): (r ) =

~

1 2

Changing r to r

A(r)

(r )

(117)

a in this equation, we obtain, using (113) and (114):

(r ) =

~

2

~

1 2

2

~

A (r)

(r )

(118)

The operator appearing on the right-hand side of (118) is none other than the Hamiltonian in the gauge A (r). Consequently, if (r ) describes, in the gauge A(r), a possible motion of the system, (r ) describes, in the equivalent gauge A (r), another possible motion, which, according to what we have just shown, is nothing more than the result of a translation of the first motion. In particular, if: ~

(r ) = (r) e

is a stationary state [in the gauge A(r)], (r ) =

~

(r) e

is another stationnary state of the same energy [in the gauge A (r)]. If we want to continue to use the gauge (12) after having performed the translation on the physical state of the particle, we must describe the translated state by a mathematical ket which is different from . According to § 3-b- of Complement HIII , the ket can be obtained from by a unitary transformation: =

(119)

The operator = e~

is given by: (R)

(120)

where (r) is the function characterizing the gauge transformation performed. Here, the potential after the gauge change is: A(r) =

1 r 2

B = A (r)

1 a 2

B

(121)

so that: (r) =

790

1 r (a 2

B)

(122)



A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

Substituting (112), (120) and (122) into (119), we finally obtain: =

(a)

(123)

with: (a) = e

2~

R (a B)

e

~

Pa

(124)

Therefore, if we remain in the gauge A(r), the translation operator is (a), given by (124). The components of R and P along two perpendicular axes enter into formula (124); they therefore commute, so we can write: (a) = e

2~

R (a B)

~

When a is a vector of the yields: (a) = e

~

Pa

(125) plane, a simple calculation, using formulas (10) and (69),

(a R0 ) B

(126)

with: R0 =

0

e +

0

The operators 0 and with translations along 3-d.

.

(127)

e 0

(coordinates of the center of the circle) are therefore associated and respectively.

Time evolution

Mean values of the observables

We have already encountered a certain number of physical quantities that are constants of the motion: 0 , 0 , Θ , Σ2 . Whatever the state of the system, their mean values are time-independent. Let us examine the time evolution of the mean values , , , and , . We immediately see from the expressions given in § 3-b- that the corresponding operators have matrix elements only between states whose values of differ by 1 (or 0). The evolution of these mean values therefore involves only one Bohr frequency, which is none other than the cyclotron frequency 2 defined in (5). This result is completely analogous to the one given by classical mechanics. .

Quasi-classical states Assume that at = 0 the state of the particle is: (0) =

(128)

where the ket is defined by expression (70) of Complement DVI . Since expression (75) for involves but not , the state vector ( ) at the instant is obtained by changing to e : () =e

2

e

(129)

[cf. expression (92) of Complement GV ]. 791



COMPLEMENT EVI

We set: =

e

=

e

(130)

Relations (80), (81) and (84) then show that: 0 0

=

1 ( 2

=

(

2

+

cos

)=

sin

(131)

()=

1 ( 2

e

()=

(

e

2

)=

+

e

)=

cos(

)

e

)=

sin(

)

(132)

and: ()=

sin(

) (133)

()=

cos(

)

Moreover, the properties of the states 2

=~ Θ

1 2

1 2 1 2 + 2

2

= 2~

Σ2 =

+

imply that:

1 2

+

(134)

All these results are extremely close to those given by classical mechanics [cf. (4)]. We see that is related to the radius of the classical trajectory, and to the initial phase 0 , while is related to the distance corresponds to the polar angle 0 , and of the vector OC0 . Furthermore, the properties of the states can be used to show that: ∆ ∆

0

=∆ =∆

0

=∆ =

(the products ∆ and: ∆ 792

=~

=∆

=

1 2

(135a) (135b)

2 0



0,





∆Θ = 2~

and ∆



∆Σ2 =

therefore take on their minimal values), 1 2

(136)

• As for the deviations ∆ 2

() =e [where ∆ (∆

A CHARGED PARTICLE IN A MAGNETIC FIELD: LANDAU LEVELS

and ∆ , they can be calculated by using the fact that: e

=

+ 2

=

e 2

(137)

is defined by relation (66) of DVI ], which yields: =∆

~

=

=

1 2

(138)

and ∆ can easily be obtained in the same way). If the conditions: 1

1

(139)

are satisfied, we see, therefore, that the various physical quantities (position, velocity, energy, ...) are, in relative value, very well defined. The states (129) therefore represent “quasi-classical” states of the charged particle placed in a uniform magnetic field.

Comment:

If

= 0, we obtain: 1 ~ 2 =0

= ∆

(140)

The states: =

(141)

therefore correspond to the ground state. References and suggestions for further reading:

Landau and Lifshitz (1.19), Chap. XVI, §§ 124 and 125; Ter Haar (1.23), Chap. 6. Application to solid state physics: Mott and Jones (13.7), Chap. VI, § 6; Kittel (13.2), Chap. 8, p. 239 and Chap. 9, p. 290.

793



EXERCISES

Complement FVI Exercises 1. Consider a system of angular momentum = 1, whose state space is spanned by the basis + 1 0 1 of three eigenvectors common to J2 (eigenvalue 2~2 ) and (respective eigenvalues +~, 0 and ~). The state of the system is: =

+1 +

where , ,

0 +

1

are three given complex parameters.

Calculate the mean value J of the angular momentum in terms of , Give the expression for the three mean values same quantities.

2

2

,

and

2

and .

in terms of the

2. Consider an arbitrary physical system whose four-dimensional state space is spanned by a basis of four eigenvectors common to J2 and ( = 0 or 1; 2 + ), of eigenvalues ( + 1)~ and ~, such that: =~

( + 1)

=

+

(

1)

1

=0 , the eigenstates common to J2 and

Express in terms of the kets denoted by .

, to be

Consider a system in the normalized state: =

=1 +

=1 + =1

=1 =

1 +

=0 =0

=0

( ) What is the probability of finding 2~2 and ~ if J2 and taneously?

are measured simul-

( ) Calculate the mean value of when the system is in the state , and the probabilities of the various possible results of a measurement bearing only on this observable. (

) Same questions for the observable J2 and for

( )

.

2

is now measured; what are the possible results, their probabilities, and their mean value?

3. Let L = R P be the angular momentum of a system whose state space is Prove the commutation relations: [

]= ~

[ [

r.

]= ~ 2

P ]=[

R2 ] = [

R P] = 0 795

COMPLEMENT FVI

where and



, , denote arbitrary components of L, R, P in an orthonormal system, is defined by: = 0 if two (or three) of the indices , , are equal = 1 if these indices are an even permutation of , , = 1 if the permutation is odd.

4. Rotation of a polyatomic molecule Consider a system composed of different particles, of positions R1 and momenta P1 P P . We set: J=

R

R ,

L

with: L

=R

P

Show that the operator J satisfies the commutation relations that define an angular momentum. Deduce from this that, if V and V denote two ordinary vectors of three-dimensional space, then: [J V

J V ] = ~(V

V) J

Calculate the commutators of J with the three components of R of P . Show that: [J

R

and with those

R ]=0

Prove that: [J

J R ]=0

and deduce from this the relation: [J R

J R

] = ~(R

R ) J = ~ J (R

R )

We set: W=

R

W =

R

where the coefficients [J W J W ] =

and ~(W

are given. Show that: W) J

Conclusion: what is the difference between the commutation relations of the components of J along fixed axes and those of the components of J along the moving axes of the system being studied? 796



EXERCISES

Consider a molecule which is formed by unaligned atoms whose relative distances are assumed to be invariant (a rigid rotator). J is the sum of the angular momenta of the atoms with respect to the center of mass of the molecule, situated at a fixed point ; the axes constitute a fixed orthonormal frame. The three principal inertial axes of the system are denoted by , and , with the ellipsoid of inertia assumed to be an ellipsoid of revolution about (a symmetrical rotator). The rotational energy of the molecule is then: =

2

2

1 2

+

+

2



where , and are the components of J along the unit vectors w , w and w of the moving axes , , attached to the molecule, and and are the corresponding moments of inertia. We grant that: 2

+

2

+

2

=

2

+

2

+

2

= J2

( ) Derive the commutation relations of

,

,

from the results of .

( ) We introduce the operators = . Using the general arguments of Chapter VI, show that one can find eigenvectors common to J2 and , of eigenvalues ( + 1)~2 and ~, with = +1 1 (

of the rotator in terms of J2 and

) Express the Hamiltonian eigenvalues.

2

. Find its

( ) Show that one can find eigenstates common to J2 , and , to be denoted by [the respective eigenvalues are ( + 1)~2 , ~, ~]. Show that these states are also eigenstates of . ( ) Calculate the commutators of and with J2 , , . Derive from them the action of and on . Show that the eigenvalues of are at least 2(2 + 1)-fold degenerate if = 0, and (2 + 1)-fold degenerate if = 0. ( ) Draw the energy diagram of the rigid rotator ( is an integer since J is a sum of orbital angular momenta; cf. Chapter X). What happens to this diagram when = (spherical rotator)? 5. A system whose state space is (

)=

( + + )e

r2

r

has for its wave function:

2

where , which is real, is given and

is a normalization constant.

The observables and L2 are measured; what are the probabilities of finding 0 2 and 2~ ? Recall that: 0 1(

)=

3 cos 4 797

COMPLEMENT FVI



If one also uses the fact that: 1 1

(

3 sin e 8

)=

is it possible to predict directly the probabilities of all possible results of measurements of L2 and in the system of wave function ( )? 6. Consider a system of angular momentum = 1. A basis of its state space is formed by the three eigenvectors of : +1, 0, 1 , whose eigenvalues are, respectively, +~, 0, and ~, and which satisfy: =~ 2

1

1 =

+

1 =0

This system, which possesses an electric quadrupole moment, is placed in an electric field gradient, so that its Hamiltonian can be written: =

0

(

2

2

)

~ where and are the components of L along the two directions and plane that form angles of 45 with and ; 0 is a real constant.

of the

Write the matrix representing in the + 1 0 1 basis. What are the stationary states of the system, and what are their energies? (These states are to be written 1 , 2 , 3 , in order of decreasing energies.) At time = 0, the system is in the state: (0) =

1 [ +1 2

1]

What is the state vector ( ) at time ? At , probabilities of the various possible results? Calculate the mean values performed by the vector L ? At , a measurement of

2

( ),

( ) and

is measured; what are the ( ) at . What is the motion

is performed.

( ) Do times exist when only one result is possible? ( ) Assume that this measurement has yielded the result ~2 . What is the state of the system immediately after the measurement? Indicate, without calculation, its subsequent evolution. 7. Consider rotations in ordinary three-dimensional space, to be denoted by Ru ( ), where u is the unit vector which defines the axis of rotation and is the angle of rotation. Show that, if then:

is the transform of

OM = OM + u 798

OM

under an infinitesimal rotation of angle ,

• If OM is represented by the column vector

EXERCISES

, what is the matrix associated

with Ru ( )? Derive from it the matrices representing the components of the operator defined by: Ru ( ) = 1 +

u

Calculate the commutators: [

]

;

[

]

;

[

]

What are the quantum mechanical analogues of the purely geometrical relations obtained? Starting with the matrix representing , calculate the one that represents e ; show that R ( ) = e ; what is the analogue of this relation in quantum mechanics? 8. Consider a particle in three-dimensional space, whose state vector is , and whose wave function is (r) = r . Let be an observable that commutes with L = R P, the orbital angular momentum of the particle. Assuming that , L2 and form a C.S.C.O. in r , call their common eigenkets, whose eigenvalues are, respectively, (the index is assumed to be discrete), ( + 1)~2 and ~. Let ( ) be the unitary operator defined by: ( )=e

~

where is a real dimensionless parameter. For an arbitrary operator transform of by the unitary operator ( ): ˜ =

( )

, we call ˜ the

( )

We set + = + , = . Calculate ˜ + and show that + and ˜ + are proportional; calculate the proportionality constant. Same question for and ˜ . Express ˜ , ˜ and ˜ in terms of , and . What geometrical transformation ˜ can be associated with the transformation of L into L? Calculate the commutators [ ] and [ ]. Show that the kets ( ) and are eigenvectors of and calculate their eigenvalues. What relation must exist between and for the matrix element to be non-zero? Same question for . By comparing the matrix elements of ^ and ˜ with those of and calculate ˜ , ˜ , ˜ in terms of , , . Give a geometrical interpretation.

,

799



COMPLEMENT FVI

9. Consider a physical system of fixed angular momentum , whose state space is , and whose state vector is ; its orbital angular momentum operator is denoted by L. We assume that a basis of is composed of 2 +1 eigenvectors of ( + ), associated with the wave functions ( ) ( ). We call L = L the mean value of L. We begin by assuming that: =

=0

Out of all the possible states of the system, what are those for which the sum (∆ )2 + (∆ )2 + (∆ )2 is minimal? Show that, for these states, the root mean square deviation ∆ of the component of L along an axis making an angle with is given by: ∆

=~

2

sin

We now assume that L has an arbitrary direction with respect to the axes. We denote by a frame whose axis is directed along L , with the axis in the plane. )2 + (∆

( ) Show that the state 0 of the system for which (∆ is minimal is such that: (

( ) Let

+

)

0

=0

0

= ~

0

= cos2 = sin

2 0

2

)2

0

0 be the angle between ; prove the relations:

+

)2 + (∆

e

cos

and

0

0

2

, and

sin2

+

e

0

+

0

2

e

+ sin

0,

the angle between

sin

0

0

2

cos

0

2

e

and

0 0

+ cos

0

If we set: 0

=

show that: = tan Express (

800

) To calculate ( + )

0

2

e

0

+

in terms of

+1 +1

,

0,

0

and .

, show that the wave function associated with ( ) [where

0

is

0(

)=

is defined by equation (D-20) of Chapter VI], the

one associated with

( +

being

this expression for 0 ( value of and the relation:

)

( ). By replacing

sin

0

cos

2

EXERCISES

,

and

in

) by their values in terms of , , , find the

+

=



0

2

e

0

( +

(2 )! )!(

)!

( ) With the system in the state 0 , is measured. What are the probabilities of the various possible results? What is the most probable result? Show that, if is much greater than 1, the results correspond to the classical limit.

10. Let J be the angular momentum operator of an arbitrary physical system whose state vector is . Can states of the system be found for which the root mean square deviations ∆ ∆ and ∆ are simultaneously zero?

,

Prove the relation: ∆

~ 2



and those obtained by cyclic permutation of , , . Let J be the mean value of the angular momentum of the system. The axes are assumed to be chosen in such a way that = = 0. Show that: (∆

)2 + (∆

)2

~

Show that the two inequalities proven in question . both become equalities if and only if + = 0 or = 0. The system under consideration is a spinless particle for which J = L = R P. ~ Show that it is not possible to have both ∆ ∆ = and (∆ )2 + 2 (∆ )2 = ~ unless the wave function of the system is of the form: (

)=

( sin

e

)

11. Consider a three-dimensional harmonic oscillator, whose state vector

is:

= where , and are quasi-classical states (cf. Complement GV ) for onedimensional harmonic oscillators moving along , and , respectively. Let L = R P be the orbital angular momentum of the three-dimensional oscillator. 801

COMPLEMENT FVI



Prove: = ~ ∆

2

=~

2

+

and the analogous expressions for the components of L along

and

.

We now assume that: =

=0

Show that minimize ∆ =

= ~

0

must be zero. We then fix the value of . Show that, in order to + ∆ , we must choose: =

2

e

0

(where 0 is an arbitrary real number). Do the expressions ∆ ∆ and (∆ )2 + (∆ )2 in this case have minimum values compatible with the inequalities obtained in question . of the preceding exercise? Show that the state of a system for which the preceding conditions are satisfied is necessarily of the form: =

(

)

=

=0

=0

with: =

=0

=0

+

=

( )=

2 !

e

=0

! 2

2

;

=0

=e

=0

0

(the results of Complement GV and of § 4 of Complement DVI can be used). Show that the angular dependence of = =0 =0 is (sin e ) . L2 is measured on a system in the state . Show that the probabilities of the various possible results are given by a Poisson distribution. What results can be obtained in a measurement of that follows a measurement of L2 whose result 2 was ( + 1)~ ? Exercise 4 :

Reference: Landau and Lifshitz (1.19), § 101; Ter Haar (1.23), §§ 8.13 and 8.14.

802

Chapter VII

Particle in a central potential. The hydrogen atom A

B

C

Stationary states of a particle in a central potential . . . . A-1 Outline of the problem . . . . . . . . . . . . . . . . . . . . . . A-2 Separation of variables . . . . . . . . . . . . . . . . . . . . . . A-3 Stationary states of a particle in a central potential . . . . . . Motion of the center of mass and relative motion for a system of two interacting particles . . . . . . . . . . . . . . . B-1 Motion of the center of mass and relative motion in classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2 Separation of variables in quantum mechanics . . . . . . . . . The hydrogen atom . . . . . . . . . . . . . . . . . . . . . . . . C-1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2 The Bohr model . . . . . . . . . . . . . . . . . . . . . . . . . C-3 Quantum mechanical theory of the hydrogen atom . . . . . . C-4 Discussion of the results . . . . . . . . . . . . . . . . . . . . .

804 804 807 810 812 812 814 818 818 819 820 824

In this chapter, we shall consider the quantum mechanical properties of a particle placed in a central potential [that is, a potential ( ) which depends only on the distance from the origin]. This problem is closely related to the study of angular momentum presented in the preceding chapter. As we shall see in § A, the fact that ( ) is invariant under any rotation about the origin means that the Hamiltonian of the particle commutes with the three components of the orbital angular momentum operator L. This considerably simplifies the determination of the eigenfunctions and eigenvalues of , since these functions can be required to be eigenfunctions of L2 and as well. This immediately defines their angular dependence, and the eigenvalue equation of can be replaced by a differential equation involving only the variable . The importance of this problem derives from a property that will be established in § B: a two-particle system in which the interaction is described by a potential energy Quantum Mechanics, Volume I, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER VII

PARTICLE IN A CENTRAL POTENTIAL. THE HYDROGEN ATOM

that depends only on the relative positions of the particles can be reduced to a simpler problem involving only one fictitious particle. In addition, when the interaction potential of the two particles depends only on the distance between them, the fictitious particle’s motion is governed by a central potential. This explains why the problem considered in this chapter is of such general interest: it is encountered in quantum mechanics whenever we investigate the behavior of an isolated system composed of two interacting particles. In § C, we shall apply the general methods already described to a special case: that in which ( ) is a Coulomb potential. The hydrogen atom, composed of an electron and a proton which electrostatically attract each other, supplies the simplest example of a system of this type. It is not the only one: in addition to hydrogen isotopes (deuterium, tritium), there are the hydrogenoid ions, which are systems composed of a single electron and a nucleus, such as the ions He+ , Li++ , etc... (other examples will be given in Complement AVII ). For these systems, we shall explicitly calculate the energies of the bound states and the corresponding wave functions. We also recall the fact that, historically, quantum mechanics was introduced in order to explain atomic properties (in particular, those of the simplest atom, hydrogen), which could not be accounted for by classical mechanics. The remarkable agreement between the theoretical predictions and the experimental observations constitutes one of the most spectacular successes of this branch of physics. Finally, it should be noted that the exact results concerning the hydrogen atom serve as the basis of all approximate calculations relating to more complex atoms (having several electrons). A.

Stationary states of a particle in a central potential

In this section, we consider a (spinless) particle of mass , subjected to a central force derived from the potential ( ) (the center of force is chosen as the origin). A-1.

Outline of the problem

A-1-a.

Review of some classical results

The force acting on the classical particle situated at the point is equal to: F=

∇ ( )=

d r d

F is always directed towards always zero. If: =r

(A-1) , and its moment with respect to this point is therefore

p

is the angular momentum of the particle with respect to theorem implies that: d =0 d

(with OM = r)

(A-2) , the angular momentum

(A-3)

is therefore a constant of the motion, so that the particle’s trajectory is necessarily situated in the plane passing through and perpendicular to . 804

A. STATIONARY STATES OF A PARTICLE IN A CENTRAL POTENTIAL

v v⊥ vr

Figure 1: Radial component v and tangential component v of a particle’s velocity.

M O

Now let us consider (Fig. 1) the position (denoted by OM = r) and velocity v of the particle at the instant . The two vectors r and v lie in the plane of the trajectory and the velocity v can be decomposed into the radial component v (along the axis defined by r) and the tangential component v (along the axis perpendicular to r). The radial velocity, the algebraic value of v , is the time derivative of the distance of the particle from the point : r

=

d d

(A-4)

The tangential velocity can be expressed in terms of since: v =

r

,

(A-5)

v

so that the modulus of the angular momentum = r

and the angular momentum

v =

is equal to: (A-6)

v

The total energy of the particle: =

1 2 v + 2

( )=

1 2 1 2 v + v + 2 2

( )

(A-7)

can be written: =

2

1 2 v + 2 2

2

+

( )

(A-8)

The classical Hamiltonian of the system is then: 2

2

=

+

2

2

2

+

( )

(A-9)

where: =

d d

(A-10) 805

CHAPTER VII

PARTICLE IN A CENTRAL POTENTIAL. THE HYDROGEN ATOM

is the conjugate momentum of and , and their conjugate momenta , 2

=

2

+

1 sin2

2

,

must be expressed in terms of the variables , . One finds (cf. Appendix III, § 4-a):

2

(A-11)

In expression (A-9), the kinetic energy is broken into two terms: the radial kinetic energy and the kinetic energy of rotation about . The reason is that, since ( ) is independent of and in this case, the angular variables and their conjugate momenta appear only in the 2 term. In fact, if we are interested in the evolution of , we can use the fact that is a constant of the motion, and replace 2 by a constant in expression (A-9). The Hamiltonian then appears as a function only of the radial variables and 2 ( plays the role of a parameter), and the result is a differential equation involving r only one variable, : d r = d

d2 = d2

(A-12a)

that is: 2

d2 = d2

d d

3

(A-12b)

It is just as if we had a one-dimensional problem (with varying only between 0 and + ), with a particle of mass subjected to the “effective potential”: 2 eff ( ) =

( )+

2

(A-13)

2

We shall see that the situation is analogous in quantum mechanics. A-1-b.

The quantum mechanical Hamiltonian

In quantum mechanics, we want to solve the eigenvalue equation of the Hamiltonian , the observable associated with the total energy. This equation is written, in the r representation: ~2 ∆+ 2

( )

(r) =

(r)

(A-14)

Since the potential depends only on the distance of the particle from the origin, spherical coordinates (cf. § D-1-a of Chapter VI) are best adapted to the problem. We therefore express the Laplacian ∆ in spherical coordinates1 : ∆=

1

2 2

+

1 2

2 2

+

1 tan

+

1 sin2

2 2

(A-15)

and look for eigenfunctions (r) that are functions of the variables , , . 1 Expression (A-15) gives the Laplacian only for non-zero . This is because of the privileged position of the origin in spherical coordinates; it can be seen, moreover, that expression (A-15) is not defined for = 0.

806

A. STATIONARY STATES OF A PARTICLE IN A CENTRAL POTENTIAL

If we compare expression (A-15) with the one for the operator L2 [formula (D-6a) of Chapter VI], we see that the quantum mechanical Hamiltonian can be put in a form completely analogous to (A-9): =

~2 1 2

2

+

2

1 2

2

L2 +

( )

(A-16)

The angular dependence of the Hamiltonian is contained entirely in the L2 term, which is an operator here. We could, in fact, perfect the analogy by defining an operator r , which would allow us to write the first term of (A-16) like the one in (A-9). We shall now show how one can solve the eigenvalue equation: ~2 1 2 A-2.

2

+

2

1 2

2

L2 +

( )

(

)=

(

)

(A-17)

Separation of variables

A-2-a.

Angular dependence of the eigenfunctions

We know [cf. formulas (D-5) of Chapter VI] that the three components of the angular momentum operator L act only on the angular variables and ; consequently, they commute with all operators acting only on the -dependence. In addition, they commute with L2 . Therefore, according to expression (A-16) for the Hamiltonian, the three components of L are constants of the motion 2 in the quantum mechanical sense: [

L] = 0

(A-18)

Obviously, also commutes with L2 . Although we have at our disposition four constants of the motion ( , , and L2 ), we cannot use all four of them to solve equation (A-17) because they do not commute with each other; we shall use only L2 and . Since the three observables L2 and commute, we can find a basis of the state space r of the particle composed of eigenfunctions common to these three observables. We can, therefore, without restricting the generality of the problem outlined in § A-1 above, require the functions ( ), solutions of equation (A-17), to be eigenfunctions of L2 and as well. We must then solve the system of differential equations: (r) = 2

L

(r)

(r) = ( + 1)~ (r) =

(A-19a) 2

(r)

~ (r)

(A-19b) (A-19c)

But we already know the general form of the common eigenfunctions of L2 and (Chap. VI, § D-1): the solutions (r) of equations (A-19), corresponding to fixed values of and , are necessarily products of a function of alone and the spherical harmonic ( ): (r) =

( )

(

)

(A-20)

2 Equation

point about

(A-18) expresses the fact that is a scalar operator with respect to rotations about the (see Complement BVI ). This is true because the potential energy is invariant under rotations .

807

CHAPTER VII

PARTICLE IN A CENTRAL POTENTIAL. THE HYDROGEN ATOM

Whatever the radial function ( ), (r) is a solution of equations (A-19b) and (A-19c). The only problem which remains to be solved is therefore how to determine ( ) such that (r) is also an eigenfunction of [equation (A-19a)]. A-2-b.

The radial equation

We shall now substitute expressions (A-16) and (A-20) into equation (A-19a). Since (r) is an eigenfunction of L2 with the eigenvalue ( + 1)~2 , we see that ( ) is a common factor on both sides. After simplifying, we obtain the radial equation: ~2 1 d2 ( + 1)~2 + + 2 d 2 2 2

( )

( )=

( )

(A-21)

Actually, a solution of (A-21), substituted into (A-20), does not necessarily yield a solution of the eigenvalue equation (A-14) of the Hamiltonian. As we have already pointed out (cf. footnote 1), expression (A-15) for the Laplacian is not necessarily valid at = 0. We must therefore make sure that the behavior of the solutions ( ) of (A-21) at the origin is sufficiently regular for (A-20) to be in fact a solution of (A-14). Instead of solving the partial differential equation (A-17) involving the three variables , , , we must now solve a differential equation involving only the variable , but dependent on a parameter : we are looking for eigenvalues and eigenfunctions of an operator which is different for each value of . In other words, we consider separately, in the state space r , the subspaces ( ) corresponding to fixed values of and (cf. Chap. VI, § C-3-a), studying the eigenvalue equation of in each of these subspaces (which is possible because commutes with L2 and ). The equation to be solved depends on , but not on ; it is therefore the same in the (2 + 1) subspaces ( ) associated with a given value of . We shall denote by the eigenvalues of , that is, the eigenvalues of the Hamiltonian inside a given subspace ( ). The index , which can be discrete or continuous, represents the various eigenvalues associated with the same value of . As for the eigenfunctions of , we shall label them with the same two indices as the eigenvalues: ( ). It is not obvious that this is sufficient: several radial functions might exist and be eigenfunctions of the same operator with the same eigenvalue ; we shall see in § A-3-b that this is not the case and that, consequently, the two indices and are sufficient to characterize the different radial functions. We shall therefore rewrite equation (A-21) in the form: ~2 1 d2 2 d 2

+

( + 1)~2 + 2 2

( )

( )=

( )

(A-22)

We can simplify the differential operator to be studied by a change in functions. We set: ( )=

1

( )

(A-23)

Multiplying both sides of (A-22) by , we obtain for equation: ~2 d2 ( + 1)~2 + + 2 d 2 2 2 808

( )

( )=

( )

( ) the following differential

(A-24)

A. STATIONARY STATES OF A PARTICLE IN A CENTRAL POTENTIAL

This equation is analogous to the one we would have to solve if, in a one-dimensional problem, a particle of mass were moving in an effective potential eff ( ): eff (

)=

( )+

( + 1)~2 2 2

(A-25)

Nevertheless, we must not lose sight of the fact that the variable can take on only non-negative real values. The term ( + 1)~2 2 2 which is added to the potential ( ) is always positive or zero; the corresponding force (equal to minus the gradient of this term) always tends to repel the particle from the force center ; this is why this term is called the centrifugal potential (or centrifugal barrier). Figure 2 represents the shape of the effective potential eff ( ) for various values of in the case where ( ) is an attractive 2 Coulomb potential [ ( ) = ]: for 1, the presence of the centrifugal term, which predominates for small values, causes eff to be repulsive for short distances. Veff (r) ×

ao e2

1

l=2

0

5 1

l=1

r/a0

Figure 2: Shape of the effective potential in the case eff ( ) for the first values of 2 where ( ) = . When = 0, eff ( ) is simply equal to ( ). When takes on the values 1, 2, etc., eff ( ) is obtained by adding to ( ) the centrifugal potential ( +1)~2 2 2 , which approaches + when approaches zero.

l=0

A-2-c.

Behavior of the solutions of the radial equation at the origin

We have already pointed out that it is necessary to examine the behavior of the solutions ( ) of the radial equation (A-21) at the origin in order to know if they are really solutions of (A-14). We shall assume that when approaches zero, the potential ( ) remains finite, or at least approaches infinity less rapidly than 1 (this hypothesis is true in most cases encountered in physics and, in particular, in the case of the Coulomb potential, to be studied in § C). We shall consider a solution of (A-22) and assume that it behaves at the origin like : ( )

0

(A-26) 809

CHAPTER VII

PARTICLE IN A CENTRAL POTENTIAL. THE HYDROGEN ATOM

Substituting (A-26) into (A-22), and setting the coefficient of the dominant term equal to zero, we obtain the equation: ( + 1) + ( + 1) = 0

(A-27)

and, consequently: either

=

(A-28a)

or

( + 1)

(A-28b)

=

For a given value of , there are therefore two linearly independent solutions of the second-order equation (A-22), behaving at the origin like and 1 +1 , respec+1 tively. But those which behave like 1 must be rejected, since it can be shown3 that 1 ( ) is not a solution of the eigenvalue equation (A-14) for = 0. From this, +1 we see that acceptable solutions of (A-24) go to zero at the origin for all , since: ( )

+1

(A-29)

0

Consequently, to equation (A-24) must be added the condition: (A-30)

(0) = 0

Comment: In equation (A-24), , the distance of the particle from the origin, varies only between 0 and + . However, thanks to condition (A-30), we can assume that we are actually dealing with a one-dimensional problem, in which the particle can theoretically move along the entire axis, but in which the effective potential is infinite for all negative values of the variable. We know that, in such a case, the wave function must be identically zero on the negative half-axis; condition (A-30) insures the continuity of the wave function at = 0. A-3.

Stationary states of a particle in a central potential

A-3-a.

Quantum numbers

We can summarize the results of § 2 as follows: the fact that the potential ( ) is independent of and makes it possible: ( ) to require the eigenfunctions of to be simultaneous eigenfunctions of L2 and , which determines their angular dependence: (r) =

( )

(

)=

1

( )

(

)

(A-31)

( ) to replace the eigenvalue equation of , an equation involving partial derivatives with respect to , , , by a differential equation involving only the variable and depending on a parameter [equation (A-24)], with condition (A-30) imposed. 3 This

is because the Laplacian of

end of § 4).

810

1 +1

(

) involves the th derivatives of (r) (cf. Appendix II,

A. STATIONARY STATES OF A PARTICLE IN A CENTRAL POTENTIAL

These results can be compared with those recalled in § 1-a, of which they are the quantum mechanical analogues. In principle, the functions ( ) must be square-integrable, that is, normalizable: (

)

2 2

d dΩ = 1

(A-32)

Their form (A-31) allows us to separate radial and angular integrations: (

)

2 2

2

(A-33)

) are normalized with respect to

and ; condition

2

d dΩ =

d

( )

2

dΩ

(

)

0

But the spherical harmonics (A-32) therefore reduces to: 2

d

( )

0

2

=

(

d

( )

2

=1

(A-34)

0

Actually, we know that it is often convenient to accept eigenfunctions of the Hamiltonian that are not square-integrable. If the spectrum of has a continuous part, we shall require only that the corresponding eigenfunctions be orthonormalized in the extended sense, that is, that they satisfy a condition of the form: 2 0

d

( )

( )=

d

( )

( )= (

)

(A-35)

0

where

is a continuous index. In (A-34) and (A-35), the integrals converge at their lower limit, = 0 [condition (A-30)]. This is physically satisfying since the probability of finding the particle in any volume of finite dimensions is then always finite. It is therefore only because of the behavior of the wave functions for that, in the case of a continuous spectrum, the normalization integrals (A-35) diverge if = . Finally, the eigenfunctions of the Hamiltonian of a particle placed in a central potential ( ) depend on at least three indices [formula (A-31)]: ( )= ( ) ( ) is a simultaneous eigenfunction of L2 and with the respective eigenvalues , ( + 1)~2 and ~. is called the radial quantum number; , the azimuthal quantum number; and , the magnetic quantum number. The radial part 1 ( )= ( ) of the eigenfunction and the eigenvalue of are independent of the magnetic quantum number and are given by the radial equation (A-24). The angular part of the eigenfunction depends only on and and not on ; it does not depend on the form of the potential ( ). A-3-b.

Degeneracy of the energy levels

Finally, we shall consider the degeneracy of the energy levels, that is, of the eigenvalues of the Hamiltonian . The (2 + 1) functions ( ) with and fixed and varying from to + are eigenfunctions of with the same eigenvalue [these (2 + 1) functions are clearly orthogonal, since they correspond to different eigenvalues of ]. The level is therefore at least (2 + 1)-fold degenerate. This degeneracy, 811

CHAPTER VII

PARTICLE IN A CENTRAL POTENTIAL. THE HYDROGEN ATOM

which exists for all potentials ( ), is called an essential degeneracy: it is due to the fact that the Hamiltonian contains L2 , but not , which means that does not 4 appear in the radial equation . It is also possible for one of the eigenvalues of the radial equation corresponding to a given value of to be the same as an eigenvalue , associated with another radial equation, characterized by = . This occurs only for certain potentials ( ). The resulting degeneracies are called accidental (we shall see in § C that the energy states of the hydrogen atom present accidental degeneracies). We must now show that, for a fixed value of , the radial equation has at most one physically acceptable solution for each eigenvalue . This actually results from condition (A-30). The radial equation, since it is a second-order differential equation, has a priori two linearly independent solutions for each value of . Condition (A-30) eliminates one of them, so there is at most one acceptable solution for each value of . We must also consider the behavior of the solutions for approaching infinity; if ( ) 0 when , the negative values of for which the solution we have just chosen is also acceptable at infinity (that is, bounded) form a discrete set (see example of § C below and Complement BVII ). It follows from the preceding considerations that L2 and constitute a C.S.C.O.5 . 2 If we fix three eigenvalues , ( + 1)~ and ~, there corresponds to them a single function (r). The eigenvalue of L2 indicates which equation yields the radial function; the eigenvalue of determines this radial function ( ) uniquely, as we have just seen; finally, there exists only one spherical harmonic ( ) for a given and . B.

Motion of the center of mass and relative motion for a system of two interacting particles

Consider a system of two spinless particles, of masses 1 and 2 and positions r1 and r2 . We assume that the forces exerted on these particles are derived from a potential energy (r1 r2 ) which depends only on r1 r2 . This is true if there are no forces originating outside the system (that is, the system is isolated), and if the interactions between the two particles are derived from a potential. This potential must depend only on r1 r2 , since only the relative positions of the two particles are involved. We shall show that the study of such a system can be reduced to that of a single particle placed in the potential (r). B-1.

Motion of the center of mass and relative motion in classical mechanics

In classical mechanics, the two-particle system is described by the Lagrangian (cf. Appendix III): (r1 r˙ 1 ; r2 r˙ 2 ) =

=

1 2

˙ 21 1r

+

1 2

˙ 22 2r

(r1

r2 )

(B-1)

4 This essential degeneracy appears whenever the Hamiltonian is rotation-invariant (cf. Complement BVI ). This is why it is encountered in numerous physical problems. 5 Actually, we have not proven that these operators are observables, that is, that the set of (r) form a basis in the state space r .

812

B. MOTION OF THE CENTER OF MASS AND RELATIVE MOTION FOR A SYSTEM OF TWO INTERACTING PARTICLES

and the conjugate momenta of the six coordinates of the two particles are the components of the mechanical momenta: p1 =

˙1 1r

p2 =

˙2 2r

(B-2)

The study of the motion of the two particles is simplified by replacing the positions r by the three coordinates of the center of mass (or center of gravity): + 1+

1 r1

r =

2 r2

(B-3)

2

and the three relative coordinates 6 : r = r1

(B-4)

r2

Formulas (B-3) and (B-4) can be inverted to yield: 2

r1 = r + 1

+

1

+

r 2

1

r2 = r

(B-5)

r 2

The Lagrangian can then be written, in terms of the new variables r (r

1 2 1 = 2

r˙ ; r r˙ ) =

2 1

2

r˙ + 1

r˙ 2 +

+

1 2 r˙ 2

r˙ 2

+

1 2

and r: 2

2

1

r˙ 1

+



(r)

2

(r)

(B-6)

where: =

+

1

(B-7)

2

is the total mass of the system, and: =

1

2

1+

(B-8a) 2

is its reduced mass (the geometrical mean of the two masses given by: 1

=

1

+

1

and

1

1

2 ),

which is also

(B-8b)

2

The conjugate momenta of the variables r and r are obtained by differentiating expression (B-6) with respect to the components of r˙ and r˙ . Using (B-3), (B-4) and (B-2), we find: p =

r˙ =

˙1 1r

+

p = r˙ = 6 Definition

˙2 2r

= p1 + p2 p 2 1 1 p2 1+ 2

(B-9a) (B-9b)

(B-4) introduces a slight asymmetry between the two particles.

813

CHAPTER VII

PARTICLE IN A CENTRAL POTENTIAL. THE HYDROGEN ATOM

or: p

=

p1

p2

1

2

(B-9c)

p is the total momentum of the system, and p is called the relative momentum of the two particles. We can express the classical Hamiltonian of the system in terms of the new dynamical variables we have just introduced: (r

p ; r p) =

p2 p2 + + 2 2

(r)

(B-10)

This leads to the following equations of motion [formulas (27) of Appendix III]: p˙ = 0 p˙ = ∇ (r)

(B-11) (B-12)

The first term of expression (B-10) represents the kinetic energy of a fictitious particle whose mass would be the sum 1 + 2 of the masses of the two real particles, whose position would be that of the center of mass of the system [formula (B-3)], and whose momentum p would be the total momentum p1 +p2 of the system. Equation (B11) indicates that this fictitious particle is in uniform rectilinear motion (free particle). This result is well known in classical mechanics: the center of mass of a system of particles moves like a single particle whose mass is the total mass of this system, subjected to the resultant of all the forces exerted on the various particles. Here, this resultant is zero since the only forces present are internal ones obeying the principle of action and reaction. Since the center of mass is in uniform rectilinear motion with respect to the initially chosen frame, the frame in which it is at rest (p = 0) is also an inertial frame. In this center of mass frame, the first term of (B-10) is zero. The classical Hamiltonian, that is, the total energy of the system, then reduces to: =

p2 + 2

(r)

(B-13)

is the energy associated with the relative motion of the two particles. It is obviously this relative motion that is the most interesting in the study of the two interacting particles. It can be described by introducing a fictitious particle, called the relative particle: its mass is the reduced mass of the two real particles, its position is characterized by the relative coordinates r, and its momentum is the relative momentum p. Since its motion obeys equation (B-12), it behaves as if it were subjected to a potential (r) equal to the potential energy of interaction between the two real particles. The study of the relative motion of two interacting particles therefore reduces to that of the motion of a single fictitious particle, characterized by formulas (B-4), (B-8) and (B-9c). This last equation expresses the fact that the velocity p of the relative particle is indeed the difference between the velocities of the two particles, that is, their relative velocity. B-2.

Separation of variables in quantum mechanics

The considerations of the preceding section can easily be transposed to quantum mechanics, as we shall now show. 814

B. MOTION OF THE CENTER OF MASS AND RELATIVE MOTION FOR A SYSTEM OF TWO INTERACTING PARTICLES

B-2-a.

Observables associated with the center of mass and the relative particle

The operators R1 , P1 and R2 , P2 , which describe the positions and momenta of the two particles of the system, satisfy the canonical commutation relations: [

1

1

]= ~

[

2

2

]= ~

(B-14)

with analogous expressions for the components along and . All the observables labeled by the index 1 commute with all those of index 2, and all the observables relating to one of the axes , or commute with those corresponding to another one of these axes. Now let us define the observables R and R by formulas similar to (B-3) and (B-4): R =

+ + R2

1 R1

1

R = R1

2 R2

(B-15b)

and the observables P P = P1 + P2 2 P1 P= 1+

(B-15a)

2

and P by formulas similar to (B-9): (B-16a)

1 P2

(B-16b)

2

It is easy to calculate the various commutators of these new observables. The results are as follows: [ [

]= ~

(B-17a)

]= ~

(B-17b)

with analogous expressions for the components along and ; all the other commutators are zero. Consequently, R and P, like R and P , satisfy canonical commutation relations. Moreover, every observable of the set R P commutes with every observable of the set R P . We can also interpret R and P, on the one hand, and R and P , on the other, as being the position and momentum observables of two distinct fictitious particles. B-2-b.

Eigenvalues and eigenfunctions of the Hamiltonian

The Hamiltonian operator of the system is obtained from formulas (B-1) and (B-2) and the quantization rules of Chapter III: =

P21 P2 + 2 + 2 1 2 2

(R1

R2 )

(B-18)

Since definitions (B-15) and (B-16) are formally identical to (B-3), (B-4) and (B-9), and since all the momentum operators commute, a simple algebraic calculation yields the equivalent of expression (B-10). =

P2 P2 + + 2 2

(R)

(B-19) 815

CHAPTER VII

PARTICLE IN A CENTRAL POTENTIAL. THE HYDROGEN ATOM

The Hamiltonian =

then appears as the sum of two terms:

+

(B-20)

with: P2 2 P2 = + 2 =

(B-21a) (R)

(B-21b)

which commute, according to the results of § B-2-a: [

]=0

(B-22)

and therefore commute with . It follows that there exists a basis of eigenvectors of that are also eigenvectors of and ; we shall therefore look for solutions of the system: = =

(B-23)

which immediately implies, according to (B-20): =

(B-24)

with: =

+

(B-25)

Consider the r r representation, whose basis vectors are the eigenvectors common to the observables R and R. In this representation, a state is characterized by a wave function (r r) which is a function of six variables. The action of the operators R and R is expressed by the multiplication of the wave functions by the variables r ~ ~ and r respectively. P and P become the differential operators ∇ and ∇ (where ∇ denotes the set of three operators , and ). The state space of the system can then be considered to be the tensor product r r of the state space and the space r associated with R. and r associated with the observable R then appear as the extensions into of operators actually acting only in r and r , respectively. We can therefore, as we saw in § F of Chapter II, find a basis of eigenvectors satisfying (B-23), in the form: =

(B-26)

r

with: =

(B-27a)

r r r

816

r

= r

r

r

(B-27b)

B. MOTION OF THE CENTER OF MASS AND RELATIVE MOTION FOR A SYSTEM OF TWO INTERACTING PARTICLES

Writing these equations in the ~2 ∆ 2 ~2 ∆+ 2

(r ) =

(r)

r (r)

and

r

r

representations respectively, we obtain:

(r )

=

r

r (r)

(B-28a) (B-28b)

The first of these equations, (B-28a), shows that the particle associated with the center of mass of the system is free, as in classical mechanics. We know its solutions: they are, for example, the plane waves: (r ) =

1 (2 ~)3

2

e~p

r

(B-29)

whose energy is equal to: =

p2 2

(B-30)

can take on any positive value or zero; it is the kinetic energy corresponding to a translation of the system as a whole. The more interesting equation from a physical point of view is the second one, (B-28b), which concerns the relative particle. It describes the behavior of the system of the two interacting particles in the center of mass frame. If the interaction potential of the two real particles depends only on the distance between them, r1 r2 , and not on the direction of the vector r1 r2 , the relative particle is subjected to a central potential ( ); the problem is then reduced to the one treated in § A.

Comment:

The total angular momentum of the system of the two real particles is: J = L1 + L2

(B-31)

with: L1 = R1

P1

L2 = R2

P2

(B-32)

It can easily be shown that it can also be written: J=L +L

(B-33)

where: L =R L=R

P P

(B-34)

are the angular momenta of the fictitious particles (according to the results of § B-2-a, L and L satisfy the commutation relations that characterize angular momenta, and the components of L commute with those of L ). 817

CHAPTER VII

C.

PARTICLE IN A CENTRAL POTENTIAL. THE HYDROGEN ATOM

The hydrogen atom

C-1.

Introduction

The hydrogen atom consists of a proton, of mass: 17

27

10

kg

(C-1)

and charge: 16

19

10

Coulomb

(C-2)

and of an electron, of mass: 0 91

10

30

kg

(C-3)

and charge . The interaction between these two particles is essentially electrostatic. The corresponding potential energy is: 2

( ) where

4

1

2

=

(C-4)

0

denotes the distance between the two particles, and:

2

=

4

2

(C-5)

0

Using the results of § B, we confine ourselves to the study of this system in the center of mass frame. The classical Hamiltonian that describes the relative motion of the two particles is then7 : (r p) = Since close to =

p2 2

2

(C-6) [formulas (C-1) and (C-3)], the reduced mass

of the system is very

:

+

1

(C-7)

(the correction term is on the order of 1/1 800). This means that the center of mass of the system is practically at the position of the proton, and that the relative particle can be identified, to a very good approximation, with the electron. This is why we shall adopt the slightly inaccurate convention of calling the relative particle the electron and the center of mass the proton.

7 Henceforth, we shall omit the index to the relative motion.

818

which was used to label in § B the quantities corresponding

C. THE HYDROGEN ATOM

C-2.

The Bohr model

We shall briefly review the results of the Bohr model, which relate to the hydrogen atom. This model, which is based on the concept of a trajectory, is incompatible with the ideas of quantum mechanics. However, it allows us to introduce, in a very simple way, fundamental quantities such as the ionization energy of the hydrogen atom and a parameter which characterizes atomic dimensions (the Bohr radius 0 ). In addition, it so happens that the energies given by the Bohr theory are the same as the eigenvalues of the Hamiltonian we shall calculate in § C-3. Finally, quantum mechanical theory is in agreement with some of the intuitive images of the Bohr model (§ C-4-c- ). This semi-classical model is based on the hypothesis that the electron describes a circular orbit of radius about the proton, obeying the following equations: =

2

1 2

2

2

(C-8)

2

=

(C-9)

2

= ~ ;

a positive integer

(C-10)

The first two equations are classical ones. (C-8) expresses the fact that the total energy 2 of the electron is the sum of its kinetic energy 2 2 and its potential energy . 2 2 (C-9) is none other than the fundamental equation of Newtonian dynamics ( is the Coulomb force exerted on the electron, and 2 is the acceleration of its uniform circular motion). The third equation expresses the quantization condition, introduced empirically by Bohr in order to explain the existence of discrete energy levels: he postulated that only circular orbits satisfying this condition are possible trajectories for the electron. The different orbits, as well as the corresponding values of the various physical quantities, are labeled by the integer associated with them. A very simple algebraic calculation then yields the expressions for , and : 1

= = =

2

(C-11a)

0

(C-11b)

2

1 0

(C-11c)

with: 4

= 0

=

0

=

2~2 ~2 2

(C-12a) (C-12b)

2

(C-12c) ~

When this model was proposed by Bohr, it marked an important step towards the understanding of atomic phenomena, since it yielded the correct values for the energy levels of the hydrogen atom. These values indeed follow the 1 2 (the Balmer formula) 819

CHAPTER VII

PARTICLE IN A CENTRAL POTENTIAL. THE HYDROGEN ATOM

law indicated by expression (C-11a). Moreover, the experimentally measured ionization energy (the energy which must be supplied to the hydrogen atom in its ground state in order to remove the electron) is equal to the numerical value of : 13 6 eV

(C-13)

Finally, the Bohr radius 0

0

indeed characterizes atomic dimensions:

0 52 ˚ A

(C-14)

Comment:

Complement CI shows how the uncertainty principle, applied to the hydrogen atom, explains the existence of a stable ground state and permits the evaluation of the order of magnitude of its energy and its spatial extension. C-3.

Quantum mechanical theory of the hydrogen atom

We shall now take up the question of the determination of the eigenvalues and eigenfunctions of the Hamiltonian describing the relative motion of the proton and the electron in the center of mass frame [formula (C-6)]. In the r representation, the eigenvalue equation of the Hamiltonian is written: ~2 ∆ 2

2

(r) =

Since the potential (r) are of the form: (r) =

1

2

( )

(r)

(C-15)

is central, we can apply the results of § A: the eigenfunctions

(

)

(C-16)

( ) is given by the radial equation, (A-24), that is: ~2 d2 ( + 1)~2 + 2 d 2 2 2

2

( )=

( )

(C-17)

We add to this equation condition (A-30): (0) = 0

(C-18)

It can be shown that the spectrum of includes a discrete part (negative eigenvalues) and a continuous part (positive eigenvalues). Consider Figure 3, which shows the effective potential for a given value of (the figure is drawn for = 0, but the reasoning remains valid for = 0). For a positive value of , the classical motion is not bounded in space: for the value 0 chosen in Figure 3, it is limited on the left by the abscissa of point , but it is not limited on the right8 . As a result (cf. Complement MIII ) equation (C-17) has acceptable solutions for any 0. The spectrum of is therefore continuous for 0, and the corresponding eigenfunctions are not square-integrable. 8 For a 1 potential, the classical trajectories are conic sections; unbounded motion follows a hyperbola or a parabola.

820

C. THE HYDROGEN ATOM

On the other hand, for 0, the classical motion is bounded: it is confined to the region between the abscissas of the two points and 9 . We shall see later that equation (C-17) has acceptable solutions only for certain discrete values of . The spectrum of is therefore discrete for 0, and the corresponding eigenfunctions are square-integrable. Veff (r)

A E>0 0

1

E∆

(B-16)

This gives us an upper limit for the absolute value of 2

6

1 ∆

ˆ

2:

2

(B-17)

=

which can be written: 2

6

6

1 ∆

ˆ

ˆ

=

1 ∆

ˆ

ˆ

(B-18)

=

The operator which appears inside the brackets differs from the identity operator only by the projector onto the state , since the basis of unperturbed states satisfies the closure relation: +

=1

(B-19)

=

1123

CHAPTER XI

STATIONARY PERTURBATION THEORY

Inequality (B-18) therefore becomes simply: 1 ∆ 1 6 ∆

ˆ [1

6

2

] ˆ

ˆ2

ˆ

2

(B-20)

Multiplying both sides of (B-20) by 2 we obtain an upper limit for the secondorder term in the expansion of ( ), in the form: 2

2

6

1 (∆ ∆

)2

(B-21)

where ∆ is the root-mean-square deviation of the perturbation in the unperturbed state . This indicates the order of magnitude of the error on the energy resulting from taking only the first-order correction into account. C.

Perturbation of a degenerate state

Now assume that the level 0 whose perturbation we want to study is -fold degenerate (where is greater than 1, but finite). We denote by 0 the corresponding eigensubspace of 0 . In this case, the choice: 0

=

0

(C-1)

does not suffice to determine the vector 0 , since equation (A-9) can theoretically be satisfied by any linear combination of the vectors ( =1 2 ). We know only that 0 belongs to the eigensubspace spanned by them. We shall see that, this time, under the action of the perturbation , the level 0 generally gives rise to several distinct “sublevels”. Their number, , is between 1 and . If is less than , some of these sublevels are degenerate, since the total number of orthogonal eigenvectors of associated with the sublevels is always equal to . To calculate the eigenvalues and eigenstates of the total Hamiltonian , we shall limit ourselves, as usually done, to first order in for the energies and to zeroth order for the eigenvectors. To determine 1 and 0 , we can project equation (A-10) onto the basis vectors . Since the are eigenvectors of 0 with the eigenvalue 0 = 0 , we obtain the relations: ˆ 0 =

1

0

(C-2)

We now insert, between the operator ˆ and the vector 0 , the closure relation for the basis: ˆ

0 =

1

0

(C-3)

The vector 0 , which belongs to the eigensubspace associated with 0 , is orthogonal to all the basis vectors for which is different from Consequently, on the left-hand 1124

C. PERTURBATION OF A DEGENERATE STATE

side of (C-3), the sum over the index ˆ

0 =

1

reduces to a single term ( = ), which gives:

0

(C-4)

=1

ˆ We arrange the 2 numbers (where is fixed and , = 1 2 ) in a matrix of row index and column index . This square matrix, which we shall denote by ( ˆ ( ) ) is, so to speak, cut out of the matrix which represents ˆ in the basis: ( ˆ ( ) ) is the part which corresponds to 0 . Equations (C-4) then show that the column vector of elements 0 ( = 1 2 ) is an eigenvector of ( ˆ ( ) ) with the eigenvalue 1 . System (C-4) can, moreover, be transformed into a vector equation inside 0 . All we need to do is define the operator ˆ ( ) , the restriction of 6 ˆ to the subspace 0 . ˆ ( ) acts only in 0 , and it is represented in this subspace by the matrix of elements ˆ , that is, by ( ˆ ( ) ). System (C-4) is thus equivalent to the vector equation: ˆ(

)

0 =

1

0

(C-5)

[We stress the fact that the operator ˆ ( ) is different from the operator ˆ of which it is the restriction: equation (C-5) is an eigenvalue equation inside 0 , and not in all space]. Therefore, to calculate the eigenvalues (to first order) and the eigenstates (to zeroth order) of the Hamiltonian corresponding to a degenerate unperturbed state 0 , diagonalize the matrix ( ( ) ), which represents the perturbation 7 , inside the eigensubspace 0 0 . associated with Let us examine more closely the first-order effect of the perturbation on the (1) degenerate state 0 . Let 1 ( = 1 2 ) be the various distinct roots of the characteristic equation of ( ˆ ( ) ). Since ( ˆ ( ) ) is Hermitian, its eigenvalues are all real, and (1) the sum of their degrees of degeneracy is equal to ( 6 ). Each eigenvalue introduces a different energy correction. Therefore, under the influence of the perturbation (1) = ˆ , the degenerate level splits, to first order, into distinct sublevels, whose energies can be written: ( )=

0

+

1

=1 2

(1)

6

(C-6)

(1)

If = , we say that, to first order, the perturbation completely removes the (1) degeneracy of the level 0 . If , the degeneracy, to first order, is only partially (1) removed (or not at all if = 1). We shall now choose an eigenvalue 1 of ˆ ( ) . If this eigenvalue is non-degenerate, the corresponding eigenvector 0 is uniquely determined (to within a phase factor) by (C-5) [or by the equivalent system (C-4)]. There then exists a single eigenvalue ( ) of ( ) which is equal to 0 + 1 , to first order, and this eigenvalue is non-degenerate8 . On 6 If

ˆ

is the projector onto the subspace

0,

ˆ

( )

can be written (Complement BII , § 3): ˆ

( )

=

.

is simply equal to ( ˆ ( ) ); this is why its eigenvalues yield directly the corrections 1 . proof of this point is analogous to the one that shows that a non-degenerate level of 0 gives rise to a non-degenerate level of ( ) (cf. end of § A-2). 7(

( ))

8 The

1125

CHAPTER XI

STATIONARY PERTURBATION THEORY

the other hand, if the eigenvalue 1 of ˆ ( ) being considered presents a -fold degeneracy, (1) (C-5) indicates only that 0 belongs to the corresponding -dimensional subspace . This property of 1 can, actually, reflect two very different situations. One could distinguish between them by pursuing the perturbation calculation to higher orders of , and seeing whether the remaining degeneracy is removed. These two situations are the following: ( ) Suppose that there is only one exact energy ( ) that is equal, to first order, to + 1 , and that this energy is -fold degenerate [in Figure 1, for example, the energy ( ) that approaches 40 when 0 is two-fold degenerate, for any value of ]. A -dimensional eigensubspace then corresponds to the eigenvalue ( ), whatever , so that the degeneracy of the approximate eigenvalues will never be removed, to any order of . In this case, the zeroth-order eigenvector 0 of ( ) cannot be completely specified, since the only condition imposed on 0 is that of belonging to a subspace which is the limit, when 0, of the -dimensional eigensubspace of ( ) corresponding to ( ). This limit is none (1) other than the eigensubspace of ( ˆ ( ) ) associated with the eigenvalue 1 chosen. This first case often arises when 0 and possess common symmetry properties, implying an essential degeneracy for ( ). Such a degeneracy then remains to all orders in perturbation theory. 0

( ) It may also happen that several different energies ( ) are equal, to first order, to + 1 (the difference between these energies then appears in a calculation at second or higher orders). (1) In this case, the subspace obtained to first order is only the direct sum of the limits, for 0, of several eigensubspaces associated with these various energies ( ). In other words, all the eigenvectors of ( ) corresponding to these energies certainly approach kets of (1) (1) , but, inversely, a particular ket of is not necessarily the limit 0 of an eigenket of ( ). In this situation, going to higher order terms allows one, not only to improve the accuracy of the energies, but also to determine the zeroth-order kets 0 . However, in practice, the partial information contained in equation (C-5) is often considered sufficient. 0

Comments:

(i) When we use the perturbation method to treat all the energies9 of the spectrum of 0 , we must diagonalize the perturbation inside each of the eigensubspaces 0 corresponding to these energies. It must be understood that this problem is much simpler than the initial problem, which is the complete diagonalization of the Hamiltonian in the entire state space. Perturbation theory enables us to ignore completely the matrix elements of between vectors belonging to different subspaces 0 . Therefore, instead of having to diagonalize a generally infinite matrix, we need only diagonalize, for each of the energies 0 in which we are interested, a matrix of smaller dimensions, generally finite. (ii) The matrix ( ˆ ( ) ) clearly depends on the basis initially chosen in this subspace 0 (although the eigenvalues and eigenkets of ˆ ( ) obviously do not depend on it). Therefore, before we begin the perturbation calculation, it is advantageous to find a basis that simplifies as much as possible the form of 9 The perturbation of a non-degenerate state, studied in § B, can be seen as a special case of that of a degenerate state.

1126

C. PERTURBATION OF A DEGENERATE STATE

( ( ) ) for this subspace, and, consequently, the search for its eigenvalues and eigenvectors (the simplest situation is obviously the one in which this matrix is obtained directly in a diagonal form). To find such a basis, we often use observables which commute both10 with 0 and . Assume that we have an observable which commutes with 0 and Since 0 and commute, we can choose for the basis vectors eigenstates common to 0 and Furthermore, since commutes with , its matrix elements are zero between eigenvectors of associated with different eigenvalues. The matrix ( ( ) ) then contains numerous zeros, which facilitates its diagonalization. (iii) Just as for non-degenerate levels (cf. comment of § B-1-b), the method described in this section is valid only if the matrix elements of the perturbation are much smaller than the differences between the energy of the level under study and those of the other levels (this conclusion would have been evident if we had calculated higher-order corrections). However, it is possible to extend this method to the case of a group of unperturbed levels that are very close to each other (but distinct) and very far from all the other levels of the system being considered. This means, of course, that the matrix elements of the perturbation are of the same order of magnitude as the energy differences inside the group, but are negligible compared to the separation between a level in the group and one outside. We can then approximately determine the influence of the perturbation by diagonalizing the matrix which represents = 0+ inside this group of levels. It is by relying on an approximation of this type that we can, in certain cases, reduce the study of a physical problem to that of a two-level system, such as those described in Chapter IV (§ C). References and suggestions for further reading:

For other perturbation methods, see, for example: Brillouin-Wigner series (an expansion which is simple for all orders but which involves the perturbed energies in the energy denominators): Ziman (2.26), § 3.1. The resolvent method (an operator method which is well suited for the calculation of higher-order corrections): Messiah (1.17), Chap. XVI, § 111; Roman (2.3), § 4-5-d. Method of Dalgarno and Lewis (which replaces the summations over the intermediate states by differential equations): Borowitz (1.7). § 14-5; Schiff (1.18), Chap. 8, § 33. Original references: (2.34), (2.35), (2.36). The W.K.B. method, applicable to quasi-classical situations: Landau and Lifshitz (1.19), Chap. 7; Messiah (1.17), Chap. VI, § 11; Merzbacher (1.16), Chap. VII; Schiff (1.18), § 34; Borowitz (1.7), Chaps. 8 and 9. The Hartree and Hartree-Fock methods: see Complement EXV ; Messiah (1.17), Chap. XVIII, § 11; Slater (11.8), Chaps. 8 and 9 (Hartree) and 17 (Hartree-Fock); Bethe and Jackiw (1.21), Chap. 4. See also references of Complement AXIV .

10 Recall

that this does not imply that

0

and

commute.

1127

COMPLEMENTS OF CHAPTER XI, READER’S GUIDE

AXI , BXI , CXI and DXI : illustrations of stationary perturbation theory using simple and important examples.

AXI : A ONE-DIMENSIONAL HARMONIC OSCILLATOR SUBJECTED TO A PERTURBING POTENTIAL IN , 2 , 3

Study of a one-dimensional harmonic oscillator perturbed by a potential in , 2 , 3 . Simple, advised for a first reading. The last example (perturbing potential in 3 ) permits the study of the anharmonicity in the vibration of a diatomic molecule (a refinement on the model presented in Complement AV ).

BXI : INTERACTION BETWEEN THE MAGNETIC DIPOLES OF TWO SPIN 1/2 PARTICLES

Can be considered as a worked example, illustrating perturbation theory for non-degenerate as well as degenerate states. Familiarizes the reader with the dipole-dipole interaction between magnetic moments of two spin 1 2 particules. Simple.

CXI : VAN DER WAALS FORCES

Study of the long-distance forces between two neutral atoms using perturbation theory (Van der Waals forces). The accent is placed on the physical interpretation of the results. A little less simple than the two preceding complements: can be reserved for later study.

DXI : THE VOLUME EFFECT: THE INFLUENCE OF THE SPATIAL EXTENSION OF THE NUCLEUS ON THE ATOMIC LEVELS

Study of the influence of the nuclear volume on the energy levels of hydrogen-like atoms. Simple. Can be considered as a sequel of Complement AVII .

EXI : THE VARIATIONAL METHOD

Presentation of another approximation method, the variational method. Important, since the applications of the variational method are very numerous.

1129

FXI : ENERGY BANDS OF ELECTRONS IN SOLIDS: A SIMPLE MODEL

GXI : A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

HXI : EXERCISES

1130

FXI , GXI : two important applications of the variational method. Introduction, using the strong-bonding approximation, of the concept of an allowed energy band for the electrons of a solid. Essential, because of its numerous applications. Moderetely difficult. The accent is placed on the interpretation of the results. The view point adopted is different from that of Complement OIII and somewhat simpler.

Studies the phenomenon of the chemical bond for the simplest possible case, that of the (ionized) H+ Shows how quantum mechanics 2 molecule. explains the attractive forces between two atoms whose electronic wave fonctions overlap. Includes a proof of the virial theorem. Essential from the point of view of chemical physics. Moderately difficult.

• HARMONIC OSCILLATOR PERTURBED BY A POTENTIAL IN

,

2

,

3

Complement AXI A one-dimensional harmonic oscillator subjected to a perturbing potential in , 2 , 3

1

Perturbation by a linear potential . . . . . . . . . . . . . . . 1131 1-a

The exact solution . . . . . . . . . . . . . . . . . . . . . . . . 1132

1-b

The perturbation expansion . . . . . . . . . . . . . . . . . . . 1133

2

Perturbation by a quadratic potential . . . . . . . . . . . . . 1133

3

Perturbation by a potential in

3

. . . . . . . . . . . . . . . . 1135

3-a

The anharmonic oscillator . . . . . . . . . . . . . . . . . . . . 1135

3-b

The perturbation expansion . . . . . . . . . . . . . . . . . . . 1136

3-c

Application: the anharmonicity of the vibrations of a diatomic molecule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1137

In order to illustrate the general considerations of Chapter XI by a simple example, we shall use stationary perturbation theory to study the effect of a perturbing potential in , 2 or 3 on the energy levels of a one-dimensional harmonic oscillator (none of these levels is degenerate, cf. Chap. V). The first two cases (a perturbing potential in and in 2 ) are exactly soluble. Consequently, we shall be able to verify in these two examples that the perturbation expansion coincides with the limited expansion of the exact solution with respect to the parameter that characterizes the strength of the perturbation. The last case (a perturbing potential in 3 ) is very important in practice for the following reason. Consider a potential ( ) which has a minimum at = 0. To a first approximation, ( ) can be replaced by the first term (in 2 ) of its Taylor series expansion, in which case we are considering a harmonic oscillator and, therefore, an exactly soluble problem. The next term of the expansion of ( ), which is proportional to 3 , then constitutes the first correction to this approximation. Calculation of the eflect of the term in 3 , consequently, is necessary whenever we want to study the anharmonicity of the vibrations of a physical system. It permits us, for example, to evaluate the deviations of the vibrational spectrum of diatomic molecules from the predictions of the (purely harmonic) model of Complement AV .

1.

Perturbation by a linear potential

We shall use the notation of Chapter V. Let: 2 0

=

2

+

1 2

2

2

(1) 1131



COMPLEMENT AXI

be the Hamiltonian of a one-dimensional harmonic oscillator of eigenvectors eigenvalues1 : 0

with

=

+

1 2

and

(2)

~

=0 1 2 We add to this Hamiltonian the perturbation: = ~ ˆ

(3)

where is a real dimensionless constant much smaller than 1, and ˆ is given by formula (B-1) of Chapter V (since ˆ is of the order of 1, ~ ˆ is of the order of 0 and plays the role of the operator ˆ of Chapter XI). The problem consists of finding the eigenstates and eigenvalues of the Hamiltonian: =

0

1-a.

+

(4)

The exact solution

We have already studied an example of a linear perturbation in : when the oscillator, assumed to be charged, is placed in a uniform electric field , we must add to 0 the electrostatic Hamiltonian: =

~ ˆ

=

(5)

where is the charge of the oscillator. The effect of such a term on the stationary states of the harmonic oscillator was studied in detail in Complement FV . It is therefore possible to use the results of this complement to determine the eigenstates and eigenvalues of the Hamiltonian given by (4) if we perform the substitution: ~

~

(6)

Expression (39) of FV thus yields immediately: =

+

1 2

2

~

2

(7)

~

Similarly, we see from (40) of FV (after having replaced the creation and annihilation operators and ): =e

2

(

)

by its expression in terms of

(8)

The expansion of the exponential then yields: = 1 =

2

(

)+ +1 2

+1

+

2

1

+

(9)

1 To specify that we are considering the unperturbed Hamiltonian, as in Chapter XI, we add the index 0 to the eigenvalue of 0 .

1132

• HARMONIC OSCILLATOR PERTURBED BY A POTENTIAL IN 1-b.

2

,

,

3

The perturbation expansion

We replace ˆ by We obtain: ~

=

1 ( 2

+ ) in (3) [cf. formula (B-7a) of Chapter V].

+

2

(10)

then mixes the state non-zero matrix elements of +1

=

1

=

only with the two states are, consequently:

+1

and

1

. The only

+1 ~ 2 2

(11)

~

According to general expression (B-15) of Chapter XI, we have: 2 0

=

+

+

+

0

0

(12)

= 0

Substituting (11) into (12) and replacing 2 0

= =

+0 +

1 2

0

by (

)~ , we immediately obtain:

2 ( + 1) ~ + ~ + 2 2 2

~

2

~ +

(13)

This shows that the perturbation expansion of the eigenvalue to second order in cides2 with the exact solution (7). Similarly, general formula (B-11) of Chapter XI: =

+

+

0

0

coin-

(14)

=

yields here: +1 2

=

+1

+

2

1

+

(15)

an expression which is identical to expansion (9) of the exact solution. 2.

Perturbation by a quadratic potential

We now assume = 2 It

to have the following form:

1 1 ~ ˆ2 = 2 2

2

2

(16)

can be shown that all terms of order higher than 2 in the perturbation expansion are zero.

1133

COMPLEMENT AXI

where



is a real dimensionless parameter much smaller than 1. 2

=

0

+

=

+

2

1 2

2

can then be written:

2

(1 + )

(17)

In this case, the effect of the perturbation is simply to change the spring constant of the harmonic oscillator. If we set: 2

=

2

(1 + )

(18)

we see that is still a harmonic oscillator Hamiltonian, whose angular frequency has become . In this section, we shall confine ourselves to the study of the eigenvalues of . According to (17) and (18), they can be written simply: =

+

1 2

=

~

1 2

+

1+

~

(19)

that is, expanding the radical: =

+

1 2

2

1+

~

2

+

8

(20)

Let us now find result (20) by using stationary perturbation theory. Expression (16) can also be written: 1 ~ 4 1 = ~ 4 =

2

+ 2

+

2

=

1 ~ 4

+2

2

+

2

+

+

+1

(21)

From this, it can be seen that the only non-zero matrix elements of are: 1 2 1 = 4 1 = 4 = +2

2

+

1 2

~

( + 1)( + 2) (

associated with

1 2

~

1 2

1)

(22)

~

When we use this result to evaluate the varions terms of (12), we find: =

0

=

0

=

+

2

+ +

+ +

1 2

1 2

2

1 2

~ ~

16

( + 1)( + 2) +

2

1 2

1)

~ + 2

2

~

8

+

2

~

1+

2

8

+

which indeed coincides with expansion (20). 1134

2 ~ + ( 2 16

(23)

• HARMONIC OSCILLATOR PERTURBED BY A POTENTIAL IN 3.

Perturbation by a potential in

We now add to = ~ where

3-a.

0

,

2

,

3

3

the perturbation:

ˆ3

(24)

is a real dimensionless number much smaller than 1.

The anharmonic oscillator

2 2 Figure 1 represents the variation with respect to of the total potential 12 + ( ) in which the particle is moving. The dashed line gives the parabolic potential 1 2 2 of the “unperturbed” harmonic oscillalor. We have chosen 0, so that the 2 total potential (the solid curve in the figure) increases less rapidly for 0 than for 0.

1 mω2x2 + W(x) 2

A

B

E

xA

0

xB

x

Figure 1: Variation of the potential associated with an anharmonic oscillator with respect to . We treat the difference between the real potential (solid line) and the harmonic potential (dashed line) of the unperturbed Hamiltonian as a perturbation ( and are the limits of the classical motion of energy ).

When the problem is treated in classical mechanics, the particle with total energy is found to oscillate between two points, and (Fig. 1), which are no longer symmetric with respect to . This motion, while it remains periodic, is no longer sinusoidal: there appears, in the Fourier expansion of ( ), a whole series of harmonics of the fundamental frequency. This is why such a system is called an “anharmonic oscillator” (its motion is no longer harmonic). Finally, let us point out that the period of the motion is no longer independent of the energy , as was the case for the harmonic oscillator. 1135

COMPLEMENT AXI

3-b.



The perturbation expansion

.

Matrix elements of the perturbation

We replace ˆ by 12 ( + ) in (24). Using relations (B-9) and (B-17) of Chapter V, we obtain, after a simple calculation: ~

=

3

23 2

where

=

+

3

+3

+ 3(

+ 1)

(25)

was defined in Chapter V [formula (B-13)].

From this can immediately be deduced the only non-zero matrix elements of associated with :

.

+3

=

3

=

+1

=3

1

=3

( + 3)( + 2)( + 1) 8 (

1)( 8 +1 2

1 2

~

1 2

2)

~

3 2

~

3 2

(26)

~

2

Calculation of the energies

We substitute results (26) into the perturbation expansion of , see relation (12). Since the diagonal element of is zero, there is no first-order correction. The four matrix elements (26) enter, however, into the second-order correction. A simple calculation thus yields:

=

+

1 2

15 4

~

2

+

1 2

2

~

7 16

2

~ +

(27)

The effect of is therefore to lower the levels (whatever the sign of ). The larger , the greater the shift (Fig. 2). The difference between two adjacent levels is equal to:

1

=~

1

15 2

2

(28)

It is no longer independent of , as it was for the harmonic oscillator. The energy states are no longer equidistant and move closer together as increases. 1136

• HARMONIC OSCILLATOR PERTURBED BY A POTENTIAL IN .

,

2

,

3

Calculation of the eigenstates Substituting relations (26) into expansion (14), we easily obtain: =

+1 2

3

3 2

+1

( + 3)( + 2)( + 1) 3 8 +

( 3

1)( 8

2)

1

2 1 2

+3

1 2

+

(29)

the state

is therefore mixed with the states

3

Under the effect of the perturbation +1 , 1 , +3 and 3 . 3-c.

3 2

+3

Application: the anharmonicity of the vibrations of a diatomic molecule

In Complement AV , we showed that a heteropolar diatomic molecule could absorb or emit electromagnetic waves whose frequency coincides with the vibrational frequency of the two nuclei of the molecule about their equilibrium position. If we denote by the displacement of the two nuclei from their equilibrium position , the electric dipole moment of the molecule can be written: ( )=

0

+

1

+

(30)

n+2

n+1

n

n–1

n–2

Figure 2: Energy levels of 0 (dashed lines) and of (solid lines). Under the effect of the perturbation , each level of 0 is lowered, and the higher , the greater the shift.

1137



COMPLEMENT AXI

The vibrational frequencies of this dipole are therefore the Bohr frequencies which can appear in the expression for ( ). For a harmonic oscillator, the selection rules satisfied by are such that only one Bohr frequency can be involved, the frequency 2 (cf. Complement AV ). When we take the perturbation into account, the states of the oscillator are “mixed” [cf. expression (29)], and can connect states and for which = 1: new frequencies can thus be absorbed or emitted by the molecule. To analyze this phenomenon more closely, we shall assume that the molecule is initially in its vibrational ground state 0 (this is practically always the case at ordinary temperatures since, in general, ~ ). By using expression (29), we can calculate, to first order3 in , the matrix elements of ˆ between the state 0 and an arbitrary state . A simple calculation thus yields the following matrix elements (all the others are zero to first order in ): 1

ˆ

0

2

ˆ

0

0

ˆ

0

1 2 1 = 2 3 = 2

=

(31a) (31b) (31c)

From this, we can find the transition frequencies observable in the absorption spectrum of the ground state. We naturally find the frequency: 1

=

1

0

(32a)

ˆ 0 is of which appears with the greatest intensity since, according to (31a), 1 zeroth-order in . Then, with a much smaller intensity [cf. formula (31b)], we find the frequency: 2

=

2

0

(32b)

which is often called the second harmonic (although it is not rigorously equal to twice 1 ). Comment: Result (31c) means that the average value of ˆ is not zero in the ground state. This can easily be understood from Figure 1, since the oscillatory motion is no longer symmetric about . If is negative (the case in Figure 1), the oscillator spends more time in the 0 region than in the 0 region, and the average value of must be positive. We thus understand the sign appearing in (31c).

The preceding calculation reveals only one new line in the absorption spectrum. Actually, the perturbation calculation could be pursued to higher orders in , taking into account higher order terms in expansion (30) of the dipole moment ( ), as well as 3 It would not be correct to keep terms of order higher than 1 in the calculation, since expansion (29) is valid only to first order in .

1138

• HARMONIC OSCILLATOR PERTURBED BY A POTENTIAL IN terms in 4 , frequencies:

5

in the expansion of the potential in the neighborhood of

2

,

3

= 0. All the

0

=

,

(33)

with = 3 4 5 would then be present in the absorption spectrum of the molecule (with intensities decreasing very rapidly when increases). This would finally give, for this spectrum, the form shown in Figure 3. This is what is actually observed.

0

ν1

ν2

ν3

ν4

ν

Figure 3: Form of the vibrational spectrum of a heteropolar diatomic molecule. A series of “harmonic” frequencies 2 , 3 appear in addition to the fundamental frequency 1 . This results from the anharmonicity of the potential, as well as higher order terms in the power series expansion in (the distance between the two atoms) of the molecular dipole moment ( ). Note that the corresponding lines are not quite equidistant and that their intensity decreases rapidly when increases.

Note that the various spectral lines of Figure 3 are not equidistant since, according to formula (28): 1

0=

1

1

=

2

2

2

=

3

3

0

1

2

= = =

15 2

1

2

1

2

15 45 2

1

2

2

(34)

2

(35) 2

(36)

which gives the relation: (

2

1)

1

=(

3

2)

(

2

1)

=

15 4

2

(37)

Thus we see that the study of the precise positions of the lines of the absorption spectrum makes it possible to find the parameter . 1139

COMPLEMENT AXI



Comments: (i) The constant appearing in (52) of Complement FVII can be evaluated by using formula (27) of the present complement. Comparing these two expressions and replacing by in (27), we obtain: =

15 4

2

(38)

Now, the perturbing potential in FVII is equal to it equal to ~ ˆ3 , that is, equal to: 3

5

1 2

3

3

, while here we have chosen

(39)

~ We therefore have: 1 2

~

=

3

5

(40)

which, substituted into (38), finally yields: =

15 4

2 3

~ 5

(41)

(ii) In the expansion of the potential in the neighborhood of = 0, the term in 4 is much smaller than the term in 3 but it corrects the energies to first order, while the term in 3 enters only in second order (cf. § 3-b- above). It is therefore necessary to evaluate these two corrections simultaneously (they may be comparable) when the spectrum of Figure 3 is studied more precisely.

References and suggestions for further reading: Anharmonicity of the vibrations of a diatomic molecule: Herzberg (12.4), vol. I, Chap. III, § 2.

1140

• INTERACTION BETWEEN THE MAGNETIC DIPOLES OF TWO SPINS 1/2

Complement BXI Interaction between the magnetic dipoles of two spin 1/2 particles

1

2

3

The interaction Hamiltonian W . . . . . . . . . . . . . . . . . 1-a The form of the Hamiltonian W. Physical interpretation . . . 1-b An equivalent expression for W . . . . . . . . . . . . . . . . . 1-c Selection rules . . . . . . . . . . . . . . . . . . . . . . . . . . Effects of the dipole-dipole interaction on the Zeeman sublevels of two fixed particles . . . . . . . . . . . . . . . . . . . . 2-a Case where the two particles have different magnetic moments 2-b Case where the two particles have equal magnetic moments . 2-c Example: the magnetic resonance spectrum of gypsum . . . . Effects of the interaction in a bound state . . . . . . . . . .

1141 1141 1142 1143 1144 1144 1147 1149 1149

In this complement, we intend to use stationary perturbation theory to study the energy levels of a system of two spin 1/2 particles placed in a static field B0 and coupled by a magnetic dipole-dipole interaction. Such systems do exist. For example, in a gypsum monocrystal (CaSO4 , 2H2 0), the two protons of each crystallization water molecule occupy fixed positions, and the dipole-dipole interaction between them leads to a fine structure in the nuclear magnetic resonance spectrum. In the hydrogen atom, there also exists a dipole-dipole interaction between the electron spin and the proton spin. In this case, however, the two particles are moving relative to each other, and we shall see that the effect of the dipole-dipole interaction vanishes due to the symmetry of the 1 ground state. The hyperfine structure observed in this state is thus due to other interactions (contact interaction; cf. Chap. XII, §§ B-2 and D-2 and Complement AXII ). 1.

The interaction Hamiltonian W

1-a.

The form of the Hamiltonian W. Physical interpretation

Let S1 and S2 be the spins of particles (1) and (2), and M1 and M2 their corresponding magnetic moments: M1 =

1

S1

M2 =

2

S2

(1)

[where 1 and 2 are the gyromagnetic ratios of (1) and (2)]. We call the interaction of the magnetic moment M2 with the field created by M1 at (2). If n denotes the unit vector of the line joining the two particles and , the distance between them (Fig. 1), can be written: =

0

4

1 1 2 3

[S1 S2

3 (S1 n) (S2 n)]

(2) 1141

COMPLEMENT BXI



Figure 1: Relative disposition of the magnetic moments M1 and M2 of particles (1) and (2) ( is the distance between the two particles, and n is the unit vector of the straight line between them).

The calculation which enables us to obtain expression (2) is in every way analogous to the one that will be presented in Complement CXI and which leads to the expression for the interaction between two electric dipoles. 1-b.

An equivalent expression for W

Let ( )=

and 0

be the polar angles of n. If we set: 1 2 3

4

(3)

we get: = ( ) 3[

1

cos + sin ( [

= ( ) 3

1

cos +

2

1

cos

+

1

cos + sin (

1 sin 2

1+ e

cos +

2

+

1 sin 2

2

sin )] cos 1

+

2

sin )]

S1 S2

e 2+ e

+

2

e

S1 S2

(4)

that is: = ( )[ 1142

0

+

0

+

1

+

1

+

2

+

2]

(5)

• INTERACTION BETWEEN THE MAGNETIC DIPOLES OF TWO SPINS 1/2 where: 0 0

= 3 cos2

1

1

1 3 cos2 4

=

2

1 (

1+

3 sin cos e ( 1 2 3 sin cos e ( 1 1 = 2 3 sin2 e 2 1+ 2+ 2 = 4 3 sin2 e2 1 2 2 = 4 1

=

+

2 2+

+

2+ )

1 1+

2

) (6)

+

2

1

2

)

Each of the terms (or ) appearing in (5) is, according to (6), the product of a function of and proportional to the second-order spherical harmonic 2 and an operator acting only on the spin degrees of freedom [the space and spin operators appearing in (6) are second-rank tensors; , for this reason, is often called the “tensor interaction”]. 1-c.

Selection rules

, and are the spherical coordinates of the relative particle associated with the system of two particles (1) and (2). The operator acts only on these variables and on the spin degrees of freedom of the two particles. Let be a standard basis in the state space r of the relative particle, and , the basis of eigenvectors common to 1 2 and in the spin state space ( = = ). The state space in which acts is 1 2 1 2 spanned by the basis, in which it is very easy, using expressions (5) 1 2 and (6), to find the selection rules satisfied by the matrix elements of .

Spin degrees of freedom 0

changes neither

0

“flips” both spins: +

1

+

1

nor

2.

and

+

+

flips one of the two spins up: +

2

Similarly, +

1

2

2

or

1

1

+

flips one of the two spins down:

2

Finally,

2

and

2

+ +

or

1

+

1

flip both spins up and down, respectively: and

+ + 1143



COMPLEMENT BXI

.

Orbital degrees of freedom

When we calculate the matrix element of ( ) between the state the state , the following angular integral appears: (

)

2

(

)

(

and

) dΩ

(7)

which, according to the results of Complement CX , is different from zero only for: =

2

=

+2

(8a)

+

(8b)

Note that the case = = 0, although not in contradiction with (8), is excluded because we must always be able to form a triangle with , and 2, which is impossible when = = 0. We must have then: 1 2.

(8c)

Effects of the dipole-dipole interaction on the Zeeman sublevels of two fixed particles

In this section, we shall assume the two particles to be fixed in space. We shall therefore quantize only the spin degrees of freedom, considering the quantities , and as given parameters. The two particles are placed in a static field B0 parallel to . The Zeeman Hamiltonian 0 , describing the interaction of the two spin magnetic moments with B0 , can then be written: =

1 1

1

=

1

0

2

=

2

0

0

+

(9)

2 2

with:

(10)

In the presence of the dipole-dipole interaction becomes: =

0

of the system

+

(11)

We shall assume the field 2-a.

, the total Hamiltonian

0

to be large enough and treat

as a perturbation of

0.

Case where the two particles have different magnetic moments

.

Zeeman levels and the magnetic resonance spectrum in the absence of interaction According to (9), we have: 0

1144

1

2

=

~ ( 2

1 1

+

2 2)

1

2

(12)

• INTERACTION BETWEEN THE MAGNETIC DIPOLES OF TWO SPINS 1/2 +, +

ħ (ω + ω2) 2 1

+, +

+, –

ħ (ω – ω2) 2 1

+, –

ħ (ω – ω2) 2 1

–, +





ħ (ω + ω2) 2 1

–, –

–, +

–, –

a

ħΩ

– ħΩ

– ħΩ

ħΩ b

Figure 2: Energy levels of two spin 1/2 particles, placed in a static field B0 parallel to . The two Larmor angular frequencies, 1 = 1 0 and 2 = 2 0 , are assumed to be different. For figure a, the energy levels are calculated without taking account of the dipole-dipole interaction between the two spins. For figure b, we take this interaction into account. The levels undergo a shift whose approximate value, to first order in , is indicated on the right-hand side of the figure. The solid-line arrows join the levels between which 1 has a non-zero matrix element, and the shorter dashed-line arrows those for which 2 does.

Figure 2a represents the energy levels of the two-spin system in the absence of the dipoledipole interaction (we have assumed 1 0). Since 1 = 2 , these levels are all 2 non-degenerate. If we apply a radio-frequency field B1 cos parallel to Ox, we obtain a series of magnetic resonance lines. The frequencies of these resonances correspond to the various Bohr frequencies which can appear in the evolution of 1 1 + 2 2 (the radiofrequency field interacts with the component along Ox of the total magnetic moment). The solid-line (dashed-line) arrows of Figure 2a join levels between which 1 ( 2 ) has a non-zero matrix element. Thus we see that there are two distinct Bohr angular frequencies, equal to 1 and 2 (Fig. 3a), which correspond simply to the resonances of the individual spins, (1) and (2). .

Modifications created by the interaction

Since all the levels of Figure 2a are non-degenerate, the effect of can be obtained to first order by calculating the diagonal elements of 1 2 1 2 . It is clear from expressions (5) and (6) that only the term 0 makes a non-zero contribution to this 1145

COMPLEMENT BXI



ω2

ω1

a





ω2

ω1

b

Figure 3: The Bohr frequencies appearing in the evolution of 1 and 2 give the positions of the magnetic resonance lines that can be observed for the two-spin system (the transitions corresponding to the arrows of Figure 2). In the absence of a dipoledipole interaction, two resonances are obtained, each one corresponding to one of the two spins (fig. a). The dipole-dipole interaction is expressed by a splitting of each of the two preceding lines (fig. b).

matrix element, which is then equal to: 1

2

1

2

= ( ) 3 cos2

1

1 2~

4

2

=

1 2

~Ω

(13)

with: Ω= Since Ω

~ ( ) 3 cos2 4

1 =

is much smaller than 1

2

~ 16

0

0,

1 2 3

3 cos2

1

(14)

we have: (15)

From this we can immediately deduce the level shifts to first order in : ~Ω for + + and , and ~Ω for + and for + (Fig. 2b). What now happens to the magnetic resonance spectrum of Figure 3a? If we are concerned only with lines whose intensities are of zeroth order in (that is, those that approach the lines of Figure 2a when approaches zero), then to calculate the Bohr frequencies appearing in 1 and 2 we simply use the zeroth-order expressions for the eigenvectors1 . It is then the same transitions which are involved (compare the arrows of Figures 2a and 2b). We see, however, that the two lines which correspond to the frequency 1 in the absence of coupling (solid-line arrows) now have different frequencies: 1 If we used higher-order expressions for the eigenvectors, we would see other lines of lower intensity appear (they disappear when 0).

1146

• INTERACTION BETWEEN THE MAGNETIC DIPOLES OF TWO SPINS 1/2 2Ω. Similarly, the two lines corresponding to 2 (dashed-line arrows) 1 + 2Ω and 1 now have frequencies of 2 + 2Ω and 2 2Ω. The magnetic resonance spectrum is therefore now composed of two “doublets” centered at 1 and 2 , the interval between the two components of each doublet being equal to 4Ω (Fig. 3b). Thus, the dipole-dipole interaction leads to a fine structure in the magnetic resonance spectrum, for which we can give a simple physical interpretation. The magnetic moment M1 associated with S1 creates a “local field” b at particle (2). Since we assume B0 to be very large, S1 precesses very rapidly about Oz, so we can consider only the 1 component (the local field created by the other components oscillates too rapidly to have a significant effect). The local field b therefore has a different direction depending on whether the spin is in the state + or , that is, depending on whether it points up or down. It follows that the total field “seen” by particle (2), which is the sum of B0 and b, can take on two possible values2 . This explains the appearance of two resonance frequencies for the spin (2). The same argument would obviously enable us to understand the origin of the doublet centered at 1 . 2-b.

Case where the two particles have equal magnetic moments

.

Zeeman levels and the magnetic resonance spectrum in the absence of the interaction Formula (12) remains valid if we choose

1

and

2

to be equal. We shall therefore

set: 1

=

2

=

=

(16)

0

The energy levels are shown in Figure 4a. The upper level, + + , and the lower level, , of energies ~ and ~ , are non-degenerate. On the other hand, the intermediate level, of energy 0, is two-fold degenerate: to it correspond the two eigenstates + and +. The frequencies of the magnetic resonance lines can be obtained by finding the Bohr frequencies involved in the evolution of 1 + 2 (the total magnetic moment is now proportional to the total spin S = S1 + S2 ). We easily obtain the four transitions represented by the arrows in Figure 4a, which correspond to a single angular frequency . This finally yields the spectrum of Figure 5a. .

Modifications created by the interaction

The shifts of the non-degenerate levels + + and can be obtained as they were before, and are both equal to ~Ω [we must replace, however, 1 and 2 by in expression (14) for Ω]. Since the intermediate level is two-fold degenerate, the effect of on this level can now be obtained by diagonalizing the matrix that represents the restriction of to the subspace + + . The calculation of the diagonal elements is performed as above and yields: + 2 Actually,

+

=

since B0

+

+ =

~Ω

(17)

b , it is only the component of b along B0 which is involved.

1147

COMPLEMENT BXI

• S, M +, + ħω

0

+, –

–, +

1, 1

ħΩ

0, 0

– 2ħΩ

1, 0

1, – 1

– ħω

ħΩ

–, – a

b

Figure 4: The two spin 1/2 particles are assumed to have the same magnetic moment and, conseqnently, the same Larmor angular frequency = 0. In the absence of a dipole-dipole interaction, we obtain three levels, one of which is twofold degenerate (fig. a). Under the effect of the dipole-dipole interaction (fig. b), these levels undergo shifts whose approximate values (to first order in ) are indicated on the right-hand side of the figure. To zeroth-order in , the stationary states are the eigenstates of the total spin. The arrows join the levels between which 1 + 2 has a non-zero matrix element.

As for the non-diagonal element + that only the term 0 contributes to it: +

+ = =

+ , we easily see from expressions (5) and (6)

( ) 3 cos2 1 + ( 4 ~2 ( ) 3 cos2 1 = ~Ω 4

1+ 2

+

1

2+ )

+ (18)

We are then led to the diagonalization of the matrix: ~Ω

11 11

(19)

whose eigenvalues are 2~Ω and 0; they are respectively associated with the eigenvectors 1 1 (+ + + ) and 2 = (+ + ). 1 = 2 2 Figure 4b represents the energy levels of the system of two coupled spins. The energies, to first order in are given by the eigenstates to zeroth order. Note that these eigenstates are none other than the eigenstates common to S2 and , where S = S1 + S2 is the total spin. Since the operator commutes with S2 , it can only couple the triplet states, that is, 1 0 to 1 1 and 1 0 to 1 1 . This 1148

• INTERACTION BETWEEN THE MAGNETIC DIPOLES OF TWO SPINS 1/2



a

ω

b

ω

Figure 5: Shape of the magnetic resonance spectrum which can be observed for a system of two spin 1/2 particles, with the same gyromagnetic ratio, placed in a static field 0 . In the absence of a dipole-dipole interaction, we observe a single resonance (fig. a). In the presence of a dipole-dipole interaction (fig. b), the preceding line splits. The separation 6Ω between the two components of the doublet is proportional to 3 cos2 1, where is the angle between the static field 0 and the straight line joining the two particles.

gives the two transitions represented by the arrows in Figure 4b, and to which correspond the Bohr frequencies + 3Ω and 3Ω. The magnetic resonance spectrum is therefore composed of a doublet centered at , the separation between the two components of the doublet being equal to 6Ω (Fig. 5b). 2-c.

Example: the magnetic resonance spectrum of gypsum

The case studied in § 2-b above corresponds to that of two protons of a crystallization water molecule in a gypsum monocrystal (CaSO4 , 2H2 O). These two protons have identical magnetic moments and can be considered to occupy fixed positions in the crystal. Moreover, they are much closer to each other than to other protons (belonging to other water molecules). Since the dipole-dipole interaction decreases very quickly when the distance increases (1 3 law), we can neglect interactions between protons belonging to other water molecules. The magnetic resonance spectrum is indeed observed to contain a doublet3 whose separation depends on the angle between the field B0 and the straight line joining the two protons. If we rotate the crystal with respect to the field B0 , this angle varies, and the separation between the two components of the doublet changes. Thus, by studying the variations of this separation, we can determine the positions of the water molecules relative to the crystal axes. When the sample under study is not a monocrystal, but rather a powder composed of small, randomly oriented monocrystals, takes on all possible values. We then observe a wide band, due to the superposition of doublets having different separations. 3.

Effects of the interaction in a bound state

We shall now assume that the two particles, (1) and (2), are not fixed, but can move with respect to each other. 3 Actually, in a gypsum monocrystal, there are two different orientations for the water molecules, and, consequently, two doublets corresponding to the two possible values of .

1149



COMPLEMENT BXI

Consider, for example, the case of the hydrogen atom (a proton and an electron). When we take only the electrostatic forces into account, the ground state of this atom (in the center of mass frame) is described by the ket 1 0 0 , labeled by the quantum numbers = 1, = 0, = 0 (cf. Chap. VII). The proton and the electron are spin 1/2 particles. The ground state is therefore four-fold degenerate, and a possible basis in the corresponding subspace is made up of the four vectors: 100

1

(20)

2

where 1 , and 2 , equal to + or , represent respectively the eigenvalues of and (S and I: the electron and proton spins). What is the effect on this ground state of the dipole-dipole interaction between S and I? The matrix elements of are much smaller than the energy difference between the 1 level and the excited levels, so that it is possible to treat the effect of by perturbation theory. To first order, it can be evaluated by diagonalizing the 4 4 matrix of elements 100 1 2 1 0 0 1 2 . The calculation of these matrix elements, according to (5) and (6), involves angular integrals of the form: 0 0

(

)

2

(

0 0(

)

) dΩ

(21)

which are equal to zero, according to the selection rules established in § 1-c above [in this particular case, it can be shown very simply that integral (21) is equal to zero: since 00 is a constant, expression (21) is proportional to the scalar product of 2 and 00 , which is equal to zero because of the spherical harmonic orthogonality relations]. The dipole-dipole interaction does not modify the energy of the ground state to first order. It enters, however, into the (hyperfine) structure of the excited levels with 1. We must then calculate the matrix elements 1 2 , that 1 2 is, the integrals: (

)

2

(

)

(

) dΩ

which, according to (8c), become non-zero as soon as

(22) 1.

References and suggestions for further reading: Evidence in nuclear magnetic resonance experiments of the magnetic dipole interactions between two spins in a rigid lattice: Abragam (14.1), Chap. IV, § II and Chap. VII, § IA; Slichter (14.2), Chap. 3; Pake (14.6).

1150



VAN DER WAALS FORCES

Complement CXI Van der Waals forces

1

2

3

4

The electrostatic interaction Hamiltonian for two hydrogen atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-a Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-b Calculation of the electrostatic interaction energy . . . . . . . Van der Waals forces between two hydrogen atoms in the 1 ground state . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2-a Existence of a attractive potential . . . . . . . . . . . 2-b Approximate calculation of the constant C . . . . . . . . . . 2-c Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Van der Waals forces between a hydrogen atom in the 1 state and a hydrogen atom in the 2 state . . . . . . . . . . 3-a Energies of the stationary states of the two-atom system. Resonance effect . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-b Transfer of the excitation from one atom to the other . . . . Interaction of a hydrogen atom in the ground state with a conducting wall . . . . . . . . . . . . . . . . . . . . . . . . . . .

1152 1152 1153 1154 1154 1156 1156 1158 1158 1159 1160

The character of the forces exerted between two neutral atoms changes with the order of magnitude of the distance separating these two atoms. Consider, for example, two hydrogen atoms. When is of the order of atomic dimensions (that is, of the order of the Bohr radius 0 ), the electronic wave functions overlap, and the two atoms attract each other, since they tend to form an H2 molecule. The potential energy of the system has a minimum1 for a certain value of the distance between the atoms. The physical origin of this attraction (and therefore of the chemical bond) lies in the fact that the electrons can oscillate between the two atoms (cf. §§ C-2-c and C-3-d of Chapter IV). The stationary wave functions of the two electrons are no longer localized about only one of the nuclei; this lowers the energy of the ground state (cf. Complement GXI ). At greater distances, the phenomena change completely. The electrons can no longer move from one atom to the other, since the probability amplitude of such a process decreases with the decreasing overlap of the wave functions, that is, exponentially with the distance. The preponderant effect is then the electrostatic interaction between the electric dipole moments of the two neutral atoms. This gives rise to a total energy which is attractive and which decreases, not exponentially, but with 1 R6 . This is the origin of the Van der Waals forces, which we intend to study in this complement by using stationary perturbation theory (confining ourselves, for the sake of simplicity, to the case of two hydrogen atoms). It should be clearly understood that the fundamental nature of Van der Waals forces is the same as that of the forces responsible for the chemical bond: the basic 1 At

very short distances, the repulsive forces between the nuclei always dominate.

1151

COMPLEMENT CXI



Figure 1: Relative position of the two hydrogen atoms. is the distance between the two protons, which are situated at and , and n is the unit vector on the line joining them. r and r are the position vectors of the two electrons with respect to points and respectively.

Hamiltonian is electrostatic in both cases. Only the variation of the energies of the quantum stationary states of the two-atom system with respect to allows us to define and differentiate these two types of forces. Van der Waals forces play an important role in physical chemistry, especially when the two atoms under consideration have no valence electrons (forces between rare gas atoms, stable molecules, etc.). They are partially responsible for the differences between the behavior of a real gas and that of an ideal gas. Finally, as we have already said, these are long-range forces, and are therefore involved in the stability of colloids. We shall begin by determining the expression for the dipole-dipole interaction Hamiltonian between two neutral hydrogen atoms (§ 1). This will enable us to study the Van der Waals forces between two atoms in the 1 state (§ 2), or between an atom in the 2 state and an atom in the 1 state (§ 3). Finally, we shall show (§ 4) that a hydrogen atom in the 1 state is attracted by its electrical mirror image in a perfectly conducting wall. 1.

The electrostatic interaction Hamiltonian for two hydrogen atoms

1-a.

points

Notation

The two protons of the two hydrogen atoms are assumed to remain motionless at and (Fig. 1). We shall set:

R = OB = R R n= R 1152

OA

(1) (2) (3)



VAN DER WAALS FORCES

is the distance between the two atoms, and n is the unit vector on the line that joins them. Let r be the position vector of the electron attached to atom ( ) with respect to point , and r , the position vector of the electron attached to atom with respect to . We call: = r

(4)

= r

(5)

the electric dipole moments of the two atoms ( is the electron charge). We shall assume throughout this complement that: r

(6)

r

Although they are identical, the electrons of the two atoms are well separated, and their wave functions do not overlap. It is therefore not necessary to apply the symmetrization postulate (cf. Chap. XIV, § D-2-b). 1-b.

Calculation of the electrostatic interaction energy

Atom ( ) creates at ( ) an electrostatic potential with which the charges of ( ) interact. This gives rise to an interaction energy . We saw in Complement EX that can be calculated in terms of n and the multipole moments of atom ( ). Since ( ) is neutral, the most important contribution to is that of the electric dipole moment . Similarly, since ( ) is neutral, the most important term in comes from the interaction between the dipole moment of ( ) and the electric field E = ∇ which is essentially created by . This explains the name of “dipole-dipole interaction” given to the dominant term of . There exist, of course, smaller terms (dipole-quadrupole and , quadrupole-quadrupole , etc.), and is written: =

+

+

+

+

(7)

To calculate , we shall start with the expression for the electrostatic potential created by at ( ): (R) =

1 4

R

(8)

3

0

which leads to: E=

∇R

=

1 4

0

3

[r

3 (r

(9)

n) n]

and, consequently: =

E

B

=

e2 3

[r

r

3 (r

n) (r

n)]

(10)

We have set 2 = 2 4 0 , and we have used expressions (4) and (5) for and . In this complement, we shall choose the axis parallel to n, so that (10) can be written: 2

=

3

(

+

2

)

(11) 1153

COMPLEMENT CXI



In quantum mechanics, becomes the operator , which can be obtained by replacing in (11) by the corresponding observables , , which act in the state spaces and of the two hydrogen atoms2 : 2

= 2.

3

(

+

2

)

(12)

Van der Waals forces between two hydrogen atoms in the 1 ground state

2-a.

6

Existence of a

.

attractive potential

Principle of the calculation The Hamiltonian of the system is: =

where

(

0

+

0

+

(13)

0 and 0 are the energies of atoms ( ) and ( ) when they are isolated. In the absence of , the eigenstates of are given by the equation:

+

0

0

)

;

=(

+

)

;

(14)

where the and the were calculated in § C of Chapter VII. In particular, the ground state of 0 + 0 is 1 0 0 ; 1 0 0 , of energy 2 . It is non-degenerate (we do not take spins into account). The problem is to evaluate the shift in this ground state due to and, in particular, its -dependence. This shift represents, so to speak, the interaction potential energy of the two atoms in the ground state. Since is much smaller than 0 and 0 , we can calculate this effect by stationary perturbation theory. .

First-order effect of the dipole-dipole interaction Let us show that the first-order correction: 1

=

100

;

100

1 0 0;

100

(15)

is zero; the energy 1 involves, according to expression (12) for , products of the form (and analogous quantities in which is 100 100 100 100 replaced by and by ). These products are zero since, in a stationary state of the atom, the average values of the components of the position operator are zero.

Comment:

The other terms, of expansion (7) involve products of two multipole moments, one relative to ( ) and the other one to ( ), at least one of which is of order higher than 1. Their contributions are also zero to first order: they 2 The translational external degrees of freedom of the two atoms are not quantized: for the sake of simplicity, we assume the two protons to be infinitely heavy and motionless. In (12), is therefore a parameter and not an observable.

1154



VAN DER WAALS FORCES

are expressed in terms of average values in the ground state of multipole operators of order greater than or equal to one, and we know (cf. Complement EX , § 2-c) that such average values are zero in an = 0 state (triangle rule of ClebschGordan-coefficients). Therefore we must find the second-order effect of , which constitutes the most important energy correction. .

Second-order effect of the dipole-dipole interaction

According to the results of Chapter XI, the second-order energy correction can be written: 2

; 2

1 0 0;

=

100

(16)

2

where the notation means that the state 1 0 0 ; 1 0 0 is excluded from the summation3 . Since is proportional to 1 3 , 2 is proportional to 1 6 . Furthermore, all the energy denominators are negative, since we are starting from the ground state. Therefore, the dipole-dipole interaction gives rise to a negative energy proportional to 1 6 : 2

=

(17)

6

Van der Waals forces are therefore attractive and vary with 1 7 . Finally, let us calculate the expansion of the ground state to first order in We find, according to formula (B-11) of Chapter XI: 0

=

100

;

.

100

; +

;

1 0 0;

100

2

+ (18)

Comment: The matrix elements appearing in expressions (16) and (18) above involve the quantities (and analogous quantities in which and 100 100 are replaced by and or and ), which are different from zero only if = 1 and = 1. These quantites are indeed proportional to products of angular integrals (Ω )

1

(Ω )

0 0

(Ω ) dΩ

(Ω )

1

(Ω )

which, according to the results of Complement CX , are zero if therefore, in (16) and (18), replace and by 1.

3 This

of

0

+

0 0

(Ω ) dΩ

= 1 or

= 1. We can

summation is performed not only over the bound states, but also over the continuous spectrum 0

1155

COMPLEMENT CXI

2-b.



Approximate calculation of the constant C

According to (16) and (12), the constant C appearing in (17) is given by: 2

;

= e4

(X

+ 2

2

+

)

1 0 0;

100

+ (19)

2 We must have > 2 and > 2. For bound states, = is smaller than , and the error is not significant if we replace in (19) and by 0. For states in the continuous spectrum, varies between 0 and + . The matrix elements of the numerator become small, however, as soon as the size of becomes appreciable, since the spatial oscillations of the wave function are then numerous in the region in which 1 0 0 (r) is non-zero. To have an idea of the order of magnitude of C, we can therefore replace all the energy denominators of (19) by 2E . Using the closure relation and the fact that the diagonal element of is zero (§ 2-a- ), we then get:

e4 2

1 0 0;

100

(X

+

)2

2

1 0 0;

100

(20)

This expression is simple to calculate: because of the spherical symmetry of the 1 state, the average values of the cross terms of the type are zero. Furthermore, and for the same reason, the various quantities: 2

2

100

100

100

2 100

100 2

are all equal to one third of the average value of R = using the expression for the wave function 1 0 0 (r): e4 2 (where 2

0

6

+

2

+

2

. We finally obtain,

2

R2 3

100

100 2

100

=6

2 5 0

(21)

is the Bohr radius) and, consequently:

6

2

5 0 6

2

=

6

0

5

(22)

The preceding calculation is valid only if 0 (no overlapping of the wave functions). Thus we see that 2 is of the order of the electrostatic interaction between two charges 5 and , multiplied by the reduction factor ( 0 ) 1. 2-c.

.

Discussion

“Dynamical” interpretation of Van der Waals forces

At any given instant, the electric dipole moment (we shall say, more simply, the dipole) of each atom has an average value of zero in the ground state 1 0 0 or 1 0 0 . This does not mean that any individual measurement of a component of this dipole will yield zero. If we make such a measurement, we generally find a non-zero value; however, 1156



VAN DER WAALS FORCES

we have the same probability of finding the opposite value. The dipole of a hydrogen atom in the ground state should therefore be thought of as constantly undergoing random fluctuations. We shall begin by neglecting the influence of one dipole on the motion of the other one. Since the two dipoles are then fluctuating randomly and independently, their mean interaction is zero: this explains the fact that has no first-order effect. However, the two dipoles are not really independent. Consider the electrostatic field created by dipole ( ) at ( ) This field follows the fluctuations of dipole ( ). The dipole it induces at ( ) is therefore correlated with dipole ( ) so the electrostatic field which “returns” to ( ) is no longer uncorrelated with the motion of dipole ( ). Thus, although the motion of dipole ( ) is random, its interaction with its own field, which is “reflected” to it by ( ), does not have a average value of zero. This is the physical interpretation of the second-order effect of . The dynamical aspect is therefore useful for understanding the origin of Van der Waals forces. If we were to think of the two hydrogen atoms in the ground state as two spherical and “static” clouds of negative electricity (with a positive point charge at the center of each one), we would be led to a rigorously zero interaction energy. .

Correlations between the two dipole moments

Let us show more precisely that there exists a correlation between the two dipoles. When we take into account, the ground state of the system is no longer 0 [cf. expression (18)]. As shown below, a simple calculation 1 0 0 ; 1 0 0 , but yields: 0

0

to first order in

=

=

0

0

=0

(23)

.

Consider, for example, the matrix element The zeroth-order term, 0 0 . in the ground 100 1 0 0 ; 1 0 0 is zero, since it is equal to the average value of state 1 0 0 . To first order, the summation appearing in formula (18) must be included. Since contains only products of the form , the coefficients of the kets 1 0 0 ; and ; 1 0 0 in this summation are zero. The first-order terms which could be different from zero are therefore proportional to 1 0 0;

;

1 0 0;

These terms are all zero since

100

with

=0

does not act on

and 100

and

= 0; 100

= 0 for

= 0.

Thus, even in the presence of an interaction, the average values of the components of each dipole are zero. This is not surprising: in the interpretation of § 2-c- , the dipole induced in ( ) by the field of dipole ( ) fluctuates randomly, like this field, and has, consequently, an average value of zero. Let us show, on the other hand, that the two dipoles are correlated, by evaluating the average value of a product of two components, one relative to dipole ( ) and the other, to dipole ( ). We shall calculate 0 ( + 2 ) 0 , for example, which, according to (12), is nothing more than ( 3 2 ) 0 0 . Using (18), 1157

COMPLEMENT CXI



we immediately find, taking (15) and (16) into account, that: 3 0

(

+

2

)

0

=2

2

2

=0

(24)

Thus, the average values of the products , and are not zero, as would be the products of average values , and according to (23). This proves the existence of a correlation between the two dipoles. .

Long-range modification of Van der Waals forces

The description of § 2-c- above enables us to understand that the preceding calculations are no longer valid if the two atoms are too far apart. The field produced by ( ) and “reflected” by ( ) returns to ( ) with a time lag due to the propagation ( ) ( ) ( ), and we have argued as if the interactions were instantaneous. We can see that this propagation time can no longer be neglected when it becomes of the order of the characteristic times of the atom’s evolution, that is, of the order of 2 1 , where 1 = ( 1 ) ~ denotes a Bohr angular frequency. In other words, the calculations performed in this complement assume that the distance between the two atoms is much smaller than the wavelengths 2 1 of the spectrum of these atoms (about 1 000 ˚ A). A calculation which takes propagation effects into account gives an interaction energy which, at large distances, decreases as 1 7 . The 1 6 law which we have found therefore applies to an intermediate range of distances, neither too large (because of the time lag) nor too small (to avoid overlapping of the wave functions). 3.

3-a.

Van der Waals forces between a hydrogen atom in the 1 state and a hydrogen atom in the 2 state Energies of the stationary states of the two-atom system. Resonance effect

The first excited level of the unperturbed Hamiltonian 0 + 0 is eight-fold degenerate. The associated eigensubspace is spanned by the eight states : with = 1 0 +1 ; 1 0 0; 200 ; 2 0 0; 100 ; 1 0 0; 21 ; with = 1 0 +1 , which correspond to a situation in which one 100 21 of the two atoms is in the ground state, while the other one is in a state of the = 2 level. According to perturbation theory for a degenerate state, to obtain the first-order effect of , we must diagonalize the 8 8 matrix representing the restriction of to the eigensubspace. We shall show that the only non-zero matrix elements of are those which connect a state 1 0 0 ; 2 1 to the state 2 1 ; 1 0 0 . The operators appearing in the expression for are odd and can therefore couple 1 0 0 only to one of the 2 1 ; an analogous argument is valid for . Finally, the dipole-dipole interaction is invariant under a rotation of the two atoms about the axis which joins them; therefore commutes with + and can only join two states for which the sum of the eigenvalues of and is the same. Therefore, the preceding 8 8 matrix can be broken down into four 2 2 matrices. One of them is entirely zero (the one which concerns the 2 states), and the other three 1158



VAN DER WAALS FORCES

are of the form: 3

0 3

(25)

0

where we have set: 1 0 0;

21

21

;

100

=

is a calculable constant, of the order of 2 20 , which will not be specified here. We can easily diagonalize matrix (25), obtaining the eigenvalues + 3 , associated respectively with the eigenstates: 1 2

1 0 0;

21

1 0 0;

21

+

21

;

100

21

;

100

(26)

3

3

and

and: 1 2

This reveals the following important results: 3 – The interaction energy varies as and not as 1 6 , since now modifies the energies to first order. The Van der Waals forces are therefore more important than they were between two hydrogen atoms in the 1 state (resonance effect between two different states of the total system with the same unperturbed energy). 3 – The sign of the interaction can be positive or negative (eigenvalues + and 3 ). There exist therefore states of the two-atom system for which there is attraction, and others for which there is repulsion.

3-b.

Transfer of the excitation from one atom to the other

The two states 1 0 0 ; 2 1 and 2 1 ; 1 0 0 have the same unperturbed energy and are coupled by a non-diagonal perturbation. According to the general results of § C of Chapter IV (two-level system), we know that there is oscillation of the system from one level to the other with a frequency proportional to the coupling. Therefore, if the system starts in the state 1 0 0 ; 2 1 at = 0, it arrives after a certain time (the larger , the longer the time) in the state 2 1 ; 1 0 0 . The excitation thus passes from ( ) to ( ), then returns to ( ), and so on.

Comment: If the two atoms are not fixed but, for example, undergo collision, varies over time and the passage of the excitation from one atom to the other is no longer periodic. The corresponding collisions, called resonant collisions, play an important role in the broadening of spectral lines.

1159

COMPLEMENT CXI

• z

rA A

d

O

A′ r′A Figure 2: To calculate the interaction energy of a hydrogen atom with a perfectly conducting wall, we can assume that the electric dipole moment r of the atom interacts with its electrical image r ( is the distance between the proton and the wall).

4.

Interaction of a hydrogen atom in the ground state with a conducting wall

We shall now consider a single hydrogen atom ( ) situated at a distance from a wall which is assumed to be perfectly conducting. The axis is taken along the perpendicular to the wall passing through (Fig. 2). The distance is assumed to be much larger than the atomic dimensions. We can therefore ignore the atomic structure of the wall, and assume that the atom interacts with its electrical image on the other side of this wall (that is, with a symmetrical atom with opposite charges). The dipole interaction energy between the atom and the wall can easily be obtained from expression (12) for by making the following substitutions: 2

2

2 = = = (the change of

2

to

(27)

2

is due to the sign difference of the image charges).

Furthermore it is necessary to divide by 2 since the dipole image is fictitious, 1160



VAN DER WAALS FORCES

proportional to the electric dipole of the atom4 . We then get: 2

=

16

(

3

2

2

+

+2

2

)

(28)

which represents the interaction energy of the atom with the wall [ acts only on the degrees of freedom of ( )]. If the atom is in its ground state, the energy correction to first order in is then:

1

=

100

(29)

100

Using the spherical symmetry of the l state, we obtain: 2 1

=

16

3

4

100

R2 3

100

=

2 2 0 4 3

(30)

We see that the atom is attracted by the wall: the attraction energy varies as 1/ 3 , and, therefore, the force of attraction varies as 1/ 4 . The fact that has an effect even to first order can easily be understood in terms of the discussion of § 2-c above. In the present case, there is a perfect correlation between the two dipoles, since they are images of each other. References and suggestions for further reading:

Kittel (13.2), Chap. 3. p. 82; Davydov (1.20), Chap. XII. §§ 124 and 125; Langbein (12.9). For a discussion of retardation effects, see: Power (2.11), §§ 7.5 and 8.4 (quantum electrodynamic approach); Landau and Lifshitz (7.12), Chap. XIII, § 90 (electromagnetic fluctuation approach). See also Derjaguin’s article (12.12).

4 This 1 2 factor is easily understood if one remembers that the energy of an electrostatic system is proportional to the integral over all space of the square of the electric field. For the system of Fig. 2, the electric field is zero below the xOy plane.

1161



COMPLEMENT DXI

Complement DXI The volume effect: the influence of the spatial extension of the nucleus on the atomic levels

1

2

First-order energy correction . . . . . . . . . 1-a Calculation of the correction . . . . . . . . . 1-b Discussion . . . . . . . . . . . . . . . . . . . . Application to some hydrogen-like systems . 2-a The hydrogen atom and hydrogen-like ions . 2-b Muonic atoms . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1164 1164 1165 1166 1166 1167

The energy levels and the stationary states of the hydrogen atom were studied in Chapter VII by assuming the proton to be a charged point particle, which creates an electrostatic 1 Coulomb potential. Actually, this is not quite true. The proton is not strictly a point charge; its charge fills a volume which has a certain size (of the order of 1 fermi = 10 13 cm). When an electron is extremely close to the center of the proton, it “sees” a potential that no longer varies as 1 , and which depends on the spatial charge distribution associated with the proton. This is true, furthermore, for all atoms: inside the volume of the nucleus, the electrostatic potential depends on how the charges are distributed. We thus expect the atomic energy levels, which are determined by the potential to which the electrons are subject at all points of space, to be affected by this distribution: this is what is called the “volume effect”. The experimental and theoretical study of such an effect is therefore important, since it can supply information about the internal structure of nuclei. In this complement, we shall give a simplified treatment of the volume effect of hydrogen-like atoms. To have an idea of the order of magnitude of the energy shifts it causes, we shall confine ourselves to a model in which the nucleus is represented by a sphere of radius 0 , in which the charge is uniformly distributed. In this model, the potential created by the nucleus is (cf. Complement AV , § 4-b): 2

( )= 2

3 0

for >

0

for 6

0

2

2 0

(1)

(we have set 2 = 2 4 0 ). The shape of the variation of ( ) with respect to is shown in Figure 1. The exact solution of the Schrödinger equation for an electron subject to such a potential poses a complicated problem. Therefore, we shall content ourselves with an approximate solution, based on perturbation theory. In a first approximation, we shall consider the potential to be a Coulomb potential [which amounts to setting 0 = 0 in (1)]. The energy levels of the hydrogen atom are then the ones found in § C of Chapter VII. 1162



THE VOLUME EFFECT: THE INFLUENCE OF THE SPATIAL EXTENSION OF THE NUCLEUS ON THE ATOMIC LEVELS

V(r)

ρ0

r

W(r)

0

Figure 1: Variation with respect to of the electrostatic potential ( ) created by the charge distribution of the nucleus, assumed to be uniformly distributed inside a sphere of radius 0 . For 6 0 , the potential is parabolic. For > 0 , it is a Coulomb potential [the extension of this Coulomb potential into the 6 0 zone is represented by the dashed line; ( ) is the difference between ( ) and the Coulomb potential].

We shall treat the difference ( ) between the potential ( ) written in (1) and the Coulomb potential as a perturbation. This difference is zero when is greater than the radius 0 of the nucleus. It is therefore reasonable that it should cause a small shift in the atomic levels (the corresponding wave functions extend over dimensions of the order of 0 0 ), which justifies a treatment by first-order perturbation theory. 1163

COMPLEMENT DXI

1.



First-order energy correction

1-a.

Calculation of the correction

By definition,

( ) is equal to: 2

2

2

( )=

+ 0

2

0

3

if 0 6

0

0

if

>

6

0

(2)

0

Let be the stationary states of the hydrogen-like atom in the absence of the perturbation To evaluate the effect of to first order, we must calculate the matrix elements: =

dΩ

(Ω) 2

(Ω)

d

( )

( )

( )

(3)

0

In this expression, the angular integral simply gives integral, we shall make an approximation and assume1 that: 0

. To simplify the radial (4)

0

that is, that the ( ) is not zero, is much smaller than the 0 region, in which spatial extent of the functions ( ). When 0 , we then have: ( )

(0)

(5)

The radial integral can therefore be written: 2

=

2

(0) 0

2

0

2

2

+

2

0

3

(6)

0

0

which gives: 2

=

10

2 0

(0)

2

(7)

and: 2

=

10

2 0

(0)

2

(8)

We see that the matrix representing in the subspace corresponding to the th level of the unperturbed Hamiltonian is diagonal. Therefore, the first-order energy correction associated with each state can be written simply: 2



=

10

2 0

(0)

2

(9)

1 This is certainly the case for the hydrogen atom. In § 2, we shall examine condition (4) in greater detail.

1164



THE VOLUME EFFECT: THE INFLUENCE OF THE SPATIAL EXTENSION OF THE NUCLEUS ON THE ATOMIC LEVELS

This correction does not depend2 on . Furthermore, since (0) is zero unless = 0 (cf. Chap. VII, § C-4-c), only the states ( = 0 states) are shifted, by a quantity which is equal to: 2



0

2 0

=

10 2 = 5

0

(0)

2

2 2 0

00

(0)

(we have used the fact that

0 0

1-b.

2

=1

(10) 4 ).

Discussion

∆ ∆

0

0

can be written:

3 10

=

(11)

where: 2

=

(12) 0

is the absolute value of the potential energy of the electron at a distance center of the nucleus, and: =

4 3

3 0

0 0 (0)

2

0

from the

(13)

is the probability of finding the electron inside the nucleus. and enter into (11) because the effect of the perturbation ( ) is felt only inside the nucleus. For the method which led us to (10) and (11) to be consistent, the correction ∆ 0 must be much smaller than the energy differences between unperturbed levels. Since is very large (an electron and a proton attract each other very strongly when they are very close), must therefore be extremely small. Before taking up the more precise calculation in § 2, we shall evaluate the order of magnitude of these quantities. Let: 0(

~2

)=

2

(14)

be the Bohr radius when the total charge of the nucleus is . If is not too high, the wave functions 0 0 (r) are practically localized inside a region of space whose volume 3 is approximately [[ 0 ( )] . As for the nucleus, its volume is of the order of 30 , so: 3 0 0(

)

2 This result could have been expected, since the perturbation is a scalar (cf. Complement BVI , § 5-b).

(15)

which is invariant under rotation,

1165



COMPLEMENT DXI

Relation (11) then yields: 3

2



0

0

0(

0

2

2 0(

)

0

)

0(

(16)

)

2 Now, ( ) of the unper0 ( ) is of the order of magnitude of the binding energy turbed atom. The relative value of the correction is therefore equal to: 2



0

0

( )

0(

(17)

)

If condition (4) is met, this correction will indeed be very small. We shall now calculate it more precisely in some special cases. 2.

Application to some hydrogen-like systems

2-a.

The hydrogen atom and hydrogen-like ions

For the ground state of the hydrogen atom, we have [cf. Chap. VII, relation (C39a)]: 1 0(

[where ∆

) = 2( 0 ) 0

10

3 2

e

(18)

0

is obtained by setting

=

2 5

2

2

0

0

= 1 in (14)]. Formula (10) then yields: 2

4 5

=

0

0

(19)

0

Now, we know that, for hydrogen: 0 53 ˚ A = 5 3 10

11

Furthermore, the radius

0

0

0

(proton)

1 F = 10

m

(20)

of the proton is of the order of:

15

m

(21)

If we substitute these numerical values into (19), we obtain: ∆

4 5 10

10

10

6 10

9

eV

(22)

The result is therefore very small. For a hydrogen-like ion, the nucleus has a charge of which amounts to replacing 2 in (19) by e2 , and 0 by ∆

1 0(

where trons), 1166

0

)=

2 5

2 2 0

0(

)

0

. We can then apply (10), ( ) = 0 . We obtain:

2

(23)

0

( ) is the radius of the nucleus, composed of nucleons (protons or neuof which are protons. In practice, the number of nucleons of a nucleus is not very



THE VOLUME EFFECT: THE INFLUENCE OF THE SPATIAL EXTENSION OF THE NUCLEUS ON THE ATOMIC LEVELS

different from 2 ; in addition, the “nuclear density saturation” property is expressed by the approximate relation: 0(

A1

)

3

Z1

3

(24)

The variation of the energy correction with respect to

is then given by:

14 3

∆E1 0 (Z)

Z

(25)

or: ∆E1 0 (Z) E (Z)

Z8

3

(26)

∆E1 0 (Z) therefore varies very rapidly with , under the effect of several concordant factors: when increases, 0 decreases and 0 increases. The volume effect is therefore significantly larger for heavy hydrogen-like ions than for hydrogen. Comment:

The volume effect also exists for all the other atoms. It is responsible for an isotopic shift of the lines of the emission spectrum. For two distinct isotopes of the same chemical element, the number of protons of the nucleus is the same, but the number of neutrons is different; the spatial distributions of the nuclear charges are therefore not identical for the two nuclei. Actually, for light atoms, the isotopic shift is caused principally by the nuclear finite mass effect (cf. Complement AVII , § 1-a- ). On the other hand, for heavy atoms (for which the reduced mass varies very little from one isotope to another), the finite mass effect is small; however, the volume effect increases with and becomes preponderant. 2-b.

Muonic atoms

We have already discussed some simple properties of muonic atoms (cf. Complement AV , § 4 and AVII , § 2-a). In particular, we have pointed out that the Bohr radius associated with them is distinctly smaller than for ordinary atoms (this is caused by the fact that the mass of the muon is approximately equal to 207 times that of the electron). From the qualitative discussion of § 1-b, we may therefore expect an important volume effect for muonic atoms. We shall evaluate it by choosing two limiting cases: a light muonic atom (hydrogen) and a heavy one (lead). .

The muonic hydrogen atom The Bohr radius is then: 0(

+

0

(27) 207 that is, of the order of 250 fermi. It therefore remains, in this case, distinctly greater than 0 . If we replace 0 by 0 207 in (19), we find: ∆E1 0 (

)

+

)

19

10

5

E (

p+ )

5

10

2

eV

(28)

Although the volume effect is much larger than for the ordinary hydrogen atom, it still yields only a small correction to the energy levels. 1167

COMPLEMENT DXI

.



The muonic lead atom The Bohr radius of the muonic lead atom is [cf. Complement AV , relation (25)]: 0(

Pb)

3

10

15

m

(29)

The muon is now very close to the lead nucleus; it is therefore practically unaffected by the repulsion of the atomic electrons which are located at distinctly greater distances. This could lead us to believe that (10), which was proven for hydrogen-like atoms and ions, is directly applicable to this case. Actually, this is not true, since the radius of the lead nucleus is equal to: 0 (Pb)

8 5F = 8 5

10

15

m

(30)

which is not small compared to 0 ( Pb). Equation (10) would therefore lead to large corrections (several MeV), of the same order of magnitude as the energy ( Pb). This means that, in this case, the volume effect can no longer be treated as a perturbation (see discussion of § 4 of Complement AV ). To calculate the energy levels, it is necessary to know the potential ( ) exactly and to solve the corresponding Schrödinger equation. The muon is therefore more inside the nucleus than outside, that is, according to (1), in a region in which the potential is parabolic. In a first approximation, we could consider the potential to be parabolic everywhere (as is done in Complement AV ) and then treat as a perturbation the difference which exists for > 0 between the real potential and the parabolic potential. However, the extension of the wave function corresponding to such a potential is not sufficiently smaller than 0 for such an approximation to lead to precise results, and the only valid method consists of solving the Schrödinger equation corresponding to the real potential. References and suggestions for further reading:

The isotopic volume effect: Kuhn (11.1), Chap. VI, § C-3; Sobel’man (11.12), Chap. 6, § 24. Muonic atoms (sometimes called mesic atoms): Cagnac and Pebay-Peyroula (11.2), Chap. XIX, § 7-C; De Benedetti (11.21); Wiegand (11.22); Weissenberg (16.19), § 4-2.

1168



THE VARIATIONAL METHOD

Complement EXI The variational method

1

2

3

Principle of the method . . . . . . . . . . . . . . . . . . 1-a A property of the ground state of a system . . . . . . . 1-b Generalization: the Ritz theorem . . . . . . . . . . . . . 1-c A special case where the trial functions form a subspace Application to a simple example . . . . . . . . . . . . . 2-a Exponential trial functions . . . . . . . . . . . . . . . . 2-b Rational wave functions . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

1169 1169 1170 1171 1172 1172 1174 1174

The perturbation theory studied in Chapter XI is not the only general approximation method applicable to conservative systems. We shall give a concise description here of another of these methods, which also has numerous applications, especially in atomic and molecular physics, nuclear physics, and solid state physics. First of all, we shall indicate, in § 1, the principle of the variational method. Then we shall use the simple example of the one-dimensional harmonic oscillator to bring out its principal features (§ 2), which we shall briefly discuss in § 3. Complements FXI and GXI apply the variational method to simple models which enable us to understand the behavior of electrons in a solid and the origin of the chemical bond. 1.

Principle of the method

Consider an arbitrary physical system whose Hamiltonian is time-independent. To simplify the notation, we shall assume that the entire spectrum of is discrete and non-degenerate: H

=

;

=0 1 2

(1)

Although the Hamiltonian is known, this is not necessarily the case for its eigenvalues and the corresponding eigenstates . The variational method is, of course, most useful in the cases in which we do not know how to diagonalize exactly. 1-a.

A property of the ground state of a system

Choose an arbitrary ket the Hamiltonian in the state H =

H

of the state space of the system. The average value of is such that:

> E0

(where 0 is the smallest eigenvalue of eigenvector of with the eigenvalue 0 .

(2) ), equality occuring if and only if

is an

1169



COMPLEMENT EXI

To prove inequality (2), we expand the ket =

on the basis of eigenstates of

: (3)

We then have: 2

=

>

2

(4)

0

with, of course: 2

=

(5)

which proves (2). For inequality (4) to become an equality, it is necessary and sufficient that all the coefficients be zero, with the exception of 0 ; is then an eigenvector of with the eigenvalue 0 . This property is the basis for a method of approximate determination of 0 . We choose (in theory, arbitrarily, but in fact, by using physical criteria) a family of kets ( ) which depend on a certain number of parameters which we symbolize by . We calculate the average value ( ) of the Hamiltonian in these states, and we minimize ( ) with respect to the parameters . The minimal value so obtained constitutes an approximation of the ground state 0 of the system. The kets ( ) are called trial kets, and the method itself, the variational method.

Comment:

The preceding proof can easily be generalized to cases in which the spectrum of is degenerate or includes a continuous part. 1-b.

Generalization: the Ritz theorem

We shall show that, more generally, the average value of the Hamiltonian H is stationary in the neighborhood of its discrete eigenvalues. Consider the average value of in the state : =

(6)

as a functional of the state vector , and calculate its increment when becomes + , where is assumed to be infinitely small. To do so, it is useful to write (6) in the form: =

(7)

and to differentiate both sides of this relation: + = 1170

[

+ +

]

(8)

• that is, since

=

THE VARIATIONAL METHOD

is a number:

[

]

The average value

+

[

]

(9)

will be stationary if:

=0

(10)

which, according to (9), means that: [

]

+

[

]

=0

(11)

We set: =[

]

(12)

Relation (11) can then be written simply: +

=0

This last relation must be satisfied for any infinitesimal ket choose: =

(13) . In particular, if we (14)

(where

is an infinitely small real number), (13) becomes:

2

=0

(15)

The norm of the ket is therefore zero, and must consequently be zero. With definition (12) taken into account, this means that: =

(16)

Consequently, the average value is stationary if and only if the state vector to which it corresponds is an eigenvector of and the stationary values of are the eigenvalues of the Hamiltonian. The variational method can therefore be generalized and applied to the approximate determination of the eigenvalues of the Hamiltonian If the function ( ) obtained from the trial kets ( ) has several extrema, they give the approximate values of some of its energies (cf. exercise 10 of Complement HXI ). 1-c.

A special case where the trial functions form a subspace

Assume that we choose for the trial kets the set of kets belonging to a vector subspace of . In this case, the variational method reduces to the resolution of the eigenvalue equation of the Hamiltonian H inside , and no longer in all of . To see this, we simply apply the argument of § 1-b, limiting it to the kets of the subspace . The maxima and minima of , characterized by = 0, are obtained when is an eigenvector of in . The corresponding eigenvalues constitute the 1171



COMPLEMENT EXI

variational method approximation for the true eigenvalues of in . They also provide upper bounds of these eigenvalues: we have seen that the lowest energy that is obtained is larger than the true energy of the ground state, but it is also possible to show (cf. MacDonald’s article in the references of this complement) that the next lowest energy is greater than the energy of the true first excited state, etc. When the dimension of is increased by one unit, one obtains a new series of energies with a new energy above all the others, which themselves decrease (or possibly remain at the same value). We stress the fact that the restriction of the eigenvalue equation of to a subspace of the state space can considerably simplify its solution. However, if is badly chosen, it can also yield results which are rather far from the true eigenvalues and eigenvectors of in (cf. § 3). The subspace must therefore be chosen so as to simplify the problem enough to make it soluble, without too greatly altering the physical reality. In certain cases, it is possible to reduce the study of a complex system to that of a two-level system (cf. Chap. IV), or at least, to that of a system of a limited number of levels. Another important example of this procedure is the method of the linear combination of atomic orbitals, widely used in molecular physics. This method (cf. Complement GXI ) essentially determines the wave functions of electrons in a molecule in the form of linear combinations of eigenfunctions associated with the various atoms which constitute the molecule, treated as if they were isolated. It therefore limits the search for the molecular states to a subspace chosen using physical criteria. Similarly, in Complement FXI , we shall choose as a trial wave function for an electron in a solid a linear combination of atomic orbitals relative to the various ions which constitute this solid.

Comment:

Note that first-order perturbation theory fits into this special case of the variational method: is then an eigensubspace of the unperturbed Hamiltonian 0 . 2.

Application to a simple example

To illustrate the discussion of § 1 and to give an idea of the validity of the approximations obtained with the help of the variational method, we shall apply this method to the one-dimensional harmonic oscillator, whose eigenvalues and eigenstates we know (cf. Chap. V). We shall consider the Hamiltonian: =

~2 2

2 2

+

1 2

2 2

(17)

and we shall solve its eigenvalue equation approximately by variational calculations. 2-a.

Exponential trial functions

Since the Hamiltonian (17) is even, it can easily be shown that its ground state is necessarily represented by an even wave function. To determine the characteristics of this ground state, we shall therefore choose even trial functions. We take, for example, the one-parameter family: ( )=e 1172

2

;

0

(18)

• The square of the norm of the ket +

=

d e

is equal to:

2

2

THE VARIATIONAL METHOD

(19)

and we find: +

= =

~2 2

~2 2

+

+

~2 d2 1 + 2 d 2 2

2

d e 1 8

2

2 2

+

1

d e

2

2

e

2

(20)

so that: ( )=

1 8

2

1

(21)

The derivative of the function =

0

( ) goes to zero for:

1 2 ~

=

(22)

and we then have: (

0)

=

1 ~ 2

(23)

The minimum value of ( ) is therefore exactly equal to the energy of the ground state of the harmonic oscillator. This result is due to the simplicity of the problem that we are studying: the wave function of the ground state happens to be precisely one of the functions of the trial family (18), the one which corresponds to value (22) of the parameter . The variational method, in this case, gives the exact solution of the problem (this illustrates the theorem proven in § 1-a). If we want to calculate (approximately, in theory) the first excited state 1 of the Hamiltonian (17), we should choose trial functions which are orthogonal to the wave function of the ground state. This follows from the discussion of § 1-a, which shows that the lower bound of is 1 , and no longer 0 if the coefficient 0 is zero. We therefore choose the trial family of odd functions: ( )=

2

e

(24)

In this case: +

=

d

2

e

2

2

(25)

and: =

~2 2

3~2 2

+

3 +

1 2

2

3 4

+

d

2

e

2

2

(26)

which yields: ( )=

3 8

2

1

(27) 1173



COMPLEMENT EXI

This function, for the same value to: (

0)

=

0

as above [formula (22)], presents a minimum equal

3 ~ 2

(28)

Here again, we find exactly the energy 1 and the associated eigenstate because the trial family includes the correct wave function. 2-b.

Rational wave functions

The calculations of § 2-a enabled us to familiarize ourselves with the variational method, but they do not really allow us to judge its effectiveness as a method of approximation, since the families chosen always included the exact wave function. Therefore, we shall now choose trial functions of a totally different type, for example1 : ( )=

2

1 +

;

a

0

(29)

A simple calculation then yields: +

=

2

(x2 + )

=

2

(30)

and, finally: ( )=

~2 1 1 + 4 2

2

(31)

The minimum value of this function is obtained for: =

0

=

1 ~ 2

(32)

and is equal to: ( 0) =

1 ~ 2

(33)

This minimum value is therefore equal to 2 times the exact ground state energy ~ 2. To measure the error committed, we can calculate the ratio of ( 0 ) ~ 2 to the energy quantum ~ : ( 0) ~ 3.

1 2~

=

2 1 2

20%

(34)

Discussion

The example of § 2-b shows that it is easy to obtain the ground state energy of a system, without significant error, starting with arbitrarily chosen trial kets. This is one of the 1 Our choice here is dictated by the fact that we want the necessary integrals to be analytically calculable. Of course, in most real cases, one resorts to numerical integration.

1174



THE VARIATIONAL METHOD

principal advantages of the variational method. Since the exact eigenvalue is a minimum of the average value , it is not surprising that does not vary very much near this minimum. On the other hand, as the same reasoning shows, the “approximate” state can be rather different from the true eigenstate. Thus, in the example of § 2-b, the wave function 1 ( 2 + 0 ) [where 0 is given by formula (32)] decreases too rapidly for small values of and much too slowly when becomes large. Table I gives quantitative support for this qualitative assertion. It gives, for various values of 2 , the values of the exact normalized eigenfunction: 0(

) = (2

[where 2

0

)1 4 e

0

2

0

was defined in (22)] and of the approximate normalized eigenfunction: 3 4

3 4

( 0)

0

2 ( 0) 2+

( )=

0

2

=

1 4

2 2

0

0

2 1 4

e

0

2

2

1 1+2 2

(2

0

2

(35)

1 4

2)

1+2 2

0

0.893

1.034

1/2

0.696

0.605

1

0.329

0.270

3/2

0.094

0.140

2

0.016

0.083

5/2

0.002

0.055

3

0.000 1

0.039

0

2

Table I It is therefore necessary to be very careful when physical properties other than the energy of the system are calculated using the approximate state obtained from the variational method. The validity of the result obtained varies enormously depending on the physical quantity under consideration. In the particular problem which we are studying here, we find, for example, that the approximate average value2 of the operator 2 is not very different from the exact value: 2 0

0 0

0

=

1 ~ 2

(36)

which is to be compared with ~ 2 . On the other hand, the average value of 4 is infinite for the wave function (35), while it is, of course, finite for the real wave function. 2 The

average value of

is automatically zero, as is correct since we have chosen even trial functions.

1175

COMPLEMENT EXI



More generally, Table I shows that the approximation will be very poor for all properties that depend strongly on the behavior of the wave function for & 2 0. The drawback we have just mentioned is all the more serious as it is very difficult, if not impossible, to evaluate the error in a variational calculation if we do not know the exact solution of the problem (and, of course, if we use the variational method, it is because we do not know this exact solution). The variational method is therefore a very flexible approximation method, which can be adapted to very diverse situations and which gives great scope to physical intuition in the choice of trial kets. It gives good values for the energy rather easily, but the approximate state vectors may present certain completely unpredictable erroneous features, and we cannot check these errors. This method is particularly valuable when physical arguments give us an idea of the qualitative or semi-quantitative form of the solutions. References and suggestions for further reading:

The Hartree-Fock method, often used in physics, is an application of the variational method. See references of Chapter XI and Complement EXV of Volume III. The variational method is of fundamental importance in molecular physics. See references of Complement GXI . For a simple presentation of the use of variational principles in physics, see Feynman II (7.2), Chap. 19. J.K.L. MacDonald, Physical Review vol. 143, pages 830 à 833 (1933).

1176



ENERGY BANDS OF ELECTRONS IN SOLIDS: A SIMPLE MODEL

Complement FXI Energy bands of electrons in solids: a simple model

1 2

A first approach to the problem: qualitative discussion A more precise study using a simple model . . . . . . . . 2-a Calculation of the energies and stationary states . . . . . 2-b Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

1178 1181 1181 1187

A crystal is composed of atoms evenly distributed in space so as to form a threedimensional periodic lattice. The theoretical study of the properties of a crystal, which brings into play an extremely large number of particles (nuclei and electrons), poses a problem which is so complicated that it is out of the question to treat it rigorously. We must therefore resort to approximations. The first of these is of the same type as the Born-Oppenheimer approximation (which we encountered in § 1 of Complement AV ). It consists of considering, first of all, the positions of the nuclei as fixed, which enables us to study the stationary states of the electrons subjected to the potential created by the nuclei. The motion of the nuclei is not treated until later, using the knowledge of the electronic energies1 . In this complement, we shall concern ourselves only with the first step of this calculation, and we shall assume the nuclei to be motionless at the nodes of the crystalline lattice. This problem still remains extremely complicated. It is necessary to calculate the energies of a system of electrons subjected to a periodic potential and interacting with each other. We then make a second approximation: we assume that each electron, at a position r , is subjected to the influence of a potential (r ) which takes into account the attraction exerted by the nuclei and the average effect of the repulsion of all the other electrons2 . The problem is thus reduced to one involving independent particles, moving in a potential that has the periodicity of the crystalline lattice. The physical characteristics of a crystal therefore depend, in a first approximation, on the behavior of independent electrons subjected to a periodic potential. We could be led to think that each electron remains bound to a given nucleus, as happens in isolated atoms. We shall see that, in reality, the situation is completely different. Even if an electron is initially in the neighborhood of a particular nucleus, it can move into the zone of attraction of an adjacent nucleus by the tunnel effect, then into another, and so on. Actually, the stationary states of the electrons are not localized in the neighborhood of any nucleus, but are completely delocalized: the probability density associated with them is uniformly distributed over all the nuclei3 . Thus, the properties of an electron 1 Recall that the study of the motion of the nuclei leads to the introduction of the normal vibrational modes of the crystal: the phonons (cf. Complement JV ). 2 This approximation is of the same type as the “central field” approximation for isolated atoms (cf. Complement AXIV , § 1). 3 This phenomenon is analogous to the one we encountered in the study of the ammonia molecule (cf. Complement GIV ). There, since the nitrogen atom can move from one side of the plane of the hydrogen atoms to the other, by the tunnel effect, the stationary states give an equal probability of finding it in each of the two corresponding positions.

1177

COMPLEMENT FXI



placed in a periodic potential resemble those of an electron free to move throughout the crystal more than they do those of an electron bound to a particular atom. Such a phenomenon could not exist in classical mechanics: the direction of a particle traveling through a crystal would change constantly under the influence of the potential variations (for example, upon skirting an ion). In quantum mechanics, the interference of the waves scattered by the different nuclei permit the propagation of an electron inside the crystal. In § 1, we shall study very qualitatively how the energy levels of isolated atoms are modified when they are brought gradually closer together to form a linear chain. Then, in § 2, still confining ourselves, for simplicity, to the case of a linear chain, we shall calculate the energies and wave functions of stationary states a little more precisely. We shall perform the calculation in the “tight bonding approximation”: when the electron is in one site, it can move to one of two neighboring sites via the tunnel effect. The tight bonding approximation is equivalent to assuming that the probability of its tunneling is small. We shall, in this way, establish a certain number of results (the delocalization of stationary states, the appearance of allowed and forbidden energy bands, the form of Bloch functions) which remain valid in more realistic models (three-dimensional crystals, bonds of arbitrary strength). The “perturbation” approach that we shall adopt here constructs the stationary states of the electrons from atomic wave functions localized about the various ions. It has the advantage of showing how atomic levels change gradually to energy bands in a solid. Note, however, that the existence of energy bands can be directly established from the periodic nature of the structure in which the electron is placed (see, for example, Complement OIII , in which we study quantization of the energy levels in a one-dimensional periodic potential). Finally, we stress the fact that we are concerned here only with the properties of the individual stationary states of the electrons. To construct the stationary state of a system of electrons from these individual states, it is necessary to apply the symmetrization postulate (cf. Chap. XIV), since we are dealing with a system of identical particles. We shall treat this problem again in Complement CXIV , when we shall describe the spectacular consequences of Pauli’s exclusion principle on the physical behavior of the electrons in a solid. Many other examples of the effects of the symmetrization will be discussed in Chapters XV to XVII. 1.

A first approach to the problem: qualitative discussion

Let us go back to the example of the ionized H+ 2 molecule, studied in §§ C-2-c and C-3-d of Chapter IV. Consider, therefore, two protons 1 and 2 whose positions are fixed, and an electron which is subject to their electrostatic attraction. This electron sees a potential (r), which has the form indicated in Figure 1. In terms of the distance between 1 and 2 (considered as a parameter) what are the possible energies and the corresponding stationary states? We shall begin by considering the limiting case in which 0 (where 0 is the Bohr radius of the hydrogen atom). The ground state is then two-fold degenerate: the electron can form a hydrogen atom either with 1 or with 2 ; it is practically unaffected by the attraction of the other proton, which is very far away. In other words, the coupling between the states 1 and 2 considered in Chapter IV (localized states in the neighborhood of 1 or 2 ; cf. Fig. 13 of Chapter IV) is then negligible, so that 1 1178



ENERGY BANDS OF ELECTRONS IN SOLIDS: A SIMPLE MODEL

V(x)



R 2

0

+

R 2

x

Figure 1: The potential seen by the electron as it moves along the axis defined by the two protons in the ionized 2+ molecule. We obtain two wells separated by a barrier. If, at any instant, the electron is localized in one of the two wells, it can move into the other well via the tunnel effect.

and

2 are practically stationary states. If we now choose a value of comparable to 0 it is no longer possible to neglect the attraction of one or the other of the protons. If, at = 0, the electron is localized in the neighborhood of one of them, and even if its energy is lower than the height of the potential barrier situated between 1 and 2 (cf. Fig. 1), it can move to the other proton by the tunnel effect. In Chapter IV we studied the effect of coupling of the states 1 and 2 , and we showed that it produces an oscillation of the system between these two states (the dynamical aspect). We have also seen (the static aspect) that this coupling removes the degeneracy of the ground state and that the corresponding stationary states are “delocalized” (for these states, the probability of finding the electron in the neighborhood of 1 or 2 is the same). Figure 2 shows the form of the variation with respect to of the possible energies of the system4 . Two effects appear when we decrease the distance between 1 and 2 . On the one hand, an = energy value gives rise to two distinct energies when decreases (when the distance is fixed at a given value 0 , the stronger the coupling between the states 1 and 2 , the greater the difference between these two energies). On the other hand, the stationary states are delocalized. It is easy to imagine what will happen if the electron is subject to the influence, not of two, but of three identical attractive particles (protons or positive ions), arranged, for example, in a straight line at intervals of When is very large, the energy levels are triply degenerate, and the stationary states of the electron can be chosen to be localized in the neighborhood of any one of the fixed particles. If is decreased, each energy gives rise to three generally distinct energies and, in a 4A

detailed study of the

+ 2

ion is presented in Complement GXI .

1179

COMPLEMENT FXI

• E

0

R0

R

EI Δ

Figure 2: Variation of the energy of stationary states of the electron in terms of the distance between the two protons of the H+ is large, there are two prac2 ion. When tically degenerate states, of energy . When decreases, this degeneracy is removed. The smaller , the greater the splitting.

stationary state, the probabilities of finding the electron in the three wells are comparable. Moreover, if, at the initial instant, the electron is localized in the right-hand well, for example, it moves into the other wells during its subsequent evolution5 . The same ideas remain valid for a chain composed of an arbitrary number of ions which attract an electron. The potential seen by the electron is then composed of regularly spaced identical wells (in the limit in which , it is a periodic potential). When the distance between the ions is large, the energy levels are -fold degenerate. This degeneracy disappears if the ions are moved closer together: each level gives rise to distinct levels, which are distributed, as shown in Figure 3, in an energy interval of width ∆. What now happens if the value of is very large? In each of the intervals ∆, the possible energies are so close that they practically form a continuum: “allowed energy bands” are thus obtained, separated by “forbidden bands”. Each allowed band contains levels (actually 2 if the electron spin is taken into account). The stronger the coupling causing the electron to pass from one potential well to the next one, the greater the band width. (Consequently, we expect the lowest energy bands to be the narrowest since the tunnel effect which is responsible for this passage is less probable when the energy is smaller). The stationary states of the electron are all delocalized. The analogue here of Figure 3 of Complement MIII is Figure 4, which represents the energy levels and gives an idea of the spatial extension of the associated wave functions. Finally, note that if, at = 0, the electron is localized at one end of the chain, it propagates along the chain during its subsequent evolution. 5 See

1180

exercise 8 of Complement JIV .



ENERGY BANDS OF ELECTRONS IN SOLIDS: A SIMPLE MODEL

E

0

R0

R

E′

E

Δ′

Δ

Figure 3: Energy levels of an electron subject to the action of regularly spaced identical ions. When is very large, the wave functions are localized about the various ions, and the energy levels are the atomic levels, -fold degenerate (the electron can form an atom with any one of the ions). In the figure, two of these levels are shown, of energies and . When decreases, the electron can pass from one ion to another by the tunnel effect, and the degeneracy of the levels is removed. The smaller , the greater the splitting. For the value 0 of found in a crystal, each of the two original atomic levels is therefore broken down into very close levels. If is very large, these levels are so close that they yield energy bands, of widths ∆ and ∆ , separated by a forbidden band.

2. 2-a.

A more precise study using a simple model Calculation of the energies and stationary states

To complete the qualitative considerations of the preceding section, we shall discuss the problem more precisely, using a simple model. We shall perform calculations analogous to those of § C of Chapter IV, but adapted to the case in which the system under consideration contains an infinite number of ions (instead of two), regularly spaced 1181

COMPLEMENT FXI



V(x)

x



Δ

Figure 4: Energy levels for a potential composed of several regularly spaced wells. Two bands are shown in this figure, one of width ∆ and the other of width ∆ . The deeper the band, the more narrow it is, since crossing the barrier by the tunnel effect is then more difficult.

in a linear chain. .

Description of the model; simplifying hypotheses

Consider, therefore, an infinite linear chain of regularly spaced positive ions. As in Chapter IV, we shall assume that the electron, when it is bound to a given ion, has only one possible state: we shall denote by the state of the electron when it forms an atom with the th ion of the chain. For the sake of simplicity, we shall neglect the mutual overlap of the wave functions ( ) associated with neighboring atoms, and we shall assume the basis to be orthonormal: =

(1)

Moreover, we shall confine ourselves to the subspace of the state space spanned by the kets . It is obvious that by restricting the state space accessible to the electron in this way, we are making an approximation. This can be justified by using the variational method (cf. Complement EXI ): by diagonalizing the Hamiltonian , not in the total space, but in the one spanned by the , it can be shown that we obtain a good approximation for the true energies of the electron. We shall now write the matrix representing the Hamiltonian in the basis. Since the ions all play equivalent roles, the matrix elements are necessarily all equal to the same energy 0 . In addition to these diagonal elements, also has nondiagonal elements (coupling between the various states , which expresses the possibility for an electron to move from one ion to another). This coupling is obviously very weak for distant ions; this is why we shall take into account only the matrix elements . Under these conditions, 1 , which we shall choose equal to a real constant 1182



ENERGY BANDS OF ELECTRONS IN SOLIDS: A SIMPLE MODEL

the (infinite) matrix that represents ..

can be written:

. 0

0

0 0

0

( )=

0 0

(2)

0

0

0

..

.

To find the possible energies and the corresponding stationary states, we must diagonalize this matrix. .

Possible energies; the concept of an energy band Let

be an eigenvector of

; we shall write it in the form:

+

=

(3) =

Using (2), the eigenvalue equation: =

(4)

projected onto 0

, yields:

+1

1

=

(5)

When takes on all positive or negative integral values, we obtain an infinite system of coupled linear equations which, in certain ways, recall the coupled equations (5) of Complement JV . As in that complement, we shall look for simple solutions of the form: =e

(6)

where is the distance between two adjacent ions, and is a constant whose dimensions are those of an inverse length. We require to belong to the “first Brillouin zone”, that is, to satisfy: 6

+

(7)

This is always possible, because two values of differing by 2 the same value. Substituting (6) into (5), we obtain: 0

e

( +1)

e

that is, dividing by e =

( )=

0

+e

(

1)

=

e

give all the coefficients (8)

:

2 cos

(9)

If this condition is satisfied, the ket given by (3) and (6) is an eigenket of ; its energy depends on the parameter , as is indicated by (9). Figure 5 represents the variation of with respect to . It shows that the possible energies are situated in the interval [ 0 2 0 + 2 ]. We therefore obtain an allowed energy band, whose width 4 is proportional to the strength of the coupling. 1183

COMPLEMENT FXI

• E(k)

E0 + 2A

Figure 5: Possible energies of the electron in terms of the parameter ( varies within the first Brillouin zone). An energy band therefore appears, with a width 4 which is proportional to the coupling between neighboring atoms.

E0 – 2A

k – π/l

. state

0

+ π/l

Stationary states; Bloch functions Let us calculate the wave function ( )= associated with the stationary of energy ( ). Relations (3) and (6) lead to: +

=

e

(10a)

=

that is: +

( )=

e

( )

(10b)

=

where: ( )=

(11)

is the wave function associated with the state . Since the state from the state 0 by a translation of we have: ( )=

0

(

)

can be obtained

(12)

so that (10b) can be written: +

( )=

e =

1184

0(

)

(13)

• We now calculate

ENERGY BANDS OF ELECTRONS IN SOLIDS: A SIMPLE MODEL

( + ):

+

( + )=

e

0

[

(

1) ]

= +

=e

(

e

1)

0

[

(

1) ]

=

=e

( )

(14)

To express this remarkable property simply, we set: ( )=e

( )

The function

(15)

( ) so defined then satisfies:

( + )=

( )

(16)

Therefore, the wave function ( ) is the product of e and a periodic function which has the period of the lattice. A function of type (15) is called a Bloch function. Note that, if is any integer: ( +

)

2

=

( )

2

(17)

a result which demonstrates the delocalization of the electron: the probability density of finding the electron at any point on the -axis is a periodic function of Comment: Expressions (15) and (16) have been proven here for a simple model. Actually, this result is more general and can be proven directly from the symmetries of the Hamiltonian (Bloch’s theorem). To show this, let us call ( ) the unitary operator associated with a translation along (cf. Complement EII , § 3). Since the system is invariant under any translation that leaves the ion chain unchanged, we must have: [

( )] = 0

(18a)

We can therefore construct a basis of eigenvectors common to the operator ( ) and . Now, equation (14) is simply the one that defines the eigenfunctions of ( ) [since this operator is unitary, its eigenvalues can always be written in the form e , where satisfies condition (7); cf. Complement CII , § 1-c]. It is then simple to get, as before, (15) and (16) from (14). Note that, for any , we have, in general: [

( )] = 0

(18b)

unlike the situation of a free particle (or one subject to the influence of a constant potential). For a free particle, since commutes with all operators ( ) (that is, with the momentum ; cf. Complement EII , § 3), the stationary wave functions are of the form: ( )

e

(19)

This means that the function ( ) appearing in (15) is necessarily a constant, which is a more restrictive condition. In the problem studied in this complement, the commutator of relation (18b) vanishes only for certain values of , which implies less restrictive conditions for the wave function.

1185

COMPLEMENT FXI

.



Periodic boundary conditions

To each value of in the interval [ + ] corresponds an eigenstate of , with the coefficients appearing in expansion (3) of given by equation (6). We thus obtain an infinite continuum of stationary states. This is due to the fact that we have considered a linear chain containing an infinite number of ions. What happens when we consider a finite linear chain, of length , composed of a large number of ions? The qualitative considerations of § 1 show that there must then be levels in the band (2 if spin is taken into account). The exact determination of the corresponding stationary states is a difficult problem, since it is necessary to take into account the boundary conditions at the ends of the chain. It is clear, however, that the behavior of electrons sufficiently far from the ends are little affected by the “edge effects”6 . This is why one generally prefers, in solid state physics, to substitute new boundary conditions for the real boundary conditions; despite their artificial character, these new conditions lead to much simpler calculations, while conserving the most important properties necessary for the comprehension of the physical effects (other than the edge effects). These new boundary conditions, called periodic boundary conditions, or “BornVon Karman conditions” (B.V.K. conditions), require the wave function to take on the same value at both ends of the chain. We can also imagine that we are placing an infinite number of identical chains, all of length end to end. We then require the wave function of the electron to be periodic, with a period Equations (5) remain valid, as does their solution (6), but the periodicity of the wave function now implies: e

=1

(20)

Consequently, the only possible values of 2 =

are of the form: (21)

where is a positive or negative integer or zero. Let us now verify that the B.V.K. conditions give the correct result for the number of stationary states contained in the band. To do so, we must calculate the number of allowed values included in the first Brillouin zone defined in (7). We obtain this number by dividing the width 2 of this zone by the interval 2 between two adjacent values of which indeed gives us: 2

2

=

=

1

(22)

We should also show that the stationary states obtained with the B.V.K. conditions are distributed in the allowed band with the same density7 ( ) as the true stationary states (associated with the real boundary conditions). As the density of states ( ) plays a very important role in the comprehension of the physical properties of a solid (we shall discuss this point in Complement CXIV ), it is important for the new boundary conditions to leave it unchanged. That the B.V.K. conditions give the correct density of states will be proven in Complement CXIV (§ 1-c) for the simple example of a free electron gas enclosed in a “rigid box”. In this case, the true stationary states can be calculated and compared with those obtained by using the periodic boundary conditions on the walls of the box (see also § 3-a of Complement OIII ). 6 For a three-dimensional crystal, this amounts to establishing a distinction between “bulk effects” and “surface effects”. 7 ( )d is the number of distinct stationary states with energies included between and + d .

1186

• 2-b.

ENERGY BANDS OF ELECTRONS IN SOLIDS: A SIMPLE MODEL

Discussion

Starting with a discrete non-degenerate level for an isolated atom (for example, the ground level) we have obtained a series of possible energies, grouped in an allowed band of width 4 for the chain of ions being considered. If we had started with another level of the atom (for example, the first excited level), we would have obtained another energy band, and so on. Each atomic level yields one energy band, as Figure 6 shows, and there appears a series of allowed bands, separated by forbidden bands. Relation (6) shows that, for a stationary state, the probability amplitude of finding the electron in the state is an oscillating function of , whose modulus does not depend on This recalls the properties of phonons, the normal vibrational modes of an infinite number of coupled oscillators for which all the oscillators participate in the collective vibration with the same amplitude, but with a certain phase shift (cf. Complement JV ). Allowed bands

E

Figure 6: Allowed bands and forbidden bands on the energy axis.

How can we obtain states in which the electron is not completely delocalized? For a free electron, we saw in Chapter I that we must superpose plane waves so as to form a free “wave packet ”: ˆ(

1 2

)=

d ˆ( )e [

( )

~]

(23)

The maximum of this wave packet propagates at the group velocity (cf. Chap. I, § C): ˆ = 1 d ~ d

= =

~

0

(24)

0

[where 0 is the value of for which the function ˆ ( ) presents a peak]. Here, we must superpose wave functions of type (15), and the corresponding ket can be written: () =

1 2

d

( )e

( )

~

(25)

where ( ) is a function of which has the form of a peak about = 0 . We shall calculate the probability amplitude of finding the electron in the state . Using (10a) and (1), we can write: () =

1 2

d

( )e [

( )

~]

(26) 1187

COMPLEMENT FXI

Replacing

by

(

1 2

)=

• in this relation, we obtain a function of : ( )e [

d

( )

~]

(27)

Only the values at the points = 0, , etc... of this function are really significant and yield the desired probability amplitudes. Relation (27) is entirely analogous to (23). By applying (24), it can be shown that ( ) takes on significant values only in a limited domain of the -axis whose center moves at the velocity: =

1 d ( ) ~ d

(28) =

0

It follows that the probability amplitude ( ) is large only for certain values of : therefore, the electron is no longer delocalized, but moves in the crystal at the velocity given by (28). Equation (9) enables us to calculate this velocity explicitly: =

2

sin 0 ~ This function is shown in Figure 7. It is zero when

(29) 0

= 0, that is, when the energy is

VG(k) +

2Al h

k – π/l

+ π/l

0



Figure 7: Group velocity of the electron as a function of the parameter This velocity goes to zero, not only for = 0 (as for the free electron), but also for = (the edges of the first Brillouin zone).

2Al h

minimal; this is also a property of the free electron. However, when 0 takes on non-zero values, important departures from the behavior of a free electron occur. For example, as soon as 0 2 , the group velocity is no longer an increasing function of the energy. It even goes to zero when 0 = (at the borders of the first Brillouin zone). This indicates that an electron cannot move in the crystal if its energy is too close to the maximum value 0 + 2 appearing in Figure 5. The optical analogy of this situation is Bragg reflection. X rays whose wavelength is equal to the unit edge of the crystalline lattice cannot propagate in it: interference of the waves scattered by each of the ions lead to total reflection. References and suggestions for further reading:

Feynman III (1.2), Chap. 13; Mott and Jones (13.7), Chap. II, § 4; references of section 13 of the bibliography. 1188



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

Complement GXI A simple example of the chemical bond: the H+ 2 ion

1

2

3

4

5

1.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-a General method . . . . . . . . . . . . . . . . . . . . . . . . . 1-b Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-c Principle of the exact calculation . . . . . . . . . . . . . . . . The variational calculation of the energies . . . . . . . . . . 2-a Choice of the trial kets . . . . . . . . . . . . . . . . . . . . . . 2-b The eigenvalue equation of the Hamiltonian in the trial ket vector subspace . . . . . . . . . . . . . . . . . . . . . . . . 2-c Overlap, Coulomb and resonance integrals . . . . . . . . . . . 2-d Bonding and antibonding states . . . . . . . . . . . . . . . . . Critique of the preceding model. Possible improvements . 3-a Results for small . . . . . . . . . . . . . . . . . . . . . . . . 3-b Results for large . . . . . . . . . . . . . . . . . . . . . . . . Other molecular orbitals of the H+ 2 ion . . . . . . . . . . . . 4-a Symmetries and quantum numbers. Spectroscopic notation . 4-b Molecular orbitals constructed from the 2 atomic orbitals . . The origin of the chemical bond; the virial theorem . . . . 5-a Statement of the problem . . . . . . . . . . . . . . . . . . . . 5-b Some useful theorems . . . . . . . . . . . . . . . . . . . . . . 5-c The virial theorem applied to molecules . . . . . . . . . . . . 5-d Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1189 1189 1190 1192 1192 1192 1194 1195 1197 1201 1201 1202 1204 1205 1206 1210 1210 1211 1212 1215

Introduction

In this complement, we intend to show how quantum mechanics enables us to understand the existence and properties of the chemical bond, which is responsible for the formation of various molecules from isolated atoms. Our aim is to explain the basic nature of these phenomena and not, of course, to enter into details which could only be covered in a specialized book on molecular physics. This is why we shall study the simplest existing molecule, the H+ 2 ion, which is composed of two protons and a single electron. We have already discussed certain aspects of this problem, in Chapter IV (§ C-2-c) and in exercise 5 of Complement KI ; we shall consider it here in a more realistic and systematic fashion. 1-a.

General method

When the two protons are very far from each other, the electron forms a hydrogen atom with one of them, and the other one remains isolated, in the form of an H+ ion. If the two protons are brought closer together, the electron will be able to “jump” from 1189

COMPLEMENT GXI

• M

r1

r2

R P1

P2

Figure 1: We call 1 the distance between the electron ( ) and proton between the electron and proton 2 and the internuclear distance

1,

2

the distance

1 2.

one to the other. This radically modifies the situation (cf. Chap. IV, § C-2). We shall therefore study the variation of the energies of the stationary states of the system with respect to the distance between the two protons. We shall see that the energy of the ground state reaches a minimum for a certain value of this distance, which explains the stability of the H+ 2 molecule. In order to treat the problem exactly, it would be necessary to write the Hamiltonian of the three-particle system and solve its eigenvalue equation. However, it is possible to simplify this problem considerably by using the Born-Oppenheimer approximation (cf. Complement AV , § 1-a). Since the motion of the electron in the molecule is considerably more rapid than that of the protons, the latter can be neglected in a first approximation. The problem is then reduced to the resolution of the eigenvalue equation of the Hamiltonian of the electron subject to the attraction of two protons which are assumed to be fixed. In other words, the distance between the two protons is treated, not like a quantum mechanical variable, but like a parameter, on which the electronic Hamiltonian and total energy of the system depend. In the case of the H+ 2 ion, it so happens that the equation simplified in this way is exactly soluble for all values of However, this is not true for other, more complex, molecules. The variational method, described in Complement EXI , must then be used. Although we are confining ourselves here to the study of the H+ 2 ion, we shall use the variational method, since it can be generalized to the case of other molecules. 1-b.

Notation

We shall call the distance between the two protons, situated at 1 and 2 , and and 2 the distances of the electron to each of the two protons (Fig. 1). We shall relate these distances to a natural atomic unit, the Bohr radius 0 (cf. Chap. VII, § C-2), by setting: 1

= 1

=

0 1

0

2

=

2

0

(1)

The normalized wave function associated with the ground state 1 of the hydrogen 1190

• atom formed around proton 1 e 1 1 = 3

1

A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

can be written: (2)

0

Similarly, we express the energies in terms of the natural unit = 2 2 0; is the ionization energy of the hydrogen atom. It will sometimes be convenient in what follows to use a system of elliptic coordinates, in which a point of space (here, the electron) is defined by: 1+ 2 1+ 2 = = 1

=

2

1

=

2

(3)

and the angle which fixes the orientation of the 1 2 plane about the 1 2 axis (this angle also enters into the system of polar coordinates whose axis coincides with and , and if varies between 0 and 2 , the point describes a circle 1 2 ). If we fix about the 1 2 axis. If (or ) and are fixed, describes an ellipse (or a hyperbola) of foci 1 and 2 when (or ) varies. It can easily be shown that the volume element in this coordinate system is: 3

d3 =

(

8

2

2

)d d d

(4)

To do so, we simply calculate the Jacobian

of the transformation:

=

(5)

We see immediately that, if 1 2: 2 1 2 2

tg

=

2

=

2

+

2

+

2

1

2

is chosen as the

in the middle of

2

+

2 2

+

+

2

=

(6)

We can then find: 1 1 = +

2

=

1

+ 1

=

axis, with the origin

1

1

2

= 2

1 2

= 1 2

= 1 2

= 1 2

=

/2

1

+

+

1

= =

/2

1

2

+

/2

+

=

2

+

/2

1

2

2

=

/2

1 2

+

=

/2

1 2

2

+

2

=0

(7)

1191



COMPLEMENT GXI

The Jacobian

+

1

=

=

can therefore be written:

(

2

2 1 2)

2

1 (

1 2)

2

2

2

(

2

2

+

2

2

+

2

0

)

(8)

Since: 2

2

4

=

1 2 2

(9)

we get, finally: =

1-c.

8 3( 2

(10)

2)

Principle of the exact calculation

In the Born-Oppenheimer approximation, the equation to be solved in order to find the energy levels of the electron in the Coulomb field of the two fixed protons can be written: ~2 ∆ 2

2

2

2

+ 1

(r) =

(r)

(11)

2

If we go into the elliptical coordinates defined in (3), we can separate the variables , and . Solving the equations so obtained, we find a discrete spectrum of possible energies for each value of . We shall not perform this calculation here, but shall merely represent (the solid-line curve in Figure 2) the variation of the ground state energy with respect to This will enable us to compare the results we shall obtain by the variational method with the values given by the exact solution of equation (11). 2.

The variational calculation of the energies

2-a.

Choice of the trial kets

Assume to be much larger than order of 0 , we have, practically: 2

0.

If we are concerned with values of

1

of the

2

for

2

0

(12)

2

The Hamiltonian: =

P2 2

2

2

2

+ 1

(13)

2

is then very close to that of a hydrogen atom centered at proton 1 . Analogous conclusions are, of course, obtained for much larger than 0 , and 2 of the order of 0 . 1192



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

E

0

ρ = – EI

1

2

3

R a0

4

Figure 2: Variation of the energy of the molecular ion H+ 2 with respect to the distance between the two protons. . solid line: the exact total energy of the ground state (the stability of the H+ 2 ion is due to the existence of a minimum in this curve). . dotted line: the diagonal matrix element 11 = 22 of the Hamiltonian (the variation of this matrix element cannot explain the chemical bond). . dashed line: the results of the simple variational calculation of § 2 for the bonding and antibonding states (though approximate, this calculation explains the stability of the H+ 2 ion). . triangles: the results of the more elaborate variational calculation of § 3-a (taking atomic orbitals of adjustable radius considerably improves the accuracy, especially at small distances).

Therefore, when the two protons are very far apart, the eigenfunctions of the Hamiltonian (13) are practically the stationary wave functions of hydrogen atoms. This is, of course, no longer true when 0 is not negligible compared to . We see, however, that it is convenient, for all , to choose a family of trial kets constructed from atomic states centered at each of the two protons. This choice constitutes the application to the special case of the H+ 2 ion of a general method known as the method of linear combination of atomic orbitals. More precisely, we shall call 1 and 2 the 1193



COMPLEMENT GXI

kets which describe the 1 states of the two hydrogen atoms: r

1

=

r

2

=

1 3 0

1 3 0

e

1

e

2

(14)

We shall choose as trial kets all the kets belonging to the vector subspace these two kets, that is, the set of kets such that: =

1

1

+

2

spanned by (15)

2

The variational method (Complement EXI ) consists of finding the stationary values of: =

(16)

within this subspace. Since this is a vector subspace, the average value is minimal or maximal when is an eigenvector of inside this subspace , and the corresponding eigenvalue constitutes an approximation of a true eigenvalue of in the total state space. 2-b.

The eigenvalue equation of the Hamiltonian

in the trial ket vector subspace

The resolution of the eigenvalue equation of within the subspace complicated by the fact that 1 and 2 are not orthogonal. Any vector of is of the form (15). For it to be an eigenvector of the eigenvalue , it is necessary and sufficient that: =

is slightly in

=1 2

with (17)

that is: 2

2

=

(18)

=1

=1

We set: = =

(19)

We must solve a system of two linear homogeneous equations: ( (

11

11 ) 1

21

21 )

+(

1+(

12

12 ) 2

=0

22

22 )

=0

2

(20)

This system has a non-zero solution only if: 11

11

12

12

21

21

22

22

The possible eigenvalues 1194

=0

are therefore the roots of a second-degree equation.

(21)

• 2-c.

A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

Overlap, Coulomb and resonance integrals

and

1

=

11

2

are normalized; consequently:

=1

22

(22)

On the other hand, 1 and 2 are not orthogonal. Since the wave-functions (14) associated with these two kets are real, we have: =

12

=

21

(23)

with: =

1

d3

=

2

1

(r)

2

(r)

(24)

is called an overlap integral, since it receives contributions only from points of space at which the atomic wave functions 1 and 2 are both different from zero (such points exist if the two atomic orbitals partially “overlap”). A simple calculation gives: =e

1+ +

1 3

2

(25)

To find this result, we can use elliptic coordinates (3), since: 1

=

2

=

+ 2 (26)

2

According to expression (14) for the wave functions and the one for the volume element, (4), we must calculate: =

+

1 3 0

2

2

d

1 +

3

=

+1

d 1

2

2

e

8

0

1 e 3

2

d

3 3 0

d

1

(27)

which easily yields (25).

By symmetry: 11

=

(28)

22

According to expression (13) for the Hamiltonian 11

=

1

P2 2

2

2 1

1

1

, we obtain:

2 1

+

1

1

(29)

2

2 P2 . The first term of (29) is therefore equal 2 1 to the energy of the ground state of the hydrogen atom, and the third term is equal to 2 ; we thus have:

Now,

1

is a normalized eigenket of

2 11

=

+

(30) 1195

COMPLEMENT GXI



with: 2

=

1

2

d3

=

1

[

2

2 1 (r)]

(31)

2

is called a Coulomb integral. It describes (to within a change of sign) the electrostatic interaction between the proton 2 and the charge distribution associated with the electron when it is in the 1 atomic state around the proton 1 . We find: 2

=

1

e

2

(1 + )

(32)

To find this result, we use elliptic coordinates again: 2

3 3 0

1

=

3 0

0

2

2

+

e

( + )

d ( + )e

( + )

+1

2

=

2

d d d

8 d 1

(33)

1

Elementary integrations then lead to result (32).

In formula (30), can be considered to be a modification of the repulsive energy of the two protons: when the electron is in the state 1 , the corresponding charge 2 distribution “screens” the proton 1 . Since 1 (r) is spherically symmetric about 1 , if the proton 2 was far enough from it this charge distribution would appear to 2 like a negative point charge situated at its center 1 , (so that the charge of the proton 1 would be totally cancelled). This does not actually happen unless is much larger than 0: 2

2

lim

=0

For finite

(34)

, the screening effect can only be partial, and we must have:

2

0

(35) 2

The variation of the energy with respect to is shown in Figure 2 by the dotted line. It is clear that the variation of 11 (or 22 ) with respect to cannot explain the chemical bond, since this curve has no minimum. Finally, let us calculate 12 and 21 . Since the wave functions 1 (r) and 2 (r) are real, we have: 12

=

(36)

21

Expression (13) for the Hamiltonian gives: 12

=

1

P2 2

2

2 2

+

2 1

2

2

that is, according to definition (24) of

1

2

(37)

1

:

2 12

1196

=

+

(38)



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

with: 2

=

2

1

d3

=

2

1 (r)

2 (r)

1

the resonance integral 1 . It is equal to:

We shall call =

2e

(1 + )

(40)

The use of elliptic coordinates enables us to write 2

= =

3 3 0

1 3 0

0

(39)

1

2

2

8

d d d

2e ( + )

+

2

in the form:

(41)

d 2 e 1

The fact that 12 is different from zero expresses the possibility of the electron “jumping” from the neighborhood of one of the protons to that of the other one. If, at some time, the electron is in the state 1 (or 2 ), it oscillates in time between the two sites, under the influence of 12 . This non-diagonal matrix element is therefore responsible for the phenomenon of quantum resonance, which we described qualitatively in § C-2-c of Chapter IV (hence the name of integral ). To sum up, the parameters which are functions of and are involved in equation (21) for the approximate energies are: 11

=

22

=1

12

=

21

=

2 11

=

22

=

12

=

21

=

+ 2

+

(42)

where , and are given by (25), (32) and (40), and are plotted in Figure 3. Note that the non-diagonal elements of determinant (21) take on significant values only if the orbitals 1 (r) and 2 (r) partially overlap, since the product 1 (r) 2 (r) appears in definition (39) of as well as in that of 2-d.

.

Bonding and antibonding states

Calculation of the approximate energies We set: = = =

(43)

1 Certain authors call an “exchange integral”. We prefer to restrict the use of this term to another type of integral which is encountered in many-particle systems (Complement BXIV , § 2-c- ).

1197

COMPLEMENT GXI



S

C, A

1 A

e2 a0

S

C

e2 = EI 2a0

0.5

e2 – C R C S A 0

1

2

3

4

5

6

ρ =

R a0

Figure 3: Variation of (the overlap integral), (the Coulomb integral) and (the resonance integral) with respect to = , and approach zero 0 . When 2 exponentially, while decreases only with 2 (the “screened” interaction of the proton 1 with the atom centered at 2 also decreases exponentially, however).

Equation (21) can then be written: 1+

2

1+

1+ 2 1+ 2

2

=0

(44)

or: 2

+ +1

2

=

+

+1

2

2

(45)

This gives the following two values for : +

=

1+

=

1+

2

2

+

1 + 1+

both approach 1 when + and approximate energies approach 1198

(46a)

(46b) approaches infinity. This means that the two , the ground state energy of an isolated hydrogen



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

atom, as expected (§ 2-a). Furthermore, it is convenient to choose this value as the energy origin, that is, to set: ∆

=

( )

( )=

+

(47)

Using (25), (32) and (40), the approximate energies ∆



2

=

2e

2

(1 + ) 1

e

e

2

(1 + )

(1 + +

2

3)

1

+

and ∆

can be written:

(48)

The variation of ∆ with respect to is shown in dashed lines in Figure 2. We see that ∆ has a negative minimum for a certain value of the distance between the two protons. Although this is an approximation (cf. Fig. 2), it explains the existence of the chemical bond. As we have already pointed out, the variation with respect to of the diagonal elements 11 and 22 of determinant (21) has no minimum (dotted-line curve of Figure 2). The minimum of ∆ therefore is due to the non-diagonal elements 12 and . This shows that the phenomenon of the chemical bond appears only if the electronic 12 orbitals of the two atoms participating in the bond overlap sufficiently. .

Eigenstates of

inside the subspace

The eigenstate corresponding to is called a bonding state, and the one corresponding to + , an antibonding state, since + always remains greater than the energy of the system formed by a hydrogen atom in the ground state and an infinitely distant proton. According to (45): 2

+ +1

=

+

+1

2

(49)

System (20) then gives: 1

2

=0

(50)

The bonding and antibonding states are therefore symmetric and antisymmetric linear combinations of the kets 1 and 2 . To normalize them, it must be recalled that 1 and 2 are not orthogonal (their scalar product is equal to ). We therefore obtain: +

=

=

1 2(1

)

1 2(1 + )

[

1

[

1

+

2

]

(51a)

2

]

(51b)

Note that the bonding state , associated with , is symmetric under exchange of 1 and 2 , while the antibonding state is antisymmetric. 1199

COMPLEMENT GXI



Comment:

It could have been expected that the eigenstates of inside the subspace would be symmetric and antisymmetric combinations of 1 and 2 : for given positions of the two protons, there is symmetry with respect to the bisecting plane of 1 2 , and remains unchanged if the roles of the two protons are exchanged. The bonding and antibonding states are approximate stationary states of the system under study. We pointed out in Complement EXI that the variational method can give a valid approximation for the energies but gives a more debatable result for the eigenfunctions. It is instructive, however, to have an idea of the mechanism of the chemical bond, to represent graphically the wave functions associated with the bonding and antibonding states, which are often called bonding and antibonding molecular orbitals. To do so, we can, for example, trace the surfaces of equal (the locus of points in space for which the modulus of the wave function has a given value). If is real, we indicate by a + (or ) sign the regions in which it is positive (or negative). This is what is done in Figure 4 for + and (the surfaces of equal are surfaces of revolution about the 1 2 axis, and Figure 4 only shows their cross sections in a plane containing 1 2 ). The difference between the bonding orbital and the antibonding orbital is striking. In the first one, the electronic cloud “streches out” to include both protons, while in the second one, the position probability of the electron is zero in the bisecting plane of 1 2 . Comment:

We can calculate the average value of the potential energy in the state if we use (51b), (31) et (39), is equal to: 2

2

2

1

2

, which,

= 2

2

1 1+

=

1

2

1

2 1

2

1

1

+

1

2

=

+

1

2

+

2 1

2 2

1 1+

(2 + 2 + )

(52)

Subtracting this from (46b), we obtain the kinetic energy: = =

P2 2 1 1+

= (1

+ )

(53)

We shall discuss later (§ 5) to what extent (52) and (53) give good approximations for the kinetic and potential energies.

1200



P1

P2

+

A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

P1

P2

+



a

b

Figure 4: Schematic drawings of the bonding molecular orbital (fig. a) and the antibonding molecular orbital (fig. b) for the H+ 2 ion. We have shown the cross section in a plane containing 1 2 of a family of surfaces for which the modulus of the wave function has a constant given value. These are surfaces of revolution about 1 2 (we have shown 4 surfaces, corresponding to 4 different values of ). The + and signs indicated in the figure are those of the wave function (which is real) in the corresponding regions. The dashed line is the trace of the bisecting plane of 1 2 , which is a nodal plane for the antibonding orbit.

3.

Critique of the preceding model. Possible improvements

3-a.

Results for small

What happens to the energy of the bonding state and the corresponding wave function 0? We see from Figure 3 that , and approach, respectively, 1, 2 and 2 when 0. If we subtract the repulsion term 2 of the two protons, to obtain the electronic energy, we find: when

2

3

(54)

0

In addition, since 1 approaches 2 , reduces to 1 (the ground state 1 of the hydrogen atom). This result is obviously incorrect. When = 0, we have the equivalent2 of a helium ion He+ . The electronic energy of the ground state of H+ = 0, with that 2 must coincide, for of the ground state of He+ . Since the helium nucleus is a = 2 nucleus, this energy is (cf. Complement AVII ): 2

=

4

(55)

and not 3 . Furthermore, the wave function (r) should not approach 1 (r) = ( 03 ) 1 2 e 1 , 1 but rather ( 03 3 ) 1 2 e with = 2 (the Bohr orbit is twice as small). This enables us to understand why the disagreement between the exact result and that of § 2 above becomes 2 In

addition to the two protons, the helium nucleus of course contains one or two neutrons.

1201

COMPLEMENT GXI



important for small values of (Fig. 2): this calculation uses atomic orbitals which are too spread out when the two protons are too close to each other. A possible improvement therefore consists of enlarging the family of trial kets because of these physical arguments and using kets of the form: =

1

1(

) +

2(

2

)

(56)

where 1 ( ) and 2 ( ) are associated with 1 atomic orbitals of radius 0 centered at 1 and 2 . The ground state still corresponds, for reasons of symmetry, to 1 = 2 . We consider like a variational parameter in seeking, for each value of the value of which minimizes the energy. The calculation can be performed completely in elliptic coordinates. We find (cf. Fig. 5) that the optimal value of decreases from = 2 for = 0 to = 1 for , as it should. The curve obtained for ∆ is much closer to the exact curve (cf. Fig. 2). Table I gives the values of the abscissa and ordinate of the minimum of ∆ obtained from the various models considered in this complement. It can be seen from this table that the energies found by the variational method are always greater than the exact energy of the ground state; in addition, we see that enlarging the family of trial kets improves the results for the energy. 3-b.

Results for large

When , we see from (48) that + and exponentially approach the same value . Actually, this limit should not be obtained so rapidly. To see this, we shall use a perturbation approach, as in Complement CXI , (Van der Waals forces) or EXII (the Stark effect of the hydrogen atom). Let us evaluate the perturbation of the energy of a hydrogen atom (in the 1 state), situated at 2 , produced by the presence of a proton 1 situated at a distance much greater than 0 ( 1). In the neighborhood of 2 , the proton 1 creates an electric field E, which varies like 1 2 . This field polarizes the hydrogen atom and causes an electric dipole moment D, proportional to E, to appear. The electronic wave function is distorted, and the barycenter of the electronic charge distribution moves closer to 1 (Fig. 6). E and D are both proportional to 1/ 2 and have the same sign. The electrostatic interaction between the proton 3 1 and the atom situated at 2 must therefore lower . Consequently, the asymptotic behavior 4 of ∆ + and ∆ must vary, not exponentially, but as (where is a positive constant) the energy by an amount which, like E D, varies as 1/ 4 . It is actually possible to find this result by the variational method. Instead of linearly superposing 1 orbitals centered at 1 and 2 , we shall superpose hybrid orbitals 1 , and 2 , which are not spherically symmetric about 1 and 2 . The hybrid orbital 2 is obtained, for example, by linearly superposing a 1 orbital and a 2 orbital, both centered4 at 2 : 2 (r)

=

2 1

(r) +

2 1

(r)

(57)

and has a form analogous to the one shown in Figure 6. Now, consider determinant (21). The non-diagonal elements 12 = 1 2 and 12 = 1 2 still approach zero exponentially when . This is because the product 1 (r) 2 (r) appears in the corresponding integrals; even though distorted, the orbitals 1 (r) and 2 (r) still remain localized in the neighborhoods of 1 and 2 respectively, and their overlap goes to zero exponentially when . The two eigenvalues + and therefore both approach 11 = 22 when , since determinant (21) becomes diagonal. Now, what does 22 represent? As we have seen (cf. § 2-c), it is the energy of a hydrogen atom placed at 2 and perturbed by the proton 1 . The calculation of § 2 neglected precisely, the energy is lowered by 12 E D (cf. Complement EXII , § 1). symmetry axis of the 2 orbital is chosen along the straight line joining the two protons.

3 More 4 The

1202



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

Z 2

ρ = R/a0

1 1

0

2

3

4

5

Figure 5: For each value of the internuclear distance, we have calculated the value of which minimizes the energy. For = 0, we have the equivalent of a He+ ion, and we indeed find = 2. For 0 , we have essentially an isolated hydrogen atom, which gives = 1. Between these two extremes, is a decreasing function of . The corresponding optimal energies are represented by triangles in Figure 2.

any polarization of the 1 electronic orbital due to the effect of the electric field created by increases. 1 , and this is why we found an energy correction decreasing exponentially when However, if, as we are doing here, we take into account the polarization of the electronic orbital, 4 we find a correction in . The fact that, in (57), we consider only the mixing with the 2 orbital causes the value of given by the variational calculation to be approximate (whereas the perturbation calculation of the polarization involves all the excited states, cf. Complement EXII ). The two curves ∆ + and ∆ therefore do approach each other exponentially, since the difference between + and involves only the non-diagonal elements 12 and 12 , and their common value for large approaches zero proportionally to 1 4 (Fig. 7). The preceding discussion also suggests using polarized orbitals like the one in (57), not only for large but also for all other values of as well. We would thus enlarge the family

E P1

P2 D

Figure 6: Under the effect of the electric field E created by the proton 1 , the electronic cloud of the hydrogen atom centered at 2 becomes distorted, and this atom acquires an electric dipole moment D. An interaction energy results which decreases with 1/ 4 when increases.

1203

COMPLEMENT GXI



ΔE– ΔE+ R

Figure 7: When , the energies of the bonding and antibonding states approach each other exponentially. However, they approach their limiting value less rapidly (like 1 4 ).

of trial kets and consequently improve the accuracy. In expression (57), we then consider as a variation parameter, like the parameter that defines the Bohr radius 0 associated with the 1 and 2 orbitals. To make the method even more flexible, we choose different parameters and for 1 and 2 . For each value of we then minimize the average value of in the state 1 + 2 (which, for reasons of symmetry, is still the ground state), and we determine the optimal values of , , . The agreement with the exact solution then becomes excellent (cf. Table I).

4.

Other molecular orbitals of the H+ 2 ion

In the preceding sections, we obtained by the variational method a bonding and an antibonding molecular orbitals. They were obtained from the ground state 1 of each of Distance d’équilibre des deux protons (abscisse du minimum de ∆ ) Méthode variationnelle du § 2 (orbitales 1 avec = 1) Méthode variationnelle du § ?? (orbitales 1 avec variable) Méthode variationnelle du § 3-b (orbitales hybrides avec variables) Valeurs exactes

2,50

0

1,76 eV

2,00

0

2,35 eV

2,00

0

2,73 eV

2,00

0

Tableau I 1204

Profondeur du puits de potentiel (valeur du minimum de ∆ )

2,79 eV



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

the two hydrogen atoms, formed about the two protons. Of course, we chose the 1 state because it was clear that this would be the best choice for obtaining an approximation of the ground state of a system of two protons and one electron. We can obviously envisage, with the method of linear combination of atomic orbitals (§ 2-a), using excited states of the hydrogen atom to obtain other molecular orbitals of higher energies. The main interest of these excited orbitals will be to give us an idea of the phenomena which can come into play in molecules which are more complex than the H+ 2 ion. For example, to understand the properties of a diatomic molecule containing several electrons, we can, in a first approximation, treat these electrons individually, as if they did not interact with each other. We thus determine the various possible stationary states for an isolated electron placed in the Coulomb field of the nuclei, and then place the electrons of the molecule in these states, taking the Pauli principle into account (Chap. XIV, § D-1) and filling the lowest energy states first (this procedure is analogous to the one described for many-electron atoms in Complement AXIV ). In this section, we shall indicate the principal properties of the excited molecular orbitals of the H+ 2 ion, while keeping in mind the possibilities of generalization to more complex molecules. 4-a.

Symmetries and quantum numbers. Spectroscopic notation

( ) The potential created by the two protons is symmetric with respect to revolution about the 1 2 axis, which we shall choose as the axis. This means that and, consequently, the Hamiltonian of the electron, do not depend on the angular variable which fixes the orientation about of the 1 2 plane containing the axis and the point . It follows that commutes with the component of the orbital angular momentum of the electron [in the r representation, ~ , which commutes with any -independent becomes the differential operator operator]. We can then find a system of eigenstates of that are also eigenstates of , and class them according to the eigenvalues ~ of . ( ) The potential is also invariant under reflection through any plane containing , that is, the axis. Under such a reflection, an eigenstate of of eigenvalue 1 2 ~ is transformed into an eigenstate of of eigenvalue ~ (the reflection changes the sense of revolution of the electron about ). Because of the invariance of the energy of a stationary state depends only on . In spectroscopic notation, we label each molecular orbital with a Greek letter indicating the value of , as follows: =0 =1

(58)

=2 (note the analogy with atomic spectroscopic notation: , , recall , , ). For example, since the ground state 1 of the hydrogen atom has a zero orbital angular momentum, the two orbitals studied in the preceding sections are orbitals (it can be shown that this is also true for the exact stationary wave functions, and not only for the approximate states obtained by the variational method). 1205

COMPLEMENT GXI



This notation does not use the fact that the two protons of the H+ 2 ion have equal charges. The , , classification of molecular orbitals therefore remains valid for a heteropolar diatomic molecule. ) In the H+ 2 ion (and, more generally, in homopolar diatomic molecules), the potential is invariant under reflection through the middle of 1 2 . We can therefore choose eigenfunctions of the Hamiltonian in such a way that they have a definite parity with respect to the point . For an even orbital, we add to the Greek letter which characterizes , an index (from the German “gerade”); this index is (“ungerade”) for odd orbitals. Thus, the bonding orbital obtained above from the 1 atomic states is a orbital, while the corresponding antibonding orbital is .

(

( ) Finally, we can use the invariance of under reflection through the bisecting plane of 1 2 to choose stationary wave functions which have a definite parity in this operation, that is, a parity defined with respect to the change in sign of the variable . Functions which are odd under this reflection are labeled with an asterisk. They are necessarily zero at all points of the bisecting plane of 1 2 , like the orbital shown in Figure 4b; these are antibonding orbitals.

Comment:

Reflection through the bisecting plane of 1 2 can be obtained by performing a reflection through followed by a rotation of about . The parity ( ) is therefore not independent of the preceding symmetries (the “ ” states will have an asterisk for odd and none for even ; the situation is reversed for the “ ” states). However, it is convenient to consider this parity, since it enables us to determine the antibonding orbitals immediately. 4-b.

Molecular orbitals constructed from the 2 atomic orbitals

If we start with the excited state 2 of the hydrogen atom, arguments analogous to those of the preceding sections will give a bonding (2 ) orbital and an antibonding (2 ) orbital, with forms similar to those in Figure 4. We shall therefore concern ourselves instead with molecular orbitals obtained from the excited atomic states 2 . .

Orbitals constructed from 2

states

1 2

We shall denote by and 22 the atomic states 2 (cf. Complement EVII , § 2-b), centered at 1 and 2 respectively. The form of the corresponding orbitals is shown in Figure 8 (note the choice of signs, indicated in the figure). By a variational calculation analogous to the one in § 2, we can construct, starting with these two atomic states, two approximate eigenstates of the Hamiltonian (13). The symmetries recalled in § 4-a imply that, to within a normalization factor, these molecular states can be written: 1 2 1 2

1206

+

2 2 2 2

(59a) (59b)



P1



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

+

P2

+

z



Figure 8: Schematic representation of the 2 atomic orbitals centered at 1 and 2 (the axis is chosen along 1 2 ) and used as a basis for constructing the excited molecular orbitals (2 ) and (2 ) shown in Figure 9 (note the sign convention chosen).

The shape of the two molecular orbitals so obtained can easily be deduced from Figure 8; they are shown in Figure 9. The two atomic states 2 are eigenstates of with the eigenvalue zero; the same is therefore also true of the two states (59). The molecular orbital associated with (59a) is even and is written (2 ); the one corresponding to (59b) is odd under a reflection through as well as under a reflection through the bisecting plane of 1 2 , and we shall therefore denote it by (2 ).

σu*(2pz)

σg(2pz)



P1

+

a

P2





P1

+



P2

+

b

Figure 9: Schematic representation of the excited molecular orbitals: the bonding orbital (2 ) (fig. a) and the antibonding orbital (2 ) (fig. b). As in Figure 8, we have drawn the cross section in a plane containing 1 2 of a constant modulus surface. This is a surface of revolution about 1 2 . The sign shown is that of the (real) wave function. The dashed-line curves are the cross sections in the plane of the figure of the nodal surfaces ( = 0).

1207

COMPLEMENT GXI



Remark: As we mentioned in the introduction of this complement, the major interest of the excited orbitals we study here is their application to molecules more complex than H+ 2 . For such molecules, the orbitals 2 and 2 have different energies, so that it is legitimate to consider them separately. If, however, we specifically study the hydrogen molecular H+ 2 ion, the sates 2 and 2 are then degenerate, so that they are immediately mixed by the electric field of a nearby proton. In this case, there is no reason to study the orbitals 2 et 2 separately; it is more appropriate to introduce hybrid orbitals similar to those discussed in § 3 of Complement EVII .

.

Orbitals constructed from 2

or 2

states

We shall now start with the atomic states 12 and 22 , with which are associated the real wave functions (cf. Complement EVII , § 2-b) shown in Figure 10 (note that the surfaces of equal whose cross sections in the plane are given in Figure 10 are surfaces of revolution, not about , but about axes parallel to and passing through 1 and 2 ). Recall that the atomic orbital 2 is obtained by the linear combination of eigenstates of corresponding to = 1 and = 1. The molecular orbitals constructed from these atomic orbitals therefore have = 1; they are orbitals.

x + P1 O

+ P2

z –



Figure 10: Schematic representation of the atomic orbitals 2 centered at 1 and 2 (the axis is chosen along 1 2 ) and used as a basis for constructing the excited molecular orbitals (2 ) and (2 ) shown in Figure 11. For each orbital, the surface of equal , whose cross section in the plane is shown, is a surface of revolution, no longer about , but about a straight line parallel to and passing either through 1 or 2 .

Here again, the approximate molecular states produced from the atomic states 2 are the symmetric and antisymmetric linear combinations: 1 2 1 2

+

2 2 2 2

(60a) (60b)

The form of these molecular orbitals can easily be qualitatively deduced from Figure 10. The surfaces of equal are not surfaces of revolution about , but are simply symmetric with respect to the plane. Their cross sections in this plane are shown in 1208



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

πg*(2px)

πu(2px) +



+

P1

P2

P1

P2



+

– a

b

Figure 11: Schematic representation of the excited molecular orbitals: the bonding orbital (2 ) (fig. a) and the antibonding orbital (2 ) (fig. b). For each of these two orbitals, we have shown the cross section in the plane of a surface on which has a given constant value. This surface is no longer a surface of revolution but is simply symmetric with respect to the plane. The meaning of the signs and the dashed lines is the same as in Figures 4, 8, 9, 10.

Figure 11. We see immediately in this figure that the orbital associated with state (60a) is odd with respect to the middle of 1 2 but even with respect to the bisecting plane of 1 2 ; it will therefore be denoted by (2 ). On the other hand, the orbital corresponding to (60b) is even with respect to point and odd with respect to the bisecting plane of 1 2 : it is an antibonding orbital, denoted by (2 ). We stress the fact that these orbitals have planes of symmetry, not axes of revolution like the orbitals. Of course, the molecular orbitals produced by the atomic states 2 can be deduced from the preceding ones by a rotation of 2 about 1 2 . orbitals analogous to the preceding ones are involved in the double or triple bonds of atoms such as carbon (cf. Complement EVII , §§ 3-c and 4-c).

Comment:

We saw earlier (§ 2-d) that the energy separation of the bonding and antibonding levels is due to the overlap of the atomic wave functions. Now, for the same distance the overlap of the 12 and 22 orbitals, which point towards each other, is larger than that of 12 and 22 , whose axes are parallel (Fig. 8 and 10). We see that the energy difference between (2 ) and (2 ) is larger than that between (2 ) and (2 ) [or (2 ) and (2 )]. The hierarchy of the corresponding levels is indicated in Figure 12.

1209

COMPLEMENT GXI

• σu* 2pz

πg* 2px

πg* 2py

πu 2px

πu 2py

2pz, 2px, 2py

σg 2pz Figure 12: The energies of the various excited molecular orbitals constructed from the atomic orbitals 2 , 2 and 2 centered at 1 and 2 (the axis is chosen along ). By symmetry, the molecular orbitals produced by the 2 atomic orbitals are 1 2 degenerate with those produced by the 2 atomic orbitals. The difference between the bonding and antibonding molecular orbitals (2 ) and (2 ) is, however, smaller than the corresponding difference between the (2 ) and (2 ) molecular orbitals. This is due to the larger overlap of the two 2 atomic orbitals.

5. 5-a.

The origin of the chemical bond; the virial theorem Statement of the problem

When the distance between the protons decreases, their electrostatic repulsion increases. The fact that the total energy ( ) of the bonding state decreases (when decreases from a very large value) and then passes through a minimum therefore means that the electronic energy begins by decreasing faster than 2 increases (of course, since this term diverges when 0, it is the repulsion between the protons which counts at short distances). We can then ask the following question: does the lowering of the electronic energy, which makes the chemical bond possible, arise from a lowering of the electronic potential energy or from a lowering of the kinetic energy or from both? We have already calculated, in (52) and (53), approximate expressions for the (total) potential and kinetic energies. We might then consider studying the variation of these expressions with respect to Such a method, however, would have to be used with caution, since, as we have already pointed out, the eigenfunctions supplied by a variational calculation are much less precise than the energies. We shall discuss this point in greater detail in § 5-d- below. Actually, it is possible to answer this question rigorously, thanks to the “virial theorem”, which provides exact relations between ( ) and the average kinetic and potential energies. Therefore, in this section, we shall prove this theorem and discuss its physical consequences. The results obtained, furthermore, are completely general and 2

1210



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

can be applied, not only to the molecular ion H+ 2 , but also to all other molecules. Before considering the virial theorem itself, we shall begin by establishing some results which we shall need later. 5-b.

Some useful theorems

.

Euler’s theorem

Recall that a function ( 1 , 2 , ..., ) of several variables 1 , 2 , ... is said to be homogeneous of degree if it is multiplied by when all the variables are multiplied by : (

1

)=

2

(

1

)

2

(61)

For example, the potential of a three-dimensional harmonic oscillator: (

)=

1 2

2

2

(

+

2

+

2

)

(62)

is homogeneous of degree 2. Similarly, the electrostatic interaction energy of two particles:

=

)2 + (

(

)2 + (

(63)

)2

is homogeneous of degree 1. Euler’s theorem indicates that any function satisfies the identity: =

(

which is homogeneous of degree

)

1

(64)

=1

To prove this, we calculate the derivatives with respect to left-hand side yields: (

)

1

(

)=

(

1

of both sides of (61). The

)

(65)

and the right-hand side yields: 1

(

1

)

If we set (65) equal to (66), with

(66) = 1, we obtain (64).

Euler’s theorem can very easily be verified in examples (62) and (63). .

The Hellman-Feynman theorem

Let ( ) be a Hermitian operator which depends on a real parameter , and a normalized eigenvector of ( ) of eigenvalue ( ): ( ) ( ) = ( ) ( ) =1

( ) ( )

( )

(67) (68) 1211



COMPLEMENT GXI

The Hellmann-Feynman theorem indicates that: d d

( )=

( )

d d

( ) ( )

(69)

This relation can be proven as follows. According to (67) and (68), we have: ( )=

( )

( ) ( )

(70)

If we differentiate this relation with respect to , we obtain: d d

( )=

( )

d d

d d

+

( ) ( ) ( )

( ) ( ) +

( )

( )

d d

( )

that is, using (67) and the adjoint relation [ ( ) is Hermitian, hence d d

( )=

( ) +

d d

(71) ( ) is real]:

( ) ( ) d d

( )

( )

( ) +

( )

d d

( )

(72)

On the right-hand side, the expression inside curly brackets is the derivative of which is zero since ( ) is normalized; we therefore find (69). .

Average value of the commutator [ ,

] in an eigenstate of

Let be a normalized eigenvector of the Hermitian operator For any operator : [ since, as ( 5-c.

.

]

=0 =

of eigenvalue

.

(73) and

)

( ) ( ),

=

:

=

=0

(74)

The virial theorem applied to molecules

The potential energy of the system

Consider an arbitrary molecule composed of nuclei and electrons. We shall denote by r ( = 1 2 ) the classical positions of the nuclei, and by r and p ( = 1 2 ) the classical positions and momenta of the electrons. The components of these vectors will be written , , , etc. We shall use the Born-Oppenheimer approximation, considering the r as given classical parameters. In the quantum mechanical calculation, only the r and p become operators, R and P . We must therefore solve the eigenvalue equation: (r1 1212

r ) (r1

r ) =

(r1

r ) (r1

r )

(75)



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

of a Hamiltonian which depends on the parameters r1 r and which acts in the state space of the electrons. The expression for can be written: =

+

where

(r1

r )

(76)

is the kinetic energy operator of the electrons:

= =1

1 (P )2 2

(77)

and (r1 r ) is the operator obtained by replacing the r by the operators R in the expression for the classical potential energy. The latter is the sum of the repulsion energy between the electrons, the attraction energy between the electrons and the nuclei, and the repulsion energy between the nuclei, so that: (r1

r )=

+

(r1

r )+

(r1

r )

(78)

Actually, since depends only on the r and does not involve the R , is a number and not an operator acting in the state space of the electrons. The only effect of is therefore to shift all the energies equally, since equation (75) is equivalent to: (r1

r ) (r1

r ) =

(r1

r ) (r1

r )

(79)

where: (r1

r )=

+

+

(r1

and where the electronic energy (r1

r )=

(r1

r )=

(r1

is related to the total energy

r )

(r1

r )

r )

(80)

by: (81)

We can apply Euler’s theorem to the classical potential energy, since it is a homogeneous function of degree 1 of the set of electronic and nuclear coordinates. Since the operators R all commute with each other, we get the relation between the quantum mechanical operators: r



+

R



=

(82)

=1

=1

where ∇ and ∇ denote the operators obtained by substitution of the R for the r in the gradients with respect to r and r in the classical expression for the potential energy. Relation (82) will serve as the foundation of our proof of the virial theorem. .

Proof of the virial theorem We apply (73) to the special case in which: =

R

P

(83)

=1

1213



COMPLEMENT GXI

To do so, we find the commutator of R

=

P

=1

[

with ]

:

+

[

]

=1

(P )2

= ~

+R



(84)

=1

(we have used the commutation relations of a function of the momentum with the position, or vice versa; cf. Complement BII , § 4-c). The first term inside the curly brackets is proportional to the kinetic energy . According to (82), the second term is equal to: ∇

r

(85)

=1

Consequently, relation (73) yields: 2

+

+



r

=0

(86)

=1

that is, since the Hamiltonian 2

+

=

depends on the parameters r only through



r

: (87)

=1

The components r here play a role analogous to that of the parameter in (69). Application of the Hellmann-Feynman theorem to the right-hand side of equation (87) then gives: 2

+

=



r

(r1

r

r )

(88)

=1

Furthermore, we obviously have: +

=

(r1

r )

(89)

We can then easily find from (88) and (89): =

r



=1

=2

+

r

(90) ∇

=1

Thus, we obtain a very simple result: the virial theorem applied to molecules. It enables us to calculate the average kinetic and potential energies if we know the variation of the total energy with respect to the positions of the nuclei. 1214



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

Comment:

The total electronic energy related by: =2

+

r

and the electronic potential energy



are also

(91)

=1

This relation can be proven by substituting (81) and the explicit expression for in terms of the r into the second relation of (90). However, it is simpler to note that the electronic potential energy = + , like the total potential energy , is a homogeneous function of degree 1 of the coordinates of the system of particles. Consequently, the preceding arguments apply to as well as to and we can simultaneously replace by and by in both relations (90).

.

A special case: the diatomic molecule

When the number of nuclei is equal to two, the energies depend only on the internuclear distance This further simplifies the expression for the virial theorem, which becomes: d d d d

= =2 Since =

+

(92)

depends on the nuclear coordinates only through

we have:

d d

(93)

and, consequently: = =1 2

d d

(94) =1 2

Now, the distance between the nuclei is a homogeneous function of degree 1 of the coordinates of the nuclei. Application of Euler’s theorem to this function enables us to replace the double summation appearing on the right-hand side of (94) by , and we finally obtain: r



=

=1 2

d d

(95)

When this result is substituted into (90), it gives relations (92).

In (92) as in (90), we can replace 5-d.

.

by

and

by

Discussion

The chemical bond is due to a lowering of the electronic potential energy

Let be the value of the total energy of the system when the various nuclei are infinitely far apart. If it is possible to form a stable molecule by moving the nuclei 1215

COMPLEMENT GXI



closer together, there must exist a certain relative arrangement of these nuclei for which the total energy passes through a minimum 0 . For the corresponding values of r , we then have: ∇

=0

(96)

Relations (90) then indicate that, for this equilibrium position, the kinetic and potential energies are equal to: =

0 0

=2

0

(97)

0

Furthermore, when the nuclei are infinitely far from each other, the system is composed of a certain number of atoms or ions without mutual interactions (the energy no longer depends on the r ). For each of these subsystems, the virial theorem indicates that = , = 2 and, for the system as a whole, we must therefore also have: = =2

(98)

Subtracting (98) from (97) then gives: =

0 0

(

0

)

0

= 2(

0

)

0

(99)

The formation of a stable molecule is therefore always accompanied by an increase in the kinetic energy of the electrons and a decrease in the total potential energy. The electronic potential energy must, furthermore, decrease even more since the average value (the repulsion between the nuclei), which is zero at infinity, is always positive. It is therefore a lowering of the potential energy of the electrons + that is responsible for the chemical bond. At equilibrium, this lowering must outweigh the increase in and . The special case of the H+ 2 ion

.

( ) Application of the virial theorem to the approximate variational energy. We return to the study of the variation of and for the H+ 2 ion. We shall begin by examining the predictions of the variational model of § 2, which led to the approximate expressions (52) and (53). From the second of these relations, we deduce that: ∆

=

=

1 1+

(

2

)

(100)

Since is always greater than 2 (cf. Fig. 3), this calculation would tend to indicate that ∆ is always negative. This appears, moreover, in Figure 13, where the dashed lines represent the variations of the approximate expressions (52) and (53). In particular, we see that, according to the variational calculation, ∆ is negative at equilibrium ( 2 5) and ∆ is positive. These results are both incorrect, according to (99). We see here the limits of a variational calculation, which gives an acceptable value for the total energy + , but not for and separately. These latter average values depend too strongly on the wave function.

1216



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

e2 a0 EI Te R a0

0 1

2

3

4

5

– EI E e2 – a0

V

Figure 13: The electronic kinetic energy and the potential energy of the H+ 2 ion as functions of = 0 (for purposes of comparison, we have also shown the total energy = + ). . solid lines: the exact values (the chemical bond is due to the fact that decreases a little faster than increases). . long dashes: the average values calculated from the bonding wave function given by the simple variational method of § 2. . short dashes: the values obtained by the application of the virial theorem to the energy given by the same variational calculation.

The virial theorem enables us, without having to resort to the rigorous calculation mentioned in § 1-c, to obtain a much better approximation for and . All we need to do is apply the exact relations (92) to the energy calculated by the variational method. We should expect an acceptable result, since the variational approximation is now used only to supply the total energy The values thus obtained for and are represented by short dashed lines in Figure 13. For purposes of comparison, we have shown in solid lines the exact values of and (obtained by application of the virial theorem to the solid-line curve of Figure 2). First of all, we see that for = 2 5, the curve in short dashed lines indicates, as expected, that ∆ is positive and ∆ is negative. In addition, the general shape of these curves reproduces rather well that of the solid-line curves. As long as & 1 5, the virial theorem applied to the variational energy does give values which are very close to reality. This represents a considerable improvement over the direct calculation of the average values in the approximate states. ( ) Behavior of and The solid-line curves of Figure 13 (the exact curves) show that 4 and + when 0. Indeed, when = 0, we have the equivalent of a He+ ion for which the

1217

COMPLEMENT GXI



electronic kinetic energy is 4 . The divergence of is due to the term = 2 , which 2 becomes infinite when 0 (the electronic potential energy = remains finite and approaches 8 , which is indeed its value in the He+ ion). The behavior for large deserves a more detailed discussion. We have seen above (§ 3-b) that the energy of the ground state behaves, for 0 , like: (101)

4

where is a constant which is proportional to the polarizability of the hydrogen atom. By substituting this result into formulas (92), we obtain: 3 4

2

+

2

(102)

4

When decreases from a very large value, begins by decreasing with 1 4 from its asymptotic value , and begins by increasing from 2 . These variations then change sign (this must be so since and ): as continues 0 is larger than 0 is smaller than to decrease (cf. Fig. 13), passes through a minimum and then increases until it reaches its value 4 for = 0. As for the potential energy , it passes through a maximum, then decreases, passes through a minimum, and then approaches infinity when 0. How can we interpret these variations?

P1 – a0

P2 – EI

z a0

Figure 14: Variation of the potential energy of the electron subjected to the simultaneous attraction of the two protons 1 and 2 as one moves along the line 1 2 . In the bonding state, the wave function is concentrated in the region between 1 and 2 , and the electron benefits simultaneously from the attraction of both protons.

As we have noted several times, the non-diagonal elements 12 and 21 of determinant (21) approach zero exponentially when . We can therefore argue only in terms of 11 or 22 in discussing the variation of the energy of the H+ 2 ion at large internuclear distances. The problem is then reduced to the study of the perturbation of a hydrogen atom centered at 2 by the electric field of the proton 1 . This field tends to distort the electronic orbital by

1218



A SIMPLE EXAMPLE OF THE CHEMICAL BOND: THE H+ 2 ION

stretching it in the 1 direction (cf. Fig. 6). Consequently, the wave function extends into a larger volume. According to Heisenberg’s uncertainty relations, this allows the kinetic energy to decrease; this can explain the behavior of for large Arguing in terms of 22 , we can also explain the asymptotic behavior of . The discussion of § 3-b showed that, for 0 , the polarization of the hydrogen atom situated at 2 2

makes its interaction energy

2

+

with

1

slightly negative (proportional to

1

4

).

1 2

If

is positive, it is because the potential energy

of the atom at

2

increases more

2 2

rapidly, when

1

is brought closer to

2,

than

2

+

2

decreases. This increase in

1

is due to the fact that the attraction of 1 moves the electron slightly away from it into regions of space in which the potential created by 2 is less negative.

2 2

and carries

+ For 0 (the equilibrium position of the H2 ion), the wave function of the bonding state is highly localized in the region between the two protons. The decrease in (despite the increase in 2 ) is due to the fact that the electron is in a region of space in which it benefits simultaneously from the attraction of both protons. This lowers its potential energy (cf. Fig. 14). This combined attraction of the two protons also leads to a decrease in the spatial extension of the electronic wave function, which is concentrated in the intermediate region. This is why, for close to 0 , increases when decreases.

References and suggestions for further reading (H+ 2 ion, H2 molecule, nature of the chemical bond, etc.):

Pauling (12.2); Pauling and Wilson (1.9), Chaps. XII and XIII; Levine (12.3), Chaps. 13 and 14; Karplus and Porter (12.1), Chap. 5, § 6; Slater (1.6), Chaps. 8 and 9; Eyring et al (12.5), Chaps. XI and XII; Coulson (12.6), Chap. IV; Wahl (12.13).

1219



EXERCISES

Complement HXI Exercises 1. A particle of mass

is placed in an infinite one-dimensional well of width :

( ) = 0 for 0 ( )=+

everywhere else

It is subject to a perturbation ( )= where

0

of the form:

2

is a real constant with the dimensions of an energy.

0

. Calculate, to first order in levels of the particle.

0,

the modifications induced by

( ) to the energy

. Actually, the problem is exactly soluble. Setting = 2 ~2 , show that the possible values of the energy are given by one of the two equations sin ( 2) = 0 or tan ( 2) = ~2 (as in exercise 2 of Complement K , watch out for 0 I the discontinuity of the derivative of the wave function at = 2). Discuss the results obtained with respect to the sign and size of show that one obtains the results of the preceding question.

0.

In the limit

0,

0

2. Consider a particle of mass placed in an infinite two-dimensional potential well of width (cf. Complement GII ) : (

) = 0 if 0

(

)=+

and

0

everywhere else

This particle is also subject to a perturbation (

)=

for 0

0

2 ) = 0 everywhere else.

(

. Calculate, to first order in

and

0,

0

described by the potential: 2

the perturbed energy of the ground state.

. Same question for the first excited state. Give the corresponding wave functions to zeroth order in 0 . 3. A particle of mass 2

2 0

=

2

+

+

2

, constrained to move in the 1 2

2

(

2

+

2

plane, has a Hamiltonian:

)

(a two-dimensional harmonic oscillator, of angular frequency effect on this particle of a perturbation given by: =

1

1

+

2

). We want to study the

2

1221



COMPLEMENT HXI

where

1

and

2

are constants, and the expressions for

1

and

2

are:

2

1

=

2

=~

2

~2

2

(

is the component along of the orbital angular momentum of the particle). In the perturbation calculations, consider only the corrections to first order for the energies and to zeroth order for the state vectors. . Indicate without calculations the eigenvalues of and the associated eigenvectors.

0,

their degrees of degeneracy

In what follows, consider only the second excited state of which is three-fold degenerate. . Calculate the matrices representing the restrictions of space of the eigenvalue 3~ of 0 . . Assume

2

= 0 and

1

1

0

of energy 3~

and

2

and

to the eigensub-

1.

Calculate, using perturbation theory, the effect of the term excited state of 0 .

1

1

on the second

. Compare the results obtained in with the limited expansion of the exact solution, to be found with the help of the methods described in Complement HV (normal vibrational modes of two coupled harmonic oscillators). . Assume 2 1. Considering the results of question 1 turbed situation, calculate the effect of the term 2 2 . . Now assume that

1

= 0 and

2

to be a new unper-

1.

Using perturbation theory, find the effect of the term state of 0 .

2

2

on the second excited

. Compare the results obtained in with the exact solution, which can be found from the discussions of Complement DVI . . Finally, assume that 1 1. Considering the results of question 2 new unperturbed situation, calculate the effect of the term 1 1 .

to be a

4. Consider a particle of mass constrained to move in the plane in a circle centered at with fixed radius (a two-dimensional rotator). The only variable of the system is the angle = ( ), and the quantum state of the particle is defined by the wave function ( ) (which represents the probability amplitude of finding the particle at the point of the circle fixed by the angle ). At each point of the circle, ( ) can take on only one value, so that: ( +2 )= ( ) ( ) is normalized if: 2

( )2d =1 0

1222



EXERCISES

~ d . Consider the operator = . Is Hermitian? Calculate the eigenvalues and d normalized eigenfunctions of What is the physical meaning of ? . The kinetic energy of the particle can be written: 2

=

0

2

2

Calculate the eigenvalues and eigenfunctions of

0.

Are the energies degenerate?

. At = 0, the wave function of the particle is cos2 (where is a normalization coefficient). Discuss the localization of the particle on the circle at a subsequent time . Assume that the particle has a charge and that it interacts with a uniform electric field parallel to . We must therefore add to the Hamiltonian 0 the perturbation: =

cos

Calculate the new wave function of the ground state to first order in . Determine the proportionality coefficient (the linear suceptibility) between the electric dipole parallel to acquired by the particle and the field . . Consider, for the ethane molecule CH3 – CH3 , a rotation of one CH3 group relative to the other about the straight line joining the two carbon atoms. To a first approximation, this rotation is free, and the Hamiltonian 0 introduced in describes the rotational kinetic energy of one of the CH3 groups relative to the other (2 2 must, however, be replaced by , where is the moment of inertia of the CH3 group with respect to the rotational axis and is a constant). To take account of the electrostatic interaction energy between the two CH3 groups, we add to 0 a term of the form: = cos 3 where

is a real constant.

Give a physical justification for the -dependence of Calculate the energy and wave function of the new ground state (to first order in for the wave function and to second order for the energy). Give a physical interpretation of the result. 5. Consider a system of angular momentum J. We confine ourselves in this exercise to a three-dimensional subspace, spanned by the three kets + 1 , 0 1 , common eigenstates of J2 (eigenvalue 2~2 ) and (eigenvalues +~ 0 ~). The Hamiltonian 0 of the system is: 0

=

2

+

where and frequency.

~ are two positive constants, which have the dimensions of an angular

1223



COMPLEMENT HXI

. What are the energy levels of the system? For what value of the ratio degeneracy?

is there

. A static field B0 is applied in a direction u with polar angles interaction with B0 of the magnetic moment of the system:

. The

and

M= J ( : the gyromagnetic ratio, assumed to be negative) is described by the Hamiltonian: =

0

where 0 = B0 is the Larmor angular frequency in the field B0 , and component of J in the u direction: =

cos +

sin cos

+

Write the matrix that represents . Assume that =

sin sin in the basis of the three eigenstates of

and that the u direction is parallel to

. We also have

Calculate the energies and eigenstates of the system, to first order in energies and to zeroth order for the eigenstates. . Assume that arbitrary.

is the

= 2 and that we again have

0

0

0.

.

0

for the

, the direction of u now being

In the + 1 0 1 basis, what is the expansion of the ground state + , to first order in 0 ? 0

0

of

Calculate the average value M of the magnetic moment M of the system in the state 0 . Are M and B0 parallel? Show that one can write: = with = bility tensor).

. Calculate the coefficients

(the components of the suscepti-

6. Consider a system formed by an electron spin S and two nuclear spins I1 and I2 (S is, for example, the spin of the unpaired electron of a paramagnetic diatomic molecule, and I1 and I2 are the spins of the two nuclei of this molecule). Assume that S, I1 , I2 are all spin 1/2’s. The state space of the three-spin system is spanned by the eight orthonormal kets , 1 , 1 2 , common eigenvectors of ~ 2, 1 ~ 2, 2 ~ 2 (with = ). For 2 , with respective eigenvalues 1 = 2 = example, the ket + + corresponds to the eigenvalues +~ 2 for , ~ 2 for 1 , and +~ 2 for 2 . 1224



EXERCISES

We begin by neglecting any coupling of the three spins. We assume, however, that they are placed in a uniform magnetic field B parallel to . Since the gyromagnetic ratios of I1 and I2 are equal, the Hamiltonian 0 of the system can be written: =Ω

0

+

where Ω and

1

+

2

are real, positive constants, proportional to B . Assume Ω

2 .

What are the possible energies of the three-spin system and their degrees of degeneracy? Draw the energy diagram. We now take coupling of the spins into account by adding the Hamiltonian: = S I1 + S I2 where

is a real, positive constant (the direct coupling of I1 and I2 is negligible).

What conditions must be satisfied by non-zero matrix element between S I2 .

, 1

1, 2

2, and

,

1,

2

1

2

for S I1 to have a ? Same question for

Assume that: ~2

~Ω ~

so that can be treated like a perturbation with respect to 0 . To first order in , what are the eigenvalues of the total Hamiltonian = 0 + ? To zeroth order in , what are the eigenstates of ? Draw the energy diagram. Using the approximation of the preceding question, determine the Bohr frequencies which can appear in the evolution of when the coupling of the spins is taken into account. In an E.P.R. (Electronic Paramagnetic Resonance) experiment, the frequencies of the observed resonance lines are equal to the preceding Bohr frequencies. What is the shape of the E.P.R. spectrum observed for the three-spin system? How can the coupling constant be determined from this spectrum? Now assume that the magnetic field B is zero, so that Ω = then reduces to

= 0. The Hamiltonian

Let I = I1 + I2 be the total nuclear spin. What are the eigenvalues of I2 and their degrees of degeneracy? Show that has no matrix elements between eigenstates of I2 of different eigenvalues. Let J = S + I be the total spin. What are the eigenvalues of J2 and their degrees of degeneracy? Determine the energy eigenvalues of the three-spin system and their degrees of degeneracy. Does the set J2 form a C.S.C.O.? Same question for I2 J2 . 7. Consider a nucleus of spin = 3 2, whose state space is spanned by the four vectors ( = +3 2, +1 2, 1 2, 3 2), common eigenvectors of I2 (eigenvalue 15~2 4) and (eigenvalue ~). 1225

COMPLEMENT HXI



This nucleus is placed at the coordinate origin in a non-uniform electric field derived from a potential ( ). The directions of the axes are chosen such that, at the origin: 2

2

= Recall that ∆

2

=

=0

satisfies Laplace’s equation:

=0

We shall assume that the interaction Hamiltonian between the electric field gradient at the origin and the electric quadrupole moment of the nucleus can be written: 0

=

2 (2

1 1) ~2

2

2

+

2

+

where is the electron charge, is a constant with the dimensions of a surface and proportional to the quadrupole moment of the nucleus, and: 2

=

2 2

;

=

2

;

2

0

=

0

2

0

(the index 0 indicates that the derivatives are evaluated at the origin). Show that, if 0

=

[3

is symmetric with respect to revolution about 2

,

Show that, in the general case,

where

=

[3 and

2

has the form:

0,

their degrees

( + 1)]

where is a constant to be specified. What are the eigenvalues of of degeneracy and the corresponding eigenstates?

0

0

( + 1)] +

(

0 2 +

can be written:

+

2

)

are constants, to be expressed in terms of

and

What is the matrix which represents 0 in the basis? Show that it can be broken down into two 2 2 submatrices. Determine the eigenvalues of 0 and their degrees of degeneracy, as well as the corresponding eigenstates. In addition to its quadrupole moment, the nucleus has a magnetic moment M = I ( : the gyromagnetic ratio). Onto the electrostatic field is superposed a magnetic field B0 , of arbitrary direction u. We set 0 = B0 . What term must be added to 0 in order to take into account the coupling between M and B0 ? Calculate the energies of the system to first order in 0 Assume B0 to be parallel to and weak enough for the eigenstates found in and the energies to first order in 0 found in to be good approximations. What are the Bohr frequencies which can appear in the evolution of ? Deduce from them the shape of the nuclear magnetic resonance spectrum which can be observed with a radiofrequency field oscillating along . 1226

• 8. A particle of mass :

EXERCISES

is placed in an infinite one-dimensional potential well of width

( ) = 0 for 0 ( )=+

elsewhere

Assume that this particle, of charge , is subject to a uniform electric field , with the corresponding perturbation being: =

2

. Let 1 and energy.

2

be the corrections to first- and second-order in

for the ground state

Show that 1 is zero. Give the expression for 2 in the form of a series, whose terms are to be calculated in terms of ~ (the integrals given at the end of the exercise can be used). By finding upper bounds for the terms of the series for 2 , give an upper bound for 2 (cf. § B-2-c of Chapter XI). Similarly, give a lower bound for 2 , obtained by retaining only the principal term of the series. With what accuracy do the two preceding bounds enable us to bracket the exact value of the shift ∆ in the ground state to second order in ? We now want to calculate the shift ∆ as a trial function: ( )= where

2

sin

1+

by using the variational method. Choose

2

is the variational parameter. Explain this choice of trial functions.

Calculate the average energy ( ) of the ground state to second order in [assuming the expansion of ( ) to second order in to be sufficient]. Determine the optimal value of . Find the result ∆ var given by the variational method for the shift in the ground state to second order in . By comparing ∆ var with the results of , evaluate the accuracy of the variational method applied to this example. We give the integrals: 2 0

2

sin

sin

2

16

d =

2

(1

1 4

2 )2

=1 2 3 2

2 0

2

0

2

2

2

sin2

sin

d = cos

2

1 6

1 2

d =

For all the numerical calculations, take

2

2 = 9 87. 1227

COMPLEMENT HXI



9. We want to calculate the ground state energy of the hydrogen atom by the variational method, choosing as trial functions the spherically symmetrical functions (r) whose -dependcnce is given by: ( )=

1

for

( ) = 0 for is a normalization constant and

is the variational parameter.

. Calculate the average value of the kinetic and potential energies of the electron in the state . Express the average value of the kinetic energy in terms of ∇ , so as to avoid the “delta functions” which appear in ∆ (since ∇ is discontinuous). . Find the optimal value

0

of . Compare

0

with the Bohr radius

0.

. Compare the approximate value obtained for the ground state energy with the exact value 1.

10. We intend to apply the variational melhod to the determination of the energies of a particle of mass in an infinite potential well: ( )=0 ( )=

everywhere else

We begin by approximating, in the interval [ + ], the wave function of the ground state by the simplest even polynomial which goes to zero at = : ( )=

2

2

( )=0

for

everywhere else

(a variational family reduced to a single trial function). Calculate the average value of the Hamiltonian obtained with the true value.

in this state. Compare the result

Enlarge the family of trial functions by choosing an even fourth-degree polynomial which goes to zero at = : 2

( )= ( )=0

2

2

2

for

everywhere else

(a variational family depending on the real parameter ). ( ) Show that the average value of ( )= 1228

~2 33 2 2 2

2 2

in the state

42 + 105 12 + 42

( ) is:

• ( ) Show that the values of which minimize or maximize the roots of the equation: 13

2

EXERCISES

( ) are given by

98 + 21 = 0

( ) Show that one of the roots of this equation gives, when substituted into ( ), a value of the ground state energy that is much more precise than the one obtained in . ( ) What other eigenvalue is approximated when the second root of the equation obtained in b- is used? Could this have been expected? Evaluate the precision of this determination. . Explain why the simplest polynomial which permits the approximation of the first 2 excited state wave function is ( 2 ). What approximate value is then obtained for the energy of this state?

1229

Chapter XII

An application of perturbation theory: the fine and hyperfine structure of hydrogen A

Introduction

B

Additional terms in the Hamiltonian . . . . . . . . . . . . . . 1233

C

D

E

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1232

B-1

The fine-structure Hamiltonian . . . . . . . . . . . . . . . . . 1233

B-2

Magnetic interactions related to proton spin: the hyperfine Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1237

The fine structure of the

= 2 level . . . . . . . . . . . . . . 1238

C-1

Statement of the problem . . . . . . . . . . . . . . . . . . . . 1238

C-2

Matrix representation of the fine-structure Hamiltonian inside the = 2 level . . . . . . . . . . . . . . . . . . . . . . . 1239

C-3

Results: the fine structure of the

The hyperfine structure of the

= 2 level . . . . . . . . . . 1243

= 1 level . . . . . . . . . . . 1246

D-1

Statement of the problem . . . . . . . . . . . . . . . . . . . . 1246

D-2

Matrix representation of

D-3

The hyperfine structure of the 1 level . . . . . . . . . . . . . 1249

in the 1 level . . . . . . . . . . 1247

The Zeeman effect of the 1 ground state hyperfine structure1251 E-1

Statement of the problem . . . . . . . . . . . . . . . . . . . . 1251

E-2

The weak-field Zeeman effect . . . . . . . . . . . . . . . . . . 1253

E-3

The strong-field Zeeman effect . . . . . . . . . . . . . . . . . 1257

E-4

The intermediate-field Zeeman effect . . . . . . . . . . . . . . 1261

Quantum Mechanics, Volume II, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XII

A.

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

Introduction

The most important forces inside atoms are Coulomb electrostatic forces. We took them into account in Chapter VII by choosing as the hydrogen atom Hamiltonian: 0

=

P2 + 2

( )

(A-1)

The first term represents the kinetic energy of the atom in the center of mass frame ( is the reduced mass). The second term: 2

( )=

4

1

2

=

(A-2)

0

represents the electrostatic interaction energy between the electron and the proton ( is the electron charge). In § C of Chapter VII, we calculated in detail the eigenstates and eigenvalues of 0 . Actually, expression (A-1) is only approximate: it does not take any relativistic effects into account. In particular, all the magnetic effects related to the electron spin are ignored. Moreover, we have not introduced the proton spin and the corresponding magnetic interactions. The error is, in reality, very small, since the hydrogen atom is a weakly relativistic system (recall that, in the Bohr model, the velocity in the first orbit = 1 satisfies = 2 ~ = 1 137 1). In addition, the magnetic moment of the proton is very small. However, the considerable accuracy of spectroscopic experiments makes it possible to observe effects that cannot be explained in terms of the Hamiltonian (A-1). Therefore, we shall take into account the corrections we have just mentioned by writing the complete hydrogen atom Hamiltonian in the form: =

0

+

(A-3)

where 0 is given by (A-1) and where represents all the terms neglected thus far. Since is much smaller than 0 , it is possible to calculate its effects by using the perturbation theory presented in Chapter XI. This is what we propose to do in this chapter. We shall show that is responsible for a “fine structure”, as well as for a “hyperfine structure” of the various energy levels calculated in Chapter VII. Furthermore, these structures can be measured experimentally with very great accuracy (the hyperfine structure of the 1 ground state of the hydrogen atom is currently known with 12 significant figures; the ratio between certain atomic frequencies has been measured with 18 digits!). We shall also consider, in this chapter and its complements, the influence of an external static magnetic or electric field on the various levels of the hydrogen atom (the Zeeman effect and the Stark effect). This chapter actually has two goals. On the one hand, we want to use a concrete and realistic case to illustrate the general stationary perturbation theory discussed in the preceding chapter. On the other hand, this study, which bears on one of the most fundamental systems of physics (the hydrogen atom), brings out certain concepts which are basic to atomic physics. For example, § B is devoted to a thorough discussion of various relativistic and magnetic corrections. This chapter, while not indispensable for the study of the last two chapters, presents concepts fundamental to atomic physics. 1232

B. ADDITIONAL TERMS IN THE HAMILTONIAN

B.

Additional terms in the Hamiltonian

The first problem to be solved obviously consists of finding the expression for B-1.

The fine-structure Hamiltonian

B-1-a.

The Dirac equation in the weakly relativistic domain

In Chapter IX, we mentioned that the spin appears naturally when we try to establish an equation for the electron which satisfies both the postulates of special relativity and those of quantum mechanics. Such an equation exists: it is the Dirac equation, which makes it possible to account for numerous phenomena (electron spin, the fine structure of hydrogen, etc.) and to predict the existence of positrons. The most rigorous way of obtaining the expression for the relativistic corrections [appearing in the term of (A-3)] therefore consists of first writing the Dirac equation for an electron placed in the potential ( ) created by the proton (considered to be infinitely heavy and motionless at the coordinate origin). One then looks for its limiting form when the system is weakly relativistic, as is the case for the hydrogen atom. We then recognize that the description of the electron state must include a two-component spinor (cf. Chap. IX, § C-1). The spin operators , , , introduced in Chapter IX then appear naturally. Finally, we obtain an expression such as (A-3) for the Hamiltonian , in which appears in the form of a power series expansion in which we can evaluate. It is out of the question here to study the Dirac equation, or to establish its form in the weakly relativistic domain. We shall confine ourselves to giving the first terms of the power series expansion in of and their interpretation. =

2

+

P2 + 2

( )

P4 8

3 2

+

1 2

2 2

1d ( ) ~2 L S+ d 8 2

2

∆ ( )+

(B-1)

0

2 We recognize in (B-1) the rest-mass energy of the electron (the first term) and the non-relativistic Hamiltonian 0 (the second and third terms)1 . The following terms are called fine structure terms.

Comment: Note that it is possible to solve the Dirac equation exactly for an electron placed in a Coulomb potential. We thus obtain the energy levels of the hydrogen atom without having to make a limited power series expansion in of the eigenstates and eigenvalues of . The “perturbation” point of view we are adopting here is, however, very useful in bringing out the form and physical meaning of the various interactions which exist inside an atom. This will later permit a generalization to the case of many-electron atoms (for which we do not know how to write the equivalent of the Dirac equation). 1 Expression (B-1) was obtained by assuming the proton to be infinitely heavy. This is why it is the mass of the electron that appears, and not, as in (A-1), the reduced mass of the atom. As far as by . However, we 0 is concerned, the proton finite mass effect is taken into account by replacing shall neglect this effect in the subsequent terms of , which are already corrections. It would, moreover, be difficult to evaluate, since the relativistic description of a system of two interacting particles poses serious problems [it is not sufficient to replace by in the last terms of (B-1)].

1233

CHAPTER XII

B-1-b.

.

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

Interpretation of the various terms of the fine-structure Hamiltonian

Variation of the mass with the velocity (

term)

( ) The physical origin The physical origin of the term is very simple. If we start with the relativistic expression for the energy of a classical particle of rest-mass and momentum p: =

p2 +

2 2

(B-2)

and perform a limited expansion of =

2

+

p2 2

p4 3 2

8

in powers of p

, we obtain:

+

(B-3)

2 In addition to the rest-mass energy ( ) and the non-relativistic kinetic energy (p2 2 ), 4 3 2 we find the term p 8 , which appears in (B-1). This term represents the first energy correction, due to the relativistic variation of the mass with the velocity. ( ) Order of magnitude To evaluate the size of this correction, we shall calculate the order of magnitude of the ratio 0:

p4 =

p2 2

0

p2

3 2

8

4

2 2

=

1 4

2

2

1 137

2

since we have already mentioned that, for the hydrogen atom, we see that 10 3 eV. .

Spin-orbit coupling (W

(B-4)

. Since

0

10 eV,

term)

( ) The physical origin The electron moves at a velocity v = p in the electrostatic field E created by the proton. Special relativity indicates that there then appears, in the electron frame, a magnetic field B given by: B =

1 2

v

(B-5)

E

to first order in . Since the electron possesses an intrinsic magnetic moment M = S , it interacts with this field B . The corresponding interaction energy can be written: =

M

(B-6)

B

Let us express more explicitly. The electrostatic field E appearing in (B-5) is equal 2 1d ( )r to , where ( ) = is the electrostatic energy of the electron. From this, d we get: B = 1234

1 1d ( ) p 2 d

r

(B-7)

B. ADDITIONAL TERMS IN THE HAMILTONIAN

In the corresponding quantum mechanical operator, there appears: R=

P

(B-8)

L

Finally, we obtain: =

1 2 2

2

1d ( ) L S= d

1

2 2

3

(B-9)

L S

Thus we find, to within the factor2 1/2, the spin-orbit term which appears in (B-1). This term then represents the interaction of the magnetic moment of the electron spin with the magnetic field “seen” by the electron because of its motion in the electrostatic field of the proton. ( ) Order of magnitude Since L and S are of the order of ~, we have: 2

~2

2 2

(B-10)

3

Let us compare

with

0,

which is of the order of

2

:

2 2

~

2 2 0

3

=

2

~2 2 2

(B-11)

2

is of the order of the Bohr radius, 4 0

.

~2

2

=

2

=

1 137

0

= ~2

2

. Consequently:

2

(B-12)

The Darwin term

( ) The physical origin In the Dirac equation, the interaction between the electron and the Coulomb field of the nucleus is “local”; it only depends on the value of the field at the electron position r. However, the non-relativistic approximation (the series expansion in ) leads, for the two-component spinor which describes the electron state, to an equation in which the interaction between the electron and the field has become non-local. The electron is then affected by all the values taken on by the field in a domain centered at the point r, and whose size is of the order of the Compton wavelength ~ of the electron. This is the origin of the correction represented by the Darwin term. To understand this more precisely, assume that the potential energy of the electron, instead of being equal to (r), is given by an expression of the form: d3

(ρ) (r + ρ)

(B-13)

2 It can be shown that the factor 1/2 is due to the fact that the motion of the electron about the proton is not rectilinear. The electron spin therefore rotates with respect to the laboratory reference frame (Thomas precession: see Jackson (7.5) section 11-8, Omnes (16.13) chap. 4 § 2, or Bacry (10.31) Chap. 7 § 5-d).

1235

CHAPTER XII

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

where (ρ) is a function whose integral is equal to 1, which only depends on ρ , and which takes on significant values only inside a volume of the order of (~ )3 , centered at ρ = 0. If we neglect the variation of (r) over a distance of the order of ~ , we can replace (r + ρ) by (r) in (B-13) and take (r) outside the integral, which is then equal to 1. (B-13) reduces, in this case, to (r). A better approximation consists of replacing, in (B-13), (r + ρ) by its Taylor series expansion in the neighborhood of ρ = 0. The zeroth-order term gives (r). The first-order term is zero because of the spherical symmetry of (ρ). The second-order term involves the second derivatives of the potential energy (r) at the point r and quadratic functions of the components of ρ, weighted by (ρ) and integrated over d3 . This leads to a result of the order of )2 ∆ (r)

(~

It is therefore easy to accept the idea that this second-order term should be the Darwin term.

( ) Order of magnitude 2 Replacing ( ) by ~2

2

2 2

8

2 2

1



, we can write the Darwin term in the form:

=

~

2

2 2

(R)

(B-14)

(we have used the expression for the Laplacian of 1/ given by formula (61) of Appendix II). When we take the average value of (B-14) in an atomic state, we find a contribution equal to: 2 2

~

(0)

2 2

2

2

where (0) is the value of the wave function at the origin. The Darwin term therefore affects only the electrons, which are the only ones for which (0) = 0 (cf. Chap. VII, 2 § C-4-c). The order of magnitude of (0) can be obtained by taking the integral of the square of the modulus of the wave function over a volume of the order of 30 (where 0 is the Bohr radius) to be equal to 1. Thus we obtain: (0)

1

2

3 0

3 6

=

(B-15)

~6

which gives the order of magnitude of the Darwin term: 2 2

~

2 2

2 Since

2 2

0 2 0

(0)

=

2

8 2

~4 4

=

2 4

(B-16)

, we again see that:

1 137

2

(B-17)

Thus, all the fine structure terms are about 104 times smaller than the non-relativistic Hamiltonian of Chapter VII. 1236

B. ADDITIONAL TERMS IN THE HAMILTONIAN

B-2.

Magnetic interactions related to proton spin: the hyperfine Hamiltonian

B-2-a.

Proton spin and magnetic moment

Thus far, we have considered the proton to be a physical point of mass and charge = . Actually, the proton, like the electron, is a spin 1/2 particle. We shall denote by I the corresponding spin observable. With the spin I of the proton is associated a magnetic moment M . However, the gyromagnetic ratio is different from that of the electron: M = where

(B-18)

I ~ is the nuclear Bohr magneton:

=

~

(B-19)

2

and the factor , for the proton, is equal to: 5 585. Because of the presence of (the proton mass) in the denominator of (B-19), is close to 2 000 times smaller than the Bohr magneton (recall that = ~ 2 ). Although the angular momenta of the proton and the electron are the same, nuclear magnetism, because of the mass difference, is much less important than electronic magnetism. The magnetic interactions due to the proton spin I are therefore very weak. B-2-b.

The magnetic hyperfine Hamiltonian

The electron moves, therefore, not only in the electrostatic field of the proton, but also in the magnetic field created by M . When we introduce the corresponding vector potential into the Schrödinger equation3 , we find that we must add to the Hamiltonian (B-1) an additional series of terms for which the expression is (cf. Complement AXII ): =

0

4

3

L M +

1 3

[3(M

n)(M

n) +

M

8 M 3

M ] M

(R)

(B-20)

M is the spin magnetic moment of the electron, and n is the unit vector of the straight line joining the proton to the electron (Fig. 1). We shall see that introduces energy shifts which are small compared to those created by . This is why is called the “hyperfine structure Hamiltonian”. B-2-c.

Interpretation of the various terms of

The first term of represents the interaction of the nuclear magnetic moment 3 M with the magnetic field ( 0 4 ) L created at the proton by the rotation of the electronic charge. The second term represents the dipole-dipole interaction between the electronic and nuclear magnetic moments: the interaction of the magnetic moment of the electron spin with the magnetic field created by M (cf. Complement BXI ) or vice versa. 3 Since the hyperfine interactions are very small corrective terms, they can be found using the nonrelativistic Schrödinger equation.

1237

CHAPTER XII

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

Ms

e

n

Figure 1: Relative disposition of the magnetic moments M and M of the proton and the electron; n is the unit vector on the line joining the two particles.

MI p

Finally, the last term, also called Fermi’s “contact term”, arises from the singularity at = 0 of the field created by the magnetic moment of the proton. In reality, the proton is not a point. It can be shown (cf. Complement AXII ) that the magnetic field inside the proton does not have the same form as the one created outside by M (and which enters into the dipole-dipole interaction). The contact term describes the interaction of the magnetic moment of the electron spin with the magnetic field inside the proton (the “delta” function expresses the fact that this contact term exists, as its name indicates, only when the wave functions of the electron and proton overlap). B-2-d.

Orders of magnitude

It can easily be shown that the order of magnitude of the first two terms of is: 2 2

~

0 3

4

2 2

=

1

~

2

(B-21)

3

By using (B-10), we see that these terms are about 2 000 times smaller than . As for the last term of (B-20), it is also 2 000 times smaller than the Darwin term, which also contains a (R) function. C.

The fine structure of the

C-1.

= 2 level

Statement of the problem

C-1-a.

Degeneracy of the

= 2 level

We saw in Chapter VII that the energy of the hydrogen atom depends only on the quantum number . The 2 ( = 2 = 0) and 2 ( = 2 = 1) states therefore have the same energy, equal to: 4

=

1 8

2 2

If the spins are ignored, the 2 subshell is composed of a single state, and the 2 subshell of three distinct states which differ by their eigenvalue ~ of the component of the orbital angular momentum L ( = 1 0 1). Because of the existence of electron and proton spins, the degeneracy of the = 2 level is higher than the value calculated in 1238

C. THE FINE STRUCTURE OF THE

= 2 LEVEL

Chapter VII. The components and of the two spins can each take on two values: = 1 2, = 1 2. One possible orthonormal basis in the = 2 level is given by the kets: =2;

=0;

=0;

=

1 ; 2

=

1 2

(C-1)

(2 subshell, of dimension 4) and: =2;

=1;

=

1 0 +1 ;

=

1 ; 2

=

1 2

(C-2)

(2 subshell, of dimension 12). The = 2 shell then has a total degeneracy equal to 16. According to the results of Chapter XI (§ C), in order to calculate the effect of a perturbation on the = 2 level, it is necessary to diagonalize the 16 16 matrix representing the restriction of to this level. The eigenvalues of this matrix are the first order corrections to the energy, and the corresponding eigenstates are the eigenstates of the Hamiltonian to zeroth order. C-1-b.

The perturbation Hamiltonian

In all of this section, we shall assume that no external field is applied to the atom. The difference between the exact Hamiltonian and the Hamiltonian 0 of Chapter VII (§ C) contains fine structure terms, indicated in § B-1 above: =

+

+

and hyperfine structure terms =

(C-3) , introduced in § B-2. We thus have:

+

(C-4)

Since is close to 2 000 times larger than (cf. § B-2-d), we must obviously begin by studying the effect of , before considering that of , on the = 2 level. We shall see that the = 16 degeneracy of this level is partially removed by . The structure which appears in this way is called the “fine structure”. may then remove the remaining degeneracy of the fine structure levels and cause a “hyperfine structure” to appear inside each of these levels. In this section (§ C), we shall confine ourselves to the study of the fine structure of the = 2 level. The calculations can easily be generalized to other levels. C-2. C-2-a.

Matrix representation of the fine-structure Hamiltonian

inside the

= 2 level

General properties

The properties of , as we shall see, enable us to show that the 16 16 matrix which represents it in the = 2 level can be broken down into a series of square submatrices of smaller dimensions. This will considerably simplify the determination of the eigenvalues and eigenvectors of this matrix. 1239

CHAPTER XII

.

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

does not act on the spin variables of the proton

We see from (B-1) that the fine structure terms do not depend on I. It follows that the proton spin can be ignored in the study of the fine structure (afterwards, we multiply by 2 all the degrees of degeneracy obtained). The dimension of the matrix to be diagonalized therefore falls from 16 to 8. .

does not connect the 2 and 2 subshells

Let us first prove that L2 commutes with . The operator L2 commutes with 2 the various components of L, with (L acts only on the angular variables), with P2 [cf. formula (A-16) of Chapter VII], and with S (L2 does not act on the spin variables). L2 therefore commutes with (which is proportional to P4 ), with (which depends only on , L, S), and with (which depends only on ). The 2 and 2 states are eigenstates of L2 with different eigenvalues (0 and 2~2 ). Therefore, , which commutes with L2 , has no matrix elements between a 2 state and a 2 state. The 8 8 matrix representing inside the = 2 level can be broken down, consequently, into a 2 2 matrix relative to the 2 state and a 6 6 matrix relative to the 2 state: 2

2 0

2 (

)

=2

= 2

0

Comment:

The preceding property can also be considered to be a consequence of the fact that is even. Under a reflection, R changes to R ( = R remains unchanged), P to P, L to L, and S to S. It is then easy to see that remains invariant. therefore has no matrix elements between the 2 and 2 states, which are of opposite parity (cf. Complement FII ). C-2-b.

Matrix representation of

in the 2 subshell

The dimension 2 of the 2 subspace is the result of the two possible values = 1 2 of (since we are ignoring for the moment). and do not depend on S. The matrices which represent these two operators in the 2 subspace are therefore multiples of the unit matrix, with proportionality coefficients equal, respectively, to the purely orbital matrix elements: =2; 1240

=0;

=0

P4 8

3 2

=2;

=0;

=0

C. THE FINE STRUCTURE OF THE

= 2 LEVEL

and: =2;

=0;

~2

=0

8

3 2

∆ ( )

=2;

=0;

=0

Since we know the eigenfunctions of 0 , the calculation of these matrix elements presents no theoretical difficulty. We find (cf. Complement BXII ): 2

=

2

=

13 128 1 16

2 4

(C-5)

2 4

(C-6)

Finally, calculation of the matrix elements of involves “angular” matrix elements of the form = 0 =0 =0 = 0 , which are zero because of the value = 0 of the quantum number . Therefore: 2

=0

(C-7)

Thus, under the effect of the fine structure terms, the 2 subshell is shifted as a whole with respect to the position calculated in Chapter VII by an amount equal to 2 4 5 128. C-2-c.

Matrix representation of

.

and

in the 2 subshell

terms

The and terms commute with the various components of L, since L acts only on the angular variables and commutes with and P2 (which depends on these 2 variables only through L ; cf. chapter VII). L therefore commutes with and . Consequently, and are scalar operators with respect to the orbital variables (cf. Complement BVI , § 5-b). Since and do not act on the spin variables, it follows that the matrices which represent and inside the 2 subspace are multiples of the unit matrix. The calculation of the proportionality coefficient is given in Complement BXII and leads to: 2

=

2

=0

7 384

2 4

(C-8) (C-9)

The result (C-9) is due to the fact that have a non-zero average value only in an the origin). .

is proportional to (R) and can therefore state (for 1, the wave function is zero at

term We must calculate the various matrix elements: =2;

=1;

=

1 ; 2

;

( )L S

=2;

=1;

=

1 ; 2

; (C-10) 1241

CHAPTER XII

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

with: 2

( )=

1

2 2

2

(C-11)

3

If we use the r representation, we can separate the radial part of matrix element (C-10) from the angular and spin parts. Thus we obtain: =1;

2

where

=

1 ; 2

=

L S

2

2 2

1 0

3

21 (

)

2

Since we know the radial function (cf. Complement BXII ): 2

=

=1;

=

1 ; 2

;

(C-12)

is a number, equal to the radial integral:

2

2 2

;

1 48~2

2

d 21 (

(C-13) ) of the 2 state, we can calculate

2 4

2

. We find

(C-14)

The radial variables have therefore disappeared. According to (C-12), the problem is reduced to the diagonalization of the operator 2 L S, which acts only on the angular and spin variables. To represent the operator 2 L S by a matrix, several different bases can be chosen: first of all, the basis: = 1; =

1 ; 2

;

(C-15)

which we have used thus far and which is constructed from common eigenstates of L2 , S2 , , ; or, introducing the total angular momentum: J=L+S

(C-16)

the basis: = 1; =

1 ; ; 2

(C-17)

constructed from the eigenstates common to L2 , S2 , J2 , . According to the results of chapter X, since = 1 and = 1 2, can take on two values: = 1 + 1 2 = 3 2 and = 1 1 2 = 1 2. Furthermore, we know how to go from one basis to the other, thanks to the Clebsch-Gordan coefficients [formulas (36) of Complement AX ]. We shall now show that the second basis (C-17) is better adapted than the first one to the problem which interests us here, since 2 L S is diagonal in the basis (C-17). To see this, we square both sides of (C-16). We find (L and S commute): J2 = (L + S)2 = L2 + S2 + 2 L S 1242

(C-18)

C. THE FINE STRUCTURE OF THE

= 2 LEVEL

which gives: 2

L S=

1 2

J2

2

L2

S2

(C-19)

Each of the basis vectors (C-17) is an eigenstate of L2 , S2 , J2 ; we thus have: 2

L S

= 1; =

1 ; ; 2

=

1 2

2

~2

( + 1)

2

3 4

= 1; =

1 ; ; 2 (C-20)

We see from (C-20) that the eigenvalues of ; they are equal to: 1 2 for

2

3 4

2

3 2 ~ = 4

2

~2 =

1 48

2

L S depend only on

2 4

and not on

(C-21)

= 1 2, and: 1 2

2

15 4

2

3 2 1 ~ =+ 4 2

2

~2 =

1 96

2 4

(C-22)

for

= 3 2. The six-fold degeneracy of the 2 level is therefore partially removed by . We obtain a four-fold degenerate level corresponding to = 3 2, and a two-fold degenerate level corresponding to = 1 2. The (2 + 1)-fold degeneracy of each state is an essential degeneracy related to the rotation invariance of .

Comments:

( ) In the 2 subspace ( = 0 1 2.

= 1 2),

can take on a single value,

= 0+1 2 =

( ) In the 2 subspace, and are represented by multiples of the unit matrix. This property remains valid in any basis since the unit matrix is invariant under a change of basis. The choice of basis (C-17), required by the term, is therefore also adapted to the and terms. C-3. C-3-a.

Results: the fine structure of the

= 2 level

Spectroscopic notation

In addition to the quantum numbers , (and ), the preceding discussion introduced the quantum number on which the energy correction due to the spin-orbit coupling term depends. For the 2 level, = 1 2; for the 2 level, = 1 2 or = 3 2. The level associated with a set of values, , , is generally denoted by adding an index to the symbol representing the ( ) subshell in spectroscopic notation (cf. Chap. VII, § C-4-b): (C-23) where stands for the letter for = 0, for = 1, = 2 level of the hydrogen atom gives rise to the 2

for = 2, for = 3... Thus, the 2 1 2 and 2 3 2 levels.

1 2,

1243

CHAPTER XII

C-3-b.

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

Positions of the 2

1 2,

2

1 2

and 2

3 2

levels

By regrouping the results of § 2, we can now calculate the positions of the 2 1 2 , 2 1 2 and 2 3 2 levels with respect to the “unperturbed” energy of the = 2 level 2 2 calculated in Chapter VII and equal to 8. According to the results of § 2.b, the 2 1 2 level is lowered by a quantity equal to: 5 2 4 128 According to the results of § 2.c, the 2 7 384

1 48

2 4

5 128

=

(C-24) 1 2

level is lowered by a quantity equal to:

2 4

(C-25)

Thus we see that the 2 1 2 and 2 1 2 levels have the same energy. According to the theory presented here, this degeneracy must be considered to be accidental, as opposed to the essential (2 + 1)-fold degeneracy of each level. Finally, the 2 3 2 level is lowered by a quantity: 7 1 + 384 96

2 4

1 128

=

2 4

(C-26)

The preceding results are shown in Figure 2. Comments:

( ) Only the spin-orbit coupling is responsible for the separation between the 2 and 2 3 2 levels, since and shift the entire 2 level as a whole.

1 2

( ) The hydrogen atom can go from the 2 state to the 1 state by emitting a Lyman photon ( = 1 216 ˚ A). The material presented in this chapter shows that, because of the spin-orbit coupling, the Lyman line actually contains two neighboring lines4 , 2 1 2 1 1 2 and 2 3 2 1 1 2 , separated by an energy difference equal to: 4 1 2 4 2 4 = 128 32 When they are observed with a sufficient resolution, the lines of the hydrogen spectrum therefore present a “fine structure”. ( ) We see in Figure 2 that the two levels with the same have the same energy. This result is not merely true to first order in : it remains valid to all orders. The exact solution of the Dirac equation gives, for the energy of a level characterized by the quantum numbers , the value: =

2

1+

2

1 + 2

2

( +1

2)2

2

1 2

(C-27)

4 In the ground state, = 0 and = 1 2, so can take on a single value = 1 2. therefore does not remove the degeneracy of the 1 state, and there is only one fine structure level, the 1 1 2 level. This is a special case, since the ground state is the only one for which is necessarily zero. This is why we have chosen here to study the excited = 2 level.

1244

C. THE FINE STRUCTURE OF THE

= 2 LEVEL

n=2 1 128

mec2α4 2p3/2

5 128

mec2α4

2s1/2

2p1/2

Figure 2: Fine structure of the = 2 level of the hydrogen atom. Under the effect of the fine structure Hamiltonian , the = 2 level splits into three fine structure levels, written 2 1 2 , 2 1 2 and 2 3 2 . We have indicated the algebraic values of the shifts, calculated to first order in . The shifts are the same for the 2 1 2 and 2 1 2 levels (a result which remains valid, moreover, to all orders in ). When we take into account the quantum mechanical nature of the electromagnetic field, we find that the degeneracy between the 2 1 2 and 2 1 2 levels is removed (the Lamb shift; see Figure 4).

We see that the energy depends only on

and , and not on .

If we make a limited expansion of formula (C-27) in powers of , we obtain:

=

2

1 2

2 2

2

1 2

2

4

+ 1 2)

3 4

4

+

(C-28)

The first term is the rest-mass-energy of the electron. The second term follows from the theory of Chapter VII. The third term gives the correction to first order in calculated in this chapter. ( ) Even in the absence of an external field and incident photons, a fluctuating electromagnetic field must be considered to exist in space (cf. Complement KV , § 3-d- ). This phenomenon is related to the quantum mechanical nature of the electromagnetic field, which we have not taken into consideration here. The coupling of the atom with these fluctuations of the electromagnetic field removes the degeneracy between the 2 1 2 and 2 1 2 levels. The 2 1 2 level is raised with respect to the 2 1 2 level by a quantity called the “Lamb shift”, which is of the order of 1 060 MHz (Fig. 4, page 1250). The theoretical and experimental study of this phenomenon, which was discovered in 1949, has been the object of a great deal of research, leading to the development of modern quantum electrodynamics. 1245

CHAPTER XII

D.

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

The hyperfine structure of the

= 1 level

It would now seem logical to study the effect of inside the fine structure levels 2 1 2 , 2 1 2 and 2 3 2 , in order to see if the interactions related to the proton spin I cause a hyperfine structure to appear in each of these levels. However, since does not remove the degeneracy of the ground state 1 , it is simpler to study the effect of on this state. The results obtained in this special case can easily be generalized to the 2 1 2 , 2 1 2 and 2 3 2 levels. D-1.

Statement of the problem

D-1-a.

The degeneracy of the 1 level

For the 1 level, there is no orbital degeneracy ( = 0). On the other hand, the and components of S and I can still take on two values: = 1 2 and = 1 2. The degeneracy of the 1 level is therefore equal to 4, and a possible basis in this level is given by the vectors: = 1; = 0;

D-1-b.

= 0;

=

1 ; 2

=

1 2

(D-1)

The 1 level has no fine structure

We shall show that the term does not remove the degeneracy of the 1 level. The and terms do not act on and , and are represented in the 1 subspace by multiples of the unit matrix. We find (cf. Complement BXII ): 1

=

1

=

5 8 1 2

2 4

(D-2)

2 4

(D-3)

Finally, calculation of the matrix elements of the term involves the “angular” matrix elements = 0 =0 =0 = 0 , which are obviously zero ( = 0); therefore: 1

=0

(D-4)

In conclusion, 5 1 + 8 2

2 4

merely shifts the 1 level as a whole by a quantity equal to: =

1 8

2 4

(D-5)

without splitting the level. This result could have been foreseen: since = 0 and = 1 2, can take on a single value, = 1 2, and the 1 level therefore gives rise to only one fine structure level, 2 1 2 . Since the Hamiltonian does not split the 1 level, we can now consider the effect of the term. To do so, we must first calculate the matrix which represents in the 1 level. 1246

D. THE HYPERFINE STRUCTURE OF THE

D-2.

Matrix representation of

D-2-a.

= 1 LEVEL

in the 1 level

Terms other than the contact term

Let us show that the first two terms of [formula (B-20)] make no contribution. Calculation of the contribution from the first term, 4 0 3 L M , leads to the “angular” matrix elements = 0; =0 L =0 = 0 , which are obviously zero ( = 0). Similarly, it can be shown (cf. Complement BXI , § 3) that the matrix elements of the second term (the dipole-dipole interaction) are zero because of the spherical symmetry of the 1 state. D-2-b.

The contact term

The matrix elements of the last term of (B-20), that is, of the contact term, are of the form: = 1; = 0;

= 0;

2

;

0

3

M

M

(R)

= 1; = 0;

= 0;

; (D-6)

If we go into the r representation, we can separate the orbital and spin parts of this matrix element and put it in the form: ; where

;

I S

(D-7)

is a number given by: 2

= = =

3

0 2

3

0

= 1; = 0;

2

2

1 4

10 (0)

=0

2 4

1+

= 1; = 0;

=0

2 3

4 3

(R)

1 ~2

(D-8)

We have used the expressions relating M and M to S and I [cf. B-18], as well as the expression for the radial function 10 ( ) given in § C-4-c of Chapter VII5 . The orbital variables have therefore completely disappeared, and we are left with a problem of two spin 1/2’s, I and S, coupled by an interaction of the form: I S where

(D-9)

is a constant.

5 The factor (1 + ) 3 in (D-8) arises from the fact that it is the reduced mass which enters into 10 (0). It so happens that, for the contact term, it is correct to take the nuclear finite mass effect into account in this way.

1247

CHAPTER XII

D-2-c.

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

Eigenstates and eigenvalues of the contact term

To represent the operator =

1 1 ; = ; 2 2

I S, we have thus far considered only the basis:

;

(D-10)

formed by the eigenvectors common to S2 , I2 , total angular momentum6 :

,

. We can also, by introducing the

F=S+I

(D-11)

use the basis: =

1 1 ; = ; ; 2 2

(D-12)

formed by the eigenstates common to S2 , I2 , F2 , . Since = = 1 2, can take on only the two values = 0 and = 1. We can easily pass from one basis to the other by means of (B-22) and (B-23) of Chapter X. The basis is better adapted than the basis to the study of the operator I S, as this operator is represented in the basis by a diagonal matrix (for the sake of simplicity, we do not explicitly write = 1 2 and = 1 2). This is true, since we obtain, from (D-11): I S=

2

F2

I2

S2

(D-13)

It follows that the states =

I S

are eigenstates of

~2 [ ( 2

+ 1)

( + 1)

I S:

( + 1)]

We see from (D-14) that the eigenvalues depend only on equal to: ~2 2 2 for

3 4

(D-14) , and not on

. They are

3 = 4

~2 4

(D-15)

3 = 4

3 ~2 4

(D-16)

= 1, and: ~2 0 2

for

3 4

= 0.

6 The total angular momentum is actually F = L + S + I, that is, F = J + I. However, for the ground state, the orbital angular momentum is zero, so F reduces to (D-11).

1248

D. THE HYPERFINE STRUCTURE OF THE

= 1 LEVEL

The four-fold degeneracy of the 1 level is therefore partially removed by . We obtain a three-fold degenerate = 1 level and a non-degenerate = 0 level. The (2 + 1)-fold degeneracy of the = 1 level is essential and is related to the invariance of under a rotation of the total system. D-3. D-3-a.

The hyperfine structure of the 1 level Positions of the levels

2 4 Under the effect of , the energy of the 1 level is lowered by a quantity 8 2 2 with respect to the value 2 calculated in Chapter VII. then splits the 1 1 2 level into two hyperfine levels, separated by an energy ~2 (Fig. 3). ~2 is often called the “hyperfine structure of the ground state”.

1s

1 8

mec2α4

F=1 +

1

𝒜ħ2

4

1s1/2 𝒜ħ2 –

3

𝒜ħ2

4

F=0

Figure 3: The hyperfine structure of the = 1 level of the hydrogen atom. Under the 2 4 effect of , the = 1 level undergoes a global shift equal to 8; can take on a single value, = 1 2. When the hyperfine coupling is taken into account, the 1 1 2 level splits into two hyperfine levels, = 1 and = 0. The hyperfine transition = 1 = 0 (the 21 cm line studied in radioastronomy) has a frequency which is known experimentally to twelve significant figures (thanks to the hydrogen maser). 1249

CHAPTER XII

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

F=2 2p3/2

F=1

ΔE

F=1 2s1/2 F=0 F=1 2p1/2

F=0

Figure 4: The hyperfine structure of the = 2 level of the hydrogen atom. The separation S between the two levels 2 1 2 and 2 1 2 is the Lamb shift, which is about ten times smaller than the fine structure splitting ∆ separating the two levels 2 1 2 and 2 3 2 (S 1 057 8 MHz: ∆ 10 969 1 MHz). When the hyperfine coupling is taken into account, each level splits into two hyperfine sublevels (the corresponding value of the quantum number is indicated on the right-hand side of the figure). The hyperfine splittings are equal to 23.7 MHz for the 2 3 2 level, 177.56 MHz for the 2 1 2 level and 59.19 MHz for the 2 1 2 level (for the sake of clarity, the figure is not drawn to scale).

Comment:

It could be found, similarly, that splits each of the fine structure levels 2 1 2 , 2 1 2 and 2 3 2 into a series of hyperfine levels, corresponding to all the values of separated by one unit and included between + and . For the 2 1 2 and 2 1 2 levels, we have = 1 2. Therefore, takes on the two values =1 and = 0. For the 2 3 2 level, = 3 2, and, consequently, we have = 2 and = 1 (cf. Fig. 4).

D-3-b.

Importance of the hyperfine structure of the 1 level

The hyperfine structure of the ground state of the hydrogen atom is currently the physical quantity which is known experimentally to the highest number of significant 1250

E. THE ZEEMAN EFFECT OF THE 1

GROUND STATE HYPERFINE STRUCTURE

figures. Expressed in Hz, it is equal to7 : ~ 2

= 1 420 405 751 767

0 001 Hz

(D-17)

Such a high degree of experimental accuracy was made possible by the development of the “hydrogen maser” in 1963. The principle of such a device is, very schematically, the following: hydrogen atoms, previously sorted (by a magnetic selection of the SternGerlach type) so as to choose those in the upper hyperfine level = 1, are stored in a glass cell (the arrangement is similar to the one shown in Figure 6 of Complement FIV ). This constitutes an amplifying medium for the hyperfine frequency [ ( = 1) ( = 0)] . If the cell is placed in a cavity tuned to the hyperfine frequency, and if the losses of the cavity are small enough for the gain to be greater than the losses, the system becomes unstable and can oscillate: we obtain an “atomic oscillator” (a maser). The frequency of the oscillator is very stable and of great spectral purity. Its measurement gives directly the value of the hyperfine splitting, expressed in Hz. Note, finally, that hydrogen atoms in interstellar space are detected in radioastronomy by the radiation they emit spontaneously when they fall from the = 1 hyperfine level to the = 0 hyperfine level of the ground state (this transition corresponds to a wave length of 21 cm). Most of the information we possess about interstellar hydrogen clouds is supplied by the study of this 21 cm line. E.

The Zeeman effect of the 1 ground state hyperfine structure

E-1.

Statement of the problem

E-1-a.

The Zeeman Hamiltonian

We now assume the atom to be placed in a static uniform magnetic field B0 parallel to . This field interacts with the various magnetic moments present in the atom: the orbital and spin magnetic moments of the electron, M = L and M = S, and 2 the magnetic moment of the nucleus, M = I [cf. expression (B-18)]. 2 The Zeeman Hamiltonian which describes the interaction energy of the atom with the field B0 can then be written: =

B0 (M + M + M )

=

0

where

0

(

+2

)+

(E-1)

(the Larmor angular frequency in the field B0 ) and 0

= =

Since 0

2 2

0 0

are defined by: (E-2) (E-3)

, we clearly have: (E-4)

7 The

calculations presented in this chapter are obviously completely incapable of predicting all these significant figures. Moreover, even the most advanced theories cannot, at the present time, explain more than the first five or six figures of (D-17).

1251

CHAPTER XII

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

Comment:

Rigorously, contains another term, which is quadratic in 0 (the diamagnetic term). This term does not act on the electronic and nuclear spin variables and merely shifts the 1 level as a whole, without modifying its Zeeman diagram, which we shall study later. Moreover, it is much smaller than (E-1). Recall that a detailed study of the effect of the diamagnetic term is presented in Complement DVII . E-1-b.

The perturbation “seen” by the 1 level

In this section, we propose to study the effect of on the 1 ground state of the hydrogen atom (the case of the = 2 level is slightly more complicated since, in a zero magnetic field, this level possesses both a fine and a hyperfine structure, while the = 1 level has only a hyperfine structure; the principle of the calculation is nevertheless the same). Even with the strongest magnetic fields that can be produced in the laboratory, is much smaller than the distance between the 1 level and the other levels; consequently, its effect can be treated by perturbation theory. The effect of a magnetic field on an atomic energy level is called the “Zeeman effect”. When 0 is plotted on the -axis and the energies of the various sublevels it creates are plotted on the -axis, a Zeeman diagram is obtained. If 0 is sufficiently strong, the Zeeman Hamiltonian can be of the same order of magnitude as the hyperfine Hamiltonian8 , or even larger. On the other hand, if . Therefore, in general it is not possible to establish the 0 is very weak, relative importance of and . To obtain the energies of the various sublevels, ( + ) must be diagonalized inside the = 1 level. We showed in § D-2 that the restriction of to the = 1 level could be put in the form I S. Using expression (E-1) for , we see that we must also calculate matrix elements of the form: = 1; = 0;

= 0;

;

0(

+2

)+

= 1; = 0;

= 0;

; (E-5)

The contribution of 0 is zero, since and are zero. Since 2 0 + acts only on the spin variables, we can, for this term, separate the orbital part of the matrix element: = 1; = 0;

=0

= 1; = 0;

=0 =1

from the spin part. In conclusion, therefore, we must, ignoring the quantum numbers nalize the operator: I S+2

0

+

(E-6)

, diago-

(E-7)

which acts only on the spin degrees of freedom. To do so, we can use either the basis or the basis. According to (E-4), the last term of (E-7) is much smaller than the second one. To simplify the discussion, we shall neglect the term from now on (it would be possible, 8 Recall

1252

that

shifts the 1 level as a whole: it therefore also shifts the Zeeman diagram as a whole.

E. THE ZEEMAN EFFECT OF THE 1

GROUND STATE HYPERFINE STRUCTURE

however, to take it into account9 ). The perturbation “seen” by the 1 level can therefore be written, finally: I S+2 E-1-c.

(E-8)

0

Different domains of field strength

By varying 0 , we can continuously modify the magnitude of the Zeeman term 2 0 . We shall consider three different field strengths, determined by the respective orders of magnitude of the hyperfine term and the Zeeman term: () ~

0

~2 : weak fields

( ) ~

0

~2 : strong fields

(

) ~

~2 : intermediate fields

0

We shall later see that it is possible to diagonalize operator (E-8) exactly. However, in order to give a particularly simple example of perturbation theory, we shall use a slightly different method in cases ( ) and ( ). In case ( ), we shall treat 2 0 like a perturbation with respect to I S. On the other hand, in case ( ), we shall treat I S like a perturbation with respect to 2 0 . The exact diagonalization of the set of two operators, indispensable in case ( ), will allow us to check the preceding results. E-2.

The weak-field Zeeman effect

The eigenstates and eigenvalues of I S have already been determined (§ D-2). We therefore obtain two different levels: the three-fold degenerate level, = 1;

=

1 0 +1

of energy ~2 4, and the non-degenerate level, = 0; = 0 , of energy 3 ~2 4. Since we are treating 2 0 like a perturbation with respect to I S, we must now separately diagonalize the two matrices representing 2 0 in the two levels, = 1 and = 0, corresponding to two distinct eigenvalues of I S. E-2-a.

Matrix representation of

in the

basis

Since we shall need it later, we shall begin by writing the matrix which represents in the basis (for the problem which concerns us here, it would suffice to write the two submatrices corresponding to the = 1 and = 0 subspaces). By using formulas (B-22) and (B-23) of Chapter X, we easily obtain: = 1;

=1

= 1;

=0

= 1;

=

= 0;

=0

~ 2 ~ = 2 =

~ 2

1 = =

~ 2

= 1;

=1

= 0;

=0 (E-9)

= 1; = 1;

=

1

=0

9 This is what we do in Complement C XII , in which we study the hydrogen-like systems (muonium, positronium) for which it is not possible to neglect the magnetic moment of one of the two particles.

1253

CHAPTER XII

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

which gives the following expression for the matrix representing in the (the basis vectors are arranged in the order 1 1 , 1 0 , 1 1 , 0 0 ):

(

1 0 0 0

} )= 2

0 0 0 1

0 0 -1 0

basis

0 1 0 0

(E-10)

Comment: It is instructive to compare the preceding matrix with the one which represents same basis:

(

1 0 0 0

)=}

0 0 0 0

0 0 -1 0

0 0 0 0

in the

(E-11)

We see, first of all, that the two matrices are not proportional: the ( while the ( ) one is not.

) matrix is diagonal,

However, if we confine ourselves to the restrictions of the two matrices in the = 1 subspace [limited by the darker line in expressions (E-10) and (E-11)], we see that they are proportional. Denoting by 1 the projector onto the = 1 subspace (cf. Complement BII ), we have: 1

1

=

1 2

1

(E-12)

1

It would be simple to show that the same relation exists between hand, and and , on the other.

and

on the one

We have thus found a special case of the Wigner-Eckart theorem (Complement DX ), according to which, in a given eigensubspace of the total angular momentum, all the matrices which represent vector operators are proportional. It is clear from this example that this proportionality exists only for the restrictions of operators to a given eigensubspace of the total angular momentum, and not for the operators themselves. Moreover, the proportionality coefficient 1/2 which appears in (E-12) can be obtained immediately from the projection theorem. According to formula (30) of Complement EX , this coefficient is equal to: S F F2 Since E-2-b.

=

=1

=

=1

(

+ 1) + ( + 1) 2 ( + 1)

( + 1)

= 1 2, (E-13) is indeed equal to 1/2.

Weak-field eigenstates and eigenvalues

According to the results of § a, the matrix which represents 2 level can be written: ~

0

0 0 1254

(E-13)

0 0 0

0 0 ~

0

in the

=1

(E-14) 0

E. THE ZEEMAN EFFECT OF THE 1

GROUND STATE HYPERFINE STRUCTURE

In the = 0 level, this matrix reduces to a number, equal to 0. Since these two matrices are diagonal, we can immediately find the weak-field eigenstates (to zeroth order in 0 ) and the eigenvalues (to first order in 0 ): Eigenstates

Eigenvalues

= 1;

=1

= 1;

=0

= 1;

=

= 0;

1

=0

~2 +~ 0 4 2 ~ +0 4 2 ~ ~ 0 4 2 ~ 3 +0 4

(E-15)

In Figure 5, we have plotted ~ 0 on the -axis and the energies of the four Zeeman sublevels on the -axis (Zeeman diagram). In a zero field, we have the two hyperfine levels, = 1 and = 0. When the field 0 is turned on, the =0 = 0 sublevel, which is not degenerate, starts horizontally; as for the = 1 level, its three-fold degeneracy is completely removed: three equidistant sublevels are obtained, varying linearly with ~ 0 with slopes of +1, 0, 1 respectively. E

mF +1 0

F= 1

Figure 5: The weak-field Zeeman diagram of the 1 ground state of the hydrogen atom. The hyperfine = 1 level splits into three equidistant levels, each of which corresponds to a well-defined value of the quantum number . The = 0 level does not undergo any shift to first order in 0 .

𝒜ħ2

–1

0

F= 0

0

ħω0

The preceding treatment is valid as long as the difference ~ 0 between two adjacent Zeeman sublevels of the = 1 level remains much smaller than the zero-field difference between the = 1 and = 0 levels (the hyperfine structure). 1255

CHAPTER XII

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

Comment: The Wigner-Eckart theorem, mentioned above, makes it possible to show that, in a given level of the total angular momentum, the Zeeman Hamiltonian 0 ( + 2 ) is represented by a matrix proportional to . Thus, we can write, denoting the projector onto the level by : [

0(

+2

)]

=

is called the Landé factor of the E-2-c.

(E-16)

0

state. In the case which concerns us here,

The Bohr frequencies involved in the evolution of model of the atom

and

=1

= 1.

; comparison with the vector

In this section, we shall determine the different Bohr frequencies which appear in the evolution of F and S , and show that certain aspects of the results obtained recall those found by using the vector model of the atom (cf. Complement FX ). First of all, we shall briefly review the predictions of the vector model of the atom (in which the various angular momenta are treated like classical vectors) as far as the hyperfine coupling between I and S is concerned. In a zero field, F = I + S is a constant of the motion. I and S precess about their resultant F with an angular velocity proportional to the coupling constant between I and S. If the system is, in addition, placed in a weak static field B0 parallel to , onto the rapid precessional motion of I and S about F is superposed a slow precessional motion of F about (Larmor precession; Fig. 6). is therefore a constant of the motion, while has a static part (the projection onto of the component of S parallel to F), and a part which is modulated by the hyperfine precession frequency (the projection onto of the component of S perpendicular to F, which precesses about F). Let us compare these semi-classical results with those of the quantum theory presented earlier in this section. To do so, we must consider the time evolution of the average values et . According to the discussion of § D-2-d of Chapter III, the average value () of a physical quantity contains a series of components which oscillate at the various Bohr frequencies ( ) of the system. Also, a given Bohr frequency appears in ( ) only if the matrix element of between the states corresponding to the two energies is different from zero. In the problem which concerns us here, the eigenstates of the weak-field Hamiltonian are the states. Now consider the two matrices (E-10) and (E-11) which represent and in this basis. Since has only diagonal matrix elements, no Bohr frequency different from zero can appear in ( ): is therefore static. On the other hand, has, not only diagonal matrix elements (with which is associated a static component of ), but also a non-diagonal element between the = 1; = 0 and = 0; = 0 states, whose energy difference is ~2 , according to Table (E-15) (or Figure 5). It follows that has, in addition to a static component, a component modulated at the angular frequency ~. This result recalls the one obtained using the vector model of the atom10 .

10 A parallel could also be established between the evolution of , , , , and that of the projections of the vectors F and S of Figure 6 onto and . However, the motion of F and S does not coincide perfectly with that of the classical angular momenta. In particular, the modulus of S is not necessarily constant (in quantum mechanics, S2 = S 2 ); see discussion of Complement FX .

1256

E. THE ZEEMAN EFFECT OF THE 1

GROUND STATE HYPERFINE STRUCTURE

z

B0

F I

S

Figure 6: The motion of S, I and F in the vector model of the atom. S, I precess rapidly about F under the effect of the hyperfine coupling. In a weak field, F slowly precesses about B0 (Larmor precession).

Comment: A relation can be established between perturbation theory and the vector model of the atom. The influence of a weak field 0 on the = 1 and = 0 levels can be obtained by retaining in the Zeeman Hamiltonian 2 0 only the matrix elements in the =1 and = 0 levels, “forgetting” the matrix element of between = 1; = 0 and = 0; = 0 . Proceeding in this way, we also “forget” the modulated component of , which is proportional to this matrix element. We therefore keep only the component of S parallel to F . Now, this is precisely what we do in the vector model of the atom when we want to evaluate the interaction energy with the field B0 . In a weak field, F does precess about B0 much more slowly than S does about F. The interaction of B0 with the component of S perpendicular to F therefore has no effect, on the average; only the projection of S onto F counts. This is how, for example, the Landé factor is calculated. E-3.

The strong-field Zeeman effect

We must now start by diagonalizing the Zeeman term. 1257

CHAPTER XII

E-3-a.

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

Eigenstates and eigenvalues of the Zeeman term

This term is diagonal in the 2

=2

0

~

basis: (E-17)

0

Since = 1 2, the eigenvalues are equal to ~ degenerate, because of the two possible values of 2

0

2

0

+

E-3-b.

= +~

0

=

0

~

0.

Each of them is therefore two-fold . We therefore have11 :

+ (E-18)

Effects of the hyperfine term considered as a perturbation

The corrections to first order in can be obtained by diagonalizing the restrictions of the operator I S to the two subspaces + and corresponding to the two different eigenvalues of 2 0 . First of all, notice that, in each of these two subspaces, the two basis vectors + + and + (or + and ) are also eigenvectors of , but do not correspond to 2 2 2 the same value of = + . Since the operator I S = 2 commutes with , it has no matrix elements between the two states + + and + , or + and . The two matrices representing I S in the two subspaces + and are then diagonal, and their eigenvalues are simply the diagonal elements ; I S ; which can also be written, using the relation: I S=

+

1 ( 2

+

+

+)

(E-19)

in the form: I S =

=

~2

(E-20)

Finally, in a strong field, the eigenstates (to zeroth order in (to first order in ) are: Eigenstates + + + +

Eigenvalues ~2 4 ~2 ~ 0 4 ~2 ~ 0 4 ~2 ~ 0+ 4 ~

0

+

11 To simplify the notation, we shall often write equal to + or , depending on the signs of and

1258

) and the eigenvalues

(E-21)

instead of .

, where

and

are

E. THE ZEEMAN EFFECT OF THE 1

GROUND STATE HYPERFINE STRUCTURE

In Figure 7, the solid-line curves on the right-hand side (for ~ 0 ~2 ) represent the strong-field energy levels: we obtain two parallel straight lines of slope + 1, separated by an energy ~2 2, and two parallel straight lines of slope 1, also separated by ~2 2. The perturbation treatments presented in this section and the preceding one therefore give the strong-field asymptotes and the tangents at the origin of the energy levels. Comment:

The strong-field splitting ~2 2 of the two states, + + and + or + and , can be interpreted in the following way. We have seen that only the term εS εl

E

+

+

+









+

𝒜ħ2

F= 1

F= 0

0

ħω0

Figure 7: The strong-field Zeeman diagram of the 1 ground state of the hydrogen atom. For each orientation of the electronic spin ( = + or = ), we obtain two parallel straight lines separated by an energy ~2 2, each one corresponding to a different orientation of the proton spin ( = + or = ). 1259

CHAPTER XII

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

of expression (E-19) for I S is involved in a strong field, when the hyperfine coupling is treated like a perturbation of the Zeeman term. The total strong-field Hamiltonian (E-8) can therefore be written: 2

0

+

=2

0

+

(E-22)

2

It is as if the electronic spin “saw”, in addition to the external field B0 , a smaller “internal field”, arising from the hyperfine coupling between I and S and having two possible values, depending on whether the nuclear spin points up or down. This field adds to or substracts from B0 and is responsible for the energy difference between + + and + or between + and . E-3-c.

The Bohr frequencies involved in the evolution of

In a strong field, the Zeeman coupling of S with B0 is more important than the hyperfine coupling of S with I. If we start by neglecting this hyperfine coupling, the vector model of the atom predicts that S will precess (very rapidly since B0 is large) about the direction of B0 (I remains motionless, since we have assumed to be negligible).

z

B0

Figure 8: The motion of S in the vector model of the atom. In a strong field, S precesses rapidly about B0 (here we are neglecting both the Zeeman coupling between I and B0 and the hyperfine coupling between I and S, so that I remains motionless). S

I

1260

E. THE ZEEMAN EFFECT OF THE 1

GROUND STATE HYPERFINE STRUCTURE

Expression (E-19) for the hyperfine coupling remains valid for classical vectors. Because of the very rapid precession of S, the terms + and oscillate very fast and have, on the average, no effect, so that only the term counts. The effect of the hyperfine coupling is therefore to add a small field parallel to and proportional to (cf. comment of the preceding section), which accelerates or slows down the precession of S about , depending on the sign of . The vector model of the atom thus predicts that will be static in a strong field. We shall show that quantum theory gives an analogous result for the average value of the observable In a strong field, the well-defined energy states are, as we have seen, the states . Now, in this basis, the operator has only diagonal matrix elements. No non-zero Bohr frequency can therefore appear in , which, consequently, is a static quantity12 , unlike its weak-field counterpart (cf. § E-2-c). E-4.

The intermediate-field Zeeman effect

E-4-a.

The matrix which represents the total perturbation in the

basis

The states are eigenstates of the operator I S. The matrix which represents this operator in the basis is therefore diagonal. The diagonal elements corresponding to = 1 are equal to ~2 4, and those corresponding to = 0, to 3 ~2 4. Furthermore, we have already written, in (E-10), the matrix representation of in the same basis. It is now very simple to write the matrix which represents the total perturbation (E-8). Arranging the basis vectors in the order 1 1 , 1 1 , 1 0 , 0 0 , we thus obtain: ~2 4

+~

0

0

0

~2 4

~

0

0

0

0

0

0

0

0

0 (E-23)

~2 4

~

0

~

0

3 ~2 4

Comment:

and commute; 2 0 can therefore have non-zero matrix elements only between two states with the same . Thus, we could have predicted all the zeros of matrix (E-23).

12 The

study of and presents no difficulty. We find two Bohr angular frequencies: one, + ~ 2, slightly larger than 0 , and the other one, 0 ~ 2, slightly smaller. They correspond to the two possible orientations of the “internal field”, produced by , which adds to the external field 0 . Similarly, we find that I precesses about the “internal field” produced by . 0

1261

CHAPTER XII

E-4-b.

two 1

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

Energy values in an arbitrary field

Matrix (E-23) can be broken into two 1 1 matrices and one 2 1 matrices immediately yield two eigenvalues: 1

=

2

=

~2 +~ 4 2 ~ ~ 4

2 matrix. The

0

(E-24) 0

corresponding respectively to the state 1 1 (that is, the state + + ) and to the state 1 1 (that is, the state ). In Figure 9, the two straight lines of slopes ,+1 and 1 passing through the point whose ordinate is + ~2 4 for a zero field (for which the perturbation theory treatment gave only the initial and asymptotic behavior) therefore represent, for any 0 , two of the Zeeman sublevels. The eigenvalue equation of the remaining 2 2 matrix can be written: ~2 4

3 ~2 4

~2

2 0

=0

(E-25)

The two roots of this equation can easily be found: =

~2 + 4

~2 2

2

3

=

~2 4

~2 2

2

4

+ ~2

2 0

(E-26)

+ ~2

2 0

(E-27)

When ~ 0 varies, the two points of abscissas ~ 0 and ordinates 3 and 4 follow the two branches of a hyperbola (Fig. 9). The asymptotes of this hyperbola are the two straight lines whose equation is = ( ~2 4) ~ 0 , obtained in § 3 above. The two turning points of the hyperbola have abscissas of 0 = 0 and ordinates of ( ~2 4) ~2 2, 2 2 that is, ~ 4 and 3 ~ 4. The tangents at both these points are horizontal. This is in agreement with the results of § 2 for the states = 1; = 0 and = 0; =0 . The preceding results are summarized in Figure 9, which is the Zeeman diagram of the 1 ground state. E-4-c.

Partial hyperfine decoupling

In a weak field, the well-defined energy states are the states ; in a strong field, the states ; in an intermediate field, the eigenstates of matrix (E-23), which are intermediate between the states and the states . One thus moves continuously from a strong coupling between I and S (coupled bases) to a total decoupling (uncoupled bases) via a partial coupling.

Comment:

An analogous phenomenon exists for the Zeeman fine structure effect. If, for simplicity, we neglect , we know (§ C) that, in a zero field, the eigenstates of the Hamiltonian are the states corresponding to a strong coupling 1262

E. THE ZEEMAN EFFECT OF THE 1

E

GROUND STATE HYPERFINE STRUCTURE

mF

+1

0

𝒜ħ2/4 (F = 1)

–1 –3𝒜ħ2/4 (F = 0)

0

0

ħω0

Figure 9: The Zeeman diagram (for an arbitrary field) of the 1 ground state of the hydrogen atom: remains a good quantum number for any value of the field. We obtain two straight lines, of opposite slopes, corresponding to the values, +1 and 1, of , as well as a hyperbola whose two branches are associated with the two = 0 levels. Figures 5 and 7 give, respectively, the tangents at the origin and the asymptotes of the levels shown in this diagram.

between L and S (the spin-orbit coupling). This property remains valid as long as . If, on the other hand, 0 is strong enough to make , we find that the eigenstates of are the states corresponding to a total decoupling of L and S. The intermediate zone ( ) corresponds to a partial 1263

CHAPTER XII

THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM

coupling of L and S. See, for example, Complement DXII , in which we study the Zeeman effect of the 2 level (without taking into account). References and suggestions for further reading:

The hydrogen atom spectrum: Series (11.7), Bethe and Salpeter (11.10). The Dirac equation: the subsection “Relativistic quantum mechanics” of section 2 of the bibliography and Messiah (1.17), Chap. XX, especially §§ V and IV-27. The fine structure of the = 2 level and the Lamb shift: Lamb and Retherford (3.11); Frisch (3.13); Series (11.7), Chaps. VI, VII and VIII. The hyperfine structure of the ground state: Crampton et al. (3.12). The Zeeman effect and the vector model of the atom: Cagnac and Pebay-Peyroula (11.2), Chap. XVII, §§ 3E and 4C; Born (11.4), Chap. 6, § 2. Interstellar hydrogen: Roberts (11.17); Encrenaz (12.11), Chap. IV.

1264

COMPLEMENTS OF CHAPTER XII, READER’S GUIDE

AXII : THE MAGNETIC HYPERFINE HAMILTONIAN

Derivation of the expression for the hyperfine Hamiltonian used in Chapter XII. Gives the physical interpretation of the various terms appearing in the Hamiltonian – in particular the contact term. Rather difficult.

BXII : CALCULATION OF THE AVERAGE VALUES OF THE FINE-STRUCTURE HAMILTONIAN IN THE 1 , 2 AND 2 STATES

The detailed calculations of certain radial integrals appearing in the expression obtained in Chapter XII for the energy shifts. Not conceptually difficult.

CXII : THE HYPERFINE STRUCTURE AND THE ZEEMAN EFFECT FOR MUONIUM AND POSITRONIUM

Extension of the study of §§ D and E of Chapter XII to two important hydrogen-like systems, muonium and positronium, already presented in Complement AVII . Brief description of experimental methods for studying these two systems. Simple if the calculations of §§ D and E of Chapter XII have been well understood.

DXII : THE INFLUENCE OF THE ELECTRON SPIN ON THE ZEEMAN EFFECT OF THE HYDROGEN RESONANCE LINE

Study of the effect of the electronic spin on the frequencies and polarizations of the Zeeman components of the resonance line of hydrogen. Improves the results obtained in Complement DVII , in which the electron spin was ignored (uses certain results of that complement). Moderately difficult.

EXII : THE STARK EFFECT OF THE HYDROGEN

Study of the effect of a static electric field on the ground state ( = 1) and of the first excited state ( = 2) of the hydrogen atom (Stark effect). Shows the importance for the Stark effect of the existence of a degeneracy between two states of different parities. Rather simple.

ATOM

1265



THE MAGNETIC HYPERFINE HAMILTONIAN

Complement AXII The magnetic hyperfine Hamiltonian

1

Interaction of the electron with the scalar and vector potentials created by the proton . . . . . . . . . . . . . . . . . . 1267

2

The detailed form of the hyperfine Hamiltonian . . . . . . . 1268

3

2-a

Coupling of the magnetic moment of the proton with the orbital angular momentum of the electron . . . . . . . . . . . . 1268

2-b

Coupling with the electron spin . . . . . . . . . . . . . . . . . 1270

Conclusion: the hyperfine-structure Hamiltonian . . . . . . 1274

The aim of this complement is to justify the expression for the hyperfine Hamiltonian given in Chapter XII [relation (B-20)]. As in that chapter, we shall confine our reasoning to the hydrogen atom, which is composed of a single electron and a proton, although most of the ideas remain valid for any atom. We have already said that the origin of the hyperfine Hamiltonian is the coupling between the electron and the electromagnetic field created by the proton. We shall therefore call A (r) and (r) the vector and scalar potentials associated with this electromagnetic field. We shall begin by considering the Hamiltonian of an electron subjected to these potentials. 1.

Interaction of the electron with the scalar and vector potentials created by the proton

Let R and P be the position and momentum of the electron, S, its spin; , its mass; and , its charge; = ~ 2 is the Bohr magneton. The Hamiltonian of the electron in the field of the proton can be written: =

1 2

[P

2

A (R)] +

(R)

2

S ~



A (R)

(1)

This operator is obtained from expression (B-46) of Chapter III (the Hamiltonian of a spinless particle) by adding to it the coupling energy between the magnetic moment 2 S ~ associated with the spin and the magnetic field ∇ A (r). We shall begin by studying the terms which, in (1), arise from the scalar potential (r). According to Complement EX , this potential results from the superposition of several contributions, each of them associated with one of the electric multipole moments of the nucleus. For an arbitrary nucleus, we must consider: ( ) The total charge a potential energy:

of the nucleus (the moment of order

= 0), which yields

2 0 (r)

=

0 (r)

=

4

(2) 0

1267

COMPLEMENT AXII



(with, for the proton, = 1). Now, the Hamiltonian which we chose in Chapter VII for the study of the hydrogen atom is precisely: =

0

P2 + 2

0 (R)

(3)

0 (R)

has therefore already been taken into account in the Hamiltonian 0 . ( ) The electric quadrupole moment ( = 2) of the nucleus. The corresponding potential adds to the potential 0 and yields a term of the hyperfine Hamiltonian, called the electric quadrupole term. The results of Complement EX enable us to write this term without difficulty. In the case of the hydrogen atom, it is zero, since the proton, which is a spin 1/2 particle, has no electric quadrupole moment (cf. § 2-c- of Complement EX ). ( ) The electric multipole moments of order = 4, 6, etc... which are theoretically involved as long as 2 ; for the proton, they are all zero. Thus, for the hydrogen atom, potential (2) is really the potential seen by the electron1 . There is no need to add any corrections to it (by hydrogen atom, we mean the electron-proton system, excluding isotopes such as deuterium: since the deuterium nucleus has a spin = 1, we would have to take into account an electric quadrupole hyperfine Hamiltonian – see comment ( ) at the end of this complement). Now let us consider the terms arising from the vector potential A (r) in (1). We denote by M the magnetic dipole moment of the proton (which, for the same reason as above, cannot have magnetic multipole moments of order 1). We have: A (r) =

0

4

M

r

(4)

3

The hyperfine Hamiltonian are linear in A : =

2

can now be obtained by retaining in (1) the terms which

[P A (R) + A (R) P]

S ~

2



A (R)

(5)

and by replacing A by expression (4) (since already makes a very small correction to the energy levels of 0 , it is perfectly legitimate to ignore the second-order term, in A2 ). This is what we shall do in the following section. 2.

The detailed form of the hyperfine Hamiltonian

2-a.

Coupling of the magnetic moment of the proton with the orbital angular momentum of the electron

First of all, we shall calculate the first term of (5). Using (4), we have: P A (R) + A (R) P =

0

4

P (M

R)

1 3

+

1 3

(M

R) P

(6)

1 We are concerned here only with the potential outside the nucleus, where the multipole moment expansion is possible. Inside the nucleus, we know that the potential does not have form (2). This causes a shift in the atomic levels called the “volume effect”. This effect was studied in Complement DXI , and we shall not take it into account here.

1268



THE MAGNETIC HYPERFINE HAMILTONIAN

We can apply the rules for a mixed vector product to vector operators as long as we do not change the order of two non-commuting operators. The components of M commute with R and P, so we have: (M

R) P = (R

P) M = L M

(7)

where: L=R

(8)

P

is the orbital angular momentum of the electron. It can easily be shown that: L

1

=0

3

(9)

(any function of R is a scalar operator), so that: 1 3

(M

R) P =

L M

(10)

3

Similarly: P (M

1

R)

3

=

(P

M

R)

1 3

=

M

L

(11)

3

since: R=L

P

(12)

Thus, the first term of (5) makes a contribution 0

=

4 2

2

M

L 3

0

=

4

2

M

to

which is equal to:

(L ~) 3

(13)

This term corresponds to the coupling between the nuclear magnetic moment M and the magnetic field: B =

0

4

L 3

created by the current loop associated with the rotation of the electron (cf. Fig. 1).

Comment:

The presence of the 1/ 3 term in (13) might lead us to believe that there is a singularity at the origin, and that certain matrix elements of are infinite. Actually, this is not the case. Consider the matrix element where and in Chapter VII. In the r

=

r

(r) =

,

are the stationary states of the hydrogen atom found representation, we have: ( )

(

)

(14) 1269

COMPLEMENT AXII

• MI

L

v q

Figure 1: Relative disposition of the magnetic moment M of the proton and the field B created by the current loop associated with the motion of the electron of charge and velocity v (B is antiparallel to the orbital angular momentum L of the electron).

BL

with [cf. Chap. VII, relation (A-28)]: ( )

0

(15)

With the presence of the 2 d term in the integration volume element taken into account, the function to be integrated over behaves at the origin like + +2 3 = + 1 . Furthermore, the presence of the Hermitian operator L in (13) means that the matrix element is zero when or is zero. We then + 1 have + 2, and remains finite at the origin.

2-b.

Coupling with the electron spin

We shall see that, for the last term of (5), the problems related to the singularity at the origin of the vector potential (4) are important. This is why, in studying this term, we shall choose a proton of finite size, letting its radius approach zero at the end of the calculations. Furthermore, from a physical point of view, we now know that the proton does possess a certain spatial extension and that its “magnetism” is spread over a certain volume. However, the dimensions of the proton are much smaller than the Bohr radius 0 . This justifies our treating the proton as a point particle in the final stage of the calculation.

.

The magnetic field associated with the proton

Consider the proton to be a particle of radius 0 (Fig. 2), placed at the origin. The distribution of magnetism inside the proton creates, at a distant point, a field B which can be calculated by attributing to the proton a magnetic moment M which we shall choose parallel to . For 0 , we obtain the components of B from the curl of the 1270



THE MAGNETIC HYPERFINE HAMILTONIAN

z MI

B

Bi y

x

Figure 2: The magnetic field created by the proton. Outside the proton, the field is that of a dipole. Inside, the field depends on the exact distribution of the magnetism of the proton, but we can, in a first approximation, consider it to be uniform. The contact term corresponds to the interaction between the spin magnetic moment of the electron and this uniform field B inside the proton.

vector potential written in (4): = = =

0

4 0

4 0

4

3

5

3

(16)

5

3

2

2 5

Expressions (16), moreover, remain valid even if is not very large compared to 0 . We have already emphasized that the proton, since it is a spin 1/2 particle, has no magnetic multipole moments of order 1. The field outside the proton is therefore a pure dipole field. Inside the proton, the magnetic field depends on the exact magnetic distribution. We shall assume this field B to be uniform (by symmetry, it must then be parallel to M and, therefore, to )2 . To calculate the field B inside the proton, we shall write the equation stating that the flux of the magnetic field through a closed surface, bounded by the plane and the upper hemisphere centered at and of infinite radius, is zero. Since, as , B decreases as 1 3 , the flux through this hemisphere is zero. Therefore, if Φ ( 0 ) denotes the flux through a disk centered at of radius 0 in the plane, and Φ ( 0 ), the flux 2 The following argument can be generalized to cases where B varies in a more complicated fashion (see comment ( ) at the end of this complement).

1271



COMPLEMENT AXII

through the rest of the

plane, we have:

Φ ( 0) + Φ ( 0) = 0

(17)

Relations (16) enable us to calculate Φ ( 0 ) easily, and we get: +

Φ ( 0) = 2

3

4

0

2

0

=

1

0

d

4

(18) 0

As for the flux Φ ( 0 ) of B , it is equal to: 2 0

Φ ( 0) =

(19)

so that (17) and (18) yield: =

2

0

(20)

3 0

4

Thus, we know the values of the field created by the proton at all points in space. We can now calculate the part of related to the electron spin S. .

The magnetic dipole term If we substitute (16) into the term dip

0

=

2

4

3

+

2

S ~

A , we obtain the operator:

+ 5

~



that is, taking into account the fact that M is, by hypothesis, parallel to dip

=

0

4

2

1 ~

3

S M

3

(21)

3

(S R)(M 2

R)

: (22)

This is the expression for the Hamiltonian of the dipole-dipole interaction between two magnetic moments M and M = 2 S ~ (cf. Complement BXI , § 1). Actually, expression (16) for the magnetic field created by the proton is valid only for 0 , and (22) should be applied only to the part of the wave functions which satisfies this condition. However, when we let 0 approach zero, expression (22) gives no singularity at the origin; it is therefore valid in all space. Consider the matrix element: dip

(we are adding here the indices and to the states considered above in order to label the eigenvalues ~ 2 and ~ 2 of ) and, in particular, the radial integral which corresponds to it. At the origin, the function of to be integrated behaves like + +2 3 = + 1 . Now, according to condition (8-c) of Complement BXI , the non-zero matrix elements are obtained for + 2. There is therefore no divergence at the origin. In the limit where 0 0, the integral over becomes an integral from 0 to infinity, and expression (22) is valid in all space.

1272

• .

THE MAGNETIC HYPERFINE HAMILTONIAN

The contact term

We shall now substitute (20) into the last term of (5), so as to obtain the contribution of the internal field of the proton to . We then obtain an operator , which we shall call the “contact term”, and whose matrix elements in the representation are:

0

=

2

2

4

Let 0 approach zero. The integration volume, 4 right-hand side of (23) becomes: 8 0 2 (r = 0) 4 ~ 3 The contact term is therefore given by: 8 M 4 3 0

=

2

d3

3 0

~

S

(r)

(r)

(23)

0

3 0

3, also approaches zero, and the

(r = 0)

(24)

(R)

(25)

~

Although the volume containing an internal magnetic field (20) approaches zero when 0, the value of remains finite, since this internal field approaches infinity as 0 1/ 30 . Comments:

( ) In (25), the function (R) of the operator R is simply the projector: (R) = r = 0 r = 0

(26)

( ) The matrix element written in (23) is different from zero only if = = 0. This is a necessary condition for (r = 0) and (r = 0) to be non-zero (cf. Chap. VII, § C-4-c- ). The contact term therefore exists only for the states. ( ) In order to study, in § 2-a, the coupling between M and the orbital angular momentum of the electron, we assumed expression (4) for A (r) to be valid in all space. This amounts to ignoring the fact that the field B actually has the form (20) inside the proton. We might wonder if this procedure is correct, or if there is not also an orbital contact term in . Actually, this is not the case. The term in P A + A the field B , to an operator proportional to: B

L=

0

4

M

2 3 0

P would lead, for

(27)

Let us calculate the matrix element of such an operator in the representation. The presence of the operator requires, as above, 1. The radial function to be integrated between 0 and 0 then behaves at the origin like + +2 and therefore goes to zero at least as rapidly as 4 . Despite the presence of the 1/ 30 term in (27), the integral between = 0 and = 0 therefore goes to zero in the limit where 0 0. 1273



COMPLEMENT AXII

3.

Conclusion: the hyperfine-structure Hamiltonian

Now, let us take the sum of the operators , dip and . We use the fact that the magnetic dipole moment M of the proton is proportional to its angular momentum I: I ~

M =

(28)

(cf. § B-2-a of Chapter XII). We obtain: =

0

2

4

I L ~2

3

+3

(I R)(S R) 5

I S 3

+

8 I S (R) 3

(29)

This operator acts both in the state space of the electron and in the state space of the proton. It can be seen that this is indeed the operator introduced in Chapter XII [cf. (B-20)].

Comments:

( ) We will now discuss the generalization of formula (29) to the case of an atom having a nuclear spin 1 2. First of all, if = 1, we have already seen that the nucleus can have an electric quadrupole moment which adds a contribution to the potential 0 (r) given by (2). An electric quadrupole hyperfine term is therefore present in the hyperfine Hamiltonian, in addition to the magnetic dipole term (29). Since an electrical interaction does not directly affect the electron spin, this quadrupole term only acts on the orbital variables of the electrons. If now 1, other nuclear electric or magnetic multipole moments can exist, increasing in number as increases. The electric moments give rise to hyperfine terms acting only on the orbital electron variables, while the magnetic terms act on both the orbital and the spin variables. For elevated values of , the hyperfine Hamiltonian has therefore a complex structure. In practice however, for the great majority of cases, one can limit the hyperfine Hamiltonian to magnetic dipole and electric quadrupole terms. This is due to the fact that the multipole nuclear moments of an order superior to 2 make extremely small contributions to the hyperfine atomic structures. These contributions are therefore difficult to observe experimentally. This arises essentially from the extremely small size of the nuclei compared to the spatial extent 0 of the electronic wave functions. ( ) The simplifying hypothesis which we have made concerning the field B(r) created by the proton (a perfectly uniform field within a sphere, a dipole field outside) is not essential. The form (25) of the magnetic dipole Hamiltonian remains valid whenever the nuclear magnetism has an arbitrary repartition, giving rise to more complicated internal fields B (r) (assuming however that the spatial extent of the nucleus is negligible compared to 0 ; cf. the following comment). The argument is actually a direct generalization of that given in this complement. Consider a sphere centered at the origin, containing the nucleus and having a radius 0.

1274



THE MAGNETIC HYPERFINE HAMILTONIAN

If = 12 , the field outside has the form (16) and, since is very small compared to 0 , its contribution leads to the terms (13) and (22). As for the contribution of the field B(r) inside , it depends only on the value at the origin of the electronic wave functions and on the integral of B(r) inside . Since the flux of B(r) across all closed surfaces is zero, the integral in of each component of B(r) can be transformed into an integral outside of , where B(r) has the form (16). A simple calculation will again give exactly expression (25) which is therefore independent of the simplifying hypothesis that we have made. 1 If , the nuclear contribution to the electromagnetic field outside of 2 gives rise to the multipole hyperfine Hamiltonian which we have discussed in comment ( ) above. On the other hand, one can easily show that the contribution of the field inside does not give rise to any new term: only the magnetic dipole possesses a contact term. (

) In all of the above, we have totally neglected the dimensions of the nucleus compared to those of the electronic wave functions (we have taken the limit 0 0 0). This is obviously not always realistic, in particular for heavy atoms whose nuclei have a relatively large spatial extension. If one studies these “volume effects” (keeping for example several of the lower order terms in 0 0 ), a series of new terms appears in the electron-nucleus interaction Hamiltonian. We have already encountered this type of effect in Complement DXI where we studied the effects of the radial distribution of the nuclear charge (nuclear multipole moments of order = 0). Analogous phenomena occur concerning the spatial distribution of nuclear magnetism and lead to modifications of different terms of the hyperfine Hamiltonian (29). In particular, a new term must be added to the contact term (25) when the electronic wave functions vary significantly within the nucleus. This new term is neither simply proportional to (R), nor to the total magnetic moment of the nucleus. It depends on the spatial distribution of the nuclear magnetism. From a practical point of view, such a term is interesting since, using precise measurements of the hyperfine structure of heavy atoms, it permits obtaining information concerning the variation of the magnetism within the corresponding nuclei.

References and suggestions for further reading: The hyperfine Hamiltonian including the electric quadrupole interaction term: Abragam (14.1), Chap. VI; Kuhn (11.1), Chap. VI, § B; Sobel’man (11.12), Chap. 6.

1275

COMPLEMENT BXII



Complement BXII Calculation of the average values of the fine-structure Hamiltonian in the 1 , 2 and 2 states

Calculation of 1 , 1 2 and 1 3 . . . . . . . The average values . . . . . . . . . . . . . . The average values . . . . . . . . . . . . . . . Calculation of the coefficient 2 associated with the 2 level . . . . . . . . . . . . . . . . . . . . . . .

1 2 3 4

For the hydrogen atom, the fine-structure Hamiltonian terms: =

+

. . . . . . . . . . . . . . . in . . . . .

. 1276 . 1278 . 1279 . 1279

is the sum of three

+

(1)

studied in § B-1 of Chapter XII. The aim of this complement is to give the calculation of the average values of these three operators for the 1 , 2 and 2 states of the hydrogen atom, a calculation which was omitted in Chapter XII for the sake of simplicity. We shall begin by calculating the average values of 1 , 1 2 and 1 3 in these states. 1.

Calculation of 1

2

, 1

3

and 1

The wave function associated with a stationary state of the hydrogen atom is (cf. Chap. VII, § C): (r) =

( )

(

)

(2)

( ) is a spherical harmonic. The expressions for the radial functions sponding to the 1 , 2 , 2 states are: 1 0(

) = 2( 0 )

2 0(

) = 2(2 0 )

2 1(

) = (2 0 )

3 2

e

3 2 3 2

( ) corre-

0

1

(3)

2

e

2

0

1 2

e

2

0

(3)

0

0

where 0

0

=4

is the Bohr radius: 0

~2 2

=

~2 2

(4)

The are normalized with respect to and , so that the average value of the th power (where is a positive or negative integer) of the operator associated with 1276

• AVERAGE VALUES OF THE FINE-STRUCTURE HAMILTONIAN IN THE 1

,2

AND 2

STATES

can be written1 :

= r in the state +2

=

( )

2

d

(5)

0

It therefore does not depend on of the form: (

)=

r

e

. If (3) is substituted into (5), there appear integrals

d

0

(6)

0

where and are integers. We shall assume here that integration by parts then yields directly: (

0

)=

r

e

0

+

0

1

0

(

r

0

2. An

d

0

0

=

e

0, that is,

1 )

(7)

Since, furthermore: (0 ) =

r

e

0

0

d =

(8)

0

we obtain, by recurrence: +1

(

0

)= !

(9)

Now, let us apply this result to the average values to be determined. We obtain: 1

1

= =

1

2

4 3 0

4 3 0

4 8 30 1 = 3 2 0 1 = 4 0

e

2r

0

d

0

1

(1 2) =

(10a)

0 2

=

1 0

(1 1)

2 1

e

r

0

d

0

(2 1) +

0

1 (3 1) 4 20 (10b)

2

1

1 Of

2

1 1 e 8 30 3 0 0 1 1 = (3 1) = 24 50 4 0 =

r

0

d

course, this average value exists only for values of

(10c)

which make integral (5) convergent.

1277



COMPLEMENT BXII

Similarly: 1

2

1

2

1

2

1

=

2

=

4 3 0

(0 2) =

2

(11a)

2 0

1 1 1 1 (0 1) (1 1) + 2 (2 1) = 2 2 30 4 4 0 0 0 1 1 = (2 1) = 24 50 12 20

2

(11b) (11c)

It is clear that the expression for the average value of 1 3 is meaningless for the 1 and 2 states [since integral (5) is divergent]. For the 2 state, it is equal to: 3

1 2.

=

2

1 1 (1 1) = 5 24 0 24 30

(12)

The average values

Let: 0

P2 + 2

=

(13)

be the Hamiltonian of the electron subjected to the Coulomb potential. We have: P4 = 4

2

[

]

0

2

(14a)

with: 2

=

(14b)

so that: =

Since

1 2

2

[

2

]

0

(15)

We shall take the average values of both sides of this expression in a state are Hermitian operators, we obtain: 0 and =

1 2

2

)2 + 2

(

2

1

+

4

1

2

.

(16)

In this expression, we have set: =

2

=

1 2

2

2

2

(17)

where: 2

=

(18) ~

is the fine-structure constant. 1278

• AVERAGE VALUES OF THE FINE-STRUCTURE HAMILTONIAN IN THE 1

,2

AND 2

STATES

If we apply relation (16) to the case of the 1 state, we obtain, using (10a) and (11a): 1

=

1 2

2

1 4

2

4

2 4

2

2

2

=

1 2

4

1 4

2

0

5 8

1+2 =

=

4

(19)

2 0

0

that is, since, according to (4) and (18), 1

4

+2 2

2

:

2

(20)

The same type of calculation, for the 2 state, leads to:

2

=

1 2

4

2

1 8

2

1 8

2

2

11 1 + = 84 4

2

11 1 + = 8 4 12

13 128

4

2

(21)

and, for the 2 state, to:

2

3.

=

1 2

4

2

7 384

4

2

(22)

The average values

With (14b) and the fact that ∆(1 ) = 4 (r) taken into account, the average value of in the state can be written [see also formula (B-14) of Chapter XII]: =

~2 8

2 2

4

2

(r = 0)

This expression goes to zero if 2

2

(23)

(r = 0) = 0, that is, if = 0. Therefore:

=0

(24a)

For the 1 and 2 levels, we obtain, using (2), (23) and the fact that 1

=

~2 8

2

1 0 (0)

2

=

1 2

2

2 0 (0)

2

=

1 16

2 2

4

0 0

=1

2

4 : (24b)

as well as: 2

4.

=

~2 8

2 2

Calculation of the coefficient

4

2

2

associated with

(24c) in the 2 level

In § C-2-c- of Chapter XII, we defined the coefficient: 2 2

=

2

2 2

2 1(

)

2

d

(25)

0

1279



COMPLEMENT BXII

According to (3): 2 2

=

2

2 2

1 (1 1) 24 50

(26)

Relation (9) then yields: 2 2

=

2

2 2

1 1 = 24 30 48~2

4

2

(27)

References: Several radial integrals for hydrogen-like atoms are given in Bethe and Salpeter (11.10).

1280



THE HYPERFINE STRUCTURE AND THE ZEEMAN EFFECT FOR MUONIUM AND POSITRONIUM

Complement CXII The hyperfine structure and the Zeeman effect for muonium and positronium

1 2

The The 2-a 2-b 2-c 2-d

hyperfine structure of the 1 ground state . Zeeman effect in the 1 ground state . . . . The Zeeman Hamiltonian . . . . . . . . . . . . . Stationary state energies . . . . . . . . . . . . . . The Zeeman diagram for muonium . . . . . . . . The Zeeman diagram for positronium . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1281 1282 1282 1283 1284 1286

In Complement AVII , we studied some hydrogen-like systems, composed, like the hydrogen atom, of two oppositely charged particles electrostatically attracted to each other. Of all these systems, two are particularly interesting: muonium (composed of an electron, , and a positive muon, + ) and positronium (composed of an electron, , and a positron, + ). Their importance lies in the fact that the various particles which come into play (the electron, the positron and the muon) are not directly affected by strong interactions (while the proton is). The theoretical and experimental study of muonium and positronium therefore permits a very direct test of the validity of quantum electrodynamics. Actually, a very precise information we now possess about these two systems comes from the study of the hyperfine structure of their 1 ground state [the optical lines joining the 1 ground state to the various excited states have been observed for positronium; cf. Ref. (11.25)]. This hyperfine structure is the result, as in the case of the hydrogen atom, of magnetic interactions between the spins of the two particles. We shall describe some interesting features of the hyperfine structure and the Zeeman effect for muonium and positronium in this complement. 1.

The hyperfine structure of the 1 ground state

Let S1 be the electron spin and S2 , the spin of the other particle (the muon or the positron, which are both spin l/2 particles). The degeneracy of the 1 ground state is then, as for hydrogen, four-fold. We can use stationary perturbation theory to study the effect on the 1 ground state of the magnetic interactions between S1 and S2 . The calculation is analogous to the one in § D of Chapter XII. We are left with a problem of two spin 1/2’s coupled by an interaction of the form: S1 S2

(1)

where is a constant which depends on the system under study. We shall denote by , , the three values of which correspond respectively to hydrogen, muonium and positronium. 1281

COMPLEMENT CXII



It is easy to see that: (2) since the smaller the mass of particle (2), the larger its magnetic moment. Now the positron is about 200 times lighter than the muon, which is close to 10 times lighter than the proton.

Comment: The theory of Chapter XII is insufficient for the extremely precise study of the hyperfine structure of hydrogen, muonium and positronium. In particular, the hyperfine Hamiltonian given in § B-2 of this chapter describes only part of the interactions between particles (1) and (2). For example, the fact that the electron and the positron are antiparticles of each other (they can annihilate to produce photons) is responsible for an additional coupling between them which has no equivalent for hydrogen and muonium. In addition, a series of corrections (relativistic, radiative, recoil effects, etc.) must be taken into account. These are complicated to calculate and must be treated by quantum electrodynamics. Finally, for hydrogen, nuclear corrections are also involved which are related to the structure and polarizability of the proton. However, it can be shown that the form (1) of the coupling between S1 and S2 remains valid, the constant being given by an expression which is much more complicated than formula (D-8) of Chapter XII. The hydrogen-like systems studied in this complement are important precisely because they enable us to compare the theoretical value of with experimental results.

The eigenstates of S1 S2 are the states, where quantum numbers related to the total angular momentum: F = S1 + S2

and

are the (3)

As in the case of the hydrogen atom, can take on two values, = 1 and = 0. The two levels, = 1 and = 0, have energies equal to ~2 4 and 3 ~2 4, respectively. Their separation ~2 gives the hyperfine structure of the 1 ground state. Expressed in MHz, this interval is equal to: ~ 2

= 4 463 317

0 021 MHz

(4)

for muonium, and: ~ 2

= 203 403

12 MHz

(5)

for positronium. 2. 2-a.

The Zeeman effect in the 1 ground state The Zeeman Hamiltonian

If we apply a static field B0 parallel to , we must add, to the hyperfine Hamiltonian (1), the Zeeman Hamiltonian which describes the coupling of B0 to the magnetic 1282



THE HYPERFINE STRUCTURE AND THE ZEEMAN EFFECT FOR MUONIUM AND POSITRONIUM

moments: M1 =

1 S1

(6)

2 S2

(7)

and: M2 =

of the two spins, with gyromagnetic ratios

1

and

2.

If we set:

1

=

1

0

(8)

2

=

2

0

(9)

this Zeeman Hamiltonian can be written: 1 1

+

(10)

2 2

In the case of hydrogen, the magnetic moment of the proton is much smaller than that of the electron. We used this property in § E-1 of Chapter XII to neglect the Zeeman coupling of the proton, compared to that of the electron1 . Such an approximation is less justified for muonium, since the magnetic moment of the muon is larger than that of the proton. We shall therefore take both terms of (10) into account. For positronium, furthermore, they are equally important: the electron and positron have equal masses and opposite charges, so that: 1

=

2

(positronium)

(11)

1

=

2

(positronium)

(12)

or:

2-b.

Stationary state energies

When 0 is not zero, it is necessary, in order to find the stationary state energies, to diagonalize the matrix representing the total Hamiltonian: S1 S2 +

1 1

+

(13)

2 2

in an arbitrary orthonormal basis, for example, the basis. A calculation which is analogous to the one in § E-4 of Chapter XII then leads to the following matrix (the four basis vectors are arranged in the order 1 1 1 1 1 0 0 0 ): ~2 4

+ ~2 ( 0 0 0

1

+

2)

0 ~ 4

2

~ 2( 1

0 0

+

0 0

2)

0 0

~2 4 ~ 2( 1

2)

~ 2) 2( 1 3 ~2 4

(14) 1 Recall that the gyromagnetic ratio of the electron spin is ~ ( : the Bohr magneton). 1 = 2 Thus, if we set 0 = 0 ~ (the Larmor angular frequency), the constant 1 defined by (8) is equal to 2 0 (this is, furthermore, the notation used in § E of Chapter XII; to obtain the results of that section, it therefore suffices, in this complement, to replace 1 by 2 0 and 2 by 0).

1283

COMPLEMENT CXII



Matrix (14) can be broken down into two 1 Two eigenvalues are therefore obvious: 1

=

2

=

~2 ~ + ( 4 2 ~2 ~ ( 4 2

1 submatrices and a 2

2 submatrix.

1

+

2)

(15)

1

+

2)

(16)

They correspond, respectively, to the states 1 1 and 1 1 , which, moreover, coincide with the states + + and of the basis of common eigenstates of 1 and 1 2 2 2 . The other two eigenvalues can be obtained by diagonalizing the remaining 2 submatrix. They are equal to: =

~2 + 4

~2 2

2

3

4

~2 4

~2 2

2

=

+

~2 ( 4

1

2)

2

(17)

+

~2 ( 4

1

2)

2

(18)

In a weak field, they correspond to the states 1 0 and 0 0 , respectively, and, in a strong field, to the states + and +. 2-c.

The Zeeman diagram for muonium

The only differences with the results of § E-4 of Chapter XII arise from the fact that here, we are taking the Zeeman coupling of particle (2) into account. These differences appear only in a sufficiently strong field. Let us therefore consider the form taken on by the energies 3 and 4 when ~( 1 ~2 . In this case: 1 2) 2 ~2 ~ + ( 4 2 ~2 ~ ( 4 2

3

4

1

2)

(19)

1

2)

(20)

Now, compare (19) with (15) and (20) with (16). We see that, in a strong field, the energy levels are no longer represented by pairs of parallel lines, as was the case in § E-3 of Chapter XII. The slopes of the asymptotes of the 1 and 3 levels are, respectively, ~ ~ ~ ~ 2 ); those of the 2 and 4 levels, 2 ( 1 + 2 ) and 2 ( 1 2 ). 2 ( 1 + 2 ) and 2( 1 Since the two particles (1) and (2) have opposite charges, 1 and 2 have opposite signs. Consequently, in a sufficiently strong field, the 3 level (which then corresponds to the + state) moves above the 1 level (the + + state), since its slope, ~2 ( 1 2 ) is greater than ~2 ( 1 + 2 ). The distance between the 1 and 3 levels therefore varies in the following way with respect to 0 (cf. Fig. 1): starting from 0, it increases to a maximum for the value of 0 which makes the derivative of: 1

1284

3

=

~2 + ( 2

0)

(21)



THE HYPERFINE STRUCTURE AND THE ZEEMAN EFFECT FOR MUONIUM AND POSITRONIUM

E

E1 | +, + E3

| +, – F=1

| –, –

E2 F=0

E4 | –, +

B0 0

Figure 1: The Zeeman diagram for the 1 ground state of muonium. Since we are not neglecting here the Zeeman coupling between the magnetic moment of the muon and the static field B0 , the two straight lines (which correspond, in a strong field, to the same electron spin orientation but different muon spin orientations) are no longer parallel, as was the case for hydrogen (in the Zeeman diagram of Figure 9 of Chapter XII, the Larmor angular frequency of the proton was neglected). For the same value of the static field 0 , the splitting between the 1 and 3 levels is maximal and that between the 2 and 4 levels is minimal. The arrows represent the transitions studied experimentally for this value of the field B0 .

1285



COMPLEMENT CXII

equal to zero, with: (

0)

~ ( 2

=

1

+

2)

~2 2

0

2

+

~2 4

2 0

(

1

2)

2

(22)

The distance then goes to zero again, and finally increases without bound. As for the distance between the 2 and 4 levels, it starts with the value ~2 , decreases to a minimum for the value of 0 which makes the derivative of: 2

4

=

~2 2

(

0)

(23)

equal to zero and then increases without bound. Since it is the same function ( 0 ) that appears in (21) and (23), we can show that, for the same value of 0 [the one which makes the derivative of ( 0 ) go to zero], the distances between the 1 and 3 levels and between the 2 and 4 levels are either maximal or minimal. This property was recently used to improve the accuracy of experimental determinations of the hyperfine structure of muonium. By stopping polarized muons (for example, in the + state) in a rare gas target, one can prepare, in a strong field, muonium atoms which will be found preferentially in the + + and + states. If we then apply simultaneously two radio frequency fields whose frequencies are close to ( 1 3 ) ~ and ( 2 4 ) ~, we induce resonant transitions from + + to + and from + to (arrows in Figure 1). It is these transitions which are detected experimentally, since they correspond to a flip of the muon spin which is revealed by a change in the anisotropy of the positrons emitted during the -decay of the muons. If we are operating in a field 0 such that the derivative of ( 0 ) is zero, the inhomogeneities of the static field, which may exist from one point to another of the cell containing the rare gas, are not troublesome, since the resonant frequencies of muonium, ( 1 and ( 2 , are not affected, to 3) 4) first order, by a variation of 0 [ref. (11.24)].

Comment:

For the ground state of the hydrogen atom, we obtain a Zeeman diagram analogous to the one in Figure 1 when we take into account the Zeeman coupling between the proton spin and the field B0 . 2-d.

The Zeeman diagram for positronium

If we set 1 = 2 (this property is a direct consequence of the fact that the positron is the antiparticle of the electron) in (15) and (16), we see that the 1 and 2 levels are independent of 0 : 1

=

2

=

~2 4

(24)

On the other hand, we obtain from (17) and (18):

3

~2 + 4

~2 2

2

=

=

~2 4

~2 2

2

4

1286

+ ~2

2 1

2 0

(25)

+ ~2

2 1

2 0

(26)



THE HYPERFINE STRUCTURE AND THE ZEEMAN EFFECT FOR MUONIUM AND POSITRONIUM

E

E3

F=1

E1 E2

F=0

E4 0

B0

Figure 2: The Zeeman diagram for the 1 ground state of positronium. As in the cases of hydrogen and muonium, this diagram is composed of one hyperbola and two straight lines. However, since the gyromagnetic ratios of the electron and positron are equal and opposite, the two straight lines have a zero slope and, consequently, are superposed (in the two corresponding states, with energy 1 and 2 , the total magnetic moment is zero, since the electron and positron spins are parallel). The arrow represents the experimentally studied transition.

1287

COMPLEMENT CXII



The Zeeman diagram for positronium therefore has the form shown in Figure 2. It is composed of two superposed straight lines parallel to the 0 axis and one hyperbola. Actually, positronium is not stable. It decays by emitting photons. In a zero field, it can be shown by symmetry considerations that the = 0 state (the singlet spin state, or “parapositronium”) decays by emitting two photons. Its half-life is of the order of 0 1 25 10 10 s. On the other hand, the = 1 state (the triplet spin state, or “orthopositronium”) can decay only by emitting three photons (since the two-photon transition is forbidden). This process is much less probable, and the half-life of the triplet is much longer, on the order of 1 4 10 7 s. 1 When a static field is applied, the 1 and 2 levels retain the same lifetimes since the corresponding eigenstates do not depend on 0 . On the other hand, the 1 0 state is “mixed” with the 0 0 state, and vice versa. Calculations analogous to those of Complement HIV show that the lifetime of the 3 level is reduced relative to its zero-field value 1 (that of the 4 level is increased relative to the value 0 ). The positronium atoms in the 3 state then have a certain probability of decaying by emission of two photons. This inequality of the lifetimes of the three states of energies 1 , 2 , 3 when 0 is non-zero is the basis of the methods for determining the hyperfine structure of positronium. Formation of positronium atoms by positron capture by an electron generally populates the four states of energies 1 , 2 , 3 , 4 equally. In a non-zero field 0 , the two states 1 and 2 decay less rapidly than the 3 state, so that in the stationary state, they are more populated. If we then apply a radiofrequency field oscillating at the frequency ( 3 =( 3 , we 1) 2) induce resonant transitions from the 1 and 2 states to the 3 state (the arrow of Figure 2). This increases the decay rate via two-photon emission, which permits the detection of resonance when (with fixed 0 ) we vary the frequency of the oscillating field. Determination of 3 1 for a given value of 0 then allows us to find the constant by using (24) and (25). In a zero field, resonant transitions could also be induced between the unequally populated = 1 and = 0 levels. However, the corresponding resonant frequency, given by (5), is high and not easily produced experimentally. This is why one generally prefers to use the “low frequency” transition represented by the arrow of Figure 2.

References and suggestions for further reading: See the subsection “Exotic atoms” of section 11 of the bibliography. The annihilation of positronium is discussed in Feynman III (1.2), § 18-3.

1288

• ZEEMAN EFFECT OF THE HYDROGEN LYMAN

LINE

Complement DXII The influence of the electronic spin on the Zeeman effect of the hydrogen resonance line

1 2 3 4

1.

Introduction . . . . . . . . . . . . . . . . . . . The Zeeman diagrams of the 1 and 2 levels The Zeeman diagram of the 2 level . . . . . The Zeeman effect of the resonance line . . . 4-a Statement of the problem . . . . . . . . . . . 4-b The weak-field Zeeman components . . . . . 4-c The strong-field Zeeman components . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

1289 1290 1290 1293 1293 1295 1295

Introduction

The conclusions of Complement DVII relative to the Zeeman effect for the resonance line of the hydrogen atom spectrum (the 1 2 transition) must be modified to take into account the electron spin and the associated magnetic interactions. This is what we shall do in this complement, using the results obtained in Chapter XII. To simplify the discussion, we shall neglect effects related to nuclear spin (which are much smaller than those related to the electron spin). Therefore, we shall not take the hyperfine coupling (chap. XII, § B-2) into account, choosing the Hamiltonian in the form: =

0

+

+

(1)

0 is the electrostatic Hamiltonian studied in Chapter VII (§ C), fine structure terms (chap. XII, § B-1):

=

+

+

, the sum of the (2)

and , the Zeeman Hamiltonian (chap. XII, § E-1) describing the interaction of the atom with a magnetic field B0 parallel to : =

0(

+2

)

where the Larmor angular frequency 0

=

2

0

(3) 0

is given by: (4)

[we shall neglect relative to 0 ; see formula (E-4) of Chapter XII]. We shall determine the eigenvalues and eigenvectors of by using a method analogous to that of § E of Chapter XII: we shall treat and like perturbations of 0 . Although they have the same unperturbed energy, the 2 and 2 levels can be studied 1289

COMPLEMENT DXII



separately since they are connected neither by (chap. XII, § C-2-a- ) nor by . In this complement, the magnetic field B0 will be called weak or strong, depending on whether is small or large compared to . Note that the magnetic fields considered here to be “weak” are those for which is small compared to but large compared to which we have neglected. These “weak fields” are therefore much stronger than those treated in § E of Chapter XII. Once the eigenstates and eigenvalues of have been obtained, it is possible to study the evolution of the average values of the three components of the electric dipole moment of the atom. Since an analogous calculation was performed in detail in Complement DVII , we shall not repeat it. We shall merely indicate, for weak fields and for strong fields, the frequencies and polarization states of the various Zeeman components of the resonance line of hydrogen (the Lyman line). 2.

The Zeeman diagrams of the 1 and 2 levels

We saw in § D-1-b of Chapter XII that shifts the 1 level as a whole and gives rise to only one fine-structure level, 1 1 2 . The same is true for the 2 level, which becomes 2 1 2 . In each of these two levels, we can choose a basis: ; = 0;

= 0;

=

1 ; 2

=

1 2

(5)

of eigenvectors common to 0 , L2 , , , (the notation is identical to that of Chapter XII; since does not act on the proton spin, we shall ignore in all that follows). The vectors (5) are obviously eigenvectors of with eigenvalues 2 ~ 0 . Thus, each 1 1 2 or 2 1 2 level splits, in a field 0 , into two Zeeman sublevels of energies: ( ; = 0;

= 0;

)=

(

1 2)

+2

~

0

(6)

where ( 1 2 ) is the zero-field energy of the 1 2 level, calculated in §§ C-2-b and D-1-b of Chapter XII. The Zeeman diagram of the 1 1 2 level (as well as the one for the 2 1 2 level) is therefore composed of two straight lines of slopes + 1 and – 1 (Fig. 1), corresponding, respectively, to the two possible orientations of the spin relative to B0 ( = +1 2 and = 1 2). Comparison of Figure 1 and Figure 9 of Chapter XII shows that to neglect, as we are doing here, the effects related to nuclear spin amounts to considering fields B0 which are so large that . We are then in the asymptotic region of the diagram of Figure 9 of Chapter XII, where we can ignore the splitting of the energy levels due to the proton spin and hyperfine coupling. 3.

The Zeeman diagram of the 2 level

In the six-dimensional 2 subspace, we can choose one of the two bases: = 2; = 1;

;

(7)

or: = 2; = 1; ; 1290

(8)

• ZEEMAN EFFECT OF THE HYDROGEN LYMAN

LINE

E

mS = +

1 2

Figure 1: The Zeeman diagram of the 1 1 2 level when the hyperfine coupling is neglected. The ordinate of the point at which the two levels = 1 2 cross is the energy of the 1 1 2 level (i.e., the eigenvalue of 0 , corrected for the global shift produced by the fine-structure Hamiltonian ). Figure 9 of Chapter XII gives an idea of the modifications of this diagram produced by .

E(l s 1/2)

mS = –

1 2

ħω0

0

adapted, respectively, to the individual angular momenta L and S and to the total angular momentum J = L + S [cf. (36a) and (36b) of Complement AX ]. The terms and which appear in expression (2) for shift the 2 level as a whole. Therefore, to study the Zeeman diagram of the 2 level, we simply diagonalize the 6 6 matrix which represents + in either one of the two bases, (7) or (8). Actually, since and = 2 L S both commute with = + , this 6 6 matrix can be broken down into as many submatrices as there are distinct values of . Thus, there appear two one-dimensional submatrices (corresponding respectively to = +3 2 and = 3 2) and two two-dimensional submatrices (corresponding respectively to = +1 2 and = 1 2). The calculation of the eigenvalues and associated eigenvectors (which is very much like that of § E-4 of Chapter XII) presents no difficulties and leads to the Zeeman diagram shown in Figure 2. This diagram is composed of two straight lines and four hyperbolic branches. In a zero field, the energies depend only on . We obtain the two fine-structure levels, 2 3 2 and 2 1 2 , already studied in § C of Chapter XII, whose energies are equal to: 1 (2 3 2 ) = ˜ (2 ) + 2 ~2 (9) 2 (2

1 2)

= ˜ (2 )

2

~2

(10)

˜ (2 ) is the 2 level energy (2 ) corrected for the global shift due to and [cf. expressions (C-8) and (C-9) of Chapter XII]. 2 is the constant which appears in the restriction 2 L S of to the 2 level [cf. expression (C-13) of Chapter XII]. In weak magnetic fields ( ), the slope of the energy levels can also be obtained by treating like a perturbation of . It is then necessary to diagonalize the 1291

COMPLEMENT DXII

• E

E(2p3/2)

E(2p)

E(2p1/2)

0

ħω0

Figure 2: The Zeeman diagram of the 2 level when the hyperfine coupling is neglected. In a zero field, we find the fine-structure levels, 2 1 2 and 2 3 2 . The Zeeman diagram is composed of two straight lines and two hyperbolas (for which the asymptotes are shown in dashed lines). The hyperfine coupling would significantly modify this diagram only in the neighborhood of 0 = 0. ˜ (2 ) is the 2 level energy (the eigenvalue 4 of 0 ) corrected for the global shift produced by + .

4 4 and 2 2 matrices representing in the 2 3 2 and 2 1 2 levels. Calculations analogous to those of § E-2 of chapter XII show that these two submatrices are respectively proportional to those which represent 0 in the same subspaces. The proportionality coefficients, called “Landé factors” (cf. Complement DX , § 3), are equal, respectively, 1292

• ZEEMAN EFFECT OF THE HYDROGEN LYMAN

LINE

to1 : (2

3 2)

=

4 3

(11)

(2

1 2)

=

2 3

(12)

In weak fields, each fine-structure level therefore splits into 2 + 1 equidistant Zeeman sublevels. The eigenstates are the states of the “coupled” basis, (8), corresponding to the eigenvalues: (

)=

(2

)+

(2

)~

(13)

0

where the (2 ) are given by expressions (9) and (10). In strong fields ( ), we can, on the other hand, treat = 2 L S like a perturbation of , which is diagonal in basis (7). As in § E-3-b of chapter XII, it can easily be shown that only the diagonal elements of 2 L S are involved when the corrections are calculated to first order in . Thus, we find that in strong fields, the eigenstates are the states of the “decoupled” basis, (7), and the corresponding eigenvalues are: ) = ˜ (2 ) + (

(

+2

)~

0

+

~2

2

(14)

Formula (14) gives the asymptotes of the diagram of Figure 2. As the magnetic field 0 increases, we pass continuously from basis (8) to basis (7). The magnetic field gradually decouples the orbital angular momentum and the spin. This situation is the analogue of the one studied in § E of Chapter XII, in which the angular momenta S and I were coupled or decoupled, depending on the relative importance of the hyperfine and Zeeman terms. 4.

The Zeeman effect of the resonance line

4-a.

Statement of the problem

Arguments of the same type as those of § 2-c of Complement DVII (see, in particular, the comment at the end of that complement) show that the optical transition between a 2 Zeeman sublevel and a 1 Zeeman sublevel is possible only if the matrix element of the electric dipole operator R between these two states is different from zero2 . In addition, depending on whether it is the ( + ), ( ) or operator which has a non-zero matrix element between the two Zeeman sublevels under consideration, the polarization state of the emitted light is + , or . Therefore, we use the previously determined eigenvectors and eigenvalues of in order to obtain the frequencies of the various Zeeman components of the hydrogen resonance line and their polarization states.

Comment: 1 These

Landé factors can be calculated directly from formula (43) of Complement DX . electric dipole, since it is an odd operator, has no matrix elements between the 1 and 2 states, which are both even. This is why we are ignoring the 2 states here. 2 The

1293

COMPLEMENT DXII



E

4 ħω0 3

J

mJ

3/2

3/2

3/2

1/2

E(2p3/2) 3/2 – 1/2 3/2 – 3/2

2 ħω0 3

E(2p1/2) σ+

π

σ–

σ+

π

σ–

π

σ–

σ+

1/2 1/2 1/2 – 1/2

π

1/2 E(1s1/2)

1/2

2ħω0 1/2 – 1/2

Figure 3: The disposition, in a weak field, of the Zeeman sublevels arising from the finestructure levels, 1 1 2 , 2 1 2 , 2 3 2 (whose zero-field energies are marked on the vertical energy scale). On the right-hand side of the figure are indicated the splittings between adjacent Zeeman sublevels (for greater clarity, these splittings have been exaggerated with respect to the fine-structure splitting which separates the 2 1 2 and 2 3 2 levels), as well as the values of the quantum numbers and associated with each sublevel. The arrows indicate the Zeeman components of the resonance line, each of which has a well-defined polarization, + , or .

1294

• ZEEMAN EFFECT OF THE HYDROGEN LYMAN

LINE

The ( + ), ( ) and operators act only on the orbital part of the wave function and cause to vary, respectively, by + 1, 1 and 0 (cf. Complement DVII , § 2-c); is not affected. Since = + is a good quantum number (whatever the strength of the field 0 ), ∆ = +1 transitions have a + polarization; ∆ = 1 transitions, a polarization; and ∆ = 0 transitions, a polarization. 4-b.

The weak-field Zeeman components

Figure 3 shows the weak-field positions of the various Zeeman sublevels resulting from the 1 1 2 , 2 1 2 and 2 3 2 levels, obtained from expressions (6), (13), (11) and (12). The vertical arrows indicate the various Zeeman components of the resonance line. The polarization is + , or , depending on whether ∆ = +1 1 or 0. Figure 4 shows the positions of these various components on a frequency scale, relative to the zero-field positions of the lines. The result differs notably from that of Complement DVII (see Figure 2 of that complement), where, observing in a direction perpendicular to B0 , we had three equidistant components of polarization + , , , separated by a frequency difference 0 2 .

3 ξ2pħ 4π

a v b σ–

π

π

σ–

σ+

σ–

π

π

σ+

σ+

ω0 2π

Figure 4: Frequencies of the various Zeeman components of the hydrogen resonance line. a) In a zero field: two lines are observed, separated by the fine-structure splitting 3 2 ~ 4 ( 2 is the spin-orbit coupling constant of the 2 level) and corresponding respectively to the transitions 2 3 2 1 1 2 (the line on the right-hand side of the figure) and 2 1 2 1 1 2 (the line on the left-hand side). b) In a weak field 0 : each line splits into a series of Zeeman components whose polarizations are indicated; 0 2 is the Larmor frequency in the field 0 .

4-c.

The strong-field Zeeman components

Figure 5 shows the strong-field positions of the Zeeman sublevels arising from the 1 and 2 levels [see expressions (6) and (14)]. To first order in , the degeneracy between the states = 1 = 1 2 and =1 = 1 2 is not removed. The vertical arrows indicate the Zeeman components of the resonance line. The polarization is + , or , depending on whether ∆ = +1 1 or 0 (recall that in an electric dipole transition, the quantum number is not affected). 1295



COMPLEMENT DXII

E

mL

mS 1

1

2

2ħω0 +

ħ2 ξ2p 2

1 0

2

ħω0

1 –1

E(2p)

1 – 0 –

– 1–

σ–

σ–

π

π

σ+

2 1 2 1

0–

ħ2 2

ξ2p

– ħω0

2 1 2

– 2ħω0 +

ħ2 2

ξ2p

σ+

1 0 2

ħω0

E(1s1/2)

0 –

1 2

– ħω0

Figure 5: The disposition, in a strong field (decoupled fine structure), of the Zeeman sublevels arising from the 1 and 2 levels. On the right-hand side of the figure are indicated the values of the quantum numbers and associated with each Zeeman sublevel, as well as the corresponding energy, given relative to (1 1 2 ) or ˜ (2 ). The vertical arrows indicate the Zeeman components of the resonance line.

The corresponding optical spectrum is shown in Figure 6. The two transitions have the same frequency (cf. Fig. 5). On the other hand, there is a small splitting, ~ 2 2 , between the frequencies of the two + transitions and between those of the two transitions. The mean distance between the + doublet and the line (or between the line and the doublet) is equal to 0 2 . The spectrum of Figure 6 is therefore 1296

• ZEEMAN EFFECT OF THE HYDROGEN LYMAN

σ–

ħξ2p

ħξ2p



2π σ–

π

σ+

π

ω0 2π

LINE

σ+

ω0 2π

Figure 6: The strong-field positions of the Zeeman components of the hydrogen resonance line. Aside from the splitting of the + and lines, this spectrum is identical to the one obtained in Complement DVII , where the effects related to electron spin were ignored.

similar to that of Figure 2 of Complement DVII . Furthermore, the splitting of the + and lines, due to the existence of the electron spin, is easy to understand. In strong fields, L and S are decoupled. Since the 1 2 transition is an electric dipole transition, only the orbital angular momentum L of the electron is affected by the optical transition. An argument analogous to the one in § E-3-b of Chapter XII shows that the magnetic interactions related to the spin can be described by an “internal field” which adds to the external field B0 and whose sign changes, depending on whether the spin points up or down. It is this internal field that causes the splitting of the + and lines (the line is not affected, since its quantum number is zero). References and suggestions for further reading: Cagnac and Pebay-Peyroula (11.2), Chaps. XI and XVII (especially § 5-A of that chapter); White (11.5), Chap. X; Kuhn (11.1), Chap. III, § F; Sobel’man (11.12), Chap. 8, § 29.

1297



COMPLEMENT EXII

Complement EXII The Stark effect for the hydrogen atom

1

2

The 1-a 1-b The

Stark effect on the = 1 level . . . . . . The shift of the 1 state is quadratic in . . Polarizability of the 1 state . . . . . . . . . . Stark effect on the = 2 level . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1298 1298 1299 1300

Consider a hydrogen atom placed in a uniform static electric field parallel to . To the Hamiltonian studied in Chapter XII must be added the Stark Hamiltonian , which describes the interaction energy of the electric dipole moment R of the atom with the field . can be written: =

R=

(1)

Even for the strongest electric fields that can be produced in the laboratory, we always have is strong enough, can have the same 0 . On the other hand, if order of magnitude as and or be even larger. To simplify the discussion, we shall assume throughout this complement that is strong enough for the effect of to be much larger than that of or . We shall therefore calculate directly, using perturbation theory, the effect of on the eigenstates of 0 found in Chapter VII (the next step, which we shall not consider here, would consist of evaluating the effect of , and then of , on the eigenstates of 0 + ). Since both 0 and do not act on the spin variables, we shall ignore the quantum numbers and . 1.

The Stark effect on the

1-a.

= 1 level

The shift of the 1 state is quadratic in

According to perturbation theory, the effect of the electric field can be obtained to first order by calculating the matrix element: =1

=0

=0

=1

=0

=0

Since the operator is odd, and since the ground state has a well-defined parity (it is even), the preceding matrix element is zero. There is therefore no effect which is linear in , and we must go on to the next term of the perturbation series: 2

=

1 0 0

2 2 =1

2

2 where = is the eigenvalue of 0 associated with the eigenstate Chap. VII, § C). The preceding sum is certainly not zero, since there exist states

1298

(2)

1

(cf.



THE STARK EFFECT FOR THE HYDROGEN ATOM

whose parity is opposite to that of 1 0 0 . We conclude that, to lowest order in , the Stark shift of the 1 ground state is quadratic. Since 1 is always negative, the ground state is lowered. 1-b.

Polarizability of the 1 state

We have already mentioned that, for reasons of parity, the average values of the components of the operator R are zero in the state 1 0 0 (the unperturbed ground state). In the presence of an electric field parallel to , the ground state is no longer 1 0 0 , but rather (according to the results of § B-1-b of Chapter XI): 0

1 0 0

= 1 0 0

+

(3)

1

=1

This shows that the average value of the electric dipole moment R in the perturbed ground state is, to first order in , 0 R 0 . Using expression (3) for 0 , we then obtain: 0

R

2

=

0

=1

1 0 0R

1 0 0 + 1 0 0

R1 0 0

(4)

1

Thus, we see that the electric field causes an “induced” dipole moment to appear, proportional to . It can easily be shown, by using the spherical harmonic orthogonality relation1 , that 0 0 and 0 0 are zero, and that the only non-zero average value is: 0

=

0

2

1 0 0

2 =1

2

(5)

1

In other words, the induced dipole moment is parallel to the applied field .This is not surprising, given the spherical symmetry of the 1 state. The coefficient of proportionality between the induced dipole moment and the field is called the linear electric susceptibility. We see that quantum mechanics permits the calculation of this susceptibility for the 1 state: 1

=

2

1 0 0

2 =1

2

(6)

1

1 This relation implies that 1 0 0 is different from zero only if = 1, = 0 (the argument is the same as the one given for 2 1 2 0 0 in the beginning of § 2 below). Consequently, in (2), (3), (4), (5), (6), the summation is actually carried out only over (it includes, furthermore, the states of the positive energy continuum).

1299



COMPLEMENT EXII

2.

The Stark effect on the

= 2 level

The effect of on the = 2 level can be obtained to first order by diagonalizing the restriction of to the subspace spanned by the four states of the 2 0 0 ; 2 1 = 1 0 +1 basis. The 2 0 0 state is even; the three 2 1 states are odd. Since is odd, the matrix element 2 0 0 2 0 0 and the nine matrix elements 2 1 2 1 are zero (cf. Complement FII ). On the other hand, since the 2 0 0 and 2 1 states have opposite parities, 2 1 2 0 0 can be different from zero. Let us show that actually only 2 1 0 2 0 0 is non-zero. is proportional to = cos and, therefore, to 10 ( ). The angular integral which enters into the matrix elements 2 1 2 0 0 is therefore of the form: 0 1 (Ω)

(Ω)

1

0 0 (Ω)

dΩ

Since 00 is a constant, this integral is proportional to the scalar product of 10 and 1 and is therefore different from zero only if = 0. Moreover, since 10 , 21 ( ) and 20 ( ) are real, the corresponding matrix element of is real. We shall set: 2 1 0

2 0 0 =

(7)

without concerning ourselves with the exact value of [which could be calculated without difficulty since we know the wave functions 2 1 0 (r) and 2 0 0 (r)]. The matrix which represents in the = 2 level, therefore, has the following form (the basis vectors are arranged in the order 2 1 1 , 2 1 1 , 2 1 0 , 2 0 0 ): 0

0

0

0

0

0

0

0

0

0

0

0

0

(8)

0

We can immediately deduce the corrections to first order in zeroth order: Eigenstates

Corrections

2 1 1

0

2 1

1

1 (2 1 0 + 2 0 0 ) 2 1 (2 1 0 2 0 0) 2

and the eigenstates to

0 (9)

Thus, we see that the degeneracy of the = 2 level is partially removed and that the energy shifts are linear, and not quadratic, in . The appearance of a linear Stark effect is a typical result of the existence of two levels of opposite parities and the same energy, here the 2 and 2 levels. This situation exists only in the case of hydrogen (because of the -fold degeneracy of the = 1 shells). 1300



THE STARK EFFECT FOR THE HYDROGEN ATOM

Comment:

The states of the = 2 level are not stable. Nevertheless, the lifetime of the 2 state is considerably longer than that of the 2 states, since the atom passes easily from 2 to 1 by spontaneous emission of a Lyman photon (lifetime of the order of 10 9 s), while decay from the 2 state requires the emission of two photons (lifetime of the order of a second). For this reason, the 2 states are said to be unstable and the 2 state, metastable. Since the Stark Hamiltonian has a non-zero matrix element between 2 and 2 , any electric field (static or oscillating) “mixes” the metastable 2 state with the unstable 2 state, greatly reducing the 2 state’s lifetime. This phenomenon is called “metastability quenching” (see also Complement HIV , in which we study the effect of a coupling between two states of different lifetimes). References and suggestions for further reading: The Stark effect in atoms: Kuhn (11.1), Chap. III, §§ A-6 and G. Ruark and Urey (11.9), Chap. V, §§ 12 and 13; Sobel’man (11.12), Chap. 8, § 28. The summation over the intermediate states which appears in (2) and (6) can be calculated exactly by the method of Dalgarno and Lewis; see Borowitz (1.7), § 14-5; Schiff (1.18), § 33. Original references: (2.34), (2.35), (2.36). Quenching of metastability: see Lamb and Retherford (3.11), App. II; Sobel’man (11.12), Chap. 8, § 28.5.

1301

Chapter XIII

Approximation methods for time-dependent problems A B

C

D

E

A.

Statement of the problem . . . . . . . . . . . . . . . . . . . . Approximate solution of the Schrödinger equation . . . . . B-1 The Schrödinger equation in the representation . . . . B-2 Perturbation equations . . . . . . . . . . . . . . . . . . . . . . B-3 Solution to first order in . . . . . . . . . . . . . . . . . . . . An important special case: a sinusoidal or constant perturbation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 Application of the general equations . . . . . . . . . . . . . . C-2 Sinusoidal perturbation coupling two discrete states: the resonance phenomenon . . . . . . . . . . . . . . . . . . . . . . . C-3 Coupling with the states of the continuous spectrum . . . . . Random perturbation . . . . . . . . . . . . . . . . . . . . . . . D-1 Statistical properties of the perturbation . . . . . . . . . . . . D-2 Perturbative computation of the transition probability . . . . D-3 Validity of the perturbation treatment . . . . . . . . . . . . . Long-time behavior for a two-level atom . . . . . . . . . . . E-1 Sinusoidal perturbation . . . . . . . . . . . . . . . . . . . . . E-2 Random perturbation . . . . . . . . . . . . . . . . . . . . . . E-3 Broadband optical excitation of an atom . . . . . . . . . . . .

1303 1305 1305 1306 1307 1309 1309 1311 1316 1320 1320 1321 1323 1324 1324 1325 1332

Statement of the problem

Consider a physical system with Hamiltonian and : 0 will be denoted by 0

=

0.

The eigenvalues and eigenvectors of (A-1)

Quantum Mechanics, Volume II, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

For the sake of simplicity, we shall consider the spectrum of 0 to be discrete and nondegenerate; the formulas obtained can easily be generalized (see, for example, § C-3). We assume that 0 is not explicitly time-dependent, so that its eigenstates are stationary states. At = 0, a perturbation is applied to the system. Its Hamiltonian then becomes: ()=

0

+

()

(A-2)

with: ()=

ˆ()

(A-3)

where is a real dimensionless parameter much smaller than 1 and ˆ ( ) is an observable (which can be explicitly time-dependent) of the same order of magnitude as 0 , and zero for 0. The system is assumed to be initially in the stationary state , an eigenstate of 0 of eigenvalue . Starting at = 0 when the perturbation is applied, the system evolves: the state is no longer, in general, an eigenstate of the perturbed Hamiltonian. We propose, in this chapter, to calculate the probability P ( ) of finding the system in another eigenstate of 0 at time . In other words, we want to study the transitions that can be induced by the perturbation ( ) between the stationary states of the unperturbed system. The treatment is very simple. Between the times 0 and , the system evolves in accordance with the Schrödinger equation: ~

d d

() =

The solution condition:

0

ˆ()

+

()

(A-4)

( ) of this first-order differential equation corresponding to the initial

( = 0) =

(A-5)

is unique. The desired probability P ( ) can be written: P ()=

()

2

(A-6)

The whole problem therefore consists of finding the solution ( ) of (A-4) that corresponds to the initial condition (A-5). However, such a problem is not generally rigorously soluble. This is why we resort to approximation methods. We shall show in this chapter how, if is sufficiently small, the solution ( ) can be found in the form of a power series expansion in . Thus, we shall calculate ( ) explicitly to first order in , as well as the corresponding probability (§ B). The general formulas obtained will then be applied (§ C) to the study of an important special case, the one in which the perturbation is a sinusoidal function of time or a constant (the interaction of an atom with an electro-magnetic wave, which falls into this category, is treated in detail in Complement AXIII ). This is an example of the resonance phenomenon. Two situations will be considered: the one in which the spectrum of 0 is discrete, and the one in which the initial state is coupled to a continuum of final states. In the latter case, we shall prove an important formula known as “Fermi’s golden rule”. In § D we will 1304

B. APPROXIMATE SOLUTION OF THE SCHRÖDINGER EQUATION

consider another important case in which the perturbation fluctuates randomly; it is then characterized by its time-dependent correlation function, and will be treated with a perturbative calculation that is valid for short times. We will then show in § E how to extend the valitidy of this calculation to long times, within a general approximation called “motional narrowing approximation”.

Comment:

The situation treated in § C-3 of Chapter IV can be considered to be a special case of the general problem discussed in this chapter. Recall that, in Chapter IV, we discussed a two-level system (the states 1 and 2 ), initially in the state 1 , subjected, beginning at time = 0, to a constant perturbation . The probability P12 ( ) can then be calculated exactly, leading to Rabi’s formula. The problem we are taking up here is much more general. We shall consider a system with an arbitrary number of levels (sometimes, as in § C-3, with a continuum of states) and a perturbation ( ) which is an arbitrary function of the time. This explains why, in general, we can obtain only an approximate solution. B.

Approximate solution of the Schrödinger equation

B-1.

The Schrödinger equation in the

representation

The probability P ( ) explicitly involves the eigenstates is therefore reasonable to choose the representation. B-1-a.

and

of

0.

It

The system of differential equations for the components of the state vector

Let

( ) be the components of the ket

() =

( ) in the

basis:

()

(B-1)

with: ()= ˆ

()

(B-2)

( ) denotes the matrix elements of the observable ˆ ( ) in the same basis: ˆ()

Recall that 0

= ˆ 0

()

is represented in the

(B-3) basis by a diagonal matrix:

=

We shall project both sides of Schrödinger equation (A-4) onto insert the closure relation: =1

(B-4) . To do so, we

(B-5) 1305

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

and use relations (B-2), (B-3) and (B-4). We obtain: ~

d d

()=

ˆ

( )+

() ()

(B-6)

The set of equations (B-6), written for the various values of , constitutes a system of coupled linear differential equations of first order in , which enables us, in theory, to determine the components ( ) of ( ) . The coupling between these equations arises solely from the existence of the perturbation ˆ ( ), which, by its non-diagonal matrix elements ˆ ( ), relates the evolution of ( ) to that of all the other coefficients ( ). B-1-b.

Changing functions

When ˆ ( ) is zero, equations (B-6) are no longer coupled, and their solution is very simple. It can be written: ()=

e

~

(B-7)

where

is a constant which depends on the initial conditions. Now, if ˆ ( ) is not zero, while remaining much smaller than 0 because of the condition 1, we expect the solution ( ) of equations (B-6) to be very close to solution (B-7). In other words, if we perform the change of functions: ()=

~

( )e

(B-8)

we can predict that the ( ) will be slowly varying functions of time. We substitute (B-8) into equation (B-6); we obtain: ~

~e

d d

( )+

~

()e =

()e

~

ˆ

+

We now multiply both sides of this relation by e+ frequency: =

~

()

()e

~

(B-9)

, and introduce the Bohr angular

(B-10) ~

related to the pair of states ~

B-2.

d d

()=

e

and ˆ

()

. We obtain: ()

(B-11)

Perturbation equations

The system of equations (B-11) is rigorously equivalent to Schrödinger equation (A-4). In general, we do not know how to find its exact solution. This is why we shall use the fact that is much smaller than 1 to try to determine this solution in the form 1306

B. APPROXIMATE SOLUTION OF THE SCHRÖDINGER EQUATION

of a power series expansion in sufficiently small): (0)

()=

(1)

( )+

( )+

(which we can hope to be rapidly convergent if

2 (2)

( )+

is

(B-12)

If we substitute this expansion into (B-11), and if we set equal the coefficients of on both sides of the equation, we find: ( ) for

=0:

d d

()=0

~

(0)

(B-13)

since the right-hand side of (B-11) has a common factor . Relation (B-13) expresses the (0) fact that does not depend on . Thus, if is zero, ( ) reduces to a constant [cf. (B-7)]. ( ) for ~

d d

=0: ( )

()=

ˆ

e

()

(

1)

()

(B-14)

We see that, with the zeroth-order solution determined by (B-13) and the initial conditions, recurrence relation (B-14) enables us to obtain the first-order solution ( = 1). It then furnishes the second-order solution ( = 2) in terms of the first-order one and, by recurrence, the solution to any order in terms of the one to order 1. B-3.

Solution to first order in

B-3-a.

The state of the system at time

For 0, the system is assumed to be in the state . Of all the coefficients ( ), only ( ) is different from zero (and, furthermore, independent of since ˆ is then zero). At time = 0, ˆ ( ) may become discontinuous in passing from a zero value to the value ˆ (0). However, since ˆ ( ) remains finite, the solution of the Schrödinger equation is continuous at = 0. It follows that: ( = 0) =

(B-15)

and this relation is valid for all must satisfy: (0)

( = 0) =

( )

( = 0) = 0

. Consequently, the coefficients of expansion (B-12)

(B-16) if

1

(B-17)

Equation (B-13) then immediately yields, for all positive : (0)

()=

(B-18)

which completely determines the zeroth-order solution. 1307

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

This result now permits us to write (B-14), for ~

d d

(1)

()=

ˆ

e

= 1, in the form:

()

ˆ ()

=e

(B-19)

an equation which can be integrated without difficulty. Taking into account initial condition (B-17), we find: (1)

1 ~

()=

ˆ ( )d

e

(B-20)

0

If we now substitute (B-18) and (B-20) into (B-8) and then into (B-1), we obtain the state ( ) of the system at time , calculated to first order in . B-3-b.

The transition probability P ( )

According to (A-6) and definition (B-2) of ( ), the transition probability P ( ) is equal to ( ) 2 , that is, since ( ) and ( ) have the same modulus [cf. (B-8)]: P ()=

( )2

(B-21)

where: ()=

(0)

(1)

( )+

( )+

(B-22)

can be calculated from the formulas established in the preceding section. From now on, we shall assume that the states and are different. We shall therefore be concerned only with the transitions induced by ˆ ( ) between two distinct (0) stationary states of 0 . We then have ( ) = 0, and, consequently: P ()=

2

(1)

()2

Using (B-20) and replacing P ()=

1 ~2

(B-23) ˆ ( ) by

( ) [cf. (A-3)], we finally obtain: 2

e

( )d

(B-24)

0

Consider the function ˜ ( ), which is zero for 0 and , and equal to ( ) for 0 (cf. Fig. 1). ˜ ( ) is the matrix element of the perturbation “seen” by the system between the time = 0 and the measurement time , when we try to determine if the system is in the state . Result (B-24) shows that P ( ) is proportional to the square of the modulus of the Fourier transform of the perturbation actually “seen”, ˜ ( ). This Fourier transform is evaluated at an angular frequency equal to the Bohr angular frequency associated with the transition under consideration. Note also that the transition probability P ( ) is zero to first order if the matrix element ( ) is zero for all . 1308

C. AN IMPORTANT SPECIAL CASE: A SINUSOIDAL OR CONSTANT PERTURBATION

Comment:

We have not discussed the validity conditions of the approximation to first order in . Comparison of (B-11) with (B-19) shows that this approximation simply amounts to replacing, on the right-hand side of (B-11), the coefficients ( ) by their values (0) at time = 0. It is therefore clear that, so long as remains small enough for (0) not to differ very much from ( ), the approximation remains valid. On the other hand, when becomes large, there is no reason why the corrections of order 2, 3, etc. in should be negligible. C.

An important special case: a sinusoidal or constant perturbation

C-1.

Application of the general equations

Now assume that

( ) has one of the two simple forms:

ˆ ( ) = ˆ sin ˆ ( ) = ˆ cos

(C-1a) (C-1b)

where ˆ is a time-independent observable and , a constant angular frequency. Such a situation is often encountered in physics. For example, in Complements AXIII and BXIII , we consider the perturbation of a physical system by an electromagnetic wave of angular frequency ; P ( ) then represents the probability, induced by the incident monochromatic radiation, of a transition between the initial state and the final state . With the particular form (C-1a) of ˆ ( ), the matrix elements ˆ ( ) take on the form: ˆ ()= ˆ

ˆ sin

=

2

(e

e

)

(C-2)

~ Wfi (t )

Figure 1: The variation of the function ˜ ( ) with respect to . This function coincides with ( ) in the interval 0 , and goes to zero outside this interval. It is the Fourier transform of ˜ ( ) that enters into the transition probability P ( ) to lowest order.

0

t

t

1309

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

where ˆ is a time-independent complex number. Let us now calculate the state vector of the system to first order in . If we substitute (C-2) into general formula (B-20), we obtain: (1)

ˆ ()=

e(

2~

+ )

e(

)

d

(C-3)

0

The integral which appears on the right-hand side of this relation can easily be calculated and yields: (1)

ˆ ()=

e(

1

2~

+ )

e(

1

)

(C-4)

+

Therefore, in the special case we are treating, general equation (B-24) becomes: P (; )=

(1)

2

2

2

()

=

e(

1

4~2

+ )

1

+

e(

)

2

(C-5a)

(we have added the variable in the probability P , since the latter depends on the frequency of the perturbation). If we choose the special form (C-1b) for ˆ ( ) instead of (C-1a), a calculation analogous to the preceding one yields: 2

P (; )=

1

e(

4~2

+ )

+

+

1

e(

)

2

(C-5b)

The operator ˆ cos becomes time-independent if we choose = 0. The transition probability P ( ) induced by a constant perturbation can therefore be obtained by replacing by 0 in (C-5b): 2

P ()=

~2

1

2

2

e

2

=

~2

(

)

(C-6)

with: (

)=

sin( (

2) 2)

2

(C-7)

In order to study the physical content of equations (C-5b) and (C-6), we shall first consider the case in which and are two discrete levels (§ C-2), and then that in which belongs to a continuum of final states (§ C-3). In the first case, P ( ; ) [or P ( )] really represents a transition probability which can be measured, while, in the second case, we are actually dealing with a probability density (the truly measurable quantities then involve a summation over a set of final states). From a physical point of view, there is a distinct difference between these two cases. We shall see in Complements CXIII and DXIII that, over a sufficiently long time interval, the system 1310

C. AN IMPORTANT SPECIAL CASE: A SINUSOIDAL OR CONSTANT PERTURBATION

Ei

Ef

φi

φf

Ef

Ei

φf

φi

a

b

Figure 2: The relative disposition of the energies and associated with the states and . If (fig. a), the transition occurs through absorption of an energy quantum ~ . If, on the other hand, (fig. b), the transition occurs through induced emission of an energy quantum ~ .

oscillates between the states and in the first case, while it leaves the state irreversibly in the second case. In § C-2, in order to concentrate on the resonance phenomenon, we shall choose a sinusoidal perturbation, but the results obtained can easily be transposed to the case of a constant perturbation. On the other hand, we shall use this latter case for the discussion of § C-3. C-2. C-2-a.

Sinusoidal perturbation coupling two discrete states: the resonance phenomenon Resonant nature of the transition probability

When the time is fixed, the transition probability P ( ; ) is a function only of the variable . We shall see that this function has a maximum for: (C-8a) or: (C-8b) A resonance phenomenon therefore occurs when the angular frequency of the perturbation coincides with the Bohr angular frequency associated with the pair of states and . If we agree to choose 0, relations (C-8) give the resonance conditions corresponding respectively to the cases 0 and 0. In the first case (cf. Fig. 2-a), the system goes from the lower energy level to the higher level by the resonant absorption of an energy quantum ~ . In the second case (cf. Fig. 2-b), the resonant perturbation stimulates the passage of the system from the higher level to the lower level (accompanied by the induced emission of an energy quantum ~ ). Throughout this section, we shall assume that is positive (the situation of Figure 2-a). The case in which is negative could be treated analogously. 1311

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

To reveal the resonant nature of the transition probability, we note that both expressions (C-5a) and (C-5b) for P ( ; ) involve the square of the modulus of a sum of two complex terms. The first of these terms is proportional to: +

=

1

e(

+ )

+

=

e(

+ )

=

e(

)

2 sin [(

(

+ ) 2] + ) 2

(C-9a)

) 2] ) 2

(C-9b)

and the second one, to: =

1

e(

)

2 sin [(

(

The denominator of the term goes to zero for = , and that of the + term, for = . Consequently, for close to , we expect only the term to be important; this is why it is called the “resonant term”, while the + term is called the “anti-resonant term” ( + would become resonant if, for negative , were close to ). Let us then consider the case in which: (C-10) neglecting the anti-resonant term + (the validity of this approximation will be discussed in § C-2-c below). Taking (C-9b) into account, we then obtain: 2

P (; )=

4~2

(

)

(C-11)

with: (

)=

sin [( (

) 2] ) 2

2

(C-12)

Figure 3 represents the variation of P ( ; ) with respect to , for a given time . It clearly shows the resonant nature of the transition probability. This probability presents 2 2 a maximum for = , when it is equal to 4~2 . As we move away from , it decreases, going to zero for = 2 . When continues to increase, it 2 oscillates between the value ~2 ( )2 and zero (“diffraction pattern”). C-2-b.

The resonance width and the time-energy uncertainty relation

The resonance width ∆ can be approximately defined as the distance between the first two zeros of P ( ; ) on each side of = . It is inside this interval that the transition probability takes on its largest values [the first secondary maximum P , 2 2 attained when ( ) 2 = 3 2, is equal to 9 2 ~2 , that is, less than 5% of the transition probability at resonance]. We then have: ∆

4

(C-13)

The larger the time , the smaller this width. Result (C-13) presents a certain analogy with the time-energy uncertainty relation (cf. Chap. III, § D-2-e). Assume that we want to measure the energy difference = 1312

C. AN IMPORTANT SPECIAL CASE: A SINUSOIDAL OR CONSTANT PERTURBATION

𝒫if (t;ω) Wfi 2t2 4ħ2

𝛥ω =

0

4π t

ωfi

ω

Figure 3: Variation, with respect to , of the first-order transition probability P ( ; ) associated with a sinusoidal perturbation of angular frequency ; is fixed. When , a resonance appears whose intensity is proportional to 2 and whose width is inversely proportional to .

~ by applying a sinusoidal perturbation of angular frequency to the system and varying so as to detect the resonance. If the perturbation acts during a time , the uncertainty ∆ on the value will be, according to (C-13), of the order of: ∆

= ~∆

~

(C-14)

Therefore, the product ∆ cannot be smaller than ~. This recalls the time-energy uncertainty relation, although here is not a time interval characteristic of the free evolution of the system, but is externally imposed. C-2-c.

Validity of the perturbation treatment

Now let us examine the limits of validity of the calculations leading to result (C-11). We shall first discuss the resonant approximation, which consists of neglecting the antiresonant term + , and then the first-order approximation in the perturbation expansion of the state vector. .

Discussion of the resonant approximation

Using the hypothesis therefore compare the moduli of

+

, we have neglected and .

+

relative to

. We shall

1313

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

2 The shape of the function ( ) 2 is shown in Figure 3. Since = +( ) 2 2 ( ) , ( ) can be obtained by plotting the curve which is symmetric with + respect to the preceding one relative to the vertical axis = 0. If these two curves, of width ∆ , are centered at points whose separation is much larger than ∆ , it is clear that, in the neighborhood of = , the modulus of + is negligible compared to that of . The resonant approximation is therefore justified on the condition1 that:

2



(C-15)

that is, using (C-13): 1

1

(C-16)

Result (C-11) is therefore valid only if the sinusoidal perturbation acts during a time which is large compared to 1 . The physical meaning of such a condition is clear: during the interval [0 ], the perturbation must perform numerous oscillations to appear to the system as a sinusoidal perturbation. If, on the other hand, were small compared to 1 , the perturbation would not have enough time to oscillate and would be equivalent to a perturbation varying linearly in time [in the case (C-1a)] or constant [in the case (C-1b)].

Comment:

For a constant perturbation, condition (C-16) can never be satisfied, since is zero. However, it is not difficult to adapt the calculations of § C-2-b above to this case. We have already obtained [in (C-6)] the transition probability P ( ) for a constant perturbation by directly setting = 0 in (C-5b). Note that the two terms + and are then equal, which shows that if (C-16) is not satisfied, the anti-resonant term is not negligible. The variation of the probability P ( ) with respect to the energy difference ~ (with the time fixed) is shown in Figure 4. This probability is maximal when = 0, which corresponds to what we found in § C-2-b above: if its angular frequency is zero, the perturbation is resonant when = 0 (degenerate levels). More generally, the considerations of § C-2-b concerning the features of the resonance can be transposed to this case.

.

Limits of the first-order calculation

We have already noted (cf. comment at the end of § B-3-b) that the first-order approximation can cease to be valid when the time becomes too large. This can indeed be seen from expression (C-11), which, at resonance, can be written: P (;

2

=

)=

2

4~2

(C-17)

1 Note that if condition (C-15) is not satisfied, the resonant and anti-resonant terms interfere: it is 2. not correct to simply add + 2 and

1314

C. AN IMPORTANT SPECIAL CASE: A SINUSOIDAL OR CONSTANT PERTURBATION

This function becomes infinite when , which is absurd, since a probability can never be greater than 1. In practice, for the first-order approximation to be valid at resonance, the probability in (C-17) must be much smaller than 1, that is2 : ~

(C-18)

To show precisely why this inequality is related to the validity of the first-order approximation, it would be necessary to calculate the higher-order corrections from (B-14) and to examine under what conditions they are negligible. We would then see that, although inequality (C-18) is necessary, it is not rigorously sufficient. For example, in the terms of second or higher order,

𝒫if (t) Wfi 2t2 ħ2

𝛥ω ≃

4π t

0

ωfi

Figure 4: Variation of the transition probability P ( ) associated with a constant perturbation with respect to =( ) ~, for fixed . A resonance appears, centered about = 0 (conservation of energy), with the same width as the resonance of Figure 3, but an intensity four times greater (because of the constructive interference of the resonant and anti-resonant terms, which, for a constant perturbation, are equal). 2 For this theory to be meaningful, it is obviously necessary for conditions (C-16) and (C-18) to be compatible. That is, we must have:

1

~

This inequality means that the energy difference element of ( ) between and .

= ~

is much larger than the matrix

1315

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

there appear matrix elements ˆ of ˆ other than ˆ , on which certain conditions must be imposed for the corresponding corrections to be small.

Note that the problem of calculating the transition probability when does not satisfy (C-18) is taken up in Complement CXIII , in which an approximation of a different type is used (the secular approximation). C-3.

Coupling with the states of the continuous spectrum

If the energy belongs to a continuous part of the spectrum of 0 , that is, if the final states are labeled by continuous indices, we cannot measure the probability of finding the system in a well-defined state at time . The postulates of Chapter III indicate that in this case the quantity ( ) 2 which we determined above (approximately) is a probability density. The physical predictions for a given measurement then involve an integration of this probability density over a certain group of final states (which depends on the measurement to be made). We shall consider what happens to the results of the preceding sections in this case. C-3-a.

.

Integration over a continuum of final states: density of states

Example

To understand how this integration is performed over the final states, we shall first consider a concrete example. We shall discuss the problem of the scattering of a spinless particle of mass by a potential (r) (cf. Chap. VIII). The state ( ) of the particle at time can be expanded on the states p of well-defined momenta p and energies: =

p2 2

(C-19)

The corresponding wave functions are the plane waves: rp =

3 2

1

epr

2 ~

~

(C-20)

The probability density associated with a measurement of the momentum is p ( ) 2 [ ( ) is assumed to be normalized]. The detector used in the experiment (see, for example, Figure 2 of Chapter VIII) gives a signal when the particle is scattered with the momentum p . Of course, this detector always has a finite angular aperture, and its energy selectivity is not perfect: it emits a signal whenever the momentum p of the particle points within a solid angle Ω about p and its energy is included in the interval centered at = p2 2 . If denotes the domain of p-space defined by these conditions, the probability of obtaining a signal from the detector is therefore: P(p

d3

)=

p ()

2

(C-21)

p

To use the results of the preceding sections, we shall have to perform a change of variables which results in an integral over the energies. This does not present any difficulties, since 1316

C. AN IMPORTANT SPECIAL CASE: A SINUSOIDAL OR CONSTANT PERTURBATION

we can write: d3 =

2

d dΩ

(C-22)

and replace the variable obtain:

by the energy

, to which it is related by (C-19). We thus

d3 = ( )d dΩ

(C-23)

where the function ( ), called the density of final states, can be written, according to (C-19), (C-22) and (C-23): ( )=

2

d d

=

2

=

2

(C-24)

(C-21) then becomes: P(p

)=

dΩ d Ω

.

( ) p ()

2

(C-25)

Ω ;

The general case

Assume that, in a particular problem, certain eigenstates of 0 are labeled by a continuous set of indices, symbolized by , such that the orthonormalization relation can be written: = (

)

(C-26)

The system is described at time by the normalized ket ( ) . We want to calculate the probability P( ) of finding the system, in a measurement, in a given group of final states. We characterize this group of states by a domain of values of the parameters , centered at , and we assume that their energies form a continuum. The postulates of quantum mechanics then yield: P(

)=

d

()

2

(C-27)

As in the example of § above, we shall change variables, and introduce the density of final states. Instead of characterizing these states by the parameters , we shall use the energy and a set of other parameters (which are necessary when 0 alone does not constitute a C.S.C.O.). We can then express d in terms of d and d : d = (

)d d

(C-28)

in which the density of final states3 ( range of values of the parameters and P(

)=

d d

(

) appears. If we denote by defined by , we obtain: )

()

and

2

the

(C-29)

;

where the notation has been replaced by -dependence of the probability density ()

in order to point up the 2

- and

.

3 In the general case, the density of states depends on both (cf. example of § above) that depends only on .

and

. However, it often happens

1317

CHAPTER XIII

C-3-b.

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

Fermi’s golden rule

In expression (C-29), ( ) is the normalized state vector of the system at time . As in § A of this chapter, we shall consider a system which is initially in an eigenstate of 0 [ therefore belongs to the discrete spectrum of 0 , since the initial state of the system must, like ( ) , be normalizable]. In (C-29), we shall replace the notation P( ) by P( ) in order to remember that the system starts from the state . The calculations of § B and their application to the case of a sinusoidal or constant perturbation (§§ C-1 and C-2) remain valid when the final state of the system belongs to the continuous spectrum of 0 . If we assume to be constant, we can therefore use (C-6) to find the probability density ( ) 2 to first order in . We then get: ()

2

=

1 ~2

2

(C-30) ~

where and are the energies of the states function defined by (C-7). We get for P( P(

)=

1 ~2

d d

(

and ), finally:

respectively, and

2

)

is the

(C-31) ~

;

The function varies rapidly about = (cf. Fig. 4). If is sufficiently ~ large, this function can be approximated, to within a constant factor, by the function ( ), since, according to (11) and (20) of Appendix II, we have: lim

=

2~

~

=2 ~

(

)

(C-32)

2 On the other hand, the function ( ) generally varies much more slowly with . We shall assume here that is sufficiently large for the variation of this function over an energy interval of width 4 ~ centered at = to be negligible4 . We can then in (C-31) replace by its limit (C-32). This enables us to perform the integral ~ over immediately. If, in addition, is very small, integration over is unnecessary, and we finally get:

– when the energy P(

)=

– when the energy P(

)=0

belongs to the domain 2 ~

=

: 2

(

=

)

(C-33a)

does not belong to this domain: (C-33b)

As we saw in the comment of § C-2-c- , a constant perturbation can induce transitions only between states of equal energies. The system must have the same energy (to within 2 ~ ) in the initial and final states. This is why, if the domain excludes the energy , the transition probability is zero. 4 ( 2 must vary slowly enough to enable the finding of values of that satisfy the ) stated condition but remain small enough for the perturbation treatment of to be valid. Here, we also assume that 4 ~ .

1318

C. AN IMPORTANT SPECIAL CASE: A SINUSOIDAL OR CONSTANT PERTURBATION

The probability (C-33a) increases linearly with time. Consequently, the transition probability per unit time, ( ), defined by: (

)=

d P( d

)

(C-34)

is time-independent. We introduce the transition probability density per unit time and per unit interval of the variable : (

(

)=

)

(C-35)

It is equal to: (

2 ~

)=

2

=

(

=

)

(C-36)

This important result is known as Fermi’s golden rule.

Comments:

( ) Assume that is a sinusoidal perturbation of the form (C-1a) or (C-1b), which couples a state ( to a continuum of states with energies close to + ~ . Starting with (C-11), we can carry out the same procedure as above, which yields: (

)=

=

2~

2

+~

(

=

+~ )

( ) Let us return to the problem of the scattering of a particle by a potential matrix elements in the r

r =

(r) (r

(C-37) whose

representation are given by:

r r)

(C-38)

Now assume that the initial state of the system is a well-defined momentum state: ( = 0) = p

(C-39)

and let us calculate the scattering probability of an incident particle of momentum p into the states of momentum p grouped about a given value p (with p = p ). (C-36) gives the scattering probability (p p ) per unit time and per unit solid angle about p = p : (p p ) =

2 ~

p

p

2

(

=

)

(C-40)

Taking into account (C-20), (C-38) and expression (C-24) for ( ), we then get: (p p ) =

2 ~

2

1 2 ~

6

2

d3 e (p

p )r ~

(r)

(C-41)

On the right-hand side of this relation, we recognize the Fourier transform of the potential (r), evaluated for the value of p equal to p p .

1319

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

Note that the initial state p chosen here is not normalizable, and it cannot represent the physical state of a particle. However, although the norm of p is infinite, the right-hand side of (C-41) maintains a finite value. Intuitively, we can therefore expect to obtain a correct physical result from this relation. If we divide the probability obtained by the probability current: 1 2 ~

=

3

~

=

1 2 ~

3

2

(C-42)

associated, according to (C-20), with the state p , we obtain: (p p )

2

2

=

4

2 ~4

d3 e (p

p )r ~

(r)

(C-43)

which is the expression for the scattering cross section in the Born approximation (§ B-4 of Chap. VIII). Although it is not rigorous, the preceding treatment enables us to show that the scattering cross sections of the Born approximation can also be obtained by a time-dependent approach, using Fermi’s golden rule.

D.

Random perturbation

Another interesting case occurs when the perturbation applied to the system fluctuates in a random fashion. Consider for example an atom (a) having a spin magnetic moment, and moving in a gas of particles (b) which also have magnetic moments. As atom (a) undergoes a series of random collisions with particles (b), it is subjected to a magnetic field that varies randomly from one collision to another. The resulting interactions can change the orientation of the atom’s magnetic moment. This type of situation is treated here (§ D). We shall go back to the calculation of § B assuming that the matrix element ˆ ( ) of the perturbation is a random function of time. Our aim is to study the transition probability P ( ) for going from the state to the state after a time , and determine how it differs from the result found in the previous section. D-1.

Statistical properties of the perturbation

Here we consider the evolution of a single quantum system, atom (a) in the example described above, and study its evolution averaged over time. We thus need to consider the properties of statistical averages over time5 of the perturbation ( ). We note ( ) the average value of the matrix element ( ), and assume it is equal to zero: ()=0

(D-1)

This means that ( ) fluctuates between values that can be opposite. Since the matrix elements and are two complex conjugate numbers, their product is necessarily positive or zero, hence having in general a non-zero average value: ()

()

0

(D-2)

5 The point of view of Complement E XIII is more directly related to most experimental situations: we study an ensemble of individual quantum systems described by their density operator. The two approaches are nevertheless equivalent since, in statistical mechanics, averaging over “a Gibbs ensemble” is equivalent to averaging a single system over a long time.

1320

D. RANDOM PERTURBATION

It will be useful in what follows to also consider the average value of such a product taken at different instants and + , called the correlation function ( ): ( + )

()=

( )=0

(D-3)

The dependence of ( ) characterizes the time during which the perturbation keeps a “memory” of its value: ( ) is non-zero as long as ( + ) remains correlated with ( ). The correlation function ( ) goes to 0 when the time difference is longer than a characteristic time called the “correlation time” : ()

(

)=

( )

0

if

(D-4)

We shall consider the case where is very short compared to all the other evolution times of the system. For instance, in the example mentioned above of an atom (a) diffusing in a gas of particles (b), the correlation time is of the order of the duration of a single collision, generally (much) shorter than 10 10 s. We assume the random perturbation to be stationary, meaning that the correlation functions depend only on the difference between the two instants + and . Consequently, we can also write: ()

(

)=

( )

(D-5)

Using complex conjugation, relation (D-3) can be written: ( + )

()=

()

( + )=

( )

(D-6)

Comparing with (D-5) yields: ( )=

(

)

(D-7)

Changing the sign of the variable transforms the function ( ) into its complex conjugate; in particular, (0) is real. In the following computations, il will be useful to introduce the Fourier transform ˜ ( ) of the function ( ): ˜ ( )=

1 2

+

d e

( )

(D-8)

˜ ( )

(D-9)

leading to its inverse relation: ( )=

1 2

+

d e+

Relation (D-7) implies that ˜ ( ) is a real function. D-2.

Perturbative computation of the transition probability

As in (B-8), we perform the change of functions that transform ( ) into ( ), which eliminates the variations of the coefficients due to 0 alone (this amounts to using the interaction picture, cf. exercise 15 of Complement LIII ). We assume that (0) = 1 1321

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

and (0) = 0. We then look for the probability amplitude ( ) for the system, starting at time = 0 from the state , to be found at time in the state . Equation (B-20) is now written: (1)

()=

1 ~

e

( )d

(D-10)

0

The probability of finding the system in the state (D-10) by its complex conjugate. Since = [

(1)

( )]

[

(1)

1 ~2

( )] =

e

at time is obtained by multiplying and = , we get:

( )d

e

0

( )d

(D-11)

0

The transition probability P ( ) is the average of (D-11) over the various values of the random perturbation. This leads to: P ( )=[

(1)

( )]

[

(1)

( )] =

1 ~2

d 0

d e

(

)

( )

( )

(D-12)

0

Setting: =

(D-13)

and using (D-5) enable us to write (D-12) as: P ()=

1 ~2

d

d e

( )

(D-14)

0

(the change of sign coming from d = d gration limits). We assume in what follows that:

is accounted for by interchanging the inte-

(D-15) The integral over d is taken over a time interval from 0 to , very large compared to . Its value will not be significantly modified if we shorten that interval at both end by a few . If is of the order of a few units ( = 2 or = 3 for instance), we can write: P ()

1 ~2

d

d e

( )

(D-16)

In the integral over d , the upper limit is ; this upper limit may be extended to infinity since ( ) goes to zero when , and hence the additional contribution to the integral is zero. In the same way, the negative lower limit can be replaced by , since the condition ensures that the function to be integrated is zero in the additional integration domain. The integral over d becomes independent of , so that the integral over d is easily computed and leads to: d =( 1322

2

)

(D-17)

D. RANDOM PERTURBATION

We then get: P ()

Γ

(D-18)

where the constant Γ is defined by: Γ=

1 ~2

+

( )e

d =

2 ˜ ( ~2

)

(D-19)

This result involves the Fourier transform ˜ ( ) of the correlation function ( ) defined by relation (D-8), taken at the (angular) frequency = of the transition between the initial state and the final state . As already mentioned, relation (D-7) shows that the constant Γ is real. The transition probability from to after a time is thus proportional to that time. This means that when the perturbation is random, one can define (at least for short times6 where the perturbative treatment to lowest order is valid) a transition probability per unit time from to . It is proportional to the Fourier transform of the correlation function of the perturbation, computed at the angular frequency . This is a very different result from the one obtained in (C-11) and (C-12) for a sinusoidal perturbation. In that case, the transition probability increased as 2 for short times, and then oscillated as a function of time. D-3.

Validity of the perturbation treatment

Result (D-18), obtained by a perturbative treatment, is valid as long as the transition probability remains small, that is if: 1 Γ

(D-20)

On the other hand, to establish (D-18) we assumed in (D-15) that was much larger than . The two conditions (D-20) and (D-15) are compatible only if: 1 Γ

(D-21)

The calculation we just presented implies the existence of two very different time scales: the evolution time of the system, of the order of 1 Γ, often called the “relaxation time”; the correlation time, , which is much shorter and characterizes the memory of the fluctuations of the random perturbation. Equation (D-21) simply expresses the fact that, during this correlation time, the system barely evolves. Using for Γ relation (D-19), this inequality can be written solely with parameters concerning the perturbation. This inequality is often called the “motional narrowing condition”, for reasons that will be explained in § 2-c- of Complement EXIII .

6 We

shall see in the next section under which conditions this result remains valid for long times.

1323

CHAPTER XIII

E.

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

Long-time behavior for a two-level atom

Until now, we have limited ourselves in this chapter to perturbative calculations of the transition probability after a time . We found that it increases as 2 for a sinusoidal perturbation, but linearly as for a random perturbation with a short memory. As a probability cannot become larger than one, such approximations are only valid for small values of . In the last part of this chapter, we shall present treatments that permit us to study and compare the long-time behaviors of a system subjected to these two types of perturbations. For the sake of simplicity, we shall limit our study to the case of a two-level system. E-1.

Sinusoidal perturbation

We already studied in Complement FIV the long-time behavior of a two-level system subjected to a sinusoidal perturbation. We computed the exact evolution of a spin 1 2 in the special case where the Hamiltonian obeys relation (14) of FIV : ()=

} 2

1e

0 1e

(E-1) 0

(this matrix is written using the basis + of the eigenvectors of the spin component). The diagonal of this matrix yields the matrix elements of the Hamiltonian 0 ; this Hamiltonian comes from the coupling of the spin with a static magnetic field B0 , parallel to the axis. The perturbation Hamiltonian ( ) corresponds to the non-diagonal parts of the matrix; it comes from the coupling of the spin with a radiofrequency field rotating around the axis at the angular frequency . We showed in Complement FIV that the quantum evolution of a spin 1 2 was identical to the classical evolution of a magnetic dipole with a proportional angular momentum. This led to a useful image for the evolution of a spin in a magnetic field, composed of a constant and a rotating field. Now we saw in Complement CIV that any two-level system is perfectly isomorphic to a spin 1 2. The states and are associated with the spin states + and , and the Hamiltonian 0 leads to two non-perturbed energies = } 0 2 and = } 0 2; this means that 0 = . We assume that the perturbation that couples the two states and is the analog of the action of a magnetic field B1 rotating in the plane at the frequency ; it is thus responsible for the non-diagonal matrix elements of (E-1), with: ()=

}

()=

}

1

e

1

e

2 2

(E-2)

(the number 1 is supposed to be real; if this is not the case, a change of the relative phase of and allows this condition to be fulfilled). We can then directly transpose the results of Complement FIV , with no additional computations. Relation (27) of Complement FIV shows that the transition probability is given by “Rabi’s formula”: P ()= 1324

( (

2

2 1)

1) + (

)

2

sin2

(

1)

2

+(

2

)

2

(E-3)

E. LONG-TIME BEHAVIOR FOR A TWO-LEVEL ATOM

Since the quantum and classical evolution coincide in the present situation, they can be simply interpreted in terms of the classical precession of a magnetic moment around an “effective field”. At resonance, the effective field is located in the plane, for instance along the axis. The spin, initially parallel to , precesses around , hence tracing large circles in the plane . Relation (E-3) shows that the probability for the spin to reverse its initial orientation is written: P ( ) = sin2

1

(E-4)

2

This probability oscillates between 0 and 1 with a precession angular frequency 1 = 2 }, called “Rabi’s frequency”. This type of long lasting oscillations could not have been obtained7 by a perturbation treatment. For a non-resonant perturbation, the effective field has a component along the axis. As it precesses, the magnetic moment now follows a cone; the larger the discrepancy between and the resonant frequency, the smaller the cone’s aperture becomes (Figure 2 of Complement FIV ). We must now use the complete relation (E-3) which, also, predicts a sinusoidal oscillation. It should be noted that, if 2}, we find again the result of equations (C-11) and (C-12) of Chapter XIII, which thus provide a good approximation in this case.

Comment:

The previous results assume that the perturbation can be reduced to a single rotating field. Replacing in (E-1) the exponentials by the sinusoidal functions sin or cos , it introduces two rotating fields with opposite frequencies ; they both act simultaneously on the system, leading to a more complex situation. The results remain, however, valid as long as the perturbation is weak enough (meaning not too far from one of the two resonances ( 1 0 ) and 0 or 0 ). E-2.

Random perturbation

As for the spin 1 2 case considered above, we assume here that the perturbation does not have diagonal elements: =

=0

(E-5)

(Complement EXIII presents a more general calculation, where this hypothesis is no longer necessary). In § E-1, we assumed that the system was initially in the state , hence only ( = 0) was different from zero. This rules out the possibility of any superposition of states at the initial moment. To remove this restriction, we now assume that the system is, at instant , in any superposition of the states and : Ψ( ) =

( )e

}

+

( )e

}

(E-6)

We then consider a later instant, +∆ , and compute the evolution of the system between the times and + ∆ , to second order in . 7 One

must sum an infinite number of perturbative terms to reconstruct a sine squared.

1325

CHAPTER XIII

E-2-a.

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

State vector of the system at time + ∆ to second order in

To zeroth-order of the perturbation, relation (B-13) shows that neither ( + ∆ ) depend on ∆ :

nor (0)

( +∆ )=

()

(0)

;

( +∆ )=

()

(E-7)

To first order, we use relation (B-14) with = 1; hypothesis (E-5) means that coupled to , and vice versa. Integrating over time, we get: (1)

(1)

1 ~

()

=

1 ~

()

( +∆ )=

is only

+∆

( +∆ )=

where we have set interchanged:

( +∆ )

e

( )d

∆ ( +

e

)

( +

)d

(E-8)

0

= 1 ~

; we also get a similar relation where the indices and

are



()

( +

e

)

( +

)d

(E-9)

0

The term (E-8) describes an atom that was at time in the state and is found at time + in the state ; the term (E-9) describes the inverse process. To second order, we again use relation (B-14), this time with = 2; after integration, it leads to: (2)

( +∆ )=

1 ~

=

1 ~

+∆

e

( )

( +∆ )= =

( )d

∆ ( +

e

)

( +

)

(1)

( +

)d

(E-10)

0

We now change the integration variable ∆ by ; this leads to: (2)

(1)

to , and insert relation (E-9) after replacing



1 ~2

()

1 ~2

()

e

( + )

( + )d

0

e

( +

)

( +

)d

0 ∆

e

( + )d

0

e

( +

)d

(E-11)

0

This perturbative term describes an atom that was at time in the state , then made a transition to the state at time + (included between and + ), then came back to the state at time + (included between and + ∆ ). Here also we can interchange the indices and to get the probability amplitude of the inverse process. E-2-b.

Average occupation probabilities to second order

For given values of the random variables and , the probability of finding 2 the system in the state is ( + ∆ ) . To second order in , this squared modulus contains the following terms: 1326

E. LONG-TIME BEHAVIOR FOR A TWO-LEVEL ATOM

(0)

– the squared modulus of ( + ∆ ), which is of zeroth order. (0) – a first-order term containing the product of ( +∆ ) and the complex conjugate (1) of ( + ∆ ), or the opposite. Due to condition (E-5), this term, linear in , averages out to zero over all the possible values of and . It will not be taken into account. (1)

– the squared modulus of ( + ∆ ), which is of order 2. (0) – finally, twice the real part of the product of ( +∆ ) and the complex conjugate (2) of ( + ∆ ), which is also of order 2. We thus get: ( +∆ ) (0)

2

( +∆ )

2

(1)

+

( +∆ )

2

+2Re [

(0)

( + ∆ )]

[

(2)

( + ∆ )]

(E-12)

This expression is rewritten below in a slightly different form. The first and third term are regrouped in a first line, while the second term is rewritten in the second line. Note that the first term is rewritten using (E-7), while for the third term we use the complex conjugate of the second line of (E-11), obtained by replacing by (and vice versa) as well as by (and vice versa). This leads to: ( +∆ ) () +

2

2

()

1 2

1 ~2

2 Re ~2



e

( + )d

e

0

( +

)d

0





e

( + )d

0

e

( +

)d

(E-13)

0

We now average this probability over the various values of the random variables and . We get in (E-13) the product of two matrix elements of and of the amplitude squared of ( ) and ( ). Rigorously speaking, these quantities are not mutually independent, since the system’s state at time is determined by the values of the perturbation at an earlier time. This correlation actually lasts over a time much shorter than ∆ , of the order of the correlation time of the functions and – cf. relation (D-4); therefore, a very short time after the instant , and are no 2 longer correlated with the values of ( ) . It is then justified to compute separately the two averages: ()

2

Double integral

()

2

Double integral

(E-14)

( ) 2 and ( ) 2 of the populations of the states and The averages are simply the diagonal elements of the density operator (Complement EXIII ). Noting ˜( ) this operator, we get: ˜ ()=

()

2

˜ ()=

()

2

(E-15) 1327

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

The variation rate of ˜ ( ) between the times 2

∆˜ ( ) = ∆

( +∆ ) ∆

()

and + ∆ is:

2

(E-16)

Relation (E-13) then yields: ∆˜ ( ) = ∆ where 1

2

and

1

= =

˜ ( )[ 2

1

+

1]

+˜ ()

(E-17)

2

are the averages of double integrals: ∆

1 1 ~2 ∆

d

d

0 ∆

1 1 ~2 ∆

(

e

)

( + )

( +

)

0 ∆

d

d

0

(

e

)

( + )

( +

)

(E-18)

0

The computation of the average values of these two double integrals is similar to the one performed in § D-2. It is carried out in detail below and leads to: 2



1

=

(E-19)

Γ + 2

(E-20)

In these relations, Γ and are expressed in terms of the Fourier transform ˜ ( ) of ( ), which was introduced in (D-8). The constant Γ was given in (D-19): 2 ˜ ( ~2

Γ= and

1 ~2

+

( )e

d

(E-21)

is defined by: =

where

)=

+

1 2 ~2

1

d

˜ ( )

(E-22)

means the principal part (Appendix II, §1-b). Computation of

1

and

2:

The double integral appearing in 2 has already been encountered in (D-12), while computing the value of P ( ); we must simply replace by ∆ . Its value is given in (D-18), which becomes here Γ∆ . According to the definition (E-18) of 2 , we must then divide this result by ∆ , which leads to relation (E-19). The computation of 1 is similar to that of 2 , except that the upper limit of the integral over d is ∆ instead of . In the first line of (E-18), we can make the change of variables = to transform the integrations according to: ∆



d 0

1328

d 0

0

d 0



d =

d 0

d 0

(E-23)

E. LONG-TIME BEHAVIOR FOR A TWO-LEVEL ATOM

As we did in (D-16), when ∆ we can replace the lower limit of the integral over d by a few , without any significant change of the result. The upper limit of the integral over d is then longer than a few and can be extended to + . The integral over d then reduces to a simple factor ∆ , which cancels that same factor in the denominator of (E-18). This leads to: 1 ~2

1

+

d e

( )

(

)

(E-24)

0

or else, using relation (D-9) to introduce the Fourier transform of +

1 2 ~2

1

( ):

+

d

˜ ( )

d e(

)

(E-25)

0

The integral over

leads to:

+

d e(

)

(E-26)

0

To make this integral convergent, we introduce an infinitesimal (positive) factor write: +

d e(

+

)

=

0

In the limit

1 (

+

)

=

(E-27)

+

0 we get, taking into account relation (12) of Appendix II:

+

d e(

and

+

)

=

(

1

)+

(E-28)

0

Inserting this result in (E-25) we find (E-20), where Γ and (E-22). E-2-c.

are given by (E-21) and

Time evolution of the populations

According to (E-20), we can write and using (E-19), we get: ∆˜ ( ) = Γ˜ ( ) + Γ˜ ( ) ∆ ∆˜ ( ) = +Γ˜ ( ) Γ˜ ( ) ∆

1

+

1

= Γ. Inserting this result into (E-17)

(E-29)

The interpretation of these two equations is straightforward: at any time the system goes from to , and from to , with a probability per unit time that is constant and equal to Γ. If, at time , the two populations of and are different, they will both tend exponentially towards the same value 1 2, without ever coming back to their initial values. This long-time irreversibility is clearly very different from what we obtained for a two-level system subjected to a sinusoidal perturbation. We no longer observe the oscillating and reversible behavior, similar to the Rabi precession of the spin 1 2 associated with the two-level system (§ E-1). One may wonder how a prediction valid for long times can be obtained while using perturbation calculations limited to second order in : expressions such as (E-11) and 1329

CHAPTER XIII

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

(E-13) are certainly no longer valid for large values of ∆ . This is due to the random character of the perturbation, which has a correlation time much shorter than the evolution time Γ 1 . At any time (even very distant from the initial time = 0) the system has little memory of its past evolution. Between the instant and + ∆ , its evolution only depends on what occurred before during the time interval [ ]. When is very short compared to ∆ , the system barely evolves during the time to + ∆ , and a perturbation treatment can be applied. This amounts to dividing the time into time intervals of width ∆ , very long compared to , but nevertheless very short compared to the characteristic evolution time Γ 1 . E-2-d.

Time evolution of the coherences

In addition to the populations (E-15) of the two levels, we must also consider the “coherences” existing between them. It is the non-diagonal matrix elements of the density operator: ˜ ( + ∆ ) = [ ( + ∆ )] [ ( + ∆ )]

(E-30)

that characterize the existence of coherent linear superpositions of the two levels. Up to second order in , such a non-diagonal element includes zeroth order, first order and second order terms. The zeroth order term is a constant, since we defined in (E-6) the coefficients [ ( )] and [ ( )] in the interaction representation (in the usual representation, this term would correspond to the free evolution of the coherence at the Bohr frequency). The first order terms cancel out since we assumed that the average values of the perturbation matrix elements are zero. To second order in , these coherences are obtained by first replacing, in (E-30), [ ( + ∆ )] and [ ( + ∆ )] by their series expansion in power of . One must then take the average over the various values of the random perturbation. This leads to: ˜ ( +∆ )

˜ ( )=[

(0)

( + ∆ )] [

(2)

+[

(2)

( + ∆ )] [

(0)

( + ∆ )]

+[

(1)

( + ∆ )] [

(1)

( + ∆ )] +

( + ∆ )]

(E-31)

(i) Let us consider the first two lines of this relation; we will show in (ii) below (0) why the third line can be left out. The zeroth order coefficients, [ ( + ∆ )] and (0) [ ( + ∆ )] , remain equal to their initial values, written [ ( )] and [ ( )] . Using the second line of (E-11), we can write: [

(2)

=

( + ∆ )] [ 1 () ~2

(0)

( + ∆ )] ∆

()

e 0

( + )d

e

whose average value yields the second line of (E-31). The first line of (E-31) is obtained by interchanging 1330

( +

)d

(E-32)

and

in the second line of

0

E. LONG-TIME BEHAVIOR FOR A TWO-LEVEL ATOM

(E-11), taking its complex conjugate, and multiplying the result by [

(0)

( + ∆ )] [ 1 () ~2

=

(2)

( ). This leads to:

( + ∆ )] ∆

()

e

( + )d

0

e

( +

)d

(E-33)

0

This expression is identical to (E-32) since = and = . We now average over the various values of and . As we did before, we can average independently ( ) ( ) and the double integral of (E-32). The computation of the average value of this double integral has already been performed in § E-2-b and yields ∆ 1 , where 1 is given in (E-20). This result must be doubled since the two terms (E-32) and (E-33) are equal and add up. This finally leads to: ˜ ( + ∆ ) = ˜ ( )[1

2

1

∆ ] = ˜ ( )[1

∆ (Γ

2

)]

(E-34)

or else: ∆˜ ( ) ˜ ( +∆ ) = ∆ ∆

˜ ()

=



2

) ˜ ()

(E-35)

Let us go back to the initial components ( ) and ( ) of the state vector. Relation (B-8) shows that the elements of the density matrix constructed with these components are: (

( ) = [ ( )] [ ( )] = e

)

}

[ ( )] [ ( )] = e

˜ ()

(E-36)

which leads to: d d

()=

d ˜ () d

( ) +e

(E-37)

Now for short enough values of ∆ , the derivative of ˜ ( ) is given by (E-35). Using this relation in (E-37), we get: d d

()=

[Γ + (

2

)]

()

(E-38)

This means that the coherence between and is damped at a rate Γ, and that its evolution frequency is shifted by 2 . (ii) The third line of (E-31) is proportional to ( + ∆ ) ( + ∆ ), and is therefore responsible for a coupling between ˜ ( + ∆ ) and ˜ ( + ∆ ): the rate of variation of ˜ ( + ∆ ) is a priori dependent on ˜ ( + ∆ ). However, if the energy difference ~ is sufficiently large, the unperturbed evolution frequencies of these nondiagonal elements are very different, so that the effect of this coupling by the perturbation remains negligible (it actually disappears within the secular approximation). Moreover, if the statistical distribution of the random perturbation has a rotational symmetry around the axis8 , this third line is equal to zero; this is demonstrated in a more general case9 in Complement EXIII . 8 The

axis is defined for the spin 1 2 associated with the two-level system. diagonal elements of are generally not equal to zero. This means that the coherences can also be coupled to the populations. In Complement EXIII , is supposed to be invariant with respect to a rotation around any axis. 9 The

1331

CHAPTER XIII

E-2-e.

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

Energy shifts

The previous computation shows that the two states are shifted by the perturbation, but it does not give the value of the shift of each state; relation E-38 only simply predicts that the difference of the shifts ~( ) of the two states must be equal to 2~ . We will now prove that the shifts are opposite, = and = + . The most convenient demonstration uses the theory of the “dressed atom”, which will be presented in Complement CXX . A more elementary demonstration is given below. Imagine that there exists a third, so called “spectator”, state , which is not coupled to the perturbation, so that its energy is not shifted by the perturbation. We assume that there is a coherence ˜ ( ) between and , and study how it is modified by the perturbation acting on . The computation of ˜ ( + ∆ ) is quite similar to that of ˜ ( + ∆ ), except that ( + ∆ ) remains equal to ( ), up to any order in since the state is not coupled to the perturbation . Replacing by in (E-31), the only non-zero term is on the second line, equal to

(2)

( + ∆ ) ( ). The computation proceeds

(2)

as for ( + ∆ ) ( ) and leads to the same result as (E-32) where is replaced by . Averaging over yields the same result as (E-34) where is again replaced by ; the factor 2 in front of 1 is no longer there, since the term of the first line of (E-31) no longer comes into play to double the value of the second line. We finally get: ∆˜ ( ) ˜ ( +∆ ) ˜ ( ) Γ = = ( )˜ () (E-39) ∆ ∆ 2 The coherence between and is damped at a rate Γ 2 and its evolution frequency, equal to = in the absence of the perturbation, is changed by . Since the state is not coupled to the perturbation, the state must be shifted by = ~ . As for the state , it is shifted by = +~ , since relation (E-38) indicates that the difference between the two shifts of and must be equal to 2~ . Let us focus on the sign of , given by relation (E-22). We saw in § D-1 that ˜ ( ) is real; we assume this function to be positive in an angular frequency domain centered around = , and zero everywhere else. Relation (E-22) shows that: 0

;

0

(E-40)

For , we have 0; when , relation (E-38) shows that the shift decreases the energy difference between the two states and . For , we have 0; it is now when 0 that the shift decreases the energy difference. In both cases, and if has the same sign as , the energy difference is decreased when ; in the opposite case, the energy levels get further from each other. E-3.

Broadband optical excitation of an atom

We now apply the previous results to the excitation of a two-level atom by broadband radiation. The radiation is described by an incoherent superposition of monochromatic fields with frequencies spreading over an interval of width ∆ , and with random phases. Consequently, the coupling between the atom and the radiation is a random perturbation. The larger ∆ , the shorter the perturbation correlation time, as we shall see below. We can directly use the results of §§ D and E, to obtain the absorption rate of the radiation by the atom, as well as the energy shifts due to the coupling between the atom and the radiation10 . 10 This problem is also studied in § 3-b of Complement A XIII using a different method. In that complement, we shall sum the transition probabilities associated with each of the monochromatic waves

1332

E. LONG-TIME BEHAVIOR FOR A TWO-LEVEL ATOM

E-3-a.

Correlation functions of the interaction Hamiltonian

The matrix element ( ) of the perturbation radiation coupling can be written as: ()=

()

=

associated with the atom-

()

(E-41)

where is the electric dipole moment of the atom and incident radiation11 ; we have set:

( ) the electric field of the

=

(E-42)

and will assume, for the sake of simplicity, that is real (this can be obtained by a change of the relative phase of and ). The correlation function of the perturbation is then proportional to that of the electric field: ()

(

2

)=

()

(

)

(E-43)

This field can be expanded on its Fourier components: ()=

+

1 2

˜( )

d e

(E-44)

with, since the field is real: ˜(

)= ˜ ( )

(E-45)

We assume that the phases of the completely random. This leads to: ˜( ) ˜ ( ) = ( )

(

components are independent of each other, and

)

(E-46)

where ( )describes the spectral distribution of the incident radiation; the function ( ) is supposed to have a width ∆ . Taking (E-45) and (E-46) into account, we can now write the correlation function (E-43) as: 2

()

(

)=

d

2

d

e

e

(

)

˜( ) ˜ ( )

2

=

2

d

( ) e

(E-47)

We first note that this function only depends on the difference of times: the perturbation is a stationary random function. Secondly, if the spectral distribution ( ) of the incident radiation is a bell shaped curve of width ∆ , the time correlation function of the perturbation decreases with a characteristic time inversely proportional to this width: 1 (E-48) ∆ This means that, if we assume that the atom is excited by a radiation with a broad enough spectrum, the correlation time will be short enough to fulfill the conditions necessary for applying the results of §§ D and E. present in the incident radiation. 11 To simplify the equations, we shall ignore the vectorial nature of

( ) et

.

1333

CHAPTER XIII

E-3-b.

APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

Absorption rates and light shifts

Relation (E-47) shows that: 2

( )=

( )

2

(E-49)

Using (D-19) for the value of Γ, we get: 2

Γ=

}2

(

)

(E-50)

As we saw above, Γ yields the transition rate per unit time between the states and . This rate is proportional to ( ), i.e. to the Fourier transform of the time correlation function of the perturbation, calculated at the frequency of the transition. This result is entirely different from the result obtained with a monochromatic radiation. In this latter case, and at resonance, one expects a Rabi oscillation between the two states and . When considering the excitation probability of an atom in its ground state, the states and are, respectively, the lower and upper states of the transition. This means that: =~

0

(E-51)

The angular frequency appearing in (E-50) is then negative: in an absorption process, it is the Fourier components with negative frequencies that come into play (note however that for an electric field in cos or sin , the positive and negative frequency components have the same intensity, and the distinction is no longer essential). Furthermore, the previous calculations are still valid when 0, a case that corresponds to “stimulated emission”, or induced emission (see § C-2 of Chapter XX). During this process, the radiation stimulates the transition of the atom from an excited state to its ground state. This treatment justifies the introduction by Einstein (§ C-4 of Chapter XX) of the coefficients and describing the absorption and stimulated emission in the presence of black body radiation (which is broadband). We can also evaluate the atomic energy shifts due to the presence of the radiation. The results of § E-2-c show that the excitation of the atom by broadband radiation shifts the states and by the respective values ~ and +~ . The “light shift” is proportional (with a positive proportionality constant) to the following integral over : 1 d ( ) (E-52) These light shifts are proportional to the light intensity since they depend linearly on ( ). Their sign depends on the detuning between the central frequency of the incident radiation and the frequency of the atomic transition. As we have seen in § E-2-e, if is larger that , meaning that the incident radiation is detuned towards the blue, the energies of the two levels get closer under the effect of the radiation. We get the opposite conclusion if the incident radiation is detuned towards the red. These light shifts will be further discussed using the “dressed atom approach” in Complement CXX . We will show that they also exist for an atom excited by monochromatic radiation, and that they are useful tools for manipulating atoms and their motion. 1334

E. LONG-TIME BEHAVIOR FOR A TWO-LEVEL ATOM

References and suggestions for further reading:

Perturbation series expansion of the evolution operator: Messiah (1.17), Chap. XVII, §§ 1 and 2. Sudden or adiabatic modification of the Hamiltonian: Messiah (1.17), Chap. XVII, § II ; Schiff (1.18), Chap. 8, § 35. Diagramatic representation of a perturbation series (Feynman diagrams) : Ziman (2.26), Chap. 3 ; Mandl (2.9), Chaps. 12 to 14 ; Bjorkén and Drell (2.10), Chaps. 16 and 17.

1335

COMPLEMENTS OF CHAPTER XIII, READER’S GUIDE

AXIII : INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

Illustration of the general considerations of § C-2 of Chapter XIII, using the important example of an atom interacting with a sinusoidal electromagnetic wave. Introduces fundamental concepts such as: spectral line selection rules, absorption and induced emission of radiation, oscillator strength... Although moderately difficult, can be recommended for a first reading, because of the importance of the concepts introduced.

BXIII : LINEAR AND NON-LINEAR RESPONSE OF A TWO-LEVEL SYSTEM SUBJECTED TO A SINUSOIDAL PERTURBATION

A simple model for the study of some non-linear effects that appear in the interaction of an electromagnetic wave with an atomic system (saturation effects, multiple-quanta transitions, etc.). More difficult than AXIII (graduate level); should therefore be reserved for a subsequent study.

CXIII : OSCILLATIONS OF A SYSTEM BETWEEN TWO DISCRETE STATES UNDER THE EFFECT OF A RESONANT PERTURBATION

Study of the behavior, over a long time interval, of a system that has discrete energy levels, subjected to a resonant perturbation. Completes, in greater detail, the results of § C-2 of Chapter XIII, which are valid only for short times. Relatively simple.

DXIII : DECAY OF A DISCRETE STATE RESONANTLY COUPLED TO A CONTINUUM OF FINAL STATES

Study of the behavior, over a long time interval, of a discrete state resonantly coupled to a continuum of final states. Completes the results obtained in § C-3 of Chapter XIII (Fermi golden rule), which were established only for short time intervals. Proves that the probability of finding the particle in the discrete level decreases exponentially, and justifies the concept of lifetimes introduced phenomenologically in Complement KIII . Important for its numerous physical applications; graduate level.

1337

EXIII : TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

This complement provides a more detailed and precise view of the study of §§ D and E-2 on the effects of a random perturbation. The “motional narrowing condition” is assumed to be valid, which means that the memory time of the perturbation is much shorter than the time it takes for the perturbation to have a significant effect. This complement first part, this complement provides de general equations of evolution of the density matrix. In a second part, the theory is applied to an ensemble of spins 1 2 coupled to a random isotropic perturbation. This complement is important because of its numerous applications: magnetic resonance, optics, etc.

FXIII : EXERCISES

Exercise 10 can be done at the end of Complement AXIII ; it is a step by step study of the effects of the external degrees of freedom of a quantum mechanical system on the frequencies of the electromagnetic radiation it can absorb (Doppler effect, recoil energy, Mössbauer effect). Some exercises (especially 8 and 9) are more difficult than others, but treat important phenomena.

1338



INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

Complement AXIII Interaction of an atom with an electromagnetic wave

1

2

3

The interaction Hamiltonian. Selection rules . . . . . . . . . 1340 1-a Fields and potentials associated with a plane electromagnetic wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1340 1-b The interaction Hamiltonian at the low-intensity limit . . . . 1341 1-c The electric dipole Hamiltonian . . . . . . . . . . . . . . . . . 1342 1-d The magnetic dipole and electric quadrupole Hamiltonians . 1347 Non-resonant excitation. Comparison with the elastically bound electron model . . . . . . . . . . . . . . . . . . . . . . . 1350 2-a Classical model of the elastically bound electron . . . . . . . 1350 2-b Quantum mechanical calculation of the induced dipole moment1351 2-c Discussion. Oscillator strength . . . . . . . . . . . . . . . . . 1352 Resonant excitation. Absorption and induced emission . . 1353 3-a Transition probability associated with a monochromatic wave 1353 3-b Broad-line excitation. Transition probability per unit time . . 1354

In § C of Chapter XIII, we studied the special case of a sinusoidally time-dependent perturbation: ()= sin . We encountered the resonance phenomenon which occurs when is close to one of the Bohr angular frequencies =( ) ~ of the physical system under consideration. A particularly important application of this theory is the treatment of an atom interacting with a monochromatic wave. In this complement, we will use this example to illustrate the general considerations of Chapter XIII and to introduce certain fundamental concepts of atomic physics such as spectral line selection rules, absorption and induced emission of radiation, oscillator strength, etc... As in Chapter XIII, we shall confine ourselves to first-order perturbation calculations. Some higher-order effects in the interaction of an atom with an electromagnetic wave (“non-linear” effects) will be taken up in Complement BXIII . We shall begin (§ 1) by analyzing the structure of the interaction Hamiltonian between an atom and the electromagnetic field. This will permit us to isolate the electric dipole, magnetic dipole and electric quadrupole terms, and to study the corresponding selection rules. Then we shall calculate the electric dipole moment induced by a nonresonant incident wave (§ 2) and compare the results obtained with those of the model of the elastically bound electron. Finally, we shall study (§ 3) the processes of absorption and induced emission of radiation which appear in the resonant excitation of an atom.

Comment: In all complements of Chapter XIII the atom is treated quantum mechanically, but the electromagnetic field is treated classically, as a time-dependent perturbation acting on the atom. In Chapter XX and its complements, a more elaborate study will be given with a full quantum treatment of both the electromagnetic field and the atom; the interaction hamiltonian is

1339

COMPLEMENT AXIII



then time-independent. This permits the description of physical effects such as the spontaneous emission of photons by atoms in excited states, which does not appear when the field is treated classically.

1.

The interaction Hamiltonian. Selection rules

1-a.

Fields and potentials associated with a plane electromagnetic wave

Consider a plane electromagnetic wave1 , of wave vector k (parallel to ) and angular frequency = . The electric field of the wave is parallel to and the magnetic field, to (Fig. 1).

z

Figure 1: The electric field E and magnetic field B of a plane wave of wave vector k.

E

y

k

O

B

x

For such a wave, it is always possible, with a suitable choice of gauge (cf. Appendix III, § 4-b- ), to make the scalar potential (r ) zero. The vector potential A(r ) is then given by the real expression: A(r ) =

0e

e(

)

+

0e

e

(

)

(1)

where 0 is a complex constant whose argument depends on the choice of the time origin. We then have: E(r ) =

A(r ) =

0e

e(

)

0e

e

(

)

(2)

B(r ) = ∇

A(r ) =

0e

e(

)

0e

e

(

)

(3)

We shall choose the time origin such that the constant

0

=

0

=

2 2

0

is pure imaginary, and we set:

(4a) (4b)

1 For the sake of simplicity, we shall confine ourselves here to the case of a plane wave. The results obtained in this complement, however, can be generalized to an any electromagnetic field.

1340

• where

and

=

INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

are two real quantities such that:

=

(5)

We then obtain: E(r ) = e cos(

)

(6)

B(r ) = e cos(

)

(7)

and are therefore the amplitudes of the electric and magnetic fields of the plane wave considered. Finally, we shall calculate the Poynting vector2 G associated with this plane wave: G=

2

0

E

(8)

B

Replacing E and B in (8) by their expressions (6) and (7), and taking the time-average value over a large number of periods, we obtain, using (5): 2

G= 1-b.

0

2

(9)

e

The interaction Hamiltonian at the low-intensity limit

The preceding wave interacts with an atomic electron (of mass and charge ) situated at a distance from and bound to this point by a central potential ( ) (created by a nucleus assumed to be motionless at ). The quantum mechanical Hamiltonian of this electron can be written: =

1 [P 2

A(R )]2 +

( )

S B(R )

(10)

The last term of (10) represents the interaction of the spin magnetic moment of the electron with the oscillating magnetic field of the plane wave. A(R ) and B(R ) are the operators obtained by replacing, in the classical expressions (1) and (3), , , by the observables , , . In expanding the square that appears on the right-hand side of (10), we should, in theory, remember that P does not generally commute with a function of R. Such a precaution is, however, unnecessary in the present case, since, as A is parallel to [formula (1)], only the component enters into the double product; now commutes with the component of R, which is the only one to appear in expression (1) for A(R ). We can then take: =

0

+

()

(11)

where: 0

=

P2 + 2

( )

2 Recall that the energy flux across a surface element d to G n d .

(12) perpendicular to the unit vector n is equal

1341



COMPLEMENT AXIII

is the atomic Hamiltonian, and: 2

()=

P A(R )

S B(R ) +

2

[A(R )]2

(13)

is the interaction Hamiltonian with the incident plane wave [the matrix elements of ( ) approach zero when 0 approaches zero]. The first two terms on the right-hand side of (13) depend linearly on 0 , and the third one depends on it quadratically. With ordinary light sources, the intensity is sufficiently low that the effect of the 20 term can be neglected compared to that of the 0 term. We shall therefore write: ()

( )+

()

(14)

with: ()=

P A(R )

(15)

()=

S B(R )

(16)

We shall evaluate the relative orders of magnitude of the matrix elements of () and ( ) between two bound states of the electron. Those of S are of the order of ~, and B is of the order of 0 [cf. formula (3)]. Thus: ~

() ()

0

=

~

(17)

0

According to the uncertainty relations, ~ is, at most, of the order of atomic dimensions (characterized by the Bohr radius, 0 05 ˚ A). is equal to 2 , where is the wavelength associated with the incident wave. In the spectral domains used in atomic physics (the optical or Hertzian domains), is much greater than 0 , so that: () () 1-c.

.

0

1

(18)

The electric dipole Hamiltonian

The electric dipole approximation. Interpretation Using expression (1) for A(R ), we can put ()=

[

0e

e

+

We now expand the exponential e e Since

2

2

e

]

in powers of

(19) :

+

(20)

is of the order of atomic dimensions, we have, as above: 0

1342

1 2

=1

0e

( ) in the form:

1

(21)



INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

We therefore obtain a good approximation for by retaining only the first term of expansion (20). Let be the operator obtained by replacing e by 1 on the right-hand side of (19). Using (4-a), we get: ()=

sin

(22)

( ) is called the “electric dipole Hamiltonian”. The electric dipole approximation, which is based on conditions (18) and (21), therefore consists of neglecting ( ) relative to ( ) and identifying ( ) with ( ): ()

()

(23)

Let us show that, if we replace ( ) by ( ), the electron oscillates as if it were subjected to a uniform sinusoidal electric field e cos , whose amplitude is that of the electric field of the incident plane wave evaluated at the point . Physically, this means that the wave function of the bound electron is too localized about for the electron to “feel” the spatial variation of the electric field of the incident plane wave. We shall therefore calculate the evolution of R ( ). Ehrenfest’s theorem (cf. Chap. III, § D-1-d) leads to: d 1 R = [R d ~ d 1 P = [P d ~

0

+

] =

0

+

] =

P

+

e sin (24)

∇ ( )

Eliminating P from these two equations, we obtain, after a simple calculation: d2 R = d2

∇ ( ) +

e cos

(25)

which is indeed the predicted result: the center of the wave packet associated with the electron moves like a particle of mass and charge , subjected to both the central force of the atomic bond [the first term on the right-hand side of (25)] and the influence of a uniform electric field [the second term of (25)].

Comment: Expression (22) for the electric dipole interaction Hamiltonian seems rather unusual for a particle of charge interacting with a uniform electric field E = e cos . We would tend to write the interaction Hamiltonian in the form: ( )=

D E=

cos

(26)

where D = R is the electric dipole moment associated with the electron. Actually, expressions (22) and (26) are equivalent. We shall show that we can go from one to the other by a gauge transformation (which does not modify the physical content of quantum mechanics; cf. Complement HIII ). The gauge used to obtain (22) is:

A(r ) = (r ) = 0

e sin(

)

(27a) (27b)

1343



COMPLEMENT AXIII

[to write (27a), we have replaced 0 by 2 in (1); cf. formula (4a)]. Now consider the gauge transformation associated with the function: (r ) =

sin

(28)

It leads to a new gauge A

characterized by:

A =A+∇ =e =

=

[sin(

) + sin

cos

The electric dipole approximation amounts to replacing see that in this approximation: A

[sin(

e

) + sin

]

(29a) (29b)

by 0 everywhere. We then

]=0

(30)

If, in addition, we neglect, as we did above, the magnetic interaction terms related to the spin, we obtain, for the system’s Hamiltonian: 1 (P A )2 + ( ) + 2 P2 = + ( )+ (R ) 2 = 0+ () =

where

0

(R )

(31)

is the atomic Hamiltonian given by (12), and:

( )=

(R ) =

cos

=

()

(32)

is the usual form (26) of the electric dipole interaction Hamiltonian. Recall that the state of the system is no longer described by the same ket when we go from gauge (27) to gauge (29) (cf. Complement HIII ). The replacement of ( ) by ( ) must therefore be accompanied by a change of state vector, the physical content, of course, remaining the same. In the rest of this complement, we shall continue to use gauge (27).

.

The matrix elements of the electric dipole Hamiltonian

Later, we shall need the expressions for the matrix elements of between and , eigenstates of 0 of eigenvalues and . According to (22), these matrix elements can be written: ()

=

sin

(33)

It is simple to replace the matrix elements of by those of on the right-hand side of (33). Insofar as we are neglecting all magnetic effects in the atomic Hamiltonian [cf. expression (12) for 0 ], we can write: [ 1344

0]

= ~

0

= ~

(34)



INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

0

0

which yields: [

0]

= =

(

)

~

=

Introducing the Bohr angular frequency

=(

(35) ) ~, we then get:

=

(36)

and, consequently: ()

=

sin

The matrix elements of

(37)

( ) are therefore proportional to those of

.

Comment:

It is the matrix element of which appears in (37) because we have chosen the electric field E(r ) parallel to . In practice, we may be led to choose a frame related, not to the light polarization, but to the symmetry of the states and . For example, if the atoms are placed in a uniform magnetic field B0 , the most convenient quantization axis for the study of their stationary states is obviously parallel to B0 . The polarization of the electric field E(r ) can then be arbitrary relative to . In this case, we must replace the matrix element of in (37) by that of a linear combination of , and . .

Electric dipole transition selection rules

If the matrix element of between the states and is different from zero, that is, if is non-zero3 , the transition is said to be an electric dipole transition. To study the transitions induced between and by the incident wave, we can then replace ( ) by ( ). If, on the other hand, the matrix element of ( ) between and is zero, we must pursue the expansion of ( ) further, and the corresponding transition is either a magnetic dipole transition or an electric quadrupole transition, etc...4 (see following sections). Since ( ) is much larger than the subsequent terms of the power series expansion of ( ) in 0 , electric dipole transitions will be, by far, the most intense. In fact, most optical lines emitted by atoms correspond to electric dipole transitions. Let: (r) = (r) =

( ) ( )

(

) (

)

(38)

3 Actually,

it suffices for one of the three numbers , or to be different from zero (cf. comment of § above). 4 It may happen that all the terms of the expansion have zero matrix elements. The transition is then said to be forbidden to all orders (it can be shown that this is always the case if and both have zero angular momenta).

1345

COMPLEMENT AXIII



be the wave functions associated with 4 3

= cos =

0 1(

the matrix element of dΩ

(

)

. Since:

)

(39)

between 0 1(

and

)

(

and

is proportional to the angular integral:

)

(40)

According to the results of Complement CX , this integral is different from zero only if: =

1

(41)

and: =

(42)

Actually, it would suffice to choose another polarization of the electric field (for example, parallel to or ; see comment of § ) to have: =

1

(43)

From (41), (42) and (43), we obtain the electric dipole transition selection rules: ∆ = ∆

=

=

1 =

(44a) 1 0 +1

(44b)

Comments:

()

is an odd operator. It can connect only two states of different parities. Since the parities of and are those of and , ∆ = must be odd, as is compatible with (44a).

( ) If there exists a spin-orbit coupling ( )L S between L and S (cf. Chap. XII, § B1-b- ), the stationary states of the electron are labeled by the quantum numbers , , , (with J = L + S). The electric dipole transition selection rules can be obtained by looking for the non-zero matrix elements of R in the basis. By using the expansions of these basis vectors on the kets (cf. Complement AX , § 2), we find, starting with (44a) and (44b), the selection rules: ∆ =0 ∆

1

(44a)

∆ =

1

(44b)

=0

1

(44c)

Note that a ∆ = 0 transition is not forbidden [unless = = 0; cf. note on the preceding page]. This is due to the fact that is not related to the parity of the level. Finally, we point out that selection rules (44a, 44b, 44c) can be generalized to many-electron atoms.

1346

• 1-d.

INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

The magnetic dipole and electric quadrupole Hamiltonians

.

Higher-order terms in the interaction Hamiltonian The interaction Hamiltonian given by (14) can be written in the form: ()=

( )+

()=

( )+[

()

( )] +

()

Thus far, we have studied ( ). As we have seen, the ratio of ( ) to ( ) is of the order of 0 . To calculate () ( ), we simply replace e by e in (19), which yields: ()

()=

[

0e

0e

]

(45) ()

( ) and 1

+

+

(46)

or, using (4b): ()

()=

If we write =

cos

+

(47)

in the form:

1 ( 2

1 )+ ( 2

+

)=

1 2

1 + ( 2

+

)

(48)

]+

(49)

we obtain, finally: ()

()=

cos

2

2

cos

[

+

In the expression for ( ) [formulas (16) and (3)], it is entirely justified to replace e by 1. We thus obtain a term of order 0 relative to ( ), that is, of the same order of magnitude as () ( ): ()=

cos

+

(50)

Substituting (49) and (50) into (45) and grouping the terms differently, we obtain:

()=

( )+

( )+

( )+

(51)

with: = =

2

(

2 (

+2

) cos

(52)

+

) cos

(53)

[we have replaced by in (53)]. and (which are, a priori, of the same order of magnitude) are, respectively, the magnetic dipole and electric quadrupole Hamiltonians. 1347

COMPLEMENT AXIII

.



Magnetic dipole transitions

The transitions induced by are called magnetic dipole transitions. represents the interaction of the total magnetic moment of the electron with the oscillating magnetic field of the incident wave. The magnetic dipole transition selection rules can be obtained by considering the conditions which must be met by and in order for to have a non-zero matrix element between these two states. Since neither nor changes the quantum number , we must have, first of all, ∆ = 0. changes the eigenvalue of by 1, which gives ∆ = 1. changes the eigenvalues of by 1, so that ∆ = 1. Note, furthermore, that if the magnetic field of the incident wave were parallel to , we would have ∆ = 0 and ∆ = 0. Grouping these results, we obtain the magnetic dipole transition selection rules: ∆ =0 ∆ = ∆ =

1 0 1 0

(54)

Comment:

In the presence of a spin-orbit coupling, the eigenstates of 0 are labeled by the quantum numbers and . Since and do not commute with J2 , can connect states with the same but different . By using the addition formulas for an angular momentum and an angular momentum 1/2 (cf. Complement AX , § 2), it can easily be shown that selection rules (54) become: ∆ =0 ∆ = 1 0 ∆ = 1 0

(55)

Note that the hyperfine transition = 0 = 1 of the ground state of the hydrogen atom (cf. Chap. XII, § D) is a magnetic dipole transition, since the components of S have non-zero matrix elements between the = 1 states and the =0 = 0 state. .

Electric quadrupole transitions Using (34), we can write: +

=

+

=

[ ~

=

(

0

~

0

0]

+[

)

0]

(56)

from which we obtain, as in (36): ()

=

2

cos

(57)

The matrix element of ( ) is therefore proportional to that of , which is a component of the electric quadrupole moment of the atom (cf. Complement EX ). In 1348



INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

addition, the following quantity appears in (57): =

=

(58)

which, according to (2), is of the order of . The operator ( ) can therefore be interpreted as the interaction of the electric quadrupole moment of the atom with the gradient5 of the electric field of the plane wave. To obtain the electric quadrupole transition selection rules, we simply note that, in the r representation, is a linear superposition of 2 21 ( ) and 2 2 1 ( ). Therefore, in the matrix element there appear angular integrals: dΩ

(

)

1 2

(

)

(

)

(59)

which, according to the results of Complement CX , are different from zero only if ∆ = 0, 2 and ∆ = 1. This last relation becomes ∆ = 2 1, 0 when we consider an arbitrary polarization of the incident wave (cf. comment of § 1-c- ), and the electric quadrupole transition selection rules can be written, finally: ∆ =0 2 ∆ =0 1

2

(60)

Comments:

()

and are even operators and can therefore connect only states of the same parity, which is compatible with (54) and (60). For a given transition, and are never in competition with . This facilitates the observation of magnetic dipole and electric quadrupole transitions. Most of the transitions that occur in the microwave or radio-frequency domain – in particular, magnetic resonance transitions (cf. Complement FIV ) – are magnetic dipole transitions. ( ) For a ∆ = 0, ∆ = 0 1 transition, the two operators and simultaneously have non-zero matrix elements. However, it is possible to find experimental conditions under which only magnetic dipole transitions are induced. All we need to do is place the atom, not in the path of a plane wave, but inside a cavity or radiofrequency loops, at a point where B is large but the gradient of E is negligible. ( ) For a ∆ = 2 transition, cannot be in competition with , and we have a pure quadrupole transition. As an example of a quadrupole transition, we can mention the green line of atomic oxygen (5577 ˚ A), which appears in the aurora borealis spectrum. ( ) If we pursued the expansion of further, we would find electric octupole and magnetic quadrupole terms, etc. In the rest of this complement, we shall confine ourselves to electric dipole transitions. In the next Complement, BXIII , on the other hand, we shall consider a magnetic dipole transition. 5 It is normal for the electric field gradient to appear, since potentials in a Taylor series in the neighborhood of

( ) was obtained by expanding the

1349

COMPLEMENT AXIII

2.



Non-resonant excitation. Comparison with the elastically bound electron model

In this section, we shall assume that the atom, initially in the ground state 0 , is excited by a non-resonant plane wave: coincides with none of the Bohr angular frequencies associated with transitions from 0 . Under the effect of this excitation, the atom acquires an electric dipole moment D ( ) which oscillates at the angular frequency (forced oscillation) and is proportional to when is small (linear response). We shall use perturbation theory to calculate this induced dipole moment, and we shall show that the results obtained are very close to those found with the classical model of the elastically bound electron. This model has played a very important role in the study of the optical properties of materials. It enables us to calculate the polarization induced by the incident wave in a material. This polarization, which depends linearly on the field , behaves like a source term in Maxwell’s equations. When we solve these equations, we find plane waves propagating in the material at a velocity different from . This allows us to calculate the refractive index of the material in terms of various characteristics of elastically bound electrons (natural frequencies, number per unit volume, etc.). Thus, we see that it is very important to compare the predictions of this model (which we shall review in § a) with those of quantum mechanics. 2-a.

Classical model of the elastically bound electron

.

Equation of motion

Consider an electron subjected to a restoring force directed towards the point and proportional to the displacement. In the classical Hamiltonian corresponding to (12), we then have: ( )=

1 2

2 2 0

(61)

where

0 is the electron’s natural angular frequency. If we make the same approximations, using the classical interaction Hamiltonian, as those which enabled us to obtain expression (22) for ( ) (the electric dipole approximation) in quantum mechanics, a calculation similar to that of § 1-c- [cf. equation (25)] yields the equation of motion:

d2 + d2

2 0

=

cos

(62)

This is the equation of a harmonic oscillator subject to a sinusoidal force. .

General solution The general solution of (62) can be written: =

cos(

0

)+

(

2 0

2)

cos

(63)

where and are real constants which depend on the initial conditions. The first term of (63), cos( 0 ), represents the general solution of the homogeneous equation (the 1350



INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

electron’s free motion). The second term is a particular solution of the equation (forced motion of the electron). We have not, thus far, taken damping into account. Without going into detailed calculations, we shall cite the effects of weak damping: after a certain time , it causes the natural motion to disappear and very slightly modifies the forced motion (provided that we are sufficiently far from resonance: 1 ). We shall therefore retain 0 only the second term of (63): cos

=

(

2 0

(64)

2)

Comment: Far from resonance, the exact damping mechanism is of little importance, provided that it is weak. We shall not, therefore, take up the problem of the exact description of this damping, either in quantum or in classical mechanics. We shall merely use the fact that it exists to ignore the free motion of the electron. It would be different for a resonant excitation: the induced dipole moment would then depend critically on the exact damping mechanism (spontaneous emission, thermal relaxation, etc.). This is why we shall not try to calculate D ( ) in § 3 (the case of a resonant excitation). We shall be concerned only with calculating the transition probabilities. In Complement BXIII , we shall study a specific model of a system placed in an electromagnetic wave and at the same time subject to dissipative processes (Bloch equations of a system of spins). We shall then be able to calculate the induced dipole moment for any exciting frequency.

.

Susceptibility Let

=

be the electric dipole moment of the system. According to (64), we

have: 2

=

=

(

2 0

2)

where the “susceptibility”

cos

=

cos

(65)

is given by:

2

=

2-b.

(

2 0

2)

(66)

Quantum mechanical calculation of the induced dipole moment

We shall begin by calculating, to first order in , the state vector ( ) of the atom at time . We shall choose for the interaction Hamiltonian, the electric dipole Hamiltonian given by (22). In addition, we shall assume that: ( = 0) =

0

(67) 1351



COMPLEMENT AXIII

We apply the results of § C-1 of Chapter XIII, replacing by 0 . This leads to6 : () =e

~

0

0

(1)

+

by

and

~

()e

(68)

=0

or, using (C-4) of Chapter XIII and multiplying which has no physical importance: () =

0

e

+ =0

2

( ) by the global phase factor e

~

e

0

0 0

+

e

e

0

0

~

,

(69)

0

From this, we find ( ) and ()= () ( ) . In the calculation of this average value, we retain only the terms linear in , and we neglect all those that oscillate at angular frequencies 0 (the natural motion, which would disappear if we took weak damping into account). Finally, replacing 0 by its expression in terms of 0 [cf. equation (36)], we find: ()=

2-c.

2 2 cos ~

0

0 2

2

(70)

2

0

Discussion. Oscillator strength

.

The concept of oscillator strength We set: 0

2

=

0

0

2

(71)

~

transition and called 0 is a real dimensionless number, characteristic of the 0 the oscillator strength7 of this transition. If 0 is the ground state, 0 is positive, like 0. Oscillator strengths satisfy the following sum rule (the Thomas-Reiche-Kuhn sum rule): 0

=1

(72)

This can be shown as follows. Using (36), we can write: 0

1 ~

=

0

The summation over basis, and we get: 0

6 Since

=

1 ~

1 ~

0

0

(

0

0

(73)

can be performed by using the closure relation relative to the

)

0

=

0

0

=1

(1)

(74)

is odd, 0 ( ) 0 is zero, so 0 ( ) = 0. operator enters into (71) because the incident wave is linearly polarized along . It would, however, be possible to give a general definition of the oscillator strength, independent of the polarization of the incident wave. 7 The

1352

• .

INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

The quantum mechanical justification for the elastically bound electron model

We substitute definition (71) into (70) and multiply the expression so obtained by the number of atoms contained in a volume whose linear dimensions are much smaller than the wavelength of the radiation. The total electric dipole moment induced in this volume can then be written: 2

()=

0

(

2

0

cos

2)

(75)

Comparing (75) and (65), we see that it is like having classical oscillators [since according to (72)] whose natural angular frequencies are not all the same 0 = since they are equal to the various Bohr angular frequencies of the atom associated with the transitions from 0 . According to (75), the proportion of oscillators with the angular frequency 0 is 0 . Thus, for a non-resonant wave, we have justified the classical model of the elastically bound electron. Quantum mechanics gives the frequencies of the various oscillators and the proportion of oscillators that have a given frequency. This result shows the importance of the concept of oscillator strength and enables us to understand a posteriori why the elastically bound electron model was so useful in the study of the optical properties of materials. 3.

Resonant excitation. Absorption and induced emission

3-a.

Transition probability associated with a monochromatic wave

Consider an atom initially in the state placed in an electromagnetic wave whose angular frequency is close to a Bohr angular frequency . The results of § C-1 of Chapter XIII (sinusoidal excitation) are directly applicable to the calculation of the transition probability P ( ; ). We find, using expression (37) (thus making the electric dipole approximation): P (; )=

2

2

2

2

4~2

(

)

(76)

where: (

)=

sin[( (

) 2] ) 2

2

(77)

We have already discussed the resonant nature of P ( ; ) in Chapter XIII. At resonance, P ( ; ) is proportional to 2 , that is, to the incident flux of electromagnetic energy [cf. formula (9)]. Comments:

( ) If instead of using the gauge (27), leading to the matrix element (37), we had used the gauge (29) leading to the Hamiltonian (32), the factor ( )2 in (76) would be missing. The fact that the results are different is not at all surprising. The states and , and consequently P ( ; ), do not have the same physical meaning in the two gauges. 1353



COMPLEMENT AXIII

( ) However, as , the diffraction function ( ) tends towards ( ), and the factor ( )2 approaches unity. This leads to the same probability density P ( ; ) in the two gauges. This result can be easily understood if we consider the incident electromagnetic wave to be a quasimonochromatic wave packet of very large but finite spatial extent, rather than a plane wave extending to infinity. When the E field “seen” by the atom tends towards zero and the gauge transformation associated with the function defined in (28) tends towards unity. Consequently and each represent the same physical states in the two gauges. (

3-b.

) Obviously, it is also possible to consider the transition probability between two well-defined energy states of the atomic system for a finite time interval. In this case, the eigenstates and of the atomic Hamiltonian 0 written in (12) only represent states of well-defined atomic energy (kinetic plus potential) in the gauge (29) where A is zero [see (30)] and p2 2 represents the kinetic energy. The same physical states would be represented in gauge (27) by the states exp[ (r ) ~] and exp[ (r ) ~] respectively. For finite , calculations are therefore simpler in the gauge (29). Since in the rest of this complement we replace ( ) by ( ) [see (79)], we will be considering the limit for which the above difficulties disappear. Broad-line excitation. Transition probability per unit time

In practice, the radiation which strikes the atom is very often non-monochromatic. We shall denote by ( ) d the incident flux of electromagnetic energy per unit surface within the interval [ + d ]. The variation of ( ) with respect to is shown in Figure 2. ∆ is the excitation line width. If ∆ is infinite, we say that we are dealing with a “white spectrum”. The different monochromatic waves which constitute the incident radiation are generally incoherent: they have no well-defined phase relation. The total transition probability P can therefore be obtained by summing the transition probabilities associated with each of these monochromatic waves. We must, consequently, replace 2 by 2 ( )d 0 in (76) [formula (9)] and integrate over . This gives: P ()=

2

2

2

2

0

d

~2

( ) (

)

(78)

We can then proceed as in § C-3 of Chapter XIII to evaluate the integral that appears in (78). Compared to a function of whose width is much larger than 4 , the function ( ) (see Figure 3 of Chapter XIII) behaves like ( ). If is large enough to make 4 ∆ (∆: excitation line width) while remaining small enough for the perturbation treatment to be valid, we can, in (78), assume that: (

)

2

(

)

(79)

which yields: 2

P ()=

2 0

1354

~2

(

)

(80)



INTERACTION OF AN ATOM WITH AN ELECTROMAGNETIC WAVE

ℐ(ω)



ωfi

ω

Figure 2: The spectral distribution of the incident flux of electromagnetic energy per unit surface. ∆ is the width of this spectral distribution.

We can write (80) in the form: P ()=

(

)

(81)

where: =

2

4

2

(82)

~ and

is the fine-structure constant: 2

=

4

2 1 = ~ 0 ~

1 137

(83)

This result shows that P ( ) increases linearly with time. The transition probability per unit time is therefore equal to: =

(

)

(84)

is proportional to the value of the incident intensity for the resonance frequency , to the fine-structure constant , and to the square of the modulus of the matrix element of , which is related [by (71)] to the oscillator strength of the transition. In this complement, we have considered the case of radiation propagating along a given direction with a well-defined polarization state. By averaging the coefficients over all propagation directions and over all possible polarization states, we could introduce coefficients , analogous to the coefficients , defining the transition probabilities per unit time for an atom placed in isotropic radiation. The coefficients (and ) are none other than the coefficients introduced by Einstein to describe the absorption (and induced emission). Thus, we see how quantum mechanics enables us to calculate these coefficients. 1355

COMPLEMENT AXIII



Comment: A third coefficient, , was introduced by Einstein to describe the spontaneous emission of a photon, which occurs when the atom falls back from the upper state to the lower state . The theory presented in this complement does not explain spontaneous emission. In the absence of incident radiation, the interaction Hamiltonian is zero, and the eigenstates of 0 (which is then the total Hamiltonian) are stationary states. Actually, the preceding model is insufficient, since it treats asymmetrically the atomic system (which is quantized) and the electromagnetic field (which is considered classically). When we quantize both systems, we find, even in the absence of incident photons, that the coupling between the atom and the electromagnetic field continues to have observable effects (a simple interpretation of these effects is given in Complement KV ). The eigenstates of 0 are no longer stationary states, since 0 is no longer the Hamiltonian of the total system, and we can indeed calculate the probability per unit time of spontaneous emission of a photon (cf. Chap. XX, § C-3) . Quantum mechanics therefore also enables us to obtain the Einstein coefficient . References and suggestions for further reading:

See, for example: Schiff (1.18), Chap. 11; Bethe and Jackiw (1.21), Part II, Chaps. 10 and 11; Bohm (5.1), Chap. 18, §§ 12 to 44. For the elastically bound electron model: Berkeley 3 (7.1), supplementary topic 9; Feynman I (6.3), Chap. 31 and Feynman II (7.2), Chap. 32. For Einstein coefficients: the original article (1.31), Cagnac and Pebay-Peyroula (11.2), Chap. III and Chap. XIX, § 4. For the exact definition of oscillator strength: Sobel’man (11.12), Chap. 9, § 31. For atomic multipole radiation and its selection rules: Sobel’man (11.12), Chap. 9, § 32.

1356



LINEAR AND NON-LINEAR RESPONSES OF A TWO-LEVEL SYSTEM

Complement BXIII Linear and non-linear responses of a two-level system subject to a sinusoidal perturbation

1

2

3

4

Description of the model . . . . . . . . . . . . . . . . . . . . . 1-a Bloch equations for a system of spin 1/2’s interacting with a radiofrequency field . . . . . . . . . . . . . . . . . . . . . . . 1-b Some exactly and approximately soluble cases . . . . . . . . . 1-c Response of the atomic system . . . . . . . . . . . . . . . . . The approximate solution of the Bloch equations of the system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-a Perturbation equations . . . . . . . . . . . . . . . . . . . . . . 2-b The Fourier series expansion of the solution . . . . . . . . . . 2-c The general structure of the solution . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-a Zeroth-order solution: competition between pumping and relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-b First-order solution: the linear response . . . . . . . . . . . . 3-c Second-order solution: absorption and induced emission . . . 3-d Third-order solution: saturation effects and multiple-quanta transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises: applications of this complement . . . . . . . . . .

1358 1358 1359 1359 1360 1360 1362 1363 1364 1364 1364 1366 1368 1372

In the preceding complement, we applied first-order time-dependent perturbation theory to the treatment of some effects produced by the interaction of an atomic system and an electromagnetic wave: appearance of an induced dipole moment, induced emission and absorption processes, etc. We shall now consider a simple example, in which it is possible to pursue the perturbation calculations to higher orders without too many complications. This will allow us to demonstrate some interesting “non-linear” effects: saturation effects, non-linear susceptibility, the absorption and induced emission of several photons, etc. In addition, the model we shall describe takes into account (phenomenologically) the dissipative coupling of the atomic system with its surroundings (the relaxation process). This will enable us to complete the results related to the “linear response” obtained in the preceding complement. For example, we shall calculate the atom’s induced dipole moment, not only far from resonance, but also at resonance. Some of the effects we are going to describe are objects of a great deal of research. Their study necessitates very strong electromagnetic fields, which cab be obtained only with lasers. New branches of research have thus appeared with lasers: quantum electronics, non-linear optics, etc. The calculation methods described in this complement (for a very simple model) are applicable to these problems.

1357



COMPLEMENT BXIII

Comment: The Comment at the end of the introduction of Complement AXIII applies to the present complement as well: here, we limit ourselves to a semi-classical treatment, where the atomic system is treated quantum mechanically but the electromagnetic field classically. A full quantum treatment of both systems will be given in Chapter XX; see in particular Complement CXX , which describes the “dressed atom” method and non-perturbative calculations.

1.

Description of the model

1-a.

Bloch equations for a system of spin 1/2’s interacting with a radiofrequency field

We shall return to the system described in § 4-a of Complement FIV : a system of spin 1/2’s placed in a static field B0 parallel to , interacting with an oscillating radiofrequency field and subject to “pumping” and “relaxation” processes. If ( ) is the total magnetization of the spin system contained in the cell (Fig. 6 of Complement FIV ), we showed in Complement FIV that: d d

()=

1 0

( )+

()

B( )

(1)

The first term on the right-hand side describes the preparation, or the “pumping” of the system: spins enter the cell per unit time, each one with an elementary magnetization . The second term arises from relaxation processes, characterized by the 0 parallel to average time required for a spin either to leave the cell or have its direction changed by collision with the walls. Finally, the last term of (1) corresponds to the precession of the spins about the total magnetic field: B( ) =

0e

+ B1 ( )

B( ) is the sum of a static field angular frequency .

(2) 0e

parallel to

and a radiofrequency field B1 ( ) of

Comments:

( ) The transitions which we shall study in this complement (which connect the two states + and of each spin 1/2) are magnetic dipole transitions. ( ) One could question our using expression (1) relative to average values rather than the Schrödinger equation. We do so because we are studying a statistical ensemble of spins coupled to a thermal reservoir (via collisions with the cell walls). We cannot describe this ensemble in terms of a state vector: we must use a density operator (see Complement EIII ). The equation of motion of this operator is called a “master equation” and we can show that it is exactly equivalent to (1) (see Complement FIV , § 3 and 4, and Complement EIV , where we show that the average value of the magnetization determines the density matrix of an ensemble of spin 1/2’s). It turns out that the master equation satisfied by the density operator and the Schrödinger equation studied in § C-1 of Chapter XIII have the same structure as (1): a linear differential equation, with constant or sinusoidally varying coefficients. The approximation methods we describe in this chapter are, therefore, applicable to any of these equations. 1358

• 1-b.

LINEAR AND NON-LINEAR RESPONSES OF A TWO-LEVEL SYSTEM

Some exactly and approximately soluble cases

If the radiofrequency field B1 ( ) is rotating, that is, if: B1 ( ) =

1 (e

cos

+ e sin

)

(3)

equation (1) can be solved exactly [changing to the frame which is rotating with B1 transforms (1) into a time-independent linear differential system]. The exact solution of (1) corresponding to such a situation is given in § 4-b of Complement FIV . Here, we shall assume B1 to be linearly polarized along : B1 ( ) =

1e

cos

(4)

In this case, it is not possible1 to find an exact analytic solution of equation (1) (there is no transformation equivalent to changing to the rotating frame). We shall see, however, that a solution can be found in the form of a power series expansion in B1 .

Comment:

The calculations we shall present here for spin 1/2’s can also be applied to other situations in which we can confine ourselves to two levels of the system and ignore all others. We know (cf. Complement CIV ) that we can associate a fictitious spin 1/2 with any two-level system. The problem considered here is therefore that of an arbitrary two-level system subject to a sinusoidal perturbation. 1-c.

Response of the atomic system

The set of terms which, in , , , depend on 1 constitute the “response” of the atom to the electromagnetic perturbation. They represent the magnetic dipole moment induced in the spin system by the radiofrequency field. We shall see that such a dipole moment is not necessarily proportional to 1 ; the terms in 1 represent the linear response, and the others (terms in 12 , 13 , ...), the “non-linear response”. In addition, we shall see that the induced dipole moment does not oscillate only at the angular frequency , but also at its various harmonics ( = 0, 2, 3, 4, ...). It is easy to see why we should be interested in calculating the response of the atomic system. Such a calculation is useful for the theory of the propagation of an electromagnetic wave in a material, or for the theory of atomic oscillators, “masers” or “lasers”. Consider an electromagnetic field. Because of the coupling between this field and the atomic system, a polarization appears in the material, due to the atomic dipole moments (arrow directed towards the right in Figure 1). This polarization acts like a source term in Maxwell’s equations and contributes to the creation of the electromagnetic field (arrow directed towards the left in Figure 1). When we “close the loop”, that is, when we take the field so created to be equal to the one with which we started, we obtain the wave propagation equations in the material (refractive index) or the oscillator equations (in the absence of external fields, an 1 A linearly polarized field can be obtained as a superposition of a left and a right circular components. It would be possible to find an exact solution for each of these components taken separately. However, equation (1) is not linear, in the sense that a solution corresponding to (4) cannot be obtained by superposing two exact solutions, one of which corresponds to (3) and the other one to the field rotating in the opposite direction [in the term B that appears on the right-hand side of (1), depends on B1 ].

1359

COMPLEMENT BXIII



electromagnetic field may appear in the material, if there is sufficient amplification: the system becomes unstable and can oscillate spontaneously). In this complement, we shall be concerned only with the first step of the calculation (the atomic response).

Response of the atomic system

Atomic dipole moments

Electromagnetic field

Maxwell’s equations

Figure 1: Schematic representation of the calculations to be performed in studying the propagation of an electromagnetic wave in a material (or the operation of an atomic oscillator, a laser or a maser). We begin by calculating the dipole moments induced in the material by a given electromagnetic field (the response of the atomic system). The corresponding polarization acts like a source term in Maxwell’s equations and contributes to the creation of the electromagnetic field. We then take the field obtained to be equal to the one with which we started.

2.

The approximate solution of the Bloch equations of the system

2-a.

Perturbation equations

As in Complement FIV , we set: 0

=

0

(5)

1

=

1

(6)

~ 0 represents the energy difference of the spin states + and (Fig. 2). Substituting (4) into (2), and (2) into (1), we obtain, after a simple calculation: d d

=

d d

=

+

0

+

0

1

2

cos

(

1

cos

+)

(7a) (7b)

with: = 1360

(8)



LINEAR AND NON-LINEAR RESPONSES OF A TWO-LEVEL SYSTEM

+

ħω



Figure 2: Energy levels of a spin 1/2 in a static magnetic field B0 ; angular frequency in the field B0 .

0

is the Larmor

Note that the source term , since 0 exists only in the equation of motion of µ0 is parallel to , and the pumping is said to be longitudinal2 . We also point out that the relaxation time can be different for the longitudinal components ( ) and the transverse components ( ) of the magnetization. For the sake of simplicity, we shall choose a single relaxation time here. Equations (7a) and (7b), called the “Bloch equations”, cannot be solved exactly. We shall therefore determine their solution in the form of a power series expansion in 1 : = (0)

+

= (0)

1

+

(1)

1

2 (2) 1

+

(1)

+

2 (2) 1

+

( )

+ +

+

1

+

( )

(9a) +

1

(9b)

Substituting (9a) and (9b) into (7a) and (7b), and setting equal the coefficients of terms in 1 , we obtain the following perturbation equations: n=0 : d (0) 1 (0) = 0 (10a) d d d

(0)

=

1

(0)

0

(0)

(10b)

=0 : d d

( )

d d

( )

= =

1 1

( )

( )

+

2

cos

0

( )

[(

1)

(

cos

1)

(

1)

+]

(11a) (11b)

2 In certain experiments, the pumping is “transverse” (µ is perpendicular to B ). See exercise 1 at 0 0 the end of this complement.

1361



COMPLEMENT BXIII

2-b.

The Fourier series expansion of the solution

Since the only time-dependent terms on the right-hand side of (10) and (11) are sinusoidal, the steady-state solution of (10) and (11) is periodic, of period 2 . We can expand it in a Fourier series: + ( )

( )

=

e

(12a)

= + ( )

( )

=

e

(12b)

= ( )

( )

and represent the Fourier components of the th-order solution. By taking ( ) real and ( ) + and ( ) as complex conjugates of each other, we obtain the following reality conditions: ( )

( )

=

( )

(13a)

( )

=

(13b)

Substituting (12a) and (12b) into (10) and (11), and setting equal to zero the coefficient of each exponential e , we find: =0 : (0) 0

=

(0)

=0

if

=0

for any

(0)

0

(14)

=0

=0 : +

1

( )

=

(

1)

(

+

+1

4

1) 1

(

1)

( +

+1

1) 1

+

(15a) (

0)

+

1

( )

=

(

1) +1

2

+

(

1) 1

(15b)

These algebraic equations can be solved immediately: ( )

(

= 4

+

1) +1

1

+

(

1) 1

(

1) +1

( +

1) 1

+

(16a) ( )

(

= 2 (

1362

0)

+

1

1) +1

+

(

1) 1

(16b)



LINEAR AND NON-LINEAR RESPONSES OF A TWO-LEVEL SYSTEM

Thus, expressions (16) give the th-order solution explicitly in terms of the ( 1)thorder solution. Since the zeroth-order solution is known [cf. equations (14)], the problem is, in theory, entirely solved. 2-c.

The general structure of the solution

It is possible to arrange the various terms of the expansion of the solution in a double-entry table in which the perturbation order labels the columns and the degree (0) of the harmonic being considered labels the rows. To zeroth-order, only 0 is different from zero. By iteration, using (16), we can deduce the other non-zero higherorder terms (table I), thus obtaining a “tree-like structure”. The following properties can be found directly by recurrence, using (16): ( ) At even perturbation orders, only the longitudinal magnetization is modified; at odd orders, only the transverse magnetization. ( ) At even perturbation orders, only the even harmonics are involved; at odd orders, only the odd harmonics. (

) For each value of , the values of

to be retained are ,

2, ...

+ 2,

Table I: Double-entry table indicating the Fourier components of the magnetization that are non-zero to the th perturbation order in 1 .

1363



COMPLEMENT BXIII

Comment:

This structure is valid only for a particular polarization of the radiofrequency field B1 ( ) (perpendicular to B0 ). Analogous tables could be constructed for other radiofrequency polarizations. 3.

Discussion

We shall now interpret the results of this calculation, through third order. 3-a.

Zeroth-order solution: competition between pumping and relaxation

According to (14), the only non-zero zeroth-order component is: (0) 0

=

(17)

0

In the absence of radiofrequency fields, there is only a static longitudinal magnetization ( = 0). Since , is proportional to the population difference of the states + and shown in Figure 2 (cf. Complement EIV ), it can also be said that the pumping populates these two states unequally. The larger the number of particles entering the cell (the more efficient the pumping) (0) and the longer (the slower the relaxation), the larger 0 . The zeroth-order solution (17) therefore describes the dynamic equilibrium resulting from competition between the pumping and relaxation processes. From now on, in order to simplify the notation, we shall set: (0)

0

=0

Γ =

3-b.

+

.

(18a)

1

(18b)

First-order solution: the linear response

To first order, only the transverse magnetization = , it suffices to study +.

is different from zero. Since

Motion of the transverse magnetization

According to Table I, for (16b), using (18), we get: (1) 1

+

(1) 1

+

= =

1. Setting

= 1 and

1364

=

e

0 1

2

0

+ Γ

(19b)

+ Γ

+

e + + Γ 0

1 in

(19a)

1 0+

Substituting these expressions into (12b) and then into (9b), we obtain order in 1 : +

=

+ Γ

0 0

2

=

1

0

2

= 1, we have

+

to first

(20)



LINEAR AND NON-LINEAR RESPONSES OF A TWO-LEVEL SYSTEM

The point representing + describes the same motion in the complex plane as the projection of in the plane perpendicular to B0 . According to (20), this motion results from the superposition of two circular motions with the same angular velocity, one of them right circular (the e term) and the other left circular (the e term). The resulting motion, in the general case, is therefore elliptical. .

Existence of two resonances

The right circular motion has a maximum amplitude when 0 = , and the left circular motion, when 0 = . therefore presents two resonances (while for a rotating field, there was a single resonance; see Complement FIV ). The interpretation of this phenomenon is as follows: the linear radiofrequency field can be broken down into a left and a right circular field, each of which induces a resonance; since the rotation directions are opposed, the static fields B0 for which these resonances appear are opposed. .

Linear susceptibility

Near a resonance ( (20). We then get: e

0 +

1

2

0

, for example), we can neglect the non-resonant term in

0

+ Γ

0

(21)

+ is therefore proportional to the rotating radiofrequency field component in the direction corresponding to the resonance, 1 e 2 in this case. The ratio of + to this component is called the linear susceptibility ( ):

( )=

1 0 0

+ Γ

(22)

( ) is a complex susceptibility because of the existence of a phase difference between and the rotating component of the radiofrequency field responsible for the resonance. The square of the modulus of ( ) has the classical resonant form in the neighborhood of = 0 (Fig. 3), over an interval of width: ∆ = 2Γ =

2

(23)

The longer the relaxation time , the sharper the resonance curve. From now on, we shall assume that the two resonances 0 = and 0 = are completely separated, i.e. that: Γ =

1

(24)

The phase difference varies from 0 to when we pass through resonance. It is equal to 2 at resonance: it is when and the rotating component are out of phase by 2 that the work of the couple exerted by the field on is maximal. The sign of this work depends on the sign of 0 , that is, on that of 0 : it depends on whether the spin states of the entering particles are + or . In one case (spins entering in the lower level), the work is furnished by the field, and energy is transferred from the field to the spins (absorption). In the opposite case (particles entering in the higher level), the work is negative, and energy is transferred from the spins to the field (induced emission). The latter situation occurs in atomic amplifiers and oscillators (masers and lasers). 1365



COMPLEMENT BXIII

(ω) 2

2 TR

0

ω0

ω

Figure 3: Variation of the square of the modulus ( ) 2 of the linear susceptibilite of the spin system, with respect to . A resonance appears, of width 2 , centered at = 0 .

3-c.

Second-order solution: absorption and induced emission (2)

(2)

To second order, according to Table I, only 0 and 2 are non-zero. First, (2) we shall study 0 , that is, the static population difference of the states + and (2) to second order. We shall then consider 2 , that is, the generation of the second harmonic. .

Variation of the population difference of the two states of the system (2) 0

0

corrects the zeroth-order result obtained for

(0) 0

0.

According to (16a) and

(13b): (2) 0

= =

(1) 1



(1) 1



+

+

(1) 1

+

(1) 1

+

(1) 1

+

(1) 1

+

(1) 1

+

(1) 1

+

(25)

which, according to first-order solutions (19a) and (19b), yields: (2) 0

=

1 2 + ( + 2 0) + Γ

0

4

(

1 2 2 0) + Γ

(26)

Grouping the static terms ( = 0) through second order in (9a), we get: (static) =

0

1

2 1

4

(

1 2 + ( + 2 0) + Γ

1 2 2 0) + Γ

+

Figure 4 represents this static longitudinal magnetization as a function of 1366

(27) 0.



LINEAR AND NON-LINEAR RESPONSES OF A TWO-LEVEL SYSTEM

ℳz (static)

ℳ0ω12 T R2

2

2

4

TR

TR

ℳ0

–ω

0

ω

ω0

Figure 4: Variation of the static longitudinal magnetization with respect to 0 . To second order in the perturbation treatment, there appear two resonances of width 2 , centered at 0 = and 0 = . The calculation is valid only if the relative intensity of the resonances is small, that is, if 1 1.

The population difference is therefore always decreased, to second order, relative to its value in the absence of radiofrequency, and the decrease is proportional to the intensity of the radiofrequency field. This is simple to understand: under the effect of the incident field, transitions are induced from + to (induced emission) or from to + (absorption); whatever the sign of the initial population difference, the transitions from the more populated state are the more numerous, so that they decrease the population difference.

Comment: (2)

2 2 2 2 The maximum value of 12 0 is = 4 (the resonance 0 1 4Γ 0 1 amplitude which appears as a dip in Figure 4). For the perturbation expansion to make sense, it is therefore necessary that:

1

1

(28) 1367

COMPLEMENT BXIII

.



Generation of the second harmonic According to (16a), (13b), (19a) and (19b): (2) 2

= =

1 4(2

(1) 1

Γ )

+

+

1

0

8(2

(1) 1

Γ )

0+

1 Γ

(29)

+ Γ

0

(2) 2

describes a vibration of the magnetic dipole along at the angular frequency 2 . The system can therefore radiate a wave of angular frequency 2 , polarized (as far as the magnetic field is concerned) linearly along . Thus, we see that an atomic system is not generally a linear system. It can double the excitation frequency, triple it (as we shall see later), etc. The same type of phenomenon exists in optics for very high intensities (“non-linear optics”): a red laser beam (produced, for example, by a ruby laser) falling on a material such as a quartz crystal can give rise to an ultraviolet light beam (doubled frequency).

Comment: (2)

It will prove useful to compare 0 and According to (29), for 0 , we have: (2) 2

(2) 2

in the neighborhood of

0

16

0

= .

(30)



Similarly, (26) indicates that: (2) 0

0

(31)

4Γ2

Therefore, for (2) 2 (2) 0

Γ 4

0:

= 0

1 4

1

(32)

0

according to (24). 3-d.

Third-order solution: saturation effects and multiple-quanta transitions (3)

(3)

To third order, Table I shows that only 1 and 3 are non-zero; it suffices to study (3) + . (3) , found to first order + corrects to third order the right circular motion of 1 (3) and analyzed in § 3-b above. We shall see that 1 + corresponds to a saturation effect in the susceptibility of the system. (3) , of angular frequency + represents a new component of the motion of 3 3 of the motion of (generation of the third harmonic). Moreover, the resonant (3) nature of 3 can be interpreted as resulting from + in the neighborhood of 0 = 3 the simultaneous absorption of three radiofrequency photons, a process which conserves both the total energy and the total angular momentum. 1368

• .

LINEAR AND NON-LINEAR RESPONSES OF A TWO-LEVEL SYSTEM

Saturation of the susceptibility of the system According to (16b): (3) 1

+

=

1 2

1 + Γ

0

(2) 2

(2)

+0

(33)

Since we are interested in the correction to the right circular motion discussed in § 3.b, which is resonant at = 0 , we shall place ourselves in the neighborhood of 0 = . We can then, according to the comment in the preceding section [cf. formula (32], neglect (2) (2) (2) compared to 0 . Thus we obtain, using expression (26) for 0 (neglecting 2 the term whose resonance peak is at 0 = ): (3) 1

1

0 +

8

+ Γ

0

1 2 2 0) + Γ

(

(34)

If we regroup results (34) and (19a), we find the expression for the right circular motion of 2 , to third order in 1 : + at the frequency + (right

circular) = e

0 1

2

+ Γ

0

2 1

1

1 2 2 0) + Γ

4 (

(35)

Comparing (35) and (21), we see that the susceptibility of the system goes from value (22) to the value: ( )=

1 0 0

+ Γ

1

2 1

4 (

0

1 )2 + Γ2

(36)

It is therefore multiplied by a factor smaller than one; the greater the intensity of the radiofrequency field and the nearer we are to resonance, the smaller the factor. The system is then said to be “saturated”. The 12 term of (36) is called the “non-linear susceptibility”. The physical meaning of this saturation is very clear. A weak electromagnetic field induces in the atomic system a dipole moment which is proportional to it. If the field amplitude is increased, the dipole cannot continue to increase proportionally to the field. The absorption and emission transitions induced by the field decrease the population difference of the atomic states involved. Consequently, the atomic system responds less and less to the field. Furthermore, we see that the term in brackets in (36) is none other than the term that expresses the decrease in the population difference to second order [cf. formula (27), in which the term resonant at 0 = was neglected].

Comment: The saturation terms play a very important role in all maser or laser theories. Consider Figure 1 again. If we keep only the linear response term in the first step of the calculation (arrow directed to the right), the induced dipole moment is proportional to the field. If the material amplifies (and if the losses of the electromagnetic cavity are sufficiently small), the reaction of the dipole on the field (arrow directed to the left) tends to increase the

1369

COMPLEMENT BXIII



field by a quantity proportional to it. Thus, we obtain for the field a linear differential equation which leads to a solution which increases linearly with time. It is the saturation terms that prevent this unlimited increase. They lead to an equation whose solution remains bounded and approaches a limit which is the steadystate laser field in the cavity. Physically, these saturation terms express the fact that the atomic system cannot furnish the field with an energy greater than that corresponding to the population difference initially introduced by the pumping.

.

Three-photon transitions According to (16b) and (29): (3) 3

+

= =

1 2

0

1 3 + Γ

0

16

0

(2) 2

1 3 + Γ

1 2 (3)

1 Γ

0+

1 Γ

0

(37)

+ Γ (2)

With respect to the term 3 : + , we could make the same comment as for 2 the atomic system produces harmonics of the excitation frequency (here, the third harmonic). (2) The difference with the discussion of the preceding section relative to 2 is the appearance of a resonance centered at 0 = 3 [due to the first resonant denominator of (37)]. We can give a particle interpretation of the 0 = resonance discussed in the preceding sections: the spin goes from the state to the state + by absorbing a photon (or emitting it, depending on the relative positions of the + and states). There is resonance when the energy ~ of the photon is equal to the energy ~ 0 of the atomic transition. We could give an analogous particle interpretation of the 0 = 3 resonance. Since ~ 0 = 3~ , the transition necessarily involves three photons, since the total energy must be conserved. We may wonder why no resonance has appeared to second order for ~ 0 = 2~ (two-photon transition). The reason is that the total angular momentum must also be conserved during the transition. The linear radiofrequency field is, as we have already said, a superposition of two fields rotating in opposite directions. With each of these rotating fields are associated photons of a different type. For the right circular field, it is + photons, transporting an angular momentum +~ relative to . For the left circular field, it is photons, transporting an angular momentum ~. To go from the state to the + state, the spin must absorb an angular momentum +~ relative to (the difference between the two eigenvalues of ). It can do so by absorbing a + photon; if 0 = , there is also conservation of the total energy, which explains the appearance of the 0 = resonance. The system can also acquire an angular momentum +~ by absorbing three photons (Fig. 5): two + photons and one photon. Therefore, if 0 = 3 , both energy and total angular momentum can be conserved, which explains the 0 = 3 resonance. On the other hand, two photons can never give the atom an angular momentum +~: either both photons are + and they carry 2~, or they are both and they carry 2~, or one is + and one is and they carry no total angular momentum. 1370



LINEAR AND NON-LINEAR RESPONSES OF A TWO-LEVEL SYSTEM

+

ħω0

σ–

ħω

σ+

ħω

σ+

ħω



Figure 5: The spin can go from the state to the + state by absorbing three photons of energy ~ . The total energy is conserved if ~ 0 = 3~ . The angular momentum is conserved if two photons have a + polarization (each carries an angular momentum +~ relative to ) and the third has a polarization (it carries an angular momentum ~).

These arguments can easily be generalized and enable us to show that resonances appear when 0 = , 3 , 5 , 7 , ..., (2 + 1) , ..., corresponding to the absorption of an (2 +1) odd number of photons. Furthermore, we see from formula (16b) that 2 +1 + gives rise to a resonance peak for 0 = (2 + 1) . Nothing analogous occurs at even orders since, according to Table I, we must then use equation (16a).

Comments:

( ) If the field B1 is rotating, there is only one type of photon, + or . The same argument shows that a single resonance can then occur, at 0 = if the photons are + and at 0 = if they are . This enables us to understand why the calculations are much simpler for a rotating field and lead to an exact solution. It is instructive to apply the method of this complement to the case of a rotating field and to show that the perturbation series can be summed to give the solution found directly in Complement FIV . ( ) Consider a system having two levels of different parities, subject to the influence of an oscillating electric field. The interaction Hamiltonian then has the same structure as the one we are studying in this complement: has only non-diagonal elements. Similarly, the electric dipole Hamiltonian, since it is odd, can have no diagonal elements. In the second case, the calculations 1371

COMPLEMENT BXIII

(



are very similar to the preceding ones and lead to analogous conclusions: resonances are found for 0 = , 3 , 5 , ... The interpretation of the “odd” nature of the spectrum is then as follows: the electric dipole photons have a negative parity, and the system must absorb an odd number of them in order to move from one level to another of different parity. ) For the spin 1/2 case, assume that the linear radiofrequency field is neither parallel nor perpendicular to B0 (Fig. 6). B1 can then be broken down into a component parallel to B0 , B1 , with which are associated photons (with zero angular momentum relative to ), and a component B1 , with which, as we have seen, + and photons are associated. In this case, the atom can increase its angular momentum relative to by +~, and move from to + , by absorbing two photons, one + and the other . It can be shown, by applying the method of this complement, that for this polarization of the radiofrequency, a complete (even and odd) spectrum of resonances appears: 0 = , 2 , 3 , 4 , ...

B0

B1 // B1

Figure 6: The static magnetic field B0 and the radiofrequency field B1 , in the case in which B1 is neither parallel nor perpendicular to B0 . B1 and B1 are the components of B1 parallel and perpendicular to B0 .

B1 ⊥

4.

Exercises: applications of this complement

EXERCISE 1 In equations (1), set 1 = 0 (no radiofrequency) and choose µ0 parallel to (transverse pumping). Calculate the steady-state values of , and . Show that and undergo resonant variations when the static field is swept about zero (the Hanle effect). Give a physical interpretation of these resonances (pumping in competition with Larmor precession) and show that they permit the measurement of the product . 1372



LINEAR AND NON-LINEAR RESPONSES OF A TWO-LEVEL SYSTEM

EXERCISE 2 Consider a spin system subjected to the same static field B0 and to the same pumping and relaxation processes as in this complement. These spins are also subjected to two linear radiofrequency fields, the first one of angular frequency and amplitude , and the second one of angular frequency and amplitude 1 , parallel 1 , parallel to to . Using the general methods described in this complement, calculate the magnetization of the spin system to second order in 1 = 1 and 1 = 1 (terms in 2 2 , , ). We fix = and . Assume , and let vary. Show that, 1 1 0 0 1 0 1 1 to this perturbation order, two resonances appear, one at = 0 and the other at = 0+ . Give a physical interpretation of these two resonances (the first one corresponds to a two-photon absorption, and the second, to a Raman effect). References and suggestions for further reading:

See section 15 of the bibliography. Semiclassical theories of masers and lasers: Lamb (15.4) and (15.2), Sargent et al. (15.5), Chap. VIII, IX and X. Non-linear optics: Baldwin (15.19), Bloembergen (15.21), Giordmaine (15.22). Iterative solution of the master equation: Bloembergen (15.21), Chap. 2, §§ 3, 4 and 5 and Appendix III. Multiphoton processes in R. F. range, Hanle effect: Brossel’s lectures in (15.2).

1373



COMPLEMENT CXIII

Complement CXIII Oscillations of a system between two discrete states under the effect of a sinusoidal resonant perturbation

1 2 3

The method: secular approximation . . . . . . . . . . . . . . 1374 Solution of the system of equations . . . . . . . . . . . . . . 1375 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1376

The approximation method used to calculate the effect of a resonant perturbation in Chapter XIII is not valid over long periods of time. We have seen [cf. condition (C-18) of this chapter] that must satisfy: ~

(1)

Suppose we want to study the behavior of a system subjected to a resonant perturbation over a considerable time [for which condition (1) is not satisfied]. Since the first-order solution is then insufficient, we could try to calculate a certain number of higher-order terms to obtain a better expression for P ( ; ): (1)

P (; )=

( )+

2 (2)

( )+

3 (3)

2

( )+

(2)

Such a method would lead to unnecessarily long calculations. We shall see here that it is possible to solve the problem more elegantly and rapidly by fitting the approximation method to the resonant nature of the perturbation. The resonance condition implies that only the two discrete states and are effectively coupled by ( ). Since the system, at the initial instant, is in the state [ (0) = 1], the probability amplitude ( ) of finding it in the state at time can be appreciable. On the other hand, all the coefficients ( ) (with = , ) remain much smaller than 1 since they do not satisfy the resonance condition. This is the basis of the method we shall use. 1.

The method: secular approximation

In Chapter XIII, we replaced all the components ( ) on the right-hand side of (B-11) by their values (0) at time = 0. Here, we shall do the same thing for the components for which = , . However, we shall explicitly keep ( ) and ( ). Thus, in order to determine ( ) and ( ), we are led to the system of equations [the perturbation having the form (C-1a) of Chap. XIII]: ~ ~ 1374

d d

d d

()= ()=

1 2 1 2

e e(

( )+ e(

e +

)

e

(

)

)

( )+ e

e

( +

e

)

() ()

(3)



OSCILLATIONS OF A SYSTEM BETWEEN TWO DISCRETE STATES

On the right-hand side of these equations, certain coefficients of ( ) and ( ) are ) proportional to e ( , so they oscillate slowly in time when . On the other hand, the coefficients proportional either to e or to e ( + ) oscillate much more rapidly. Here, we shall use the secular approximation, which consists in neglecting the second type of terms. The remaining ones, called “secular terms”, are then those whose coefficients reduce to constants for = . When integrated over time, they make significant contributions to the variations of the components ( ) and ( ). On the other hand, the contribution of the other terms is negligible, since their variation is too rapid (the integration of e Ω causes a factor 1 Ω to appear, and the average value of e Ω over a large number of periods is practically zero).

Comment:

For the preceding argument to be valid, it is necessary for the temporal variation of a term e ( ) to be due principally to the exponential, and not to the component ( ). Since is very close to , this means that ( ) must not significantly vary over a time interval of the order of 1 . This is indeed true with the assumptions we have made, that is, with 0 . The variations of ( ) and ( ) (which are constants if = 0) are due to the presence of the perturbation , and are appreciable for times of the order of ~ [this can be verified directly from formulas (8), obtained below]. Since by hypothesis ~ , this time is much greater than 1 . In conclusion, the secular approximation leads to the system of equations: d d

()=

d d

()=

1 ( e 2~ 1 e 2~

)

()

(4a)

)

()

(4b)

(

whose solution, very close to that of system (3), is easier to calculate, as we shall see in the next section. 2.

Solution of the system of equations

We shall begin by considering the case for which substituting (4b) into the result, we obtain: d2 ()= d2

1 4~2

2

=

. Differentiating (4a) and

()

Since the system is in the state

(5) at time = 0, the initial conditions are:

(0) = 1

(6a)

(0) = 0

(6b) 1375

COMPLEMENT CXIII



which, according to (4), gives: d (0) = 0 d

(7a)

d (0) = d 2~

(7b)

The solution of (5) that satisfies (6a) and (7a) can be written: ( ) = cos

(8a)

2~

We can then calculate ()=e

sin

from (4a): (8b)

2~

where is the argument of . The probability P ( ; = in the state at time is therefore, in this case, equal to: P (;

) = sin2

=

) of finding the system

(9)

2~

When is different from (while remaining close to the resonance value), the differential system (4) is still exactly soluble. In fact, it is completely analogous to the one we obtained in Complement FIV [cf. equation (15)] in studying the magnetic resonance of a spin 1/2. The same type of calculation as in that complement leads to the analogue of relation (27) (Rabi’s formula), which can be written here: 2

P (; )= [when 3.

=

2

+ ~2 (

)2

sin2

2

~2

+(

)2

2

(10)

, this expression does reduce to (9)].

Discussion

The discussion of the result obtained in (10) is the same as that of the magnetic resonance of a spin 1/2 (cf. Complement FIV , § 2-c). The probability P ( ; ) is an oscillating function of time; for certain values of , P ( ; ) = 0, and the system has gone back into the initial state . Furthermore, equation (10) measures the magnitude of the resonance phenomenon. When = , however small the perturbation is, it can cause the system to move 1 completely from the state to the state . On the other hand, if the perturbation is not resonant, the probability P ( ; ) always remains less than 1. Finally, it is interesting to compare the result obtained in this complement with the one obtained using the first-order theory in Chapter XIII. First of all, note that, 1 The magnitude of the perturbation, characterized by taken by the system to move from to . The smaller

1376

, enters, at resonance, only into the time , the longer the time.



OSCILLATIONS OF A SYSTEM BETWEEN TWO DISCRETE STATES

for all values of , the probability P ( ; ) obtained in (10) is included between 0 and 1. The approximation method used here therefore enables us to avoid the difficulties encountered in Chapter XIII (cf. § C-2-c- ). When we let approach zero in (9), we get (C-17) of this chapter. Thus, first-order perturbation theory is indeed valid for sufficiently small (cf. comment of § B-3-b). It amounts to replacing the sinusoid which represents P ( ; ) as a function of time by a parabola.

1377

COMPLEMENT DXIII



Complement DXIII Decay of a discrete state resonantly coupled to a continuum of final states

1 2

3 4 5

1.

Statement of the problem . . . . . . . . . . . . . . . . . . . . 1378 Description of the model . . . . . . . . . . . . . . . . . . . . . 1379 2-a Assumptions about the unperturbed Hamiltonian 0 . . . . . 1379 2-b Assumptions about the coupling . . . . . . . . . . . . . . 1380 2-c Results of first-order perturbation theory . . . . . . . . . . . 1380 2-d Integrodifferential equation equivalent to the Schrödinger equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1382 Short-time approximation. Relation to first-order perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1383 Another approximate method for solving the Schrödinger equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1385 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1386 5-a Lifetime of the discrete state . . . . . . . . . . . . . . . . . . 1386 5-b Shift of the discrete state due to the coupling with the continuum1387 5-c Energy distribution of the final states . . . . . . . . . . . . . 1388

Statement of the problem

In § C-3 of Chapter XIII, we showed that the coupling induced by a constant perturbation between an initial discrete state of energy and a continuum of final states (some of which have an energy equal to ) causes the system to go from the initial state to this continuum of final states. More precisely, the probability of finding the system in a well-defined group of states of the continuum at time increases linearly with time. Consequently, the probability P ( ) of finding the system in the initial state at time must decrease linearly over time from the value P (0) = 1. It is clear that this result is valid only over short times, since extrapolation of the linear decrease of P ( ) to long times would lead to negative values of P ( ), which would be absurd for a probability. This raises the problem of determining the long-time behavior of the system. We encountered an analogous problem when we studied the resonant transitions induced by a sinusoidal perturbation between two discrete states and . Firstorder perturbation theory predicts a decrease proportional to 2 of P ( ) from the initial value P (0) = 1. The method presented in Complement CXIII shows that the system actually oscillates between the states and . The decrease with 2 found in § C of Chapter XIII merely represents the “beginning” of the corresponding sinusoid. We might expect an analogous result in the problem with which we are concerned here (oscillations of the system between the discrete state and the continuum). We shall show that this is not the case: the physical system leaves the state irreversibly. We find an exponential decrease e Γ for P ( ) (for which the perturbation treatment gives 1378



DECAY OF A DISCRETE STATE RESONANTLY COUPLED TO A CONTINUUM OF FINAL STATES

only the short-time behavior 1 Γ ). Thus, the continuous nature of the set of final states causes the reversibility found in Complement CXIII to disappear; it is responsible for a decay of the initial state, which thus acquires a finite lifetime (unstable state; cf. Complement KIII ). The situation envisaged in the present complement is very frequently encountered in physics. For example, a system, initially in a discrete state, can split, under the effect of an internal coupling (described, consequently, by a time-independent Hamiltonian ), into two distinct parts whose energies (kinetic in the case of material particles and electromagnetic in the case of photons) can have, theoretically, any value; this gives the set of final states a continuous nature. Thus, in -decay, a nucleus initially in a discrete state is transformed (via the tunnel effect) into a system composed of an particle and another nucleus. A many-electron atom initially in a configuration (cf. Complements AXIV and BXIV ) in which several electrons are excited can, under the effect of electrostatic interactions between electrons, give rise to a system formed of an ion + and a free electron (the energy of the initial configuration must, of course, be greater than the simple ionization limit of ): this is the “autoionization” phenomenon. We can also cite the spontaneous emission of a photon by an excited atomic (or nuclear) state: the interaction of the atom with the quantized electromagnetic field couples the discrete initial state (the excited atom in the absence of photons) with a continuum of final states (the atom in a lower state in the presence of a photon of arbitrary direction, polarization and energy). Finally, we can mention the photoelectric effect, in which a perturbation, now sinusoidal, couples a discrete state of an atom to a continuum of final states (the ion + and the photoelectron ). These few examples of unstable states taken from various domains of physics are sufficient to underline the importance of the problem we are treating in this complement. 2.

Description of the model

2-a.

Assumptions about the unperturbed Hamiltonian

0

To simplify the calculations as much as possible, we shall make the following assumptions about the spectrum of the unperturbed Hamiltonian 0 . This spectrum includes: ( ) a discrete state 0

=

( ) a set of states 0

of (non-degenerate) energy

=

: (1)

forming a continuum: (2)

can take on a continuous infinity of values, distributed over a portion of the real axis including . We shall assume, for example, that varies from 0 to + : 0 Each state we shall denote by

(3) is characterized by its energy and a set of other parameters which (as in § C-3-a- of Chapter XIII). can therefore also be written 1379

COMPLEMENT DXIII



in the form

. We have [cf. formula (C-28) of Chap. XIII]:

d = (

)d d

(4)

where ( ) is the density of final states. The eigenstates of 0 satisfy the following relations (orthogonality and closure relations): =1

(5a)

=0

(5b)

= (

+ 2-b.

)

d

(5c)

=1

(6)

Assumptions about the coupling

We shall assume that elements:

is not explicitly time-dependent and has no diagonal

=0 =0

(7)

(if these diagonal elements were not zero, we could always add them to those of 0 , which would simply amount to changing the unperturbed energies). Similarly, we shall assume that cannot couple two states of the continuum: =0

(8)

The only non-zero matrix elements of are then those connecting the discrete state with the states of the continuum. It is these matrix elements, , that are responsible for the decay of the state . The preceding assumptions are not too restrictive. In particular, condition (8) is very often satisfied in the physical problems alluded to at the end of § 1. The advantage of this model is that it enables us to investigate the physics of the decay phenomenon without too many complicated calculations. The essential physical conclusions would not be modified by using a more elaborate model.

Before taking up the new method for solving the Schrödinger equation which we are describing in this complement, we shall indicate the results of the first-order perturbation theory of Chapter XIII as they apply to this model. 2-c.

Results of first-order perturbation theory

The discussion of § C-3 of Chapter XIII enables us to calculate [using, in particular, formula (C-36)] the probability of finding the physical system at time (initially in the state ) in a final state of arbitrary energy belonging to a group of final states characterized by the interval around the value . 1380



DECAY OF A DISCRETE STATE RESONANTLY COUPLED TO A CONTINUUM OF FINAL STATES

Here, we shall concern ourselves with the probability of finding the system in any of the final states : neither nor is specified. We must therefore integrate expression (C-36) of Chapter XIII with respect to , which gives the probability density [the integration over the energy was already performed in (C-36)]. Thus, we introduce the constant: Γ=

2 ~

d

2

=

(

=

)

(9)

The desired probability is then equal to Γ . With the assumptions of § 2-a, it represents the probability of the system having left the state at time . If we call P ( ) the probability that the system is still in this state at time , we have: P ()=1

Γ

(10)

In the discussion of the following sections, it is important to recall the validity conditions for (10): ( ) Expression (10) results from a first-order perturbation theory which is valid only if P ( ) differs only slightly from its initial value P (0) = 1. We then must have: 1 Γ

(11)

( ) Furthermore, (10) is valid only for sufficiently long times . To state the second condition more precisely, and to see, in particular, if it is compatible with (11), we return to expression (C-31) of Chapter XIII ( and are no longer constrained to vary only inside the intervals and ). Instead of proceeding as we did in Chapter XIII, we shall integrate the probability density appearing in (C-31), first over and then over . The following integral then appears: 1 ~2

d

( )

where

(12)

~

0

( ), which results from the first integration over , is given by:

( )=

d

2

(

)

(13)

is the diffraction function defined by (C-7) of Chapter XIII, centered at and of width 4 ~ . Let ~∆ be the “width” of ( ): ~∆ represents the order of magnitude of the variation needed for ( ) to change significantly (cf. Fig. 1). As soon as is sufficiently large that: =

~

1 ∆ ~

(14) behaves like a “delta function” with respect to

( ). Using relation (C-32) 1381



COMPLEMENT DXIII

F

t,

E – Ei ћ

K(E)

ћ𝛥 4πћ t 0

Ei

E

Figure 1: Variation of the functions ( ) and with respect to . The ~ respective “widths” of the two curves are of the order of ~∆ and 4 ~ . For sufficiently large , behaves like a “delta function” with respect to ( ). ~

of Chapter XIII, we can then write (12) in the form: 2 ~

d

(

)

( )=

2 ~

(

=

)=Γ

(15)

since by comparing (9) and (13), it can easily be seen that: 2 ~

(

=

)=Γ

(16)

Again we find that the linear decrease appearing in (10) is valid only if enough to satisfy (14). Conditions (11) and (14), obviously, are compatible only if: ∆

is large

Γ

(17)

We have thus given a quantitative form to the condition stated in the note of Chap. XIII on page 1318. In the rest of this complement, we shall assume that inequality (17) is satisfied. 2-d.

Integrodifferential equation equivalent to the Schrödinger equation

It is easy to adapt expressions (B-11) of Chapter XIII to the case we are studying here. The state of the system at time () = 1382

( )e

~

+

d

(

can be expanded on the )e

~

basis: (18)



DECAY OF A DISCRETE STATE RESONANTLY COUPLED TO A CONTINUUM OF FINAL STATES

When we substitute state vector (18) into the Schrödinger equation, using the assumptions stated in §§ 2-a and 2-b, we obtain, after a calculation that is analogous to the one in § B-1 of Chapter XIII, the following equations of motion: d d d ~ d

d e(

()=

~

)= e(

(

)

)

~

(

~

)

(19)

()

(20)

The problem consists of using these rigorous equations to predict the behavior of the system after a long time, taking into account the initial conditions: (0) = 1 (

(21a)

0) = 0

(21b)

d The simplifying assumptions which we made for imply that ( ) depends d d only on ( ), and ( ), only on ( ). Consequently, we can integrate equation d (20), taking initial condition (21b) into account. Substituting the value obtained in this way for ( ) into (19), we obtain the following equation describing the evolution of ( ): d d

()=

1 ~2

d e(

d

)(

) ~

2

( )

(22)

0

By using (4) and performing the integration over , we obtain, according to (13): d d

()=

1 ~2

d 0

d

( )e(

)(

) ~

( )

(23)

0

Thus, we have been able to obtain an equation involving only . However, it must be noted that this equation is no longer a differential equation, but an integrodifferential d equation: the time derivative ( ) depends on the entire “history of the system” d between the times 0 and . Equation (23) is rigorously equivalent to the Schrödinger equation, but we do not know how to solve it exactly. In the following sections, we shall describe two approximate methods for solving this equation. One of them (§ 3) is equivalent to the first-order theory of Chapter XIII; the other one (§ 4) enables us to study the long-time behavior of the system more satisfactorily. 3.

Short-time approximation. Relation to first-order perturbation theory

If

is not too large, that is, if ( ) is not too different from (0) = 1, we can replace ( ) by (0) = 1 on the right-hand side of (23). This right-hand side then reduces to a double integral, over and , whose integration presents no difficulties: 1 ~2

d 0

d

( )e (

)(

) ~

(24)

0

1383



COMPLEMENT DXIII

We shall perform this calculation explicitly, since it allows us to introduce two constants [one of which is Γ, defined by (9)] which play an important role in the more elaborate method described in § 4. We shall begin by integrating over ’ in (24). According to relation (47) of Appendix II, the limit of this integral for is the Fourier transform of the Heaviside step function. More precisely: e(

Lim

)

~

d =~

(

)+

1

(25)

0

(we have set = ). Actually, it is not necessary to let approach infinity in order to use (25) in the calculation of (24). It suffices for ~ to be much smaller than the “width” ~∆ of ( ), that is, for to be much greater than 1 ∆. We again find the validity condition (14). If this condition is satisfied, we can use (25) to write (24) in the form: (

=

( )

)

~

~

d

(26)

0

The first term of (26) is, according to (16), simply ( )

=

Γ 2. We shall set:

d

(27)

0

Therefore, the double integral (24) is equal to: Γ 2

(28)

~ When ( ) is replaced by as (14) is satisfied]:

(0) = 1 in (23), this equation then becomes [as soon

d Γ ()= d 2 ~ The solution of (29), using the initial condition (21a), is very simple: ()=1

Γ + 2

(29)

(30) ~

Obviously, this result is valid only if

( ) differs slightly from 1, that is if:

1 ~ (31) Γ This is the other validity condition, (11), for first-order perturbation theory. Using (30), we can easily calculate the probability P ( ) = ( ) 2 that the system 2 is still in the state at time . If we neglect terms in Γ2 and , we obtain: P ()=1

Γ

(32)

All the results obtained in Chapter XIII can then be deduced from equation (23) when ( ) is replaced by (0). This equation has also enabled us to introduce the parameter , whose physical significance will be discussed later [note that does not appear in the treatment of Chapter XIII because we were concerned only with the calculation of the probability ( ) 2 , and not with that of the probability amplitude ( )]. 1384

• 4.

DECAY OF A DISCRETE STATE RESONANTLY COUPLED TO A CONTINUUM OF FINAL STATES

Another approximate method for solving the Schrödinger equation

A better approximation consists of replacing ( ) by ( ) rather than by (0) in (23). To see this, we shall begin by performing the integral over which appears on the right-hand side of the rigorous equation, (23). We obtain a function of and : (

)=

1 ~2

d

( )e (

)(

) ~

(33)

0

which is clearly different from zero only if is very small. In (33), we are integrating over the product of ( ), which varies slowly with (cf. Fig. 1), and an exponential whose period with respect to the variable is 2 ~ ( ). If we choose values of and such that this period is much smaller than the width ~∆ of ( ), the product of these two functions undergoes numerous oscillations when is varied, and its integral over is negligible. Consequently, the modulus of ( ) is large for 0 and becomes negligible as soon as 1 ∆. This property means that, for all , the only values of ( ) to enter significantly into the right-hand side of (23) are those which correspond to very close to (more precisely, 1 ∆). Indeed, once the integration over has been performed, this right-hand side becomes: (

) ( )d

(34)

0

and we see that the presence of ( ) practically eliminates the contribution of ( ) as soon as 1 ∆. d Thus, the derivative ( ) has only a very short memory of the previous values of d ( ) between 0 and . Actually, it depends only on the values of at times immediately before , and this is true for all . This property enables us to transform the integrodifferential equation (23) into a differential equation. If ( ) varies very little over a time interval of the order of 1 ∆, we make only a small error by replacing ( ) by ( ) in (34). This yields: ()

(

)d =

0

Γ + 2

()

(35)

~

[to write the right-hand side of (35), we used the fact that the integral over of ( ) is simply, according to (33), the double integral (24) evaluated in § 3 above]. Now, according to the results of § 3 (and as we shall see later), the time scale characteristic of the evolution of ( ) is of the order of 1 Γ or ~ . The validity conditions for (35) are then: Γ





(36)

~ which we have already assumed to be fulfilled [cf. (17)]. To a good approximation, and for all , equation (23) can therefore be written: d d

()=

Γ + 2

()

(37)

~ 1385



COMPLEMENT DXIII

whose solution, using (21a), is obvious: ()=e

Γ

2

e

(38)

It can easily be shown that the limited expansion of (38) gives (30) to first order in Γ and .

Comment:

No upper bound has been imposed on the time . On the other hand, the integral ( )d which appears in (35) is equal to (Γ 2+ ~) only if 1 ∆, 0 as we saw in § 3 above. For very short times, the theory presented here suffers from the same limitations as perturbation theory; however, it has the great advantage of being valid for long times. If we now substitute expression (38) for ( ) into equation (20), we obtain a very simple equation which enables us to determine the probability amplitude ( ) associated with the state : (

)=

1 ~

e

Γ

2

e(

)

~

d

(39)

0

that is: (

1

)=

e

Γ

2

e(

1 ~(

~

)

)+

Γ 2

~

(40)

Equations (38) and (40), respectively, describe the decay of the initial state and the “filling” of the final states . Now let us study in greater detail the physical content of these two equations. 5.

Discussion

5-a.

Lifetime of the discrete state

According to (38), we have: P ()=

()2=e

Γ

(41)

P ( ) therefore decreases irreversibly from P (0) = 1 and approaches zero as (Fig. 2). The discrete initial state is said to have a finite lifetime , where is the time constant of the exponential of Figure 2 : =

1 Γ

(42)

This irreversible behavior contrasts sharply with the oscillations of the system (Rabi’s formula) between two discrete states when it is subject to a resonant perturbation coupling these two states. 1386



DECAY OF A DISCRETE STATE RESONANTLY COUPLED TO A CONTINUUM OF FINAL STATES

𝒫ii(t) 1

1 e

0

t

1 τ= Γ

Figure 2: Variation with respect to time of the probability of finding the system in the discrete state at time . We obtain an exponential decrease, e , for which Fermi’s golden rule gives the tangent at the origin (this tangent is represented by a dashed line).

5-b.

Shift of the discrete state due to the coupling with the continuum

If we go from

( ) to

( ) [cf. formula (B-8) of Chapter XIII], we obtain, from

(38): ()=e

Γ

2

e

(

+

)

~

(43)

Recall that, in the absence of the coupling

, we would have:

~

()=e

(44)

In addition to the exponential decrease, e Γ 2 , the coupling with the continuum is therefore responsible for a shift in the discrete state energy, which goes from to + . This is the interpretation of the quantity introduced in § 3. Let us analyze expression (27) for more closely. Substituting definition (13) of ( ) into (27), we get: d

=

d

(

2

)

(45)

0

or, if we use (4) and replace

by

:

2

=

which

d

The contribution to this integral of a particular state = , is:

(46) of the continuum, for

2

(47) 1387

COMPLEMENT DXIII



We recognize (47) as a familiar expression in stationary perturbation theory [cf. formula (B-14) of Chapter XI]. (47) represents the energy shift of the state due to the coupling with the state , to second order in . is simply the sum of the shifts due to the various states of the continuum. We might imagine that a problem would appear for the states for which = . Actually, the presence in (46) of the principal part implies that the contribution of the states situated immediately above compensates that of the states situated immediately below. Summing up: ( ) The coupling of with the states of the same energy is responsible for the finite lifetime of [the function ( ) of formula (25) enters into the expression for Γ]. ( ) The coupling of with the states of different energies is responsible for an energy shift of the state . This shift can be calculated by stationary perturbation theory (this was not obvious in advance).

Comment: In the particular case of the spontaneous emission of a photon by an atom, represents the shift of the atomic level under study due to the coupling with the continuum of final states (an atom in another discrete state, in the presence of a photon). The difference between the shifts of the 2 1 2 and 2 1 2 states of the hydrogen atom is the “Lamb shift” [cf. Complement KV , § 3-d- and Chapter XII, § C-3-b, comment (iv)]. 5-c.

Energy distribution of the final states

Once the discrete state has decayed, that is, when 1 Γ, the final state of the system belongs to the continuum of states . It is interesting to study the energy distribution of the possible final states. For example, in the spontaneous emission of a photon by an atom, this energy distribution is that of the photon emitted when the atom falls back from the excited level to a lower level (the natural width of spectral lines). When 1 Γ, the exponential which appears in the numerator of (40) is practically zero. We then have: )2

(

1

2 1 Γ

(48)

)2 + ~2 Γ2 4

(

( ) 2 actually represents a probability density. The probability of finding the system, after the decay, in a group of final states characterized by the intervals d and d about and can be calculated directly from (48): dP(

)=

Let us examine the dP( d d 1388

)

2

(

)

1 (

)2 + ~2 Γ2 4

-dependence of the probability density:

d

d

(49)



DECAY OF A DISCRETE STATE RESONANTLY COUPLED TO A CONTINUUM OF FINAL STATES

d𝒫(βf , Ef , t) dβf dEf

ħΓ =

ħ τ

Ei + δE

0

Ef

Figure 3: Form of the energy distribution of the final states attained by the system after the decay of the discrete state. We obtain a Lorentzian distribution centered at + (the energy of the discrete state corrected by the shift due to the coupling with the continuum). The shorter the lifetime of the discrete state, the wider the distribution (time-energy uncertainty relation).

2 Since ( ) remains practically constant when varies over an interval of the order of ~Γ, the variation of the probability density with respect to is essentially determined by the function: 1 (50) ( )2 + ~2 Γ2 4

and has, consequently, the form shown in Figure 3. The energy distribution of the final states has a maximum for = + , that is, when the final state energy is equal to that of the initial state , corrected by the shift . The form of the distribution is that of a Lorentz curve of width ~Γ, called the “natural width” of the state . An energy dispersion of the final states therefore appears. The larger ~Γ (that is, the shorter the lifetime = 1 Γ of the discrete state), the greater the dispersion. More precisely: ∆

= ~Γ =

~

(51)

Note again the analogy between (51) and the time-energy uncertainty relation. In the presence of the coupling , the state can be observed only during a finite time, of the order of its lifetime . When we want to determine its energy by measuring that of the final state of the system, the uncertainty ∆ of the result cannot be much less than ~ . References:

The original article: Weisskopf and Wigner (2.33). 1389

COMPLEMENT EXIII



Complement EXIII Time-dependent random perturbation, relaxation

1

2

3

Evolution of the density operator . . . . . . . . . 1-a Coupling Hamiltonian, correlation times . . . . . 1-b Evolution of a single system . . . . . . . . . . . . 1-c Evolution of the ensemble of systems . . . . . . . 1-d General equations for the relaxation . . . . . . . Relaxation of an ensemble of spin 1/2’s . . . . . 2-a Characterization of the operators, isotropy of the 2-b Longitudinal relaxation . . . . . . . . . . . . . . 2-c Transverse relaxation . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . 1391 . . . . . . . 1392 . . . . . . . 1393 . . . . . . . 1396 . . . . . . . 1397 . . . . . . . 1398 perturbation 1399 . . . . . . . 1400 . . . . . . . 1403 . . . . . . . 1408

This complement examines the problem studied in § D of Chapter XIII, both in a more precise and general way. Rather than studying a single system, we shall study an ensemble of individual quantum systems subjected to an external random perturbation. This type of situation often occurs in magnetic resonance experiments where one measures the global magnetization of an ensemble of spins each carrying a small magnetic moment, as for example the nuclear spins of atoms in a gas. As the atoms move, they undergo collisions with impurities contained in the gas or on the walls of the container. As mentioned in Chapter XIII, if these impurities carry a magnetic moment, such collisions may change the directions of the nuclear spins of the colliding atoms. The corresponding perturbation lasts for a very short time (the collision time), and is of a random nature since the magnetic moment of the impurities can have any direction. The gas of atomic spins is thus subjected to a sum of random perturbations that rapidly change their values (and signs), hence having a very short correlation time. Another classic example of random perturbation is an experiment where an ensemble of atoms is illuminated by a light source. Several reasons give the interaction between the atoms and the incident electromagnetic field a random character. First of all, most light sources produce fields that have rapid frequency and phase fluctuations. This means that the field itself must be characterized in a stochastic manner, with a short coherence time. Furthermore, even if the light source is an almost perfectly monochromatic laser, the atoms’ motion is random. Because of the Doppler effect, the atoms will be coupled, in their own reference frame, to a field having a random frequency. Studying the propagation of a light beam in an atomic gas thus involves the study of a large number of individual atoms, each subjected to a different and random perturbation. Many examples exist of similar situations involving rapidly fluctuating perturbations. This complement examines how the effect of such a random perturbation on an ensemble of individual systems must be treated using quantum mechanics. In a more general framework than the one used in § D of Chapter XIII, we will show that the coupling with the random perturbation produces a so-called “relaxation” phenomenon in the global system, very different from the evolution in the absence of the random 1390



TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

event. We saw that the coupling with a constant or perfectly sinusoidal perturbation produces oscillations in the physical system, at Bohr frequencies that are modified by the interaction. This complement will describe a totally different behavior, an exponential evolution with real exponents, leading to irreversible evolutions. An example of such behavior could be the relaxation of a system towards thermal equilibrium, very different than an oscillation. We first study (§ 1) the evolution of the density operator characterizing the ensemble of systems. This leads to a general relaxation equation valid whenever the correlation times are very short. In the following section (§ 2) we apply this general equation to an important specific case, an ensemble of spin 1 2’s coupled to statistically isotropic perturbations. This will enable us to explain the important concepts of “longitudinal relaxation” and “transverse relaxation”, which play a central role in many magnetic resonance experiments. 1.

Evolution of the density operator

Consider an ensemble of individual systems labeled by the index = 1, 2, ..., . Each system is described by a density operator (Complements EIII and EIV ) noted ( ). Statistically, the ensemble of the systems is described by the following density operator ( ): ()=

1

()

(1)

=1

Each individual system evolves under the effect of an operator Hamiltonians: ()=

0

+

( ), the sum of two

()

(2)

The first, 0 , is the Hamiltonian common to all the individual systems, corresponding for example to the coupling of their spins with an external static magnetic field. We assume that this Hamiltonian does not depend on time. The second, ( ), is the coupling Hamiltonian with the random perturbation. It depends not only on the time but also on the index of the individual system. We note the eigenvectors of 0 :

=

0

having the energies

(3) =}

; we set:

=

(4)

The matrix elements of the coupling Hamiltonian are written, in this basis: ( )

()=

()

(5) 1391

COMPLEMENT EXIII

1-a.



Coupling Hamiltonian, correlation times ( )

Consider the ensemble of the functions ( ) obtained when varies from 1 to : they are different realizations of the same matrix element. Choosing randomly1 introduces a random function of time, ( ). Changing the values of and , and calling the dimension of the state space, we can define 2 random functions. These random functions can be considered as the matrix elements of an operator () that is also a random function of time. In other words, ( ) is the operator obtained by choosing randomly the value of labeling the operators ( ). The statistical correlation properties of this random operator (or of its matrix elements) play an essential role in what follows. The ensemble formed by the systems can be considered as an ensemble of different possible realizations of the same individual system, called in statistical mechanics the “Gibbs ensemble”. It is equivalent to take an average over this ensemble at a given time or over a single system taken at a large number of different times (ergodic hypothesis). This average will be symbolized by placing a horizontal line over the letter . We first assume that the average value of the perturbation is zero: ()=0

that is:

()=

( ) = 0 for any ,

(6)

This hypothesis is not restrictive since, if the average value of ( ) is any constant operator , that operator can be added to the Hamiltonian 0 , hence keeping the average value of the perturbation equal to zero. As ( ) and ( ) are complex conjugates, their product is always a positive number whose average is, a priori, not zero. The amplitude of the perturbation is then defined by the average value of the products: ()

()=

0

(7)

As we assume the random function to be stationary, this mean square is independent of time. In a more general way, one can define a series of cross-correlation coefficients, also time-independent: ()

()=

0

(8)

To characterize a function random in time, we also need to consider averages of products taken at different times. As we did in § D of Chap. XIII, we introduce the correlation functions between time and + : ( + )

()=

( )

Since the random functions are stationary by hypothesis, the function on the difference between the times + and , and we can also write: ( )

(0) =

( )

(9) only depends

(10)

We know that, by definition, the function ( ) starts from a positive value for = 0. As the delay starts increasing, the correlation between ( ) and (0) 1 We assume the number to be very large. As an example, a millimeter cube of gas, at standard temperature and pressure, contains roughly 1016 atoms.

1392



TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

rapidly decreases. The function ( ) tends towards zero, with a characteristic time called the “correlation time” and noted : ( + )

()

0

if

(11)

For instance, if the perturbation is induced by the collision between an atom and an impurity, it will clearly lose any memory of its value between one collision and the next, or even right after a single collision. As collision times are often very short, there are many examples where is a very short time. This analysis can be generalized to crosscorrelation coefficients like the one described in (8). One often uses a model where ( ) is a decreasing exponential of the delay : ( + )

0

()=

(12)

To simplify the notation, we only took into account a single correlation time , independent of and ; the generalization to several correlation times is straightforward. Similar relations as the ones we just wrote can be obtained for cross-correlation coefficients like the one described in (8). This leads to a whole series of correlation times depending on numerous indices. In the collision example discussed above, all these times are of the same order of magnitude as the very short collision time; a natural approximation is to assume that they are comparable, and to call the longest amongst all these times. 1-b.

Evolution of a single system

We now study the evolution of a single quantum system (the value of is fixed). It will be treated in the interaction picture (exercise 15 of Complement LIII ), which we now briefly review. .

Interaction picture

The evolution of each density operator ( ) obeys the usual von Neumann equation, with a commutator on the right-hand side: }

d d

()=[

0

+

()

( )]

(13)

In this right-hand side, the term containing 0 may lead to a rapid evolution. The term containing ( ) is assumed to be smaller, hence leading to a slower evolution that can be treated using approximations. ( ) To start with, let us assume that the coupling Hamiltonian ( ) is zero. The evolution of ( ) is only due to 0 . It is useful to express it as a function of the evolution operator 0 ( ) between the times and (Complement FIII ) associated with the non-perturbed Hamiltonian 0 : 0

(

0(

)=

) }

(14)

This is a unitary operator: 0

(

)

0

(

)=1

for any

or

(15) 1393



COMPLEMENT EXIII

When

( ) is zero, we simply have:

()=

0

(

)

( )

0

(

)

(16)

as we now show. Taking the derivative of this relation with respect to , the derivation of 0 ( ) introduces a term ( ) }; the derivation of 0 ( ) introduces a term 0 + ( ) 0 }. Both terms together reconstruct the 0 term of the commutator on the right-hand side of (13), which shows that equation (13) is verified by solution (16). As for the initial condition for = , it is also verified since the unitary operator 0 ( ) then becomes the unit operator. We can also use 0 ( ) to perform an inverse unitary transformation on () and define the modified density operator ( ) as: ()=

0

(

)

()

0

(

)

(17)

Inserting relation (16) in this definition, we obtain relation (15) twice, both on the left and on the right of ( ). All the evolution operators thus disappear, and we get: ()=

( )

(18)

This shows that ( ) does not depend on time as long as the coupling ( ) remains equal to zero. ( ) Even when ( ) is no longer zero, it is still useful to apply the unitary transformation (17) to the density operator. This operation is generally referred to as the “passage to the interaction picture”. In this picture, the evolution of the density operator ( ) is only due to the presence of the interaction ( ). According to our assumptions, this evolution is much slower than the evolution ( ), which is also governed by 0 . This property considerably facilitates the use of approximations and will be used in this complement. Let us take the time derivative of (17), starting with the derivative of the two unitary operators on the left and on the right of ( ), followed by the derivative of the operator itself. This yields: }

d d

()=

( )+

0

=

[

()

( )] +

0

Now for any operator , the fact that following commutator according to: 0

(

)[

( )]

0

(

)=

0

(

0

+

(

d d ()

) }

0

(

)

( )]

0

(

)

(

0

is a unitary operator allows transforming the

0

(

0

)

+

()

0

)

)[

0

0

(

)

()

0

(

)

(19)

(20)

(to check this, one can simply expand both commutators and use the relation 0 0 = 1). The right-hand side now contains ( ). If = 0 + ( ), since 0 commutes with 0 , we get: 0

1394

(

)[

0

+

=[

0

() ( )] +

( )] 0

0

(

(

) )

()

0

(

)

()

(21)



TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

The right-hand side contains the unitary transform ( ) of same unitary transformation that led from ( ) to ( ): ()=

0

(

)

()

0

(

( ), obtained by the

)

(22)

Inserting this expression in the right-hand side of (19), the commutators containing on both sides cancel out. We finally get the simple relation: }

d d

()=

()

()

0

(23)

This evolution equation only contains operators in the interaction picture. The hamiltonian 0 is no longer explicitly present (but is implicitly contained in the unitary transformation that leads to the interaction picture). It is easy to verify that ( ) does not evolve in the absence of perturbation. .

Approximate calculation of the evolution Integrating over time equation (23) yields: ()=

( )+

1 }

d

( )

( )

(24)

which, inserted in the same equation, leads to: }

d d

()=

()

( ) +

1 }

d

()

( )

( )

(25)

This evolution equation for ( ), which now contains a double commutator, is exact. As ( ) appears in the integral, it is an integro-differential equation. This equation can be transformed into a simple differential equation, using the following approximation. If the effect of the perturbation remains limited during the time interval from to , a good approximation of the evolution of ( ) in that interval is to replace ( ) by its value for any time chosen in that interval. Choosing for example the time , yields the differential equation: }

d d

()=

The delay

()

( ) +

1 }

d

()

( )

( )

(26)

can be introduced explicitly by performing the change of integration variable:

=

(27)

Dividing both sides by }, we obtain: d d

()=

1 }

()

( )

1 }2

d

()

(

)

( )

(28)

0

1395

COMPLEMENT EXIII

1-c.



Evolution of the ensemble of systems

To obtain the evolution equation for the density operator ( ) describing the ensemble of the systems, we first transform that operator as in (17) to use the interaction picture: ()=

0

(

)

()

0

(

)

(29)

The initial density operator can easily be retrieved using the inverse unitary transformation. Definition (1) of ( ) shows that its evolution is obtained by summing relation (28) over the index , and dividing the result by , the total number of systems. In other words, it means that we have to take the ensemble average of both sides of (28). This operation is difficult to carry out without making some hypotheses about the characteristics of the random functions that come into play. We shall assume that the evolution of ( ) occurs with time constants that are much longer than the correlation time . We shall explain below (§ 1-d- ) what this implies in terms of the parameters defining the interactions, hence verifying that the computation is consistent. With each time we can associate a previous time such that is very large compared to the correlation time , while remaining small compared to the characteristic evolution time of the density operator in the interaction picture. We shall then use relations (9) and (11) that characterize the random perturbation. The first commutator on the right-hand side of the evolution equation (28) involves two operators at different times and which are therefore not correlated: ( ) depends on values of the perturbation at times earlier than , whereas ( ) is the value of the perturbation at a time later than by a time larger than . Taking the average over all the values of then shows that this first term cancels out since we assumed in (6) that the average values of the matrix elements of the perturbation are zero. As for the following integral, it contains the average value over of the product () ( ) ( ), in that order or any other order. Contributions to this integral only come from values of the delay of the order of the correlation time ; if , ( ) depends neither on ( ) nor on ( ), which allows factoring an average value that is equal to zero. This has two consequences. First ( ) is not correlated with the two terms in , so that we can compute its average separately and replace ( ) by ( ). The second consequence is that we can replace the integral upper bound by infinity without changing significantly its value. This leads to: d d

()=

1 }2

d

() [

(

)

( )]

(30)

0

where, as before, the bar on top of the operators stands for the ensemble average (this average only concerns the perturbation , not the density operator). An additional simplification comes from the fact that we assumed to be short compared to the evolution time of ( ). It is therefore a good approximation to replace ( ) by ( ) in the right-hand side of this equation. This finally leads to the relaxation equation of the density operator in the interaction representation: d d

()=

1 }2

d

() [

(

)

( )]

0

Using (29), we obtain the corresponding equation in the usual representation. 1396

(31)

• 1-d.

TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

General equations for the relaxation

We now use the previous results to compute the evolution of the density matrix. .

Evolution of the matrix elements of the density operator

Relation (31) can be written in the basis of the eigenvectors of the Hamiltonian , in order to directly obtain the coupled evolution equations of the different matrix 0 elements of the density operator. Since: () where

(

=e

)

()

(32)

is defined in (4), we get:

d d

() (

e

1 }2

= )

d 0

e

()

(

e(

)(

)

e

()

(

)(

)

e

(

e

+ e

(

)

e

(

)

()

( )

)

)

()

()

()

()

()

(33)

The random functions associated with are stationary, as seen from relation (9). This allows adding an arbitrary time to the two variables they contain, in the right-hand side of the previous equation. We can thus replace by and by 0. We now leave the interaction picture and come back to the usual picture (laboratory picture) using the unitary transformation (17), written in the basis: ()

=e

(

)

()

(34)

This relation leads to: d () = ( ) () +e d The general relaxation equations are then written: d d

() 1 }2

= ( d

e

)

(

)

d d

()

(35)

() ( )

(0)

()

0

e

( )

(0)

()

e

(0)

( )

()

+e

()

(0)

( )

(36)

Noting the dimension of the state space, the previous relations (33) or (36) yield differential equations that govern the time evolution of the matrix elements of () or of ( ). These differential equations are coupled with each other; their coefficients are time integrals of correlation functions of the perturbation, which are supposedly known for a given physical problem. 2

1397

COMPLEMENT EXIII

.



Short memory approximation

In view of the approximations we used, let us find under which conditions our calculations are consistent. The general validity condition is that there exists, for each time , a previous time such that the interval obeys two conditions: it must be simultaneously very long compared to the correlation time and very short compared to the evolution time in the interaction picture. We can evaluate this evolution time by using an approximate expression of relation (33). We introduced in (7) the mean square of the matrix element ( ). Let us call 2 the order of magnitude of such a mean square for the various values of and . The coefficients that multiply ( ) on the right-hand side of (33) can be replaced by this factor 2 , integrated over d . Taking (12) into account, this integral introduces a factor . The coefficients can thus be approximated by: 2

(37)

}2

With this approximation, the evolution equation (33) yields an evolution time of the order of }2 / 2 . Our computations are consistent if this time is much larger than , that is if: 2

( )

2

}2

(38)

In other words, our computations are valid if the correlation (memory) time of the perturbation is short compared to the characteristic time of its intensity, } . This means that the perturbation will frequently change its value (and sign) before it can significantly change the system. This validity condition is often called the “motional narrowing condition”, for a reason explained in § 2-c- . Relations (33) or (36) are sets of first order differential equations. They describe the exponential relaxation of all the populations towards a situation where they all become equal. Carrying out calculations with these equations is not particularly difficult. However, it leads to the writing of complicated equations, in particular due to the large number of indices involved. In the general case, the populations () are not only coupled to each other, but also to non-diagonal elements () with = . All the matrix elements of the density operator can a priori be coupled to each other. One must then use additional approximations to select the terms essential for determining the relaxation properties. In this complement, we shall only consider a simple particular case, that still allows us to develop a large number of physical concepts: the study of an ensemble of spin 1/2’s undergoing an statistically isotropic perturbation. 2.

Relaxation of an ensemble of spin 1/2’s

Consider an ensemble of spin 1/2’s, contained for example in a sample measured in a magnetic resonance experiment, such as the one mentioned in the introduction. The evolution equations then take on a simple form, easy to interpret. There are only two levels , which will be noted + and . Their energy difference is: }( 1398

+

)=}

0

(39)



TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

It is useful to characterize the density operator of the spins by the average value of their angular momentum, which amounts to expanding this density operator on Pauli matrices (§ 5 of Complement EIV ). 2-a.

Characterization of the operators, isotropy of the perturbation

All the operators appearing in the previous equations now act in a 2-dimensional space. They are represented by matrices that can be expanded on the three Pauli matrices , and , as well as on the identity matrix, as shown in relation (22) of Complement AIV . .

Transformation of the operators We set: ()=

1 [1 + M ( ) 2

]

(40)

where stands for the vector operator whose components are the three Pauli matrices. The components of the vector M ( ) are three real numbers that play the role of parameters defining ( ). As we now show, the vector M ( ) is simply the mean value of over the whole sample, whose total magnetization is thus proportional to M ( ). Relation (11) of Complement AIV indicates that: =

+

(41)

where is equal to zero if two indices are equal, equal to +1 if the series of indices is an even permutation of the three axes , and , and equal to 1 if the permutation is odd. It follows that the trace of a product of Pauli matrices is zero, unless two of the matrices are identical (in which case the trace equals 2). Consequently, the average value of the operator is given by: ( ) = Tr =

() = ()

1 2 Tr [ ] 2

() (42)

The operator 0 is written in a form2 similar to that of relation (12) in Complement FIV , which studies a magnetic resonance experiment: 0

=

}

0

2

(43)

This operator corresponds to the effect of a magnetic field parallel to the axis, which induces a rotation of the spins around that axis at the angular frequency 0 . The unitary operator 0 ( ) defined in (14) is now a rotation operator of the spins (Complement AIV ) through an angle 0 ( ); as for the adjoint operator 0 ( ), it is also a rotation operator, but through an opposite angle. 2 We did not give the operators ( ) a component on the identity operator, since it would 0 or not alter the commutators where these operators come into play.

1399

COMPLEMENT EXIII



The interaction operator can be written in a similar way: ()=

1 h( ) 2

(44)

where h ( ) is a random vector function that characterizes the perturbation acting on the spins. In the same way as ( ) was defined (§ 1-a) as a random choice amongst the possible outcomes of the individual system labeled by the index , h ( ) characterizes the statistical properties of the three components of the local field acting on each spin at time . Contrary to the field associated with 0 , this local field is random and can point in any direction, not necessarily parallel to the axis. The three components = 1, 2, 3 of this vector are noted ( ). For ensemble averages, we assume that ( ) = 0 and, as in (9) and (12), we shall write the correlation functions in the following way: ( + ) where the coefficients ( ) for .

()=

( )

(45)

( ) are rapidly decreasing functions of over times of the order of . The ( ) are auto-correlation functions of the various components of h ( ), the = are cross-correlation functions pertaining to two different components.

Isotropy

We introduce an additional hypothesis, and assume that the perturbation affecting the spins is statistically isotropic: the correlation functions of the components of h ( ) have no preferred direction. This means that the correlation functions ( ) are identical for the three axes , and (corresponding to = 1, 2 and 3 respectively), whatever the value of . In other words, the ensemble of the ( ) form a 3 3 matrix, which is rotation invariant, hence necessarily proportional to the unit matrix. Consequently, not only the auto-correlation coefficients are equal to each other, but also the cross-correlation coefficients for = must be equal to zero. Added to the stationarity of the perturbation, this hypothesis leads to: ( + )

()=

( )

(46)

One frequently models the decrease of constant . This leads to: ( + ) 2-b.

()=

[

(0)]

2

( ) with

by a simple exponential, with a time

(47)

Longitudinal relaxation

When = = + , the first term on the right-hand side of equation (36) in ( ) cancels out (no evolution of the populations due to 0 ). In the following terms, we must replace ( ) and (0) by their expression given by (44). This leads to products of matrix elements of two Pauli matrices and , multiplied by the statistical average (46) of the perturbation. Because of the factor we must choose the same Pauli matrix in both operators ( ) and (0). 1400

• .

TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

Calculation of the relaxation time

Let us start by choosing twice the matrix . Since this matrix does not couple the states + and , we must have = = + in relation (36). As a result, the sum of the last 4 terms on the right-hand side is zero. We then choose twice the matrix , which only has non-diagonal elements, all equal to 1. In the second line of the right-hand side of (36), when = = +, we must necessarily have = and = + , whereas in the fifth line, the opposite is true ( = + and = ); for the third and fourth lines we have = = . This yields the following term: 1 4}2

d

()

( + )

+

0

() +

0

()

0

()

0

+

+

0

() +

(48)

or: 1 2}2

d

()

( + ) cos

0

[

()

+

() +]

(49)

0

We finally choose twice the matrix , which has the same structure, with two matrix elements equal to + and , so that their product is also equal to unity. This term is the same as the term, except that we must replace by . We finally obtain: d + d with: 1 1

=

() + =

1 }2

1 2

d

()

+

() +

(50)

1

()

( + )+

()

( + )

cos

0

(51)

0

The time 1 is called the “longitudinal relaxation time”. Its properties are discussed below. The calculation of the evolution of the other diagonal element () is practically the same and yields: d d

()

=

1 2

+

() +

()

Now using (42) we can write the evolution of the d d d ()= () = + () + d d d Taking the difference between (50) and (52) leads to: d d

()=

1

(52)

1

()

component of the magnetization: ()

(53)

(54)

1

This equation shows that the longitudinal (parallel to the static magnetic field) component of M ( ) decreases exponentially with a time constant 1 , and tends toward zero when . The relaxation rate 1 1 depends on the sum of correlation functions of both transverse (perpendicular to ) components of the perturbation. This was to be expected since it is the operators and that can induce transitions of the spins between their levels + and . 1401

COMPLEMENT EXIII

.



Role of the spectral density

The dependence in 0 of the relaxation probability 1 1 can be interpreted in view of the results of Chapter XIII (§ C-2), where we studied a sinusoidal perturbation coupling two levels + and . We showed that the closer the perturbation frequency is to the Bohr frequency 0 associated with the energy difference between the two levels, the more effective the perturbation. In our present case, the perturbation is not a sinusoid but a random function. We thus expect the probability amplitude of the transition to involve the 0 Fourier component of the perturbation that acts between the instants 0 and . To further examine this idea, we introduce the Fourier transform ( ) of the correlation function, called the “spectral density”: ( )= ()

(

)=

+

1 2 1 2

d

()

d

( )

(

)

+

(55)

As the random function is stationary, we have: ()

(

)=

( + )

()=

()

( + )

(56)

This shows that the correlation function is an even function. As the Fourier transform of an even and real function is also even and real (Appendix I), we can write: (

)=

( )

(57)

Relation (51) can be rewritten as: 1 1

=

1 2 2 }2

+

d

d

[

( )+

( )]

0

+

(58)

0

0

Taking (57) into account, changing simultaneously the signs of the two integration variables and does not change the function to be integrated. It merely transforms the first integral over d to an integral between 0 and , while the limits of the second become + and . Since two sign changes cancel each other, we can reverse the limits in each of the two integrals. This leads to: 1 1

1 2 2 }2 1 = 4 2 }2

0

=

+

d

d

[

( )+

( )]

0

+

0

d

[

( )+

( )]

0

+

0

+

d

(59)

Note that the final form (second line) of this relation was obtained by adding the first line of this relation to relation (58), and dividing by two. The integral over d leads to: d 1402

( +

0)

+

(

0)

=2 [ ( +

0)

+ (

0 )]

(60)



TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

Taking again (57) into account, the two terms in ( + contribution, and we obtain: 1

(

=

0)

2

1

+ }2

(

0)

and (

0)

yield the same

0)

(61)

The transition probability is proportional to the sum of the spectral densities of the two perturbations responsible for the transitions. As expected, it is the resonant components of the perturbation that induce the transitions between the states + and . .

Exponential correlation function

The correlation functions of these components are often modeled by a simple exponential, as in (47). In that case, the integral over d of (51) is easy to compute: d

()

( + )

0

+

= [

0

(0)]

2

d

0

0

=

1

2

[

(0)]

0

=[

=

1

1 [ }2

2

(0)] + [

2

(0)]

Adding the term corresponding to the effect of the 1

+

0

0

1

1

+ 0

2 1+(

2

0

)

1+(

2

0

(62)

component, we get:

2

(0)]

1

)

(63)

The longitudinal relaxation rate varies as a Lorentzian function of 0 , plotted in Fig. 1. The relaxation rate is maximum when 0 = 0 (zero static field), and is equal to: 1 1(

0

= 0)

=

1 }2

[

2

(0)] + [

(0)]

2

(64)

With our present notation, the motional narrowing condition (38) is written: 2

[

(0)] ( )

2

}2

(65)

This leads to: 1

1

(66)

1

We thus verify that 1 , the characteristic evolution time of ( ), is very long compared to the correlation time , hence proving the consistency of the approximations we have used. 2-c.

Transverse relaxation

We now study the evolution of the non-diagonal elements of the density matrix between the states + and . Let us first show that the matrix element + () , or its complex conjugate ( ) + , characterizes the transverse components () 1403

COMPLEMENT EXIII



Figure 1: Plot of the longitudinal relaxation rate 1 1 as a function of the energy difference } 0 between the energy levels (left-hand side of the figure), or as a function of the correlation time (right-hand side of the figure). This rate is proportional to the power spectrum of the perturbation at the frequency 0 , and hence follows a Lorentzian function when plotted as a function of 0 – cf. relation (63). In the regime where 0 1 , the power spectrum of the perturbation decreases as 1 02 : the relaxation rate can be greatly reduced by increasing 0 . If we now keep 0 fixed and increase the correlation time , we first get a linear variation of the relaxation probability, proportional to the time during which the perturbation acts in a coherent way. The probability then reaches a maximum for 0 = 1 , followed by a decrease in 1 as the Fourier components of the perturbation at the frequency 0 become weaker and weaker. and ( ) of M ( ). Using the expressions of the Pauli matrices – cf. for example relations (2) of Complement AIV – we can compute the difference : 00 20

=

(67)

which leads to: =

(

)

() =2 +

()

(68)

We saw in § 2-a that the three components of M ( ) were equal to the average values of the corresponding Pauli matrices. It thus follows: ()

() = + () (69) 2 The real part of the non-diagonal matrix element + () directly yields ( ) 2, whereas its imaginary part yields the opposite of ( ) 2. To avoid taking into account the evolution of this matrix element due to 0 , which introduces the first term in ( ) on the right-hand side of (36), we shall use the interaction picture. The evolution of + () is then given by (33). In this relation, we replace the interaction operators ( ) and (0) by their expression (44). As before, the statistical isotropy of the perturbation leads us to only keep the terms where the same component of h ( ) appears in the two operators. We shall first examine the case where this component is either ( ), or ( ); the case where this component is () will be examined later. 1404

• .

TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

Effect of the transverse components of the perturbation

We successively take into account the components on the and axes. ( ) The component of the perturbation introduces in (33) the matrix elements of , which are all non-diagonal and equal to one. Each matrix element of changes a ket + into a bra , or vice versa. The Pauli matrix simply introduces a coefficient equal to 1. In the first term on the right-hand side of (33), when = +, we have = and hence = +; this term couples + () to itself. As for the exponential, it 0 introduces . The same result is obtained for the fourth term: since = , we have 0 = + and = , and the exponential again introduces the term . The sum of these two terms yields: 1 2}2

d

()

0

(

)

+

()

(70)

0

The second and third terms on the right-hand side of (33) are different, since if = + we have = , whereas if = we have = +, which introduces on the righthand side the matrix element ( ) + . This means there is a term that couples two complex conjugate matrix elements: 2

0(

)

d

2}2

()

0

(

)

() +

(71)

0

( ) The component of the perturbation introduces in (33) the matrix elements of , also non-diagonal but now equal to . For the first and fourth term on the righthand side of (33), this introduces in the previous calculation a factor ( ) ( ) = 1, which 2 does not change anything. As for the second and third term, the factor equals ( ) = 1, which changes the sign of the result. Since the isotropy requires the correlation functions of ( ) and of ( ) to be equal, we obtain the opposite of (71), and both terms cancel out. ( ) We finally get: 1 2}2

d

Expanding 1 2

()

0

(

)+

()

(

)

+

()

(72)

0 0

into cos (

0

+

()

+ ∆

) + sin (

0

), and taking (51) into account, we get: (73)

1

where the coefficient ∆ is defined by: ∆=

1 2}2

d

sin (

0

)

()

(

)+

()

(

)

(74)

0

The physical significance of this coefficient is discussed below. 1405

COMPLEMENT EXIII

.



Effect of the longitudinal component of the perturbation

The component of the perturbation does not change the spin state. As a result, the first line in (33) contains = = = + and = ; the exponentials and the matrix elements of are all equal to unity. In the fourth line, = + and = = = , the exponentials are again equal to unity, as is the product of two matrix elements of (each equal to 1), and the final result is the same. As for the second and third lines, we have = = + and = = , so that the exponentials are equal to 1, whereas the product of the matrix elements of is now equal to 1; these two terms double the two preceding terms. Taking into account the factor 1 2 of (44), we finally obtain the contribution: 1 }2

d

()

(

)

+

()

(75)

0

that leads to the coefficient of transverse relaxation: 1

=

2

.

1 }2

d

()

(

)

(76)

0

Discussion, role of the spectral density

Grouping together both contributions (73) and (75), we can write the complete evolution of the non-diagonal element as: d + d

()

=

1 2

+ 1

1

+ ∆

+

()

(77)

2

Leaving the interaction picture to go back to the density operator ( ) in the usual laboratory picture, we must add to this evolution the first term appearing on the righthand side of (36); this yields: d + d

()

=

1 2

+ 1

1

+ (

0

+ ∆)

+

()

(78)

2

( ) Damping In either picture, the non-diagonal element of the density matrix is damped with a time constant 2 given by: 1 2

=

1 2

+ 1

1

(79)

2

which is the sum of two contributions. – The first is directly related to the longitudinal relaxation process. This process changes the distribution of the populations between the two levels + and , hence destroying the coherence between these two levels. This rate of destruction of the coherence is the same as the one affecting the populations in (50), but half the one affecting () in (54). Note that it is only the transverse components of the perturbation that play a role in this contribution, since they are the ones that can induce transitions between the two levels. 1406



TIME-DEPENDENT RANDOM PERTURBATION, RELAXATION

– The second contribution comes only from the longitudinal component of the perturbation. This fluctuating component directly modifies the energy difference } 0 between the two levels + and , and hence the precession velocity of the spins’ transverse component. When the different spins have different precession velocities around the axis, their transverse components spread out and their vector sum diminishes. This leads to a decrease of the transverse component of the global spin of the system. ( ) Frequency shift Relation (78) shows that the term in ∆ is equivalent to a change in the precession frequency 0 of the spins. In addition to the damping associated with the relaxation, the perturbation introduces a shift in the evolution frequency of the non-diagonal elements. The same calculation as the one leading to (58) yields for the frequency shift ∆: ∆=

1 2 }2

4

+

d

d

[

( )+

( )]

0

0

(80)

0

This expression contains an integral over d : 1

( +

d

0)

(

0)

0

1 +

=

1

[ ( +

0

0)

(

0 )]

(81)

0

As the functions ( ) and ( ) are even, the terms containing the delta functions cancel out, whereas the terms containing the principal parts double each other. This leads to: 1 2 }2

∆=

+

1

d

[

( )+

( )]

(82)

0

Note that, contrary to the longitudinal relaxation characterized by the time 1 , it is not the power spectrum of the perturbation at the resonant frequency = 0 that plays a role. Only non-resonant frequencies contribute to the shift. .

Exponential correlation functions

When the correlation function is modeled by an exponential as in relation (47), equalities (74) and (76) become: 1 2

=

1 [ }2

2

(0)]

(83)

and: ∆=

1 [ 2}2

2

(0)] + [

(0)]

2

0

1+(

2

0

)

(84)

It is interesting to discuss the effect of the fluctuations of the perturbation on the relaxation time 2 . Imagine first that the ensemble of spins is placed in a magnetic field along , which does not change in time but has a different value for each spin. We 1407

COMPLEMENT EXIII



note the root mean square of the corresponding fluctuation of the Hamiltonian. If the spins are initially oriented in the same direction, their transverse orientations will start spreading around since each spin has a different precession velocity. The average value of the global transverse orientation of the sample will go to zero over a time of the order } . This means that the transverse orientation diminishes at a rate of the order of }, which depends linearly on the perturbation amplitude . Now in the presence of time dependent fluctuations of the perturbation, result (83) predicts a totally different behavior. The relaxation rate 1 2 is of the order of 2 }2 and hence varies as the square of the perturbation amplitude. This evolution rate in the presence of fluctuations is thus } times the evolution rate in the static case. As the factor } 1 – see the motional narrowing condition (38) – the quadratic relaxation is much slower than the relaxation in the absence of fluctuation. This effect is even stronger when is shorter, which shows that it is the rapidly changing fluctuations that are responsible for the decrease of the relaxation rate. Now the width of the magnetic resonance lines is an increasing function of the transverse relaxation rate3 . Consequently, the shorter the correlation time, the narrower the lines. It often happens that the perturbation fluctuations come from the spins’ motion in the sample, in which case the more rapid the motion, the narrower the magnetic resonance lines. This explains the origin of the expression “motional narrowing”. On the other hand, comparing relations (63) and (84) shows that the relaxation probability and the frequency shift have a very different dependence on 0 . The relaxation probability follows a Lorentzian function, with a maximum for 0 = 0, whereas the frequency shift is maximum for 0 = 1. This difference comes from the fact that, as we discussed at the end of § 2-c- , it is not the resonant but rather the non-resonant frequencies of the spectral density that determine the frequency shift. 3.

Conclusion

As mentioned in the introduction, there are many situations where an ensemble of individual quantum systems is subjected to a random perturbation with a correlation time very short compared to the other characteristic times of the problem. In a more general way than in §§ D and E of Chapter XIII, we examined in this complement how, in the limit where is too short for the perturbation to have an effect during that time, the perturbation no longer induces a Rabi type oscillation. This led us to introduce a transition probability between the levels, leading to an exponential (and not oscillating) evolution of the populations. Note that in Complement DXIII , we also obtained, with the Fermi golden rule, a transition probability. In that case, it was the summation over the energies of all the final states that transformed the oscillation into a real and damped exponential. In the present complement, it is the random character of the perturbation that has a similar effect, even though the final state is unique and has a perfectly well defined energy. Another result we obtained concerns the existence of a frequency shift induced by the random perturbation. In the case of an optical excitation such as the one considered in § E-3-b of Chapter XIII, they are called “light shifts”, and have numerous applications in atomic physics (Complement CXX ). 3 Figure 7 of Complement F IV shows the variation of these lines, assuming there exists only one longitudinal and transverse relaxation rate 1 .

1408



EXERCISES

Complement FXIII Exercises 1. Consider a one-dimensional harmonic oscillator of mass , angular frequency and charge . Let and = ( + 1 2)~ 0 be the eigenstates and eigenvalues of its Hamiltonian 0 . For 0, the oscillator is in the ground state 0 . At = 0, it is subjected to an electric field “pulse” of duration . The corresponding perturbation can be written: 0

()=

0

for 0 for

0 and

is the field amplitude and is the position observable. Let P0 be the probability of finding the oscillator in the state after the pulse. . Calculate P01 by using first-order time-dependent perturbation theory. How does P01 vary with , for fixed 0 ? . Show that, to obtain P02 , the time-dependent perturbation theory calculation must be pursued at least to second order. Calculate P02 to this perturbation order. . Give the exact expressions for P01 and P02 in which the translation operator used in Complement FV appears explicitly. By making a limited power series expansion in of these expressions, find the results of the preceding questions. 2. Consider two spin 1/2’s, S1 and S2 , coupled by an interaction of the form ( )S1 S2 ; ( ) is a function of time which approaches zero when approaches infinity, and takes on non-negligible values (on the order of 0 ) only inside an interval, whose width is of the order of , about = 0. . At = , the system is in the state + (an eigenstate of 1 and 2 with the eigenvalues +~ 2 and ~ 2). Calculate, without approximations, the state of the system at = + . Show that the probability P(+ +) of finding, at = + , + the system in the state + depends only on the integral ( )d . . Calculate P(+ +) by using first-order time-dependent perturbation theory. Discuss the validity conditions for such an approximation by comparing the results obtained with those of the preceding question. . Now assume that the two spins are also interacting with a static magnetic field B0 parallel to . The corresponding Zeeman Hamiltonian can be written: 0

=

0( 1 1

+

2 2

)

where

1 and 2 are the gyromagnetic ratios of the two spins, assumed to be different. 2 2 Assume that ( ) = 0 e . Calculate P(+ +) by first-order timedependent perturbation theory. Considering 0 and as fixed, discuss the variation of P(+ +) with respect to 0 .

3. Two-photon transitions between non-equidistant levels Consider an atomic level of angular momentum = 1, subject to static electric and magnetic fields, both parallel to . It can be shown that three non-equidistant 1409



COMPLEMENT FXIII

energy levels are then obtained. The eigenstates of ( = 1 0 +1), of energies , correspond to them. We set 1 = ~ , 0 0 0 1 = ~ 0 ( 0 = 0 ). The atom is also subjected to a radiofrequency field rotating at the angular frequency in the plane. The corresponding perturbation ( ) can be written: ()= where

1

2

(

+e

+

e

)

1 is a constant proportional to the amplitude of the rotating field. . We set (notation identical to that of Chapter XIII): +1

() =

~

( )e = 1

Write the system of differential equations satisfied by the ( ). . Assume that, at time = 0, the system is in the state 1 . Show that if we want to calculate 1 ( ) by time-dependent perturbation theory, the calculation must be pursued to second order. Calculate 1 ( ) to this perturbation order. . For fixed , how does the probability P 1 +1 ( ) = 1 ( ) 2 of finding the system in the state 1 at time vary with respect to ? Show that a resonance appears, not only for = 0 and = 0 , but also for = ( 0 + 0 ) 2. Give a particle interpretation of this resonance. 4. Returning to exercise 5 of Complement HXI and using its notation, assume that the field B0 is oscillating at angular frequency , and can be written B0 ( ) = B0 cos . Assume that = 2 and that is not equal to any Bohr angular frequency of the system (non-resonant excitation). Introduce the susceptibility tensor , of components ( ), defined by: ()=

Re

( )

0

e

with = . Using a method analogous to the one in § 2 of Complement AXIII , calculate ( ). Setting = 0, find the results of exercise 5 of Complement HXI . 5. The Autler-Townes effect Consider a three-level system: 1 , 2 , and 3 , of energies 1, 2 and 3. Assume 3 and . 2 1 3 2 2 1 This system interacts with a magnetic field oscillating at the angular frequency . The states 2 and 3 are assumed to have the same parity, which is the opposite of that of 1 , so that the interaction Hamiltonian ( ) with the oscillating magnetic field can connect 2 and 3 to 1 . Assume that, in the basis of the three states 1 , ( ) is represented by the matrix: 2 , 3 , arranged in that order, 0 0 0 where 1410

0 0 sin 1 1

1

0 sin 0

is a constant proportional to the amplitude of the oscillating field.



EXERCISES

. Set (notation identical to that of Chapter XIII): 3

() =

( )e

~

=1

Write the system of differential equations satisfied by the ( ). . Assume that is very close to 32 = ( 3 2 ) ~. Making approximations analogous to those used in Complement CXIII , integrate the preceding system, with the initial conditions: 1 (0)

=

2 (0)

=

1 2

3 (0)

=0

(neglect, on the right-hand side of the differential equations, the terms whose coefficients, e ( + 32 ) , vary very rapidly, and keep only those whose coefficients are constant or vary 32 ) very slowly, as e ( ). . The component along of the electric dipole moment of the system is represented, in the basis of the three states 1 , 2 , 3 , arranged in that order, by the matrix: 0

0 0 0 0 0 0 where is a real constant ( is an odd operator and can connect only states of different parities). Calculate ()= () ( ) , using the vector ( ) calculated in Show that the time evolution of ( ) is given by a superposition of sinusoidal terms. Determine the frequencies and relative intensities of these terms. These are the frequencies that can be absorbed by the atom when it is placed in an oscillating electric field parallel to . Describe the modifications of this absorption spectrum when, for fixed and equal to 32 , 1 is increased from zero. Show that the presence of the magnetic field oscillating at the frequency 32 2 splits the electric dipole absorption line at the frequency 21 2 , and that the separation of the two components of the doublet is proportional to the oscillating magnetic field amplitude (the AutlerTownes doublet). What happens when, for 1 fixed, 32 is varied? 6. Elastic scattering by a particle in a bound state. Form factor Consider a particle ( ) in a bound state 0 described by the wave function 0 (r ) localized about a point . Towards this particle ( ) is directed a beam of particles ( ), of 1 mass , momentum ~k , energy = ~2 k2 2 and wave function e k r . Each (2 )3 2 particle ( ) of the beam interacts with particle ( ). The corresponding potential energy, , depends only on the relative position r r of the two particles. . Calculate the matrix element: :

0;

:k

(R

R )

:

0;

:k 1411



COMPLEMENT FXIII

of (R R ) between two states in which particle ( ) is in the same state 0 and particle ( ) goes from the state k to the state k . The expression for this matrix element should include the Fourier transform (k) of the potential (r r ): (r

r )=

1 (2 )3

2

r ) 3

(k) e k (r

d

. Consider the scattering processes in which, under the effect of the interaction , particle ( ) is scattered in a certain direction, with particle ( ) remaining in the same quantum state 0 after the scattering process (elastic scattering). Using a method analogous to the one in Chapter XIII [cf. comment ( ) of § C-3-b], calculate, in the Born approximation, the elastic scattering cross section of particle ( ) by particle ( ) in the state 0 . Show that this cross section can be obtained by multiplying the cross section for scattering by the potential (r) (in the Born approximation) by a factor which characterizes the state 0 , called the “form factor”.

7. A simple model of the photoelectric effect Consider, in a one-dimensional problem, a particle of mass , placed in a potential of the form ( ) = ( ), where is a real positive constant. Recall (cf. exercises 2 and 3 of Complement KI ) that, in such a potential, there is 2 a single bound state, of negative energy 0 = 2~2 , associated with a normalized 2 2 ~ wave function 0 ( ) = ~ e . For each positive value of the energy = ~2 2 2 , on the other hand, there are two stationary wave functions, corresponding, respectively, to an incident particle coming from the left or from the right. The expression for the first eigenfunction, for example, is:

( )=

1 1 e 1 + ~2 2 1 ~2 e 2 1 + ~2

. Show that the

e

for

0

for

0

( ) satisfy the orthonormalization relation (in the extended

sense): = (

)

The following relation [cf. formula (47) of Appendix II] can be used: 0

e

d =

e

d = Lim 0

0

=

( )

1 +

1

Calculate the density of states ( ) for a positive energy 1412

.



EXERCISES

. Calculate the matrix element between 0 of the position observable the bound state 0 and the positive energy state whose wave function was given above. . The particle, assumed to be charged (charge ) interacts with an electric field oscillating at the angular frequency . The corresponding perturbation is: ()=

sin

where

is a constant. The particle is initially in the bound state 0 . Assume that ~ 0 . Calculate, using the results of § C of Chapter XIII [see, in particular, formula (C-37)], the transition probability per unit time to an arbitrary positive energy state (the photoelectric or photoionization effect). How does vary with and ? 8. Disorientation of an atomic level due to collisions with rare gas atoms Consider a motionless atom at the origin of a coordinate frame (Fig. 1). This atom is in a level of angular momentum = 1, to which correspond the three orthonormal kets ( = 1 0 +1), eigenstates of of eigenvalues ~. A second atom , in a level of zero angular momentum, is in uniform rectilinear motion in the plane: it is travelling at the velocity along a straight line parallel to and situated at a distance from this axis ( is the “impact parameter”). The time origin is chosen at the time when arrives at point of the axis ( = ). At time , atom is therefore at point , where = . Call the angle between and . The preceding model, which treats the external degrees of freedom of the two atoms classically, permits the simple calculation of the effect on the internal degrees of freedom of atom (which are treated quantum mechanically) of a collision with atom (which is, for example, a rare gas atom in the ground state). It can be shown that, because of the Van der Waals forces (cf. Complement CXI ) between the two atoms, atom is

z

M θ O

y

H x

Figure 1 1413

COMPLEMENT FXIII



subject to a perturbation =

acting on its internal degrees of freedom, and given by:

2 6

where is a constant, is the distance between the two atoms, and is the component of the angular momentum J of atom on the axis joining the two atoms. . Express in terms of = . Introduce the dimensionless parameter = . . Assume that there is no external magnetic field, so that the three states + 1 , 0, 1 of atom have the same energy. Before the collision, that is, at = , atom is in the state 1 . Using firstorder time-dependent perturbation theory, calculate the probability P 1 +1 of finding, after the collision (that is, at = + ), atom in the state + 1 . Discuss the variation of P 1 +1 with respect to and . Similarly, calculate P 1 0 . . Now assume that there is a static field B0 parallel to , so that the three states have an additional energy ~ 0 (the Zeeman effect), where 0 is the Larmor angular frequency in the field B0 . . With ordinary magnetic fields (B0 102 gauss), 0 109 rad.sec 1 ; 2 1 ˚ is of the order of 5 A, and , of the order of 5 10 m.sec . Show that, under these conditions, the results of question remain valid. Without going into detailed calculations, explain what happens for much higher values of 0 . Starting with what value of 0 (where and have the values indicated in ) will the results of no longer be valid? . Without going into detailed calculations, explain how to calculate the disorientation probabilities P 1 +1 and P 1 0 for an atom placed in a gas of atoms in thermodynamic equilibrium at the temperature , containing a number of atoms per unit volume sufficiently small that only binary collisions need be considered. + N.B. We give: d (1 + 2 ) 4 = 5 16 9. Transition probability per unit time under the effect of a random perturbation. Simple relaxation model This exercice uses the results of Complement EXIII . We consider a system of spin 1/2 particles, with gyromagnetic ratio , placed in a static field B0 (set 0 = 0 ). These particles are enclosed in a spherical cell of radius . Each of them bounces constantly back and forth between the walls. The mean time between two collisions of the same particle with the wall is called the “flight time” . During this time, the particle “sees” only the field B0 . In a collision with the wall, each particle remains adsorbed on the surface during a mean time ( ), during which it “sees”, in addition to B0 , a constant microscopic magnetic field b, due to the paramagnetic impurities contained in the wall. The direction of b varies randomly from one collision to another; the mean amplitude of b is denoted by 0 . . What is the correlation time of the perturbation seen by the spins? Give the physical justification for the following form, to be chosen for the correlation function of the components of the microscopic field b: () ( 1414

)=

1 3

2 0

e



EXERCISES

with analogous expressions for the components along and , all the cross terms () ( )... being zero. . Let be the component along the axis defined by the field B0 of the macroscopic magnetization of the particles. Show that, under the effect of the collisions with the wall, “relaxes” with a time constant 1 : d

=

d (

1

is called the longitudinal relaxation time). Calculate 1 in terms of , 0 , , , 0 . . Show that studying the variation of 1 with 0 permits the experimental determination of the mean adsorption time . . We have at our disposition several cells, of different radii , constructed from the same material. By measuring 1 , how can we determine experimentally the mean amplitude 0 of the microscopic field at the wall? 1

10. Absorption of radiation by a many-particle system forming a bound state. The Doppler effect. Recoil energy. The Mössbauer effect In Complement AXIII , we consider the absorption of radiation by a charged particle attracted by a fixed center (the hydrogen atom model for which the nucleus is infinitely heavy). In this exercise, we treat a more realistic situation, in which the incident radiation is absorbed by a system of several particles of finite masses interacting with each other and forming a bound state. Thus, we are studying the effect on the absorption phenomenon of the degrees of freedom of the center of mass of the system. I-Absorption of radiation by a free hydrogen atom. The doppler effect. Recoil energy Let R1 and P1 , R2 and P2 be the position and momentum observables of two particles, (1) and (2), of masses 1 and 2 and opposite charges 1 and 2 (a hydrogen atom). Let R and P, R and P be the position and momentum observables of the relative particle and of the center of mass (cf. Chap. VII, § B). = 1 + 2 is the total mass, and = 1 2 ( 1 + 2 ) is the reduced mass. The Hamiltonian 0 of the system can be written: 0

=

+

(1)

where: =

1 2

P2

(2)

is the translational kinetic energy of the atom, assumed to be free (“external” degrees of freedom), and where (which depends only on R and P) describes the internal energy of the atom (“internal” degrees of freedom). We denote by K the eigenstates of , with eigenvalues ~2 K2 2 . We concern ourselves with only two eigenstates of , and , of energies and ( ). We set: =~

(3)

0

. What energy must be furnished to the atom to move it from the state K; (the atom in the state with a total momentum ~K) to the state K ; ? 1415

COMPLEMENT FXIII



. This atom interacts with a plane electromagnetic wave of wave vector k and angular frequency = , polarized along the unit vector e perpendicular to k. The corresponding vector potential A(r ) is: A(r ) =

0

e e (k r

)

+ c.c.

(4)

where 0 is a constant. The principal term of the interaction Hamiltonian between this plane wave and the two-particle system can be written (cf. Complement AXIII , § 1-b): 2

()=

P

A(R

)

(5)

=1

Express ( ) in terms of R, P, R , P , , and (setting 1 = 2 = ), and show that, in the electric dipole approximation which consists of neglecting k R (but not k R ) compared to 1, we have: ()=

e

+

e

(6)

where: 0

=

e P ekR

(7)

. Show that the matrix element of between the state K; and the state K; is different from zero only if there exists a certain relation between K, k, K (to be specified). Interpret this relation in terms of the total momentum conservation during the absorption of an incident photon by the atom. . Show from this that if the atom in the state K; is placed in the plane wave (4), resonance occurs when the energy ~ of the photons associated with the incident wave differs from the energy ~ 0 of the atomic transition by a quantity which is to be expressed in terms of ~, 0 , K, k, , (since is a corrective term, we can replace by 0 in the expression for ). Show that is the sum of two terms: one of which, 1 , depends on K and on the angle between K and k (the Doppler effect); and the other, 2 , is independent of K. Give a physical interpretation of 1 and 2 (showing that 2 is the recoil kinetic energy of the atom when, having been initially motionless, it absorbs a resonant photon). Show that 2 is negligible compared to 1 when ~ 0 is of the order of 10 eV (the domain of atomic physics). Choose, for , a mass of the order of that of the proton 2 109 eV), and, for K , a value corresponding to a thermal velocity at = 300 K. Would this still be true if ~ 0 were of the order of 105 eV (the domain of nuclear physics)? II-Recoilless absorption of radiation by a nucleus vibrating about its equilibrium position in a crystal. The Mössbauer effect The system under consideration is now a nucleus of mass vibrating at the angular frequency Ω about its equilibrium position in a crystalline lattice (the Einstein model; cf. Complement AV , § 2). We again denote by R and P the position and momentum of the center of mass of this nucleus. The vibrational energy of the nucleus is described by the Hamiltonian: = 1416

1 2

P2 +

1 2

Ω2 (

2

+

2

+

2

)

(8)



EXERCISES

which is that of a three-dimensional isotropic harmonic oscillator. Denote by the eigenstate of of eigenvalue ( + + + 3 2)~Ω. In addition to these external degrees of freedom, the nucleus possesses internal degrees of freedom with which are associated observables which all commute with R and P . Let be the Hamiltonian that describes the internal energy of the nucleus. As above, we concern ourselves with two eigenstates of , and , of energies and , and we set ~ 0 = . Since ~ 0 falls into the -ray domain, we have, of course: Ω

0

(9)

. What energy must be furnished to the nucleus to allow it to go from the state ; (the nucleus in the vibrational state defined by the quantum numbers = 0, 000 = 0, = 0 and the internal state ) to the state ; ? 00 . This nucleus is placed in an electromagnetic wave of the type defined by (4), whose wave vector k is parallel to . It can be shown that, in the electric dipole approximation, the interaction Hamiltonian of the nucleus with this plane wave (responsible for the absorption of the -rays) can be written as in (6), with: =

0

( )e

(10)

where ( ) is an operator which acts on the internal degrees of freedom and consequently commutes with R and P . Set ( ) = ( ) . The nucleus is initially in the state 0 0 0 ; . Show that, under the influence of the incident plane wave, a resonance appears whenever ~ coincides with one of the energies calculated in , with the intensity of the corresponding resonance proportional to 2 ( )2 , where the value of is to be specified. Show, furthermore, 00 e 000 that condition (9) allows us to replace by 0 = 0 in the expression for the intensity of the resonance. . We set: ( 0) =

e

0

2

0

(11)

where the states are the eigenstates of a one-dimensional harmonic oscillator of position , mass and angular frequency Ω. . Calculate ( 0 ) in terms of ~, , Ω, 0 , (see also exercise 7 of Comple~2 2 ment MV ). Set = 2 0 ~Ω. 0 Hint: establish a recurrence relation between e 0 0 and 1 e 0 , and express all the ( 0 ) in terms of 0 ( 0 ), which is to be calculated directly from the wave function of the harmonic oscillator ground state. Show that the ( 0 ) are given by a Poisson distribution. . Verify that

( 0) = 1 =0

. Show that

~Ω

( 0 ) = ~2

2 0

2

2

.

=0 2 . Assume that ~Ω ~2 02 2 , i.e. that the vibrational energy of the nucleus is much greater than the recoil energy (very rigid crystalline bonds). Show that the absorption spectrum of the nucleus is essentially composed of a single line of angular frequency 0 . This line is called the recoilless absorption line. Justify this name. Why does the Doppler effect disappear?

1417

COMPLEMENT FXIII



2 . Now assume that ~Ω ~2 02 2 (very weak crystalline bonds). Show that the absorption spectrum of the nucleus is composed of a very large number of equidistant lines whose barycenter (obtained by weighting the abscissa of each line by its relative intensity) coincides with the position of the absorption line of the free and initially motionless nucleus. What is the order of magnitude of the width of this spectrum (the dispersion of the lines about their barycenter)? Show that one obtains the results of the first part in the limit Ω 0.

REFERENCES Exercise 3: References: Exercise 5: References: Exercise 6: References: Exercise 9: References: Exercise 10: References:

1418

see Brossel’s lectures in (15.2). see Townes and Schawlow (12.10), Chap. 10, § 9. see Wilson (16.34). see Abragam (14.1), Chap. VIII; Slichter (14.2), Chap. 5. see De Benedetti (16.23); Valentin (16.1), annex XV.

Chapter XIV

Systems of identical particles A

B

C

D

Statement of the problem . . . . . . . . . . . . . . . . . . . . 1420 A-1

Identical particles: definition . . . . . . . . . . . . . . . . . . 1420

A-2

Identical particles in classical mechanics . . . . . . . . . . . . 1420

A-3

Identical particles in quantum mechanics: the difficulties of applying the general postulates . . . . . . . . . . . . . . . . . 1421

Permutation operators . . . . . . . . . . . . . . . . . . . . . . 1425 B-1

Two-particle systems . . . . . . . . . . . . . . . . . . . . . . . 1426

B-2

Systems containing an arbitrary number of particles . . . . . 1430

The symmetrization postulate . . . . . . . . . . . . . . . . . . 1434 C-1

Statement of the postulate

C-2

Removal of exchange degeneracy . . . . . . . . . . . . . . . . 1435

. . . . . . . . . . . . . . . . . . . 1434

C-3

Construction of physical kets . . . . . . . . . . . . . . . . . . 1436

C-4

Application of the other postulates . . . . . . . . . . . . . . . 1440

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1443 D-1

Differences between bosons and fermions. Pauli’s exclusion principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1444

D-2

The consequences of particle indistinguishability on the calculation of physical predictions . . . . . . . . . . . . . . . . . 1446

In Chapter III, we stated the postulates of non-relativistic quantum mechanics, and in Chapter IX, we concentrated on those which concern spin degrees of freedom. Here, we shall see (§ A) that, in reality, these postulates are not sufficient when we are dealing with systems containing several identical particles since, in this case, their application leads to ambiguities in the physical predictions. To eliminate these ambiguities, it is necessary to introduce a new postulate, concerning the quantum mechanical description of systems of identical particles. We shall state this postulate in § C and discuss its physical implications in § D. Before we do so, however, we shall (in § B) define and study permutation operators, which considerably facilitate the reasoning and the calculations. Quantum Mechanics, Volume II, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XIV

A.

SYSTEMS OF IDENTICAL PARTICLES

Statement of the problem

A-1.

Identical particles: definition

Two particles are said to be identical if all their intrinsic properties (mass, spin, charge, etc.) are exactly the same: no experiment can distinguish one from the other. Thus, all the electrons in the universe are identical, as are all the protons and all the hydrogen atoms. On the other hand, an electron and a positron are not identical, since, although they have the same mass and the same spin, they have different electrical charges. An important consequence can be deduced from this definition: when a physical system contains two identical particles, there is no change in its properties or its evolution if the roles of these two particles are exchanged.

Comment:

Note that this definition is independent of the experimental conditions. Even if, in a given experiment, the charges of the particles are not measured, an electron and a positron can never be treated like identical particles. A-2.

Identical particles in classical mechanics

In classical mechanics, the presence of identical particles in a system poses no particular problems. This special case is treated just like the general case. Each particle moves along a well-defined trajectory, which enables us to distinguish it from the others and “follow” it throughout the evolution of the system. To treat this point in greater detail, we shall consider a system of two identical particles. At the initial time 0 , the physical state of the system is defined by specifying the position and velocity of each of the two particles; we denote these initial data by r0 v0 and r0 v0 . To describe this physical state and calculate its evolution, we number the two particles: r1 ( ) and v1 ( ) denote the position and velocity of particle (1) at time , and r2 ( ) and v2 ( ), those of particle (2). This numbering has no physical foundation, as it would if we were dealing with two particles having different natures. It follows that the initial physical state which we have just defined may, in theory, be described by two different “mathematical states” as we can set, either: r1 ( 0 ) = r0

r2 ( 0 ) = r0

v1 ( 0 ) = v0

v2 ( 0 ) = v0

(A-1)

or: r1 ( 0 ) = r0

r2 ( 0 ) = r0

v1 ( 0 ) = v0

v2 ( 0 ) = v0

(A-2)

Now, let us consider the evolution of the system. Suppose that the solution of the equations of motion defined by initial conditions (A-1) can be written: r1 ( ) = r( ) r2 ( ) = r ( ) 1420

(A-3)

A. STATEMENT OF THE PROBLEM

where r( ) and r ( ) are two vector functions. The fact that the two particles are identical implies that the system is not changed if they exchange roles. Consequently, the Lagrangian (r1 v1 ; r2 v2 ) and the classical Hamiltonian (r1 p1 ; r2 p2 ) are invariant under exchange of indices 1 and 2. It follows that the solution of the equations of motion corresponding to the initial state (A-2) is: r1 ( ) = r ( ) r2 ( ) = r( )

(A-4)

where the functions r( ) and r ( ) are the same as in (A-3). The two possible mathematical descriptions of the physical state under consideration are therefore perfectly equivalent, since they lead to the same physical predictions. The particle which started from r0 v0 at 0 is at r( ) with the velocity v( ) = dr d at time , and the one which started from r0 v0 is at r ( ) with the velocity v ( ) = dr d (Fig. 1). Under these conditions, all we need to do is choose, at the initial time, either one of the two possible “mathematical states” and ignore the existence of the other one. Thus, we treat the system as if the two particles were actually of different natures. The numbers (1) and (2), with which we label them arbitrarily at 0 , then act like intrinsic properties to distinguish the two particles. Since we can follow each particle step-by-step along its trajectory (arrows in Figure 1), we can determine the locations of the particle numbered (1) and the one numbered (2) at any time. r0 , v0

r(t), v(t)

r0 , v0

r(t), v(t)

Initial state

State at the instant l

Figure 1: Position and velocity of each of the two particles at the initial time time .

A-3. A-3-a.

0

and at

Identical particles in quantum mechanics: the difficulties of applying the general postulates Qualitative discussion of a first simple example

It is immediately apparent that the situation is radically different in quantum mechanics, since the particles no longer have definite trajectories. Even if, at 0 , the wave packets associated with two identical particles are completely separated in space, their subsequent evolution may mix them. We then “lose track” of the particles; when we detect one particle in a region of space in which both of them have a non-zero position probability, we have no way of knowing if the particle detected is the one numbered (1) or the one numbered (2). Except in special cases – for example, when the two wave packets never overlap – the numbering of the two particles becomes ambiguous when their positions are measured, since, as we shall see, there exist several distinct “paths” taking the system from its initial state to the state found in the measurement. 1421

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

To investigate this point in greater detail, consider a concrete example; a collision between two identical particles in their center of mass frame (Fig. 2). Before the collision, we have two completely separate wave packets, directed towards each other (Fig. 2a). We can agree, for example, to denote by (1) the particle on the left and by (2), the one on the right. During the collision (Fig. 2b), the two wave packets overlap. After the collision, the region of space in which the probability density of the two particles is non-zero1 looks like a spherical shell whose radius increases over time (Fig. 2c). Suppose that a detector placed in the direction which makes an angle with the initial velocity of wave packet (l) detects a particle. It is then certain (because momentum is conserved in the collision) that the other particle is moving away in the opposite direction. However, it is impossible to know if the particle detected at is the one initially numbered (1) or the one numbered (2). Thus, there are two different “paths” that could have led the system from the initial state shown in Figure 2a to the final state found in the measurement. These two paths are represented schematically in Figures 3a and 3b. Nothing enables us to determine which one was actually followed.

Figure 2: Collision between two identical particles in the center of mass frame: schematic representation of the probability density of the two particles. Before the collision (fig. a), the two wave packets are clearly separated and can be labeled. During the collision (fig. b), the two wave packets overlap. After the collision (fig. c), the probability density is nonzero in a region shaped like a spherical shell whose radius increases over time. Because the two particles are identical, it is impossible, when a particle is detected at , to know with which wave packet, (1) or (2), it was associated before the collision.

A fundamental difficulty then arises in quantum mechanics when using the postulates of Chapter III. In order to calculate the probability of a given measurement result it is necessary to know the final state vectors associated with this result. Here, there are two, which correspond respectively to Figures 3a and 3b. These two kets are distinct (and, furthermore, orthogonal). Nevertheless, they are associated with a single physical 1 The two-particle wave function depends on six variables (the components of the two particles coordinates r and r ) and is not easily represented in 3 dimensions. Figure 2 is therefore very schematic: the grey regions are those to which both r and r must belong for the wave function to take on significant values.

1422

A. STATEMENT OF THE PROBLEM

D

D (1)

(2)

(1)

(1) (2)

(2)

(2)

(1) a

b

Figure 3: Schematic representation of two types of “paths” which the system could have followed in going from the initial state to the state found in the measurement. Because the two particles are identical, we cannot determine the path that was actually followed.

state since it is impossible to imagine a more complete measurement that would permit distinguishing between them. Under these conditions, should one calculate the probability using path 3a, path 3b or both? In the latter case, should one take the sum of the probabilities associated with each path, or the sum of their probability amplitudes (and in this case, with what sign)? These different possibilities lead, as we shall verify later, to different predictions. The answer to the preceding questions will be given in § D after we have stated the symmetrization postulate. Before going on, we shall study another example that will aid us in understanding the difficulties related to the indistinguishability of two particles. A-3-b.

Origin of the difficulties: Exchange degeneracy

In the preceding example, we considered two wave packets which, initially, did not overlap; this enabled us to label each of them arbitrarily with a number, (1) or (2). Ambiguities appeared, however, when we tried to determine the mathematical state (or ket) associated with a given result of a position measurement. Actually, the same difficulty arises in the choice of the mathematical ket used to describe the initial physical state. This type of difficulty is related to the concept of “exchange degeneracy” which we shall introduce in this section. To simplify the reasoning, we shall first consider a different example, so as to confine ourselves to a finite-dimensional space. Then, we shall generalize the concept of exchange degeneracy, showing that it can be generalized to all quantum mechanical systems containing identical particles. .

Exchange degeneracy for a system of two spin 1/2 particles

Let us consider a system composed of two identical spin 1/2 particles, confining ourselves to the study of its spin degrees of freedom. As in § A-2, we shall distinguish between the physical state of the system and its mathematical description (a ket in state space). It would seem natural to suppose that, if we made a complete measurement of each of the two spins, we would then know the physical state of the total system perfectly. Here, we shall assume that the component along of one of them is equal to +~/2 and that of the other one, – ~/2 (this is the equivalent for the two spins of the specification 1423

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

of r0 v0 and r0 v0 in § A-2). To describe the system mathematically, we number the particles: S1 and S2 denote the two spin observables, and (where 1 and 2 can be equal to + or ) is the 1 2 orthonormal basis of the state space formed by the common eigenkets of 1 (eigenvalue 1 ~ 2) and 2 (eigenvalue 2 ~ 2). Just as in classical mechanics, two different “mathematical states” could be associated with the same physical state. Either one of the two orthogonal kets: 1

=+

1

=

2

=

(A-5a)

2

=+

(A-5b)

can, a priori, describe the physical state considered here. These two kets span a two-dimensional subspace whose normalized vectors are of the form: +

+

+

(A-6)

with: 2

+

2

=1

(A-7)

By the superposition principle, all mathematical kets (A-6) can represent the same physical state as (A-5a) or (A-5b) (one spin pointing up and the other one pointing down). This is called “exchange degeneracy”. Exchange degeneracy creates fundamental difficulties, since application of the postulates of Chapter III to the various kets (A-6) can lead to physical predictions that depend on the ket chosen. Let us determine, for example, the probability of finding the components of the two spins along both equal to +~ 2. With this measurement result is associated a single ket of the state space. According to formula (A-20) of Chapter IV, this ket can be written: 1 1 [ 1=+ + 1= ] [ 2=+ + 2= ] 2 2 1 = + + + + + + + (A-8) 2 Consequently, the desired probability, for the vector (A-6), is equal to: 1 ( + ) 2

2

(A-9)

This probability does depend on the coefficients and . It is not possible, therefore, to describe the physical state under consideration by the set of kets (A-6) or by any one of them chosen arbitrarily. The exchange degeneracy must be removed. That is, we must indicate unambiguously which of the kets (A-6) is to be used. Comment:

In this example, exchange degeneracy appears only in the initial state, since we chose the same value for the components of the two spins in the final state. In the general case (for example, if the measurement result corresponds to two different eigenvalues of ), exchange degeneracy appears in both the initial and the final state. 1424

B. PERMUTATION OPERATORS

.

Generalization

The difficulties related to exchange degeneracy arise in the study of all systems containing an arbitrary number of identical particles ( 1). Consider, for example, a three-particle system. With each of the three particles, taken separately, are associated a state space and observables acting in this space. Thus, we are led to number the particles: (1), (2) and (3) will denote the three one-particle state spaces, and the corresponding observables will be labeled by the same indices. The state space of the three-particle system is the tensor product: = (1)

(2)

(3)

(A-10)

Now, consider an observable (1), initially defined in (1). We shall assume that (1) alone constitutes a C.S.C.O. in (1) [or that (1) actually denotes several observables which form a C.S.C.O.]. The fact that the three particles are identical implies that the observables (2) and (3) exist and that they constitute C.S.C.O.’s in (2) and (3) respectively. (1), (2) and (3) have the same spectrum, ; =1 2 . Using the bases that define these three observables in (1), (2) and (3), we can construct, by taking the tensor product, an orthonormal basis of , which we shall denote by: 1:

; 2:

; 3:

;

=1 2

(A-11)

The kets 1 : ; 2 : ; 3 : are common eigenvectors of the extensions of (1), (2) and (3) in , with respective eigenvalues , et . Since the three particles are identical, we cannot measure (1) or (2) or (3), since the numbering has no physical significance. However, we can measure the physical quantity for each of the three particles. Suppose that such a measurement has resulted in three different eigenvalues, , and . Exchange degeneracy then appears, since the state of the system after this measurement can, a priori, be represented by any one of the kets of the subspace of spanned by the six basis vectors: 1:

; 2:

; 3:

1:

; 2:

; 3:

1:

; 2:

; 3:

1:

; 2:

; 3:

1:

; 2:

; 3:

1:

; 2:

; 3:

(A-12)

Therefore, a complete measurement on each of the particles does not permit the determination of a unique ket of the state space of the system.

Comment:

The indeterminacy due to exchange degeneracy is, of course, less important if two of the eigenvalues found in the measurement are equal. This indeterminacy disappears in the special case in which the three results are identical. B.

Permutation operators

Before stating the additional postulate that enables us to remove the indeterminacy related to exchange degeneracy, we shall study certain operators, defined in the total state space of the system under consideration, which actually permute the various particles of the system. The use of these permutation operators will simplify the calculations and reasoning in §§ C and D. 1425

CHAPTER XIV

B-1.

SYSTEMS OF IDENTICAL PARTICLES

Two-particle systems

B-1-a.

Definition of the permutation operator

21

Consider a system composed of two particles with the same spin . Here it is not necessary for these two particles to be identical; it is sufficient that their individual state spaces be isomorphic. Therefore, to avoid the problems that arise when the two particles are identical, we shall assume that they are not: the numbers (1) and (2) with which they are labeled indicate their natures. For example, (1) will denote a proton and (2), an electron. We choose a basis, , in the state space (1) of particle (1). Since the two particles have the same spin, (2) is isomorphic to (1), and it can be spanned by the same basis. By taking the tensor product, we construct, in the state space of the system, the basis: 1:

; 2:

(B-1)

Since the order of the vectors is of no importance in a tensor product, we have: 2:

; 1:

1:

; 2:

= 1:

; 2:

(B-2)

However, note that: 1:

; 2:

if

The permutation operator on the basis vectors is given by: 21

1:

; 2:

= 2:

Its action on any ket of (B-1).

21

=

(B-3)

is then defined as the linear operator whose action

; 1:

= 1:

; 2:

(B-4)

can easily be obtained by expanding this ket2 on the basis

Comment: If we choose the basis formed by the common eigenstates of the position observable R and the spin component , (B-4) can be written: 21

1:r

Any ket variables: =

;2 : r

= 1:r

of the state space

d3 d3

; 2:r

(B-5)

can be represented by a set of (2 + 1)2 functions of six

(r r ) 1 : r

; 2:r

(B-6)

with: (r r ) = 1 : r

2 It

1426

; 2:r

can easily be shown that the operator

(B-7)

21

so defined does not depend on the

basis chosen.

B. PERMUTATION OPERATORS

We then have: 3

=

21

3

(r r ) 1 : r

; 2:r

(B-8)

By changing the names of the dummy variables:

r

(B-9)

r

we transform formula (B-8) into: 3

=

21

3

(r r) 1 : r

; 2:r

(B-10)

Consequently, the functions: (r r ) = 1 : r

; 2:r

(B-11)

21

which represent the ket = 21 can be obtained from the functions (B-7) which represent the ket by inverting (r ) and (r ): (r r ) = B-1-b.

(r r)

Properties of

(B-12)

21

We see directly from definition (B-4) that: (

2 21 )

=1

(B-13)

The operator 21 is its own inverse. It can easily be shown that 21 is Hermitian: 21

=

(B-14)

21

The matrix elements of 1:

; 2:

21

1:

21

in the

1:

; 2:

; 2:

= 1:

basis are:

; 2:

1:

; 2:

= Those of

21

1:

; 2:

(B-15)

are, by definition: 21

1:

; 2:

=( 1:

; 2:

=( 1:

; 2:

=

21

1:

1:

; 2: ; 2:

) ) (B-16)

Each of the matrix elements of 21 is therefore equal to the corresponding matrix element of 21 . This leads to relation (B-14). It follows from (B-13) and (B-14) that 21 is also unitary: 21 21

=

21 21

=1

(B-17) 1427

CHAPTER XIV

B-1-c.

SYSTEMS OF IDENTICAL PARTICLES

Symmetric and antisymmetric kets. Symmetrizer and antisymmetrizer

According to relation (B-14), the eigenvalues of 21 must be real. Since, according to (B-13), their squares are equal to 1, these eigenvalues are simply +1 and 1. The eigenvectors of 21 associated with the eigenvalue +1 are called symmetric, those corresponding to the eigenvalue 1, antisymmetric: 21

=

=

symmetric

21

=

=

antisymmetric

(B-18)

Now consider the two operators: 1 (1 + 2 1 = (1 2 =

21 )

(B-19a)

21 )

(B-19b)

These operators are projectors, since (B-13) implies that: 2

=

(B-20a)

2

=

(B-20b)

and, in addition, (B-14) enables us to show that: =

(B-21a)

=

(B-21b)

and

are projectors onto orthogonal subspaces, since, according to (B-13): =

=0

(B-22)

These subspaces are supplementary, since definitions (B-19) yield: +

=1

(B-23)

If is an arbitrary ket of the state space , is a symmetric ket and an antisymmetric ket, as it is easy to see, using (B-13) again, that: 21

=

21

=

For this reason,

,

(B-24) and

are called, respectively, a symmetrizer and an antisymmetrizer.

Comment:

The same symmetric ket is obtained by applying 21

=

to

21

or to

itself: (B-25)

For the antisymmetrizer, we have, similarly: 21

1428

=

(B-26)

B. PERMUTATION OPERATORS

B-1-d.

Transformation of observables by permutation

Consider an observable (1), initially defined in (1) and then extended into . It is always possible to construct the basis in (1) from eigenvectors of (1) (the corresponding eigenvalues will be written ). Let us now calculate the action of the operator 21 (1) 21 on an arbitrary basis ket of : (1)

21

21

1:

; 2:

=

(1) 1 :

21

=

21

=

1:

1:

; 2: ; 2:

; 2:

(B-27)

We would obtain the same result by applying the observable ket chosen. Consequently: (1)

21

21

=

(2)

(2) directly to the basis

(B-28)

The same reasoning shows that: (2)

21

21

=

(1)

(B-29)

In , there are also observables, such as (1) + indices simultaneously. We obviously have: 21 [

(1) +

(2)]

21

=

(2) +

(2) or

(1) (2), which involve both

(1)

(B-30)

Similarly, using (B-17), we find: (1) (2)

21

21

=

21

=

(2) (1)

(1)

21 21

(2)

21

(B-31)

These results can be generalized to all observables in which can be expressed in terms of observables of the type of (1) and (2), to be denoted by (1 2): (1 2)

21

21

=

(2 1)

(B-32)

(2 1) is the observable obtained from (1 2) by exchanging indices 1 and 2 throughout. An observable (1 2) is said to be symmetric if: (2 1) =

(1 2)

(B-33)

According to (B-32), all symmetric observables satisfy: (1 2) =

21

(1 2)

21

(B-34)

that is: [

(1 2)

21 ]

=0

(B-35)

Symmetric observables commute with the permutation operator. 1429

CHAPTER XIV

B-2.

SYSTEMS OF IDENTICAL PARTICLES

Systems containing an arbitrary number of particles

In the state space of a system composed of particles with the same spin (temporarily assumed to be of different natures), ! permutation operators can be defined (one of which is the identity operator). If is greater than 2, the properties of these operators are more complex than those of 21 . To have an idea of the changes involved when is greater than 2, we shall briefly study the case in which = 3. B-2-a.

Definition of the permutation operators

Consider, therefore, a system of three particles that are not necessarily identical, but have the same spin. As in § B-1-a, we construct a basis of the state space of the system by taking a tensor product: 1:

; 2:

; 3:

(B-36)

In this case, there exist six permutation operators, which we shall denote by: 123

312

231

132

213

(B-37)

321

By definition, the operator (where , , is an arbitrary permutation of the numbers 1, 2, 3) is the linear operator whose action on the basis vectors obeys: 1:

; 2:

; 3:

=

:

;

:

;

:

; 2:

; 3:

= 2:

; 3:

; 1:

= 1:

:2:

; 3:

(B-38)

For example: 231

1:

(B-39)

on any ket of the 123 therefore coincides with the identity operator. The action of state space can easily be obtained by expanding this ket on the basis (B-36). The ! permutation operators associated with a system of particles with the same spin could be defined analogously. B-2-b.

.

Properties

The set of permutation operators constitutes a group This can easily be shown for the operators (B-37): ()

123

is the identity operator.

( ) The product of two permutation operators is also a permutation operator. We can show, for example, that: 312

132

=

(B-40)

321

To do so, we apply the left-hand side to an arbitrary basis ket: 312

1430

132

1:

; 2:

; 3: =

312

1:

; 3:

; 2:

=

312

1:

; 2:

; 3:

= 3:

; 1:

; 2:

= 1:

; 2:

; 3:

(B-41)

B. PERMUTATION OPERATORS

The action of 321

(

1:

321

; 2:

effectively leads to the same result: ; 3:

= 3:

; 2:

; 1:

= 1:

; 2:

; 3:

(B-42)

) Each permutation operator has an inverse, which is also a permutation operator. Reasoning as in ( ), we can easily show that: 1 123 1 132

= =

123 132

;

1 312

;

1 213

= =

231 213

;

1 231

=

312

;

1 321

=

321

(B-43)

Note that the permutation operators do not commute with each other. For example: 132

312

=

(B-44)

213

which, compared to (B-40), shows that the commutator of

.

132

and

312

is not zero.

Transpositions. Parity of a permutation operator

A transposition is a permutation which simply exchanges the roles of two of the particles, without touching the others. Of the operators (B-37), the last three are transposition operators3 . Transposition operators are Hermitian, and each of them is the same as its inverse, so that they are also unitary [the proofs of these properties are identical to those for (B-14), (B-13) and (B-17)]. Any permutation operator can be broken down into a product of transposition operators. For example, the second operator (B-37) can be written: 312

=

132 213

=

321 132

=

213 321

=

132 213 ( 132 )

2

=

(B-45)

This decomposition is not unique. However, for a given permutation, it can be shown that the parity of the number of transpositions into which it can be broken down is always the same: it is called the parity of the permutation. Thus, the first three operators (B-37) are even, and the last three, odd. For any , there are always as many even permutations as odd ones. .

Permutation operators are unitary

Permutation operators, which are products of transposition operators, all of which are unitary, are therefore also unitary. However, they are not necessarily Hermitian, since transposition operators do not generally commute with each other. Finally, note that the adjoint of a given permutation operator has the same parity as that of the operator, since it is equal to the product of the same transposition operators, taken in the opposite order. B-2-c.

Completely symmetric or antisymmetric kets. Symmetrizer and antisymmetrizer

Since the permutation operators do not commute for 2, it is not possible to construct a basis formed by common eigenvectors of these operators. Nevertheless, we shall see that there exist certain kets which are simultaneously eigenvectors of all the permutation operators. 3 Of

course, for

= 2, the only possible permutation is a transposition.

1431

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

We shall denote by an arbitrary permutation operator associated with a system of particles with the same spin; represents an arbitrary permutation of the first integers. A ket such that: =

(B-46)

for any permutation antisymmetric ket

, is said to be completely symmetric. Similarly, a completely satisfies, by definition4 :

=

(B-47)

where: = +1 if

is an even permutation

=

is an odd permutation

1 if

The set of completely symmetric kets constitutes a vector subspace ; the set of completely antisymmetric kets, a subspace . Now consider the two operators: = =

1

(B-48) of the state space

(B-49)

! 1

(B-50)

!

where the summations are performed over the ! permutations of the first integers, and is defined by (B-48). We shall show that and are the projectors onto and respectively. For this reason, they are called a symmetrizer and an antisymmetrizer. and are Hermitian operators: =

(B-51)

=

(B-52)

The adjoint of a given permutation operator is, as we saw above (cf. § B-2-b- ), another 1 permutation operator, of the same parity (which coincides, furthermore, with ). Taking the adjoints of the right-hand sides of the definitions of and therefore amounts simply 1 to changing the order of the terms in the summations (since the set of the is again the permutation group).

Also, if

0

is an arbitrary permutation operator, we have:

0

=

0

=

0

=

0

=

(B-53a)

This is due to the fact that 0

=

(B-53b)

0

0

is also a permutation operator: (B-54)

4 According to the property stated in § B-2-b- , this definition can also be based solely on the transposition operators: any transposition operator leaves a completely symmetric ket invariant and transforms a completely antisymmetric ket into its opposite.

1432

B. PERMUTATION OPERATORS

such that: =

(B-55)

0

If, for all the permutations of the group, we see that 0 fixed, we choose successively for the are each identical to one and only one of these permutations (in, of course, a different order). Consequently: 0

=

0

=

1 1

=

! =

0

!

1

=

0

!

1

=

0

!

(B-56a) (B-56b)

0

Similarly, we could prove analogous relations in which right.

and

are multiplied by

0

from the

From (B-53), we see that: 2

=

2

=

(B-57)

and, moreover: =

=0

(B-58)

This is because: 2

2

1

=

=

! 1

=

1

=

!

=

! 1

2

!

as each summation includes =

1

=

!

since half the

=

(B-59)

! terms; furthermore:

1 !

=0

are equal to +1 and half equal to

(B-60) 1 (cf. § B-2-b- ).

and are therefore projectors. They project respectively onto and since, according to (B-53), their action on any ket of the state space yields a completely symmetric or completely antisymmetric ket: 0

=

0

=

(B-61a) (B-61b)

0

Comments:

( ) The completely symmetric ket constructed by the action of on is an arbitrary permutation, is the same as that obtained from expressions (B-53) indicate that: =

, where , since (B-62) 1433

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

As for the corresponding completely antisymmetric kets, they differ at most by their signs: =

(B-63)

( ) For 2, the symmetrizer and antisymmetrizer are not projectors onto supplementary subspaces. For example, when = 3, it is easy to obtain [by using the fact that the first three permutations (B-37) are even and the others odd] the relation: 1 ( 123 + 231 + 312 ) = 1 (B-64) 3 In other words, the state space is not the direct sum of the subspace of completely symmetric kets and the subspace of completely antisymmetric kets. +

B-2-d.

=

Transformation of observables by permutation

We have indicated (§ B-2-b- ) that any permutation operator of an -particle system can be broken down into a product of transposition operators analogous to the operator 21 studied in § B-1. For these transposition operators, we can use the arguments of § B-1-d to determine the behavior of the various observables of the system when they are multiplied from the left by an arbitrary permutation operator and from the right by . In particular, the observables (1 2 ) which are completely symmetric under exchange of the indices 1, 2, . . . , , commute with all the transposition operators, and, therefore, with all the permutation operators: [ C.

(1 2

)

]=0

(B-65)

The symmetrization postulate

C-1.

Statement of the postulate

When a system includes several identical particles, only certain kets of its state space can describe its physical states. Physical kets are, depending on the nature of the identical particles, either completely symmetric or completely antisymmetric with respect to permutation of these particles. Those particles for which the physical kets are symmetric are called bosons, and those for which they are antisymmetric, fermions. The symmetrization postulate thus limits the state space for a system of identical particles. This space is no longer, as it was in the case of particles of different natures, the tensor product of the individual state spaces of the particles constituting the system. It is only a subspace of , namely or , depending on whether the particles are bosons or fermions. From the point of view of this postulate, particles existing in nature are divided into two categories. All currently known particles obey the following empirical rule 5 : particles of half-integral spin (electrons, positrons, protons, neutrons, muons, etc.) are fermions, and particles of integral spin (photons, mesons, etc.) are bosons. 5 The

1434

“spin-statistics theorem”, proven in quantum field theory, makes it possible to consider this

C. THE SYMMETRIZATION POSTULATE

Comment: Once this rule has been verified for the particles which are called “elementary”, it holds for all other particles as well, inasmuch as they are composed of these elementary particles. Consider a system of many identical composite particles. Permuting two of them is equivalent to simultaneously permuting all the particles composing the first one with the corresponding particles (necessarily identical to the aforementioned ones) of the second one. This permutation must leave the ket describing the state of the system unchanged if the composite particles being studied are formed only of elementary bosons or if each of them contains an even number of fermions (no sign change, or an even number of sign changes); in this case, the particles are bosons. On the other hand, composite particles containing an odd number of fermions are themselves fermions (an odd number of sign changes in the permutation). Now, the spin of these composite particles is necessarily integral in the first case and half-integral in the second one (Chap. X, § C-3-c). They therefore obey the rule just stated. For example, atomic nuclei are known to be composed of neutrons and protons, which are fermions (spin 1/2). Consequently, nuclei whose mass number (the total number of nucleons) is even are bosons, and those whose mass number is odd are fermions. Thus, the nucleus of the 3 He isotope of helium is a fermion, and that of the 4 He isotope, a boson. C-2.

Removal of exchange degeneracy

We shall begin by examining how this new postulate removes the exchange degeneracy and the corresponding difficulties. The discussion of § A can be summarized in the following way. Let be a ket which can mathematically describe a well-defined physical state of a system containing identical particles. For any permutation operator , can describe this physical state as well as . The same is true for any ket belonging to the subspace spanned by and all its permutations . Depending on the ket chosen, the dimension of can vary between 1 and !. If this dimension is greater than 1, several mathematical kets correspond to the same physical state: there is then an exchange degeneracy. The new postulate which we have introduced considerably restricts the class of mathematical kets able to describe a physical state: these kets must belong to for bosons, or to for fermions. We shall be able to say that the difficulties related to exchange degeneracy are eliminated if we can show that contains a single ket of or a single ket of . To do so, we shall use the relations = or = , proven in (B-53). We obtain: =

(C-1a)

=

(C-1b)

These relations express the fact that the projections onto and of the various kets which span and, consequently, of all the kets of , are collinear. The symmetrization postulate thus unambiguously indicates (to within a constant factor) the ket of which rule to be a consequence of very general hypotheses. However, these hypotheses may not all be correct, and discovery of a boson of half-integral spin or a fermion of integral spin remains possible. It is not inconceivable that, for certain particles, the physical kets might have more complex symmetry properties than those envisaged here.

1435

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

must be associated with the physical state considered: fermions. This ket will be called the physical ket.

for bosons and

for

Comment:

It is possible for all the kets of to have a zero projection onto (or ). In this case, the symmetrization postulate excludes the corresponding physical state. Later (§§ C-3-b and C-3-c), we shall see examples of such a situation when dealing with fermions. C-3. C-3-a.

Construction of physical kets The construction rule

The discussion of the preceding section leads directly to the following rule for the construction of the unique ket (the physical ket) corresponding to a given physical state of a system of identical particles: ( ) Number the particles arbitrarily, and construct the ket corresponding to the physical state considered and to the numbers given to the particles. ( ) Apply or fermions. (

to

, depending on whether the identical particles are bosons or

) Normalize the ket so obtained.

We shall describe some simple examples to illustrate this rule. C-3-b.

Application to systems of two identical particles

Consider a system composed of two identical particles. Suppose that one of them is known to be in the individual state characterized by the normalized ket , and the other one, in the individual state characterized by the normalized ket . First of all, we shall envisage the case in which the two kets, and , are distinct. The preceding rule is applied in the following way: ( ) We label with the number 1, for example, the particle in the state the number 2, the one in the state . This yields: = 1:

; 2:

( ) We symmetrize =

1 [1: 2

(C-2) if the particles are bosons:

; 2:

We antisymmetrize = 1436

1 [1: 2

, and with

; 2:

+ 1:

; 2:

]

(C-3a)

if the particles are fermions: 1:

; 2:

]

(C-3b)

C. THE SYMMETRIZATION POSTULATE

(

) The kets (C-3a) and (C-3b), in general, are not normalized. If we assume and to be orthogonal, the normalization constant is very simple to calculate. All we have to do to normalize or is replace the factor 1 2 appearing in formulas (C-3) by 1 2. The normalized physical ket, in this case, can therefore be written: ; with

1 [1: 2

=

; 2:

= +1 for bosons and

+ 1:

; 2:

]

(C-4)

1 for fermions.

We shall now assume that the two individual states,

and

, are identical:

=

(C-5)

(C-2) then becomes: = 1:

; 2:

(C-6)

is already symmetric. If the two particles are bosons, (C-6) is then the physical ket associated with the state in which the two bosons are in the same individual state . If, on the other hand, the two particles are fermions, we see that: 1 1: ; 2: 1: ; 2: =0 (C-7) 2 Consequently, there exists no ket of able to describe the physical state in which two fermions are in the same individual state . Such a physical state is therefore excluded by the symmetrization postulate. We have thus established, for a special case, a fundamental result known as “Pauli’s exclusion principle”: two identical fermions cannot be in the same individual state. This result has some very important physical consequences which we shall discuss in § D-1. =

C-3-c.

Generalization to an arbitrary number of particles

These ideas can be generalized to an arbitrary number of particles. To see how this can be done, we shall first treat the case = 3. Consider a physical state of the system defined by specifying the three individual normalized states , and . The state which enters into the rule of § a can be chosen in the form: = 1:

; 2:

; 3:

(C-8)

We shall discuss the cases of bosons and fermions separately. .

The case of bosons The application of =

1 3!

=

1 1: 6 + 1:

; 2: ; 2:

to

gives:

; 3:

+ 1:

; 3:

+ 1:

; 2: ; 2:

; 3: ; 3:

+ 1: + 1:

; 2: ; 2:

; 3: ; 3:

(C-9) 1437

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

It then suffices to normalize the ket (C-9). First of all, let us assume that the three kets , and are orthogonal. The six kets appearing on the right-hand side of (C-9) are then also orthogonal. To normalize (C-9), all we must do is replace the factor 1/6 by 1 6. If the two states and coincide, while remaining orthogonal to , only three distinct kets now appear on the right-hand side of (C-9). It can easily be shown that the normalized physical ket can then be written: ;

;

1 1: 3

=

+ 1:

; 2: ; 2:

Finally, if the three states = 1:

; 2:

; 3: ; 3:

,

,

+ 1:

; 2:

; 3:

(C-10)

are the same, the ket:

; 3:

(C-11)

is already symmetric and normalized. .

The case of fermions The application of =

1 3!

1:

to

leads to:

; 2:

; 3:

(C-12)

The signs of the various terms of the sum (C-12) are determined by the same rule as those of a 3 3 determinant. This is why it is convenient to write in the form of a Slater determinant: 1 = 3!

1:

1:

1:

2:

2:

2:

3:

3:

3:

(C-13)

is zero if two of the individual states , or coincide, since the determinant (C-13) then has two identical columns. We obtain Pauli’s exclusion principle, already mentioned in § C-3-b: the same quantum mechanical state cannot be simultaneously occupied by several identical fermions. Finally, note that if the three states , , are orthogonal, the six kets appearing on the right-hand side of (C-12) are orthogonal. All we must then do to normalize is replace the factor 1 3! appearing in (C-12) or (C-13) by 1 3!. If, now, the system being considered contains more than three identical particles, the situation actually remains similar to the one just described. It can be shown that, for identical bosons, it is always possible to construct the physical state from arbitrary individual states , , . . . On the other hand, for fermions, the physical ket can be written in the form of an Slater determinant; this excludes the case in which two individual states coincide (the ket is then zero). This shows, and we shall return to this in detail in § D, how different the consequences of the new postulate can be for fermion and boson systems. 1438

C. THE SYMMETRIZATION POSTULATE

C-3-d.

Construction of a basis in the physical state space

Consider a system of identical particles. Starting with a basis, state space of a single particle, we can construct the basis: 1:

; 2:

;

;

, in the

:

in the tensor product space . However, since the physical state space of the system is not , but rather one of the subspaces or , the problem arises of how to determine a basis in this physical state space. By application of (or ) to the various kets of the basis: 1:

; 2:

;

;

:

we can obtain a set of vectors spanning (or ). Let be an arbitrary ket of for example (the case in which belongs to can be treated in the same way). which belongs to , can be expanded in the form: =

1:

; 2:

;

:

, ,

(C-14)

Since , by hypothesis, belongs to , we have = , and we simply apply the operator to both sides of (C-14) to show that can be expressed in the form of a linear combination of the various kets 1 : ; 2: ; ; : . However, it must be noted that the various kets 1 : ; 2: ; ; : are not independent. Let us permute the roles of the various particles in one of the kets 1: ; 2: ; ; : of the initial basis (before symmetrization). On this new ket, application of or leads, according to (B-62) and (B-63), to the same ket of or (possibly with a change of sign). Thus, we are led to introduce the concept of an occupation number: by definition, for the ket 1 : ; 2 : ; ; : , the occupation number of the individual state is equal to the number of times the state appears in the sequence , that is, the number of particles in the state (we have, obviously, = ). Two different kets 1 : ; 2: ; ; : for which the occupation numbers are equal can be obtained from each other by the action of a permutation operator. Consequently, after the action of the symmetrizer (or the antisymmetrizer ), they give the same physical state, which we shall denote by 1 2 : 1

2

=

1:

1

; 2:

1

;

1

1 particles in the state |

;

1

;

1

+1:

2

;

;

1

+

2

:

2

;

(C-15)

2 particles in the state 2

1

For fermions, would be replaced by in (C-15) ( is a factor which permits the normalization of the state obtained in this way6 ). We shall not study the states 1 2 in detail here; we shall confine ourselves to giving some of their important properties:

6A

simple calculation yields:

=

!

1! 2!

for bosons and

! for fermions.

1439

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

( ) The scalar product of two kets 1 2 and 1 from zero only if all the occupation numbers are equal (

2

=

is different for all ).

By using (C-15) and definitions (B-49) and (B-50) of and , we can obtain the expansion of the two kets under consideration on the orthonormal basis, 1 : ; 2 : ; ; : . It is then easy to see that, if the occupation numbers are not all equal, these two kets cannot simultaneously have non-zero components on the same basis vector.

( ) If the particles under study are bosons, the kets 1 2 various occupation numbers are arbitrary (with, of course orthonormal basis of the physical state space.

, in which the = ) form an

Let us show that, for bosons, the kets 1 2 defined by (C-15) are never zero. To do so, we replace by its definition (B-49). There then appear, on the right-hand side of (C-15), various orthogonal kets 1 : ; 2: ; ; : , all with positive coefficients. The ket 1 2 cannot, therefore, be zero. The 1 2 form a basis in and are orthogonal to each other.

(

since these kets span

, are all non-zero,

) If the particles under study are fermions, a basis of the physical state space is obtained by choosing the set of kets 1 2 in which all the occupation numbers are equal either to 1 or to 0 (again with = ). The preceding proof is not applicable to fermions because of the minus signs which appear before the odd permutations in definition (B-50) of . Furthermore, we saw in § c that two identical fermions cannot occupy the same individual quantum state: if any one of the occupation numbers is greater than 1, the vector defined by (C-15) is zero. On the other hand, it is never zero if all the occupation numbers are equal to one or zero; this is because two particles are then never in the same individual quantum state, so that the kets 1 : ; 2: ; ; : and 1: ; 2: ; ; : are always distinct and orthogonal. Relation (C-15) therefore defines a non-zero physical ket in this case. The rest of the proof is the same as for bosons.

C-4.

Application of the other postulates

It remains for us to show how the general postulates of Chapter III can be applied in light of the symmetrization postulate introduced in § C-1, and to verify that no contradictions arise. More precisely, we shall see how measurement processes can be described with kets belonging only to either or , and we shall show that the time evolution process does not take the ket ( ) associated with the state of the system out of this subspace. Thus, all the quantum mechanical formalism can be applied inside either or . C-4-a.

.

Measurement postulates

Probability of finding the system in a given physical state

Consider a measurement performed on a system of identical particles. The ket ( ) describing the quantum state of the system before the measurement must, according to the symmetrization postulate, belong to or to , depending on whether the system is formed of bosons or fermions. To apply the postulates of Chapter III 1440

C. THE SYMMETRIZATION POSTULATE

concerning measurements, we must take the scalar product of ( ) with the ket corresponding to the physical state of the system after the measurement. This ket is to be constructed by applying the rule given in § C-3-a. The probability amplitude ( ) can therefore be expressed in terms of two vectors, both belonging either to or to . In § D-2, we shall discuss a certain number of examples of such calculations. If the measurement envisaged is a “complete” measurement (yielding, for example, the positions and spin components for all the particles), the physical ket is unique (to within a constant factor). On the other hand, if the measurement is “incomplete” (for example, a measurement of the spins only, or a measurement bearing on a single particle), several orthogonal physical kets are obtained, and the corresponding probabilities must then be summed. .

Physical observables: invariance of

and

In certain cases, it is possible to specify the measurement performed on the system of identical particles by giving the explicit expression of the corresponding observable in terms of R1 , P1 , S1 , R2 , P2 , S2 , etc. We shall give some concrete examples of observables which can be measured in a three-particle system: Position of the center of mass R , total momentum P and total angular momentum L: 1 (R1 + R2 + R3 ) 3 P = P1 + P2 + P3

R =

(C-16) (C-17)

L = L1 + L2 + L3

(C-18)

Electrostatic repulsion energy: 2

=

4

1 0

R1

R2

+

1 R2

R3

+

1 R3

R1

(C-19)

Total spin: S = S1 + S2 + S3

(C-20)

etc. It is clear from these expressions that the observables associated with the physical quantities considered involve the various particles symmetrically. This important property follows directly from the fact that the particles are identical. In (C-16), for example, R1 , R2 and R3 have the same coefficient, since the three particles have the same mass. It is the equality of the charges which is at the basis of the symmetric form of (C-19). In general, since no physical properties are modified when the roles of the identical particles are permuted, these particles must play a symmetric role7 in any actually measurable observable. Mathematically, the corresponding observable , which we shall call a physical observable, must be invariant under all permutations of the identical 7 Note

that this reasoning is valid for fermions as well as for bosons.

1441

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

particles. It must therefore commute with all the permutation operators particles (cf. § B-2-d): [

]=0

for all

of the (C-21)

For a system of two identical particles, for example, the observable R1 R2 (the vector difference of the positions of the two particles), which is not invariant under the effect of the permutation 21 (R1 R2 changes sign) is not a physical observable; indeed, a measurement of R1 R2 assumes that particle (1) can be distinguished from particle (2). On the other hand, we can measure the distance between the two particles, that is, (R1 R2 )2 , which is symmetric. Relation (C-21) implies that and are both invariant under the action of a physical observable . Let us show that, if belongs to , also belongs to (the same proof also applies, of course, to ). The fact that belongs to means that: =

(C-22)

Now let us calculate =

. According to (C-21) and (C-22), we have: =

(C-23)

Since the permutation is arbitrary, (C-23) expresses the fact that is completely antisymmetric and therefore belongs to . All operations normally performed on an observable – in particular, the determination of eigenvalues and eigenvectors – can therefore be applied to entirely within one of the subspaces, or . Only the eigenkets of belonging to the physical subspace, and the corresponding eigenvalues, are retained.

Comments:

( ) All the eigenvalues of which exist in the total space are not necessarily found if we restrict ourselves to the subspace (or ). The effect of the symmetrization postulate on the spectrum of a symmetric observable may therefore be to abolish certain eigenvalues. On the other hand, it adds no new eigenvalues to this spectrum, since, because of the global invariance of (or ) under the action of , any eigenvector of in (or ) is also an eigenvector of in with the same eigenvalue. ( ) Consider the problem of writing mathematically, in terms of the observables R1 , P1 , S1 , etc., the observables corresponding to the different types of measurement envisaged in § . This problem is not always simple. For example, for a system of three identical particles, we shall try to write the observables corresponding to the simultaneous measurement of the three positions in terms of R1 , R2 and R3 . We can resolve this problem by considering several physical observables chosen such that we can, using the results obtained by measuring them, unambiguously deduce the position of each particle (without, of course, being able to associate a numbered particle with each position). For example, we can choose the set: 1

1442

+

2

+

3

1

2

+

2

3

+

3

1

1

2

3

D. DISCUSSION

(and the corresponding observables for the and coordinates). However, this point of view is essentially formal. Rather than trying to write the expressions for the observables in all cases, it is simpler to follow the method used in § , in which we confined ourselves to using the physical eigenkets of the measurement. C-4-b.

Time-evolution postulates

The Hamiltonian of a system of identical particles must be a physical observable. We shall write, for example, the Hamiltonian describing the motion of the two electrons of the helium atom about the nucleus, assumed to be motionless8 : (1 2) =

P21 P2 + 2 2 2

2

2 1

2

2

2

+ 2

R1

R2

(C-24)

The first two terms represent the kinetic energy of the system; they are symmetric because the two masses are equal. The next two terms are due to the attraction of the nucleus (whose charge is twice that of the proton). The electrons are obviously equally affected by this attraction. Finally, the last term describes the mutual interaction of the electrons. It is also symmetric, since neither of the two electrons is in a privileged position. It is clear that this argument can be generalized to any system of identical particles. Consequently, all the permutation operators commute with the Hamiltonian of the system: [

]=0

(C-25)

Under these conditions, if the ket ( 0 ) describing the state of the system at a given time 0 is a physical ket, the same must be true of the ket ( ) obtained from ( 0 ) by solving the Schrödinger equation. According to this equation: ( +d ) = Now, applying

1+

d ~

()

(C-26)

and using relation (C-25):

( + d ) = (1 +

d ~

)

()

(C-27)

If ( ) is an eigenvector of , ( + d ) is also an eigenvector of , with the same eigenvalue. Since ( 0 ) , by hypothesis, is a completely symmetric or completely antisymmetric ket, this property is conserved over time. The symmetrization postulate is therefore also compatible with the postulate that gives the time evolution of physical systems: the Schrödinger equation does not remove the ket ( ) from or . D.

Discussion

In this final section, we shall examine the consequences of the symmetrization postulate on the physical properties of systems of identical particles. First of all, we shall indicate 8 Here, we shall consider only the most important terms of this Hamiltonian. See Complement B XIV for a more detailed study of the helium atom.

1443

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

the fundamental differences introduced by Pauli’s exclusion principle between systems of identical fermions and systems of identical bosons. Then, we shall discuss the implications of the symmetrization postulate concerning the calculation of the probabilities associated with the various physical processes. D-1.

Differences between bosons and fermions. Pauli’s exclusion principle

In the statement of the symmetrization postulate, the difference between bosons and fermions may appear insignificant. Actually, this simple sign difference in the symmetry of the physical ket has extremely important consequences. As we saw in § C-3, the symmetrization postulate does not restrict the individual states accessible to a system of identical bosons. On the other hand, it requires fermions to obey Pauli’s exclusion principle: two identical fermions cannot occupy the same individual quantum state. The exclusion principle was formulated initially in order to explain the properties of many-electron atoms (§ D-1-a below and Complement AXIV ). It can now be seen to be more than a principle applicable only to electrons: it is a consequence of the symmetrization postulate, valid for all systems of identical fermions. Predictions based on this principle, which are often spectacular, have always been confirmed experimentally. We shall give some examples of them. D-1-a.

Ground state of a system of independent identical particles

The Hamiltonian of a system of identical particles (bosons or fermions) is always symmetric with respect to permutations of these particles (§ C-4). Consider such a system in which the various particles are independent, that is, do not interact with each other (at least in a first approximation). The corresponding Hamiltonian is then a sum of one-particle operators of the form: (1 2

) = (1) + (2) +

+ ( )

(D-1)

(1) is a function only of the observables associated with the particle numbered (1); the fact that the particles are identical [which implies a symmetric Hamiltonian (1 2 )] requires this function to be the same in the terms of expression (D-1). In order to determine the eigenstates and eigenvalues of the total Hamiltonian (1 2 ), we simply calculate those of the individual Hamiltonian ( ) in the state space ( ) of one of the particles: ( )

=

;

( )

(D-2)

For the sake of simplicity, we shall assume that the spectrum of ( ) is discrete and non-degenerate. If we are considering a system of identical bosons, the physical eigenvectors of the Hamiltonian (1 2 ) can be obtained by symmetrizing the tensor products of arbitrary individual states : ( ) 1

2

=

1:

1

; 2:

2

;

where the corresponding energy is the sum of the 1

1444

2

=

1

+

2

+

+

;

:

(D-3)

individual energies: (D-4)

D. DISCUSSION

[it can easily be shown that each of the kets appearing on the right-hand side of (D-3) is an eigenket of with the eigenvalue (D-4); this is also true of their sum]. In particular, if 1 is the smallest eigenvalue of ( ), and 1 is the associated eigenstate, the ground state of the system is obtained when the identical bosons are all in the state 1 . The energy of this ground state is therefore: 11

1

=

(D-5)

1

and its state vector is: ( ) 1

= 1:

2

1

; 2:

1

;

;

:

1

(D-6)

Now, suppose that the identical particles considered are fermions. It is no longer possible for these particles all to be in the individual state 1 . To obtain the ground state of the system, Pauli’s exclusion principle must be taken into account. If the individual energies are arranged in increasing order: 1

2

1

the ground state of the system of 12

=

1

+

2

(D-7)

+1

+

identical fermions has an energy of:

+

(D-8)

and it is described by the normalized physical ket: 1: ( ) 12

=

1

2: . ! ..

1

1:

2

1:

3

1

2:

2

2:

3

(D-9) :

1

:

2

:

3

The highest individual energy found in the ground state is called the Fermi energy of the system. Pauli’s exclusion principle thus plays a role of primary importance in all domains of physics in which many-electron systems are involved, such as atomic and molecular physics (cf. Complements AXIV and BXIV ) and solid state physics (cf. Complement CXIV ), and in all those in which many-proton and many-neutron systems are involved, such as nuclear physics9 .

Comment:

In most cases, the individual energies are actually degenerate. Each of them can then enter into a sum such as (D-8) a number of times equal to its degree of degeneracy.

9 The ket representing the state of a nucleus must be antisymmetric both with respect to the set of protons and with respect to the set of neutrons.

1445

CHAPTER XIV

D-1-b.

SYSTEMS OF IDENTICAL PARTICLES

Quantum statistics

The object of statistical mechanics is to study systems composed of a very large number of particles (in numerous cases, the mutual interactions between these particles are weak enough to be neglected in a first approximation). Since we do not know the microscopic state of the system exactly, we content ourselves with describing it globally by its macroscopic properties (pressure, temperature, density, etc.). A particular macroscopic state corresponds to a whole set of microscopic states. We then use probabilities: the statistical weight of a macroscopic state is proportional to the number of distinct microscopic states that correspond to it, and the system, at thermodynamic equilibrium, is in its most probable macroscopic state (with any constraints that may be imposed taken into account). To study the macroscopic properties of the system, it is therefore essential to determine how many different microscopic states possess certain characteristics and, in particular, a given energy. In classical statistical mechanics (Maxwell-Boltzmann statistics), the particles of the system are treated as if they were of different natures, even if they are actually identical. Such a microscopic state is defined by specifying the individual state of each of the particles. Two microscopic states are considered to be distinct when these individual states are the same but the permutation of the particles is different. In quantum statistical mechanics, the symmetrization postulate must be taken into account. A microscopic state of a system of identical particles is characterized by the enumeration of the individual states which form it, the order of these states being of no importance since their tensor product must be symmetrized or anti-symmetrized. The numbering of the microscopic states therefore does not lead to the same result as in classical statistical mechanics. In addition, Pauli’s principle radically differentiates systems of identical bosons and systems of identical fermions: the number of particles occupying a given individual state cannot exceed one for fermions, while it can take on any value for bosons (cf. § C-3). Different statistical properties result: bosons obey Bose-Einstein statistics and fermions, Fermi-Dirac statistics. This is the origin of the terms “bosons” and “fermions”. The physical properties of systems of identical fermions and systems of identical bosons are very different. This subject will be discussed in more detail in the first three chapters of Volume III. The differences can be observed, for example, at low temperatures, when the particles tend to accumulate in the individual states of lowest energy. Identical bosons may then exhibit a phenomenon called Bose-Einstein condensation of particles (Complements BXV and CXV ); by contrast identical fermions, subject to the restrictions of Pauli’s principle, build a Fermi sphere (Complement CXIV ) and can undergo only a pair condensation (Chapter XVII). Bose-Einstein condensation is at the origin of the remarkable properties (superfluidity) of the 4 He isotope of helium, in particular the superfluid properties (Complement DXV ) of its liquid at low temperatures (a few K) . The 3 He isotope, which is a fermion (cf. Comment of § C-1), has very different properties and is superfluid only at much lower temperatures because of pair condensation. D-2.

The consequences of particle indistinguishability on the calculation of physical predictions

In quantum mechanics, all the predictions concerning the properties of a system are expressed in terms of probability amplitudes (scalar products of two state vectors) 1446

D. DISCUSSION

or matrix elements of an operator. It is then not surprising that the symmetrization or antisymmetrization of state vectors causes special interference effects to appear in systems of identical particles. First, we shall specify these effects, and then we shall see how they disappear under certain conditions (the particles of the system, although identical, then behave as if they were of different natures). To simplify the discussion, we shall confine ourselves to systems containing only two identical particles. D-2-a.

Interferences between direct and exchange processes

.

Predictions concerning a measurement on a system of identical particles: the direct term and the exchange term

Consider a system of two identical particles, one of which is known to be in the individual state and the other, in the individual state . We shall assume and to be orthogonal, so that the state of the system is described by the normalized physical ket [cf. formula (C-4)]: ;

=

1 [1 + 2

21 ]

1:

; 2:

(D-10)

where: = +1 if the particles are bosons =

1

if the particles are fermions

(D-11)

With the system in this state, suppose that we want to measure on each of the two particles the same physical quantity with which the observables (1) and (2) are associated. For the sake of simplicity, we shall assume that the spectrum of is entirely discrete and non-degenerate: =

(D-12)

What is the probability of finding certain given values in this measurement ( for one of the particles and for the other one)? We shall begin by assuming and to be different, so that the corresponding eigenvectors and are orthogonal. Under these conditions, the normalized physical ket defined by the result of this measurement can be written: ;

=

1 [1 + 2

21 ]

1:

; 2:

(D-13)

which gives the probability amplitude associated with this result: ;

;

=

1 1: 2

; 2:

(1 +

21 )(1

Using properties (B-13) and (B-14) of the operator 1 (1 + 2

21 )(1

+

21 )

=1+

+ 21 ,

21 )

1:

; 2:

(D-14)

we can write: (D-15)

21

(D-14) then becomes: ;

;

= 1:

; 2:

(1 +

21 )

1:

; 2:

(D-16) 1447

CHAPTER XIV

Letting 1 +

SYSTEMS OF IDENTICAL PARTICLES

21

;

;

act on the bra, we obtain: = 1: +

; 2: 1:

= 1: +

(1 :

; 2: 1:

1:

=

1:

2: 1:

; 2: ; 2:

2: 2:

2:

+

(D-17)

The numbering has disappeared from the probability amplitude, which is now expressed directly in terms of the scalar products . Also, the probability amplitude appears either as a sum (for bosons) or a difference (for fermions) of two terms, with which we can associate the diagrams of Figures 4a and 4b. un

φ

un

φ

un

χ

un

χ

a

b

Figure 4: Schematic representation of the direct term and the exchange term associated with a measurement performed on a system of two identical particles. Before the measurement, one of the particles is known to be in the state and the other one, in the state . The measurement result obtained corresponds to a situation in which one particle is in the state and the other one, in the state . Two probability amplitudes are associated with such a measurement; they are represented schematically by figures a and b. These amplitudes interfere with a + sign for bosons and with a – sign for fermions.

We can interpret result (D-17) in the following way. The two kets and associated with the initial state can be connected to the two bras and associated with the final state by two different “paths”, represented schematically by Figures 4a and 4b. With each of these paths is associated a probability amplitude, or , and these two amplitudes interfere with a + sign for bosons and a – sign for fermions. Thus, we obtain the answer to the question posed in § A-3-a above: the desired probability ( ; ) is equal to the square of the modulus of (D-17): ( ;

)=

+

2

(D-18)

One of the two terms on the right-hand side of (D-17), the one which corresponds, for example, to path 4-a, is often called the direct term. The other term is called the exchange term.

1448

D. DISCUSSION

Comment:

Let us examine what happens if the two particles, instead of being identical, are of different natures. We shall then choose as the initial state of the system the tensor product ket: = 1:

; 2:

(D-19)

Now, consider a measurement instrument which, although the two particles, (1) and (2), are not identical, is not able to distinguish between them. If it yields the results and , we do not know if is associated with particle (l) or particle (2) (for example, for a system composed of a muon and an electron , the measurement device may be sensitive only to the charge of the particles, giving no information about their masses). The two eigenstates 1 : ; 2 : and 1 : ; 2 : (which, in this case, represent different physical states) then correspond to the same measurement result. Since they are orthogonal, we must add the corresponding probabilities, which gives: ( ;

)= 1:

; 2:

1:

; 2:

2

; 2:

1:

+ 1: 2

=

2

; 2:

2

+

2

2

(D-20)

Comparison of (D-18) with (D-20) clearly reveals the significant difference in the physical predictions of quantum mechanics depending on whether the particles under consideration are identical or not. Now consider the case in which the two states and are the same. When the two particles are fermions, the corresponding physical state is excluded by Pauli’s principle, and the probability ( ; ) is zero. On the other hand, if the two particles are bosons, we have: ;

= 1:

; 2:

(D-21)

and, consequently: ;

;

1 1: 2 = 2 =

; 2:

(1 +

21 )

1:

; 2: (D-22)

which gives: ( ;

)=2

2

(D-23)

Comments:

( ) Let us compare this result with the one which would be obtained in the case, already considered above, in which the two particles are different. We must 1449

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

un

φ

un

φ

un

φ

un

χ

un

χ

un

χ

un

ω

un

ω

un

ω

+

+

+

un

φ

un

φ

un

φ

un

χ

un

χ

un

χ

un

ω

un

ω

un

ω

ε

ε

ε

Figure 5: Schematic representation of the six probability amplitudes associated with a system of three identical particles. Before the measurement, one particle is known to be in the state , another, in the state , and the last one, in the state . The result obtained corresponds to a situation in which one particle is in the state , another, in the state , and the last one, in the state . The six amplitudes interfere with a sign which is shown beneath each one ( = +1 for bosons, 1 for fermions).

then replace ; by 1 : ; 2 : and ; gives the value for the probability amplitude:

by 1 :

; 2:

, which (D-24)

and, consequently: ( ;

)=

2

(D-25)

( ) For a system containing

identical particles, there are, in general, ! distinct exchange terms which add (or subtract) in the probability amplitude. For example, consider a system of three identical particles in the individual states , and , and the probability of finding, in a measurement, the results , and . The possible “paths” are then shown in Figure 5. There are six such paths (all different if the three eigenvalues, , and are different). Some always contribute to the probability amplitude with a + sign, others with an sign (+ for bosons and for fermions).

.

Example: elastic collision of two identical particles

To understand the physical meaning of the exchange term, let us examine a concrete example (already alluded to in § A-3-a): that of the elastic collision of two identical 1450

D. DISCUSSION

particles in their center of mass frame10 . Unlike the situation in § above, here we must take into account the evolution of the system between the initial time when it is in the state and the time when the measurement is performed. However, as we shall see, this evolution does not change the problem radically, and the exchange term enters the problem as before. In the initial state of the system (Fig. 6a), the two particles are moving towards each other with opposite momenta. We choose the axis along the direction of these momenta, and we denote their modulus by . One of the particles thus possesses the momentum ez , and the other one, the momentum e (where e is the unit vector of the axis). We shall write the physical ket representing this initial state in the form: =

1 (1 + 2

21 )

1: e ; 2:

(D-26)

e

describes the state of the system at

0,

before the collision. n

O

z

O

a

b

Initial state

Final state

z

Figure 6: Collision between two identical particles in the center of mass frame: the momenta of the two particles in the initial state (fig. a) and in the final state found in the measurement (fig. b) are represented. For the sake of simplicity, we ignore the spin of the particles.

The Schrödinger equation which governs the time evolution of the system is linear. Consequently, there exists a linear operator ( ), which is a function of the Hamiltonian , such that the state vector at time is given by: () =

(

0)

(D-27)

(Complement FIII ). In particular, after the collision, the state of the system at time is represented by the physical ket: ( 1) =

(

1

0)

1

(D-28)

10 We shall give a simplified treatment of this problem, intended only to illustrate the relation between the direct term and the exchange term. In particular, we ignore the spin of the two particles. However, the calculations of this section remain valid in the case in which the interactions are not spin-dependent and the two particles are initially in the same spin state.

1451

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

Note that, since the Hamiltonian with the permutation operator: [ (

)

21 ]

is symmetric, the evolution operator

=0

commutes (D-29)

Now, let us calculate the probability amplitude of the result envisaged in § A-3-a, in which the particles are detected in the two opposite directions of the On axis, of unit vector n (Fig. 6b). We denote the physical ket associated with this final state by: =

1 (1 + 2

21 )

1: n; 2:

(D-30)

n

The desired probability amplitude can therefore be written: ( 1) = =

(

0)

1

1 1: n; 2: n (1 + 21 ) 2 ( 1 0 )(1 + 21 ) 1 : e ; 2 :

(D-31)

e

According to relation (D-29) and the properties of the operator =

(

we finally obtain:

0)

= 1: n; 2:

n (1 +

= 1: n; 2:

n

+

21 ,

1:

(

n; 2: n

21 ) 0)

1

(

1

(

1

0)

1: e ; 2:

1: e ; 2: 0) 1 : e ; 2 :

e

e e

(D-32)

The direct term corresponds, for example, to the process shown in Figure 7a, and the exchange term is then represented by Figure 7b. Again, the probability amplitudes associated with these two processes must be added or subtracted. This causes an interference term to appear when the square of the modulus of expression (D-32) is taken. Note also that this expression is simply multiplied by if n is changed to –n, so that the corresponding probability is invariant under this change. D-2-b.

Situations in which the symmetrization postulate can be ignored

If the application of the symmetrization postulate were always indispensable, it would be impossible to study the properties of a system containing a restricted number of particles, because it would be necessary to take into account all the particles in the universe which are identical to those in the system. We shall see in this section that this is not the case. In fact, under certain special conditions, identical particles behave as if they were actually different, and it is not necessary to take the symmetrization postulate into account in order to obtain correct physical predictions. It seems natural to expect, considering the results of § D-2-a, that such a situation would arise whenever the exchange terms introduced by the symmetrization postulate are zero. We shall give two examples. .

Identical particles situated in two distinct regions of space

Consider two identical particles, one of which is in the individual state and the other, in the state . To simplify the notation, we shall ignore their spin. Suppose that 1452

D. DISCUSSION

the domain of the wave functions representing the kets space: (r) = r

= 0 if r

(r) = r

= 0 if r

and

are well separated in

(D-33)



where the domains and ∆ do not overlap. The situation is analogous to the classical mechanical one (§ A-2): as long as the domains and ∆ do not overlap, each of the particles can be “tracked”; we therefore expect application of the symmetrization postulate to be unnecessary. In this case, we can envisage measuring an observable related to one of the two particles. All we need is a measurement device placed so that it cannot record what happens in the domain , or in the domain ∆. If it is which is excluded in this way, the measurement will only concern the particle in ∆, an vice versa. Now, imagine a measurement concerning the two particles simultaneously, but performed with two distinct measurement devices, one of which is not sensitive to phenomena occurring in ∆, and the other, to those in . How can the probability of obtaining a given result be calculated? Let and be the individual states associated respectively with the results of the two measurement devices. Since the two particles are identical, the symmetrization postulate must, in theory, be taken into account. In the probability amplitude associated with the measurement result, the direct term is then , and the exchange term is . Now, the spatial disposition of the measurement devices implies that: (r) = r (r) = r

= 0 if r = 0 if r

∆ (D-34)

According to (D-33) and (D-34), the wave functions (r) and (r) do not overlap; neither do (r) and (r), so that: =

=0

(D-35)

n

n

z

z

a

b

Figure 7: Collision between two identical particles in the center of mass frame: schematic representation of the physical processes corresponding to the direct term and the exchange term. The scattering amplitudes associated with these two processes interfere with a plus sign for bosons and a minus sign for fermions.

1453

CHAPTER XIV

SYSTEMS OF IDENTICAL PARTICLES

The exchange term is therefore zero. Consequently, it is unnecessary, in this situation, to use the symmetrization postulate. We obtain the desired result directly by reasoning as if the particles were of different natures, labeling, for example, the one in the domain with the number 1, and the one situated in ∆ with the number 2. Before the measurement, the state of the system is then described by the ket 1 : ; 2 : , and with the measurement result envisaged is associated the ket 1 : ; 2 : . Their scalar product gives the probability amplitude . This argument shows that the existence of identical particles does not prevent the separate study of restricted systems, composed of a small number of particles.

Comment: In the initial state chosen, the two particles are situated in two distinct regions of space. In addition, we have defined the state of the system by specifying two individual states. We might wonder if, after the system has evolved, it is still possible to study one of the two particles and ignore the other one. For this to be the case, it is necessary, not only that the two particles remain in two distinct regions of space, but also that they do not interact. Whether the particles are identical or not, an interaction always introduces correlations between them, and it is no longer possible to describe each of them by a state vector.

n

z O

a

z O

b

Figure 8: Collision between two identical spin 1/2 particles in the center of mass frame: a schematic representation of the momenta and spins of the two particles in the initial state (fig. a) and in the final state found in the measurement (fig. b). If the interactions between the two particles are spin-independent, the orientation of the spins does not change during the collision. When the two particles are not in the same spin state before the collision (the case of the figure), it is possible to determine the “path” followed by the system in arriving at a given final state. For example, the only scattering process which leads to the final state of figure b and which has a non-zero amplitude is of the type shown in Figure 7a.

1454

D. DISCUSSION

.

Particles which can be identified by the direction of their spins

Consider an elastic collision between two identical spin 1/2 particles (electrons, for example), assuming that spin-dependent interactions can be neglected, so that the spin states of the two particles are conserved during the collision. If these spin states are initially orthogonal, they enable us to distinguish between the two particles at all times, as if they were not identical; consequently, the symmetrization postulate should again have no effect here. We can show this, using the calculation of § D-2-a- . The initial physical ket will be, for example (Fig. 8a): =

1 (1 2

21 )

1: e +; 2:

(D-36)

e

(where the symbol + or added after each momentum indicates the sign of the spin component along a particular axis). The final state we are considering (Fig. 8b) will be described by: 1 (1 2

=

21 )

1: n +; 2:

(D-37)

n

Under these conditions, only the first term of (D-32) is different from zero, since the second one can be written: 1:

n

; 2: n +

(

1

0)

1: e +; 2:

e

(D-38)

This is the matrix element of a spin-independent operator (by hypothesis) between two kets whose spin states are orthogonal; it is therefore zero. Consequently, we would obtain the same result if we treated the two particles directly as if they were different, that is, if we did not antisymmetrize the initial and final kets and if we associated index 1 with the spin state + and index 2 with the spin state . Of course, this is no longer possible if the evolution operator , that is, the Hamiltonian of the system, is spin-dependent. References and suggestions for further reading:

The importance of interference between direct and exchange terms is stressed in Feynman III (1.2), § 3.4 and Chap. 4. Quantum statistics: Reif (8.4). Kittel (8.2). Permutation groups: Messiah (1.17), app. D, § IV; Wigner (2.23), Chap. 13; Bacry (10.31), §§ 41 and 42. The effect of the symmetrization postulate on molecular spectra: Herzberg (12.4), Vol. I, Chap. III, § 2f. An article giving a popularized version: Gamow (1.27).

1455

COMPLEMENTS OF CHAPTER XIV, READER’S GUIDE

AXIV : MANY-ELECTRON ATOMS; ELECTRONIC CONFIGURATIONS

Simple study of many-electron atoms in the central-field approximation. Discusses the consequences of the Pauli exclusion principle and introduces the concept of an electronic configuration. Remains qualitative.

BXIV : ENERGY LEVELS OF THE HELIUM ATOM: CONFIGURATIONS, TERMS, MULTIPLETS

Study, in the case of the helium atom, of the effect of the electronic repulsion between electrons and of the magnetic interactions. Introduces the concepts of terms and multiplets. Can be reserved for later study.

CXIV : PHYSICAL PROPERTIES OF AN ELECTRON GAS. APPLICATION TO SOLIDS

Study of the ground stante of a gas of free electrons enclosed in a “box”. Introduces the concept of Fermi energy and periodic boundary conditions. Generalization to electrons in solids and qualitative discussion of the relation between electrical conductivity and the position of the Fermi level. Moderately difficult. The physical discussions are emphasized. Can be considered to be a sequel of FXI .

DXIV : EXERCISES

1457



MANY-ELECTRON ATOMS. ELECTRONIC CONFIGURATIONS

Complement AXIV Many-electron atoms. Electronic configurations

1

The central-field approximation . . . . . . . . 1-a Difficulties related to electron interactions . . 1-b Principle of the method . . . . . . . . . . . . 1-c Energy levels of the atom . . . . . . . . . . . Electron configurations of various elements .

2

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1459 1460 1460 1461 1463

The energy levels of the hydrogen atom were studied in detail in Chapter VII. Such a study is considerably simplified by the fact that the hydrogen atom possesses a single electron, so that Pauli’s principle is not relevant. In addition, by using the center of mass frame, we can reduce the problem to the calculation of the energy levels of a single particle (the relative particle) subjected to a central potential. In this complement, we shall consider many-electron atoms, for which these simplifications cannot be made. In the center-of-mass frame, we must solve a problem involving several non-independent particles. This is a complex problem and we shall give only an approximate solution, using the central-field approximation (which will be outlined, without going into details of the calculations). In addition Pauli’s principle, as we shall show, plays an important role. 1.

The central-field approximation

Consider a -electron atom. Since the mass of its nucleus is much larger (several thousand times) than that of the electrons, the center-of-mass of the atom practically coincides with the nucleus, which we shall therefore assume to be motionless at the coordinate origin1 . The Hamiltonian describing the motion of the electrons, neglecting relativistic corrections and, in particular, spin-dependent terms, can be written: P2 2

= =1

2

2

+ =1

R

(1)

R

We have numbered the electrons arbitrarily from 1 to

, and we have set:

2 2

=

4

(2) 0

where is the electron charge. The first term of the Hamiltonian (1) represents the total kinetic energy of the system of electrons. The second one arises from the attraction exerted on each of them by the nucleus, which bears a positive charge equal to . The last one describes the mutual repulsion of the electrons [note that the summation is carried out over the ( 1) 2 different ways of pairing the -electrons]. The Hamiltonian (1) is too complicated for us to solve its eigenvalue equation exactly, even in the simplest case, that of helium ( = 2). 1 Making

this approximation amounts to neglecting the nuclear finite mass effect.

1459



COMPLEMENT AXIV

1-a.

Difficulties related to electron interactions 2

In the absence of the mutual interaction term in , the electrons R R would be independent. It would then be easy to determine the energies of the atom. We would simply sum the energies of the electrons placed individually in the Coulomb 2 potential , and the theory presented in Chapter VII would yield the result immediately. As for the eigenstates of the atom, they could be obtained by antisymmetrizing the tensor product of the stationary states of the various electrons. It is then the presence of the mutual interaction term that makes it difficult to solve the problem exactly. We might try to treat this term by perturbation theory. However, a rough evaluation of its relative magnitude shows that this would not yield a good approximation. We expect the distance R R between two electrons to be, on the average, roughly the distance of an electron from the nucleus. The ratio of the third term of formula (1) to the second one is therefore approximately equal to: 1 2

(

1)

(3)

2

varies between 1/4 for = 2 and 1/2 for much larger than 1. Consequently, the perturbation treatment of the mutual interaction term would yield, at most, more or less satisfactory results for helium ( = 2), but it is out of the question to apply it to other atoms ( is already equal to 1/3 for = 3). A more elaborate approximation method must therefore be found. 1-b.

Principle of the method

To understand the concept of a central field, we shall use a semi-classical argument. Consider a particular electron ( ). In a first approximation, the existence of the 1 other electrons affects it only because their charge distribution partially compensates the electrostatic attraction of the nucleus. In this approximation, the electron ( ) can be considered to move in a potential that depends only on its position r and takes into account the average effect of the repulsion of the other electrons. We choose a potential ( ) that depends only on the modulus of r and call it the “central potential” of the atom under consideration. Of course, this can only be an approximation: since the motion of the electron ( ) actually influences that of the ( 1) other electrons, it is not possible to ignore the correlations which exist between them. Moreover, when the electron ( ) is in the immediate vicinity of another electron ( ), the repulsion exerted by the latter becomes preponderant, and the corresponding force is not central. However, the idea of an average potential appears more valid in quantum mechanics, where we consider the delocalization of the electrons as distributing their charges throughout an extended region of space. These considerations lead us to write the Hamiltonian (1) in the form: = =1

P2 + 2

(

) +

(4)

with: 2

= =1

1460

2

+

R

R

( =1

)

(5)



MANY-ELECTRON ATOMS. ELECTRONIC CONFIGURATIONS

If the central potential ( ) is suitably chosen, should play the role of a small correction in the Hamiltonian . The central-field approximation then consists of neglecting this correction, that is of choosing the approximate Hamiltonian: 0

= =1

P2 + 2

(

)

(6)

will then be treated like a perturbation of 0 (cf. Complement BXIV , § 2). The diagonalization of 0 leads to a problem of independent particles: to obtain the eigenstates of 0 , we simply determine those of the one-electron Hamiltonian: P2 + 2

( )

(7)

Definitions (4) and (5) do not, of course, determine the central potential ( ), since we always have = 0 + , for all ( ). However, in order to treat like a perturbation, ( ) must be wisely chosen. We shall not take up the problem of the existence and determination of such an optimal potential here. This is a complex problem. The potential ( ) to which a given electron is subjected depends on the spatial distribution of the ( 1) other electrons, and this distribution, in turn, depends on the potential ( ), since the wave functions of the ( 1) electrons must also be calculated from ( ). We must therefore arrive at a coherent solution (one generally says “self-consistent”), for which the wave functions determined from ( ) give a charge distribution which reconstitutes this same potential ( ). 1-c.

Energy levels of the atom

While the exact determination of the potential ( ) requires rather long calculations, the short- and long-distance behavior of this potential is simple to predict. We expect, for small , the electron ( ) under consideration to be inside the charge distribution created by the other electrons, so that it “sees” only the attractive potential of the nucleus. On the other hand, for large , that is, outside the “cloud” formed by the ( 1) electrons treated globally, it is as if we had a single point charge situated at the coordinate origin and equal to the sum of the charges of the nucleus and the “cloud” [the ( 1) electrons screen the field of the nucleus]. Consequently (Fig. 1): 2

( )w

for large 2

( )w

for small

(8)

For intermediate values of , the variation of ( ) can be more or less complicated, depending on the atom under consideration. Although these considerations are qualitative, they give an idea of the spectrum of the one-electron Hamiltonian (7). Since ( ) is not simply proportional to 1 , the accidental degeneracy found for the hydrogen atom (Chap. VII, § C-4-b) is no longer observed. The eigenvalues of the Hamiltonian (7) depend on the two quantum numbers and [however, they remain independent of , since ( ) is central]. , of course, characterizes the eigenvalue of the operator 2 , and is, by definition (as for the hydrogen 1461

COMPLEMENT AXIV



Vc(r)

0

r

e2 r

Ze2 r

Figure 1: Variation of the central potential ( ) with respect to . The dashed-line 2 curves represent the behavior of this potential at short distances ( ) and at long 2 distances ( ).

atom), the sum of the azimuthal quantum number , and the radial quantum number introduced in solving the radial equation corresponding to ; and are therefore integer and satisfy: 06 6

1

(9)

Obviously, for a given value of , the energies if

increase with : (10)

For fixed , the energy is lower when the corresponding eigenstate is more “penetrating”, that is, when the probability density of the electron in the vicinity of the nucleus is larger [according to (8), the screening effect is then smaller]. The energies associated with the same value of can therefore be arranged in order of increasing angular momenta: 0

1

1

(11)

It so happens that the hierarchy of states is approximately the same for all atoms, although the absolute values of the corresponding energies obviously vary with . Figure 2 1462



MANY-ELECTRON ATOMS. ELECTRONIC CONFIGURATIONS

indicates this hierarchy, as well as the 2(2 + 1)-fold degeneracy of each state (the factor 2 comes from the electron spin). The various states are represented in spectroscopic notation (cf. Chap. VII, § C-4-b). Those shown inside the same brackets are very close to each other, and may even, in certain atoms, practically coincide (we stress the fact that Figure 2 is simply a schematic representation intended to situate the eigenvalues with respect to each other; no attempt is made to establish an even moderately realistic energy scale). Note the great difference between the energy spectrum shown and that of the hydrogen atom (cf. Chap. VII, Fig. 4). As we have already pointed out, the energy depends here on the orbital quantum number , and, in addition, the order of the states is different. For example, Figure 2 indicates that the 4 shell has a slightly lower energy than that of the 3 shell. This is explained, as mentioned above, by the fact that the 4 wave function is more penetrating. Analogous inversions occur for the = 4 and = 5 shells, etc. This demonstrates the importance of inter-electron repulsion. 2.

Electron configurations of various elements

In the central-field approximation, the eigenstates of the total Hamiltonian 0 of the atom are Slater determinants, constructed from the individual electron states associated with the energy states that we have just described. This is therefore the situation envisaged in § D-1-a of Chapter XIV: the ground state of the atom is obtained when the electrons occupy the lowest individual states compatible with Pauli’s principle. The maximum number of electrons that can have a given energy is equal to the 2(2 + 1)-fold degeneracy of this energy level. The set of individual states associated with the same energy is called a shell. The list of occupied shells with the number of electrons found in each is called the electronic configuration. The notation used will be specified below in a certain number of examples. The concept of a configuration also plays an important role in the chemical properties of atoms. Knowledge of the wave functions of the various electrons and of the corresponding energies makes it possible to interpret the number, stability, and geometry of the chemical bonds which can be formed by this atom (cf. Complement EXII ). To determine the electronic configuration of a given atom in its ground state, we simply “fill” the various shells successively, in the order indicated in Figure 2 (starting, of course, with the 1 level), until the electrons are exhausted. This is what we shall do, in a rapid review of Mendeleev’s table. In the ground state of the hydrogen atom, the single electron of this atom occupies the 1 level. The electronic configuration of the next element (helium, = 2) is: He : 1

2

(12)

which means that the two electrons occupy the two orthogonal states of the 1 shell (same spatial wave function, orthogonal spin states). Then comes lithium ( = 3), whose electronic configuration is: Li : 1

2

2

(13)

The 1 shell can accept only two electrons, so the third one must go into the level directly above it, that is, according to Figure 2, into the 2 shell. This shell can accept a second 1463

COMPLEMENT AXIV

• 5f (14)

6d (10) 6p

4f (14)

E

5d (10)

7s (2)

(6) 6s (2)

5p 4d

(6) 5s

(10)

(2) etc...

4p (6) 3d (10)

Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Zn K, Ca

4s (2)

3p

Al, Si, P, S, Cl, A

(6) 3s

Na, Mg

(2)

2p

B, C, N, O, F, Ne

(6) 2s (2)

Li, Be

1s (2) n=1

H, He

n=2

n=3

n=4

n=5

n=6

n=7

Figure 2: Schematic representation of the hierarchy of energy levels (electronic shells) in a central potential of the type shown in Figure 1. For each value of , the energy increases with . The degeneracy of each level is indicated in parentheses. The levels that appear inside the same bracket are very close to each other, and their relative disposition can vary from one atom to another. On the right-hand side of the figure, we have indicated the chemical symbols of the atoms for which the electronic shell appearing on the same line is the outermost shell occupied in the ground state configuration.

1464

• electron, which gives beryllium ( Be : 1

2

2

2

MANY-ELECTRON ATOMS. ELECTRONIC CONFIGURATIONS

= 4) the electronic configuration: (14)

For 4, the 2 shell (cf. Fig. 2) is the first to be gradually filled, and so on. As the number of electrons increases, higher and higher electronic shells are brought in (on the right-hand side of Figure 2, we have shown, opposite each of the lowest shells, the symbols of the atoms for which this shell is the outermost). Thus, we obtain the configurations of the ground state for all the atoms. This explains Mendeleev’s classification. However, it must be noted that levels that are very close to each other (those grouped in brackets in Figure 2) may be filled in a very irregular fashion. For example, although Figure 2 gives the 4 shell a lower energy than that of the 3 shell, chromium ( = 24) has five 3 electrons although the 4 shell is incomplete. Similar irregularities arise for copper ( = 29), niobium ( = 41), etc.

Comments:

( ) The electronic configurations which we have analyzed characterize the ground state of various atoms in the central-field approximation. The lowest excited states of the Hamiltonian 0 are obtained when one of the electrons moves to an individual energy level which is higher than the last shell occupied in the ground state. We shall see, for example, in Complement BXIV , that the first excited configuration of the helium atom is: 1

2

(15)

( ) A single non-zero Slater determinant is associated with an electronic configuration ending with a complete shell, since there are then as many orthogonal individual states as there are electrons. Thus, the ground state of the rare gases (. . . , 2 , 6 ) is non-degenerate, as is that of the alkaline-earths (. . . , 2 ). On the other hand, when the number of external electrons is smaller than the degree of degeneracy of the outermost shell, the ground state of the atom is degenerate. For the alkalines ( ), the degree of degeneracy is equal to 2; for carbon (1 2 , 2 2 , 2 2 ), it is equal to 62 = 15, since two individual states can be chosen arbitrarily from the six orthogonal states constituting the 2 shell. (

) It can be shown that, for a complete shell, the total angular momentum is zero, as are the total orbital angular momentum and the total spin (the sums, respectively, of the orbital angular momenta and the spins of the electrons occupying this shell). Consequently, the angular momentum of an atom2 is due only to its outer electrons. Thus, the total angular momentum of a helium atom in its ground state is zero, and that of an alkali metal is equal to 1/2 (a single external electron of zero orbital angular momentum and spin 1/2).

2 The angular momentum being discussed here is that of the electronic cloud of the atom. The nucleus also possesses an angular momentum which should be added to this one.

1465

COMPLEMENT AXIV



References and suggestions for further reading:

Pauling and Wilson (1.9), Chap. IX; Levine (12.3), Chap. 11, § 1, 2 and 3; Kuhn (11.1), Chap. IV, §§ A and B; Schiff (1.18), § 47; Slater (1.6), Chap. 6; Landau and Lifshitz (1.19), §§ 68, 69 and 70. See also references of Chap. XI (Hartree and HartreeFock methods). The shell model in nuclear physics: Valentin (16.1), Chap. VI; Preston (16.4), Chap. 7; Deshalit and Feshbach (16.6), Chap. IV and V. See also articles by Mayer (16.20); Peierls (16.21) and Baranger (16.22).

1466



ENERGY LEVELS OF THE HELIUM ATOM. CONFIGURATIONS, TERMS, MULTIPLETS

Complement BXIV Energy levels of the helium atom. Configurations, terms, multiplets

1

The central-field approximation. Configurations . . . . . . . 1467 1-a The electrostatic Hamiltonian . . . . . . . . . . . . . . . . . . 1467 1-b The ground state configuration and first excited configurations 1468 1-c Degeneracy of the configurations . . . . . . . . . . . . . . . . 1468 The effect of the inter-electron electrostatic repulsion: exchange energy, spectral terms . . . . . . . . . . . . . . . . . . 1469 2-a Choice of a basis of ( ; ) adapted to the symmetries of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1470 2-b Spectral terms. Spectroscopic notation . . . . . . . . . . . . . 1472 2-c Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1473 Fine-structure levels; multiplets . . . . . . . . . . . . . . . . . 1478

2

3

In the preceding complement, we studied many-electron atoms in the central-field approximation in which the electrons are independent. This enabled us to introduce the concept of a configuration. We shall evaluate the corrections that must be made to this approximation, taking into account the inter-electron electrostatic repulsion more precisely. In order to simplify the reasoning, we shall confine ourselves to the simplest many-electron atom, the helium atom. We shall show that, under the effect of the interelectron electrostatic repulsion, the configurations of this atom (§ 1) split into spectral terms (§ 2), which give rise to fine-structure multiplets (§ 3) when smaller terms in the atomic Hamiltonian (magnetic interactions) are taken into account. The concepts we shall bring out in this treatment can be generalized to more complex atoms. 1.

The central-field approximation. Configurations

1-a.

The electrostatic Hamiltonian

As in the preceding complement, we shall take into account only the electrostatic forces at first, writing the Hamiltonian of the helium atom [formula (C-24) of Chapter XIV] in the form: =

0

+

(1)

where: 0

=

P21 P2 + 2 + 2 2

(

1)

+

(

2)

(2)

and: =

2

2 1

2

2

2

+ 2

R1

R2

(

1)

(

2)

(3) 1467

COMPLEMENT BXIV

• 1s,2p

1s,2s

Figure 1: The ground state configuration and first excited configurations of the helium atom (the energies are not shown to scale).

1s2

The central potential ( ) is chosen so as to make a small correction of 0 . When is neglected, the electrons can be considered to be independent (although their average electrostatic repulsion is partially taken into account by the potential ). The energy levels of 0 then define the electronic configurations we shall study in this section. We shall then examine the effect of by using stationary perturbation theory in § 2. 1-b.

The ground state configuration and first excited configurations

According to the discussion of Complement AXIV (§ 2), the configurations of the helium atom are specified by the quantum numbers , and , of the two electrons (placed in the central potential ). The corresponding energy can be written: =

+

(4)

Thus (Fig. 1), the ground state configuration, written 1 2 , is obtained when the two electrons are in the 1 shell; the first excited configuration, 1 2 , when one electron is in the 1 shell and the other one is in the 2 shell. Similarly, the second excited configuration is the 1 , 2 configuration. The excited configurations of the helium atom are of the form 1 , . Actually, there also exist “doubly excited” configurations of the type , (with , 1). But, for helium, their energy is greater than the ionization energy of the atom (the limit of the energy of the configuration 1 , when ). Most of the corresponding states, therefore, are very unstable: they tend to dissociate rapidly into an ion and an electron and are called “autoionizing states”. However, there exist levels belonging to doubly excited configurations which are not autoionizing, but which decay by emitting photons. Some of the corresponding spectral lines have been observed experimentally. 1-c.

Degeneracy of the configurations

Since is central and not spin-dependent, the energy of a configuration does not depend on the magnetic quantum numbers and ( 6 6 , 6 6 ) or on the spin quantum numbers and ( = , = ) associated with the two electrons. Most of the configurations, therefore, are degenerate; it is this degeneracy we shall now calculate. A state belonging to a configuration is defined by specifying the four quantum numbers ( ) and ( ) of each electron. Since the electrons are identical 1468



ENERGY LEVELS OF THE HELIUM ATOM. CONFIGURATIONS, TERMS, MULTIPLETS

particles, the symmetrization postulate must be taken into account. The physical ket associated with this state can, according to the results of § C-3-b of Chapter XIV, be written in the form: 1 (1 ;2 : (5) ; = 21 ) 1 : 2 Pauli’s principle excludes the states of the system for which the two electrons would be in the same individual quantum state ( = , = , = , = ). According to the discussion of § C-3-b of Chapter XIV, the set of physical kets (5) for which , , , are fixed and which are not null (that is, not excluded by Pauli’s principle) constitute an orthonormal basis in the subspace ( ; ) of associated with the configuration , . To evaluate the degeneracy of a configuration , we shall distinguish between two cases: ( ) The two electrons are not in the same shell (we do not have = and = ). The individual states of the two electrons can never coincide, and , , , can independently take on any value. The degeneracy of the configuration, consequently, is equal to: 2(2 + 1)

2(2 + 1) = 4(2 + 1)(2 + 1)

(6)

The 1 , 2 and 1 , 2 configurations enter into this category; their degeneracies are equal to 4 and 12 respectively. ( ) The two electrons are in the same shell ( = and = ). In this case, the states for which = and = must be excluded. Since the number of distinct individual quantum states is equal to 2(2 + 1), the degree of degeneracy of the 2 configuration is equal to the number of pairs that can be formed from these individual states (cf. § C-3-b of Chapter XIV), that is: 2 2(2 +1)

= (2 + 1)(4 + 1)

(7)

Thus, the 1 2 configuration, which enters into this category, is not degenerate. It is useful to expand the Slater determinant corresponding to this configuration. If, in (5), we set = = 1, = = = = 0, = +, = , we obtain, writing the spatial part as a common factor: 1

2

= 1 : 1 0 0; 2 : 1 0 0

1 ( 1 : +; 2 : 2

1:

;2 : +

)

(8)

In the spin part of (8), we recognize the expression for the singlet state = 0, =0 , where and are the quantum numbers related to the total spin S = S1 + S2 (cf. Chap. X, § B-4). Thus, although the Hamiltonian 0 does not depend on the spins, the constraints introduced by the symmetrization postulate require the total spin of the ground state to have the value = 0. 2.

The effect of the inter-electron electrostatic repulsion: exchange energy, spectral terms

We shall now study the effect of by using stationary perturbation theory. To do so, we must diagonalize the restriction of inside the subspace ( ; ) associated 1469

COMPLEMENT BXIV



with the , configuration. The eigenvalues of the corresponding matrix give the corrections of the configuration energy to first order in ; the associated eigenstates are the zero-order eigenstates. To calculate the matrix which represents inside ( ; ), we can choose any basis, in particular, the basis of kets (5). Actually, it is to our advantage to use a basis well adapted to the symmetries of . We shall see that we can choose a basis in which the restriction of is already diagonal. 2-a.

.

Choice of a basis of (

;

) adapted to the symmetries of

Total orbital momentum L and total spin S

does not commute with the individual orbital angular momenta L1 and L2 of each electron. However, we have already shown (cf. Chap. X, § A-2) that, if L denotes the total orbital angular momentum: L = L1 + L2

(9)

we have: 2

[

L] = [

L] = 0

(10)

12

Therefore, L is a constant of the motion1 . Moreover, since state space, this is also true for the total spin S: [

S] = 0

does not act in the spin (11)

Now, consider the set of the four operators, L2 , S2 , , . They commute with each other and with . We shall show that they constitute a C.S.C.O. in the subspace ( ; ) of . This will enable us in § 2-b to find directly the eigenvalues of the restriction of in this subspace. To do this, we shall return to the space , the tensor product of the state spaces (1) and (2) relative to the two electrons, assumed to be numbered arbitrarily. The subspace ( ; ) of associated with the , configuration can be obtained2 by antisymmetrizing the various kets of the subspace (1) (2) of . If we choose the basis 1 : 2: in this subspace, we obtain the basis of physical kets (5) by antisymmetrization. However, we know from the results of Chapter X that we can also choose in (1) (2) another basis composed of common eigenvectors of L2 , , S2 , and entirely defined by the specification of the corresponding eigenvalues. We shall write this basis: 1:

;2 :

;

= + =1 0

+

(12)

with:

1 This

1

(13)

result is related to the fact that, under a rotation involving both electrons, the distance between them, 12 , is invariant. However, it changes if only one of the two electrons is rotated. This is why commutes with neither L1 nor L2 . 2 We could also start with the subspace (1) (2) [cf. comment (i) of § B-2-c of Chapter XIV, p. 1433].

1470



ENERGY LEVELS OF THE HELIUM ATOM. CONFIGURATIONS, TERMS, MULTIPLETS

Since L2 , , S2 , are all symmetric operators (they commute with 21 ), the vectors (12) remain, after antisymmetrization, eigenvectors of L2 , , S2 , with the same eigenvalues (some of them may, of course, have a zero projection onto , in which case the corresponding physical states are excluded by Pauli’s principle; see § below). The non-zero kets obtained by antisymmetrization of (12) are therefore orthogonal, since they correspond to different eigenvalues of at least one of the four observables under consideration. Since they span ( ; ), they constitute an orthonormal basis of this subspace, which we shall write: ;

;

;

(14)

with: ;

;

; = (1

where ( ;

21 )

1:

;2 :

;

(15)

is a normalization constant. L2 , , S2 , therefore form a C.S.C.O. inside ). ( ) Now, we shall introduce the permutation operator 21 in the spin state space:

( ) 21

1 : ;2 :

= 1:

;2 :

(16)

We showed in § B-4 of Chapter X [cf. comment ( )] that: ( ) 21

= ( 1)

Furthermore, if we have: 21

=

(0) 21

(0) 21

+1

(17)

is the permutation operator in the state space of the orbital variables,

( ) 21

(18)

Using (17) and (18), we can, finally, put (15) in the form: ;

;

; =

.

[1

( 1)

+1

(0) 21 ]

1:

;2 :

;

(19)

Constraints imposed by the symmetrization postulate

We have seen that the dimension of the space ( ; ) is not always equal to 4(2 + 1)(2 + 1), that is, to the dimension of (1) (2). Certain kets of (1) (2) can therefore have a zero projection onto ( ; ). It is interesting to study the consequences for the basis (14) of this constraint imposed by the symmetrization postulate. First of all, assume that the two electrons do not occupy the same shell. It is then easy to see that the orbital part of (19) is a sum or a difference of two orthogonal kets and, consequently, is never zero3 . Since the same is true of , we see that all the 3 The

normalization constant

is then equal to

1 2

1471

COMPLEMENT BXIV



possible values of and [cf. formula (13)] are allowed. For example, for the 1 , 2 configuration, we can have = 0, = 0 and = 1, = 0; for the 1 , 2 configuration, we can have = 0, = 1 and = 1, = 1, etc. If we now assume that the two electrons occupy the same shell, we have = and = , and certain of the kets (19) can be zero. Let us write 1 : ;2 : ; in the form: 1:

;2 :

; =

;

1:

;2 :

(20)

According to relation (25) of Complement BX : ;

= ( 1)

;

(21)

By using (20), we then get: (0) 21

1:

;2 :

;

= ( 1) 1 :

;2 :

;

(22)

Substituting this result into (19), we obtain4 : ;

;

;

=

0 1:

if + is odd ;2 : ;

if

+

is even (23)

Therefore, and cannot be arbitrary: + must be even. In particular, for the 1 2 configuration, we must have = 0, so = 1 is excluded. This is a result found previously. Finally, note that the symmetrization postulate introduces a close correlation between the symmetry of the orbital part and that of the spin part of the physical ket (19). Since the total ket must be antisymmetric, and the spin part, depending on the value of , is symmetric ( = 1) or antisymmetric ( = 0), the orbital part must be antisymmetric when = 1 and symmetric when = 0. We shall see the importance of this point later. 2-b.

Spectral terms. Spectroscopic notation

commutes with the four observables L2 , inside ( ; ). It follows that the restriction of the basis: ; (

; )=

; ;

, S2 , , which form a C.S.C.O. inside ( ; ) is diagonal in

and has eigenvalues of: ;

;

;

;

;

(24)

This energy depends neither on nor on , since relations (10) and (11) imply that commutes not only with and but also with and : is therefore a scalar 4 The

1472

normalization constant is then 1/2.



ENERGY LEVELS OF THE HELIUM ATOM. CONFIGURATIONS, TERMS, MULTIPLETS

operator in both the orbital state space and the spin state space (cf. Complement BVI , §§ 5-b and 6-c). Inside each configuration, we thus obtain energy levels ( ; )+ ( ), labeled by their values of and . Each of them is (2 + 1)(2 + 1)-fold degenerate. Such levels are called spectral terms and denoted in the following way. With each value of is associated, in spectroscopic notation (Chap. VII, § C-4-b) a letter of the alphabet; we write the corresponding capital letter and add, at the upper left, a number equal to 2 + 1. For example, the 1 2 configuration leads to a single spectral term, written 1 (the 3 , as we have seen, is forbidden by Pauli’s principle). The 1 , 2 configuration produces two terms, 1 (non-degenerate) and 3 (three-fold degenerate); the 1 , 2 configuration, two terms, 1 (degeneracy 3) and 3 (degeneracy 9). For a more complicated configuration such as, for example, 2 2 , we obtain (cf. § 2-a- ) the spectral terms 1 , 1 and 3 ( + must be even), etc. Under the effect of the electrostatic repulsion, the degeneracy of each configuration is therefore partially removed (the 1 2 configuration, which is non-degenerate, is simply shifted). We shall study this effect in greater detail in the simple example of the 1 2 configuration. We shall try to understand why the two terms 1 and 3 resulting from this configuration, and whose total spin values are different, have different energies although the original Hamiltonian is purely electrostatic. 2-c.

Discussion

.

Energies of the spectral terms arising from the 1 , 2 configuration In the 1 , 2 configuration, = 1:

=1

= 0; 2 :

=2

= 1:

= 0;

=1

=

=

= 0. It is then easy to obtain from (20):

=

=0

= 0; 2 :

=2

=

=0

(25)

a vector that we shall write, more simply, 1 : 1 ; 2 : 2 . If 3 3 denote the states corresponding to the two spectral terms and 1 2 configuration, we obtain, substituting (25) into (19): 1 [(1 2 1 = [(1 + 2

3

1

Since

= 0

(0) 21

1 : 1 ;2 : 2

]

=1

(0) 21

1 : 1 ;2 : 2

]

=0

and 1 0 arising from the 1 ,

(26a) =0

(26b)

does not act on the spin variables, the eigenvalues given by (24) can be written: 1 2 1 )= 2

(3 ) =

1 : 1 ; 2 : 2 (1

(0) 21 )

(1

(0) 21 )

1 : 1 ;2 : 2

(27a)

(1

1 : 1 ; 2 : 2 (1 +

(0) 21 )

(1 +

(0) 21 )

1 : 1 ;2 : 2

(27b)

(0)

(we have used the fact that 21 is Hermitian). Moreover, (0) the square of 21 is the identity operator. Therefore: (1

(0) 21 )

(1

(0) 21 )

= (1

(0) 2 21 )

= 2(1

(0) 21 )

(0) 21

commutes with

, and

(28) 1473

COMPLEMENT BXIV



Finally, we obtain: (3 ) = 1

(

)=

(29a) +

(29b)

with: = =

1 : 1 ;2 : 2

1 : 1 ;2 : 2 (0) 21

1 : 1 ;2 : 2

(30)

1 : 1 ;2 : 2

=

1 : 2 ;2 : 1

1 : 1 ;2 : 2

(31)

therefore represents an overall shift of the energy of the two terms and does not contribute to their separation. is more interesting, as it introduces an energy difference between the 3 and 1 terms (cf. Fig. 2). We shall therefore study it in a little more detail.

1S

2J ≃ 0.8 eV 3S

1s

K

2s

Figure 2: The relative position of the spectral terms 1 and 3 arising from the 1 , 2 configuration of the helium atom. represents an overall shift of the configuration. The removal of the degeneracy is proportional to the exchange integral

.

The exchange integral When we substitute expression (3) for 1 : 2 ;2 : 1

(

1)

into (31), there appear terms of the form:

1 : 1 ;2 : 2 =

1:2

(

1)

1:1

2:1 2:2

(32)

Now, the scalar product of the two orthogonal states, 2 : 1 and 2 : 2 is zero. Expression (32) is then equal to zero. The same type of reasoning shows that the terms that arise from the operators ( 2 ), 2 2 1 , 2 2 2 are also zero, since each of these operators acts only in the single-electron spaces while the state of the two electrons is different in the ket and bra of (31). Finally, there remains: 2

= 1474

1 : 2 ;2 : 1

R1

R2

1 : 1 ;2 : 2

(33)



ENERGY LEVELS OF THE HELIUM ATOM. CONFIGURATIONS, TERMS, MULTIPLETS

therefore involves only the electrostatic repulsion between the electrons. Let (r) be the wave functions associated with the states stationary states of an electron in the central potential ): (r) = In the

(34)

r representation, the calculation of

r

(the

from (33) yields: 2

=

d3

1

d3

2

2 0 0 (r1 )

1 0 0 (r2 )

r1

r2

1 0 0 (r1 )

2 0 0 (r2 )

(35)

This integral is called the “exchange integral”. We shall not calculate it explicitly here; we point out, however, that it is positive. .

The physical origin of the energy difference between the two spectral terms

We see from expressions (26) and (27) that the origin of the energy separation of the 3 and 1 terms lies in the symmetry differences of the orbital parts of these terms. As we emphasized at the end of § 2-a, a triplet term ( = 1) must have an orbital part (0) which is antisymmetric under exchange of the two electrons ; hence the sign before 21 in (26a) and (27a). On the other hand, a singlet term ( = 0) must have a symmetric orbital part [+ sign in (26b) and (27b)]. This explains the relative position of the 3 and 1 terms shown in Figure 2. For the singlet term, the orbital wave function is symmetric with respect to exchange of the two electrons, which then have a non-zero probability of being at the same point in space. This is why the electrostatic repulsion, which gives an energy of 2 12 which is large when the electrons are near each other, significantly increases the singlet state energy. On the other hand, for the triplet state, the orbital function is antisymmetric with respect to exchange of the two electrons, which then have a zero probability of being at the same point in space. The average value of the electrostatic repulsion is then smaller. Therefore, the energy difference between the singlet and triplet states arises from the fact that the correlations between the orbital variables of the two electrons depend, because of the symmetrization postulate, on the value of the total spin. .

Analysis of the role played by the symmetrization postulate

At this point in the discussion, it might be thought that the degeneracy of a configuration is removed by the symmetrization postulate. We now show5 that this is not the case. This postulate merely fixes the value of the total spin of the terms arising from a given configuration (because of the inter-electron electrostatic repulsion). To see this, imagine for a moment that we do not need to apply the symmetrization postulate. Suppose, for example, that the two electrons are replaced by two particles (fictitious, of course) of the same mass, the same charge and the same spin as the electrons but with another intrinsic property that permits us to distinguish between them [without, however, changing the Hamiltonian of the problem, which is still given by formula (1)]. Since is not spin-dependent and we do not have to apply the symmetrization postulate, we can ignore the spins completely until the end of the calculations, and then multiply the degeneracies obtained by 4. The energy level of 0 corresponding to 5 See

also comment (i) of § C-4-a-

of Chapter XIV, p. 1442.

1475

COMPLEMENT BXIV



the 1 , 2 configuration is two-fold degenerate from the orbital point of view, because two orthogonal states 1 : 1 ; 2 : 2 and 1 : 2 ; 2 : 1 correspond to it (they are different physical states since the two particles are of different natures). To study the effect of , we must diagonalize in the two-dimensional space spanned by these two kets. The corresponding matrix can be written: (36) where and are given by (30) and (31) [the two diagonal elements of (36) are equal because is invariant under permutation of the two particles]. Matrix (36) can be diagonalized immediately. The eigenvalues found are + and , associated respectively with the symmetric and antisymmetric linear combinations of the two kets 1 : 1 ;2 : 2 and 1 : 2 ; 2 : 1 . The fact that these orbital eigenstates have well-defined symmetries relative to exchange of the two particles has nothing to do with (0) Pauli’s principle. It arises only from the fact that commutes with 21 (common (0) eigenstates of and 21 can therefore be found). When the two particles are not identical, we obtain the same arrangement of levels and the same orbital symmetry as before. On the other hand, the degeneracy of the levels is obviously different: the lower level, with energy , can have a total spin of either = 0 or = 1, as can the upper level. If we return to the real helium atom, we now see very clearly the role played by Pauli’s principle. It is not responsible for the splitting of the initial level 1 , 2 into the two energy levels + and , since this splitting would also appear for two particles of different natures. Similarly, the symmetric or antisymmetric character of the orbital part of the eigenvectors is related to the invariance of the electrostatic interaction under permutation of the two electrons. Pauli’s principle merely forbids the lower state to have a total spin = 0 and the upper state to have a total spin = 1, since the corresponding states would be globally symmetric, which is unacceptable for fermions. .

The effective spin-dependent Hamiltonian We replace =

by the operator:

+ S1 S2

(37)

where S1 and S2 denote the two electron spins. We also have: =

3 ~2 + S2 4 2

so that the eigenstates of are the triplet states, with the eigenvalue the singlet state, with the eigenvalue 3 ~2 4. Therefore, if we set: = = 1476

2 2 ~2

(38) + ~2 4, and

(39)



ENERGY LEVELS OF THE HELIUM ATOM. CONFIGURATIONS, TERMS, MULTIPLETS

we obtain, by diagonalizing , the same eigenstates and eigenvalues we found above6 . We can then consider that it is as if the perturbation responsible for the appearance of the terms were (the “effective” Hamiltonian), which is of the same form as the magnetic interaction between two spins. However, one should not conclude that the coupling energy between the electrons, which is responsible for the appearance of the two terms, is of magnetic origin: two magnetic moments equal to that of the electron and placed at a distance of the order of 1 ˚ A from each other would have an interaction energy much smaller than . However, because of the very simple form of , this effective Hamiltonian is often used instead of An analogous situation arises in the study of ferromagnetic materials. In these substances, the electron spins tend to align themselves parallel to each other. Since the spin state is then completely symmetric, Pauli’s principle requires the orbital state to be completely antisymmetric. For the same reasons as for the helium atom, the electronic repulsion energy is then minimal. When we study such phenomena, we often use effective Hamiltonians of the same type as (37). However, it must be noted that the physical interaction which is at the origin of the coupling is again electrostatic and not magnetic.

Comments:

( ) The 1 , 2 configuration can be treated in the same way. We then have = 1, so that = +1, 0 or 1. As for the 1 , 2 configuration, the shells occupied by the two electrons are different, so that the two terms 3 and 1 exist simultaneously. The first one is nine-fold degenerate, and the second, three-fold. It can be shown, as above, that the 3 term has an energy lower than that of the 1 term, and the difference between the two energies is proportional to an exchange integral which is analogous to the one written in (35). We would proceed in the same way for all other configurations of the type 1 , . ( ) We have treated

like a perturbation of 0 . For this approach to be coherent, the energy shifts associated with [for example, the exchange integral written in (35)] must be much smaller than the energy differences between configurations. Actually, this is not the case. For the 1 , 2 and 1 , 2 configurations, for example, while the 3 energy difference ∆ (1 ) in the 1 , 2 configuration is of the order of 0.8 eV, the minimum distance between levels is ∆ [(1 2 )3 (1 2 )1 ] 0 35 eV. We might therefore believe that it is not valid to treat like a perturbation of 0 . However, the approach we have given is correct. This is due to the fact that, for all configurations of the type 1 , , we have = . Therefore , which according to (10) commutes with L, has zero matrix elements between the states of the 1 , 2 configuration and those of the 1 , 2 configuration, since they correspond to different values of . The operator couples a 1 , configuration only to configurations with distinctly higher energies, of the 1 , type with = (only the values of are different) or of the , type, with and different from 1 (the angular momenta and can be added to give ).

6 We

must, obviously, keep only the eigenvectors of

that belong to

.

1477



COMPLEMENT BXIV

3.

Fine-structure levels; multiplets

Thus far, we have taken into account in the Hamiltonian only interactions of purely electrostatic origin; we have neglected all effects of relativistic and magnetic origin. Actually, such effects exist, and we have already studied them in the case of the hydrogen atom (cf. Chap. XII, § B-1), where they arise from the variation of the electron mass with the velocity, from the L S spin-orbit coupling, and from the Darwin term. For helium, the situation is more complicated because of the simultaneous presence of two electrons. For example, there is a spin-spin magnetic coupling term in the Hamiltonian (cf. Complement BXI ) which acts in both the spin state space and the orbital state space of the two electrons7 . Nevertheless, a great simplification arises from the fact that the energy differences associated with these couplings of relativistic and magnetic origin are much weaker than those which exist between two different spectral terms. This enables us to treat the corresponding Hamiltonian (the fine-structure Hamiltonian) like a perturbation. The detailed study of the fine structure levels of helium falls outside the domain of this complement. We shall confine ourselves to describing the symmetries of the problem and indicating how to distinguish between the different energy levels. We shall use the fact that the fine-structure Hamiltonian is invariant under a simultaneous rotation of all the orbital and spin variables. This means (cf. Complement BVI , § 6) that, if J denotes the total angular momentum of the electrons: J=L+S

(40)

we have: [

J] = 0

(41)

On the other hand, the fine-structure Hamiltonian changes if the rotation acts only on the orbital variables or only on the spins: [

L] =

[

S] = 0

(42)

These properties can easily be seen for the operators ( )L S , for example, or for the dipole-dipole magnetic interaction Hamiltonian (cf. Complement BXI ). The state space associated with a term is spanned by the ensemble of states ; ; ; written in (19), where and are fixed, and where: 6

6+

6

6+

(43)

In this subspace, it can be shown that J2 and form a C.S.C.O. which, according to (41), commutes with . The eigenvectors common to J2 [eigenvalue 2 ( + 1)~ ] and (eigenvalue ~) are therefore necessarily eigenvectors of , with an eigenvalue that depends on but not on (this last property arises from the fact 7 See for example § 19.6 in Sobel’man (11.12) for an explicit expression of the different terms of the fine structure Hamiltonian (Breit Hamiltonian).

1478



ENERGY LEVELS OF THE HELIUM ATOM. CONFIGURATIONS, TERMS, MULTIPLETS

1P 1 1P

0.25 eV 1s 2p

3

P0

3P

1.2 10–4 eV 3P 1

1 10–5 eV

3P 2

Figure 3: The relative position of the spectral terms and multiplets arising from the 1 2 configuration of the helium atom (the splitting of the three multiplets 3 0 , 3 1 , 3 2 has been greatly exaggerated in order to make the figure clearer).

that commutes with + and ). According to the general theory of addition of angular momenta, the possible values of are: =

+

+

1

+

2

(44)

The effect of is therefore a partial removal of the degeneracy. For each “term”, there appear as many distinct levels as there are different values of , according to relation (44). Each of these levels is (2 + 1)-fold degenerate and is called a “multiplet”. The usual spectroscopic notation consists of denoting a multiplet by adding a right lower index equal to the value of to the symbol representing the term from which it arises. For example, the ground state of the helium atom gives a single multiplet, 1 0 . Similarly, each of the terms 1 and 3 of the 1 , 2 configuration leads to a single multiplet: 1 0 and 3 1 , respectively. On the other hand, the 3 term arising from 1 , 2 yields three multiplets, 3 2 , 3 1 and 3 0 (cf. Fig. 3), and so on. We point out that the measurement and theoretical calculation of the fine structure of the 3 level of the 1 , 2 configuration is of great fundamental interest, since it can lead to the very precise knowledge of the “fine structure constant”, = 2 ~ .

1479

COMPLEMENT BXIV



Comments:

( ) For many atoms, the fine-structure Hamiltonian is essentially given by: (

)L S

(45)

=1

where R , L and S denote the positions, angular momenta and spins of each of the electrons. It can then be shown, using the Wigner-Eckart theorem (cf. Complement DX ), that the energy of the multiplet is proportional to ( + 1) ( + 1) ( + 1). This result is sometimes called the “Landé interval rule”. For helium, the 3 1 and 3 2 levels arising from the 1 , 2 configuration are much closer than would be predicted by this rule. This arises from the importance of the dipole-dipole magnetic coupling of the spins of the two electrons.

( ) In this complement, we have neglected the “hyperfine effects” related to nuclear spin (cf. Chap. XII, § B-2). Such effects actually exist only for the 3 He isotope, whose nucleus has a spin = 1 2 (the nucleus of the 4 He isotope has a zero spin). Each multiplet of electronic angular momentum splits, in the case of 3 He, into two hyperfine levels of total angular momentum = 1 2, (2 + 1)-fold degenerate (unless, of course, = 0). References and suggestions for further reading:

Kuhn (11.1), Chap. III-B; Slater (11.8), Chap. 18; Bethe and Salpeter (11.10). Multiplet theory and the Pauli principle: Landau and Lifshitz (1.19), §§ 64 and 65; Slater (1.6), Chap. 7 and (11.8), Chap. 13; Kuhn (11.1), Chap. V, § A; Sobel’man (11.12), Chap. 2, § 5.3.

1480



PHYSICAL PROPERTIES OF AN ELECTRON GAS. APPLICATION TO SOLIDS

Complement CXIV Physical properties of an electron gas. Application to solids

1

2

Free electrons enclosed in a box . . . . . . . . . . . . 1-a Ground state of an electron gas; Fermi energy . 1-b Importance of the electrons with energies close to 1-c Periodic boundary conditions . . . . . . . . . . . . . Electrons in solids . . . . . . . . . . . . . . . . . . . . 2-a Allowed bands . . . . . . . . . . . . . . . . . . . . . 2-b Position of the Fermi level and electric conductivity

. . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

1481 1481 1484 1489 1491 1491 1492

In Complements AXIV and BXIV , we studied, taking the symmetrization postulate into account, the energy levels of a small number of independent electrons placed in a central potential (the shell model of many-electron atoms). Now, we shall consider systems composed of a much larger number of electrons, and we shall show that Pauli’s exclusion principle has an equally spectacular effect on their behavior. To simplify the discussion, we shall neglect interactions between electrons. Moreover, we shall assume, at first (§ 1), that they are subjected to no external potential other than the one that restricts them to a given volume and which exists only in the immediate vicinity of the boundary (a free-electron gas enclosed in a “box”). We shall introduce the important concept of the Fermi energy , which depends only on the number of electrons per unit volume. We shall also show that the physical properties of the electron gas (specific heat, magnetic susceptibility, ...) are essentially determined by the electrons whose energy is close to . A free-electron model describes the principal properties of certain metals rather well. However, the electrons of a solid are actually subjected to the periodic potential created by the ions of the crystal. We know that the energy levels of each electron are then grouped into allowed energy bands, separated by forbidden bands (cf. Complements FXI and OIII ). We shall show qualitatively in § 2 that the electric conductivity of a solid is essentially determined by the position of the Fermi level of the electron system relative to the allowed energy bands. Depending on this position, the solid is an insulator or a conductor. 1. 1-a.

Free electrons enclosed in a box Ground state of an electron gas; Fermi energy

Consider a system of electrons, whose mutual interactions we shall neglect, and which, furthermore, are subjected to no external potential. These electrons, however, are enclosed in a box, which, for simplicity, we shall choose to be a cube with edges of length If the electrons cannot pass through the walls of the box, it is because the walls constitute practically infinite potential barriers. Since the potential energy of the electrons is zero inside the box, the problem is reduced to that of the three-dimensional 1481

COMPLEMENT CXIV



infinite square well (cf. Complements GII and HI ). The stationary states of a particle in such a well are described by the wave functions: (r) =

3 2

2

sin

sin

=1 2 3

sin

(1a) (1b)

[expression (1a) is valid for 0 6 region]. The energy associated with

6 , since the wave function is zero outside this is equal to:

2 2

=

}

2

2

(

2

+

2

+

2

)

(2)

Of course, the electron spin must be taken into account: each of the wave functions (1) describes the spatial part of two distinct stationary states which differ by their spin orientation; these two states correspond to the same energy, since the Hamiltonian of the problem is spin-independent. The set of these stationary states constitutes a discrete basis, enabling us to construct any state of an electron enclosed in this box (that is, whose wave function goes to zero at the walls). Note that, by increasing the dimensions of the box, we can make the interval between two consecutive individual energies as small as we wish, since this interval is inversely proportional to 2 . If is sufficiently large, therefore, we cannot, in practice, distinguish between the discrete spectrum (2) and a continuous spectrum containing all the positive values of the energy. The ground state of the system of the independent electrons can be obtained by antisymmetrizing the tensor product of the individual states associated with the lowest energies compatible with Pauli’s principle. If is small, it is thus simple to fill the first individual levels (2) and to find the ground state of the system, as well as its degree of degeneracy and the antisymmetrized kets that correspond to it. However, when is much larger than 1 (in a macroscopic solid, is of the order of 1023 ), this method cannot be used in practice, and we must follow a more global reasoning. We shall begin by evaluating the number ( ) of individual stationary states whose energies are lower than a given value . To do so, we shall write expression (2) for the possible energies in the form: =

~2 2 k 2

(3)

with: (k

) =

(k

) =

(k

) =

(4)

According to (1), a vector k corresponds to each function (r). Conversely, to each of these vectors, there corresponds one and only one function . The number of states ( ) can then be obtained by multiplying by 2 the number of vectors k whose modulus is smaller than 2 ~2 (the factor 2 arises, of course, 1482



PHYSICAL PROPERTIES OF AN ELECTRON GAS. APPLICATION TO SOLIDS

from the existence of electron spin). The tips of the vectors k divide k-space into elementary cubes of edge (see Figure 1, in which, for simplicity, a two-dimensional rather than a three-dimensional space is shown). Each of these tips is common to eight neighboring cubes, and each cube has eight corners. Consequently, if the elementary cubes are sufficiently small (that is, if is sufficiently large), there can be considered to be one vector k per volume element ( )3 of k-space.

(k)y

π/L (k)x

0 π/L

Figure 1: Tips of the vectors k characterizing the stationary wave functions in a two-dimensional infinite square well.

The value of the energy which we have chosen defines, in k-space, a sphere centered at the origin, of radius 2 ~2 . Only one-eigth of the volume of this sphere is involved, since the components of k are positive [cf. (1b) and (4)]. If we divide it by the volume element ( )3 associated with each stationary state, and if we take into account the factor 2 due to the spin, we obtain: ( )=2

14 83

3 2

2 ~2

3

1 (

)3

=

3

2

2 ~2

3 2

(5)

This result enables us to calculate immediately the maximal individual energy of an electron in the ground state of the system, that is, the Fermi energy of the electron gas. This energy satisfies: (

)=

(6)

which gives: =

~2 2

2 3

3

2 3

(7) 1483



COMPLEMENT CXIV

3 Note that, as might be expected, the Fermi energy depends only on the number of electrons per unit volume. At absolute zero, all the individual states of energy less than are occupied, and all those whose energies are greater than are empty. We shall see in § 1-b what happens at non-zero temperatures. We can also deduce the density of states ( ) from (5); by definition, ( )d is the number of states whose energies are included between and + d . This density of states, as we shall see later, is of considerable physical importance. It can be obtained simply by differentiating ( ) with respect to :

( )=

d ( ) = d 2

3 2

2 ~2

3 2 1 2

(8)

( ) therefore varies like . At absolute zero, the number of electrons with a given energy between and + d (less than , of course) is equal to ( )d . By using the value (7) of the Fermi energy , we can put ( ) in the form: ( )=

3 2

1 2 3 2

(9)

Comment: It can be seen from (5) that the dimensions of the box are involved only through the intermediary of the volume element ( )3 associated, in k-space, with each stationary state. If, instead of choosing a cubic box of edge , we had considered a parallelepiped of edges 1 , 2 , 3 , we would have obtained a volume element of 3 1 2 3 : only the volume 1 2 3 of the box, therefore, enters into the density of states. This result can be shown to remain valid, whatever the exact form of the box, provided it is sufficiently large. 1-b.

Importance of the electrons with energies close to

The results obtained in the preceding section make it possible to understand the physical properties of a free electron gas. We shall give two simple examples here, that of the specific heat and that of the magnetic susceptibility of the system. We shall confine ourselves, however, to semi-quantitative arguments which simply illustrate the fundamental importance of Pauli’s exclusion principle. .

Specific heat

At absolute zero, the electron gas is in its ground state: all the individual levels of energy less than are occupied, and all the others are empty. Taking into account the form (8) of the density of states ( ), we can represent the situation schematically as in Figure 2a: the number ( ) d of electrons with an energy between and + d is ( ) d for and zero for . What happens if the temperature is low but not strictly zero? If the electrons obeyed classical mechanics, each of them, in going from absolute zero to the temperature , would gain an energy of the order of (where is the 1484



PHYSICAL PROPERTIES OF AN ELECTRON GAS. APPLICATION TO SOLIDS

v(E)

≃ kT

v(E)

0

EF a

E

0

EF

E

b

Figure 2: Variation of ( ) with respect to [ ( )d is the number of electrons with energy between and + d ]. At absolute zero, all the levels whose energies are less than the Fermi energy are occupied (fig. a). At a slightly higher temperature , the transition between empty and occupied levels occurs over an energy interval of a few (fig. b).

Boltzmann constant). The total energy per unit volume of the electron gas would then be approximately: ( )

(10)

3

This would lead to a specific heat at constant volume that is independent of the temperature. In reality, the physical phenomena are totally different, since Pauli’s principle prevents most of the electrons from gaining energy. For an electron whose initial energy is much less than (more precisely, if ), the states to which it could go if its energy increased by are already occupied and are therefore forbidden to it. Only electrons having an initial energy close to ( ) can “heat up”, as shown by Figure 2b. The number of these electrons is approximately: ∆

(

)

=

3 2

[according to (9)]. Since the energy of each one increases by about per unit volume can be written: ( )

(11) , the total energy

(12)

3

instead of the classical expression (10). Consequently, the constant volume specific heat is proportional to the absolute temperature : =

3

(13) 1485



COMPLEMENT CXIV

For a metal, to which the free-electron model can be applied, is typically on the order of a few eV. Since is about 0.03 eV at ordinary temperatures, we see that in this case the factor introduced by Pauli’s principle is of the order of 1/100.

Comments: ( ) In order to calculate the specific heat of the electron gas quantitatively, we must know the probability ( ) for an individual state of energy to be occupied when the system is at thermodynamic equilibrium at the temperature . The number ( ) d of electrons whose energies are included between and +d is then: ( )d

= (

) ( )d

(14)

It is shown in statistical mechanics (Complement BXV , § 2-a) that, for fermions, the function ( ) can be written: (

)=

1 e(

)

+1

(15)

where is the chemical potential(Appendix VI), also called the Fermi level of the system. This is the Fermi-Dirac distribution. The Fermi level is determined by the condition that the total number of electrons must be equal to : + 0

( )d e(

)

+1

=

(16)

depends on the temperature, but it can be shown that it varies very slowly for small . The shape of the function ( ) is shown in Figure 3. At absolute zero, ( 0) is equal to 1 for and to 0 for (“step” function). At non-zero temperatures, ( ) has the form of a rounded “step” (the energy interval over which it varies is of the order of a few as long as ). For a free electron gas, it is clear that the Fermi level at absolute zero coincides with the Fermi energy calculated in § 1-a. According to (14) and the form that ( ) takes for = 0 (Fig. 3), then characterizes, like , the highest individual energy. On the other hand, for a system with a discrete spectrum of energies ( 1 , 2 , , ), the Fermi level obtained from formula (16) does not coincide with the highest individual energy in the ground state at absolute zero. In this case, the density of states is composed of a series of “delta functions” centered at Consequently, at absolute zero, can take on any value between 1 2 and +1 , since, according to (14), all these possibilities lead to the same value of ( ). We choose to define at absolute zero as the limit of ( ) as approaches zero. Since at non-zero temperatures the level empties a little, and and +1 begins to fill, the limit of ( ) is found to be a value between +1 (halfway between these two values if the two states and +1 have the same degree of degeneracy). Similarly, for a system containing a series of allowed energy bands separated by forbidden bands (electrons of a solid; cf. Complement FXI ), the Fermi level is in a forbidden band when the highest individual energy at absolute zero coincides with the upper limit of an allowed band. On the other hand, the Fermi level is equal to when falls in the middle of an allowed band.

1486



PHYSICAL PROPERTIES OF AN ELECTRON GAS. APPLICATION TO SOLIDS

f (E, T) 1

1 2

0

μ

E

Figure 3: Plot of the Fermi-Dirac distribution at absolute zero (dashed line) and at low temperatures (solid line). For an electron gas at absolute zero, the Fermi level coincides with the Fermi energy . The curves in Figure 2 can be obtained by multiplying the density of states ( ) by ( ).

( ) The preceding results explain the behavior of the specific heat of metals at very low temperatures. At ordinary temperatures, the specific heat is essentially due to vibrations of the ionic lattice (cf. Complement LV ), since that of the electron gas is practically negligible. However, the specific heat of the lattice approaches zero as 3 for small . Therefore, that of the electron gas becomes preponderant at low temperatures (around 1 K) where, for metals, a decrease that is linear with respect to is actually observed.

.

Magnetic susceptibility

Now suppose that a free electron gas is placed in a uniform magnetic field B parallel to . The energy of an individual stationary state then depends on the corresponding spin state, since the Hamiltonian contains a paramagnetic spin term (cf. Chap. IX, § A-2): =

2

(17)

~ is the Bohr magneton:

where =

~

(18)

2

and S is the electron spin operator. For the sake of simplicity, we shall treat (17) as the only additional term in the Hamiltonian (the behavior of the spatial wave functions was studied in detail in Complement EVI ). Under these conditions, the stationary states remain the same as in the absence of a magnetic field, and the corresponding energy is increased or decreased by depending on the spin state. The densities of states + ( ) and ( ) corresponding respectively to the spin states + and can therefore be obtained very simply from the density ( ) calculated in § 1-a: ( )=

1 ( 2

)

(19) 1487

COMPLEMENT CXIV

• ρ+(E)

2 μB B E 0

EF

ρ–(E)

Figure 4: The densities of states + ( ) and ( ) corresponding respectively to the spin states + and ( is negative). At absolute zero, only the states whose energies are less than are occupied.

Thus, at absolute zero, we arrive at the situation shown in Figure 4. Since the magnetic energy B is much smaller than , the difference between the number of electrons whose spins are antiparallel to the magnetic field and the number whose spins are parallel to B is practically, at absolute zero: 1 ( 2

+

)2

(20)

The magnetic moment 1

= =

3

(

1

2

3

per unit volume can therefore be written: +)

(

)

(21)

This magnetic moment is proportional to the applied field, so that the magnetic susceptibility per unit volume is equal to: =

=

2

1 3

(

)

(22)

or, using expression (9) for ( ): = 1488

3 2

2 3

(23)



PHYSICAL PROPERTIES OF AN ELECTRON GAS. APPLICATION TO SOLIDS

Comments:

( ) We have assumed the system to be at absolute zero, but result (23) remains valid at low temperatures, since the modifications of the number of occupied states (Fig. 2b) are practically the same for both spin orientations. We therefore find a temperature-independent magnetic susceptibility. This is indeed what is observed for metals. ( ) As in the preceding section, we see that the system behavior in the presence of a magnetic field is essentially determined by the electrons whose energies are close to . This is another manifestation of Pauli’s principle. When the magnetic field is applied, the electrons in the + spin state tend to go into the state, which is energetically more favorable. But most of them are prevented from doing so by the exclusion principle, since all the neighbouring states are already occupied. 1-c.

.

Periodic boundary conditions

Introduction

The functions given by formula (1a) have a completely different structure from that of the plane waves e k r which usually describe the stationary states of free electrons. This difference arises solely from the boundary conditions imposed by the walls of the box, since, inside the box, the plane waves satisfy the same equation as the : ~2 ∆ (r) = 2

(r)

(24)

The functions (1a) are less convenient to handle than plane waves; this is why the latter are preferably used. To do so, we impose on the solutions of equation (24) new, artificial, boundary conditions which do not exclude plane waves. Of course, since these conditions are different from those actually created by the walls of the box, this changes the physical problem. However, we shall show in this section that we can find the most important physical properties of the initial system in this way. For this to be true, it is necessary for the new boundary conditions to lead to a discrete set of possible values of k such that: ( ) The system of plane waves corresponding to these values of k constitutes a basis on which can be expanded any function whose domain is inside the box. ( ) The density of states ( ) associated with this set of values of k is identical to the density of states ( ) calculated in § 1-a from the true stationary states. Of course, the fact that the new boundary conditions are different from the real conditions means that the plane waves cannot correctly describe what happens near the walls (surface effects). However, it is clear that they can, because of condition ( ), lead to a very simple explanation of the volume effects, which, according to what we have seen in § 1-b, depend only on the density of states ( ). Moreover, because of condition (i), the motion of any wave packet far from the walls can be correctly described by superposing plane waves, since, between two collisions with the walls, the wave packet propagates freely. 1489

COMPLEMENT CXIV

.



The Born-von Karman conditions

We shall no longer require the individual wave functions to go to zero at the walls of the box, but, rather, to be periodic with a period : ( +

)= (

)

(25)

with analogous relations in and . Wave functions of the form e k r satisfy these conditions if the components of the vector k satisfy: =

2

=

2

=

2

(26)

where, now, , and are positive or negative integers or zero. We therefore introduce a new system of wave functions: (r) =

1 3 2

e

2

(

)

(27)

which are normalized inside the volume of the box. The corresponding energy, according to (24), can be written: =

~2 2

4

2 2

(

2

+

2

+

2

)

(28)

Any wave function defined inside the box can be extended into a periodic function in of period . Since this periodic function can always be expanded in a Fourier series (cf. Appendix I, § 1-b), the (r) system constitutes a basis for wave functions with a domain inside the box. To each vector k , whose components are given by (26), there corresponds a well-defined value of the energy , given by (28). Note, however, that the vectors k can now have positive, negative or zero components, and that their tips divide space into elementary cubes whose edges are twice that found in § 1-a. In order to show that boundary conditions (25) lead to the same physical results (as far as the volume effects are concerned) as those of § 1-a, it suffices to calculate the number ( ) of stationary states of energy less than , and find the value (5) [the Fermi energy and the density of states ( ) can be derived directly from ( )]. We evaluate ( ) in the same way as in § 1-a, taking into account the new characteristics of the vectors k . Since the components of k can now have arbitrary signs, the ~2 must no longer be divided by 8. However, this volume of the sphere of radius 2 modification is compensated by the fact that the volume element (2 )3 associated with each of the states (27) is eight times larger than the one corresponding to the boundary conditions of § 1-a. Consequently, ( ) is the same as expression (5) for ( ). The periodic boundary conditions (25) therefore permit us to meet conditions ( ) and ( ) of the preceding section. They are usually called the Born-Von Karman conditions (“B.V.K. conditions”).

1490



PHYSICAL PROPERTIES OF AN ELECTRON GAS. APPLICATION TO SOLIDS

Comment: Consider a truly free electron (not enclosed in a box). The eigenfunctions of the three components of the momentum P (and, consequently, those of the Hamiltonian = P2 2 ) form a “continuous basis”: 1 2 ~

3 2

epr

~

(29)

We have already indicated several times that the states for which the form (29) is valid in all space are not physical states, but can be used as mathematical intermediaries in studying the physical states, which are wave packets. We sometimes prefer to use the discrete basis (27) rather than the continuous basis (29). To do so, we consider the electron to be enclosed in a fictitious box of edge , much larger than any dimension involved in the problem, and we impose the B.V.K. conditions. Any wave packet, which will always be inside the box for sufficiently large , can be as well expanded on the discrete basis (27) as on the continuous basis (29). The states (27) can therefore, like the states (29), be considered to be intermediaries of the calculation; however, they present the advantage of being normalized inside the box. We must, of course, check, at the end of the calculations, that the various physical quantities obtained (transition probabilities, cross sections,...) do not depend on , provided that is sufficiently large. Obviously, for a truly free electron, has no physical meaning and can be arbitrary, as long as it is sufficiently large for the states (27) to form a basis on which the wave packets involved in the problem can be expanded [condition (i) of § 1-c- ]. On the other hand, in the physical problem which we are studying here, 3 is the volume inside which the electrons are actually confined and has, consequently, a definite value.

2. 2-a.

Electrons in solids Allowed bands

The model of a free electron gas enclosed in a box can be applied rather well to the conduction electrons of a metal. These electrons can be considered to move freely inside the metal, the electrostatic attraction of the crystalline lattice preventing them from escaping when they approach the surface of the metal. However, this model does not explain why some solids are good electrical conductors while others are insulators. This is a remarkable experimental fact: the electric properties of crystals are due to the electrons of the atoms of which they are composed; yet, the intrinsic conductivity can vary by a factor of 1030 between a good insulator and a pure metal. We shall see, in a very qualitative way, how this can be explained by Pauli’s principle and by the existence of energy bands arising from the periodic nature of the potential created by the ions (cf. Complements OIII and FXI ). We showed in Complement FXI that if, in a first approximation, we consider the electrons of a solid to be independent, their possible individual energies are grouped into allowed bands, separated by forbidden bands. Assuming that each electron is subjected to the influence of a linear chain of regularly spaced positive ions, we found, in the strongbond approximation, a series of bands, each one containing 2 levels, where is the number of ions (the factor 2 arises from the spin). 1491

COMPLEMENT CXIV



The situation, of course, is more complex in a real crystal, in which the positive ions occupy the nodes of a three-dimensional lattice. The theoretical understanding of the properties of a solid requires a detailed study of the energy bands, a study which is based on the spatial characteristics of the crystalline lattice. We shall not treat in detail these specific problems of solid state physics. We shall content ourselves with a qualitative discussion of the phenomena. 2-b.

Position of the Fermi level and electric conductivity

Knowing the band structure and the number of states per band, we obtain the ground state of the electron system of a solid by successively “filling” the individual states of the various allowed bands, beginning, of course, with the lowest energies. The electron system is really in the ground state only at absolute zero. However, as we pointed out in § 1-b- , the characteristics of this ground state permit the semi-quantitative understanding of the behavior of the system at non-zero temperatures – often, up to ordinary temperatures. Like the thermal and magnetic properties (cf. § 1-b), the electrical properties of the system are principally determined by the electrons whose individual energies are very close to the highest value . If we place the solid in an electric field, an electron whose initial energy is much lower than cannot gain energy by being accelerated, since the states it would reach in this way are already occupied. It is therefore essential to know the position of relative to the allowed energy bands. First of all, we shall assume (Fig. 5a) that falls in the middle of an allowed band. The Fermi level is then equal to [cf. comment (i) of § 1-b- ]. The electrons whose energies are close to can easily be accelerated, in this case, since the slightly higher energy states are empty and accessible. Consequently, a solid for which the Fermi level falls in the middle of an allowed band is a conductor. The electrons with the highest energies then behave approximately like free parlicles. Consider, on the other hand, a solid for which the ground state is composed of entirely occupied allowed bands (Fig. 5b). is then equal to the upper limit of an allowed band, and the Fermi level falls inside the adjacent forbidden band [cf. comment (i) of § 1-b- ]. In this case, no electrons can be accelerated, since the energy states immediately above theirs are forbidden. Therefore, a solid for which the Fermi level falls inside a forbidden band is an insulator. The larger the interval ∆ between the last occupied band and the first empty allowed band, the better the insulator. We shall return to this point later. The deep allowed bands, completely occupied by electrons and, consequently, inert from an electrical and thermal point of view, are called valence bands. They are generally narrow. In a “strong-bond” model (cf. Complement FXI , § 2), these bands arise from the atomic levels of lowest energies, which are only slightly affected by the presence of the other atoms in the crystal. On the other hand, the higher bands are wider; a partially occupied band is called a conduction band. For a solid to be a good insulator, the last occupied band must not only be entirely full in the ground state, but also, separated from the immediately higher allowed band by a sufficiently wide forbidden band. As we have indicated (§ 1-b- ), at non-zero temperatures, some states of energy lower than can empty, while some higher energy states fill (Fig. 2b). For the solid to remain an insulator at the temperature , the width ∆ of the forbidden band, which prevents this excitation of electrons, must be much larger than . If ∆ is less than or of the order of , a certain number of electrons leave 1492



PHYSICAL PROPERTIES OF AN ELECTRON GAS. APPLICATION TO SOLIDS

EF, μ

μ

∆E

EF

a

b

Figure 5: Schematic representation of the individual levels occupied by the electrons at absolute zero (in grey). is the highest individual energy. In a conductor (fig. a), (which then coincides with the Fermi level ) falls inside an allowed band, called the “conduction band”. The electrons whose energies are near can then be accelerated easily, since the slightly higher energy states are accessible to them. In an insulator (fig. b), falls on the upper boundary of an allowed band called the “valence band” (the Fermi level is then situated in the adjacent forbidden band). The electrons can be excited only by crossing the forbidden band. This requires an energy at least equal to the width ∆ of this band.

the last valence band to occupy states of the immediately higher allowed band (which would be completely empty at absolute zero). The crystal then possesses conduction electrons, but in restricted numbers: it is a semiconductor (such a semiconductor is called intrinsic; see comment below). For example, diamond, for which ∆ is close to 5 eV, remains an insulator at ordinary temperatures, while silicon and germanium, although quite similar to diamond, are semiconductors: their forbidden bands have a width ∆ less than 1 eV. These considerations, while very qualitative, enable us to understand why the electrical conductivity of a semiconductor increases very rapidly with the temperature; with more quantitative arguments, we indeed find a dependence of the form e ∆ 2 . The properties of semiconductors also reveal an apparently paradoxical phenomenon. It is as if, in addition to the electrons which have crossed the forbidden band ∆ at a temperature , there existed in the crystal an equal number of particles with a positive charge. These particles also contribute to the electric current, but their contribution to the Hall effect1 , for example, is opposite in sign to what would be expected for electrons. 1 Recall what the Hall effect is: in a sample carrying a current and placed in a magnetic field perpendicular to this current, the moving charges are subjected to the Lorentz force. In the steady state, this causes a transverse electric field to appear (perpendicular to the current and to the magnetic

1493



COMPLEMENT CXIV

This can be explained very well by band theory, and constitutes a spectacular demonstration of Pauli’s principle. To understand this qualitatively, we must recall that the last valence band, when it is completely full in the vicinity of absolute zero, does not conduct any current (Pauli’s principle forbids the corresponding electrons from being accelerated). When, by thermal excitation, certain electrons move into the conduction band, they free the states they had occupied in the valence band. These empty states in an almost full band are called “holes”. Holes behave like particles of charge opposite to that of the electron. If an electric field is applied to the system, the electrons remaining in the valence band can move, without leaving this band, and occupy the empty states. In this way, they “fill holes” but also “leave new holes behind them”. Holes therefore move in the direction opposite to that of the electrons, that is, as if they had a positive charge. This very rough argument can be made more precise, and it can indeed be shown that holes are in every way equivalent to positive charge carriers.

Conduction band ∆Ed ↕ Donor level

Forbidden band

Acceptor level ↕∆Ea

Valence band a : type n

b : type p

Figure 6: Extrinsic semiconductors: donor atoms (fig. a) bring in electrons which move easily into the conduction band, since their ground states are separated from it only by an energy interval ∆ which is much smaller than the width of the forbidden band. Acceptor atoms (fig. b) easily capture valence band electrons, since, for this to happen, these electrons need only an excitation energy ∆ which is much smaller than that needed to reach the conduction band. This process creates, in the valence band, holes which can conduct current.

Comment: We have been speaking only of chemically pure and geometrically perfect crystals. However, in practice, all solids have imperfections and impurities, which often play an important role, particularly in semiconductors. Consider, for example, a quadrivalent silicon or germanium crystal, in which certain atoms are replaced by pentavalent impurity atoms, such as phosphorus, arsenic or antimony (this often happens, without any important field).

1494



PHYSICAL PROPERTIES OF AN ELECTRON GAS. APPLICATION TO SOLIDS

change in the crystal structure). An atom of such an impurity possesses one too many outer electrons relative to the neighboring silicon or germanium atoms: it is called an electron donor. The binding energy ∆ of the additional electron is considerably lower in the crystal than in the free atom (it is of the order of a few hundredths of an eV); this is due essentially to the large dielectric constant of the crystal, which reduces the Coulomb force (cf. Complement AVII , § 1-a- ). The result is that the excess electrons brought in by the donor atoms move more easily into the conduction band than do the “normal” electrons which occupy the valence band (Fig. 6a). The crystal thus becomes a conductor at a temperature much lower than would pure silicon or germanium. This conductivity due to impurities is called extrinsic. Analogously, a trivalent impurity (like boron, aluminium or gallium) behaves in silicon or germanium like an electron acceptor: it can easily capture a valence band electron (Fig. 6b), leaving a hole which can conduct the current. In a pure (intrinsic) semiconductor, the number of conduction electrons is always equal to the number of holes in the valence band. An extrinsic semiconductor, on the other hand, can, depending on the relative proportion of donor and acceptor atoms, contain more conduction electrons than holes (it is then said to be of the n-type, since the majority of charge carriers are negative), or more holes than conduction electrons (p-type semiconductors with a majority of positive charge carriers). These properties serve as the foundation of numerous technological applications (transistors, rectifiers, photoelectric cells, etc.). This is why impurities are often intentionally added to a semiconductor to modify its characteristics: this is called “doping”. References and suggestions for further reading:

See section 8 of the bibliography, especially Kittel (8.2) and Reif (8.4). For the solid state physics part, see Feynman III (1.2), Chap. 14 and section 13 of the bibliography.

1495

COMPLEMENT DXIV



Complement DXIV Exercises 1. Let 0 be the Hamiltonian of a particle. Assume that the operator 0 acts only on the orbital variables and has three equidistant levels of energies 0, ~ 0 , 2~ 0 (where (in 0 is a real positive constant) which are non-degenerate in the orbital state space the total state space, the degeneracy of each of these levels is equal to 2 + 1, where is the spin of the particle). From the point of view of the orbital variables, we are concerned only with the subspace of spanned by the three corresponding eigenstates of 0 . . Consider a system of three independent electrons whose Hamiltonian can be written: =

0 (1)

+

(2) +

0 (3)

Find the energy levels of and their degrees of degeneracy. . Same question for a system of three identical bosons of spin 0. 2. Consider a system of two identical bosons of spin = 1 placed in the same central potential ( ). What are the spectral terms (cf. Complement BXIV , § 2-b) corresponding to the 1 2 , 1 2 , 2 2 configurations? 3. Consider the state space of an electron, spanned by the two vectors which represent two atomic orbitals, and , of wave functions (r) (cf. Complement EVII , § 2-b): (r) =

( ) = sin cos ( ) ( )

(r) =

( ) = sin cos ( ) ( )

and (r) and

. Write, in terms of and , the state that represents the orbital pointing in the direction of the plane that makes an angle with . . Consider two electrons whose spins are both in the + state, the eigenstate of of eigenvalue +~ 2. Write the normalized state vector which represents the system of these two electrons, one of which is in the state and the other, in the state . . Same question, with one of the electrons in the state and the other one in the state , where and are two arbitrary angles. Show that the state vector obtained is the same. . The system is in the state of question Calculate the probability density ( , , ; , , ) of finding one electron at ( ) and the other one at ( ). Show that the electronic density ( ) [the probability density of finding any electron at ( )] is symmetric with respect to revolution about the axis. Determine the probability density of having = 0 , where 0 is given. Discuss the variation of this probability density with respect to 0 .

1496

• 4.

EXERCISES

Collision between two identical particles

The notation used is that of § D-2-a- of Chapter XIV. . Consider two particles (1) and (2), with the same mass , assumed for the moment to have no spin and to be distinguishable. These two particles interact through a potential ( ) that depends only on the distance between them. At the initial time e . Let ( 0 ) be the evolution operator 0 , the system is in the state 1 : e ; 2 : of the system. The probability amplitude of finding it in the state 1 : n; 2 : n at time 1 is: (n) =

1 : n; 2 :

n

(

1

0)

1 : e ;2 :

e

Let

and be the polar angles of the unit vector n in a system of orthonormal axes . Show that (n) does not depend on . Calculate in terms of (n) the probability of finding any one of the particles (without specifying which one) with the momentum n and the other one with the momentum n. What happens to this probability if is changed to ? . Consider the same problem [with the same spin-independent interaction potential ( )], but now with two identical particles, one of which is initially in the state e , and the other, in the state e (the quantum numbers and refer to the eigenvalues ~ and ~ of the spin component along ). Assume that = . Express in terms of (n) the probability of finding, at time 1 , one particle with momentum n and spin and the other one with momentum n and spin . If the spins are not measured, what is the probability of finding one particle with momentum n and the other one with momentum n? What happens to these probabilities when is changed to ? . Treat problem b for the case = . In particular, examine the = 2 direction, distinguishing between two possibilites, depending on whether the particles are bosons or fermions. Show that, again, the scattering probability is the same in the and directions. 5. Collision between two identical unpolarized particles Consider two identical particles, of spin , which collide. Assume that their initial spin states are not known: each of the two particles has the same probability of being in the 2 + 1 possible orthogonal spin states. Show that, with the notation of the preceding exercise, the probability of observing scattering in the n direction is: (n) 2 +

( n) 2 +

( = +1 for bosons,

2 +1

[

(n) ( n) +

]

1 for fermions).

6. Possible values of the relative angular momentum of two identical particles Consider a system of two identical particles interacting by means of a potential that depends only on their relative distance, so that the Hamiltonian of the system can be written: =

P21 P2 + 1 + 2 2

( R1

R2 ) 1497



COMPLEMENT DXIV

As in § B of Chapter VII, we set: R =

1 (R1 + R2 ) 2

R = R1

P = P1 + P2 P=

R2

1 (P1 + 2

2)

then becomes: =

+

with: = =

P2 4 P2

+

( )

. First, we assume that the two particles are identical bosons of zero spin ( mesons, for example). . We use the r r basis of the state space of the system, composed of common eigenvectors of the observables R and R. Show that, if 21 is the permutation operator of the two particles: 21

r

r

= r

r

. We now go to the p ; basis of common eigenvectors of P L2 and (L = R P is the relative angular momentum of the two particles). Show that these new basis vectors are given by expressions of the form: p ;

=

1 (2 ~)3

d3

2

d3

ep

( )

(

r

~

) r

r

Show that: 21

p ;

= ( 1) p ; . What values of are allowed by the symmetrization postulate?

. The two particles under consideration are now identical fermions of spin 1/2 (electrons or protons). . In the state space of the system, we first use the rG r; S M basis of common eigenstates of R R S2 and , where S = S1 + S2 is the total spin of the system (the kets of the spin state space were determined in § B of Chapter X). Show that: 21

P , 1498

r

r;

= ( 1)

+1

r

. We now go to the , L2 , , S2 and .

r; p ;

;

basis of common eigenstates of



EXERCISES

As in question - , show that: p ;

21

;

= ( 1)

+1

( 1) p ;

;

. Derive the values of allowed by the symmetrization postulate for each of the values of (triplet and singlet). . (more difficult) Recall that the total scattering cross section in the center of mass system of two distinguishable particles interacting through the potential ( ) can be written: =

4

(2 + 1) sin2

2 =0

where the

are the phase shifts associated with ( ) [cf. Chap. VIII, formula (C-58)]. . What happens if the measurement device is equally sensitive to both particles (the two particles have the same mass)? . Show that, in the case envisaged in question , the expression for becomes: =

16

(2 + 1) sin2

2 even

. For two unpolarized identical fermions of spin 1/2 (the case of question ), prove that: =

4

(2 + 1) sin2

2 even

+3

(2 + 1) sin2 odd

7. Position probability densities for a system of two identical particles Let and be two normalized orthogonal states belonging to the orbital state space r of an electron, and let + and be the two eigenvectors, in the spin state space , of the component of its spin. . Consider a system of two electrons, one in the state + and the other, in the state . Let (r r ) d3 d3 be the probability of finding one of them in a volume d3 r centered at point r, and the other in a volume d3 r centered at r (twoparticle density function). Similarly, let (r) d3 r be the probability of finding one of the electrons in a volume d3 r centered at point r (one-particule density function). Show that: (r r ) = (r) =

(r) 2 (r ) 2 + (r) 2 +

(r ) 2 (r) 2

(r) 2

Show that these expressions remain valid even if in

and

are not orthogonal

r.

Calculate the integrals over all space of (r) and (r r ). Are they equal to 1? Compare these results with those which would be obtained for a system of two distinguishable particles (both spin 1/2), one in the state + and the other in the 1499



COMPLEMENT DXIV

state ; the device which measures their positions is assumed to be unable to distinguish between the two particles. . Now assume that one electron is in the state + . Show that we then have:

state

(r r ) =

and the other one, in the

(r ) (r) 2

(r) (r )

(r) 2 +

(r) =

+

(r) 2

Calculate the integrals over all space of (r) and (r r ). What happens to and if and are no longer orthogonal in

?

. Same questions for two identical bosons, either in the same spin state or in two orthogonal spin states. 8. The aim of this exercise is to demonstrate the following point: once the state vector of a system of identical bosons (or fermions) has been suitably symmetrized (or antisymmetrized), it is not indispensable, in order to calculate the probability of any measurement result, to perform another symmetrization (or antisymmetrization) of the kets associated with the measurement. More precisely, provided that the state vector belongs to (or ), the physical predictions can be calculated as if we were confronted with a system of distinguishable particles studied by imperfect measurement devices unable to distinguish between them. Let be the state vector of a system of identical bosons (all of the following reasoning is equally valid for fermions). We have: =

the

(1)

I. . Let be the normalized physical ket associated with a measurement in which bosons are found to be in the different and orthonormal individual states , . Show that: =

! 1:

;2 :

;

;

:

(2)

. Show that, because of the symmetry properties of 1:

;2 :

where

;

;

2

:

=

:

; :

;

: ; :

is an arbitrary permutation of the numbers 1, 2, . . . , . Show that the probability of finding the system in the state 2

= =

1500

2

!

1:

;2 : :

; :

;

; ;

: ; :

can be written:

2 2

(3)



EXERCISES

where the summation is performed over all permutations of the numbers 1, 2, . . . , . Now assume that the particles are distinguishable, and that their state is described by the ket . What would be the probability of finding any one of them in the state , another one in the state , . . . , and the last one in the state ? Conclude, by comparison with the results of , that, for identical particles, it is sufficient to apply the symmetrization postulate to the state vector of the system. . How would the preceding argument be modified if several of the individual states constituting the state were identical? (For the sake of simplicity, consider only the case where = 3). II. (more difficult) Now, consider the general case, in which the measurement result being considered is not necessarily defined by the specification of individual states, since the measurement may no longer be complete. According to the postulates of Chapter XIV, we must proceed in the following way in order to calculate the corresponding probability: – first of all, we treat the particles as distinguishable, and we number them: their state space is then . Then let be the subspace of associated with the measurement result envisaged and the measurement being performed with devices incapable of distinguishing between the particles; – with denoting an arbitrary ket of , we construct the set of kets which constitutes a vector space ( is the projection of onto ); if the dimension of is greater than 1, the measurement is not complete; – the desired probability is then equal to the square of the norm of the orthogonal of the ket describing the state of the identical particles. projection onto . If is an arbitrary permutation operator of the construction of :

Show that tersection of

is globally invariant under the action of and .

. We construct an orthonormal basis in 1

2

particles, show that, by

and that

is simply the in-

:

+1

the first vectors of which constitute a basis of . Show that the kets , where + 1 6 6 , must be linear combinations of the first vectors of this basis. Show, 1 2 by taking their scalar products with the bras , , that these kets (with > + 1) are necessarily zero.

1501

COMPLEMENT DXIV



. Show from the preceding results that the symmetric nature of 2

implies that:

2

=

=1

=1

that is: = where

and denote respectively the projectors onto and . Conclusion: The probabilities of the measurement results can be calculated from the projection of the ket (belonging to ) onto an eigensubspace whose kets do not all belong to , but in which all the particles play equivalent roles. 9. One- and two-particle density functions in an electron gas at absolute zero I. . Consider a system of particles 1 2 with the same spin . First of all, assume that they are not identical. In the state space ( ) of particle ( ), the ket : r0 represents a state in which particle ( ) is localized at the point r0 in the spin state ( ~: the eigenvalue of ). Consider the operator: (r0 ) =

: r0

: r0

=1

( ) =

where ( ) is the identity operator in the space ( ). Let be the state of the -particle system. Show that (r0 ) d represents the probability of finding any one of the particles in the infinitesimal volume element d centered at r0 , the component of its spin being equal to ~. . Consider the operator: (r0 r0 ) =

: r0 =1

=

; : r0

: r0

; : r0

( ) =

What is the physical meaning of the quantity (r0 r0 ) d d , where d and d are infinitesimal volumes? The average values (r0 ) and (r0 r0 ) will be written, respectively, (r0 ) and (r0 r0 ) and will be called the one- and two-particle density functions of the -particle system. The preceding expressions remain valid when the particles are identical, provided that is the suitably symmetrized or antisymmetrized state vector of the system (cf. preceding exercise).

1502



1

EXERCISES

II. Consider a system of particles in the normalized and orthogonal individual states , 2 . The normalized state vector of the system is: =

!

1:

1; 2

:

2;

;

:

where is the symmetrizer for bosons and the antisymmetrizer for fermions. In this part, we want to calculate the average values in the state of symmetric one-particle operators of the type: =

()

( )

=1

=

or of symmetric two-particle operators of the type: =

( =1

)

( )

=

=

. Show that: =

1:

1; 2

:

2;

;

:

1:

1; 2

:

2;

;

:

where = +1 for bosons, and +1 or 1 for fermions, depending on whether the permutation is even or odd. Show that the same expression is valid for the operator . Derive the relations: =

:

() :

=1

= =1

with

:

; :

+

:

(

) :

; :

(

) :

=

= +1 for bosons,

=

; :

; :

1 for fermions.

III. We now want to apply the results of part II to the operators (r0 ) and (r0 r0 ) introduced in part I. The physical system under study is a gas of free electrons enclosed in a cubic box of edge at absolute zero (Complement CXIV , § 1). By applying periodic boundary conditions, we obtain individual states of the form k , where the wave function associated with k is a plane wave 31 2 e k r , and the components of k satisfy relations (26) of Complement CXIV . We shall call = ~2 2 2 the Fermi energy of the system and =2 , the Fermi wavelength. 1503

COMPLEMENT DXIV



. Show that the two one-particle density functions equal to: + (r0 )

=

(r0 ) =

k (r0 )

+ (r0 )

and

(r0 ) are both

2

where the summation over k is performed over all values of k of modulus less then , satisfying the periodic boundary conditions. By using § 1 of Complement CXIV , show that + (r0 ) = (r0 ) = 3 6 2 = 2 3 . Could this result have been predicted simply? . Show that the two two-particle density functions are both equal to:

+

(r0 r0 ) and

+ (r0

r0 )

2 k (r0 ) k k

(r0 ) 2 =

k

4

6

where the summations over k and k are defined as above. Give a physical interpretation. . Finally, consider the two two-particle density functions Prove that they are both equal to: k (r0 ) k k

(r0 ) 2

k (r0 ) k

(r0 )

k (r0 ) k

++ (r0

r0 ) and

(r0 r0 ).

(r0 )

=k

Show that the restriction k = k can be omitted, and show that the two two-particle density functions are equal to: 2

2 k (r0 ) k (r0 )

6

4

k

with

= r0

( )= (

2

=

4

6

[1

r0 , where the function

3 3

[sin

2

(

)]

( ) is defined by:

cos ]

can be replaced by an integral over k) How do the two-particle density functions ++ (r0 r0 ) and (r0 r0 ) vary with respect to the distance between r0 and r0 ? Show that it is practically impossible to find two electrons with the same spin separated by a distance much smaller than .

1504

FOURIER SERIES AND FOURIER TRANSFORMS

Appendix I Fourier series and Fourier transforms

1

2

Fourier series . . . . . . . . . . . . . . . . . . . . . . 1-a Periodic functions . . . . . . . . . . . . . . . . . . 1-b Expansion of a periodic function in a Fourier series 1-c The Bessel-Parseval relation . . . . . . . . . . . . . Fourier transforms . . . . . . . . . . . . . . . . . . . 2-a Definitions . . . . . . . . . . . . . . . . . . . . . . 2-b Simple properties . . . . . . . . . . . . . . . . . . . 2-c The Parseval-Plancherel formula . . . . . . . . . . 2-d Examples . . . . . . . . . . . . . . . . . . . . . . . 2-e Fourier transforms in three-dimensional space . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

1505 1505 1506 1507 1508 1508 1509 1511 1511 1512

In this appendix, we shall review a certain number of definitions, formulas and properties which are useful in quantum mechanics. We do not intend to enter into the details of the derivations, nor shall we give rigorous proofs of the mathematical theorems. 1.

Fourier series

1-a.

Periodic functions

A function ( ) of a variable is said to be periodic if there exists a real non-zero number such that, for all : ( + )= ( )

(1)

is called the period of the function ( ). If ( ) is periodic with a period of , all numbers , where is a positive or negative integer, are also periods of ( ). The fundamental period 0 of such a function is defined as being its smallest positive period (the term “period” is often used in physics to denote what is actually the fundamental period of a function).

Comment:

We can take a function ( ) defined only on a finite interval [ ] of the real axis and construct a function ( ) which is equal to ( ) inside [ ] and is periodic, with a period ( ). The function ( ) is continuous if ( ) is and if: ( )= ( )

(2)

We know that the trigonometric functions are periodic. In particular: cos 2

and sin 2

(3) 1505

Quantum Mechanics, Volume II, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

APPENDIX I

have fundamental periods equal to Other particularly important examples of periodic functions are the periodic exponentials. For an exponential e to have a period of , it is necessary and sufficient, according to definition (1), that: e

=1

(4)

that is: =2 where e

(5)

is an integer. There are therefore two exponentials of fundamental period

:

2

(6)

which are, furthermore, related to the trigonometric functions (3) which have the same period: e

2

= cos 2

sin 2

The exponential e2 1-b.

(7)

also has a period of

, but its fundamental period is

.

Expansion of a periodic function in a Fourier series

Let ( ) be a periodic function with a fundamental period of . If it satisfies certain mathematical conditions (as is practically always the case in physics), it can be expanded in a series of imaginary exponentials or trigonometric functions. .

Series of imaginary exponentials

We can write ( ) in the form: +

( )=

e

(8)

=

with: 2

=

(9)

The coefficients =

of the Fourier series (8) are given by the formula:

0+

1

d e

( )

(10)

0

where

0

is an arbitrary real number.

To prove (10), we multiply (8) by e +

0+

d e 0

1506

and integrate between

and

0

+ :

0+

d e(

( )= =

0

0

)

(11)

FOURIER SERIES AND FOURIER TRANSFORMS

The integral of the right-hand side is zero for = and equal to for = . Hence formula (10). It can easily be shown that the value obtained for is independent of the number 0 chosen.

The set of values if and only if:

is called the Fourier spectrum of ( ). Note that ( ) is real

= .

(12)

Cosine and sine series

If, in the series (8), we group the terms corresponding to opposite values of , we obtain: ( )=

0

+

e

+

e

(13)

=1

that is, according to (7): ( )=

0

+

(

cos

+

sin

)

(14)

=1

with: 0

=

0

=

+

= (

0

)

(15)

The formulas giving the coefficients 0

=

and

can therefore be derived from (10):

0+

1

d

( )

d

( ) cos

d

( ) sin

0

=

0+

2 0

=

0+

2

(16)

0

If ( ) has a definite parity, expansion (14) is particularly convenient, since: = 0 if

( ) is even

= 0 if

( ) is odd

(17)

Moreover, if ( ) is real, the coefficients 1-c.

and

are real.

The Bessel-Parseval relation

It can easily be shown from the Fourier series (8) that: +

0+

1

d 0

( )2=

2

(18)

=

1507

APPENDIX I

This can be shown using equation (8): 0+

1

0+

1

( )2=

d 0

d e(

)

(19)

0

As in (11), the integral of the right-hand side is equal to

. This proves (18).

When expansion (14) is used, the Bessel-Parseval relation (18) can also be written: 0+

1

( )2=

d

0

2

+

0

1 2

2

+

2

(20)

=1

If we have two functions, ( ) and ( ), with the same period , whose Fourier coefficients are, respectively. and , we can generalize relation (18) to the form: +

0+

1

d

( ) ( )=

0

2.

Fourier transforms

2-a. .

(21) =

Definitions The Fourier integral as the limit of a Fourier series

Now, consider a function ( ) which is not necessarily periodic. We define ( ) to be the periodic function of period which is equal to ( ) inside the interval [ 2 2]. The function ( ) can be expanded in a Fourier series: +

( )=

e

(22)

=

where =

is defined by formula (9), and: 0+

1

d e

( )=

1

0

+2

d e

( )

(23)

2

When approaches infinity, ( ) becomes the same as ( ). We shall therefore let approach infinity in the expressions above. Definition (9) of then yields: +1

=

2

(24)

We shall now replace 1 by its expression in terms of ( this value of into the series (22): + +1

( )= =

1508

2

+1

) in (23), and substitute

+2

e

d e 2

( )

(25)

FOURIER SERIES AND FOURIER TRANSFORMS

When , +1 approaches zero [cf. (24)], so that the sum over is transformed into a definite integral; ( ) approaches ( ). The integral appearing in (25) becomes a function of the continuous variable . If we set: +

1 2

˜( ) =

d e

( )

(26)

relation (25) can be written in the limit of infinite +

1 2

( )=

:

˜( )

d e

(27)

( ) and ˜( ) are called Fourier transforms of each other. .

Fourier transforms in quantum mechanics

In quantum mechanics, we actually use a slightly different convention. If a (one-dimensional) wave function, its Fourier transform ( ) is defined by: ( )=

1 2 ~

( ) is

+

d e

( )

(28)

( )

(29)

and the inverse formula is: ( )=

1 2 ~

+ ~

d e

To go from (26) and (27) to (28) and (29), we set: =~

(30)

( has the dimensions of a momentum if

is a length), and:

1 ˜ 1 ˜ ( )= ~ ~ ~

( )=

(31)

In this appendix, as is usual in quantum mechanics, we shall use definition (28) of the Fourier transform instead of the traditional definition, (26). To return to the latter definition, furthermore, all we need to do is replace ~ by 1 and by in all the following expressions. 2-b.

Simple properties

We shall state (28) and (29) in the condensed notation: ( )=

[ ( )]

(32a)

( )=

[ ( )]

(32b)

The following properties can easily be demonstrated: ()

( e

0) 0

~

=

[e

( )=

0

~

[ (

( )]

(33)

0 )]

1509

APPENDIX I

This follows directly from definition (28). ( )

( )=

[ ( )] =

[ (

)] =

1

(34)

To see this, all we need to do is change the integration variable: =

(35)

In particular: [ (

)] = (

)

(36)

Therefore, if the function parity. (

)

( ) has a definite parity, its Fourier transform has the same

( ) real

[ ( )] = (

( ) pure imaginary

[ ( )] =

) (

The same expressions are valid if the functions

(37a) )

(37b)

and

are inverted.

( ) If ( ) denotes the th derivative of the function , successive differentiations inside the summation yield, according to (28) and (29): [

( )

( )] =

( )

(38a)

~ ( )

( )=

( )

(38b)

~ ( ) The convolution of two functions equal to:

1(

) and

2(

) is, by definition, the function ( )

+

( )=

d

1(

)

2(

)

(39)

Its Fourier transform is proportional to the ordinary product of the transforms of ) and 2 ( ):

1(

( )=

2 ~

1(

)

2(

)

(40)

This can be shown as follows. We take the Fourier transform of expression (39): ( )=

+

1 2 ~

+

d e

~

d

1(

)

2(

)

(41)

and perform the change of integration variables: =

=

(42) ~

If we multiply and divide by e ( )=

1510

1 2 ~

we obtain:

+

+

d e

~

1(

)

d e

~

2(

)

(43)

FOURIER SERIES AND FOURIER TRANSFORMS

which proves (40).

( ) When

( ) is a peaked function of width ∆ , the width ∆ of

( ) satisfies:

∆ &~



(44)

(see § C-2 of Chapter I, where this inequality is analyzed, and Complement CIII ). 2-c.

The Parseval-Plancherel formula

A function and its Fourier transform have the same norm: +

+

( )2=

d

( )2

d

(45)

To prove this, all we need to do is use (28) and (29) in the following way: +

+

( )2=

d

d +

=

d

1 2 ~

( )

+

1 2 ~

( )

~

d e

( )

+ ~

d e

( )

+

=

d

( )

( )

(46)

As in § 1-c, the Parseval-Plancherel formula can be generalized: +

+

d

2-d.

( )

( )=

d

( )

( )

(47)

Examples

We shall confine ourselves to three examples of Fourier transforms, for which the calculations are straightforward. ( ) Square function ( )=

1

for

=0

2

for

2

( )=

2

1 sin ( 2~) 2~ 2 ~

(48)

( ) Decreasing exponential ( )=e (

( )=

2 ~(

2

1 ~2 ) + (1

2)

(49)

) Gaussian function ( )=e

2

2

( )=

2~

e

2 2

4~2

(50)

(note the remarkable fact that the Gaussian form is conserved by the Fourier transform). 1511

APPENDIX I

Comment:

In each of these three cases, the widths ∆ and ∆ can be defined for ( ) respectively, and they verify inequality (44). 2-e.

( ) and

Fourier transforms in three-dimensional space

For wave functions (r) which depend on the three spatial variables , , , (28) and (29) are replaced by: 1 (2 ~)3 1 (r) = (2 ~)3

(p) =

d3 e

2

pr ~

d3 e p r

2

~

(r)

(51a)

(p)

(51b)

The properties stated above (§§ 2-b and 2-c) can easily be generalized to three dimensions. If depends only on the modulus of the radius-vector r, depends only on the modulus of the momentum p and can be calculated from the expression: 1 2 2 ~

( )=

r d sin

( )

(52)

~

0

Proof: First, we shall find using (51a) the value of arbitrary rotation : p = (p ) =

for a vector p obtained from p by an

(53)

p 1 (2 ~)3

2

d3 e

p r ~

( )

(54)

In this integral, we replace the variable r by r and set: r =

(55)

r

Since the volume element is conserved under rotation, we have: d3

= d3

In addition, the function finally: p

(56) is unchanged, since the modulus of r remains equal to ;

r =p r

(57)

since the scalar product is rotation-invariant. We thus find: (p ) = (p) that is,

1512

depends only on the modulus of p and not on its direction.

(58)

FOURIER SERIES AND FOURIER TRANSFORMS

We can then choose p along ( )=

1 (2 ~)3

=

1 (2 ~)3

1 = (2 ~)3 =

to evaluate

d3 e

2

~

( ):

( ) 2

2 2

d

( )

0

d sin e

0 2

2

1 2 2 ~

d

d

( )2

cos

~

0

2~

sin ~

0

d

( ) sin

(59) ~

0

This proves (52).

Assume for instance that Ψ( ) is given by the following (non normalized) function:

Ψ( ) = where

e

(60)

is positive. Relation (52) then becomes: 1 1 d e 2 ~ 0 2 1 3 2 ~ + 2 ~2

( )= =

e

~

e

~

=

1 1 2 ~

1 +

1 ~

~ (61)

A central potential that varies with as the right-hand side of (60) is called a “Yukawa potential”. When = 0 , it becomes a Coulomb potential, whose gradient gives an electric field. If we take the gradient of (51b), we obtain the Fourier transformation correspondence between two following vector functions (we now use variable k instead of p, and therefore set ~ = 1): 1 e

2

r FT

The limit r 3 FT

2

k +

2

(62)

0 then provides: 2 k 2

(63)

References and suggestions for further reading:

See, for example, Arfken (10.4), Chaps. 14 and 15, or Butkov (10.8), Chaps. 4 and 7; Bass (10.1), vol. I, Chaps. XVIII through XX: section 10 of the bibliography, especially the subsection “Fourier transforms; distributions”.

1513

THE DIRAC -“FUNCTION”

Appendix II The Dirac -“function”

1

Introduction; principal properties . . . . . . 1-a Introduction of the -“function” . . . . . . . 1-b Functions that approach . . . . . . . . . . . 1-c Properties of . . . . . . . . . . . . . . . . . The -“function” and the Fourier transform 2-a The Fourier transform of . . . . . . . . . . 2-b Applications . . . . . . . . . . . . . . . . . . Integral and derivatives of the -“function” . 3-a is the derivative of the “unit step-function” 3-b Derivatives of . . . . . . . . . . . . . . . . . The -“function” in three-dimensional space

2

3

4

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

1515 1515 1517 1518 1520 1520 1521 1521 1521 1523 1524

The -“function” is actually a distribution. However, like most physicists, we shall treat it like an ordinary function. This approach, although not mathematically rigorous, is sufficient for quantum mechanical applications. 1.

Introduction; principal properties

1-a.

Introduction of the -“function”

Consider the function ( )

( )=

1

=0

for

2

for

( )

( ) given by (cf. Fig. 1): 2 (1)

2

δ(ε)(x) 1 ε



ε 2

Figure 1: The function ( ) ( ): a square function of width and height 1/ , centered at = 0.

+

ε

x

2

1515

APPENDIX II

where

is a positive number. We shall evaluate the integral:

+

d

( )

( ) ( )

(2)

where ( ) is an arbitrary function, well-defined for = 0. If is sufficiently small, the variation of ( ) over the effective integration interval [ 2 2] is negligible, and ( ) remains practically equal to (0). Therefore: +

+

d

( )

( ) ( )

(0)

( )

d

( ) = (0)

The smaller , the better the approximation. We therefore examine the limit define the -“function” by the relation:

(3) = 0 and

+

d

( ) ( ) = (0)

(4)

which is valid for any function ( ) defined at the origin. More generally, ( defined by:

0)

is

+

d

(

0)

( )= (

0)

(5)

Comments:

( ) Actually, the integral notation in (5) is not mathematically justified. is defined rigorously not as a function but as a distribution. Physically, this distinction is not an essential one as it becomes impossible to distinguish between ( ) ( ) and ( ) as soon as becomes negligible compared to all the distances involved in a given physical problem1 : any function ( ) which we might have to consider does not vary significantly over an interval of length . Whenever a mathematical difficulty might arise, all we need to do is assume that ( ) is actually ( ) ( ) [or an analogous but more regular function, for example, one of those given in (7), (8), (9), (10), (11)], with extremely small but not strictly zero. ( ) For arbitrary integration limits d

( ) ( ) = (0) =0

and , we have:

if 0

if 0

[

[

] ]

(6)

1 The accuracy of present-day physical measurements does not, in any case, allow us to investigate phenomena on a scale of less than a fraction of a Fermi (1 Fermi = 1015 m).

1516

THE DIRAC -“FUNCTION”

1-b.

Functions that approach

It can easily be shown that, in addition to ( ) ( ) defined by (1), the following functions approach ( ), that is, satisfy (5), when the parameter approaches zero from the positive side: 1 e 2 1

() ( ) (

( )

2

+

1

)

( )

(7) (8)

2 2

e

1 sin(

2

(9)

)

sin2 (

(10)

)

(11)

2

We shall also mention an identity which is often useful in quantum mechanics (particularly in collision theory): 1

Lim

1

=

0+

( )

(12)

denotes the Cauchy principal part, defined by2 [ ( ) is a regular function at

where = 0]: +

+

d

( ) = Lim

d

+

0+

( );

0

(13)

+

To prove (12), we separate the real and imaginary parts of 1 ( 1

=

2

): (14)

2

+

Since the imaginary part is proportional to the function (8), we have: Lim

2

0+

+

2

=

( )

(15)

As for the real part, we shall multiply it by a function integrate over : +

Lim 0+ 2 One +

d 2 +

( ) that is regular at the origin, and

+ 2

( ) = Lim Lim 0+

+

+

+

0+

2 +

d +

2

( )

(16)

often uses one of the following relations: d

+

( )=

d

( )

+

d

+

( )=

d

( )

(0)

+ (0) Log

where ( )=[ ( ) ( )] 2 is the odd part of ( ). These formulas allow us to explicitly eliminate the divergence at the origin.

1517

APPENDIX II

The second integral is zero: +

Lim

2

0+

d +

( ) = (0) Lim

2

0+

1 Log( 2

2

+

2

)

+

=0

If we now reverse the order of the evaluation of the limits in (16), the difficulties in the two other integrals. Thus: +

+

d 2 +

Lim 0+

( ) = Lim

2

+

0+

d

( )

(17) 0 limit presents no

(18)

+

This establishes identity (12). 1-c.

Properties of

The properties we shall now state can be demonstrated using (5). Multiplying both sides of the equations below by a function ( ) and integrating, we see that the results obtained are indeed equal. ()

(

( )

(

)= ( ) 1 )= ( )

(19) (20)

and, more generally: 1 ( ( )

[ ( )] = where

)

( ) is the derivative of ( ) and the

(21) are the simple zeros of the function ( ):

( )=0 ( )=0

(22)

The summation is performed over all the simple zeros of ( ). If ( ) has zeros of multiple order [that is, for which ( ) is zero], the expression [ ( )] makes no sense. (

)

(

0)

=

0

(

0)

(23)

and, in particular: ( )=0

(24)

The converse is also true and it can be shown that the equation: ( )=0

(25)

has the general solution: ( )= where 1518

( )

is an arbitrary constant.

(26)

THE DIRAC -“FUNCTION”

More generally: ( ) (

0)

= (

0)

(

0)

(27)

+

( )

d

(

) (

)= (

)

(28)

Equation (28) can be understood by examining functions Figure 1. The integral:

( )

( ) like the one shown in

+ ( )

(

)=

d

( )

is zero as long as

(

)

( )

(

)

(29)

, that is, as long as the two square functions do not overlap (Fig. 2).

δ(ε)(x – z)

δ(ε)(x – y) ε

ε

1 ε

y

z

Figure 2: The functions ( ) ( ) and ( ) ( height 1 , centered respectively at = and

x

): two square functions of width = .

and

The maximum value of the integral, obtained for = , is equal to 1 . Between this maximum value and 0, the variation of ( ) ( ) with respect to is linear (Fig. 3). We see immediately that ( ) ( ) approaches ( ) when 0.

Comment:

A sum of regularly spaced -functions: +

(

)

(30)

=

can be considered to be a periodic “function” of period . By applying formulas (8), (9) and (10) of Appendix I, we can write it in the form: +

( =

)=

1

+

e2

(31)

=

1519

APPENDIX II

F(ε)(y, z) 1 ε

–ε



y–z

Figure 3: The variation with respect to of the scalar product ( ) ( ) of the two square functions shown in Figure 2. This scalar product is zero when the two functions ( ) do not overlap ( ), and maximal when they coincide. ( ) approaches ( ) when 0.

2.

The -“function” and the Fourier transform

2-a.

The Fourier transform of

Definition (28) of Appendix I and equation (5) enable us to calculate directly the Fourier transform 0 ( ) of ( 0 ): ( )= 0

+

1 2 ~

~

d e

(

0)

=

1 e 2 ~

0

~

(32)

In particular, that of ( ) is a constant: 0(

1 2 ~

)=

(33)

The inverse Fourier transform [formula (29) of Appendix I] then yields: (

0) =

1

+

d e

2 ~

(

0)

~

=

1 2

+

d e

(

0)

(34)

This result can also be found by using the function ( ) ( ) defined by (1) or any of the functions given in § 1-b. For example, (48) of Appendix I enables us to write: ( )

( )=

If we let

1520

1 2 ~

+

d e

~

sin(

2~) 2~

approach zero, we indeed obtain (34).

(35)

THE DIRAC -“FUNCTION”

2-b.

Applications

Expression (34) for the -function is often very convenient. We shall show, for example, how it simplifies finding the inverse Fourier transform and the Parseval-Plancherel relation [formulas (29) and (45) of Appendix I]. Starting with: +

1 2 ~

( )=

~

d e

( )

(36)

we calculate: +

1 2 ~

~

d e

( )=

+

1

+

d

2 ~

In the second integral, we recognize ( +

1 2 ~

( )

d e

(

) ~

(37)

), so that:

+ ~

d e

( )=

d

( ) (

)= ( )

(38)

which is the inversion formula of the Fourier transform. Similarly: ( )2=

+

1

+ ~

d e

2 ~

( )

d

~

e

( )

(39)

If we integrate this expression over , we find: +

( )2=

d

1

+

+

d

2 ~

( )

+

d

( )

d e

(

) ~

(40)

that is, according to (34): +

+

( )2=

d

+

d

( )

+

d

( ) (

)=

d

( )2

(41)

which is none other than the Parseval-Plancherel formula. We can obtain the Fourier transform of a convolution product in an analogous way [cf. formulas (39) and (40) of Appendix I]. 3.

Integral and derivatives of the -“function”

3-a.

is the derivative of the “unit step-function”

We shall evaluate the integral: ( )

( )=

( )

where the function for

2

, to 1 for

( )d

( )

(42)

( ) is defined in (1). It can easily be seen that ( ) ( ) is equal to 0 1 , and to + for . The variation of ( ) ( ) 2 2 2 2 1521

APPENDIX II

θ(ε)(x)

1

ε –

ε +

2

x

2

Figure 4: Variation of the function ( ) ( ), whose derivative ( ) ( ) is shown in Figure 1. When 0, ( ) ( ) approaches the Heaviside step-function ( ).

with respect to is shown in Figure 4. When 0, “step-function” ( ), which, by definition, is equal to: ( )=1

if

0

( )=0

if

0

( )

( ) approaches the Heaviside

(43)

( )

( )

( ) is the derivative of the derivative of ( ):

( ). By considering the limit

d ( )= ( ) d Now, consider a function ( ) which has a discontinuity Lim ( ) 0+

Lim ( ) = 0

0, we see that ( ) is

(44) 0

at

= 0: (45)

0

Such a function can be written in the form ( ) = 1 ( ) ( ) + 2 ( ) ( ), where 1 ( ) and 2 ( ) are continuous functions which satisfy 1 (0) 2 (0) = 0 . If we differentiate this expression, using (44), we obtain: ( )=

1(

) ( )+

2(

) (

)+

1(

=

1(

) ( )+

2(

) (

)+

0

) ( )

2(

) (

)

( )

(46)

according to properties (19) and (27) of . For a discontinuous function, there is then added to the ordinary derivative [the first two terms of (46)] a term proportional to the -function, the proportionality coefficient being the magnitude of the function’s discontinuity3 . 3 Of 2 ( 0 )]

1522

course, if the function is discontinuous at ( 0 ).

=

0,

the additional term is of the form: [

1( 0)

THE DIRAC -“FUNCTION”

Comment:

The Fourier transform of the step-function ( ) can be found simply from (12). We get: +

( )e 3-b.

d = Lim

d e

0+

( + )

= Lim

0+

0

+

=

1

+

( )

(47)

Derivatives of

By analogy with the expression for integration by parts, the derivative -function is defined by the relation4 : +

( ) of the

+

d

( ) ( )=

d

( )

( )=

(0)

(48)

From this definition, we immediately get: (

)=

( )

(49)

( )=

( )

(50)

and:

Conversely it can be shown that the general solution of the equation: ( )= ( )

(51)

can be written: ( )=

( )+

( )

(52)

where the second term arises from the homogeneous equation [cf. formulas (25) and (26)]. Equation (34) allows us to write ( ) in the form: ( )=

1

+

+

d

2 ~

e

~

=

~

The th-order derivative

( )

2

d e

(53)

( ) can be defined in the same way:

+

d

( )

( ) ( ) = ( 1)

( )

(0)

(54)

Relations (49) and (50) can then be generalized to the forms: ( )

(

( )

) = ( 1)

( )

(55)

( )

(56)

and: ( )

( )=

(

1)

4 ( ) can be considered to be the limit, for in § 1-b.

0, of the derivative of one of the functions given

1523

APPENDIX II

4.

The -“function” in three-dimensional space

The -“fonction” in three-dimensional space, which we shall write simply as (r), is defined by an expression analogous to (4): d3

(r) (r) = (0)

(57)

and, more generally: d3

(r

(r (r

r0 ) (r) = (r0 )

(58)

r0 ) can be broken down into a product of three one-dimensional functions:

r0 ) = (

0)

(

0)

(

0)

(59)

or, if we use polar coordinates: (r

r0 ) =

2

1 sin

1

=

(

2

(

0) 0)

(

(cos

0)

cos

( 0)

0)

(

0)

(60)

The properties stated above for ( ) can therefore easily be generalized to (r). We shall mention, in addition, the important relation: 1



=

4

(r)

(61)

where ∆ is the Laplacian operator. Equation (61) can easily be understood if it is recalled that in electrostatics, an electrical point charge placed at the origin can be described by a volume density (r) equal to: (r) =

(r)

(62)

We know that the expression for the electrostatic potential produced by this charge is: (r) =

1 4

(63)

0

Equation (61) is thus simply the Poisson equation for this special case: ∆ (r) =

1

(r)

(64)

0

To prove (61) rigorously, it is necessary to use mathematical distribution theory. We shall confine ourselves here to an elementary “proof”. First of all, note that the Laplacian of 1 is everywhere zero, except, perhaps, at the origin, which is a singular point: d2 2 d + d 2 d

1524

1

=0

for

=0

(65)

THE DIRAC -“FUNCTION”

Let (r) be a function equal to 1 when r is outside the sphere , centered at and of a radius , and which takes on values (of the order of 1 ) inside this sphere such that (r) is sufficiently regular (continuous, differentiable, etc.). Let (r) be an arbitrary function of r which is also regular at all points in space. We now find the limit of the integral: d3

( )=

(r) ∆ (r)

(66)

for 0. According to (65), this integral can receive contributions only from inside the sphere , and: d3

( )= We choose

(r) ∆ (r)

(67)

small enough for the variation of (r) inside

( )

to be negligible. Then:

d3 ∆ (r)

(0)

(68)

Transforming the integral so obtained into an integral over the surface S of ( )

∇ (r) dn

(0)

, we obtain: (69)

S

(r) is continuous on the surface S , we get:

Now, since [∇ (r)]

1

=

=

e =

2

=

(where e is the unit vector r ( )

(0)

4

4

(0)

1 2

(70)

e

). This yields:

1

2

2

(71)

that is: Lim

d3 ∆ (r) (r) =

4

(0)

(72)

0

According to definition (57), this is simply (61).

Equation (61) can be used, for example, to derive an expression which is useful in collision theory (cf. Chap. VIII): (∆ +



2

)

e

=

4

(r)

(73)

To do so, it is sufficient to consider e

as a product:

e

1

=

1

∆(e

)+e



1

+ 2∇

∇(e

)

(74)

Now: ∇(e

)=

∆(e

)=

e 2

e

r 2

e

(75)

1525

APPENDIX II

We therefore find, finally: (∆ +

2

)

2

e

2

=

4

2

=

4 e

=

4

(r)

2 2

2

(

)+

e

(r) (r)

(76)

according to (27).

Equation (61) can, furthermore, be generalized: the Laplacian of the function 2 ( ) +1 involves th-order derivatives of (r). Consider, for example cos . We know that the expression for the electrostatic potential created at a distant point by an cos electric dipole of moment D directed along is . If is the absolute value of 4 0 2 each of the two charges which make up the dipole and is the distance between them, the modulus of the dipole moment is the product , and the corresponding charge density can be written: (r) =

r

2

r+ e 2

e

(where e denotes the unit vector of the axis). If we let maintaining = finite, this charge density becomes: (r)

cos 2

approach zero, while

(r)

0

(78)

Therefore, in the limit where ∆

(77)

=

4

0, the Poisson equation (64) yields:

(r)

(79)

Of course, this formula could be justified as (61) was above, or proven by distribution theory. Analogous reasoning could be applied to the function ( ) +1 which gives the potential created by an electric multipole moment located at the origin (complement EX ). References and suggestions for further reading:

See Dirac (1.13) § 15, and, for example, Butkov (10.8), Chap. 6, or Bass (10.1), vol. I, §§ 21.7 and 21.8; section 10 of the bibliography, especially the subsection “Fourier transforms; distributions”.

1526

LAGRANGIAN AND HAMILTONIAN IN CLASSICAL MECHANICS

Appendix III Lagrangian and Hamiltonian in classical mechanics

1

2 3

4

5

Review of Newton’s laws . . . . . . . . . . . . . . . . . . . . . 1-a Dynamics of a point particle . . . . . . . . . . . . . . . . . . 1-b Systems of point particles . . . . . . . . . . . . . . . . . . . . 1-c Fundamental theorems . . . . . . . . . . . . . . . . . . . . . . The Lagrangian and Lagrange’s equations . . . . . . . . . . The classical Hamiltonian and the canonical equations . . . 3-a The conjugate momenta of the coordinates . . . . . . . . . . 3-b The Hamilton-Jacobi canonical equations . . . . . . . . . . . Applications of the Hamiltonian formalism . . . . . . . . . . 4-a A particle in a central potential . . . . . . . . . . . . . . . . . 4-b A charged particle placed in an electromagnetic field . . . . . The principle of least action . . . . . . . . . . . . . . . . . . . 5-a Geometrical representation of the motion of a system . . . . 5-b The principle of least action . . . . . . . . . . . . . . . . . . . 5-c Lagrange’s equations as a consequence of the principle of least action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1527 1527 1528 1528 1530 1531 1531 1532 1533 1533 1536 1539 1539 1540 1541

We shall review the definition and principal properties of the Lagrangian and the Hamiltonian in classical mechanics. This appendix is not meant to be a course in analytical mechanics. Its goal is simply to indicate the classical basis for applying the quantization rules (cf. Chap. III) to a physical system. In particular, we shall concern ourselves essentially with systems of point particles. 1.

Review of Newton’s laws

1-a.

Dynamics of a point particle

Non-relativistic classical mechanics is based on the hypothesis that there exists at least one geometrical frame, called the Galilean or inertial frame, in which the following law is valid: The fundamental law of dynamics: a point particle has, at all times, an acceleration which is proportional to the resultant F of the forces acting on it: F=

(1)

The constant is an intrinsic property of the particle, called its inertial mass. It can easily be shown that if a Galilean frame exists, all frames which are in uniform translational motion with respect to it are also Galilean frames. This leads us to the Galilean relativity principle: there is no absolute frame; there is no experiment which can give one inertial frame a privileged role with respect to all others. 1527

APPENDIX III

1-b.

Systems of point particles

If we are dealing with a system composed of damental law to each of them1 : ¨r = F

;

point particles, we apply the fun-

=1 2

(2)

The forces that act on the particles can be classed in two categories: internal forces represent the interactions between the particles of the system, and external forces originate outside the system. The internal forces are postulated to satisfy the principle of action and reaction: the force exerted by particle ( ) on particle ( ) is equal and opposite to the one exerted by ( ) on ( ). This principle is true for gravitational forces (Newton’s law) and electrostatic forces, but not for magnetic forces (whose origin is relativistic). If all the forces can be derived from a potential, the equations of motion (2) can be written: ∇

¨r =

(3)

where ∇ denotes the gradient with respect to the r coordinates, and the potential energy is of the form: =

(r ) +

(r

r )

(4)

=1

(the first term in this expression corresponds to the external forces, and the second one to the internal forces). In cartesian coordinates, the motion of the system is therefore described by the 3 differential equations: ¨ = =1 2

¨ =

(5)

¨ =

1-c.

Fundamental theorems

We shall first review a few definitions. The center of mass or center of gravity of a system is the point whose coordinates are: r r =

=1

(6)

=1 1 In

¨=

mechanics, a simplified notation is generally used for the time-derivatives; by definition, ˙ =

d2 , etc... d2

1528

d , d

LAGRANGIAN AND HAMILTONIAN IN CLASSICAL MECHANICS

The total kinetic energy of the system is equal to: = =1

1 2

r˙ 2

(7)

where r˙ is the velocity of particle ( ). The angular momentum with respect to the origin is the vector: L =



r

(8)

=1

The following theorems can then be easily proven: ( ) The center of mass of a system moves like a point particle with a mass equal to the total mass of the system, subject to a force equal to the resultant of all the forces involved in the system: ¨r = =1

(9)

F =1

( ) The time-derivative of the angular momentum evaluated at a fixed point is equal to the moment of the forces with respect to this point: d L = d

r

(10)

F

=1

( ) The variation of the kinetic energy between time 1 and 2 is equal to the work performed by all the forces during the motion between these two times: 2

( 2)

( 1) =

F 1

r˙ d

(11)

=1

If the internal forces satisfy the principle of action and reaction, and if they are directed along the straight lines joining the interacting particles, their contribution to the resultant [equation (9)] and to the moment with respect to the origin [equation (10)] is zero. If, in addition, the system is isolated (that is, if it is not subject to any external forces), the total angular momentum L is constant, and the center of mass is in uniform rectilinear motion. This means that the total mechanical momentum: r˙

(12)

=1

is also a constant of the motion. 1529

APPENDIX III

2.

The Lagrangian and Lagrange’s equations

Consider a system of particles in which the forces are derived from a potential energy [cf. formula (4)], which we shall write simply (r ). The Lagrangian, or Lagrange’s function, of this system is the function of 6 variables ;˙ ˙ ˙ ; =1 2 given by: (r r˙ ) = =

1 2

r˙ 2

(r )

(13)

=1

It can immediately be shown that the equations of motion written in (5) are identical to Lagrange’s equations: d d

˙

d d

˙

d d

˙

=0 =0

(14)

=0

A very interesting feature of Lagrange’s equations is that they always have the same form, independent of the type of coordinates used (whether they are cartesian or not). In addition, they can be applied to systems which are more general than particle systems. Many physical systems (including for example one or several solid bodies) can be described at a given time by a set of independent parameters ( = 1 2 ), called generalized coordinates. Knowledge of the permits the calculation of the position in space of any point of the system. The motion of this system is therefore characterized by specifying the functions of time ( ). The time-derivatives ˙ ( ) are called the generalized velocities. The state of the system at a given instant 0 is therefore defined by the set of ( 0 ) and ˙ ( 0 ). If the forces acting on the system can be derived from a potential energy ( 1 2 ), the Lagrangian ( 1 2 ; ˙1 ˙2 ˙ ) is again the difference between the total kinetic energy and the potential energy . It can be shown that, for any choice of the coordinates , the equations of motion can always be written: d d where

=0

˙

(15)

d denotes the total time-derivative d

d = d

˙

+ =1

+

¨ =1

˙

(16)

Furthermore, it is not really necessary for the forces to be derived from a potential for us to be able to define a Lagrangian and use Lagrange’s equations (we shall see an example 1530

LAGRANGIAN AND HAMILTONIAN IN CLASSICAL MECHANICS

of this situation in § 4-b). In the general case, the Lagrangian is a function of the coordinates and the velocities ˙ , and can also be explicitly time-dependent2 . We shall then write it: (

˙; )

(17)

Lagrange’s equations are important in classical mechanics for several reasons. For one thing, as we have just indicated, they always have the same form, independent of the coordinates which are used. Furthermore, they are more convenient than Newton’s equations when the system is complex. Finally, they are of considerable theoretical interest, since they form the foundation of the Hamiltonian formalism (cf. § 3 below), and since they can be derived from a variational principle (§ 5). The first two points are secondary as far as quantum mechanics is concerned, since quantum mechanics treats particle systems almost exclusively and since the quantization rules are stated in cartesian coordinates (cf. Chap. III, § B-5). However, the last point is an essential one, since the Hamiltonian formalism constitutes the point of departure for the quantization of physical systems. 3.

The classical Hamiltonian and the canonical equations

For a physical system described by generalized coordinates, Lagrange’s equations (15) constitute a system of coupled second-order differential equations with unknown functions, the ( ). We shall see that this system can be replaced by a system of 2 first-order equations with 2 unknown functions. 3-a.

The conjugate momenta of the coordinates

The conjugate momentum =

of the generalized coordinate

is defined as: (18)

˙

is also called the generalized momentum. In the case of a particle system for which the forces are derived from a potential energy, the conjugate momenta of the position variables r ( 1 ) are simply [see (13)] the mechanical momenta: r˙

p =

(19)

However, we shall see in § 4-b- that this is no longer true in the presence of a magnetic field. Instead of defining the state of the system at a given time by the coordinates ( ) and the velocities ˙ ( ), we shall henceforth characterize it by the 2 variables: ()

( ); = 1 2

(20)

2 The

Lagrangian is not unique: two functions ( ˙ ; ) and ( ˙ ; ) may lead, using (15), to the same equations of motion. This is true, in particular, if the difference between and is the total derivative with respect to time of a function ( ; ).. =

d d

(

)

+

˙

1531

APPENDIX III

This amounts to assuming that from the 2 parameters ( ) and ( ), we can determine the ˙ ( ) uniquely. These variables may be considered as the 2 coordinates of a point defining the state of the system at every time, and moving in a 2 dimensional space called the phase space. 3-b.

The Hamilton-Jacobi canonical equations

The classical Hamiltonian, or Hamilton’s function, of the system is, by definition: =

˙

(21)

=1

In accordance with convention (20), we eliminate the ˙ and consider the Hamiltonian to be a function of the coordinates and their conjugate momenta. Like , may be explicitly time-dependent: (

; )

(22)

The total differential of the function d

=

d

+

d

+

:

d

(23)

is equal to, using definitions (21) and (18): d

=

[

d˙ + ˙ d ]

=

˙ d

d

d

˙



d

d

Setting (23) and (24) equal, we see that the change from the variables leads to:

the

(24) ˙

variables to

=

(25a)

= ˙

(25b)

=

(25c)

Furthermore, using (18) and (25a), we can write Lagrange’s equations (15) in the form: d d

=

(26)

By grouping terms in (25b) and (26), we obtain the equations of motion: d = d d = d 1532

(27)

LAGRANGIAN AND HAMILTONIAN IN CLASSICAL MECHANICS

which are called the Hamilton-Jacobi canonical equations. As we said, (27) is a system of 2 first-order differential equations for 2 unknown functions, the ( ) and ( ). These equations determine the motion of the point in the phase space. For an -particle system whose potential energy is (r ), we have, according to (13): =

p



p



=1

= =1

1 2

r˙ 2 +

(r )

(28)

=1

To express the Hamiltonian in terms of the variables r and p , we use (19). This yields:

(r p ) = =1

p2 + 2

(r )

(29)

Note that the Hamiltonian is thus equal to the total energy of the system. The canonical equations: dr p = d dp = d



(30)

are equivalent to Newton’s equations, (3).

4.

Applications of the Hamiltonian formalism

4-a.

A particle in a central potential

Consider a system composed of a single particle of mass whose potential energy ( ) depends only on its distance from the origin. In polar coordinates ( ), the components of the particle’s velocity on the local axes (Fig. 1) are: = ˙ =

˙

= sin

(31) ˙

so that the Lagrangian, (13), can be written: (

1 ; ˙ ˙ ˙) = 2

˙2 +

2 ˙2

+

2

sin2

˙2

( )

(32) 1533

APPENDIX III

z

er M θ



r

Figure 1: The unit vectors er , e , e of the local axes associated with point , where is defined by its spherical coordinates , , .



O

y φ

x

The conjugate momenta of the three variables

= = =

=

˙ ˙ ˙

can then be calculated:

˙

(33a)

=

2

˙

=

2

sin2

(33b) ˙

(33c)

To obtain the Hamiltonian of the particle, we use definition (21). This amounts to adding ( ) to the kinetic energy, expressed in terms of and . We find:

2

( 1534

;

)=

2

+

2

1 2

2 2

+

sin2

+

( )

(34)

LAGRANGIAN AND HAMILTONIAN IN CLASSICAL MECHANICS

The system of canonical equations [formulas (27)] can be written here: d = d

=

d = d

=

d = d

=

(35a) (35b)

2

d d

=

=

d d

=

=

d d

=

=0

(35c)

sin2

2

2

1

2 3

2 2

+

(35d)

sin2

cos sin3

(35e) (35f)

The first three of these equations simply give (33); the last three are the real equations of motion. Now, consider the angular momentum of the particle with respect to the origin: L =r

(36)

v

Its local components can easily be calculated from (31): L =0 L =

2

=

L =

=

2

sin

˙=

sin

˙=

(37)

so that: 2

L2 =

2

+

(38)

sin2

From the angular momentum theorem [formula (10)], we know that L is a vector which is constant over time, since the force derived from the potential ( ) is central, that is, collinear at each instant3 with the vector r. By comparing (34) and (38), we see that the Hamiltonian depends on the angular variables and their conjugate momenta only through the intermediary of L 2 : 2

(

;

)=

2

+

1 2

2

L2(

)+

( )

(39)

Now, assume that the initial angular momentum of the particle is L 0 . Since the angular momentum remains constant, the Hamiltonian (39) and the equation of motion (35d) 3 This conclusion can also be derived from (35e) and (35f) by calculating the time-derivatives of the components of L on the fixed axes , , .

1535

APPENDIX III

are the same as they would be for a particle of mass placed in the effective potential: eff (

)=

4-b.

, in a one-dimensional problem,

L 20 2 2

( )+

(40)

A charged particle placed in an electromagnetic field

Now, consider a particle of mass and charge placed in an electromagnetic field characterized by the electric field vector E(r ) and the magnetic field vector B(r ). .

Description of the electromagnetic field. Gauges

E(r ) and B(r ) satisfy Maxwell’s equations: ∇ E=

(41a) 0



B

E=

(41b)

∇ B=0 ∇

B=

(41c) 0

E

j+

(41d)

0 0

where (r ) and j(r ) are the volume charge density and the current density producing the electromagnetic field. The fields E and B can be described by a scalar potential (r ) and a vector potential A(r ), since equation (41c) implies that there exists a vector field A(r ) such that: B=∇

A(r )

(42)

(41b) can thus be written: ∇

A

E+

=0

(43)

Consequently, there exists a scalar function E+

A

∇ (r )

=

(r ) such that: (44)

The set of the two potentials A(r ) and (r ) constitutes what is called a gauge for describing the electromagnetic field. The electric and magnetic fields can be calculated from the A gauge by: B(r ) = ∇ E(r ) =

A(r )

∇ (r )

(45a) A(r )

(45b)

A given electromagnetic field, that is, a pair of fields E(r ) and B(r ), can be described by an infinite number of gauges, which, for this reason, are said to be equivalent. 1536

LAGRANGIAN AND HAMILTONIAN IN CLASSICAL MECHANICS

If we know one gauge, A , which yields the fields E and B, all the equivalent gauges, A , can be found from the gauge transformation formulas: A (r ) = A(r ) + ∇ (r ) (r ) =

(r )

(46a)

(r )

(46b)

where (r ) is any scalar function. First of all, it is easy to show from (46) that: ∇

A (r ) = ∇

A(r ) (47)



(r )

∇ (r )

A (r ) =

A(r )

Any gauge, A , which satisfies (46) therefore yields the same electric and magnetic fields as A . Conversely we shall show that if two gauges, A and A , are equivalent, there must exist a function (r ) which establishes relations (46) between them. Since, by hypothesis: B(r ) = ∇

A(r ) = ∇

A (r )

(48)

we have: ∇

(A

A) = 0

This implies that A A

(49) A is the gradient of a scalar function:

A = ∇ (r )

(50)

(r ) is, for the moment, determined only to within an arbitrary function of , ( ). Furthermore, the fact that the two gauges are equivalent means that: E(r ) =

∇ (r )

A(r ) =



(r )

A (r )

(51)

that is: ∇(

)+

(A

A) = 0

(52)

According to (50), we must have: ∇(

)=



(r )

Consequently, the functions

(53) and

(r ) can differ only by a function of ; thus, we

can choose ( ) so as to make them equal: =

(r )

(54)

This completes the determination of the function (r ) (to within an additive constant). Two equivalent gauges must therefore satisfy relations of the form (46).

1537

APPENDIX III

.

Equations of motion and the Lagrangian

In the electromagnetic field, the charged particle is subject to the Lorentz force: F = [E + v

(55)

B]

(where v is the velocity of the particle at the time ). Newton’s law therefore gives the equations of motion in the form: ¨r = [E(r ) + r˙

B(r )]

Projecting this equation onto ¨= [

+ ˙

˙

=

(56) and using (45), we obtain:

] + ˙

˙

(57)

It can easily be shown that these equations can be derived from the Lagrangian by using (15): (r r˙ ) =

1 2

r˙ 2 +

r˙ A(r )

(r )

(58)

Therefore, although the Lorentz force is not derived from a potential energy, we can find a Lagragian for the problem. Let us show that Lagrange’s equations (15) do yield the equations of motion (56), using the Lagrangian (58). To do so, we shall first calculate:

˙

=

˙+

= r˙

(r ) A(r )

(r )

(59)

Lagrange’s equation for the -coordinate can therefore be written: d [ d

˙+

(r )]



A(r ) +

(r ) = 0

(60)

Writing this equation explicitly and using (16), we again get (57): ¨+

+ ˙

+ ˙

+ ˙

˙

+ ˙

+ ˙

+

=0

(61)

that is: ¨=

1538

+ ˙

˙

(62)

LAGRANGIAN AND HAMILTONIAN IN CLASSICAL MECHANICS

.

Momentum. The classical Hamiltonian

The Lagrangian (58) enables us to calculate the conjugate momenta of the cartesian coordinates , , of the particle. For example: =

˙

=

˙+

(r )

(63)

The momentum of the particle, which is, by definition, the vector whose components are ( ), is no longer equal, as it was in (19), to the mechanical momentum r˙ : r˙ + A(r )

p=

(64)

Finally, we shall write the classical Hamiltonian: (r p; ) = p r˙ =p

1

(p

A)

1 (p 2

A)2

(p

A) A +

(65)

that is: (r p; ) =

1 [p 2

2

A(r )] +

(r )

(66)

Comment:

Hamiltonian formalism therefore uses the potentials A and , and not the fields E and B directly. The result is that the description of the particle depends on the gauge chosen. It is reasonable to expect, however, since the Lorentz force is expressed in terms of the fields, that predictions concerning the physical behavior of the particle must be the same for two equivalent gauges. The physical consequences of the Hamiltonian formalism are said to be gauge-invariant. The concept of gauge invariance is analyzed in detail in Complement HIII . 5.

The principle of least action

Classical mechanics can be based on a variational principle, the principle of least action. In addition to its theoretical importance, the concept of action serves as the foundation of the Lagrangian formulation of quantum mechanics (cf. Complement JIII ). This is why we shall now briefly discuss the principle of least action and show how it leads to Lagrange’s equations. 5-a.

Geometrical representation of the motion of a system

First of all, consider a particle constrained to move along the axis. Its motion can be represented by tracing, in the ( ) plane, the curve defined by the law of motion which yields ( ). More generally, let us study a physical system described by generalized coordinates (for an -particle system in three-dimensional space, = 3 ). It is convenient to interpret the to be the coordinates of a point in an -dimensional Euclidean space . There is then a one-to-one correspondence between the positions of the system and the points of .

1539

APPENDIX III

With each motion of the system is associated a motion of point in , characterized by the -dimensional vector function ( ) whose components are the ( ). As in the simple case of a single particle moving in one dimension, the motion of point , that is, the motion of the system, can be represented by the graph of ( ), which is a curve in an ( + 1)-dimensional space-time (the time axis is added to the dimensions of ). This curve characterizes the motion being studied. 5-b.

The principle of least action

The ( ) can be fixed arbitrarily; this gives point and the system an arbitrary motion. But their real behavior is defined by the initial conditions and the equations of motion. Suppose that we know that, in the course of the real motion. is at 1 at time 1 and at 2 at a subsequent time 2 (as is shown schematically by Figure 2): ( 1) =

1

( 2) =

2

(67)

There is an infinite number of a priori possible motions which satisfy conditions (67). They are represented by all the curves4 , or paths in space time, which connect the points ( 1 1 ) and ( 2 2 ) (cf. Fig. 2).

Q Q2

Q1

t1

t2

t

Figure 2: The path in space-time which is associated with a given motion of the physical system. The “ -axis” represents the time and the “ -axis”, (which symbolizes the set of generalized coordinates ).

Consider such a path in space-time Γ, characterized by the vector function satisfies (67). If: (

1

2

; ˙1 ˙2

˙ ; )

(

˙; )

( ) which

(68)

4 Excluding, of course, the curves which “go backward”, that is, which would give two distinct positions of for the same time .

1540

LAGRANGIAN AND HAMILTONIAN IN CLASSICAL MECHANICS

is the Lagrangian of the system, the action

Γ

which corresponds to the path Γ is, by definition:

2

Γ

=

d

) ˙ Γ ( );

Γ(

(69)

1

[the function to be integrated depends only on ; it is obtained by replacing the and ˙ by the time-dependent coordinates of Γ ( ) and ˙ Γ ( ) in the Lagrangian (68)]. The principle of least action can then be stated in the following way: of all the paths in space-time connecting ( 1 1 ) with ( 2 2 ), the one which is actually followed (that is, the one which characterizes the real motion of the system) is the one for which the action is minimal. In other words, when we go from the path which is actually followed to one infinitely close to it, the action does not vary to first order. Note the analogy with other variational principles, such as Fermat’s principle in optics. 5-c.

Lagrange’s equations as a consequence of the principle of least action

In conclusion, we shall show how Lagrange’s equations can be deduced from the principle of least action. Suppose that the real motion of the system under study is characterized by the functions of time ( ), that is by the path in space-time Γ connecting the points ( 1 1 ) and ( 2 2 ). Now consider an infinitely close path, Γ (fig. 3), for which the generalized coordinates are equal to: ( )=

( )+

where the

()

(70)

( ) are infinitesimally small and satisfy conditions (67), that is:

( 1) =

( 2) = 0

(71)

The generalized velocities ˙ ( ) corresponding to Γ can be obtained by differentiating relations (70): ˙ ( )= ˙ ( )+

d d

Thus, their increments ˙( )=

d d

()

(72)

˙ ( ) are simply:

()

(73)

We now calculate the variation of the action in going from the path Γ to the path Γ : 2

=

d 1

2

=

d

+

d

+

1

2

= 1

˙

˙

d ˙ d

(74)

1541

APPENDIX III

Q Q2 Γ Γ Q1

t2

t1

t

Figure 3: Two paths in space-time which pass through the points ( 1 1 ) and ( 2 2 ): the solid-line curve is the path associated with the real motion of the system, and the dashed-line curve is another, infinitely close, path.

according to (73). If we integrate the second term by parts, we obtain: 2

=

˙ 1 2

=

2

+ 1

d d

d 1

d d

d

˙

˙

(75)

since the integrated term is zero, because of conditions (71). If Γ is the path in space-time which is actually followed during the real motion of the system, the increment of the action is zero, according to the principle of least action. For this to be so, it is necessary and sufficient that: d d

˙

=0

;

=1 2

(76)

It is obvious that this condition is sufficient. It is also necessary, since, if there existed a time interval during which expression (76) were non-zero for a given value of the index , the ( ) could be chosen so as to make the corresponding increment different from zero. d (It would suffice, for example, to choose them so as to make the product d ˙ always positive or zero). Consequently, the principle of least action is equivalent to Lagrange’s equations.

1542

LAGRANGIAN AND HAMILTONIAN IN CLASSICAL MECHANICS

References and suggestions for further reading:

See section 6 of the bibliography, in particular Marion (6.4). Goldstein (6.6), Landau and Lifshitz (6.7). For a simple presentation of the use of variational principles in physics, see Feynman II (7.2), Chap. 19. For Lagrangian formalism applied to a classical field, see Bogoliubov and Chirkov (2.15), Chap. I.

1543

Bibliography

BIBLIOGRAPHY OF VOLUMES I and II

1. QUANTUM MECHANICS: GENERAL REFERENCES A - INTRODUCTORY TEXTS Quantum Physics (1.1) E. H. WICHMANN, Berkeley Physics Course, Vol. 4: Quantum Physics, McGrawHill, New York (1971). (1.2) R. P. FEYNMAN, R. B. LEIGHTON and M. SANDS, The Feynman Lectures on Physics, Vol. III: Quantum Mechanics, Addison-Wesley, Reading, Mass. (1965). (1.3) R. EISBERG and R. RESNICK, Quantum Physics of Atoms, Molecules, Solids, Nuclei and Particules, Wiley, New York (1974). (1.4) M. ALONSO and E. J. FINN, Fundamental University Physics, Vol. III: Quantum and Statistical Physics, Addison Wesley, Reading, Mass. (1968). (1.5) U. FANO and L. FANO, Basic Physics of Atoms and Molecules, Wiley, New York (1959). (1.6) J. C. SLATER, Quantum Theory of Matter, McGraw-Hill, New York (1968). Quantum mechanics (1.7) S. BOROWITZ, Fundamentals of Quantum Mechanics, Benjamin, New York (1967). (1.8) S. I. TOMONAGA, Quantum Mechanics, Vol. I: Old Quantum Theory, North Holland, Amsterdam (1962). (1.9) L. PAULING and E. B. WILSON JR., Introduction to Quantum Mechanics, McGrawHill, New York (1935). (1.10) Y. AYANT et E. BELORIZKY, Cours de Mécanique Quantique, Dunod, Paris (1969). (1.11) P. T. MATTHEWS, Introduction to Quantum Mechanics, McGraw-Hill, New York (1963). (1.12) J. AVERY, The Quantum Theory of Atoms, Molecules and Photons, McGraw-Hill, London (1972).

1545

Quantum Mechanics, Volume II, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

Bibliography

B - MORE ADVANCED TEXTS: (1.13) P. A. M. DIRAC, The Principles of Quantum Mechanics, Oxford University Press (1958). (1.14) R. H. DICKE and J. P. WITTKE, Introduction to Quantum Mechanics, AddisonWesley, Reading, Mass. (1966). (1.15) D. I. BLOKHINTSEV, Quantum Mechanics, D. Reidel, Dordrecht (1964). (1.16) E. MERZBACIIER, Quantum Mechanics, Wiley, New York (1970). (1.17) A. MESSIAH, Mécanique Quantique, Vols 1 et 2, Dunod, Paris (1964). (1.18) L. I. SCHIFF, Quantum Mechanics, McGraw-Hill, New York (1968). (1.19) L. D. LANDAU and E. M. LIFSHITZ, Quantum Mechanics, Nonrelativistic Theory, Pergamon Press, Oxford (1965). (1.20) A. S. DAVYDOV, Quantum Mechanics, Translated, edited and with additions by D. Ter HAAR, Pergamon Press, Oxford (1965). (1.21) H. A. BETHE and R. W. JACKIW, Intermediate Quantum Mechanics, Benjamin, New York (1968). (1.22) H. A. KRAMERS, Quantum Mechanics, North Holland, Amsterdam (1958). C - PROBLEMS IN QUANTUM MECHANICS (1.23) Selected Problems in Quantum Mechanics, Collected and edited by D. Ter HAAR, Infosearch, London (1964). (1.24) S. FLÜGGE, Practical Quantum Mechanics, I and II, Springer-Verlag, Berlin (1971). D - ARTICLES (1.25) E. SCHRÖDINGER, “What is Matter?”, Scientific American 189, 52 (Sept. 1953). (1.26) G. GAMOW, “The Principle of Uncertainty”, Scientific American 198, 51 (Jan. 1958). (1.27) G. GAMOW, “The Exclusion Principle”, Scientific American 201, 74 (July 1959). (1.28) M. BORN and W. BIEM, “Dualism in Quantum Theory”, Physics Today 21, p. 51 (Aug. 1968). (1.29) W. E. LAMB JR., “An Operational Interpretation of Nonrelativistic Quantum Mechanics”, Physics Today 22, 23 (April 1969). (1.30) M. O. SCULLY and M. SARGENT III, “The Concept of the Photon”, Physics Today 25, 38 (March 1972). (1.31) A. EINSTEIN, “Zur Quantentheorie der Strahlung”, Physik. Z. 18, 121 (1917). 1546

Bibliography

(1.32) A. GOLDBERG, H. M.. SCHEY and J. L. SCHWARTZ, “Computer-Generated Motion Pictures of One-Dimensional Quantum-Mechanical Transmission and Reflection Phenomena”, Am. J. Phys., 35, 177 (1967). (1.33) R. P. FEYNMAN, F. L. VERNON JR. and R. W. HELLWARTH, “Geometrical Representation of the Schrödinger Equation for Solving Maser Problems”, J. Appl. Phys. 28, 49 (1957). (1.34) A. A. VUYLSTEKE, “Maser States in Ammonia-Inversion”, Am. J. Phys. 27, 554 (195 2. QUANTUM MECHANICS: MORE SPECIALIZED REFERENCES A - COLLISIONS (2.1) T. Y. WU and T. OHMURA, Quantum Theory of Scattering, Prentice Hall, Englewood Cliffs (1962). (2.2) R. G. NEWTON, Scattering Theory of Waves and Particles, McGraw-Hill, New York (1966). (2.3) P. ROMAN, Advanced Quantum Theory, Addison-Wesley, Reading, Mass. (1965). (2.4) M. L. GOLDBERGER and K. M. WATSON, Collision Theory, Wiley, New York (1964). (2.5) N. F. MOTT and H. S. W. MASSEY, The Theory of Atomic Collisions, Oxford University Press (1965). B - RELATIVISTIC QUANTUM MECHANICS (2.6) J. D. BJORKEN and S. D. DRELL, Relativistic Quantum Mechanics, McGrawHill, New York (1964). (2.7) J. J. SAKURAI, Advanced Quantum Mechanics, Addison-Wesley, Reading, Mass. (1967). (2.8) V. B. BERESTETSKII, E. M. LIFSHITZ and L. P. PITAEVSKII, Relativistic Quantum Theory, Pergamon Press, Oxford (1971). C - FIELD THEORY. QUANTUM ELECTRODYNAMICS (2.9) F. MANDL, Introduction to Quantum Field Theory, Wiley Interscience, New York (1959). (2.10) J. D. BJORKEN and S. D. DRELL, Relativistic Quantum Fields, McGraw-Hill, New York (1965). (2.11) E. A. POWER, Introductory Quantum Electrodynamics, Longmans, London (1964). (2.12) R. P. FEYNMAN, Quantum Electrodynamics, Benjamin, New York (1961). 1547

Bibliography

(2.13) W. HEITLER, The Quantum Theory of Radiation, Clarendon Press, Oxford (1954). (2.14) A. I. AKHIEZER and V. B. BERESTETSKII, Quantum Electrodynamics, Wiley Interscience, New York (1965). (2.15) N. N. BOGOLIUBOV and D. V. SHIRKOV, Introduction to the Theory of Quantized Fields, Interscience Publishers, New York (1959). (2.16) S. S. SCHWEBER, An Introduction to Relativistic Quantum Field Theory, Harper and Row, New York (1961). (2.17) M. M. STERNHEIM, “Resource Letter TQE-1 : Tests of Quantum Electrodynamics”, Am. J. Phys. 40, 1363 (1972). D - ROTATIONS AND GROUP THEORY (2.18) P. H. E. MEIJER and E. BAUER, Group Theory, North Holland, Amsterdam (1962). (2.19) M. E. ROSE, Elementary Theory of Angular Momentum, Wiley, New York (1957). (2.20) M. E. ROSE, Multipole Fields, Wiley, New York (1955). (2.21) A. R. EDMONDS, Angular Momentum in Quantum Mechanics, Princeton University Press (1957). (2.22) M. TINKHAM, Group Theory and Quantum Mechanics, McGraw-Hill, New York (1964). (2.23) E. P. WIGNER, Group Theory and its Application to the Quantum Mechanics of Atomic Spectra, Academic Press, New York (1959). (2.24) D. PARK, “Resource Letter SP-I on Symmetry in Physics”, Am. J. Phys. 36, 577 (1968). E - MISCELLANEOUS (2.25) R. P. FEYNMAN and A. R. HIBBS, Quantum Mechanics and Path Integrals, McGraw-Hill, New York (1965). (2.26) J. M. ZIMAN, Elements of Advanced Quantum Theory, Cambridge University Press (1969). (2.27) F. A. KAEMPFFER, Concepts in Quantum Mechanics, Academic Press, New York (1965). F - ARTICLES (2.28) P. MORRISON, “The Overthrow of Parity”, Scientific American 196, 45 (April 1957). (2.29) G. FEINBERG and M. GOLDHABER, “The Conservation Laws of Physics”, Scientific American 209, 36 (Oct. 1963). 1548

Bibliography

(2.30) E. P. WIGNER, “Violations of Symmetry in Physics”, Scientific American 213, 28 (Dec. 1965). (2.31) U. FANO, “Description of States in Quantum Mechanics by Density Matrix and Operator Techniques”, Rev. Mod. Phys. 29, 74 (1957). (2.32) D. Ter HAAR, “Theory and Applications of the Density Matrix”, Rept. Progr. Phys. 24, 304 (1961). (2.33) V. F. WEISSKOPF and E. WIGNER, “Berechnung der Natürlichen Linienbreite auf Grund der Diracschen Lichttheorie”, Z. Physik 63, 54 (1930). (2.34) A. DALGARNO and J. T. LEWIS, “The Exact Calculation of Long-Range Forces between Atoms by Perturbation Theory”, Proc. Roy. Soc. A 233, 70 (1955). (2.35) A. DALGARNO and A. L. STEWART, “On the Perturbation Theory of Small Disturbances”, Proc. Roy. Soc. A 238, 269 (1957). (2.36) C. SCHWARTZ, “Calculations in Schrödinger Perturbation Theory”, Annals of Physics (New York), 6, 156 (1959). (2.37) J. O. HIRSCHFELDER, W. BYERS BROWN and S. T. EPSTEIN, “Recent Developments in Perturbation Theory”, in Advances in Quantum Chemistry, P. O. LOWDIN ed., Vol. I, Academic Press, New York (1964). (2.38) R. P. FEYNMAN, “Space Time Approach to Nonrelativistic Quantum Mechanics”, Rev. Mod. Phys., 20, 367 (1948). (2.39) L. VAN HOVE, “Correlations in Space and Time and Born Approximation Scattering in Systems of Interacting Particles”, Phys. Rev. 95, 249 (1954).

3. QUANTUM MECHANICS: FUNDAMENTAL EXPERIMENTS Interference effects with weak light: (3.1) G. I. TAYLOR, “Interference Fringes with Feeble Light”, Proc. Camb. Phil. Soc. 15, 114 (1909). (3.2) G. T. REYNOLDS, K. SPARTALIAN and D. B. SCARL, “Interference Effects Produced by Single Photons”, Nuovo Cimento 61 B, 355 (1969). Experimental verification of Einstein’s law for the photoelectric effect; measurements of (3.3) A. L. HUGHES, “On the Emission Velocities of Photoelectrons”, Phil. Trans. Roy. Soc. 212, 205 (1912). (3.4) R. A. MILLIKAN, “A Direct Photoelectric Determination of Planck’s h”, Phys. Rev. 7 355 (1916).

1549

Bibliography

The Franck-Hertz experiment: (3.5) J. FRANCK und G. HERTZ, “Über Zusammenstösse zwischen Elecktronen und den Molekullen des Quecksilberdampfes und die Ionisierungsspannung desselben”, Verhandlungen der Deutschen Physikalischen Gesellschaft, 16, 457 (1914). “Über Kinetik von Elektronen und Ionen in Gasen”, Physikalische Zeitschrift 17, 409 (1916). The proportionality between the magnetic moment and the angular momentum: (3.6) A. EINSTEIN und J. W. DE HAAS, “Experimenteller Nachweis der Ampereschen Molekularströme”, Verhandlungen der Deutschen Physikalischen Gesellschaft 17, 152 (1915). (3.7) E. BECK, “Zum Experimentellen Nachweis der Ampereschen Molekularströme”, Annalen der Physik (Leipzig) 60, 109 (1919). The Stern-Gerlach experiment: (3.8) W. GERLACH und O. STERN, “Der Experimentelle Nachweis der Richtungsquantelung im Magnetfeld”, Zeitschrift für Physik 9, 349 (1922). The Compton effect: (3.9) A. H. COMPTON, “A Quantum Theory of the Scattering of X-Rays by Light Elements”, Phys. Rev., 21, 483 (1923). “Wavelength Measurements of Scattered X-Rays”, Phys. Rev., 21, 715 (1923). Electron diffration: (3.10) C. DAVISSON and L. H. GERMER, “Diffraction of Electrons by a Crystal of Nickel”, Phys. Rev. 30, 705 (1927). The Lamb shift: (3.11) W. E. LAMB JR. and R. C. RETHERFORD, “Fine Structure of the Hydrogen Atom”, I - Phys. Rev. 79, 549 (1950), II - Phys. Rev. 81, 222 (1951). Hyperfine structure of the hydrogen atom: (3.12) S. B. CRAMPTON, D. KLEPPNER and N. F. RAMSEY, “Hyperfine Separation of Ground State Atomic Hydrogen”, Phys. Rev. Letters 11, 338 (1963).

1550

Bibliography

Some fundamental experiments are described in: (3.13) O. R. FRISCH, “Molecular Beams”, Scientific American 212, 58 (May 1965).

4. QUANTUM MECHANICS: HISTORY (4.1) L. DE BROGLIE, “Recherches sur la Théorie des Quanta”, Annales de Physique(Paris), 3, 22 (1925). (4.2) N. BOHR, “The Solvay Meetings and the Development of Quantum Mechanics”, Essays 1958-1962 on Atomic Physics and Human Knowledge, Vintage, New York (1966). (4.3) W. HEISENBERG, Physics and Beyond: Encounters and Conversations, Harper and Row, New York (1971). La Partie et le Tout, Albin Michel, Paris (1972). (4.4) Niels Bohr, His life and work as seen by his friends and colleagues, S. ROZENTAL, ed., North Holland, Amsterdam (1967). (4.5) A. EINSTEIN, M. and H. BORN, Correspondance 1916-1955, Editions du Seuil, Paris (1972). See also La Recherche, 3, 137 (fev. 1972). (4.6) Theoretical Physics in the Twentieth Century, M. FIERZ and V. F. WEISSKOPF eds., Wiley Interscience, New York (1960). (4.7) Sources of Quantum Mechanics, B. L. VAN DER WAERDEN ed., North Holland, Amsterdam (1967); Dover, New York (1968). (4.8) M. JAMMER, The Conceptual Development of Quantum Mechanics, McGrawHill, New York (1966). This book traces the historical development of quantum mechanics. Its very numerous footnotes provide a multitude of references. See also (5.13). ARTICLES (4.9) K. K. DARROW, “The Quantum Theory”, Scientific American 186, 47 (March 1952). (4.10) M. J. KLEIN, “Thermodynamics and Quanta in Planck’s work”, Physics Today 19, 23 (Nov. 1966). (4.11) H. A. MEDICUS, “Fifty Years of Matter Waves”, Physics Today 27, 38 (Feb. 1974). Reference (5.12) contains a large number of references to the original articles.

1551

Bibliography

5. QUANTUM MECHANICS: DISCUSSION OF ITS FOUNDATIONS A - GENERAL PROBLEMS (5.1) D. BOHM, Quantum Theory, Constable, London (1954). (5.2) J. M. JAUCH, Foundations of Quantum Mechanics, Addison-Wesley, Reading, Mass. (1968). (5.3) B. D’ESPAGNAT, Conceptual Foundations of Quantum Mechanics, Benjamin, New York (1971); Conceptions de la Physique Contemporaine. Les Interprétations de la Mécanique Quantique et de la Mesure, Hermann, Paris (1965). (5.4) Proceedings of the International School of Physics “Enrico Fermi” (Varenna), Course IL; Foundations of Quantum Mechanics, B. D’ESPAGNAT ed., Academic Press, New York (1971). (5.5) B. S. DEWITT, “Quantum Mechanics and Reality”, Physics Today 23, 30, (Sept. 1970). (5.6) “Quantum Mechanics debate”, Physics Today 24, 36 (April 1971). See also (1.28). (5.7) F. LALOË, Do we really understand quantum mechanics?, Cambridge University Press, (second edition 2019). See also (1.28). B - MISCELLANEOUS INTERPRETATIONS (5.8) N. BOHR, “Discussion with Einstein on Epistemological Problems in Atomic Physics”, in A. Einstein: Philosopher-Scientist, P. A. SCHILPP ed., Harper and Row, New York (1959). (5.9) M. BORN, Natural Philosophy of Cause and Chance, Oxford University Press, London (1951); Clarendon Press, Oxford (1949). (5.10) L. DE BROGLIE, Une Tentative d’Interprétation Causale et Non Linéaire de la Mécanique Ondulatoire: la Théorie de la Double Solution, Gauthier-Villars, Paris (1956); Etude Critique des Bases de l’Interprétation Actuelle de la Mécanique Ondulatoire, Gauthier-Villars, Paris (1963). (5.11) The Many-Worlds Interpretation of Quantum Mechanics, B. S. DEWITT and N. GRAHAM eds., Princeton University Press (1973). A very complete set of references with comments may be found in: (5.12) B. S. DEWITT and R. N. GRAHAM, “Resource Letter IQM-1 on the Interpretation of Quantum Mechanics”, Am. J. Phys. 39, 724 (1971). (5.13) M. JAMMER, The Philosophy of Quantum Mechanics, Wiley-interscience, New York (1974). General presentation of the different interpretations and formalisms of quantum mechanics. Contains many references. 1552

Bibliography

C - MEASUREMENT THEORY (5.14) K. GOTTFRIED, Quantum Mechanics, Vol. I, Benjamin, New York (1966). (5.15) D. I. BLOKIIINTSEV, Principes Essentiels de la Mécanique Quantique, Dunod, Paris (1968). (5.16) A. SHIMONY, “Role of the Observer in Quantum Theory”, Am. J. Phys., 31, 755 (1963). See also (5.13), Chap. 11. D - HIDDEN VARIABLES AND “PARADOXES” (5.17) A. EINSTEIN, B. PODOLSKY and N. ROSEN, “Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?”, Phys. Rev. 47, 777 (1935). N. BOHR, “Can Quantum Mechanical Description of Physical Reality Be Considered Complete?”, Phys. Rev. 48, 696 (1935). (5.18) Paradigms and Paradoxes, the Philosophical Challenge of the Quantum Domain, R. G. COLODNY ed., University of Pittsburgh Press (1972). (5.19) J. S. BELL, “On the Problem of Hidden Variables in Quantum Mechanics”, Rev. Mod. Phys. 38, 447 (1966). See also Ref. (4.8), as well as (5.12) and Chap. 7 of (5.13).

6. CLASSICAL MECHANICS A - INTRODUCTORY LEVEL (6.1) M. ALONSO and E. J. FINN, Fundamental University Physics, Vol. I: Mechanics, Addison-Wesley, Reading, Mass. (1967). (6.2) C. KITTEL, W. D. KNIGHT and M. A. RUDERMAN, Berkeley Physics Course, Vol. 1: Mechanics, McGraw-Hill, New York (1962). (6.3) R. P. FEYNMAN, R. B. LEIGHTON and M. SANDS, The Feynman Lectures on Physics, Vol. I: Mechanics, Radiation, and Heat, Addison-Wesley, Reading, Mass. (1966). (6.4) J. B. MARION, Classical Dynamics of Particles and Systems, Academic Press, New York (1965). B - MORE ADVANCED LEVEL (6.5) A. SOMMERFELD, Lectures on Theoretical Physics, Vol. I: Mechanics, Academic Press, New York (1964). (6.6) H. GOLDSTEIN, Classical Mechanics, Addison-Wesley, Reading, Mass. (1959). 1553

Bibliography

(6.7) L. D. LANDAU and E. M. LIFSHITZ, Mechanics, Pergamon Press, Oxford (1960); Mécanique, 3e éd., Ed. Mir, Moscou (1969).

7. ELECTROMAGNETISM AND OPTICS A - INTRODUCTORY LEVEL (7.1) E. M. PURCELL, Berkeley Physics Course, Vol. 2: Electricity and Magnetism, McGraw-Hill, New York (1965). F. S. CRAWFORD JR., Berkeley Physics Course, Vol. 3: Waves, McGraw-Hill, New York (1968). (7.2) R. P. FEYNMAN, R. B. LEIGHTON and M. SANDS, The Feynman Lectures on Physics, Vol. II: Electromagnetism and Matter, Addison-Wesley, Reading, Mass. (1966). (7.3) M. ALONSO and E. J. FINN, Fundamental University Physics, Vol. II: Fields and Waves, Addison-Wesley, Reading, Mass. (1967). (7.4) E. HECHT and A. ZAJAC, Optics, Addison-Wesley, Reading, Mass. (1974). B - MORE ADVANCED LEVEL (7.5) J. D. JACKSON, Classical Electrodynamics, 2e edition Wiley, New York (1975). (7.6) W. X. H. PANOFSKY and M. PHILLIPS, Classical Electricity and Magnetism, Addison-Wesley, Reading, Mass. (1964). (7.7) J. A. STRATTON, Electromagnetic Theory, McGraw-Hill, New York (1941). (7.8) M. BORN and E. WOLF, Principles of Optics, Pergamon Press, London (1964). (7.9) A. SOMMERFELD, Lectures on Theoretical Physics, Vol. IV: Optics, Academic Press, New York (1964). (7.10) G. BRUHAT, Optique, 5e completed by A. KASTLER, Masson, Paris (1954). (7.11) L. LANDAU and E. LIFSHITZ, The Classical Theory of Fields, Addison-Wesley, Reading, Mass. (1951); Pergamon Press, London (1951); Théorie du champ, 2e éd., Ed. Mir, Moscou (1966). (7.12) L. D. LANDAU and E. M. LIFSHITZ, Electrodynamics of Continuous Media, Pergamon Press, Oxford (1960). (7.13) L. BRILLOUIN, Wave Propagation and Group Velocity, Academic Press, New York (1960).

1554

Bibliography

8. THERMODYNAMICS. STATISTICAL MECHANICS A - INTRODUCTORY LEVEL (8.1) F. REIF, Berkeley Physics Course, Vol. 5: Statistical Physics, McGraw-Hill, New York (1967). (8.2) C. KITTEL, Thermal Physics, Wiley, New York (1969). (8.3) G. BRUHAT, Thermodynamique, 5 édition remaniée par A. KASTLER, Masson, Paris (1962). See also (1.4), part 2, and (6.3). B - MORE ADVANCED LEVEL (8.4) F. REIF, Fundamentals of Statistical and Thermal Physics, McGraw-Hill, New York (1965). (8.5) R. CASTAING, Thermodynamique Statistique, Masson, Paris (1970). (8.6) P. M. MORSE, Thermal Physics, Benjamin, New York (1964). (8.7) R. KUBO, Statistical Mechanics, North Holland, Amsterdam and Wiley, New York (1965). (8.8) L. D. LANDAU and E. M. LIFSHITZ, Course of Theoretical Physics, Vol. 5: Statistical Physics, Pergamon Press, London (1963). (8.9) H. B. CALLEN, Thermodynamics, Wiley, New York (1961). (8.10) A. B. PIPPARD, The Elements of Classical Thermodynamics, Cambridge University Press (1957). (8.11) R. C. TOLMAN, The Principles of Statistical Mechanics, Oxford University Press (1950).

9. RELATIVITY A - INTRODUCTORY LEVEL (9.1) J. H. SMITH, Introduction to Special Relativity, Benjamin, New York (1965). See also references (6.2) and (6.3). B - MORE ADVANCED LEVEL (9.2) J. L. SYNGE, Relativity: The Special Theory, North Holland, Amsterdam (1965). (9.3) R. D. SARD, Relativistic Mechanics, Benjamin, New York (1970). (9.4) J. AHARONI, The Special Theory of Relativity, Oxford University Press, London (1959). 1555

Bibliography

(9.5) C. MØLLER, The Theory of Relativity, Oxford University Press, London (1972). (9.6) P. G. BERGMANN, Introduction to the Theory of Relativity, Prentice Hall, Englewood Cliffs (1960). (9.7) C. W. MISNER, K. S. THORNE and J. A. WHEELER, Gravitation, Freeman, San Francisco (1973). See also references on Electromagnetism, in particular (7.5) and (7.11). Other useful references are : (9.8) A. EINSTEIN, Quatre Conférences sur la Théorie de la Relativité, Gauthier-Villars, Paris (1971). (9.9) A. EINSTEIN, La Théorie de la Relativité Restreinte et Générale. La Relativité et le Problème de l’Espace, Gauthier-Villars, Paris (1971). (9.10) A. EINSTEIN, The Meaning of Relativity, Methuen, London (1950). (9.11) A. EINSTEIN, Relativity, the Special and General Theory, a Popular Exposition, Methuen, London (1920); H. Holt, New York (1967). A much more complete set of references can be found in: (9.12) G. HOLTON, Resource Letter SRT-1 on Special Relativity Theory, Am. J. Phys. 30, 462 (1962).

10. MATHEMATICAL METHODS A - ELEMENTARY GENERAL TEXTS (10.1) J. BASS, Cours de Mathématiques, Vols. I, II and III, Masson, Paris (1961). (10.2) A. ANGOT, Compléments de Mathématiques, Revue d’Optique, Paris (1961). (10.3) T. A. BAK and J. LICHTENBERG, Mathematics for Scientists, Benjamin, New York (1966). (10.4) G. ARFKEN, Mathematical Methods for Physicists, Academic Press, New York (1966). (10.5) J. D. JACKSON, Mathematics for Quantum Mechanics, Benjamin, New York (1962). B - MORE ADVANCED GENERAL TEXTS (10.6) J. MATHEWS and R. L. WALKER, Mathematical Methods of Physics, Benjamin, New York (1970). (10.7) L. SCHWARTZ, Méthodes Mathématiques pour les Sciences Physiques, Hermann, Paris (1965). Mathematics for the Physical Sciences, Hermann, Paris (1968). 1556

Bibliography

(10.8) E. BUTKOV, Mathematical Physics, Addison-Wesley, Reading, Mass (1968). (10.9) H. CARTAN, Théorie Elémentaire des Fonctions Analytiques d’une ou Plusieurs Variables Complexes, Hermann, Paris (1961). Elementary Theory of Analytic Functions of One or Several Complex Variables, Addison-Wesley, Reading, Mass. (1966). (10.10) J. VON NEUMANN, Mathematical Foundations of Quantum Mechanics, Princeton University Press (1955). (10.11) R. COURANT and D. HILBERT, Methods of Mathematical Physics, Vols. I and II, Wiley, Interscience, New York (1966). (10.12) E. T. WHITTAKER and G. N. WATSON, A Course of Modern Analysis, Cambridge University Press (1965). (10.13) P. M. MORSE and H. FESHBACH, Methods of Theoretical Physics, McGraw-Hill, New York (1953). C - LINEAR ALGEBRA. HILBERT SPACES (10.14) A. C. AITKEN, Determinants and Matrices, Oliver and Boyd, Edinburgh (1956). (10.15) R. K. EISENSCHITZ, Matrix Algebra for Physicists, Plenum Press, New York (1966). (10.16) M. C. PEASE III, Methods of Matrix Algebra, Academic Press, New York (1965). (10.17) J. L. SOULE, Linear Operators in Hilbert Space, Gordon and Breach, New York (1967). (10.18) W. SCHMEIDLER, Linear Operators in Hilbert Space, Academic Press, New York (1965). (10.19) N. I. AKHIEZER and I. M. GLAZMAN, Theory of Linear Operators in Hilbert Space, Ungar, New York (1961). D - FOURIER TRANSFORMS. DISTRIBUTIONS (10.20) R. STUART, Introduction to Fourier Analysis, Chapman and Hall, London (1969). (10.21) M. J. LIGHTHILL, Introduction to Fourier Analysis and Generalized Functions, Cambridge University Press (1964). (10.22) L. SCHWARTZ, Théorie des Distributions, Hermann, Paris (1967). (10.23) I. M. GEL’FAND and G. E. SHILOV, Generalized Functions, Academic Press, New York (1964). (10.24) F. OBERHETTINGER, Tabellen zur Fourier Transformation, Springer-Verlag, Berlin (1957).

1557

Bibliography

E - PROBABILITY AND STATISTICS (10.25) Elements of Probability Theory, Academic Press, New York (1966). (10.26) P. G. HOEL, S. C. PORT and C. J. STONE, Introduction to Probability Theory, Houghton- Mifflin, Boston (1971). (10.27) H. G. TUCKER, An Introduction to Probability and Mathematical Statistics, Academic Press, New York (1965). (10.28) J. LAMPERTI, Probability, Benjamin, New York (1966). (10.29) W. FELLER, An Introduction to Probability Theory and its Applications, Wiley, New York (1968). (10.30) L. BREIMAN, Probability, Addison-Wesley, Reading, Mass. (1968). F - GROUP THEORY Applied to physics: (10.31) H. BACRY, Lectures on Group Theory, Gordon and Breach, New York (1967). (10.32) M. HAMERMESH, Group Theory and its Application to Physical Problems, AddisonWesley, Reading, Mass. (1962). See also (2.18), (2.22), (2.23) or reference (16.13), which provides a simple introduction to continuous groups in physics. More mathematical: (10.33) G. PAPY, Groupes, Presses Universitaires de Bruxelles, Bruxelles (1961); Groups, Macmillan, New York (1964). (10.34) A. G. KUROSH, The Theory of Groups, Chelsea, New York (1960). (10.35) L. S. PONTRYAGIN, Topological Groups, Gordon and Breach, New York (1966). G - SPECIAL FUNCTIONS AND TABLES (10.36) A. GRAY and G. B. MATHEWS, A Treatise on Bessel Functions and their Applications to Physics, Dover, New York (1966). (10.37) E. D. RAINVILLE, Special Functions, Macmillan, New York (1965). (10.38) W. MAGNUS, F. OBERHETTINGER and R. P. SONI, Formulas and Theorems for the Special Functions of Mathematical Physics, Springer-Verlag, Berlin (1966). (10.39) BATEMAN MANUSCRIPT PROJECT, Higher Transcendental Functions, Vols. I, II and III, A. ERDELYI ed., McGraw-Hill, New York (1953). (10.40) M. ABRAMOWITZ and I. A. STEGUN, Handbook of Mathematical Functions, Dover, New York (1965). 1558

Bibliography

(10.41) L. J. COMRIE, Chambers’s Shorter Six-Figure Mathematical Tables, Chambers, London (1966). (10.42) E. JAHNKE and F. EMDE, Tables of Functions, Dover, New York (1945). (10.43) V. S. AIZENSHTADT, V. I. KRYLOV and A. S. METEL’SKII, Tables of Laguerre Polynomials and Functions, Pergamon Press, Oxford (1966). (10.44) H. B. DWIGHT, Tables of Integrals and Other Mathematical Data, Macmillan, New York (1965). (10.45) D. BIERENS DE HAAN, Nouvelles Tables d’Intégrales Définies, Hafner, New York (1957). (10.46) F. OBERHETTINGER and L. BADII, Tables of Laplace Transforms, SpringerVerlag, Berlin (1973). (10.47) BATEMAN MANUSCRIPT PROJECT, Tables of Integral Transforms, Vols. I and II, A. ERDELYI ed., McGraw-Hill, New York (1954). (10.48) M. ROTENBERG, R. BIVINS, N. METROPOLIS and J. K. WOOTEN JR., The 3-j and 6-j symbols, M.I.T. Technology Press (1959); Crosby Lockwood and Sons, London.

11. ATOMIC PHYSICS A - INTRODUCTORY LEVEL (11.1) H. G. KUHN, Atomic Spectra, Longman, London (1969). (11.2) B. CAGNAC and J. C. PEBAY-PEYROULA, Modern Atomic Physics, Vol. 1 : Fundamental Principles, and 2 : Quantum Theory and its Application, Macmillan, London (1975). (11.3) A. G. MITCHELL and M. W. ZEMANSKY, Resonance Radiation and Excited Atoms, Cambridge University Press, London (1961). (11.4) M. BORN, Atomic Physics, Blackie and Son, London (1951). (11.5) H. E. WHITE, Introduction to Atomic Spectra, McGraw-Hill, New York (1934). (11.6) V. N. KONDRATIEV, La Structure des Atomes et des Molécules, Masson, Paris (1964). See also (1.3) and (12.1). B - MORE ADVANCED LEVEL (11.7) G. W. SERIES, The Spectrum of Atomic Hydrogen, Oxford University Press, London (1957). (11.8) J. C. SLATER, Quantum Theory of Atomic Structure, Vols. I and II, McGraw-Hill, New York (1960). 1559

Bibliography

(11.9) A. E. RUARK and H. C. UREY, Atoms, Molecules and Quanta, Vols. I and II, Dover, New York (1964). (11.10) Handbuch der Physik, Vols. XXXV and XXXVI, Atoms, S. FLÜGGE ed., SpringerVerlag Berlin (1956 and 1957). (11.11) N. F. RAMSEY, Molecular Beams, Oxford University Press, London (1956). (11.12) I. I. SOBEL’MAN, Introduction to the Theory of Atomic Spectra, Pergamon Press, Oxford (1972). (11.13) E. U. CONDON and G. H. SHORTLEY, The Theory of Atomic Spectra, Cambridge University Press (1953). C - ARTICLES Many references to articles and books, with comments, can be found in: (11.14) J. C. ZORN, “Resource Letter MB-1 on Experiments with Molecular Beams”, Am. J. Phys. 32, 721 (1964). See also (3.13). (11.15) V. F. WEISSKOPF, “How Light Interacts with Matter”, Scientific American, 219, 60 (Sept. 1968). (11.16) H. R. CRANE, “The g Factor of the Electron”, Scientific American 218, 72 (Jan. 1968). (11.17) M. S. ROBERTS, “Hydrogen in Galaxies”, Scientific American 208, 94 (June 1963). (11.18) S. A. WERNER, R. COLELLA, A. W. OVERHAUSER and C. F. EAGEN, “Observation of the Phase Shift of a Neutron due to Precession in a Magnetic Field”, Phys. Rev. Letters 35, 1053 (1975). See also: H. RAUCH, A. ZEILINGER, G. BADUREK A. WILFING, W. BAUPIESS and U. BONSE, Physics Letters 54 A, 425 (1975). D - EXOTIC ATOMS (11.19) H. C. CORBEN and S. DE BENEDETTI, “The Ultimate Atom”, Scientific American 191, 88 (Dec. 1954). (11.20) V. W. HUGHES, “The Muonium Atom”, Scientific American 214, 93, (April 1966). “Muonium”, Physics Today 20, 29 (Dec. 1967). (11.21) S. DE BENEDETTI, “Mesonic Atoms”, Scientific American 195, 93 (Oct. 1956). (11.22) C. E. WIEGAND, “Exotic Atoms”, Scientific American 227, 102 (Nov. 1972). (11.23) V. W. HUGHES, “Quantum Electrodynamics: Experiment”, in Atomic Physics, B. Bederson, V. W. Cohen and F. M. Pichanick eds., Plenum Press, New York (1969). 1560

Bibliography

(11.24) R. DE VOE, P. M. Mc INTYRE, A. MAGNON, D. Y. STOWELL, R. A. SWANSON and V. L. TELEGDI, “Measurement of the Muonium Hfs Splitting and of the Muon Moment by Double Resonance, and New Value of ”, Phys. Rev. Letters 25, 1779 (1970). (11.25) K. F. CANTER, A. P. MILLS JR. and S. BERKO, “Observations of Positronium Lyman-Radiation”, Phys. Rev. Letters 34, 177 (1975). “Fine-Structure Measurement in the First Excited State of Positronium” Phys. Rev. Letters 34, 1541 (1975). (11.26) V. MEYER et al., “Measurement of the 1s-2s energy interval in muonium”, Phys. Rev. Letters 84, 1136 (2000).

12. MOLECULAR PHYSICS A - INTRODUCTORY LEVEL (12.1) M. KARPLUS and R. N. PORTER, Atoms and Molecules, Benjamin, New York (1970). (12.2) L. PAULING, The Nature of the Chemical Bond, Cornell University Press (1948). See also (1.3), Chap. 12; (1.5) and (11.6). B - MORE ADVANCED LEVEL (12.3) I. N. LEVINE, Quantum Chemistry, Allyn and Bacon, Boston (1970). (12.4) G. HERZBERG, Molecular Spectra and Molecular Structure, Vol. I: Spectra of Diatomic Molecules and Vol. II: Infrared and Raman Spectra of Polyatomic Molecules, D. Van Nostrand Company, Princeton (1963 and 1964). (12.5) H. EYRING, J. WALTER and G. E. KIMBALL, Quantum Chemistry, Wiley, New York (1963). (12.6) C. A. COULSON, Valence, Oxford at the Clarendon Press (1952). (12.7) J. C. SLATER, Quantum Theory of Molecules and Solids, Vol. 1 : Electronic Structure of Molecules, McGraw-Hill, New York (1963). (12.8) Handbuch der Physik, Vol. XXXVII, 1 and 2, Molecules, S. FLÜGGE, ed., Springer Verlag, Berlin (1961). (12.9) D. LANGBEIN, Theory of Van der Waals Attraction, Springer Tracts in Modern Physics, Vol. 72, Springer Verlag, Berlin (1974). (12.10) C. H. TOWNES and A. L. SCHAWLOW, Microwave Spectroscopy, McGraw-Hill, New York (1955). 1561

Bibliography

(12.11) P. ENCRENAZ, Les Molécules Interstellaires, Delachaux et Niestlé, Neuchâtel (1974). See also (11.9), (11.11) and (11.14). C - ARTICLES (12.12) B. V. DERJAGUIN, “The Force Between Molecules”, Scientific American 203, 47 (July 1960). (12.13) A. C. WAHL, “Chemistry by Computer”, Scientific American 222, 54 (April 1970). (12.14) B. E. TURNER, “Interstellar Molecules”, Scientific American 228,51 (March 1973). (12.15) P. M. SOLOMON, “Interstellar Molecules”, Physics Today 26, 32 (March 1973). See also (16.25).

13. SOLID STATE PHYSICS A - INTRODUCTORY LEVEL (13.1) C. KITTEL, Elementary Solid State Physics, Wiley, New York (1962). (13.2) C. KITTEL, Introduction to Solid State Physics, 3rd ed., Wiley, New York (1966). (13.3) J. M. ZIMAN, Principles of the Theory of Solids, Cambridge University Press, London (1972). (13.4) F. SEITZ, Modern Theory of Solids, McGraw-Hill, New York (1940). B - MORE ADVANCED LEVEL General texts: (13.5) C. KITTEL, Quantum Theory of Solids, Wiley, New York (1963). (13.6) R. E. PEIERLS, Quantum Theory of Solids, Oxford University Press, London (1964). (13.7) N. F. MOTT and H. JONES, The Theory of the Properties of Metals and Alloys, Clarendon Press, Oxford (1936); Dover, New York (1958). More specialized texts: (13.8) M. BORN and K. HUANG, Dynamical Theory of Crystal Lattices, Oxford University Press, London (1954). (13.9) J. M. ZIMAN, Electrons and Phonons, Oxford University Press, London (1960). (13.10) H. JONES, The Theory of Brillouin Zones and Electronic States in Crystals, North Holland, Amsterdam (1962). 1562

Bibliography

(13.11) J. CALLAWAY, Energy Band Theory, Academic Press, New York (1964). (13.12) R. A. SMITH, Wave Mechanics of Crystalline Solids, Chapman and Hall, London (1967). (13.13) D. PINES and P. NOZIÈRES, The Theory of Quantum Liquids, Benjamin, New York (1966). (13.14) D. A. WRIGHT, Semi-Conductors, Associated Book Publishers, London (1966). (13.15) R. A. SMITH, Semi-Conductors, Cambridge University Press, London (1964). C - ARTICLES (13.16) R. L. SPROULL, “The Conduction of Heat in Solids”, Scientific American 207, 92 (Dec. 1962). (13.17) A. R. MACKINTOSH, “The Fermi Surface of Metals”, Scientific American 209, 110 (July 1963). (13.18) D. N. LANGENBERG, D. J. SCALAPINO and B. N. TAYLOR, “The Josephson Effects”, Scientific American 214, 30 (May 1966). (13.19) G. L. POLLACK, “Solid Noble Gases”, Scientific American 215, 64 (Oct. 1966). (13.20) B. BERTMAN and R. A. GUYER, “Solid Helium”, Scientific American 217, 85 (Aug. 1967). (13.21) N. MOTT, “The Solid State”, Scientific American 217, 80 (Sept. 1967). (13.22) M. Ya. AZBEL’, M. I. KAGANOV and I. M. LIFSHITZ, “Conduction Electrons in Metals”, Scientific American 228, 88 (Jan. 1973). (13.23) W. A. HARRISON, “Electrons in Metals”, Physics Today 22, 23 (Oct. 1969).

14. MAGNETIC RESONANCE (14.1) A. ABRAGAM, The Principles of Nuclear Magnetism, Clarendon Press, Oxford (1961); (14.2) C. P. SLICHTER, Principles of Magnetic Resonance, Harper and Row, New York (1963). (14.3) G. E. PAKE, Paramagnetic Resonance, Benjamin, New York (1962). See also Ramsey (11.11), Chaps. V, VI and VII. ARTICLES (14.4) G. E. PAKE, “Fundamentals of Nuclear Magnetic Resonance Absorption”, I and II, Am. J. Phys. 18, 438 and 473 (1950). 1563

Bibliography

(14.5) E. M. PURCELL, “Nuclear Magnetism”, Am. J. Phys. 22, 1 (1954). (14.6) G. E. PAKE, “Magnetic Resonance”, Scientific American 199, 58 (Aug. 1958). (14.7) K. WÜTHRICH and R. C. SHULMAN, “Magnetic Resonance in Biology”, Physics Today 23, 43 (April 1970). (14.8) F. BLOCH, “Nuclear Induction”, Phys. Rev. 70, 460 (1946). Numerous other references, in particular to the original articles, can be found in: (14.9) R. E. NORBERG, “Resource Letter NMR-EPR-1 on Nuclear Magnetic Resonance and Electron Paramagnetic Resonance”, Am. J. Phys. 33, 71 (1965).

15. QUANTUM OPTICS ; MASERS AND LASERS A - OPTICAL PUMPING. MASERS AND LASERS (15.1) R. A. BERNHEIM, Optical Pumping: An Introduction, Benjamin, New York (1965). Contient de nombreuses références. De plus, plusieurs articles originaux y sont reproduits. (15.2) Quantum Optics and Electronics, Les Houches Lectures 1964, C. DE WITT, A. BLANDIN and C. COHEN-TANNOUDJI eds., Gordon and Breach, New York (1965). (15.3) Quantum Optics, Proceedings of the Scottish Universities Summer School 1969, S. M. KAY and A. MAITLAND eds., Academic Press, London (1970). The proceedings of these two summer schools contain several lectures related to optical pumping and quantum electronicss. (15.4) W. E. LAMB JR., Quantum Mechanical Amplifiers, in Lectures in Theoretical Physics, Vol. II, W. BRITTIN and D. DOWNS eds., Interscience Publishers, New York (1960). (15.5) M. SARGENT III, M. O. SCULLY and W. E. LAMB JR., Laser Physics, AddisonWesley, New York (1974). (15.6) A. E. SIEGMAN, An Introduction to Lasers and Masers, McGraw-Hill, New York (1971). (15.7) L. ALLEN, Essentials of Lasers, Pergamon Press, Oxford (1969). This short book contains the reprints of several original articles related to lasers. (15.8) L. ALLEN and J. H. EBERLY, Optical Resonance and Two-Level Atoms, Wiley Interscience, New York (1975). (15.9) A. YARIV, Quantum Electronics, Wiley, New York (1967). 1564

Bibliography

(15.10) H. M. NUSSENZVEIG, Introduction to Quantum Optics, Gordon and Breach, London (1973). B - ARTICLES Two “Resource Letters” give and comment a large number of references : (15.11) H. W. MOOS, “Resource Letter MOP-1 on Masers (Microwave through Optical) and on Optical Pumping”, Am. J. Phys. 32, 589 (1964). (15.12) P. CARRUTHERS, “Resource Letter QSL-1 on Quantum and Statistical Aspects of Light”, Am. J. Phys., 31, 321 (1963). Reprints of many important papers on lasers have been collected in: (15.13) Laser Theory, F. S. BARNES ed., I.E.E.E. Press, New York (1972). (15.14) H. LYONS, “Atomic Clocks”, Scientific American 196, 71 (Feb. 1957). (15.15) J. P. GORDON, “The Maser”, Scientific American 199, 42 (Dec. 1958). (15.16) A. L. BLOOM, “Optical Pumping”, Scientific American 203, 72 (Oct. 1960). (15.17) A. L. SCHAWLOW, “Optical Masers”, Scientific American 204, 52 (June 1961). “Advances in Optical Masers”, Scientific American 209, 34 (July 1963). “Laser Light”, Scientific. American., 219, 120 (Sept. 1968). (15.18) M. S. FELD and V. S. LETOKHOV, “Laser Spectroscopy”, Scientific American 229, 69 (Dec. 1973). C - NON-LINEAR OPTICS (15.19) G. C. BALDWIN, An Introduction to Non-Linear Optics, Plenum Press, New York (1969). (15.20) F. ZERNIKE and J. E. MIDWINTER, Applied Non-Linear Optics, Wiley Interscience, New York (1973). (15.21) N. BLOEMBERGEN, Non-Linear Optics, Benjamin, New York (1965). See also the lectures of this author in references (15.2) and (15.3). D - ARTICLES (15.22) J. A. GIORDMAINE, “The Interaction of Light with Light”, Scientific American 210, 38 (Apr. 1964). “Non-Linear Optics”, Physics Today 22, 39 (Jan. 1969).

1565

Bibliography

16. NUCLEAR PHYSICS AND PARTICLE PHYSICS A - INTRODUCTION TO NUCLEAR PHYSICS (16.1) L. VALENTIN, Physique Subatomique: Noyaux et Particules, Hermann, Paris (1975). (16.2) D. HALLIDAY, Introductory Nuclear Physics, Wiley, New York (1960). (16.3) R. D. EVANS, The Atomic Nucleus, McGraw-Hill, New York (1955). (16.4) M. A. PRESTON, Physics of the Nucleus, Addison-Wesley, Reading, Mass. (1962). (16.5) E. SEGRE, Nuclei and Particles, Benjamin, New York (1965). B - MORE ADVANCED NUCLEAR PHYSICS TEXTS (16.6) A. DESHALIT and H. FESHBACH, Theoretical Nuclear Physics, Vol. 1: Nuclear Structure, Wiley, New York (1974). (16.7) J. M. BLATT and V. F. WEISSKOPF, Theoretical Nuclear Physics, Wiley, New York (1963). (16.8) E. FEENBERG, Shell Theory of the Nucleus, Princeton University Press (1955). (16.9) A. BOHR and B. R. MOTTELSON, Nuclear Structure, Benjamin, New York (1969). C - INTRODUCTION TO PARTICLE PHYSICS (16.10) D. H. FRISCH and A. M. THORNDIKE, Elementary Particles, Van Nostrand, Princeton (1964). (16.11) C. E. SWARTZ, The Fundamental Particles, Addison-Wesley, Reading, Mass. (1965). (16.12) R. P. FEYNMAN, Theory of Fundamental Processes, Benjamin, New York (1962). (16.13) R. OMNES, Introduction à l’Etude des Particules Elémentaires, Ediscience, Paris (1970). (16.14) K. NISHIJIMA, Fundamental Particles, Benjamin, New York (1964). D - MORE ADVANCED PARTICLE PHYSICS TEXTS (16.15) B. DIU, Qu’est-ce qu’une Particule Elémentaire? Masson, Paris (1965). (16.16) J. J. SAKURAI, Invariance Principles and Elementary Particles, Princeton University Press (1964). (16.17) G. KÄLLEN, Elementary Particle Physics, Addison-Wesley, Reading, Mass. (1964). (16.18) A. D. MARTIN and T. D. SPEARMAN, Elementary Particle Theory, North Holland, Amsterdam (1970). 1566

Bibliography

(16.19) A. O. WEISSENBERG, Muons, North Holland, Amsterdam (1967). E - ARTICLES (16.20) M. G. MAYER, “The Structure of the Nucleus”, Scientific American 184, 22 (March 1951). (16.21) R. E. PEIERLS, “The Atomic Nucleus”, Scientific American 200, 75 (Jan. 1959). (16.22) E. U. BARANGER, “The Present Status of the Nuclear Shell Model”, Physics Today, 26, 34 (June 1973). (16.23) S. DE BENEDETTI, “Mesonic Atoms”, Scientific American 195, 93 (Oct. 1956). (16.24) S. DE BENEDETTI, “The Mössbauer Effect”, Scientific American 202, 72 (April 1960). (16.25) R. H. HERBER, “Mössbauer Spectroscopy”, Scientific American 225, 86 (Oct. 1971). (16.26) S. PENMAN, “The Muon”, Scientific American 205, 46 (July 1961). (16.27) R. E. MARSHAK, “The Nuclear Force”, Scientific American 202, 98 (March 1960). (16.28) M. GELL-MANN and E. P. ROSENBAUM, “Elementary Particles”, Scientific American 197, 72 (July 1957). (16.29) G. F. CHEW, M. GELL-MANN and A. H. ROSENFELD, “Strongly Interacting Particles”, Scientific American 210, 74 (Feb. 1964). (16.30) V. F. WEISSKOPF, “The Three Spectroscopies”, Scientific American 218, 15 (May 1968). (16.31) U. AMALDI, “Proton Interactions at High Energies”, Scientific American 229, 36 (Nov. 1973). (16.32) S. WEINBERG, “Unified Theories of Elementary-Particle Interaction”, Scientific American 231, 50 (July 1974). (16.33) S. D. DRELL, “Electron-Positron Annihilation and the New Particles”, Scientific American 232,50 (June 1975). (16.34) R. WILSON, “Form Factors of Elementary Particles”, Physics Today 22, 47 (Jan. 1969). (16.35) E. S. ABERS and B. W. LEE, “Gauge Theories”, Physics Reports (Amsterdam), 9C, 1 (1973).

1567

Index

[The notation (ex.) refers to an exercise]

Absorption and emission of photons, 2073 collision with, 971 of a quantum, a photon, 1311, 1353 of field, 2149 of several photons, 1368 rates, 1334 Acceptor (electron acceptor), 1495 Acetylene (molecule), 878 Action, 341, 1539, 1980 Addition of angular momenta, 1015, 1043 of spherical harmonics, 1059 of two spins 1/2, 1019 Adiabatic branching of the potential, 932 Adjoint matrix, 123 operator, 112 Algebra (commutators), 165 Allowed energy band, 381, 1481, 1491 Ammonia (molecule), 469, 873 Amplitude scattering amplitude, 929, 953 Angle (quantum), 2258 Angular momentum addition of momenta, 1015, 1043 and rotations, 717 classical, 1529 commutation relations, 669, 725 conservation, 668, 736, 1016 coupling, 1016 electromagnetic field, 1968, 2043 half-integral, 987 of identical particles, 1497(ex.) of photons, 1370 orbital, 667, 669, 685 quantization, 394 quantum, 667 spin, 987, 991 standard representation, 677, 691 two coupled momenta, 1091 Anharmonic oscillator, 502, 1135 Annihilation operator, 504, 513, 514, 1597

Annihilation-creation (pair), 1831, 1878 Anomalous average value, 1828, 1852 dispersion, 2149 Zeeman effect, 987 Anti-normal correlation function, 1782, 1789 Anti-resonant term, 1312 Anti-Stokes (Raman line), 532, 752 Antibunching (photon), 2121 Anticommutation, 1599 field operator, 1754 Anticrossing of levels, 415, 482 Antisymmetric ket, state, 1428, 1431 Antisymmetrizer, 1428, 1431 Applications of the perturbation theory, 1231 Approximation central field approximation, 1459 secular approximation, 1374 Argument (EPR), 2205 Atom(s), see helium, hydrogenoid donor, 837 dressed, 2129, 2133 many-electron atoms, 1459, 1467 mirrors for atoms, 2153 muonic atom, 541 single atom fluorescence, 2121 Atomic beam (deceleration), 2025 orbital, 869, 1496(ex.) parameters, 41 Attractive bosons, 1747 Autler-Townes doublet, 2144 effect, 1410 Autoionization, 1468 Average value (anomalous), 1828 Azimuthal quantum number, 811 Band (energy), 381 Bardeen-Cooper-Schrieffer, 1889 Barrier (potential barrier), 68, 367, 373 Basis 1569

Quantum Mechanics, Volume II, Second Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

INDEX

[The notation (ex.) refers to an exercise]

change of bases, 174 characteristic relations, 101, 119 continuous basis in the space of states, 99 mixed basis in the space of states, 99 BCHSH inequalities, 2209, 2210 BCS, 1889 broken pairs and excited pairs, 1920 coherent length, 1909 distribution functions, 1899 elementary excitations, 1923 excited states, 1919 gap, 1894, 1896, 1923 pairs (wave function of), 1901 phase locking, 1893, 1914, 1916 physical mechanism, 1914 two-particle distribution, 1901 Bell’s inequality, 2208 theorem, 2204, 2208 Benzene (molecule), 417, 495 Bessel Bessel-Parseval relation, 1507 spherical Bessel function, 944 spherical equation, 961 spherical function, 966 Biorthonormal decomposition, 2194 Bitter, 2059 Blackbody radiation, 651 Bloch equations, 463, 1358, 1361 theorem, 659 Bogolubov excitations, 1661 Hamiltonian, 1952 operator method, 1950 phonons, spectrum, 1660 transformation, 1950 Bogolubov-Valatin transformation, 1836, 1919 Bohr, 2207 electronic magneton, 856 frequencies, 249 magneton, see front cover pages model, 40, 819 nuclear magneton, 1237 radius, 820 1570

Boltzmann constant, see front cover pages distribution, 1630 Born approximation, 938, 977, 1320 Born-Oppenheimer approximation, 528, 1177, 1190 Born-von Karman conditions, 1490 Bose-Einstein condensation, 1446, 1638, 1940 condensation (repulsive bosons), 1933 condensation of pairs, 1857 distribution, 652, 1630 statistics, 1446 Bosons, 1434 at non-zero temperature, 1745 attractive, 1747 attractive instability, 1745 condensed, 1638 in a Fock state, 1775 paired, 1881 Boundary conditions (periodic), 1489 Bra, 103, 104, 119 Bragg reflection, 382 Brillouin formula, 452 zone, 614 Broadband detector, 2165 optical excitation, 1332 Broadening (radiative), 2138 Broken pairs and excited pairs (BCS), 1920 Brossel, 2059 Bunching of bosons, 1777 C.S.C.O., 133, 137, 153, 236 Canonical commutation relations, 142, 223, 1984 ensemble, 2289 Hamilton-Jacobi canonical equations, 214 Hamilton-Jacobi equations, 1532 Cauchy principal part, 1517 Center of mass, 812, 1528 Center of mass frame, 814 Central

INDEX

[The notation (ex.) refers to an exercise]

field approximation, 1459 Commutation, 1599 canonical relations, 142, 223 potential, 1533 field operator, 1754 Central potential, 803, 841 of pair field operators, 1861 scattering, 941 relations, 1984 stationary states, 804 Commutation relations Centrifugal potential, 809, 888, 893 angular momentum, 669, 725 Chain (von Neumann), 2201 field, 1989, 1996 Chain of coupled harmonic oscillators, 611 Commutator algebra, 165 Change Commutator(s), 91, 167, 171, 187 of bases, 124, 174, 1601 of functions of operators, 168 of representation, 124 Compatibility of observables, 232 Characteristic equation, 129 Complementarity, 45 Characteristic relation of an orthonormal basis, 116 Complete set of commuting observables (C.S.C.O.), 133, 137, 236 Charged harmonic oscillator in an electric field, 575 Complex variables (Lagrangian), 1982 Charged particle Compton wavelength of the electron, 825, 1235 in an electromagnetic field, 1536 Condensates Charged particle in a magnetic field, 240, 321, 771 relative phase, 2237 Chemical bond, 417, 869, 1189, 1210 with spins, 2254 Chemical potential, 1486, 2287 Condensation Circular quanta, 761, 783 BCS condensation energy, 1917 Classical Bose-Einstein, 1446, 1857, 1933 electrodynamics, 1957 Condensed bosons, 1638 histories, 2272 Conduction band, 1492 Clebsch-Gordan coefficients, 1038, 1051 Conductivity (solid), 1492 Closure relation, 93, 117 Configurations, 1467 Coefficients Conjugate momentum, 214, 323, 1531, 1983, 1987, 1995 Clebsch-Gordan, 1038 Conjugation (Hermitian), 111 Einstein, 1334, 2083 Conservation Coherences (of the density matrix), 307 local conservation of probability, 238 Coherent length (BCS), 1909 of angular momentum, 668, 736, 1016 Coherent state (field), 2008 of energy, 248 Coherent superposition of states, 253, 301, 307 of probability, 237 Collision, 923 Conservative systems, 245, 315 between identical particles, 1454, 1497(ex.) Constants of the motion, 248, 317 between identical particles in classiContact term, 1273 cal mechanics, 1420 Contact term (Fermi), 1238, 1247 between two identical particles, 1450 Contextuality, 2231 cross section, 926 Continuous scattering states, 928 spectrum, 133, 219, 264, 1316 total scattering cross section, 926 variables (in a Lagrangian), 1984 with absorption, 971 Continuum of final states, 1316, 1378, Combination 1380 of atomic orbitals, 1172 Contractions, 1802 1571

INDEX

[The notation (ex.) refers to an exercise]

Convolution product of two functions, 1510 Cooling Doppler, 2026 down atoms, 2025 evaporative, 2034 Sisyphus, 2034 sub-Doppler, 2155 subrecoil, 2034 Cooper model, 1927 Cooper pairs, 1927 Cooperative effects (BCS), 1916 Correlation functions, 1781, 1804 anti-normal, 1782, 1789 dipole and field, 2113 for one-photon processes, 2084 normal, 1782, 1787 of the field, spatial, 1758 Correlations, 2231 between two dipoles, 1157 between two physical systems, 296 classical and quantum, 2221 introduced by a collision, 1104 Coulomb field, 1962 gauge, 1965 Coulomb potential cross section, 979 Coupling between angular momenta, 1016 between two angular momenta, 1091 between two states, 412 effect on the eigenvalues, 438 spin-orbit coupling, 1234, 1241 Creation and annihilation operators, 504, 513, 514, 1596, 1990 Creation operator (pair of particles), 1813, 1846 Critical velocity, 1671 Cross section and phase shifts, 951 scattering cross section, 926, 933, 953, 972 Current metastable current in superfluid, 1667 of particles, 1758 of probability, 240 probability current in hydrogen atom, 1572

851 Cylindrical symmetry, 899(ex.) Darwin term, 1235, 1279 De Broglie relation, 10 wavelength, see front cover pages, 11, 35 Decay of a discrete state, 1378 Deceleration of an atomic beam, 2025 Decoherence, 2199 Decomposition (Schmidt), 2193 Decoupling (fine or hyperfine structure), 1262, 1291 Degeneracy essential, 811, 825, 845 exchange degeneracy, 1423 exchange degeneracy removal, 1435 lifted by a perturbation, 1125 rotation invariance, 1072 systematic and accidental, 203 Degenerate eigenvalue, 127, 203, 217, 260 Degereracy lifted by a perturbation, 1117 parity, 199 Delta Dirac function, 1515 potential well and barriers, 83–85(ex.) use in quantum mechanics, 97, 106, 280 Density Lagrangian, 1986 of probability, 264 of states, 389, 1316, 1484, 1488 operator, 449, 1391 operator and matrix, 299 particle density operator, 1756 Density functions one and two-particle, 1502(ex.) Depletion (quantum), 1940 Derivative of an operator, 169 Detection probability amplitude (photon), 2166 Detectors (photon), 2165 Determinant Slater determinant, 1438, 1679 Deuterium, 834, 1107(ex.) Diagonalization

INDEX

of a 2 2 matrix, 429 of an operator, 128 Diagram (dressed-atom), 2133 Diamagnetism, 855 Diatomic molecules rotation, 739 Diffusion (momentum), 2030 Dipole -dipole interaction, 1142, 1153 -dipole magnetic interaction, 1237 electric dipole transition, 863 electric moment, 1080 Hamiltonian, 2011 magnetic dipole moment, 1084 magnetic term, 1272 trap, 2151 Dirac, see Fermi delta function, 97, 106, 280, 1515 equation, 1233 notation, 102 Direct and exchange terms, 1613, 1632, 1634, 1646, 1650 term, 1447, 1453 Discrete bases of the state space, 91 spectrum, 132, 217 Dispersion (anomalous), 2149 Dispersion and absorption (field), 2147 Distribution Boltzmann, 1630 Bose-Einstein, 1630 Fermi-Dirac, 1630 function (bosons), 1629 function (fermions), 1629 functions, 1625, 1733 functions (BCS), 1899 Distribution law Bose-Einstein, 652 Divergence (energy), 2007 Donor atom, 837, 1495 Doppler cooling, 2026 effect, 2022 effect (relativistic), 2022 free spectroscopy, 2105 temperature, 2033

[The notation (ex.) refers to an exercise]

Double condensate, 2237 resonance method, 2059 spin condensate, 2254 Doublet (Autler-Townes), 2144 Down-conversion (parametric), 2181 Dressed states and energies, 2133 Dressed-atom, 2129, 2133 diagram, 2133 strong coupling, 2141 weak coupling, 2137 E.P.R., 1225(ex.) Eckart (Wigner-Eckart theorem), see Wigner Effect Autler-Townes, 2144 Mössbauer, 2040 photoelectric, 2110 Effective Hamiltonian, 2141 Ehrenfest theorem, 242, 319, 522 Eigenresult, 9 Eigenstate, 217, 232 Eigenvalue, 11, 25, 176, 216 degenerate, 217, 260 equation, 126, 429 of an operator, 126 Eigenvector, 176 of an operator, 126 Einstein, 2110 coefficients, 1334, 1356, 2083 EPR argument, 297, 1104 model, 534, 653 Planck-Einstein relations, 3 temperature, 659 Einstein-Podolsky-Rosen, 2204, 2261 Elastic scattering, 925 scattering (photon), 2086 scattering, form factor, 1411(ex.) total cross section, 972 Elastically bound electron model, 1350 Electric conductivity of a solid, 1492 Electric dipole Hamiltonian, 2011 interaction, 1342 1573

INDEX

[The notation (ex.) refers to an exercise]

matrix elements, 1344 moment, 1080 selection rules, 1345 transition and selection rules, 863 transitions, 2056 Electric field (quantized), 2000, 2005 Electric polarisability NH3 , 484 Electric polarizability of the 1 state in Hydrogen, 1299 Electric quadrupole Hamiltonian, 1347 moment, 1082 transitions, 1348 Electric susceptibility bound electron, 577 of an atom, 1351 Electrical susceptibility, 1223(ex.) Electrodynamics classical, 1957 quantum, 1997 Electromagnetic field and harmonic oscillators, 1968 and potentials, 321 angular momentum, 1968, 2043 energy, 1966 Lagrangian, 1986, 1992 momentum, 1967, 2019 polarization, 1970 quantization, 631, 637 Electromagnetic interaction of an atom with a wave, 1340 Electromagnetism fields and potentials, 1536 Electron spin, 393, 985 Electron(s) configurations, 1463 gas in solids, 1491 in solids, 1177, 1481 mass and charge, see front cover pages Electronic configuration, 1459 paramagnetic resonance, 1225(ex.) shell, 827 Elements of reality, 2205 Emergence of a relative phase, 2248, 2253 1574

Emission of a quantum, 1311 photon, 2080 spontaneous, 2081, 2135 stimulated (or induced), 2081 Energy, see Conservation, Uncertainty and momentum of the transverse electromagnetic field, 1973 band, 381 bands in solids, 1177, 1481 conservation, 248 electromagnetic field, 1966 Fermi energy, 1772 fine structure energy levels, 986 free energy, 2290 levels, 359 levels of harmonic oscillator, 509 levels of hydrogen, 823 of a paired state, 1869 recoil energy, 2023 Ensemble canonical, 2289 grand canonical, 2291 microcanonical, 2285 statistical ensembles, 2295 Entanglement quantum, 2187, 2193, 2203, 2242 swapping, 2232 Entropy, 2286 EPR, 2204, 2261 elements of reality, 2205 EPRB, 2205 paradox/argument, 1104 Equation of state ideal quantum gas, 1640 repulsive bosons, 1745 Equation(s) Bloch, 1361 Hamilton-Jacobi, 1982, 1983, 1988 Lagrange, 1982, 1993 Lorentz, 1959 Maxwell, 1959 Schrödinger, 11, 12, 306 von Neumann, 306 Essential degeneracy, 811, 825 Ethane (molecule), 1223 Ethylene (molecule), 536, 881

INDEX

Evanescent wave, 29, 67, 70, 78, 285 Evaporative cooling, 2034 Even operators, 196 Evolution field operator, 1765 of quantum systems, 223 of the mean value, 241 operator, 313, 2069 operator (expansion), 2070 operator (integral equation), 2069 Exchange, 1611 degeneracy, 1423 degeneracy removal, 1435 energy, 1469 hole, 1774 integral, 1474 term, 1447, 1451, 1453 Excitations BCS, 1923 Bogolubov, 1661 vacuum, 1623 Excited states (BCS), 1919 Exciton, 838 Exclusion principle (Pauli), 1437, 1444, 1463, 1484 Extensive (or intensive) variables, 2292 Fermi contact term, 1238 energy, 1445, 1481, 1486, 1772 gas, 1481 golden rule, 1318 level, 1486, 1621 radius, 1621 surface (modified), 1914 , see Fermi-Dirac Fermi level and electric conductivity, 1492 Fermi-Dirac distribution, 1486, 1630, 1717 statistics, 1446 Fermions, 1434 in a Fock state, 1771 paired, 1874 Ferromagnetism, 1477 Feynman path, 2267

[The notation (ex.) refers to an exercise]

postulates, 341 Fictitious spin, 435, 1359 Field absorption, 2149 commutation relations, 1989, 1996 dispersion and absorption, 2147 intense laser, 2126 interaction energy, 1764 kinetic energy, 1763 normal variables, 1971 operator, 1752 operator (evolution), 1763, 1765 pair field operator, 1861 potential energy, 1764 quantization, 1765, 1999 quasi-classical state, 2008 spatial correlation functions, 1758 Final states continuum, 1378, 1380 Fine and hyperfine structure, 1231 Fine structure constant, see front cover pages, 825 energy levels, 1478 Hamiltonian, 1233, 1276, 1478 Helium atom, 1478 Hydrogen, 1238 of spectral lines, 986 of the states 1 , 2 et 2 , 1276 Fletcher, 2111 Fluctuations boson occupation number, 1633 intensity, 2125 vacuum, 644, 2007 Fluorescence (single atom), 2121 Fluorescence triplet, 2144 Fock space, 1593, 2004 state, 1593, 1614, 1769, 2103 Forbidden, see Band energy band, 381, 390, 1481 transition, 1345 Forces van der Waals, 1151 Form factor elastic scattering, 1411(ex.) Forward scattering (direct and exchange), 1874 Fourier 1575

INDEX

[The notation (ex.) refers to an exercise]

series and transforms, 1505 Fragmentation (condensate), 1654, 1776 Free electrons in a box, 1481 energy, 2290 particle, 14 quantum field (Fock space), 2004 spherical wave, 941, 944, 961 spherical waves and plane waves, 967 Free particle stationary states with well-defined angular momentum, 959 stationary states with well-defined momentum, 19 wave packet, 14, 57, 347 Frequency Bohr, 249 components of the field (positive and negative), 2072 Rabi’s frequency, 1325 Friction (coefficient), 2028 Function of operators, 166 periodic functions, 1505 step functions, 1521 Fundamental state, 41 Gap (BCS), 1894, 1896, 1923 Gauge, 1343, 1536, 1960, 1963 Coulomb, 1965 invariance, 321 Lorenz, 1965 Gaussian wave packet, 57, 292, 2305 Generalized velocities, 214, 1530 Geometric quantization, 2311 Gerlach, see Stern GHZ state, 2222, 2227 Gibbs-Duhem relation, 2296 Golden rule (Fermi), 1318 Good quantum numbers, 248 Grand canonical, 1626, 2291 Grand potential, 1627, 1721, 2292 Green’s function, 337, 936, 1781, 1786, 1789 evolution, 1785 Greenberger-Horne-Zeilinger, 2227 1576

Groenewold’s formula, 2315 Gross-Pitaevskii equation, 1643, 1657 Ground state, 363 harmonic oscillator, 509, 520 Hydrogen atom, 1228(ex.) Group velocity, 55, 60, 614 Gyromagnetic ratio, 396, 455 orbital, 860 spin, 988 H+ 2 molecular ion, 85(ex.), 417, 1189 Hadronic atoms, 840 Hall effect, 1493 Hamilton function, 1532 function and equations, 1531 Hamilton-Jacobi canonical equations, 214, 1532, 1982, 1983, 1988 Hamiltonian, 223, 245, 1527, 1983, 1988, 1995 classical, 1531 effective, 2141 electric dipole, 1342, 2011 electric quadrupole, 1347 fine structure, 1233, 1276 hyperfine, 1237, 1267 magnetic dipolar, 1347 of a charged particle in a vector potential, 1539 of a particle in a central potential, 806, 1533 of a particle in a scalar potential, 225 of a particle in a vector potential, 225, 323, 328 Hanbury Brown and Twiss, 2120 Hanle effect, 1372(ex.) Hard sphere scattering, 980, 981(ex.) Harmonic oscillator, 497 in an electric field, 575 in one dimension, 527, 1131 in three dimensions, 569 in two dimensions, 755 infinite chain of coupled oscillators, 611 quasiclassical states, 583 thermodynamic equilibrium, 647

INDEX

three-dimensional, 841, 899(ex.) two coupled oscillators, 599 Hartree-Fock approximation, 1677, 1701 density operator (one-particle), 1691 equations, 1686, 1731 for electrons, 1695 mean field, 1677, 1693 potential, 1706 thermal equilibrium, 1711, 1733 time-dependent, 1701, 1708 Healing length, 1652 Heaviside step function, 1521 Heisenberg picture, 317, 1763 relations, 19, 39, 41, 45, 55, 232, 290 Helicity (photon), 2051 Helium energy levels, 1467 ion, 838 isotopes, 1480 isotopes 3 He and 4 He, 1435, 1446 solidification, 535 Hermite polynomials, 516, 547, 561 Hermitian conjugation, 111 matrix, 124 operator, 115, 124, 130 Histories (classical), 2272 Hole creation and annihilation, 1622 exchange, 1774 Holes, 1621 Hybridization of atomic orbitals, 869 Hydrogen, 645 atom, 803 atom in a magnetic field, 853, 855, 862 atom, relativistic energies, 1245 Bohr model, 40, 819 energy levels, 823 fine and hyperfine stucture, 1231 ionisation energy, see front cover pages ionization energy, 820 maser, 1251 molecular ion, 85(ex.), 417, 1189 quantum theory, 41

[The notation (ex.) refers to an exercise]

radial equation, 821 Stark effect, 1298 stationary states, 851 stationary wave functions, 830 Hydrogen-like systems in solid state physics, 837 Hydrogenoid systems, 833 Hyperfine decoupling, 1262 Hamiltonian, 1237, 1267 Hyperfine structure, see Hydrogen, muonium, positronium, Zeeman effect, 1231 Muonium, 1281 Ideal gas, 1625, 1787, 1791, 1804 correlations, 1769 Identical particles, 1419, 1591 Induced emission, 1334, 1366, 2081 emission of a quantum, 1311 emission of photons, 1355 Inequality (Bell’s), 2208 Infinite one-dimensional well, 271 Infinite potential well, 74 in two dimensions, 201 Infinitesimal unitary operator, 178 Insulator, 1492 Integral exchange integral, 1474 scattering equation, 935 Intense laser fields, 2126 Intensive (or extensive) variables, 2292 Interaction between magnetic dipoles, 1141 dipole-dipole interaction, 1141, 1153 electromagnetic interaction of an atom with a wave, 1340 field and particles, 2009 field and atom, 2010 magnetic dipole-dipole interaction, 1237 picture, 353, 1393, 2070 tensor interaction, 1141 Interference photons, 2167 two-photon, 2170, 2183 Ion H+ 2 , 1189 1577

INDEX

[The notation (ex.) refers to an exercise]

Ionization photo-ionization, 2109 tunnel ionization, 2126 Isotropic radiation, 2079 Jacobi, see Hamilton Kastler, 2059, 2062 Ket, see state, 103, 119 for identical particles, 1436 Kuhn, see Thomas Lagrange equations, 1530, 1982, 1993 fonction and equations, 214 multipliers, 2281 Lagrangian, 1530, 1980 densities, 1986 electromagnetic field, 1986, 1992 formulation of quantum mechanics, 339 of a charged particle in an electromagnetic field, 1538 particle in an electromagnetic field, 323 Laguerre-Gaussian beams, 2065 Lamb shift, 645, 1245, 1388, 2008 Landau levels, 771 Landé factor, 1072, 1107(ex.), 1256, 1292 Laplacian, 1527 of 1 , 1524 of ( ) +1 , 1526 Larmor angular frequency, 857 precession, 394, 396, 410, 455, 857, 1071 Laser, 1359, 1365 Raman laser, 2093 saturation, 1370 trap, 2151 Lattices (optical), 2153 Least action principle of, 1539 Legendre associated function, 714 polynomial, 713 Length (healing), 1652 Level 1578

anticrossing, 415, 482 Fermi level, 1621 Lifetime, 343, 485, 645 of a discrete state, 1386 radiative, 2081 Lifting of degeneracy by a perturbation, 1125 Light quanta, 3 shifts, 1334, 2138, 2151, 2156 Linear, see operator combination of atomic orbitals, 1172 operators, 90, 108, 163 response, 1350, 1357, 1364 superposition of states, 253 susceptibility, 1365 Local conservation of probability, 238 Local realism, 2209, 2230 Longitudinal fields, 1961 relaxation, 1400 relaxation time, 1401 Lorentz equations, 1959 Lorenz (gauge), 1965 Magnetic dipole term, 1272 dipole-dipole interaction, 1237 effect of a magnetic field on the levels of the Hydrogen atom, 1251 hyperfine Hamiltonian, 1267 interactions, 1232, 1237 quantum number, 811 resonance, 455 susceptibility, 1224, 1487 Magnetic dipole Hamiltonian, 1347 transitions and selection rules, 1084, 1098, 1348 Magnetic dipoles interactions between two dipoles, 1141 Magnetic field and vector potential, 321 charged particle in a, 240, 771 effects on hydrogen atom, 853, 855 harmonic oscillator in a, 899(ex.) Hydrogen atom in a magnetic field, 1263, 1289

INDEX

multiplets, 1074 quantized, 2000, 2005 Magnetism (spontaneous), 1737 Many-electron atoms, 1459 Maser, 477, 1359, 1365 hydrogen, 1251 Mass correction (relativistic), 1234 Master equation, 1358 Matrice(s), 119, 121 diagonalization of a 2 2 matrix, 429 Pauli matrices, 425 unitary matrix, 176 Maxwell’s equations, 1959 Mean field (Hartree-Fock), 1693, 1708, 1725 Mean value of an observable, 228 evolution, 241 Measurement general postulates, 216, 226 ideal von Neumann measurement, 2196 of a spin 1/2, 394 of observables, 216 on a part of a physical system, 293 state after measurement, 221, 227 Mendeleev’s table, 1463 Metastable superfluid flow, 1671 Methane (molecule), 883 Microcanonical ensemble, 2285 Millikan, 2111 Minimal wave packet, 290, 520, 591 Mirrors for atoms, 2153 Mixing of states, 1121, 1137 Model Cooper model, 1927 Einstein model, 534 elastically bound electron, 1350 vector model of atom, 1071 Modes vibrational modes, 599, 611 Modes (radiation), 1974, 1975 Molecular ion, 417 Molecule(s) chemical bond, 417, 869, 873, 878, 883, 1189 rotation, 796 vibration, 527, 1137 vibration-rotation, 885

[The notation (ex.) refers to an exercise]

Mollow, 2144 Moment quadrupole electric moment, 1225(ex.) Momentum, 1539 conjugate, 214, 323, 1983, 1987, 1995 diffusion, 2030 electromagnetic field, 1967, 2019 mechanical momentum, 328 Monogamy (quantum), 2221 Mössbauer effect, 1415, 2040 Motional narrowing, 1323 condition, 1323, 1398, 1408 Multiphoton transition, 1368, 2040, 2097 Multiplets, 1072, 1074, 1467 Multipliers (Lagrange), 2281 Multipolar waves, 2052 Multipole moments, 1077 Multipole operators introduction, 1077, 1083 parity, 1082 Muon, 527, 541, 1281 Muonic atom, 541, 839 Muonium, 835 hyperfine structure, 1281 Zeeman effect, 1281 Narrowing (motional), 1323, 1408 condition, 1398 Natural width, 345, 1388 Need for a quantum treatment, 2118, 2120 Neumann spherical function, 967 Neutron mass, see front cover pages Non-destructive detection of a photon, 2159 Non-diagonal order (BCS), 1912 Non-locality, 2204 Non-resonant excitation, 1350 Non-separability, 2207 Nonlinear response, 1357, 1368 susceptibility, 1369 Norm conservation, 238 of a state vector, 104, 237 of a wave function, 13, 90, 99 1579

INDEX

[The notation (ex.) refers to an exercise]

Normal correlation function, 1782, 1787 variables, 602, 616, 631, 633 variables (field), 1971 Nuclear multipole moments, 1088 Bohr magneton, 1237 Nucleus spin, 1088 volume effect, 1162, 1268 Number occupation number, 1439, 1593 photon number, 2135 total number of particles in an ideal gas, 1635 Observable(s), 130 C.S.C.O., 133, 137 commutation, 232 compatibility, 232 for identical particles, 1429, 1441 mean value, 228 measurement of, 216, 226 quantization rules, 223 symmetric observables, 1441 transformation by permutation, 1434 whose commutator is }, 187, 289 Occupation number, 1439, 1593 operator, 1598 Odd operators, 196 One-particle Hartree-Fock density operator, 1691 operators, 1603, 1605, 1628, 1756 Operator(s) adjoint operator, 112 annihilation operator, 504, 513, 514, 1597 creation and annihilation, 1990 creation operator, 504, 513, 514, 1596 derivative of an operator, 169 diagonalization, 126, 128 even and odd operators, 196 evolution operator, 313, 2069 field, 1752 function of, 166 Hermitian operators, 115 linear operators, 90, 108, 163 1580

occupation number, 1598 one-particle operator, 1603, 1605, 1628, 1756 parity operator, 193 particle density operator, 1756 permutation operators, 1425, 1430 potential, 168 product of, 90 reduced to a single particle, 1607 representation, 121 restriction, 165 restriction of, 1125 rotation operator, 1001 symmetric, 1628, 1755 translation operator, 190 two-particle operator, 1608, 1610, 1631, 1756 unitary operators, 173 Weyl operator, 2300 Oppenheimer, see Born, 1177, 1190 Optical excitation (broadband), 1332 lattices, 2153 pumping, 2062, 2140 Orbital angular momentum (of radiation), 2052 atomic orbital, 1496(ex.) hybridization, 869 linear combination of atomic orbitals, 1172 quantum number, 1463 state space, 988 Order parameter for pairs, 1851 Orthonormal basis, 91, 99, 101, 133 characteristic relation, 116 Orthonormalization and closure relations, 101, 140 relation, 116 Oscillation(s) between two discrete states, 1374 between two quantum states, 418 Rabi, 2134 Oscillator anharmonic, 502 harmonic, 497 strength, 1352 Pair(s)

INDEX

annihilation-creation of pairs, 1831, 1874, 1887 BCS, wave function, 1909 Cooper, 1927 of particles (creation operator), 1813, 1846 pair field (commutation), 1861 pair field operator, 1845 pair wave function, 1851 Paired bosons, 1881 fermions, 1874 state energy, 1869 states, 1811 states (building), 1818 Pairing term, 1878 Paramagnetism, 855 Parametric down-conversion, 2181 Parity, 2106 degeneracy, 199 of a permutation operator, 1431 of multipole operators, 1082 operator, 193 Parseval Parseval-Plancherel equality, 20 Parseval-Plancherel formula, 1511, 1521 Partial reflection, 79 trace of an operator, 309 waves in the potential, 948 waves method, 941 Particle (current), 1758 Particles and holes, 1621 Partition function, 1626, 1627, 1717 Path integral, 2267 space-time path, 339 Pauli exclusion principle, 1437, 1444, 1463, 1481 Hamiltonian, 1009(ex.) matrices, 425, 991 spin theory, 986 spinor, 993 Penetrating orbit, 1463 Penrose-Onsager criterion, 1776, 1860, 1947 Peres, 2212

[The notation (ex.) refers to an exercise]

Periodic boundary conditions, 1489 classification of elements, 1463 functions, 1505 potential (one-dimensional), 375 Permutation operators, 1425, 1430 Perturbation applications of the perturbation theory, 1231 lifting of a degeneracy, 1125 one-dimensional harmonic oscillator, 1131 random perturbation, 1320, 1325, 1390 sinusoidal, 1311 stationary perturbation theory, 1115 Perturbation theory time dependent, 1303 Phase locking (BCS), 1893, 1916 locking (bosons), 1938, 1944 relative phase between condensates, 2237, 2248 velocity, 37 Phase shift (collision), 951, 1497(ex.) with imaginary part, 971 Phase velocity, 21 Phonons, 611, 626 Bogolubov phonons, 1660 Photodetection double, 2172, 2184 single, 2169, 2171 Photoelectric effect, 1412(ex.), 2110 Photoionization, 2109, 2165 rate, 2115, 2124 two-photon, 2123 Photon, 3, 631, 651, 2004, 2005, 2110 absorption and emission, 2067 angular momentum, 1370 antibunching, 2121 detectors, 2165 non-destructive detection, 2159 number, 2135 scattering (elastic), 2086 scattering by an atom, 2085 vacuum, 2007 , see Absorption, Emission Picture 1581

INDEX

[The notation (ex.) refers to an exercise]

Heisenberg, 317, 1763 interaction, 1393, 2070 Pitaevskii (Gross-Pitaevskii equation), 1643, 1657 Plancherel, see Parseval Planck constant, see front cover pages, 3 law , 2083 Planck-Einstein relations, 3, 10 Plane wave, 14, 19, 95, 943 Podolsky (EPR argument), 297, 1104 Pointer states, 2199 Polarizability of the 1 state in Hydrogen, 1299 Polarization electromagnetic field, 1970 of Zeeman components, 1295 space-dependent, 2156 Polynomial method (harmonic oscillator), 555, 842 Polynomials Hermite polynomials, 516, 547, 561 Position and momentum representations, 181 Positive and negative frequency components, 2072 Positron, 1281 Positronium, 836 hyperfine structure, 1281 Zeeman effect, 1281 Postulate (von Neumann projection), 2202 Postulates of quantum mechanics, 215 Potential adiabatic branching, 932 barrier, 26, 68, 367, 373 centrifugal potential, 809, 888, 893 Coulomb potential, cross section, 979 cylindrically symmetric, 899(ex.) Hartree-Fock, 1706 infinite one-dimensional well, 74 operator, 168 scalar and vector potentials, 1536, 1960, 1963 scattering by a, 923 self-consistent potential, 1461 square potential, 63 square well, 29 1582

step, 28, 65, 75, 284 well, 71, 367 well (arbitrary shape), 359 well (infinite one-dimensional), 271 well (infinite two-dimensional, 201 Yukawa potential, 977 Precession Larmor precession, 396, 1071 Thomas precession, 1235 Preparation of a state, 235 Pressure (ideal quantum gas), 1640 Principal part, 1517 Principal quantum number, 827 Principle of least action, 1539, 1980 of spectral decomposition, 11, 216 of superposition, 237 Probability amplitude, 11, 253, 259 conservation, 237 current, 240, 283, 333, 349, 932 current in hydrogen atom, 851 density, 11, 264 fluid, 932 of photon absorption, 2076 of the measurement results, 9, 11 transition probability, 439 Process (pair annihilation-creation), 1878, 1887 Product convolution product of functions, 1510 of matrices, 122 of operators, 90 scalar product, 101, 141, 149, 161 state (tensor product), 311 tensor product, 147 tensor product, applications, 441 Projection theorem, 1070 Projector, 109, 133, 165, 218, 222, 1108(ex.) Propagator for the Schrödinger equation, 335 of a particle, 2267, 2272 Proper result, 9 Proton mass, see front cover pages spin and magnetic moment, 1237, 1274 Pumping, 1358

INDEX

[The notation (ex.) refers to an exercise]

Pure (state or case), 301

cascade of the dressed atom, 2145 Raman Quadrupolar electric moment, 1082, 1225(ex.) effect, 532, 740, 1373(ex.) Quanta (circular), 761, 783 laser, 2093 Quantization scattering, 2091 electrodynamics, 1997 scattering (stimulated), 2093 electromagnetic field, 631, 637, 1997 Random perturbation, 1320, 1325, 1390 of a field, 1765 Rank (Schmidt), 2196 of angular momentum, 394, 677 Rate (photoionization), 2115, 2124 of energy, 3, 11, 71, 359 Rayleigh of measurement results, 9, 216, 398 line, 752 of the measurement results, 405 scattering, 532, 2089 rules, 11, 223, 226, 2274 Realism (local), 2205, 2209 Quantum Recoil angle, 2258 blocking, 2036 electrodynamics, 1245, 1282, 1997 effect of the nucleus, 834 entanglement, 2187, 2193 energy, 1415, 2023 monogamy, 2221 free atom, 2020 number suppression, 2040 orbital, 1463 Reduced principal quantum number, 827 density operator, 1607 numbers (good), 248 mass, 813 resonance, 417 Reduction of the wave packet, 221, 279 treatment needed, 2118, 2120 Reflection on a potential step, 285 Quasi-classical Refractive index, 2149 field states, 2008 Reiche, see Thomas states, 765, 791, 801 Relation (Gibbs-Duhem), 2296 states of the harmonic oscillator, 583 Relative Quasi-particles, 1736, 1840 motion, 814 Bogolubov phonons, 1954 particle, 814 Quasi-particle vacuum, 1836 phase between condensates, 2248, 2258 phase between spin condensates, 2253 Rabi Relativistic formula, 440, 460, 1324, 1376 corrections, 1233, 1478 formula), 419 Doppler effect, 2022 frequency, 1325 mass correction, 1234 oscillation, 2134 Relaxation, 465, 1358, 1390, 1413, 1414(ex.) Radial general equations, 1397 equation, 842 longitudinal, 1400 equation (Hydrogen), 821 longitudinal relaxation time, 1401 equation in a central potential, 808 transverse, 1403 integral, 1277 transverse relaxation time, 1406 quantum number, 811 Relay state, 2086, 2098, 2106 Radiation Renormalization, 2007 isotropic, 2079 Representation(s) pressure, 2024 change of, 124 Radiative broadening, 2138 in the state space, 116 1583

INDEX

[The notation (ex.) refers to an exercise]

of operators, 121 position and momentum, 139, 181 Schrödinger equation, 183–185 Repulsion between electrons, 1469 Resonance magnetic resonance, 455 quantum resonance, 417, 1158 scattering resonance, 69, 954, 983(ex.) two resonnaces with a sinusoidal excitation, 1365 width, 1312 with sinusoidal perturbation, 1311 Restriction of an operator, 165, 1125 Rigid rotator, 740, 1222(ex.) Ritz theorem, 1170 Root mean square deviation general definition, 230 Rosen (EPR argument), 297, 1104 Rotating frame, 459 Rotation(s) and angular momentum, 717 invariance and degeneracy, 734 of diatomic molecules, 739 of molecules, 796, 885 operator(s), 720, 1001 rotation invariance, 1478 rotation invariance and degeneracy, 1072 Rotator rigid rotator, 740, 1222(ex.) Rules quantization rules, 2274 selection rules, 197 Rutherford’s formula, 979 Rydberg constant, see front cover pages Saturation of linear response, 1368 of the susceptibility, 1369 Scalar and vector potentials, 321, 1536 interaction between two angular momenta, 1091 observable, operator, 732, 737 potential, 225 product, 89, 92, 101, 141, 149, 161 product of two coherent states, 593 1584

Scattering amplitude, 929, 953 by a central potential, 941 by a hard sphere, 980, 981(ex.) by a potential, 923 cross section, 933, 953, 972 cross section and phase shifts, 951 inelastic, 2091 integral equation, 935 of particles with spin, 1102 of spin 1/2 particles, 1108(ex.) photon, 2086 Raman, 2091 Rayleigh, 532, 2089 resonance, 954, 983(ex.) resonant, 2089 stationary scattering states, 951 stationary states, 928 stimulated Raman, 2093 Schmidt decomposition, 2193 rank, 2196 Schottky anomaly, 654 Schrödinger, 2190 equation, 11, 12, 223, 306 equation in momentum representation, 184 equation in position representation, 183 equation, physical implications, 237 equation, resolution for conservative systems, 245 picture, 317 Schwarz inequality, 161 Second quantization, 1766 harmonic generation, 1368 Secular approximation, 1316, 1374 Selection rules, 197, 863, 2014, 2056 electric quadrupolar, 1348 magnetic dipolar, 1098, 1348 Self-consistent potential, 1461 Semiconductor, 837, 1493 Separability, 2207, 2223 Separable density operator, 2223 Shell (electronic), 827 Shift

INDEX

light shift, 2138 of a discrete state, 1387 Singlet, 1024, 1474 Sinusoidal perturbation, 1311, 1374 Sisyphus cooling, 2034 effect, 2155 Slater determinant, 1438, 1679 Slowing down atoms, 2025 Solids electronic bands, 1177 energy bands of electrons, 1491 energy bands of electrons in solids, 381 hydrogen-like systems in solid state physics, 837 Space (Fock), 1593 Space-dependent polarization, 2156 Space-time path, 339, 1539 Spatial correlations (ideal gas), 1769 Specific heat of an electron gas, 1484 of metals, 1487 of solids, 653 two level system, 654 Spectral decomposition principle, 7, 11, 216 function, 1795 terms, 1469 Spectroscopy (Doppler free), 2105 Spectrum BCS elementary excitation, 1923 continuous, 219, 264 discrete, 132, 217 of an observable, 126, 216 Spherical Bessel equation, 961 Bessel function, 944, 966 free spherical waves, 961 free wave, 944 Neumann function, 967 wave, 941 waves and plane waves, 967 Spherical harmonics, 689, 705 addition of, 1059 expression for = 0 1 2 , 709 general expression, 707

[The notation (ex.) refers to an exercise]

Spin and magnetic moment of the proton, 1237 angular momentum, 987 electron, 985, 1289 fictitious, 435 gyromagnetic ratio, 396, 455, 988 nuclear, 1088 of the electron, 393 Pauli theory, 986, 988 quantum description, 985, 991 rotation operator, 1001 scattering of particles with spin, 1102 spin 1 and radiation, 2044, 2049, 2050 system of two spins, 441 Spin 1/2 density operator, 449 ensemble of, 1358 fictitious, 1359 interaction between two spins, 1141 preparation and measurement, 401 scattering of spin 1/2 particles, 1108(ex.) Spin-orbit coupling, 1018, 1234, 1241, 1279 Spin-statistics theorem, 1434 Spinor, 993 rotation, 1005 Spontaneous emission, 343, 645, 1301, 2081, 2135 emission of photons, 1356 magnetism of fermions, 1737 Spreading of a wave packet, 59, 348 Square barrier of potential, 26, 68 potential, 26, 63, 75, 283 potential well, 71, 271 spherical well, 982(ex.) Standard representation (angular momentum), 677, 691 Stark effect in Hydrogen atom, 1298 State(s), see Density operator density of, 389, 1316, 1484, 1488 Fock, 1593, 1614, 1769, 2103 ground state, 363 mixing of states by a perturbation, 1121 orbital state space, 988 paired, 1811 1585

INDEX

[The notation (ex.) refers to an exercise]

pointer states, 2199 quasi-classical states, 583, 765, 791, 801 relay state, 2086, 2098, 2106 stable and unstable states, 485 state after measurement, 221 state preparation, 235 stationary, 63, 359, 375 stationary state, 24, 246 stationary states in a central potential, 804 unstable, 343 vacuum state, 1595 vector, 102, 215 Stationary perturbation theory, 1115 phase condition, 18, 54 scattering states, 928, 951 states, 24, 63, 246, 359 states in a periodic potential, 375 states with well-defined angular momentum, 944, 959 states with well-defined momentum, 943 Statistical entropy, 2217 mechanics (review of), 2285 mixture of states, 253, 299, 304, 450 Statistics Bose-Einstein, 1446 Fermi-Dirac, 1446 Step function, 1521 potential, 28, 65, 75, 284 Stern-Gerlach experiment, 394 Stimulated (or induced) emission, 1334, 1366, 2081 Raman scattering, 2093 Stokes Raman line, 532, 752 Stoner (spontaneous magnetism), 1737 Strong coupling (dressed-atom), 2141 Subrecoil cooling, 2034 Sum rule (Thomas-Reiche-Kuhn), 1352 Superfluidity, 1667, 1674 Superposition of states, 253 1586

principle, 7, 237 principle and physical predictions, 253 Surface (modified Fermi surface), 1914 Susceptibility, see Linear, nonlinear, tensor electric susceptibility of an atom, 1351 electrical susceptibility, 577, e1223 electrical susceptibility of NH3 , 484 magnetic susceptibility, 1224 tensor, 1224, 1410(ex.) Swapping (entanglement), 2232 Symmetric ket, state, 1428, 1431 observables, 1429, 1441 operators, 1603, 1605, 1608, 1610, 1628, 1631, 1755 Symmetrization of observables, 224 postulate, 1434 Symmetrizer, 1428, 1431 System time evolution of a quantum system, 223 two-level system, 435 Systematic and accidental degeneracies, 203 degeneracy, 845 Temperature (Doppler), 2033 Tensor interaction, 1141 product, 147, 441 product of operators, 149 product state, 295, 311 product, applications, 201 susceptibility tensor, 1224 Term direct and exchange terms, 1613, 1632, 1634, 1646, 1650 pairing, 1878 spectral terms, 1467, 1469 Theorem Bell, 2204, 2208 Bloch, 659 projection, 1070 Ritz, 1170 Wick, 1799, 1804

INDEX

Wigner-Eckart, 1065, 1085, 1254 Thermal wavelength, 1635 Thermodynamic equilibrium, 308 harmonic oscillator, 647 ideal quantum gas, 1625 spin 1/2, 452 Thermodynamic potential (minimization), 1715 Thomas precession, 1235 Thomas-Reiche-Kuhn sum rule, 1352 Three-dimensional harmonic oscillator, 569, 841, 899(ex.) Three-level system, 1409(ex.) Three-photon transition, 1370 Time evolution of quantum systems, 223 Time-correlations (fluorescent photons), 2145 Time-dependent Gross-Pitaevskii equation, 1657 perturbation theory, 1303 Time-energy uncertainty relation, 250, 279, 345, 1312, 1389 Torsional oscillations, 536 Torus (flow in a), 1667 Total elastic scattering cross section, 972 reflection, 67, 75 scattering cross section (collision), 926 Townes Autler-Townes effect, 1410 Trace of an operator, 163 partial trace of an operator, 309 Transform (Wigner), 2297 Transformation Bogolubov, 1950 Bogolubov-Valatin, 1836, 1919 Gauge, 1960 of observables by permutation, 1434 Transition, see Probability, Forbidden, Electric dipole, Magnetic dipole, Quadrupole electric dipole, 2056 magnetic dipole transition, 1098 probability, 439, 1308, 1321, 1355 probability per unit time, 1319 probability, spin 1/2, 460 three-photon transition, 1370

[The notation (ex.) refers to an exercise]

two-photon, 2097 virtual, 2100 Translation operator, 190, 579, 791 Transpositions, 1431 Transverse fields, 1961 relaxation, 1403 relaxation time, 1406 Trap dipolar, 2151 laser, 2151 Triplet, 1024, 1474 fluorescence triplet, 2144 Tunnel effect, 29, 70, 365, 476, 540, 1177 ionization, 2126 Two coupled harmonic oscillators, 599 Two-dimensional harmonic oscillator, 755 infinite potential well, 201 wave packets, 49 Two-level system, 393, 411, 435, 1357 Two-particle operators, 1608, 1610, 1631, 1756 Two-photon absorption, 1373(ex.) interference, 2170, 2183 transition, 1409(ex.), 2097 Uncertainty relation, 19, 39, 41, 45, 232, 290 time-energy uncertainty relation, 1312 Uniqueness of the measurement result, 2201 Unitary matrix, 125, 176 operator, 173, 314 transformation of operators, 177 Unstable states, 343 Vacuum electromagnetism, 644, 2007 excitations, 1623 fluctuations, 2007 photon vacuum, 2007 quasi-particule vacuum, 1836 state, 1595 Valence band, 1493 1587

INDEX

[The notation (ex.) refers to an exercise]

Van der Waals forces, 1151 Variables intensive or extensive, 2292 normal variables, 602, 616, 631, 633 Variational method, 1169, 1190, 1228(ex.) Vector model, 1091 model of the atom, 1071, 1256 observable, operator, 732 operator, 1065 potential, 225 potential of a magnetic dipole, 1268 Velocity critical, 1671 generalized velocities, 214, 1530 group velocity, 23, 614 phase velocity, 21, 37 Vibration(s) modes, 599, 611 modes of a continuous system, 631 of molecules, 885, 1137 of nuclei in a crystal, 534, 611, 653 of the nuclei in a molecule, 527 Violations of Bell’s inequalities, 2210, 2265 Virial theorem, 350, 1210 Virtual transition, 2100 Volume effect, 544, 840, 1162, 1268 Von Neumann chain, 2201 equation, 306 ideal measurement, 2196 reduction postulate, 2202 statistical entropy, 2217 Vortex in a superfluid, 1667 Water (molecule), 873, 874 Wave (evanescent), 67 Wave function, 88, 140, 226 BCS pairs, 1901, 1909 Hydrogen, 830 norm, 90 pair wave functions, 1851 particle, 11 Wave packet(s) Gaussian, 57, 2305 in a potential step, 75 in three dimensions, 53 1588

minimal, 290, 520, 591 motion in a harmonic potential, 596 one-photon, 2168 particle, 13 photon, 2163 propagation, 20, 57, 242, 398 reduction, 221, 227, 265, 279 spreading, 57, 59, 347, 348(ex.) two-dimension, 49 two-photons, 2181 Wave(s) de Broglie wavelength, 10, 35 evanescent, 29 free spherical waves, 961 multipolar, 2052 partial waves, 948 plane, 14, 19, 943 wave function, 11, 88, 140, 226 Wave-particle duality, 3, 45 Wavelength Compton wavelength, 1235 de Broglie, 10 Weak coupling (dressed-atom), 2137 Well potential square well, 29 potential well, 367 Weyl operator, 2300 quantization, 2311 Which path type of experiments, 2202 Wick’s theorem, 1799, 1804 Wigner transform, 2297 Wigner-Eckart theorem, 1065, 1085, 1254 Young (double slit experiment), 4 Yukawa potential, 977 Zeeman components, polarizations, 865 effect, 855, 862, 987, 1251, 1253, 1257, 1261, 1281 polarization of the components, 1295 slower, 2025 Zeeman effect Hydrogen, 1289 in muonium, 1281 in positronium, 1281 Muonium, 1284

INDEX

[The notation (ex.) refers to an exercise]

Zone (Brillouin zone), 614

1589

QUANTUM MECHANICS Volume III Fermions, Bosons, Photons, Correlations, and Entanglement

Claude Cohen-Tannoudji, Bernard Diu, and Franck Laloë Translated from the French by Nicole Ostrowsky and Dan Ostrowsky

Authors

First Edition

Prof. Dr. Claude Cohen-Tannoudji Laboratoire Kastler Brossel (ENS) 24 rue Lhomond 75231 Paris Cedex 05 France

All books published by Wiley-VCH are carefully produced. Nevertheless, authors, editors, and publisher do not warrant the information contained in these books, including this book, to be free of errors. Readers are advised to keep in mind that statements, data, illustrations, procedural details or other items may inadvertently be inaccurate.

Prof. Dr. Bernard Diu 4 rue du Docteur Roux 91440 Boures-sur-Yvette France

Library of Congress Card No.: applied for

Prof. Dr. Frank Laloë Laboratoire Kastler Brossel (ENS) 24 rue Lhomond 75231 Paris Cedex 05 France

British Library Cataloguing-in-Publication Data: A catalogue record for this book is available from the British Library.

Cover Image © antishock/Getty Images

Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.d-nb.de. © 2020 WILEY-VCH Verlag GmbH & Co. KGaA, Boschstr. 12, 69469 Weinheim, Germany All rights reserved (including those of translation into other languages). No part of this book may be reproduced in any form – by photoprinting, microfilm, or any other means – nor transmitted or translated into a machine language without written permission from the publishers. Registered names, trademarks, etc. used in this book, even when not specifically marked as such, are not to be considered unprotected by law. Print ISBN 978-3-527-34555-7 ePDF ISBN 978-3-527-82274-4 ePub ISBN 978-3-527-82275-1 Cover Design Tata Consulting Services Printing and Binding CPI Ebner & Spiegel Printed on acid-free paper.

Directions for Use This book is composed of chapters and their complements: – The chapters contain the fundamental concepts. Except for a few additions and variations, they correspond to a course given in the last year of a typical undergraduate physics program (Volume I) or of a graduate program (Volumes II and III). The 21 chapters are complete in themselves and can be studied independently of the complements. – The complements follow the corresponding chapter. Each is labelled by a letter followed by a subscript, which gives the number of the chapter (for example, the complements of Chapter V are, in order, AV , BV , CV , etc.). They can be recognized immediately by the symbol that appears at the top of each of their pages. The complements vary in character. Some are intended to expand the treatment of the corresponding chapter or to provide a more detailed discussion of certain points. Others describe concrete examples or introduce various physical concepts. One of the complements (usually the last one) is a collection of exercises. The difficulty of the complements varies. Some are very simple examples or extensions of the chapter. Others are more difficult and at the graduate level or close to current research. In any case, the reader should have studied the material in the chapter before using the complements. The complements are generally independent of one another. The student should not try to study all the complements of a chapter at once. In accordance with his/her aims and interests, he/she should choose a small number of them (two or three, for example), plus a few exercises. The other complements can be left for later study. To help with the choise, the complements are listed at the end of each chapter in a “reader’s guide”, which discusses the difficulty and importance of each. Some passages within the book have been set in small type, and these can be omitted on a first reading.

Foreword

Foreword

Quantum mechanics is a branch of physics whose importance has continually increased over the last decades. It is essential for understanding the structure and dynamics of microscopic objects such as atoms, molecules and their interactions with electromagnetic radiation. It is also the basis for understanding the functioning of numerous new systems with countless practical applications. This includes lasers (in communications, medicine, milling, etc.), atomic clocks (essential in particular for the GPS), transistors (communications, computers), magnetic resonance imaging, energy production (solar panels, nuclear reactors), etc. Quantum mechanics also permits understanding surprising physical properties such as superfluidity or supraconductivity. There is currently a great interest in entangled quantum states whose non-intuitive properties of nonlocality and nonseparability permit conceiving remarkable applications in the emerging field of quantum information. Our civilization is increasingly impacted by technological applications based on quantum concepts. This why a particular effort should be made in the teaching of quantum mechanics, which is the object of these three volumes. The first contact with quantum mechanics can be disconcerting. Our work grew out of the authors’ experiences while teaching quantum mechanics for many years. It was conceived with the objective of easing a first approach, and then aiding the reader to progress to a more advance level of quantum mechanics. The first two volumes, first published more than forty years ago, have been used throughout the world. They remain however at an intermediate level. They have now been completed with a third volume treating more advanced subjects. Throughout we have used a progressive approach to problems, where no difficulty goes untreated and each aspect of the diverse questions is discussed in detail (often starting with a classical review). This willingness to go further “without cheating or taking shortcuts” is built into the book structure, using two distinct linked texts: chapters and complements. As we just outlined in the “Directions for use”, the chapters present the general ideas and basic concepts, whereas the complements illustrate both the methods and concepts just exposed. Volume I presents a general introduction of the subject, followed by a second chapter describing the basic mathematical tools used in quantum mechanics. While this chapter can appear long and dense, the teaching experience of the authors has shown that such a presentation is the most efficient. In the third chapter the postulates are announced and illustrated in many of the complements. We then go on to certain important applications of quantum mechanics, such as the harmonic oscillator, which lead to numerous applications (molecular vibrations, phonons, etc.). Many of these are the object of specific complements. Volume II pursues this development, while expanding its scope at a slightly higher level. It treats collision theory, spin, addition of angular momenta, and both timedependent and time-independent perturbation theory. It also presents a first approach to the study of identical particles. In this volume as in the previous one, each theoretical concept is immediately illustrated by diverse applications presented in the complements. Both volumes I and II have benefited from several recent corrections, but there have also been additions. Chapter XIII now contains two sections §§ D and E that treat random perturbations, and a complement concerning relaxation has been added. ii

Foreword

Volume III extends the two volumes at a slightly higher level. It is based on the use of the creation and annihilation operator formalism (second quantization), which is commonly used in quantum field theory. We start with a study of systems of identical particles, fermions or bosons. The properties of ideal gases in thermal equilibrium are presented. For fermions, the Hartree-Fock method is developed in detail. It is the base of many studies in chemistry, atomic physics and solid state physics, etc. For bosons, the Gross-Pitaevskii equation and the Bogolubov theory are discussed. An original presentation that treats the pairing effect of both fermions and bosons permits obtaining the BCS (Bardeen-Cooper-Schrieffer) and Bogolubov theories in a unified framework. The second part of volume III treats quantum electrodynamics, its general introduction, the study of interactions between atoms and photons, and various applications (spontaneous emission, multiphoton transitions, optical pumping, etc.). The dressed atom method is presented and illustrated for concrete cases. A final chapter discusses the notion of quantum entanglement and certain fundamental aspects of quantum mechanics, in particular the Bell inequalities and their violations. Finally note that we have not treated either the philosophical implications of quantum mechanics, or the diverse interpretations of this theory, despite the great interest of these subjects. We have in fact limited ourselves to presenting what is commonly called the “orthodox point of view”. It is only in Chapter XXI that we touch on certain questions concerning the foundations of quantum mechanics (nonlocality, etc.). We have made this choice because we feel that one can address such questions more efficiently after mastering the manipulation of the quantum mechanical formalism as well as its numerous applications. These subjects are addressed in the book Do we really understand quantum mechanics? (F. Laloë, Cambridge University Press, 2019); see also section 5 of the bibliography of volumes I and II.

iii

Foreword

Acknowledgments: Volumes I and II: The teaching experience out of which this text grew were group efforts, pursued over several years. We wish to thank all the members of the various groups and particularly Jacques Dupont-Roc and Serge Haroche, for their friendly collaboration, for the fruitful discussions we have had in our weekly meetings and for the ideas for problems and exercises that they have suggested. Without their enthusiasm and valuable help, we would never have been able to undertake and carry out the writing of this book. Nor can we forget what we owe to the physicists who introduced us to research, Alfred Kastler and Jean Brossel for two of us and Maurice Levy for the third. It was in the context of their laboratories that we discovered the beauty and power of quantum mechanics. Neither have we forgotten the importance to us of the modern physics taught at the C.E.A. by Albert Messiah, Claude Bloch and Anatole Abragam, at a time when graduate studies were not yet incorporated into French university programs. We wish to express our gratitude to Ms. Aucher, Baudrit, Boy, Brodschi, Emo, Heywaerts, Lemirre, Touzeau for preparation of the mansucript.

Volume III: We are very grateful to Nicole and Daniel Ostrowsky, who, as they translated this Volume from French into English, proposed numerous improvements and clarifications. More recently, Carsten Henkel also made many useful suggestions during his translation of the text into German; we are very grateful for the improvements of the text that resulted from this exchange. There are actually many colleagues and friends who greatly contributed, each in his own way, to finalizing this book. All their complementary remarks and suggestions have been very helpful and we are in particular thankful to: Pierre-François Cohadon Jean Dalibard Sébastien Gleyzes Markus Holzmann Thibaut Jacqmin Philippe Jacquier Amaury Mouchet Jean-Michel Raimond Félix Werner Some delicate aspects of Latex typography have been resolved thanks to Marco Picco, Pierre Cladé and Jean Hare. Roger Balian, Edouard Brézin and William Mullin have offered useful advice and suggestions. Finally, our sincere thanks go to Geneviève Tastevin, Pierre-François Cohadon and Samuel Deléglise for their help with a number of figures.

iv

Table of contents

Volume I Table of contents I

vii

WAVES AND PARTICLES. INTRODUCTION TO THE BASIC IDEAS OF QUANTUM MECHANICS 1 33

READER’S GUIDE FOR COMPLEMENTS

AI

Order of magnitude of the wavelengths associated with material particles

35

BI

Constraints imposed by the uncertainty relations

39

CI

Heisenberg relation and atomic parameters

41

DI

An experiment illustrating the Heisenberg relations

45

EI

A simple treatment of a two-dimensional wave packet

49

FI

The relationship between one- and three-dimensional problems

53

GI

One-dimensional Gaussian wave packet: spreading of the wave packet 57

HI

Stationary states of a particle in one-dimensional square potentials 63

JI

Behavior of a wave packet at a potential step

75

KI

Exercises

83 ***********

II

THE MATHEMATICAL TOOLS OF QUANTUM MECHANICS 87

READER’S GUIDE FOR COMPLEMENTS

159

AII

The Schwarz inequality

161

BII

Review of some useful properties of linear operators

163

CII

Unitary operators

173

DII

A more detailed study of the r and p representations

181

EII

FII

Some general properties of two observables, tator is equal to ~ The parity operator

and

, whose commu187 193 v

Table of contents

GII An application of the properties of the tensor product: the twodimensional infinite well 201 HII

Exercises

205

III

THE POSTULATES OF QUANTUM MECHANICS

213

READER’S GUIDE FOR COMPLEMENTS

267

AIII Particle in an infinite one-dimensional potential well

271

BIII Study of the probability current in some special cases

283

CIII Root mean square deviations of two conjugate observables

289

DIII Measurements bearing on only one part of a physical system

293

EIII The density operator

299

FIII The evolution operator

313

GIII The Schrödinger and Heisenberg pictures

317

HIII Gauge invariance

321

JIII Propagator for the Schrödinger equation

335

KIII Unstable states. Lifetime

343

LIII Exercises

347

MIII Bound states in a “potential well” of arbitrary shape

359

NIII Unbound states of a particle in the presence of a potential well or barrier 367 OIII Quantum properties of a particle in a one-dimensional periodic structure 375 *********** IV

APPLICATIONS OF THE POSTULATES TO SIMPLE CASES: SPIN 1/2 AND TWO-LEVEL SYSTEMS 393

READER’S GUIDE FOR COMPLEMENTS

423

AIV The Pauli matrices

425

BIV Diagonalization of a 2

2 Hermitian matrix

CIV Fictitious spin 1/2 associated with a two-level system vi

429 435

Table of contents

DIV System of two spin 1/2 particles

441

EIV Spin 1 2 density matrix

449

FIV Spin 1/2 particle in a static and a rotating magnetic fields: magnetic resonance 455 GIV A simple model of the ammonia molecule

469

HIV Effects of a coupling between a stable state and an unstable state

485

JIV

491

Exercises ***********

V THE ONE-DIMENSIONAL HARMONIC OSCILLATOR

497

READER’S GUIDE FOR COMPLEMENTS

525

AV

527

BV

Some examples of harmonic oscillators

Study of the stationary states in the x representation. Hermite polynomials 547

CV Solving the eigenvalue equation of the harmonic oscillator by the polynomial method 555 DV

Study of the stationary states in the momentum representation

563

EV

The isotropic three-dimensional harmonic oscillator

569

FV

A charged harmonic oscillator in a uniform electric field

575

GV

Coherent “quasi-classical” states of the harmonic oscillator

583

HV

Normal vibrational modes of two coupled harmonic oscillators

599

JV

Vibrational modes of an infinite linear chain of coupled harmonic oscillators; phonons 611

KV LV

Vibrational modes of a continuous physical system. Photons

631

One-dimensional harmonic oscillator in thermodynamic equilibrium at a temperature 647

MV Exercises

661 ***********

VI

GENERAL PROPERTIES OF ANGULAR MOMENTUM IN QUANTUM MECHANICS 667 vii

Table of contents

READER’S GUIDE FOR COMPLEMENTS

703

AVI Spherical harmonics

705

BVI Angular momentum and rotations

717

CVI Rotation of diatomic molecules

739

DVI Angular momentum of stationary states of a two-dimensional harmonic oscillator 755 EVI A charged particle in a magnetic field: Landau levels

771

FVI Exercises

795 ***********

VII PARTICLE IN A CENTRAL POTENTIAL, HYDROGEN ATOM 803 READER’S GUIDE FOR COMPLEMENTS

831

AVII Hydrogen-like systems

833

BVII A soluble example of a central potential: The isotropic three-dimensional harmonic oscillator 841 CVII Probability currents associated with the stationary states of the hydrogen atom 851 DVII The hydrogen atom placed in a uniform magnetic field. Paramagnetism and diamagnetism. The Zeeman effect 855 EVII Some atomic orbitals. Hybrid orbitals

869

FVII Vibrational-rotational levels of diatomic molecules

885

GVII Exercises

899

INDEX

901 ***********

viii

Table of contents

Volume II

VOLUME II

923

Table of contents

v

VIII AN ELEMENTARY APPROACH TO THE QUANTUM THEORY OF SCATTERING BY A POTENTIAL 923 READER’S GUIDE FOR COMPLEMENTS

957

AVIII The free particle: stationary states with well-defined angular momentum

959

BVIII

Phenomenological description of collisions with absorption

971

CVIII

Some simple applications of scattering theory

977

*********** IX

ELECTRON SPIN

985

READER’S GUIDE FOR COMPLEMENTS

999

AIX

Rotation operators for a spin 1/2 particle

1001

BIX

Exercises

1009 ***********

X

ADDITION OF ANGULAR MOMENTA

1015

READER’S GUIDE FOR COMPLEMENTS

1041

AX

Examples of addition of angular momenta

1043

BX

Clebsch-Gordan coefficients

1051

CX

Addition of spherical harmonics

1059

DX

Vector operators: the Wigner-Eckart theorem

1065

EX

Electric multipole moments

1077

FX

Two angular momenta J1 and J2 coupled by an interaction J1 J2

GX

Exercises

1091 1107 ix

Table of contents

***********

XI

STATIONARY PERTURBATION THEORY

READER’S GUIDE FOR COMPLEMENTS

1115 1129

AXI A one-dimensional harmonic oscillator subjected to a perturbing potential in , 2 , 3 1131 BXI Interaction between the magnetic dipoles of two spin 1/2 particles

1141

CXI Van der Waals forces

1151

DXI The volume effect: the influence of the spatial extension of the nucleus on the atomic levels 1162 EXI The variational method

1169

FXI Energy bands of electrons in solids: a simple model

1177

GXI A simple example of the chemical bond: the H+ 2 ion

1189

HXI Exercises

1221 ***********

XII AN APPLICATION OF PERTURBATION THEORY: THE FINE AND HYPERFINE STRUCTURE OF THE HYDROGEN ATOM 1231 READER’S GUIDE FOR COMPLEMENTS

1265

AXII

1267

The magnetic hyperfine Hamiltonian

BXII Calculation of the average values of the fine-structure Hamiltonian in the 1 , 2 and 2 states 1276 CXII The hyperfine structure and the Zeeman effect for muonium and positronium 1281 DXII The influence of the electronic spin on the Zeeman effect of the hydrogen resonance line 1289 EXII

The Stark effect for the hydrogen atom ***********

x

1298

Table of contents

XIII APPROXIMATION METHODS FOR TIME-DEPENDENT PROBLEMS

1303

READER’S GUIDE FOR COMPLEMENTS

1337

AXIII

1339

Interaction of an atom with an electromagnetic wave

BXIII Linear and non-linear responses of a two-level system subject to a sinusoidal perturbation 1357 CXIII Oscillations of a system between two discrete states under the effect of a sinusoidal resonant perturbation 1374 DXIII Decay of a discrete state resonantly coupled to a continuum of final states 1378 EXIII

Time-dependent random perturbation, relaxation

1390

FXIII

Exercises

1409 ***********

XIV SYSTEMS OF IDENTICAL PARTICLES

1419

READER’S GUIDE FOR COMPLEMENTS

1457

AXIV

1459

Many-electron atoms. Electronic configurations

BXIV Energy levels of the helium atom. Configurations, terms, multiplets 1467 CXIV

Physical properties of an electron gas. Application to solids

1481

DXIV

Exercises

1496 ***********

APPENDICES

1505

I

Fourier series and Fourier transforms

1505

II

The Dirac -“function”

1515

III

Lagrangian and Hamiltonian in classical mechanics

1527

BIBLIOGRAPHY OF VOLUMES I AND II

1545

INDEX

1569 ***********

xi

Table of contents

Volume III

VOLUME III

1591

Table of contents

v

XV CREATION AND ANNIHILATION OPERATORS FOR CAL PARTICLES A General formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . B One-particle symmetric operators . . . . . . . . . . . . . . . . . . . C Two-particle operators . . . . . . . . . . . . . . . . . . . . . . . .

IDENTI1591 . . . . . . 1592 . . . . . . 1603 . . . . . . 1608 1617

READER’S GUIDE FOR COMPLEMENTS

AXV Particles and holes 1621 1 Ground state of a non-interacting fermion gas . . . . . . . . . . . . . . . . . . 1621 2 New definition for the creation and annihilation operators . . . . . . . . . . 1622 3 Vacuum excitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1623 BXV Ideal gas in thermal equilibrium; quantum distribution functions 1 Grand canonical description of a system without interactions . . . . . . . . 2 Average values of symmetric one-particle operators . . . . . . . . . . . . . . 3 Two-particle operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Total number of particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Equation of state, pressure . . . . . . . . . . . . . . . . . . . . . . . . . . .

1625 . 1626 . 1628 . 1631 . 1635 . 1640

CXV Condensed boson system, Gross-Pitaevskii equation 1 Notation, variational ket . . . . . . . . . . . . . . . . . . . . . 2 First approach . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Generalization, Dirac notation . . . . . . . . . . . . . . . . . 4 Physical discussion . . . . . . . . . . . . . . . . . . . . . . . .

1643 . 1643 . 1645 . 1648 . 1651

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

DXV Time-dependent Gross-Pitaevskii equation 1657 1 Time evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1657 2 Hydrodynamic analogy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1664 3 Metastable currents, superfluidity . . . . . . . . . . . . . . . . . . . . . . . . . 1667 EXV Fermion system, Hartree-Fock approximation 1677 1 Foundation of the method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1678 2 Generalization: operator method . . . . . . . . . . . . . . . . . . . . . . . . . 1688 xii

Table of contents

FXV Fermions, time-dependent Hartree-Fock approximation 1 Variational ket and notation . . . . . . . . . . . . . . . . . . . . . 2 Variational method . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Computing the optimizer . . . . . . . . . . . . . . . . . . . . . . 4 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1701 . 1701 . 1702 . 1705 . 1707

GXV Fermions or Bosons: Mean field thermal equilibrium 1711 1 Variational principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1712 2 Approximation for the equilibrium density operator . . . . . . . . . . . . . . 1716 3 Temperature dependent mean field equations . . . . . . . . . . . . . . . . . . 1725 HXV Applications of the mean field method for non-zero 1 Hartree-Fock for non-zero temperature, a brief review . . . 2 Homogeneous system . . . . . . . . . . . . . . . . . . . . . . 3 Spontaneous magnetism of repulsive fermions . . . . . . . . 4 Bosons: equation of state, attractive instability . . . . . . .

temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1733 . 1733 . 1734 . 1737 . 1745

. . . .

1751 . 1752 . 1755 . 1763 . 1765

*********** XVI FIELD OPERATOR A Definition of the field operator . . . . . . . . . . . . . . . B Symmetric operators . . . . . . . . . . . . . . . . . . . . . C Time evolution of the field operator (Heisenberg picture) D Relation to field quantization . . . . . . . . . . . . . . . .

BXVI Spatio-temporal correlation 1 Green’s functions in ordinary space 2 Fourier transforms . . . . . . . . . 3 Spectral function, sum rule . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1767

READER’S GUIDE FOR COMPLEMENTS

AXVI Spatial correlations in an ideal 1 System in a Fock state . . . . . . . . . 2 Fermions in the ground state . . . . . 3 Bosons in a Fock state . . . . . . . . .

. . . .

gas of . . . . . . . . . . . .

bosons . . . . . . . . . . . . . . .

or . . . . . .

fermions 1769 . . . . . . . . . . . 1769 . . . . . . . . . . . 1771 . . . . . . . . . . . 1775

functions, Green’s functions 1781 . . . . . . . . . . . . . . . . . . . . . . . . 1781 . . . . . . . . . . . . . . . . . . . . . . . . 1790 . . . . . . . . . . . . . . . . . . . . . . . . 1795

CXVI Wick’s theorem 1799 1 Demonstration of the theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 1799 2 Applications: correlation functions for an ideal gas . . . . . . . . . . . . . . . 1804 *********** XVII PAIRED STATES OF IDENTICAL PARTICLES 1811 A Creation and annihilation operators of a pair of particles . . . . . . . . . . . . 1813 B Building paired states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1818 C Properties of the kets characterizing the paired states . . . . . . . . . . . . . 1822 D Correlations between particles, pair wave function . . . . . . . . . . . . . . . 1830 E Paired states as a quasi-particle vacuum; Bogolubov-Valatin transformations 1836 xiii

Table of contents

1843

READER’S GUIDE FOR COMPLEMENTS

AXVII Pair field operator for identical particles 1845 1 Pair creation and annihilation operators . . . . . . . . . . . . . . . . . . . . . 1846 2 Average values in a paired state . . . . . . . . . . . . . . . . . . . . . . . . . . 1851 3 Commutation relations of field operators . . . . . . . . . . . . . . . . . . . . . 1861 BXVII Average energy in a paired state 1 Using states that are not eigenstates of the total particle 2 Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . 3 Spin 1/2 fermions in a singlet state . . . . . . . . . . . . 4 Spinless bosons . . . . . . . . . . . . . . . . . . . . . . .

number . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1869 . 1869 . 1871 . 1874 . 1881

CXVII Fermion pairing, BCS theory 1 Optimization of the energy . . . . . 2 Distribution functions, correlations . 3 Physical discussion . . . . . . . . . . 4 Excited states . . . . . . . . . . . . .

. . . .

. . . .

DXVII Cooper pairs 1 Cooper model . . . . . . . . . . . . . 2 State vector and Hamiltonian . . . . 3 Solution of the eigenvalue equation . 4 Calculation of the binding energy for

. . . a

EXVII Condensed repulsive bosons 1 Variational state, energy . . . . . . . 2 Optimization . . . . . . . . . . . . . 3 Properties of the ground state . . . . 4 Bogolubov operator method . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1889 . 1890 . 1899 . 1914 . 1919

. . . . . . . . . . . . simple

. . . . . . . . . case

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1927 . 1927 . 1927 . 1929 . 1929

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

1933 . 1935 . 1937 . 1940 . 1950

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

*********** XVIII REVIEW OF CLASSICAL ELECTRODYNAMICS 1957 A Classical electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1959 B Describing the transverse field as an ensemble of harmonic oscillators . . . . . 1968 READER’S GUIDE FOR COMPLEMENTS

1977

AXVIII Lagrangian formulation of electrodynamics 1 Lagrangian with several types of variables . . . . . . . . . . . . . . . . . . . 2 Application to the free radiation field . . . . . . . . . . . . . . . . . . . . . 3 Lagrangian of the global system field + interacting particles . . . . . . . . .

1979 . 1980 . 1986 . 1992

*********** xiv

Table of contents

XIX QUANTIZATION OF ELECTROMAGNETIC RADIATION 1997 A Quantization of the radiation in the Coulomb gauge . . . . . . . . . . . . . . 1999 B Photons, elementary excitations of the free quantum field . . . . . . . . . . . 2004 C Description of the interactions . . . . . . . . . . . . . . . . . . . . . . . . . . 2009 2017

READER’S GUIDE FOR COMPLEMENTS

AXIX Momentum exchange between atoms and photons 1 Recoil of a free atom absorbing or emitting a photon . . . . . . . . . . . 2 Applications of the radiation pressure force: slowing and cooling atoms 3 Blocking recoil through spatial confinement . . . . . . . . . . . . . . . . 4 Recoil suppression in certain multi-photon processes . . . . . . . . . . .

. . . .

. . . .

2019 . 2020 . 2025 . 2036 . 2040

BXIX Angular momentum of radiation 2043 1 Quantum average value of angular momentum for a spin 1 particle . . . . . . 2044 2 Angular momentum of free classical radiation as a function of normal variables2047 3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2050 CXIX Angular momentum exchange between atoms and photons 1 Transferring spin angular momentum to internal atomic variables . . . . . . 2 Optical methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Transferring orbital angular momentum to external atomic variables . . . .

2055 . 2056 . 2058 . 2065

*********** XX ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS 2067 A A basic tool: the evolution operator . . . . . . . . . . . . . . . . . . . . . . . 2068 B Photon absorption between two discrete atomic levels . . . . . . . . . . . . . 2073 C Stimulated and spontaneous emissions . . . . . . . . . . . . . . . . . . . . . . 2080 D Role of correlation functions in one-photon processes . . . . . . . . . . . . . . 2084 E Photon scattering by an atom . . . . . . . . . . . . . . . . . . . . . . . . . . . 2085 READER’S GUIDE FOR COMPLEMENTS

2095

AXX A multiphoton process: two-photon absorption 1 Monochromatic radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Non-monochromatic radiation . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2097 . 2097 . 2101 . 2105

BXX Photoionization 1 Brief review of the photoelectric effect . . . . . . . . . . . . . . . . . . . . . 2 Computation of photoionization rates . . . . . . . . . . . . . . . . . . . . . 3 Is a quantum treatment of radiation necessary to describe photoionization? 4 Two-photon photoionization . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Tunnel ionization by intense laser fields . . . . . . . . . . . . . . . . . . . .

2109 . 2110 . 2112 . 2118 . 2123 . 2126 xv

Table of contents

CXX Two-level atom in a monochromatic field. Dressed-atom 1 Brief description of the dressed-atom method . . . . . . . . . . . 2 Weak coupling domain . . . . . . . . . . . . . . . . . . . . . . . . 3 Strong coupling domain . . . . . . . . . . . . . . . . . . . . . . . 4 Modifications of the field. Dispersion and absorption . . . . . . .

method . . . . . . . . . . . . . . . . . . . . . . . .

2129 . 2130 . 2137 . 2141 . 2147

DXX Light shifts: a tool for manipulating 1 Dipole forces and laser trapping . . . . . . 2 Mirrors for atoms . . . . . . . . . . . . . . 3 Optical lattices . . . . . . . . . . . . . . . 4 Sub-Doppler cooling. Sisyphus effect . . . 5 Non-destructive detection of a photon . .

. . . . .

atoms and fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

2151 . 2151 . 2153 . 2153 . 2155 . 2159

EXX Detection of one- or two-photon wave packets, interference 1 One-photon wave packet, photodetection probability . . . . . . . . . 2 One- or two-photon interference signals . . . . . . . . . . . . . . . . 3 Absorption amplitude of a photon by an atom . . . . . . . . . . . . 4 Scattering of a wave packet . . . . . . . . . . . . . . . . . . . . . . . 5 Example of wave packets with two entangled photons . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

2163 . 2165 . 2167 . 2174 . 2176 . 2181

. . . . .

. . . . .

*********** XXI QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES 2187 A Introducing entanglement, goals of this chapter . . . . . . . . . . . . . . . . . 2188 B Entangled states of two spin-1 2 systems . . . . . . . . . . . . . . . . . . . . 2190 C Entanglement between more general systems . . . . . . . . . . . . . . . . . . 2193 D Ideal measurement and entangled states . . . . . . . . . . . . . . . . . . . . . 2196 E “Which path” experiment: can one determine the path followed by the photon in Young’s double slit experiment? . . . . . . . . . . . . . . . . . . . . . . 2202 F Entanglement, non-locality, Bell’s theorem . . . . . . . . . . . . . . . . . . . . 2204 READER’S GUIDE FOR COMPLEMENTS

2215

AXXI Density operator and correlations; separability 2217 1 Von Neumann statistical entropy . . . . . . . . . . . . . . . . . . . . . . . . . 2217 2 Differences between classical and quantum correlations . . . . . . . . . . . . . 2221 3 Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2223 BXXI GHZ states, entanglement swapping 2227 1 Sign contradiction in a GHZ state . . . . . . . . . . . . . . . . . . . . . . . . 2227 2 Entanglement swapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2232 CXXI Measurement induced relative phase between two condensates 1 Probabilities of single, double, etc. position measurements . . . . . . . . . . 2 Measurement induced enhancement of entanglement . . . . . . . . . . . . . 3 Detection of a large number of particles . . . . . . . . . . . . . . . . . . . xvi

2237 . 2239 . 2242 . 2245

DXXI Emergence of a relative phase with spin condensates; macroscopic non-locality and the EPR argument 2253 1 Two condensates with spins . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2254 2 Probabilities of the different measurement results . . . . . . . . . . . . . . . . 2255 3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2259 *********** APPENDICES

2267

IV 1 2 3 4

Feynman path integral Quantum propagator of a particle . . . . . Interpretation in terms of classical histories Discussion; a new quantization rule . . . . . Operators . . . . . . . . . . . . . . . . . . .

V 1 2

Lagrange multipliers 2281 Function of two variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2281 Function of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2283

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

2267 . 2267 . 2272 . 2274 . 2276

VI Brief review of Quantum Statistical Mechanics 2285 1 Statistical ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2285 2 Intensive or extensive physical quantities . . . . . . . . . . . . . . . . . . . . . 2292 VII 1 2 3 4 5

Wigner transform Delta function of an operator . . . . . . . . . . . . . . . . . . Wigner distribution of the density operator (spinless particle) Wigner transform of an operator . . . . . . . . . . . . . . . . Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . Discussion: Wigner distribution and quantum effects . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

2297 . 2299 . 2299 . 2310 . 2318 . 2319

BIBLIOGRAPHY OF VOLUME III

2325

INDEX

2333

xvii

Chapter XV

Creation and annihilation operators for identical particles A

B

C

General formalism . . . . . . . . . . . . . . . . . . . . A-1 Fock states and Fock space . . . . . . . . . . . . . . A-2 Creation operators . . . . . . . . . . . . . . . . . A-3 Annihilation operators . . . . . . . . . . . . . . . . A-4 Occupation number operators (bosons and fermions) A-5 Commutation and anticommutation relations . . . . A-6 Change of basis . . . . . . . . . . . . . . . . . . . . . One-particle symmetric operators . . . . . . . . . . B-1 Definition . . . . . . . . . . . . . . . . . . . . . . . . B-2 Expression in terms of the operators and . . . . B-3 Examples . . . . . . . . . . . . . . . . . . . . . . . . B-4 Single particle density operator . . . . . . . . . . . . Two-particle operators . . . . . . . . . . . . . . . . . C-1 Definition . . . . . . . . . . . . . . . . . . . . . . . . C-2 A simple case: factorization . . . . . . . . . . . . . . C-3 General case . . . . . . . . . . . . . . . . . . . . . . C-4 Two-particle reduced density operator . . . . . . . . C-5 Physical discussion; consequences of the exchange . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

1592 1593 1596 1597 1598 1599 1601 1603 1603 1604 1606 1607 1608 1608 1609 1610 1610 1611

Introduction For a system composed of identical particles, the particle numbering used in Chapter XIV, the last chapter of Volume II [2], does not really have much physical significance. Furthermore, when the particle number gets larger than a few units, applying the symmetrization postulate to numbered particles often leads to complex calculations. For Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

example, computing the average value of a symmetric operator requires the symmetrization of the bra, the ket, and finally the operator, which introduces a large number of terms1 . They seem different, a priori, but at the end of the computation many are found to be equal, or sometimes cancel each other. Fortunately, these lengthy calculations may be avoided using an equivalent method based on creation and annihilation operators in a “Fock space”. The simple commutation (or anticommutation) rules satisfied by these operators are the expression of the symmetrization (or antisymmetrization) postulate. The non-physical particle numbering is replaced by assigning “occupation numbers” to individual states, which is more natural for treating identical particles. The method described in this chapter and the following is sometimes called “second quantization”2 . It deals with operators that no longer conserve the particle number, hence acting in a state space larger than those we have previously considered; this new space is called the “Fock space” (§ A). These operators which change the particle number appear mainly in the course of calculations, and often regroup at the end, keeping the total particle number constant. Examples will be given (§ B) for one-particle symmetric operators, such as the total linear momentum or angular momentum of a system of identical particles. We shall then study two-particle symmetric operators (§ C), such as the energy of a system of interacting identical particles, their spatial correlation function, etc. In quantum statistical mechanics, the Fock space is well adapted to computations performed in the “grand canonical” ensemble, where the total number of particles may fluctuate since the system is in contact with an external reservoir. Furthermore, as we shall see in the following chapters, the Fock space is very useful for describing physical processes where the particle number changes, as in photon absorption or emission. A.

General formalism

We denote the state space of a system of tensor product of individual state spaces 1 : =

1 (1)

1 (2)

1(

)

distinguishable particles, which is the (A-1)

Two sub-spaces of are particularly important for identical particles, as they contain all their accessible physical states: the space ( ) of the completely symmetric states for bosons, and the space ( ) of the completely antisymmetric states for fermions. The projectors onto these two sub-spaces are given by relations (B-49) and (B-50) of Chapter XIV: =

1 !

(A-2)

and: =

1 !

(A-3)

1 For a one-particle symmetric operator, which includes the sum of terms, both the ket and bra contain ! terms. The matrix element will therefore involve ( !)2 terms, a very large number once exceeds a few units. 2 A commonly accepted but a somewhat illogical expression, since no new quantification comes in addition to that of the usual postulates of Quantum Mechanics; its essential ingredient is the symmetrization of identical particles.

1592

A. GENERAL FORMALISM

where the are the ! permutation operators for the particles, and the parity of (in this chapter we have added for clarity the index to the projectors S and A defined in Chapter XIV). A-1.

Fock states and Fock space

Starting from an arbitrary orthonormal basis of the state space for one particle, we constructed in § C-3-d of Chapter XIV a basis of the state space for identical particles. Its vectors are characterized by the occupation numbers , with: 1

+

2

+

+

+

=

(A-4)

where 1 is the occupation number of the first basis vector 1 (i.e. the number of particles in 1 ), 2 that of 2 , .., that of . In this series of numbers, some (even many) may be zero: a given state has no particular reason to always be occupied. It is therefore often easier to specify only the non-zero occupation numbers, which will be noted . This series indicates that the first basis state that has at least one particle is and it contains particles; the second occupied state is with a population , etc. As in (A-4), these occupation numbers add up to . Comment : In this chapter we constantly use subscripts of different types, which should not be confused. The subscripts , , , , ..denote different basis vectors of the state space 1 of a single particle; they span values given by the dimension of this state space, which often goes to infinity. They should not be confused with the subscripts used to number the particles, which can take different values, and are labeled , , etc. Finally the subscript distinguishes the different permutations of the particles, and can therefore take ! different values. A-1-a.

Fock states for identical bosons

For bosons, the basis vectors can be written as in (C-15) of Chapter XIV:

=

1:

; 2:

;

;

:

;

+1:

;

+

:

;

(A-5)

where is a normalization constant; on the right-hand side, particles occupy the state , the state , etc... (because of symmetrization, their order does not matter). Let us calculate the norm of the right-hand term. It is composed of ! terms, coming from each of the ! permutations included in , but only some of them are orthogonal to each other: all the permutations leading to redistributions of the first particles among themselves, of the next particles among themselves, etc. yield the same initial ket. On the other hand, if a permutation changes the individual state of one (or more than one) particle, it yields a different ket, actually orthogonal to the initial ket. This means that the different permutations contained in can be grouped into families of ! ! ! equivalent permutations, all yielding the same ket; taking into account the factor ! appearing in the definition of , the coefficient in front of this ket becomes ! ! ! ! and its contribution to the norm of the ket is equal to the square of this number. On the other hand, the number of orthogonal kets is 1593

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

! ! ! ! . Consequently if would have a norm equal to: ! !

2

1

!

!

!

!

!

!

We shall therefore choose for the normalized ket:

!

=

!

was equal to 1 in formula (A-5), the ket thus defined

!

!

!

(A-6)

!

the inverse of the square root of that number, leading to

1:

!

!

=

; 2:

;

;

:

;

+1:

; ;

+

:

; (A-7)

These states are called the “Fock states”, for which the occupation numbers are well defined. For the Fock states, it is sometimes handy to use a slightly different but equivalent notation. In (A-7), these states are defined by specifying the occupation numbers of all the states that are actually occupied ( 1). Another option would be to indicate all the occupation numbers including those which are zero 3 – this is what we have explicitly done in (A-4). We then write the same kets as:

1

(A-8)

2

Another possibility is to specify a list of repeated times, etc. :

occupied states, where

is repeated

times, (A-9)

-times

-times

As we shall see later, this latter notation is sometimes useful in computations involving both bosons and fermions. A-1-b.

Fock states for identical fermions

In the case of fermions, the operator acting on a ket where two (or more) numbered particles are in the same individual state yields a zero result: there are no such states in the physical space ( ). Hence we concentrate on the case where all the occupation numbers are either 1 or 0. We denote , ,.., , .. all the states having an occupation number equal to 1. The equivalent for fermions of formula (A-7) is written:

=

!

1:

; 2:

;

0

;

:

;

if all the are different if two are identical (A-10)

3 Remember

1594

that, by convention, 0! = 1.

A. GENERAL FORMALISM

Taking into account the 1 ! factor appearing in definition (A-3) of , the righthand side of this equation is a linear superposition, with coefficients 1 !, of ! kets which are all orthogonal to each other (as we have chosen an orthonormal basis for the individual states ); hence its norm is equal to 1. Consequently, Fock states for fermions are defined by (A-10). Contrary to bosons, the main concern is no longer how many particles occupy a state, but whether a state is occupied or not. Another difference with the boson case is that, for fermions, the order of the states matters. If for instance the first two states and are exchanged, we get the opposite ket: =

(A-11)

but it obviously does not change the physical meaning of the ket. A-1-c.

Fock space

The Fock states are the building blocks used to construct this whole chapter. We have until now considered separately the spaces ( ) associated with different values of the particle number . We shall now regroup them into a single space, called the “Fock space ”, using the direct sum4 formalism. For bosons:

Fock

=

(0)

(1)

(2)

( )

(A-12)

(1)

(2)

( )

(A-13)

and, for fermions: Fock

=

(0)

(the sums go to infinity). In both cases, we have included on the right-hand side a first term associated with a total number of particles equal to zero. The corresponding space, (0), is defined as a one-dimensional space, containing a single state called “vacuum” and denoted 0 or vac . For bosons as well as fermions, an orthonormal basis for the Fock space can be built with the Fock states 1 2 , relaxing the constraint (A4): the occupation numbers may then take on any (integer) values, including zeros for all, which corresponds to the vacuum ket 0 . Linear combinations of all these basis vectors yield all the vectors of the Fock space, including linear superpositions of kets containing different particle numbers. It is not essential to attribute a physical interpretation to such superpositions since they can be considered as intermediate states of the calculation. Obviously, the Fock space contains many kets with well defined particle numbers: all those belonging to a single sub-space ( ) for bosons, or ( ) for fermions. Two kets having different particle numbers are necessarily orthogonal; for example, all the kets having a non-zero total population are orthogonal to the vacuum state. Comments: (i) Contrary to the distinguishable particle case, the Fock space is not the tensor product of the spaces of states associated with particles numbered 1, 2,..., , etc. First of all, for a 4 The

direct sum of two spaces (with dimension ) and (with dimension ) is a space + with dimension + , spanned by all the linear combinations of a vector from the first space with a vector from the second. A basis for may be simply obtained by grouping together a basis for + and one for . For example, vectors of a two-dimensional plane belong to a space that is the direct sum of the one-dimensional spaces for the vectors of two axes of that plane.

1595

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

fixed , it only includes the totally symmetric (or antisymmetric) subspace of this tensor product; furthermore, the Fock space is the direct sum of such subspaces associated with each value of the particle number . The Fock space is, however, the tensor product of Fock spaces Fock associated with the individual orthogonal states , each Fock being spanned by the kets where takes on all integer values (from zero to infinity for bosons, from zero to one for fermions): Fock

=

1

2

Fock

Fock

(A-14)

Fock

This is because the Fock states, which are a basis for product: 1

=

2

1

Fock ,

may be written as the tensor

(A-15)

2

It is often said that each individual state defines a “mode” of the system of identical particles. Decomposing the Fock state into a tensor product allows considering the modes as describing different and distinguishable variables. This will be useful on numerous occasions (see for example Complements BXV , DXV and EXV ). (ii) One should not confuse a Fock state with an arbitrary state of the Fock space. The occupation numbers of individual states are all well defined in a Fock state (also called “number state”), whereas an arbitrary state of the Fock space is a linear superposition of these eigenstates, with several non-zero coefficients. A-2.

Creation operators

Choosing a basis of individual states , we now define the action in the Fock space of the creation operator5 on a particle in the state . A-2-a.

Bosons

For bosons, we introduce the linear operator 1

=

2

+1

1

defined by:

+1

2

(A-16)

As all the states of the Fock space may be obtained by a linear superposition of 1 2 the action of is defined in the entire space. It adds a particle to the system, which goes from a state of ( ) to a state of ( + 1), and in particular from the vacuum to a state having one single occupied state. Creation operators acting on the vacuum allow building occupied states. Recurrent application of (A-16), leads to: 1

2

=

1 1! 2!

1

!

1

2 2

0

(A-17)

Comment : Why was the factor + 1 introduced in (A-16)? We shall see later (§ B) that, together with the factors of (A-7), it simplifies the computations.

5A

1596

similar notation was used for the harmonic oscillator.

,

A. GENERAL FORMALISM

A-2-b.

Fermions

For fermions, we define the operator

by:

=

(A-18)

where the newly created state appears first in the list of states in the ket on the right-hand side. If we start from a ket where the individual state is already occupied ( = 1), the action of leads to zero, as in this case (A-10) gives: =

=0

(A-19)

Formulas (A-16) and (A-17) are also valid for fermions, with all the occupation numbers equal to 0 or 1(or else both members are zero). Comment : Definition (A-18) must not depend on the specific order of the individual states in the ket on which the operator acts. It can be easily verified that any permutation of the states simply multiply by its parity both members of the equality. It therefore remains valid independently of the order chosen for the individual states in the initial ket. A-3.

Annihilation operators

We now study the Hermitian conjugate operator of , that we shall simply call since taking twice in a row the Hermitian conjugate of an operator brings you back to the initial operator. A-3-a.

Bosons

For bosons, we deduce from (A-16) that the only non-zero matrix elements of in the Fock states orthonormal basis are: 1

+1

2

1

=

2

+1

(A-20)

They link two vectors having equal occupation numbers except for , which increases by one going from the ket to the bra. The matrix elements of the Hermitian conjugate of are obtained from relation (A-20), using the general definition (B-49) of Chapter II. The only non-zero matrix elements of are thus: 1

2

1

+1

2

=

+1

(A-21)

Since the basis we use is complete, we can deduce the action of the operators having given occupation numbers: 1

2

=

1

2

1

on kets (A-22)

(note that we have replaced by 1). As opposed to , which adds a particle in the state , the operator takes one away; it yields zero when applied on a ket where the state is empty to begin with, such as the vacuum state: 0 =0 We call

“the annihilation operator” for the state

(A-23) . 1597

CHAPTER XV

A-3-b.

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

Fermions

For fermions, relation (A-18) allows writing the matrix elements: =1

(A-24)

The only non-zero elements are those where all the individual occupied states are left unchanged in the bra and the ket, except for the state only present in the bra, but not in the ket. As for the occupation numbers, none change, except for which goes from 0 (in the ket) to 1 (in the bra). The Hermitian conjugation operation then yields the action of the corresponding annihilation operator: = or, if initially the state

(A-25) is not occupied:

=0

(A-26)

Relations (A-22) and (A-23) are also valid for fermions, with the usual condition that all occupation numbers should be equal to 0 or 1; otherwise, the relations amount to 0 = 0. Comment: To use relation (A-25) when the state is already occupied but not listed in the first position, we first have to bring it there; if it requires an odd permutation, a change of sign will occur. For example: 2

1

2

=

(A-27)

1

For fermions, the operators and therefore act on the individual state that is listed in the first position in the -particle ket; destroys the first state in the list, and creates a new state placed at the beginning of the list. Forgetting this could lead to errors in sign. A-4.

Occupation number operators (bosons and fermions)

Consider the operator =

defined by: (A-28)

and its action on a Fock state. For bosons, if we apply successively formulas (A-22) and (A-16), we see that this operator yields the same Fock state, but multiplied by its occupation number . For fermions, if is empty in the Fock state, relation (A-26) shows that the action of the operator yields zero. If the state is already occupied, we must first permute the states to bring to the first position, which may eventually change the sign in front of the Fock space ket. The successive application on this ket of (A-25) and (A-19) shows that the action of the operator leaves this ket unchanged; we then move the state back to its initial position, which may introduce a second change in sign, canceling the first one. We finally obtain for fermions the same result as for bosons, except that the can only take the values 1 and 0. In both cases the Fock states are the eigenvectors of the operator with the occupation numbers as 1598

A. GENERAL FORMALISM

eigenvalues; consequently, this operator is named the “occupation number operator of the state ”. The operator associated with the total number of particles is simply the sum: =

A-5.

=

(A-29)

Commutation and anticommutation relations

Creation and annihilation operators have very simple commutation (for the bosons) and anticommutation (for the fermions) properties, which make them easy tools for taking into account the symmetrization or antisymmetrization of the state vectors. To simplify the notation, each time the equations refer to a single basis of individual states , we shall write instead of . If, however, it can lead to ambiguity, we will return to the full notation. A-5-a.

Bosons: commutation relations

Consider, for bosons, the two operators different, they correspond to orthogonal states yields: 1

=

2

+1

+1

and . If both subscripts and are and . Using twice (A-16) then

1

2

+1

+1

(A-30)

Changing the order of the operators yields the same result. As the Fock states form a basis, we can deduce that the commutator of and is zero if = . In the same way, it is easy to show that both operator products and acting on the same ket yield the same result (a ket having two occupation numbers lowered by 1); and thus commute if = . Finally the same procedure allows showing that and commute if = . Now, if = , we must evaluate the commutator of and . Let us apply (A-16) and (A-22) successively, first in that order, and then in the reverse order: 1

2

1

2

= ( + 1) =( ) 1

1

2

(A-31)

2

The commutator of and is therefore equal to 1 for all the values of the subscript . All the previous results are summarized in three equalities valid for bosons: [ A-5-b.

]=0

=0

=

(A-32)

Fermions: anticommutation relations

For fermions, let us first assume that the subscripts and are different. The successive action of and on an occupation number ket only yields a non-zero ket if = = 0; using twice (A-18) leads to: =

(A-33)

but, if we change the order: =

=

(A-34) 1599

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

Consequently the sign change that goes with the permutation of the two individual states leads to: =

if =

(A-35)

If we define the anticommutator [ [

]+ =

]+ of two operators

+

and

by: (A-36)

(A-35) may be written as: =0

if =

(A-37)

+

Taking the Hermitian conjugate of (A-35), we get: =

if =

(A-38)

which can be written as: [

]+ = 0

if =

(A-39)

Finally, we show by the same method that the anticommutator of and is zero except when it acts on a ket where = 1 and = 0; those two occupation numbers are then interchanged. The computation goes as follows: =

=

(A-40)

and: = =

=

(A-41)

Adding those two equations yields zero, hence proving that the anticommutator is zero: =0

si =

(A-42)

+

In the case where = , the limitation on the occupation numbers (0 or 1) leads to: 2

2

=0

and

=0

(A-43)

Equalities (A-37) and (A-39) are still valid if and are equal. We are now left with the computation of the anticommutator of and . Let us first examine the product ; it yields zero if applied to a ket having an occupation number = 1, but leaves unchanged any ket with = 0, since the particle created by is then annihilated by . We get the inverse result for the product where the order has been inverted: it yields zero if = 0, and leaves the ket unchanged if = 1. Finally, whatever the occupation number ket is, one of the terms of the anticommutator yields zero, the other 1, and the net result is always 1. Therefore: =1

(A-44)

+

All the previous results valid for fermions are summarized in the following three relations, which are for fermions the equivalent of relations (A-32) for bosons: [ 1600

]+ = 0

=0 +

= +

(A-45)

A. GENERAL FORMALISM

A-5-c.

Common relations for bosons and fermions

To regroup the results valid for bosons and fermions in common relations, we introduce the notation: [

]

=

(A-46)

with: = 1 for bosons = 1 for fermions

(A-47)

so that (A-46) is the commutator of A and B for bosons, and their anticommutator for fermions. We then have: =0

for all

and

(equal or different)

= 0 for all

and

(equal or different)

(A-48) and the only non-zero combinations are: = A-6.

=

(A-49)

Change of basis

What are the effects on the creation and annihilation operators of a change of and have been introduced by their basis for the individual states? The operators action on the Fock states, defined by relations (A-7) and (A-10) for which a given basis of individual states was chosen. One could also choose any another orthonormal basis and define in the same way bases for the Fock state and creation and annihilation operators. What is the relation between these new operators and the ones we defined earlier with the initial basis? For creation operators acting on the vacuum state 0 , the answer is quite straightforward: the action of on 0 yields a one-particle ket, which can be written as: 0 = 1:

=

1:

=

0

(A-50)

This result leads us to expect a simple linear relation of the type: =

(A-51)

with its Hermitian conjugate: =

(A-52)

Equation (A-51) implies that creation operators are transformed by the same unitary relation as the individual states. Commutation or anticommutation relations are then conserved, since: =

=

(A-53)

1601

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

which amounts to (as expected): =

=

(A-54)

Furthermore, it is straightforward to show that the creation operators commute (or anticommute), as do the annihilation operators. Equivalence of the two bases We have not yet shown the complete equivalence of the two bases, which can be done following two different approaches. In the first one, we use (A-51) and (A-52) to define the creation and annihilation operators in the new basis. The associated Fock states are defined by replacing the by the in relations (A-17) for the bosons, and (A-18) for the fermions. We then have to show that these new Fock states are still related to the states with numbered particles as in (A-18) for bosons, and (A-10) for fermions. This will establish the complete equivalence of the two bases. We shall follow a second approach where the two bases are treated completely symmetrically. Replacing in relations (A-7) and (A-10) the by the , we construct the new Fock basis. We next define the operators by transposing relations (A-17) and (A-18) to the new basis. We then must verify that these operators obey relation (A-51), without limiting ourselves, as in (A-50), to their action on the vacuum state. (i) Bosons Relations (A-7) and (A-17) lead to: 0 =

!

1:

; 2:

;

;

:

;

+1:

; ;

+

:

;

(A-55)

where, on the right-hand side, the first particles occupy the same individual state , the following particles, numbered from + 1 to + , the individual state , etc. The equivalent relation in the second basis can be written:

0 =

!

1:

; 2:

;

;

:

;

+1:

; ;

+

:

;

(A-56)

with: +

+

=

+

+

=

(A-57)

Replacing on the right-hand side of (A-56), the first ket =

i

by: (A-58)

i

i

we obtain: i i

1602

!

1:

i;

2:

;

;

:

;

+1:

; ;

+

:

;

(A-59)

B. ONE-PARTICLE SYMMETRIC OPERATORS

Following the same procedure for all the basis vectors of the right-hand side, we can replace it by: i

j

i

i

j

i

!

1:

i;

2:

i

;

;

j

(A-60)

j

+1:

j;

+2:

j

;

6

or else , taking into account (A-55):

i i

j

i j

i

j

0

j

(A-61) We have thus shown that the operators act on the vacuum state in the same way as the operators defined by (A-51), raised to the powers , , .. When the occupation numbers , , .. can take on any values, the kets (A-56) span the entire Fock space. Writing the previous equality for and + 1, we see that the action on all the basis kets of and of yields the same result, establishing the equality between these two operators. Relation (A-52) can be readily obtained by Hermitian conjugation.

(ii) Fermions The demonstration is identical, with the constraint that the occupation numbers are 0 or 1. As this requires no changes in the operator or state order, it involves no sign changes.

B.

One-particle symmetric operators

Using creation and annihilation operators makes it much easier to deal, in the Fock space, with physical operators that are thus symmetric (§ C-4-a- of Chapter XIV). We first study the simplest of such operators, those which act on a single particle and are called “one-particle operators”. B-1.

Definition

Consider an operator defined in the space of individual states; ( ) acts in the state space of particle . It could be for example the momentum of the -th particle, or its angular momentum with respect to the origin. We now build the operator associated with the total momentum of the -particle system, or its total angular momentum, which is the sum over of all the ( ) associated with the individual particles. A one-particle symmetric operator acting in the space ( ) for bosons - or ( ) for fermions - is therefore defined by: (

)

=

( )

(B-1)

=1 6 In

this relation, the first

sums are identical, as are the next

sums, etc.

1603

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

(contrary to states, which are symmetric for bosons and antisymmetric for fermions, the physical operators are always symmetric). The operator acting in the Fock space is defined as the operator ( ) acting either in ( ) or in ( ), depending on the specific case. Since the basis for the entire Fock space is the union of the bases of these spaces for all values of , the operator is thus well defined in the direct sum of all these subspaces. To summarize: (

)

;

=1 2 3

=

(B-2)

Using (B-1) directly to compute the matrix elements of often leads to tedious manipulations. Starting with an operator involving numbered particles, we place it between states with numbered particles; we then symmetrize the bra, the ket, and take into account the symmetry of the operator (cf. footnote 1). This introduces several summations (on the particles and on the permutations) that have to be properly regrouped to be simplified. We will now show that expressing in terms of creation and annihilation operators avoids all these intermediate calculations, taking nevertheless into account all the symmetry properties. B-2.

Expression in terms of the operators

and

We choose a basis for the individual states. The matrix elements one-particle operator are given by: =

of the (B-3)

They can be used to expand the operator itself as follows: ( )=

B-2-a.

:

:

(

Action of

)

( )

:

on a ket with

:

=

:

:

(B-4)

particles

Using in (B-1) the expression (B-4) for ( ) leads to: (

)

=

:

:

(B-5)

=1

The action of ( and of terms: :

)

on a symmetrized ket written as (A-9) therefore includes a sum over

:

(B-6)

=1

with coefficients . Let us use (A-7) or (A-10) to compute this ket for given values of and . As the operator contained in the bracket is symmetric with respect to the exchange of particles, it commutes with the two operators and (§ C-4-a- of Chapter XIV)), and the ket can be written as: ! !

!

!

: 1:

1604

:

=1

; 2:

; ;

:

;

+1:

;

;

:

;

(B-7)

B. ONE-PARTICLE SYMMETRIC OPERATORS

In the summation over , the only non-zero terms are those for which the individual state coincides with the individual state occupied in the ket on the right by the particle labeled q; there are different values of that obey this condition (i.e. none or one for fermions). For these terms, the operator : : transforms the state into , then (or ) reconstructs a symmetrized (but not normalized) ket: ! !

!

1:

!

;

;

+1:

;

;

:

;

(B-8)

This ket is always the same for all the numbers among the selected ones (for fermions, this term might be zero, if the state was already occupied in the initial ket). We shall then distinguish two cases: (i) For = , and for bosons, the ket written in (B-8) equals: +1

(B-9)

where the square root factor comes from the variation in the occupation numbers and , which thus change the numerical coefficients in the definition (A-7) of the Fock states. As this ket is obtained times, this factor becomes ( + 1) . This is exactly the factor obtained by the action on the same symmetrized ket of the operator , which also removes a particle from the state and creates a new one in the state . Consequently, the operator reproduces exactly the same effect as the sum over . For fermions, the result is zero except when, in the initial ket, the state was occupied by a particle, and the state empty, in which case no numerical factor appears; as before, this is exactly what the action of the operator would do. (ii) if = , for bosons the only numerical factor involved is , coming from the number of terms in the sum over that yields the same symmetrized ket. For fermions, the only condition that yields a non-zero result is for the state to be occupied, which also leads to the factor . In both cases, the sum over amounts to the action of the operator . We have shown that: :

:

=

(B-10)

=1

The summation over (

)

B-2-b.

=

and in (B-5) then yields: =

(B-11)

Expression valid in the entire Fock space

The right-hand side of (B-11) contains an expression completely independent of the space ( ) or ( ) in which we defined the action of the operator ( ) . Since we defined operator as acting as ( ) in each of these subspaces having fixed , we can simply write: =

(B-12)

1605

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

This is the expression of one-particle symmetric operators we were looking for. Its form is valid for any value of and the particles are no longer numbered; it contains equal numbers of creation and annihilation operators, which only act on occupation numbers. Comment: Choosing the proper basis operator and write:

, it is always possible to diagonalize the Hermitian

=

(B-13)

Equality (B-11) is then simply written as:

=

= =

where B-3.

(B-14)

is the occupation number operator in the state

defined in (A-28).

Examples

A first very simple example is the operator corresponding to the total number of particles: =

, already described in (A-29), and

=

(B-15)

As expected, this operator does not depend on the basis chosen to count the particles, as we now show. Using the unitary transformations of operators (A-51) and (A-52), and with the full notation for the creation and annihilation operators to avoid any ambiguity, we get: =

=

(B-16)

which shows that: =

=

(B-17)

For a spinless particle one can also define the operator corresponding to the probability density at point r0 : = r0 r0

(B-18)

Relation (B-12) then leads to the “particle local density” (or “single density”) operator: (r0 ) =

(r0 ) (r0 )

The same procedure as above shows that this operator is independent of the basis chosen in the individual states space. 1606

(B-19)

B. ONE-PARTICLE SYMMETRIC OPERATORS

Let us assume now that the chosen basis is formed by the eigenvectors k of a particle’s momentum }k , and that the corresponding annihilation operators are noted k . The operator associated with the total momentum of the system can be written as: P=

}k

k

k

=

k

}k

(B-20)

k

k

As for the kinetic energy of the particles, its associated operator is expressed as: 0

}2 k 2 2

= k

B-4.

k

k

= k

}2 k 2 2

(B-21)

k

Single particle density operator

Consider the average value

of a one-particle operator

in an arbitrary

-

particle quantum state. It can be expressed, using relation (B-12), as a function of the average values of operator products : =

(B-22)

This expression is close to that of the average value of an operator for a physical system composed of a single particle. Remember (Complement EIII , § 4-b) that if a system is described by a single particle density operator 1 (1), the average value of any operator (1) is written as: (1) = Tr

(1)

1

(1) =

1

(B-23)

The above two expressions can be made to coincide if, for the system of identical particles, we introduce a “density operator reduced to a single particle” 1 whose matrix elements are defined by: =

1

(B-24)

This reduced operator allows computing average values of all the single particle operators as if the system consisted only of a single particle: = Tr

(B-25)

1

where the trace is taken in the state space of a single particle. The trace of the reduced density operator thus defined is not equal to unity, but to the average particle number as can be shown using (B-24) and (B-15):

Tr

1

=

=

(B-26) 1607

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

This normalization convention can be useful. For example, the diagonal matrix element of 1 in the position representation is simply the average of the particle local density defined in (B-19): r0

1

r0 =

(r0 )

(B-27)

It is however easy to choose a different normalization for the reduced density operator: its trace can be made equal to 1 by dividing the right-hand side of definition (B-24) by the factor

C.

.

Two-particle operators

We now extend the previous results to the case of two-particle operators. C-1.

Definition

Consider a physical quantity involving two particles, labeled and . It is associated with an operator ( ) acting in the state space of these two particles (the tensor product of the two individual state’s spaces). Starting from this binary operator, the easiest way to obtain a symmetric -particle operator is to sum all the ( ) over all the particles and , where the two subscripts and range from 1 to . Note, however, that in this sum all the terms where = add up to form a one-particle operator of exactly the same type as those studied in § B-1. Consequently, to obtain a real two-particle operator we shall exclude the terms where = and define: (

)

=

1 2

( =1;

)

(C-1)

=

The factor 1 2 present in this expression is arbitrary but often handy. If for example the operator describes an interaction energy that is the sum of the contributions of all the distinct pairs of particles, ( ) and ( ) corresponding to the same pair are equal and appear twice in the sum over and : the factor 1 2 avoids counting them twice. Whenever ( )= ( ), it is equivalent to write ( ) in the form: (

)

=

(

)

(C-2)

As with the one-particle operators, expression (C-1) defines symmetric operators separately in each physical state’s space having a given particle number . This definition may be extended to the entire Fock space, which is their direct sum over all . This results in a more general operator , following the same scheme as for (B-2): (

1608

)

;

=1 2 3

=

(C-3)

C. TWO-PARTICLE OPERATORS

C-2.

A simple case: factorization

Let us first assume the operator ( (

) can be factored as:

)= ( ) ( )

(C-4)

The operator written in (C-1) then becomes: (

)

=

1 2

( ) ( )= =1;

=

1 2

( ) =1

( )

( ) ( )

(C-5)

=1

=1

The right-hand side of this expression starts with a product of one-particle operators, each of which can be replaced, following (B-11), by its expression as a function of the creation and annihilation operators: ( )=

and

=1

( )=

(C-6)

=1

As for the last term on the right-hand side of (C-5), it is already a single particle operator: ( ) ( )=

(C-7)

=1

This leads to: (

)

=

1 2

(C-8)

We can then use general relations (A-49) to transform the operator product: =

+

=

+

(C-9)

Including this form in the first term on the right-hand side of (C-8) yields, for the contribution: =

(C-10)

which exactly cancels the second term of (C-8). Consequently, we are left with: (

)

=

1 2

(C-11)

As the right-hand side of this expression has the same form in all spaces having a fixed , it is also valid for the operator acting in the entire Fock space. 1609

CHAPTER XV

C-3.

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

General case

Any two-particle operator ( single particle operators: (

)=

( )

)

=

1 2

( )

( =1;

(C-12)

are numbers7 . Hence expression (C-1) can be written as:

where the coefficients (

) may be decomposed as a sum of products of

)=

=

1 2

( ) =1

=1;

( )

(C-13)

=

In this linear combination with coefficients , each term (corresponding to a given and ) is of the form (C-5) and can therefore be replaced by expression (C-11). This leads to: (

)

=

1 2

(C-14)

The right-hand side of this equation has the same form in all the spaces of fixed ; hence it is valid in the entire Fock space. Furthermore, we recognize in the summation over and the matrix element of as defined by (C-12): = 1:

; 2:

(1 2) 1 :

; 2:

=

(C-15)

The final result is then: =

1 2

1:

; 2:

(1 2) 1 :

; 2:

(C-16)

which is the general expression for a two-particle symmetric operator. As for the one-particle operators, each term of expression (C-16) for the twoparticle operators contains equal numbers of creation and annihilation operators. Consequently, these symmetric operators do not change the total number of particles, as was obvious from their initial definition. C-4.

Two-particle reduced density operator

Relation (C-16) implies that the average value of any two-particle operator may be written as: = 7 The

1 2

1:

; 2:

(1 2) 1 :

; 2:

(C-17)

two-particle state space is the tensor product of the two spaces of individual states (see § F-4-b of Chapter II). In the same way, the space of operators acting on two particles is the tensor product of the spaces of operators acting separately on these particles. For example, the operator for the interaction potential between two particles can be decomposed as a sum of products of two operators: the first one is a function of the position of the first particle, and the second one of the position of the second particle.

1610

C. TWO-PARTICLE OPERATORS

Figure 1: Physical interaction between two identical particles: initially in the states and (schematized by the letters and ), the particles are transferred to the states and (schematized by the letters and )

This expression is similar to the average value of an operator system having a density operator 2 (1 2): (1 2) =

1:

; 2:

(1 2) 1 : 1:

; 2:

; 2: 2

(1 2) 1 :

; 2:

which leads us to define a two-particle reduced density operator 1:

; 2:

2

1:

; 2:

(1 2) for a two-particle

=

(C-18) 2:

(C-19)

In this definition we have left out the factor 1 2 of (C-17) since this will lead to a normalization of 2 often more handy: the matrix element of 2 in the position representation yields directly the double density (as well as the field correlation function that we shall study in § B-3-b of Chapter XVI). The trace of 2 is then written: Tr

2

= =

= 1

(C-20)

It is obviously possible to divide the right-hand side of the definition of 2 either by the factor 2, or else by the factor 1 if we wish its trace to be equal to 1. C-5.

Physical discussion; consequences of the exchange

As mentioned in the introduction of this chapter, the equations no longer contain labeled particles, permutations, symmetrizers and antisymmetrizers; the total number of particles has also disappeared. We may now continue the discussion begun in § D-2 of Chapter XIV concerning the exchange terms, but in a more general way since we no longer specify the total particle number . 1611

CHAPTER XV

C-5-a.

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

Two terms in the matrix elements

Consider a physical process (schematized in Figure 1) where, in a system of identical particles, an interaction produces a transfer from the two states and towards the two states and ; we assume that the four states we are dealing with are different. In the summation over of (C-16), the only terms involved in this process are those where the bra contains either = and = , or the opposite = and = ; as for the ket, it must contain either = and = , or the opposite = and = . We are then left with four terms: 1 2 1 2 1 2 1 2

1: 1: 1: 1:

; ; ; ;

2: 2: 2: 2:

1: 1: 1: 1:

; ; ; ;

2: 2: 2: 2:

(C-21)

However, since the numbers used to label the particles are dummy variables, the first two matrix elements shown in (C-21) are equal and so are the last two. In addition, the product of creation and annihilation operators obey the following relations, for bosons ( = 1) as well as for fermions ( = 1): = =

(C-22)

=

These relations are obvious for bosons since we only commute either creation operators or annihilation operators. For fermions, as we assumed all the states were different, the anticommutation of operators or of operators , leads to sign changes; these may cancel out depending on whether the number of anticommutations is even or odd. If we now double the sum of the first and last term of (C-21), we obtain the final contribution to (C-16):

1:

; 2: +

1:

1: ; 2:

; 2: 1:

; 2:

(C-23)

Hence we are left with two terms whose relative sign depends on the nature (bosons or fermions) of the identical particles. They correspond to a different “switching point” for the incoming and outgoing individual states (Fig. 2). For bosons, the product of the 4 operators in (C-23) acting on an occupation number ket introduces the square root: +1 (

+ 1)

(C-24)

For large occupation numbers, this square root may considerably increase the value of the matrix element. For fermions, however, this amplification effect does not occur. Furthermore, if the direct and exchange matrix elements of are equal, they will cancel each other in (C-23) and the corresponding transition amplitude of this process will be zero. 1612

C. TWO-PARTICLE OPERATORS

Figure 2: Two diagrams representing schematically the two terms appearing in equation (C-23); they differ by an exchange of the individual states of the exit particles. They correspond, in a manner of speaking, to a different “switching point” for the incoming and outgoing states. The solid lines represent the particles’ free propagation, and the dashed lines their binary interaction.

C-5-b.

Particle interaction energy, the direct and exchange terms

Many physics problems involve computing the average particle interaction energy. For the sake of simplicity, we shall only study here spinless particles (or, equivalently, particles being in the same internal spin state so that the corresponding quantum number does not come into play) and assume their interactions to be binary. These interactions are then described by an operator int , diagonal in the r1 r2 r basis (eigenstates of all the particles’ positions), which multiplies each of these states by the function: int (r1

r2

r )=

2 (r

r )

(C-25)

In this expression, the function 2 (r r ) yields the diagonal matrix elements of the operator 2 (R R ) associated with the two-particle ( ) interaction, where R is the quantum operator associated with the classical position r . The matrix elements of this operator in the ; basis is simply obtained by inserting a closure relation for each of the two positions. This leads to:

1:

; 2: =

.

2 (R1

d3

1

R2 ) 1 : d3

2

; 2:

2 (r1

r2 )

(r1 )

(r2 )

(r1 ) (r2 )

(C-26)

General expression:

Replacing in (C-16) operator (1 2) by count, we get: int

=

1 2

d3

1

d3

2

2 (r1

r2 )

int (R1

(r1 )

R2 ) and taking (C-26) into ac-

(r2 )

(r1 ) (r2 )

(C-27)

1613

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

We can thus write the average value of the interaction energy in any normalized state Φ as: = Φ

int

where

2 (r1

2 (r1

2

Φ =

1 2

d3

d3

1

2

2 (r1

r2 )

2 (r1

r2 )

(C-28)

r2 ) is the spatial correlation function defined by:

r2 ) =

(r1 )

(r2 )

(r1 ) (r2 ) Φ

Φ

(C-29)

Consequently, knowing the correlation function 2 (r1 r2 ) associated with the quantum state Φ allows computing directly, by a double spatial integration, the average interaction energy in that state. Actually, as we shall see in more detail in § B-3 of Chapter XVI, 2 (r1 r2 ) is simply the double density, equal to the probability density of finding any particle in r1 and another one in r2 . The physical interpretation of (C-28) is simple: the average interaction energy is equal to the sum over all the particles’ pairs of the interaction energy int (r1 r2 ) of a pair, multiplied by the probability of finding such a pair at points r1 and r2 (the factor 1 2 avoids the double counting of each pair). .

Specific case: the Fock states Let us assume the state Φ is a Fock state, with specified occupation numbers Φ =

1

:

1;

2

:

2;

;

:

;

(C-30)

We can compute explicitly, as a function of the Φ

:

, the average values:

Φ

(C-31)

contained in (C-29). We first notice that to get a non-zero result, the two operators must create particles in the same states from which they were removed by the two annihilation operators . Otherwise the action of the four operators on the ket Φ will yield a new Fock state orthogonal to the initial one, and hence a zero result. We must therefore impose either = and = , or the opposite = and = , or eventually the special case where all the subscripts are equal. The first case leads to what we call the “direct term”, and the second, the “exchange term”. We now compute their values. (i) Direct term, = and = , shown on the left diagram of Figure 3. If = = = , the four operators acting on Φ reconstruct the same ket, multiplied by the factor ( 1); this yields a zero result for fermions. If = , we can move the operator = just to the right of the first operator to form the particle number operator ˆ . This permutation in the operators’ order does not change anything: for bosons, we are moving commuting operators, and for fermions, two anticommutations introduce two minus signs, which cancel each other. The same goes for the operators with subscript , leading to the particle number ˆ . Finally, the direct term is equal to: dir 2 (r1

r2 ) =

(r1 ) =

1614

2

(r2 )

2

+

(

1)

(r1 )

2

(r2 )

2

(C-32)

C. TWO-PARTICLE OPERATORS

Figure 3: Schematic representation of a direct term (left diagram where each particle remains in the same individual state) and an exchange term (right diagram where the particles exchange their individual states). As in Figure 2, the solid lines represent the particles free propagation, and the dashed lines their binary interaction.

where the second sum is zero for fermions ( is equal to 0 or 1). (ii) Exchange term, = and = , shown on the right diagram of Figure 3. The case where all four subscripts are equal is already included in the direct term. To get the operators’ product ˆ ˆ starting from the product , we just have to permute the two central operators ; when = this operation is of no consequence for bosons, but introduces a change of sign for fermions (anticommutation). The exchange term can ex therefore be written as 2 (r1 r2 ), with: ex 2 (r1

r2 ) =

(r1 )

(r2 ) (r2 ) (r1 )

(C-33)

=

Finally, the spatial correlation function (or double density) of the direct and exchange terms: 2 (r1

r2 ) =

dir 2 (r1

r2 ) +

ex 2 (r1

r2 )

2 (r1

r2 ) is the sum (C-34)

where the factor in front of the exchange term is 1 for bosons and 1 for fermions. 2 2 The direct term only contains the product (r1 ) (r2 ) of the probability densities associated with the individual wave functions (r1 ) and (r2 ); it corresponds to noncorrelated particles. We must add to it the exchange term, which has a more complex mathematical form and reveals correlations between the particles, even when they do not interact with each other. These correlations come from explicitly taking into account the fact that the particles are identical (symmetrization or antisymmetrization of the state vector). They are sometimes called “statistical correlations ” and their spatial dependence will be studied in more detail in Complement AXVI . Conclusion The creation and annihilation operators introduced in this chapter lead to compact and general expressions for operators acting on any particle number . These expressions involve the occupation numbers of the individual states but the particles are no longer 1615

CHAPTER XV

CREATION AND ANNIHILATION OPERATORS FOR IDENTICAL PARTICLES

numbered. This considerably simplifies the computations performed on “ -body systems”, like interacting bosons or fermions. The introduction of approximations such as the mean field approximation used in the Hartree-Fock method (Complement DXV ) will also be facilitated. We have shown the complete equivalence between this approach and the one where we explicitly take into account the effect of permutations between numbered particles. It is important to establish this link for the study of certain physical problems. In spite of the overwhelming efficiency of the creation and annihilation operator formalism, the labeling of particles is sometimes useful or cannot be avoided. This is often the case for numerical computations, dealing with numbers or simple functions that require numbered particles and which, if needed, will be symmetrized (or antisymmetrized) afterwards. In this chapter, we have only considered creation and annihilation operators with discrete subscripts. This comes from the fact that we have only used discrete bases or for the individual states. Other bases could be used, such as the position eigenstates r of a spinless particle. The creation and annihilation operators will then be labeled by a continuous subscript r. Fields of operators are thus introduced at each space point: they are called “field operators” and will be studied in the next chapter.

1616

COMPLEMENTS OF CHAPTER XV, READER’S GUIDE AXV : PARTICLES AND HOLES

In an ideal gas of fermions, one can define creation and annihilation operators of holes (absence of a particle). Acting on the ground state, these operators allow building excited states. This is an important concept in condensed matter physics. Easy to grasp, this complement can be considered to be a preliminary to Complement EXV .

BXV : IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

Studying the thermal equilibrium of an ideal gas of fermions or bosons, we introduce the distribution functions characterizing the physical properties of a particle or of a pair of particles. These distribution functions will be used in several other complements, in particular GXV and HXV . Bose-Einstein condensation is introduced in the case of bosons. The equation of state is discussed for both types of particles.

The list of complements continues on the next page

1617

Series of four complements, discussing the behavior of particles interacting through a mean field created by all the others. Important, since the mean field concept is largely used throughout many domains of physics and chemistry. CXV : CONDENSED BOSON SYSTEM, GROSSPITAEVSKII EQUATION

CXV : This complement shows how to use a variational method for studying the ground state of a system of interacting bosons. The system is described by a one-particle wave function in which all the particles of the system accumulate. This wave function obeys the Gross-Pitaevskii equation.

DXV : TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

DXV : This complement generalizes the previous one to the case where the Gross-Pitaevskii wave function is time-dependent. This allows us to obtain the excitation spectrum (Bogolubov spectrum), and to discuss metastable flows (superfluidity).

EXV : FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

EXV : An ensemble of interacting fermions can be treated by a variational method, the Hartree-Fock approximation, which plays an essential role in atomic, molecular and solid state physics. In this approximation, the interaction of each particle with all the others is replaced by a mean field created by the other particles. The correlations introduced by the interactions are thus ignored, but the fermions’ indistinguishability is accurately treated. This allows computing the energy levels of the system to an approximation that is satisfactory in many situations.

FXV : FERMIONS, TIME-DEPENDENT HARTREE-FOCK APPROXIMATION

FXV : We often have to study an ensemble of fermions in a time-dependent situation, as for example electrons in a molecule or a solid subjected to an oscillating electric field. The Hartree-Fock mean field method also applies to time-dependent problems. It leads to a set of coupled equations of motion involving a Hartree-Fock mean field potential, very similar to the one encountered for time-independent problems.

The list of complements continues on the next page

1618

The mean field approximation can also be used to study the properties, at thermal equilibrium, of systems of interacting fermions or bosons. The variational method amounts to optimizing the one-particle reduced density operator. It permits generalizing to interacting particles a number of results obtained for an ideal gas (Complement BXV ). GXV : FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

GXV : The trial density operator at non-zero temperature can be optimized using a variational method. This leads to self-consistent Hartree-Fock equations, of the same type as those derived in Complement EXV . We thus obtain an approximate value for the thermodynamic potential.

HXV : APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURES (FERMIONS AND BOSONS)

HXV : This complement discusses various applications of the method described in the previous complement: spontaneous magnetism of an ensemble of repulsive fermions, equation of state for bosons and instability in the presence of attractive interactions.

1619



PARTICLES AND HOLES

Complement AXV Particles and holes

1 2 3

Ground state of a non-interacting fermion gas . . . . . . . . 1621 New definition for the creation and annihilation operators 1622 Vacuum excitations . . . . . . . . . . . . . . . . . . . . . . . . 1623

Creation and annihilation operators are frequently used in solid state physics where the notion of particle and hole plays an important role. A good example is the study of metals or semiconductors, where we talk about an electron-hole pair created by photon absorption. A hole means an absence of a particle, but it has properties similar to a particle, like a mass, a momentum, an energy; the holes obey the same fermion statistics as the electrons they replace. Using creation or annihilation operators allows a better understanding of the hole concept. We will remain in the simple framework of a free particle gas, but the concepts can be generalized to the case of particles placed in an external potential or a Hartree-Fock mean potential (Complement EXV ). 1.

Ground state of a non-interacting fermion gas

Consider a system of non-interacting fermions in their ground state. We assume for simplicity that they are all in the same spin state, and thus introduce no spin index (generalization to several spin states is fairly simple). As we showed in Complement CXIV , this system in its ground state is described by a state where all the occupation numbers of the individual states having an energy lower than the Fermi energy are equal to 1, and all the other individual states are empty. In momentum space, the only occupied states are all the individual states whose wave vector k is included in a sphere (called the “Fermi sphere”) of radius (the “Fermi radius”) given by1 : }2 ( 2

2

)

=

=

}2 2

6

2 3

2 3

(1)

where we have used the notation of formula (7) in Complement CXIV : is the Fermi energy (proportional to the particle density to the power 2 3), and the edge length of the cube containing the particles. When the system is in its ground state, all the individual states inside the Fermi sphere are occupied, whereas all the other individual states are empty. Choosing for the individual states basis the plane wave basis, noted to explicit the wave vector k , the occupation numbers are: k k k

=1 =0

if k if k

(2)

1 In Complement C XIV we had assumed that both spin states of the electron gas were occupied, whereas this is not the case here. This explains why the bracket in formula (1) contains the coefficient 6 2 instead of 3 2 .

1621

COMPLEMENT AXV



In a macroscopic system, the number of occupied states is very large, of the order of the Avogadro number ( 1023 ). The ground state energy is given by: 0

=

(3)

k k

with: =

k

}2 ( )2 2

(4)

The sum over k in (3) must be interpreted as a sum over all the k values that obey the boundary conditions in the box of volume 3 , as well as the restriction on the length of the vector k which must be smaller or equal to . 2.

New definition for the creation and annihilation operators

We now consider this ground state as a new “vacuum” 0 and introduce creation operators that, acting on this vacuum, create excited states for this system. We define:

= k k = k k = k k = k k

if k if k

(5)

Outside the Fermi sphere, the new operators k and k are therefore simple operators of creation (or annihilation) of a particle in a momentum state that is not occupied in the ground state. Inside the Fermi sphere, the results are just the opposite: operator k creates a missing particle, that we shall call a “hole”; the adjoint operator k repopulates that level, hence destroying the hole. It is easy to show that the anticommutation relations for the new operators are: k

k

k

k

= +

k

=

k k

=

k

+

k

=0 +

(6)

as well as: k

k

k

k

+

= +

k

=0 +

(7)

k k

which are the same as for ordinary fermions. Finally, the cross anticommutation relations are: k

1622

k

= +

k

k

= +

k

k

= +

k

k

=0 +

(8)

• 3.

PARTICLES AND HOLES

Vacuum excitations

Imagine, for example, that with this new point of view we apply an annihilation operator , to the “new vacuum” 0 . The result must be zero since it is k , with k impossible to annihilate a non-existent hole. From the old point of view and according to (5), this amounts to applying the creation operator k to a system where the individual state k is already occupied, and the result is indeed zero, as expected. On the other hand, if we apply the creation operator k , with k , to the new vacuum, the result is not zero: from the old point of view, it removes a particle from an occupied state, and in the new point of view it creates a hole that did not exist before. The two points of view are consistent. Instead of talking about particles and holes, one can also use a general term, excitations (or “quasi-particles”). The creation operator of an excitation of k is the creation operator k = k of a hole ; the creation operator of an excitation of k is the creation operator k = k of a particle. The vacuum state defined initially is a common eigenvector of all the particle annihilation operators, with eigenvalues zero; in a similar way, the new vacuum state 0 is a common eigenvector of all the excitation annihilation operators. We therefore call it the “quasi-particle vacuum”. As we have neglected all particle interactions, the system Hamiltonian is written as: =

k

k

k

=

k

k

k

=

k

k

k

k

k

+

k

k

Taking into account the anticommutation relations between the operators can rewrite this expression as: 0

=

k k

k

k

+

(9)

k

k

k

k

k

k

and

k

we

(10)

k

where 0 has been defined in (3) and simply shifts the origin of all the system energies. Relation (10) shows that holes (excitations with k ) have a negative energy, as expected since they correspond to missing particles. Starting from its ground state, to increase the system energy keeping the particle number constant, we must apply the operator k k that creates both a particle and a hole: the system energy is then increased by the quantity ; inversely, to decrease the system energy, the adjoint operator k k must be applied. Comments: (i) We have discussed the notion of hole in the context of free particles, but nothing in the previous discussion requires the one-particle energy spectrum to be simply quadratic as in (4). In semi-conductor physics for example, particles often move in a periodic potential, and occupy states in the “valence band” when their energy is lower than the Fermi level , whereas the others occupy the “conduction band”, separated from the previous band by an “energy gap”. Sending a photon with an energy larger than this gap allows the creation of an electron-hole pair, easily studied in the formalism we just introduced.

1623

COMPLEMENT AXV



A somewhat similar case occurs when studying the relativistic Dirac wave equation, where two energy continuums appear: one with energies greater than the electron rest energy 2 (where is the electron mass, and the speed of light), and one for negative energies 2 less than associated with the positron (the antiparticle of the electron, having the opposite charge). The energy spectrum is relativistic, and thus different from formula (4), even inside each of those two continuums. However, the general formalism remains valid, the operators k and k describing now, respectively, the creation and annihilation of a positron. The Dirac equation however leads to difficulties by introducing for example an infinity of negative energy states, assumed to be all occupied to avoid problems. A proper treatment of this type of relativistic problems must be done in the framework of quantum field theory. (ii) An arbitrary -particle Fock state Φ does not have to be the ground state to be formally considered as a “quasi-particle vacuum”. We just have to consider any annihilation operator on an already occupied individual state as a creation operator of a hole (i.e. of an excitation); we then define the corresponding hole (or excitation) annihilation operators, which all have in common the eigenvector Φ with eigenvalue zero. This comment will be useful when studying the Wick theorem (Complement CXVI ). In § E of Chapter XVII, we shall see another example of a quasi-particle vacuum, but where, this time, the new annihilation operators are no longer acting on individual states but on states of pairs of particles.

1624



IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

Complement BXV Ideal gas in thermal equilibrium; quantum distribution functions

1

2

3

4

5

Grand canonical description of a system without interactions1626 1-a Density operator . . . . . . . . . . . . . . . . . . . . . . . . . 1626 1-b Grand canonical partition function, grand potential . . . . . 1627 Average values of symmetric one-particle operators . . . . . 1628 2-a Fermion distribution function . . . . . . . . . . . . . . . . . . 1629 2-b Boson distribution function . . . . . . . . . . . . . . . . . . . 1629 2-c Common expression . . . . . . . . . . . . . . . . . . . . . . . 1630 2-d Characteristics of Fermi-Dirac and Bose-Einstein distributions 1630 Two-particle operators . . . . . . . . . . . . . . . . . . . . . . 1631 3-a Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1632 3-b Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1632 3-c Common expression . . . . . . . . . . . . . . . . . . . . . . . 1634 Total number of particles . . . . . . . . . . . . . . . . . . . . . 1635 4-a Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1635 4-b Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1637 Equation of state, pressure . . . . . . . . . . . . . . . . . . . . 1640 5-a Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1640 5-b Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1641

This complement studies the average values of one- or two-particle operators for an ideal gas, in thermal equilibrium. It includes a discussion of several useful properties of the Fermi-Dirac and Bose-Einstein distribution functions, already introduced in Chapter XIV. To describe thermal equilibrium, statistical mechanics often uses the grand canonical ensemble, where the particle number may fluctuate, with an average value fixed by the chemical potential (cf. Appendix VI, where you will find a number of useful concepts for reading this complement). This potential plays, with respect to the particle number, a role similar to the role the inverse of the temperature term = 1 plays with respect to the energy ( is the Boltzmann constant). In quantum statistical mechanics, Fock space is a good choice for the grand canonical ensemble as it easily allows changing the total number of particles. As a direct application of the results of §§ B and C of Chapter XV, we shall compute the average values of symmetric one- or two-particle operators for a system of identical particles in thermal equilibrium. We begin in § 1 with the density operator for non-interacting particles, and then show in §§ 2 and 3 that the average values of the symmetric operators may be expressed in terms of the Fermi-Dirac and Bose-Einstein distribution functions, increasing their application range and hence their importance. In § 5, we shall study the equation of state for an ideal gas of fermions or bosons at temperature and contained in a volume . 1625

COMPLEMENT BXV

1.



Grand canonical description of a system without interactions

We first recall how a system of non-interacting particles is described, in quantum statistical mechanics, by the grand canonical ensemble; more details on this subject can be found in Appendix VI, § 1-c. 1-a.

Density operator

Using relations (42) and (43) of Appendix VI, we can write the grand canonical density operator (whose trace has been normalized to 1) as:

= where

1

(1)

is the grand canonical partition function:

= Tr

(2)

In these relations, = 1 ( ) is the inverse of the absolute temperature multiplied by the Boltzmann constant , and , the chemical potential (which may be fixed by a large reservoir of particles). Operators and are, respectively, the system Hamiltonian and the particle number operator defined by (B-15) in Chapter XV. Assuming the particles do not interact, equation (B-1) of Chapter XV allows writing the system Hamiltonian as a sum of one-particle operators, in each subspace having a total number of particles equal to :

=

( )

(3)

=1

Let us call the basis of the individual states that are the eigenstates of the operator . Noting and the creation and annihilation operators of a particle in these states, may be written as in (B-14):

=

=

where the =

1

(4)

are the eigenvalues of . Operator (1) can also be written as: (

)

=

1

(

)

(5)

We shall now compute the average values of all the one- or two-particle operators for a system described by the density operator (1). 1626

• 1-b.

IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

Grand canonical partition function, grand potential

In statistical mechanics, the “grand potential” Φ associated with the grand canonical equilibrium is defined as the (natural) logarithm of the partition function, multiplied by (cf. Appendix VI, § 1-c- ): Φ=

ln

(6)

where is given by (2). The trace appearing in this equation is easily computed in the basis of the Fock states built from the individual states , as we now show. The trace of a tensor product of operators (Chapter II, § F-2-b) is simply the product of the traces of each operator. The Fock space has the structure of a tensor product of the spaces associated with each of the (each being spanned by kets having a population ranging from zero to infinity – see comment (i) of § A-1-c in Chapter XV); we must thus compute a product of traces in each of these spaces. For a fixed , we sum all the diagonal elements over all the values of , then take the product over all ’s, which leads to: =

.

exp [

(

)]

(7)

Fermions

For fermions, as can only take the values 0 or 1 (two identical fermions never occupy the same individual state), we get: fermions

=

(

1+

)

(8)

and: Φfermions =

(

ln 1 +

)

(9)

The index must be summed over all the individual states. In case these states are also labeled by orbital and spin subscripts, these must also be included in the summation. Let us consider for example particles having a spin and contained in a box of volume with periodic boundary conditions. The individual stationary states may be written as k , where k obeys the periodic boundary conditions (Complement CXIV , § 1-c) and the subscript takes (2 + 1) values. Assuming the particles to be free in the box (no spin Hamiltonian), each value yields the same contribution to Φfermions ; in the large volume limit, expression (9) then becomes: Φfermions = .

(2 + 1)

3

(2 )

d3 ln 1 +

(

)

(10)

Bosons

For bosons, the summation over in (7) goes from = 0 to infinity, which introduces a geometric series whose sum is readily computed. We therefore get: bosons

=

1 1

(

)

(11) 1627

COMPLEMENT BXV



which leads to: Φbosons =

(

ln 1

)

(12)

For a system of free particles with spin , confined in a box with periodic boundary conditions, we obtain, in the large volume limit: Φbosons = (2 + 1)

d3 ln 1

3

(2 )

(

)

(13)

In a general way, for fermions as well as bosons, the grand potential directly yields the pressure , as shown in relation (61) of AppendixVI: Φ=

(14)

Using the proper derivatives with respect to the equilibrium parameters (temperature, chemical potential, volume), it also yields the other thermodynamic quantities such as the energy, the specific heats, etc. 2.

Average values of symmetric one-particle operators

Symmetric quantum operators for one, and then for two particles, were introduced in a general way in Chapter XV (§§ B and C). The general expression for a one-particle operator is given by equation (B-12) of that chapter. We can thus write: =

(15)

with, when the state of the system is given by the density operator (1): = Tr

=

1

Tr

(16)

This trace can be computed in the Fock state basis 1 associated with the eigenstates basis of . If = , operator destroys a particle in the individual state and creates another one in the different state ; it therefore transforms the Fock state 1 into a different, hence orthogonal, Fock state 1 + 1 . Operator then acts on this ket, multiplying it by a constant. 1 Consequently, if = , all the diagonal elements of the operator whose trace is taken in (16) are zero; the trace is therefore zero. If = , this average value may be computed as for the partition function, since the Fock space has the structure of a tensor product of individual state’s spaces. The trace is the product of the value contribution by all the other values contributions. We can thus write, in a general way: =

1

exp [

(

)]

exp [

(

)]

(17)

=

For = , this expression yields the average particle number in the individual state 1628

.

• 2-a.

IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

Fermion distribution function

As the occupation number only takes the values 0 and 1, the first bracket in ( ) expression (17) is equal to ; as for the other modes ( = ) contribution, in the second bracket, it has already been computed when we determined the partition function. We therefore obtain: 1

=

(

)

(

1+

)

(18)

= ( ) Multiplying both the numerator and denominator by 1 + allows reconstructing the function in the numerator, and, after simplification by , we get: (

=

) (

1+

=

)

(

)

(19)

We find again the Fermi-Dirac distribution function (

(

)=

) (

1+

=

)

(§ 1-b of Complement CXIV ):

1 (

)

(20)

+1

This distribution function gives the average population of each individual state with energy ; its value is always less than 1, as expected for fermions. The average value at thermal equilibrium of any one-particle operator is now readily computed by using (19) in relation (15). 2-b.

Boson distribution function

The mode

= contribution can be expressed as:

exp [

(

)] =

1

exp [

=0

(

)]

=0

=

1

1 1

exp [

(

)]

(21)

We then get: (

=

)

1

1 ) 2

(

=

(

1

)

(22)

which, using (11), amounts to: =

(

)

(23)

where the Bose-Einstein distribution function (

(

)=

1

) (

)

=

1 (

)

1

is defined as: (24) 1629

COMPLEMENT BXV



This distribution function gives the average population of the individual state with energy . The only constraint of this population, for bosons, is to be positive. The chemical potential is always less than the lowest individual energy . In case this energy is zero, must always be negative. This avoids any divergence of the function . Hence for bosons, the average value of any one-particle operator is obtained by inserting (23) into relation (15). 2-c.

Common expression

We define the function as equal to either the function function for bosons. We can write for both cases: (

)=

1 (

=

(25)

)

where the number

is defined as:

1 for fermions

= +1 for bosons 2-d.

for fermions, or the

(26)

Characteristics of Fermi-Dirac and Bose-Einstein distributions

We already gave in Complement CXIV (Figure 3) the form of the Fermi-Dirac distribution. Figure 1 shows both the variations of this distribution and the Bose-Einstein distribution. For the sake of comparison, it also includes the variations of the classical Boltzmann distribution: Boltzmann

(

)=

(

)

(27)

which takes on intermediate values between the two quantum distributions. For a noninteracting gas contained in a box with periodic boundary conditions, the lowest possible ) energy is zero and all the others are positive. Exponential ( is therefore always greater than . We are now going to distinguish several cases, starting with the most negative values for the chemical potential. (i) For a negative value of , with a modulus large compared to 1 (i.e. for , which corresponds to the right-hand side of the figure), the exponential in the denominator of (25) is always much larger than 1 (whatever the energy ), and the distribution reduces to the classical Boltzmann distribution (27). Bosons and fermions have practically the same distribution; the gas is said to be “non-degenerate”. (ii) For a fermion system, the chemical potential has no upper boundary, but the population of an individual state can never exceed 1. If is positive, with : – for low values of the energy, the factor 1 is much larger than the exponential term; the population of each individual state is almost equal to 1, its maximum value. – if the energy increases to values of the order , the population decreases and when , it becomes practically equal to the value predicted by the Boltzmann exponential (27). Most of the particles occupy, however, the individual states having an energy less or comparable to , whose population is close to 1. The fermion system is said to be “degenerate”. 1630



IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

Figure 1: Quantum distribution functions of Fermi-Dirac (for fermions, lower curve) and of Bose-Einstein (for bosons, upper curve) as a function of the dimensionless variable ( ); the dashed line intermediate curve represents the classical ( ) Boltzmann distribution . In the right-hand side of the figure, corresponding to large negative values of , the particle number is small (the low density region) and the two distributions practically join the Boltzmann distribution. The system is said to be non-degenerate, or classical. As increases, we reach the central and left hand side of the figure, and the distributions become more and more different, reflecting the increasing gas degeneracy. For bosons, cannot be larger than the one-particle ground state energy, assumed to be zero in this case. The divergence observed for = 0 corresponds to Bose-Einstein condensation. For fermions, the chemical potential can increase without limit, and for all the energy values, the distribution function tends towards 1 (but never exceeding 1 due to the Pauli exclusion principle).

(iii) For a boson system, the chemical potential cannot be larger than the lowest 0 individual energy value, which we assumed to be zero. As tends towards zero through negative values and 0, the distribution function denominator becomes very small leading to very large populations of the corresponding states. The boson gas is then said to be “degenerate”. On the other hand, for energies of the order or larger than , and as was the case for fermions, the boson distribution becomes practically equal to the Boltzmann distribution. (iv) Finally, for situations intermediate between the extreme cases described above, the gas is said to be “partially degenerate”.

3.

Two-particle operators

For a two-particle symmetric operator which yields: =

1 2

1:

; 2:

(1 2) 1 :

we must use formula (C-16) of Chapter XV,

; 2:

(28)

1631

COMPLEMENT BXV



with: =

1

Tr

(29)

As the exponential operator in the trace is diagonal in the Fock basis states , this trace will be non-zero on the double condition that the states 1 and associated with the creation operator be exactly the same as the states and associated with the annihilation operators, whatever the order. In other words, to get a non-zero trace, we must have either = and = , or = and = , or both. 3-a.

Fermions

As two fermions cannot occupy the same quantum state, the product is zero if = ; we therefore assume = which allows, using for expression (5) (which is a product), to perform independent calculations for the different modes. The case = and = yields, using the anticommutation relations: =

(30)

and the case =

and

= yields:

=+

(31)

We begin with term (30). As and are different, operators and act on different modes, which belong to different factors in the density operator (5). The average value of the product is thus simply the product of the average values: =

(32)

=

(

)

(

)

(33)

As for the second term (31), it is just the opposite of the first one. Consequently, we finally get: =[

]

(

)

(

)

(34)

The first term on the right-hand side is called the direct term. The second one is the exchange term, and has a minus sign, as expected for fermions. 3-b.

Bosons

For bosons, the operators .

commute with each other.

Average value calculation If = , a calculation, similar to the one we just did, yields: =[

1632

+

]

(

)

(

)

(35)



IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

which differs in two ways from (34): the result now involves the Bose-Einstein distribution, and the exchange term is positive. If = , only one individual state comes into a new calculation, which we now perform. Using for expression (5) we get, after summing as in (11) a geometric series: 1

=

(

1) exp [

(

1

)]

=0

(

1

=

(36)

)

The sum appearing in this equation can be written as: (

1) exp [

(

1

)] =

2

2

1

exp [

2

=0

(

)]

=0

1

=

2

2

1

1

2

(

1

(37)

)

The first order derivative term yields: 1

(

1 (

1

)

=

)

(38)

) 2

(

1

and the second order derivative term is: 1 2

2 2

(

1 (

1

)

=

) 2

(

1

) 2

(

)

+2 1

(

(39)

) 3

Summing these two terms yields: ) 2

(

2

) 3

(

1

=

2 1

(

( Multiplying by 1 1 yields the partition function with:

=2

(

)

)

2

(40)

)

the product at the end of the right-hand side of (36) , which cancels out the first factor 1 . We are then left

(

)

2

(41)

This result proves that (35) remains valid even in the case = . .

Physical discussion: occupation number fluctuations For two different physical states and , the average value

for an ideal

gas is simply equal to the product of the average values =

(

=

(

) and

); this is a consequence of the total absence of interaction between

the particles. The same is true for the average value

. 1633

COMPLEMENT BXV



Now if = , we note the factor 2 in relation (41). As we now show, this factor leads to the presence of strong fluctuations associated with the operator , the particle number in the state . The calculations shows that: ( )

2

=

=

+

=2

(

)

2

+

(

The square of the root mean square deviation ∆ 2

2

(∆ )2 = ( )

=

(

)

2

+

)

(42a)

is therefore given by: (

)=

2

+

(42b)

The fluctuations of this operator are therefore larger than its average value, which implies that the population of each state is necessarily poorly defined1 at thermal equilibrium. This is particularly true for large : in an ideal boson gas, a largely populated individual state is associated with a very large population fluctuation. This is due to the shape of the Bose-Einstein distribution (24), a decreasing exponential which is maximum at the origin: the most probable occupation number is always = 0. Hence it is impossible to get a very large average without introducing a distribution spreading over many values. Complement HXV (§ 4-a) discusses certain consequences of these fluctuations for an ideal gas. It also shows that as soon as a weak repulsive particle interaction is introduced, the fluctuations greatly diminish and almost completely disappear, since their presence would lead to a very large increase in the potential energy. 3-c.

Common expression

To summarize, we can write in all cases: =[

+

]

(

)

(

)

(43)

with: for fermions for bosons

= 1, = +1,

= =

(44)

As shown in relation (C-19) of Chapter XV, this average value is simply the matrix element 1 : ; 2 : ; 2: of the two-particle reduced density operator. To 2 1: get the general expression for the average of any symmetric two-particle operator, we simply use (43) in (28). Consequently, for independent particles, the average values of all these operators are simply expressed in terms of the quantum Fermi-Dirac and Bose-Einstein distribution functions. Complement CXVI will show how the Wick theorem allows generalizing these results to operators dealing with any number of particles.

1 A physical observable is said to have a well defined value in a given quantum state if, in this state, its root mean square deviation is small compared to its average value.

1634

• 4.

IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

Total number of particles

The operator corresponding to the total number of particles is given by the sum over all the individual states: =

(45) =1

and its average value is given by: = Tr

=

(

)

(46)

=1

As increases as a function of , the total number of particles is controlled (for fixed ) by the chemical potential. 4-a.

Fermions

For the sake of simplicity, we study the ideal gas properties without taking into account the spin, which assumes that all particles are in the same spin state (the spin can easily be accounted for by adding the contributions of the different individual spin states). For a large physical system, the energy levels are very close and the discrete sum in (46) can be replaced by an integral. This leads to: =

(

)

where the function (

)=

(47) (

(2 )

3

) is defined as (the subscript

d3

stands for ideal gas):

1 (

)

+1

(48)

Figure 2 shows the variations of the function ( ) as a function of , for fixed values of and the volume . To deal with dimensionless quantities, one often introduces the “thermal wavelength” as: =}

2

=}

2

(49)

We can then use in the integral of (48) the dimensionless variable: =

(50)

2

and write: (

)=

(

)

3

3 2(

)

(51) 1635

COMPLEMENT BXV



0

Figure 2: Variations of the particle number ( ) for an ideal fermion gas, as a function of the chemical potential , and for different fixed temperatures ( = 1 ( )). For = 0 (lower dashed line curve), the particle number is zero for negative values of , and proportional to 3 2 for positive values of . For a non-zero temperature = 1 (thick line curve), the curve is above the previous one, and never goes to zero. Also shown are the curves obtained for temperatures twice ( = 2 1 ) and three times ( = 3 1 ) as large. The units chosen for the axes are the thermal energy ( 1 )3 , where 1 associated with the thick line curve, and the particle number 1 = is the thermal wavelength at temperature 1 . 1 Largely negative values of correspond to the classical region where the fermion gas is not degenerate; the classical ideal gas equations are then valid to a good approximation. In the region where , the gas is largely degenerate and a Fermi sphere shows up clearly in the momentum space; the total number of particles has only a slight temperature dependence and varies approximately as 3 2 . This figure was kindly contributed by Geneviève Tastevin. with2 : 3 2(

)=

3 2

d3

1 2

+1

=

2

d

(52)

+1

0

where, in the second equality, we made the change of variable: =

2

Note that the value of

(53) 3 2

only depends on a dimensionless variable, the product

If the particles have a spin 1 2, both contributions

+

and

.

from the

two spin states must be added to (46); in the absence of an external magnetic field, the 2 The subscript 3 2 refers to the subscript used for more general functions ( ), often called the +1 Fermi functions in physics. They are defined by ( )= ( 1) , where is the “fugacity” =1

= . Expanding in terms of the function 1 1 + 1 properties of the Euler Gamma function, it can be shown that

1636

= 3 2(

)=

1+ ).

3 2(

and using the



IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

individual particle energies do not depend on their spin direction, and the total particle number is simply doubled: = 4-b.

+

+

=2

(

)

(54)

Bosons

For the sake of simplicity, we shall also start with spinless particles, but including several spin states is fairly straightforward. For bosons, we must use the Bose-Einstein distribution (24) and their average number is therefore: =

(

1

)=

(

)

(55)

1

We impose periodic boundary conditions in a cubic box of edge length . The lowest individual energy3 is = 0. Consequently, for expression (55) to be meaningful, must be negative or zero: 0

(56)

Two cases are possible, depending on whether the boson system is condensed or not. .

Non-condensed bosons

When the parameter takes on a sufficiently negative value (much lower than the opposite of the individual energy 1 of the first excited level), the function in the summation (55) is sufficiently regular for the discrete summation to be replaced by an integral (in the limit of large volumes). The average particle number is then written as: =

(

)

(57)

with: (

)=

3

(2 )

1

d3

(

)

(58)

1

Performing the same change of variables as above, this expression becomes: (

)=

(

3

)

3 2(

)

(59)

with4 : 3 2(

)=

3 2

d3

1 2

1

=

2

d 0

1

(60)

3 Defining other boundary conditions on the box walls will lead in general to a non-zero ground state energy; choosing that value as the common origin for the energies and the chemical potential will leave the following computations unchanged. 4 The subscript 3 2 refers to the subscript used for the functions ( ), often called, in physics, the Bose functions (or the polylogarithmic functions). They are defined by the series ( )= . =1 3 2. The exact value of the number defined in (61) is thus given by the series =1

1637

COMPLEMENT BXV



The variations of ( ) as a function of total particle number tends towards a limit negative values, where is the number: =

3 2 (0)

= 2 612

(61)

As the function increases with (

are shown in Figure 3. Note that the 3 as tends towards zero through

)

(

, we can write: (62)

3

)

There exists an insurmountable upper limit for the total particle number of a noncondensed ideal Bose gas.

Figure 3: Variations of the total particle number ( ) in a non-condensed ideal Bose gas, as a function of and for fixed = 1 ( ). The chemical potential is always negative, and the figure shows curves corresponding to several temperatures = 1 (thick line), = 2 1 and = 3 1 . Units on the axes are the same as in Figure 2: the thermal energy = 1 , and the particle number 1 = ( 1 )3 , where 1 associated with curve 1 . As the chemical potential 1 is the thermal wavelength for this same temperature tends towards zero, the particle numbers tend towards a finite value. For = 1 , this value is equal to is given by (61). 1 (shown as a dot on the vertical axis), where This figure was kindly contributed by Geneviève Tastevin

.

Condensed bosons As 0

1638

gets closer to zero, the population (

0

)=

1

1 1

0

0

of the ground state becomes: (63)



IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

This population diverges in the limit = 0 and, when gets small enough, it can become arbitrarily large. It can, for example, become proportional5 to the volume , in which case it adds a finite contribution 0 to the particle numerical density (particle number per unit volume) as . This particularity is limited to the ground state, which, in this case, plays a very different role than the other levels. Let us show, for example, that the first excited state population does not yield a similar effect. Assuming the system to be contained in a cubic 2 2 2 box6 of edge length , the population of the first excited energy level 1 } (2 ) can be written as: (

1

)=

1 (

)

1

2

1 1

0

(64) 1

(we assume the box to be large enough so that , which means 1 1); this population can therefore be proportional only to the square of , i.e. to the volume to the power 2 3. It shows that this first excited level cannot make a contribution to the particle density in the limit ; the same is true for all the other excited levels whose contributions are even smaller. The only arbitrary contribution to the density comes from the ground state. This arbitrarily large value as 0 obviously does not appear in relation (59), which predicts that the density ( ) is always less than a finite value as shown by (62). This is not surprising: as the population varies radically from the first energy level to the next, we can no longer compute the average particle number by replacing in (55) the discrete summation by an integral and a more precise calculation is necessary. Actually, only the ground state population must be treated separately, and the summation over all the excited states (of which none contributes to the density divergence) can still be replaced by an integral as before. Consequently, to get the total population of the physical system we simply add the integral on the right-hand side of (57) to the contribution 0 of the ground level: =

(

0) +

(65)

0

where

0 is defined in (63). As 0, the total population of all the excited levels (others than the ground level) remains practically constant and equal to its upper limit (62); only the ground state has a continuously increasing population 0 , which becomes comparable to the total population of all the excited states when the right-hand sides of (63) and (62) are of the same order of magnitude: 3

&

0

&

3

(66)

( being of course always negative). When this condition is satisfied, a significant fraction of the particles accumulates in the individual ground level, which is said to have a 5 The

limit where while the density remains constant is often called the “thermodynamic limit”. 6 As above, we assume periodic conditions on the box walls. Another choice would be to impose zero values for the wave functions on the walls: the numerical coefficients of the individual energies would be changed, but not the line of reasoning.

1639



COMPLEMENT BXV

“macroscopic population” (proportional to the volume). We can even encounter situations where the majority of the particles all occupy the same quantum state. This phenomenon is called “Bose-Einstein condensation” (it was predicted by Einstein in 1935, following Bose’s studies of quantum statistics applicable to photons). It occurs when the total density reaches the maximum predicted by formula (62), that is: 2 612

=

3

(67)

3

This condition means that the average distance between particles is of the order of the thermal wavelength . Initially, Bose-Einstein condensation was considered to be a mathematical curiosity rather than an important physical phenomenon. Later on, people realized that it played an important role in superfluid liquid Helium 4, although this was a system with constantly interacting particles, hence far from an ideal gas. For a dilute gas, BoseEinstein condensation was observed for the first time in 1995, and in a great number of later experiments. 5.

Equation of state, pressure

The “equation of state” of a fluid at thermal equilibrium is the relation that links, for a given particle number , its pressure , volume , and temperature = 1 . We have just studied the variations of the total particle number. We shall now examine the pressure of a fermion or boson ideal gas. 5-a.

Fermions

The grand canonical potential of a fermion ideal gas is given by (9). Equation (14) indicates that, for a system at thermal equilibrium, this grand potential is equal to the opposite of the product of the volume and the pressure . We thus have: = =

ln 1 +

(

)

d3 ln 1 +

3

(2 )

(

)

(68)

(where the second equality is valid in the limit of large volumes). Simplifying by = 3 , we get the pressure of a fermion system contained in a box of macroscopic dimension: = (2 )

3

d3 ln 1 +

1

=

3

5 2

(

(

)

)

(69)

with: 5 2

(

3 2

)= =

2

d 0

1640

d3 ln 1 + ln 1 +

2

(70)



IDEAL GAS IN THERMAL EQUILIBRIUM; QUANTUM DISTRIBUTION FUNCTIONS

where

has been defined in (53). To obtain the equation of state, we must find a relation between the pressure , the volume , and the temperature of the physical system, assuming the particle number to be fixed. We have, however, used the grand canonical ensemble (cf. Appendix VI), where the temperature is determined by the parameter and the volume is fixed, but where the particle number can vary: its average value is a function of a parameter, the chemical potential (for fixed values of and ). Mathematically, the pressure appears as a function of , and and not as the function of , and the particle number we were looking for. We can nevertheless vary , and obtain values of the pressure and particle number of the system and consequently explore, point by point, the equation of state in this parametric form. To obtain an explicit form of the equation of state would require the elimination of the chemical potential using both (47) and (69); there is generally no algebraic solution, and people just use the parametric form of the equation of state, which allows computing all the possible state variables. There also exists a “virial expansion” in powers of the fugacity , which allows the explicit elimination of at all the successive orders; its description is beyond the scope of this book. 5-b.

Bosons

The pressure of an ideal boson gas is derived from the grand potential (12), taking into account its relation (14) to the pressure and volume : = =

(

ln 1 3 3

(2 )

)

(

ln 1

)

(71)

(the second relation being valid in the limit of large volumes). This leads to: 3

=

(2 ) 1

=

3

5 2

(

(

ln 1

3

)

)

(72)

with: 5 2

(

2

=

2

d3 ln 1

3 2

)=

d

ln 1

(73)

0

As

0, the contribution

0

of the ground level to the pressure written in (71)

is: 0

=

ln 1

ln [

]

(74)

When the chemical potential tends towards zero as in (66), it leads to: 0

ln

3

(75) 1641

COMPLEMENT BXV



which therefore goes to zero in the limit of large volumes. For a large system, the ground level contribution to the pressure remains negligible compared to that of all the other individual energy levels, whose number gets bigger as the system gets larger. Contrary to what we encountered for the average total particle number, the condensed particles’ contribution to the pressure goes to zero in the limit of large volumes. As we have seen for fermions, the equation of state must be obtained by eliminating the chemical potential between equations (72) yielding the pressure and (65) yielding the total particle number. As opposed to an ideal fermion gas, whose particle number and pressure increase without limit as and the density increase, the pressure in a boson system is limited. As soon as the system condenses, only the particle number in the individual ground state continues to grow, but not the pressure. In other words, the physical system acquires an infinite compressibility, and becomes a “marginally pathological” system (a system whose pressure decreases with its volume is unstable). This pathology comes, however, from totally neglecting the bosons’ interactions. As soon as repulsive interactions are introduced, no matter how small, the compressibility will take on a finite value and the pathology will disappear. This complement is a nice illustration of the simplifications incurred by the systematic use, in the calculations, of the creation and annihilation operators. We shall see in the following complements that these simplifications still occur when taking into account the interactions, provided we stay in the framework of the mean field approximation. Complement BXVI will even show that for an interacting system studied without using this approximation, the ideal gas distribution functions are still somewhat useful for expressing the average values of various physical quantities.

1642



CONDENSED BOSON SYSTEM, GROSS-PITAEVSKII EQUATION

Complement CXV Condensed boson system, Gross-Pitaevskii equation

1

2

3

4

Notation, variational ket . . . . . . . . . . . . . . . . . . . . . 1643 1-a

Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1644

1-b

Choice of the variational ket (or trial ket) . . . . . . . . . . . 1644

First approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 1645 2-a

Trial wave function for spinless bosons, average energy . . . . 1645

2-b

Variational optimization . . . . . . . . . . . . . . . . . . . . . 1646

Generalization, Dirac notation

. . . . . . . . . . . . . . . . . 1648

3-a

Average energy . . . . . . . . . . . . . . . . . . . . . . . . . . 1648

3-b

Energy minimization . . . . . . . . . . . . . . . . . . . . . . . 1649

3-c

Gross-Pitaevskii equation . . . . . . . . . . . . . . . . . . . . 1650

Physical discussion . . . . . . . . . . . . . . . . . . . . . . . . . 1651 4-a

Energy and chemical potential . . . . . . . . . . . . . . . . . 1651

4-b

Healing length . . . . . . . . . . . . . . . . . . . . . . . . . . 1652

4-c

Another trial ket: fragmentation of the condensate . . . . . . 1654

The Bose-Einstein condensation phenomenon for an ideal gas (no interaction) of identical bosons was introduced in § 4-b- of Complement BXV . We show in the present complement how to describe this phenomenon when the bosons interact. We shall look for the ground state of this physical system within the mean field approximation, using a variational method (see Complement EXI ). After introducing in § 1 the notation and the variational ket, we study in § 2 spinless bosons, for which the wave function formalism is simple and the introduction of the creation and annihilation operators does not lead to any major computation simplifications. This will lead us to a first version of the Gross-Pitaevskii equation. We will then come back in § 3 to Dirac notation and the creation operators, to deal with the more general case where each particle may have a spin. Defining the Gross-Pitaevskii potential operator, we shall obtain a more general version of that equation. Finally, some properties of the Gross-Pitaevskii equation will be discussed in § 4, as well as the role of the chemical potential, the existence of a relaxation (or “healing”) length, and the energetic consequences of “condensate fragmentation” (these terms will be defined in § 4-c).

1.

Notation, variational ket

We first define the notation and the variational family of state vectors that will lead to relatively simple calculations for a system of identical interacting bosons. 1643



COMPLEMENT CXV

1-a.

Hamiltonian

The Hamiltonian operator we consider is the sum of operators for the kinetic energy 0 , the one-body potential energy ext , and the interaction energy int : =

0

+

ext

+

(1)

int

The first term 0 is simply the sum of the individual kinetic energy operators associated with each of the particles : 0

=

0(

)

(2)

where : 0(

P2 2

)=

(3)

(P is the momentum of particle ). Similarly, ext is the sum of the external potential operators 1 (R ), each depending on the position operator R of particle : ext

=

1 (R

)

(4)

=1

Finally, int

int

1 2

=

is the sum of the interaction energy associated with all the pairs of particles: 2 (R

R )

(this summation can also be written as a sum over 1 2). 1-b.

(5)

= =1

, while removing the prefactor

Choice of the variational ket (or trial ket)

Let us choose an arbitrary normalized quantum state =1

: (6)

and call the associated creation operator. The -particle variational kets we consider are defined by the family of all the kets that can be written as: Ψ =

1

0 (7) ! where can vary, only constrained by (6). Consider a basis of the individual state space whose first vector is 1 = . Relation (A-17) of Chapter XV shows that this ket is simply a Fock state whose only non-zero occupation number is the first one: Ψ =

1

=

2

=0

3

=0

(8)

An assembly of bosons that occupy the same individual state is called a “Bose-Einstein condensate”. Relation (8) shows that the kets Ψ are normalized to 1. We are going to vary , and therefore Ψ , so as to minimize the average energy: = Ψ 1644

Ψ

(9)

• 2.

CONDENSED BOSON SYSTEM, GROSS-PITAEVSKII EQUATION

First approach

We start with a simple case where the bosons have no spin. We can then use the wave function formalism and keep the computations fairly simple. 2-a.

Trial wave function for spinless bosons, average energy

Assuming one single individual state to be populated, the wave function Ψ(r1 r2 is simply the product of functions (r): Ψ(r1 r2

r ) = (r1 )

(r2 )

(r )

(10)

with: (r) = r

(11)

This wave function is obviously symmetric with respect to the exchange of all particles and can be used for a system of identical bosons. In the position representation, each operator 0 ( ) defined by (3) corresponds to }2 2 ∆r , where ∆r is the Laplacian with respect to the position r ; consequently, we have: 0

}2

=

2

d3

d3

d3

(r )

(r )

1

=1

(r1 )

(r1 )

∆ (r )

(r )

(12)

In this expression, all the integral variables others than r simply introduce the square of the norm of the function (r), which is equal to 1. We are just left with one integral over r , in which r plays the role of a dummy variable, and thus yields a result independent of . Consequently, all the values give the same contribution, and we can write: 0

}2

=

d3

2

(r)∆ (r)

(13)

As for the one-body potential energy, a similar calculation yields: ext

d3

=

(r)

1 (r)

(r)

(14)

Finally, the interaction energy calculation follows the same steps, but we must keep two integral variables instead of one. The final result is proportional to the number ( 1) 2 of pairs of integral variables: 2

=

(

1)

d3

2

d3

The variational average energy =

0

+

ext

+

2

(r) (r )

2

(r r ) (r) (r )

(15)

is the sum of these three terms: (16) 1645

r )

COMPLEMENT CXV

2-b.



Variational optimization

We now optimize the energy we just computed, so as to determine the wave functions (r) corresponding to its minimum value. .

Variation of the wave function Let us vary the function (r) by a quantity: (r)

(r) +

(r)

(17)

where (r) is an infinitesimal function and an arbitrary number. A priori, (r) must be chosen to take into account the normalization constraint (6), which forces the integral of the (r) modulus squared to remain constant. We can, however, use the Lagrange multiplier method (Appendix V) to impose this constraint. We therefore introduce the multiplier (we shall see in § 4-a that this factor can be interpreted as the chemical potential) and minimize the function: d3

=

(r) (r)

(18)

This allows considering the infinitesimal variation (r) to be free of any constraint. The variation of the function is now the sum of 4 variations, coming from the three terms of (16) and from the integral in (18). For example, the variation of 0 yields:

0

=

}2 2

d3

(r) ∆ (r) +

(r) ∆

(r)

(19)

which is the sum of a term proportional to and another proportional to . This is true for all 4 variations and the total variation can be expressed as the sum of two terms: =

1

+

2

(20)

the first being the (r) contribution and the second, that of (r). Now if is stationary, must be zero whatever the choice of , which is real. Choosing for example = 0 imposes 1 + 2 = 0, and the choice = 2 leads (after multiplication by ) to 1 2 = 0. Adding and subtracting the two relations shows that both coefficients 1 and 2 must be zero. In other words, we can impose to be zero as just (r) varies but not (r) – or the opposite1 . .

Stationary condition: Gross-Pitaevskii equation

We choose to impose the variation to be zero as only (r) varies and for = 0. We must first add contributions coming from (13) and (14), then from (15). For this last contribution, we must add two terms, one coming from the variations due to (r), and the other from the variation due to (r ). These two terms only differ by the notation 1 This means that the stationary condition may be found by varying indifferently the real or imaginary part of (r).

1646



CONDENSED BOSON SYSTEM, GROSS-PITAEVSKII EQUATION

in the integral variable and are thus equal: we just keep one and double it. We finally add the term due to the variation of the integral in (18), and we get: d3

=

(r) }2

2

∆+

1 (r)

+(

d3

1)

2

(r r )

(r ) (r )

(r)

(21)

This variation must be zero for any value of (r); this requires the function that multiplies (r) in the integral to be zero, and consequently that (r) be the solution of the following equation, written for (r): }2 ∆+ 2

1 (r)

+(

d3

1)

2

(r r )

(r )

2

(r) =

(r)

(22)

This is the time-independent Gross-Pitaevskii equation. It is similar to an eigenvalue Schrödinger equation, but with a potential term: 1 (r)

+(

d3

1)

2

(r r )

(r )

2

(23)

which actually contains the wave function in the integral over d3 ; it is therefore a nonlinear equation. The physical meaning of the potential term in 2 is simply that, in the mean field approximation, each particle moves in the mean potential created by all the others, each of them being described by the same wave function (r ); the factor ( 1) corresponds to the fact that each particle interacts with ( 1) other particles. The Gross-Pitaevskii equation is often used to describe the properties of a boson system in its ground state (Bose-Einstein condensate). .

Zero-range potential

The Gross-Pitaevskii equation is often written in conjunction with an approximation where the particle interaction potential has a microscopic range, very small compared to the distances over which the wave function (r) varies. We can then substitute: 2 (r

r)=

(r

r)

(24)

where the constant is called the “coupling constant”; such a potential is sometimes known as a “contact potential” or, in other contexts, a “Fermi potential”. We then get: }2 ∆+ 2

1 (r)

+(

1)

(r)

2

(r) =

(r)

(25)

Whether in this form2 or in its more general form (22), the equation includes a cubic term in (r). It may render the problem difficult to solve mathematically, but it also is the source of many interesting physical phenomena. This equation explains, for example, the existence of quantum vortices in superfluid liquid helium. 2 Strictly speaking, in what is generally called the Gross-Pitaevskii equation, the coupling constant is replaced by 4 }2 0 , where 0 is the “scattering length”; this length is defined when studying the collision phase shift ( ) (Chapter VIII, § ), as the limit of 0 ( ) 0. This scattering 0 when length is a function of the interaction potential 2 (r r ), but generally not merely proportional to it, as opposed to the matrix elements of 2 (r r ). It is then necessary to make a specific demonstration for this form of the Gross-Pitaevskii equation, using for example the “pseudo-potential” method.

1647



COMPLEMENT CXV

.

Other normalization

Rather than normalizing the wave function (r) to 1 in the entire space, one sometimes chooses a normalization taking into account the particle number by setting: d3

(r)

2

=

(26)

the wave function we have used until now. At each This amounts to multiplying by point r of space, the particle (numerical) density (r) is then given by: (r) =

(r)

2

(27)

With this normalization, the factor ( generally be taken equal to 1 for large }2 ∆+ 2

1 (r)

+

(r)

2

(r) =

1) in (25) is replaced by ( 1) , which can . The Gross-Pitaevskii equation then becomes: (r)

As already mentioned, we shall see in § 4-a that 3.

(28) is simply the chemical potential.

Generalization, Dirac notation

We now go back to the previous line of reasoning, but in a more general case where the bosons may have spins. The variational family is the set of the -particle state vectors written in (7). The one-body potential may depend on the position r, and, at the same time, act on the spin (particles in a magnetic field gradient, for example). 3-a.

Average energy

To compute the average energy value Ψ Ψ , we use a basis vidual state space, whose first vector is 1 = . Using relation (B-12) of Chapter XV, we can write the average value 0

=

Ψ

0

Ψ

of the indi0

as: (29)

Since Ψ is a Fock state whose only non-zero population is that of the state 1 , the ket Ψ is non-zero only if = 1; it is then orthogonal to Ψ if = 1. Consequently, the only term left in the summation corresponds to = = 1. As the operator 1 1 multiplies the ket by its population , we get: 0

=

1

0

1

(30)

With the same argument, we can write: ext

1648

=

1

1

1

(31)



CONDENSED BOSON SYSTEM, GROSS-PITAEVSKII EQUATION

Using relation (C-16) of Chapter XV, we can express the average value of the interaction energy as3 : =

2

1 2

1:

;2 :

2 (1

2) 1 :

;2 :

Ψ

Ψ

(32)

In this case, for the second matrix element to be non-zero, both subscripts and must be equal to 1 and the same is true for both subscripts and ( otherwise the operator will yield a Fock state orthogonal to Ψ ). When all the subscripts are equal to 1, the operator multiplies the ket Ψ by ( 1). This leads to: (

=

2

1)

1:

2

1; 2

:

2 (1

1

2) 1 :

1; 2

:

(33)

1

The average interaction energy is therefore simply the product of the number of pairs ( 1) 2 that can be formed with particles and the average interaction energy of a given pair. We can replace 1 by , since they are equal. The variational energy, obtained as the sum of (30), (31) and (33), then reads: =

[

3-b.

0

+

1]

+

(

1) 2

1 : ;2 :

2 (1

2) 1 : ; 2 :

(34)

Energy minimization

Consider a variation of

:

+

(35)

where is an arbitrary infinitesimal ket of the individual state space, and an arbitrary real number. To ensure that the normalization condition (6) is still satisfied, we impose and to be orthogonal: =0

(36)

so that remains equal to 1 (to the first order in ). Inserting (35) into (34) to obtain the variation d of the variational energy, we get the sum of two terms: the first one comes from the variation of the ket , and is proportional to ; the second one comes from the variation of the bra , and is proportional to . The result has the form: =

1

+

(37)

2

The stationarity condition for must hold for any arbitrary real value of . As before (§ 2-b- ), it follows that both 1 and 2 are zero. Consequently, we can impose the variation to be zero as just the bra varies (but not the ket ), or the opposite. Varying only the bra, we get the condition: 0= 3 We

[

0

+

1

]

use the simpler notation

+

2 (1

(

1) 2

2) for

[ 1:

;2 :

+ 1 : ;2 : 2 (R1

2 (1

2) 1 : ; 2 :

2 (1

2) 1 : ; 2 :

(38) ]

R2 ).

1649



COMPLEMENT CXV

As the interaction operator 2 (1 2) is symmetric, the last two terms within the bracket in this equation are equal. We get (after simplification by ): 0=

[

3-c.

0

+

]

1

+(

1) 1 :

;2 :

2 (1

2) 1 : ; 2 :

(39)

Gross-Pitaevskii equation

To deal with equation (39), we introduce the Gross-Pitaevskii operator fined as a one-particle operator whose matrix elements in an arbitrary basis given by: =(

1) 1 :

;2 :

2 (1

2) 1 :

, deare

;2 :

(40)

which leads to: =(

1) 1 : ; 2 :

2 (1

2) 1 :

;2 :

(41)

where and are two arbitrary one-particle kets – this can be shown by expanding these two kets on the basis and using relation (40). Note that this potential operator does not include an exchange term; this term does not exist when the two interacting particles are in the same individual quantum state. Equation (39) then becomes: 0=

0

+

1

+

(42)

This stationarity condition must be verified for any value of the bra , with only the constraint that it must be orthogonal to (according to relation (36)). This means that the ket resulting from the action of the operator on must have 0+ 1+ zero components on all the vectors orthogonal to ; its only non-zero component must be on the ket itself, which means it is necessarily proportional to . In other words, must be an eigenvector of that operator, with eigenvalue (real since the operator is Hermitian): 0

+

+

1

=

(43)

We have just shown that the optimal value equation: [

0

+

1

+

]

of

is the solution of the Gross-Pitaevskii

=

(44)

which is a generalization of (28) to particles with spin, and is valid for one- or twobody arbitrary potentials. For each particle, the operator represents the mean field created by all the others in the same state .

Comment: The Gross-Pitaevskii operator (1) = ( where

(2)

2 (1

2)

(2) is the projection operator

(2) =

1650

1) Tr2

is simply a partial trace over the second particle:

1:

1:

2:

(45) (2) of the state of particle 2 onto

2:

=

1:

;2 :

1:

;2 :

: (46)



CONDENSED BOSON SYSTEM, GROSS-PITAEVSKII EQUATION

To show this, let us compute the partial trace on the right-hand side of (45). To obtain this trace (Complement EIII , § 5-b), we choose for particle 2 a set of basis states whose first vector 1 coincides with : Tr2

(2)

2 (1

2)

=

1:

;2 :

(2)

2 (1

2) 1 :

;2 :

(47)

Replacing (2) by its value (46) yields the product of (for the scalar product associated with particle 1) and 1 (for the one associated with particle 2). This leads to: Tr2

(2)

2 (1

2)

= 1:

;2 :

2 (1

2) 1 :

;2 :

(48)

which is simply the initial definition (40) of . Relation (45) is therefore another possible definition for the Gross-Pitaevskii potential.

4.

Physical discussion

We have established which conditions the variational wave function must obey to make the energy stationary, but we have yet to study the actual value of this energy. This will allow us to show that the parameter is in fact the chemical potential associated with the system of interacting bosons. We shall then introduce the concept of a relaxation (or “healing”) length, and discuss the effect, on the final energy, of the fragmentation of a single condensate into several condensates, associated with distinct individual quantum states. 4-a.

Energy and chemical potential

Since the ket [

0

+

1

is normalized, multiplying (44) by the bra

+

]

and by

=

, we get: (49)

We recognize the first two terms of the left-hand side as the average values of the kinetic energy and the external potential. As for the last term, using definition (41) for , we can write it as: =

(

1) 1 : ; 2 :

2 (1

2) 1 : ; 2 :

(50)

which is simply twice the potential interaction energy given in (33) when leads to: =

0

+

1

To find the energy

+2

2

=

, note that

+

2 is the sum of

2

[ +

[

0

+

1]

]

. This (51)

2

2

and of half the kinetic and

external potential energies. Adding the missing halves, we finally get for =

=

1

: (52)

An advantage of this formula is to involve only one- (and not two-) particle operators, which simplifies the computations. The interaction energy is implicitly contained in the factor . 1651



COMPLEMENT CXV

The quantity does not yield directly the average energy, but it is related to it, as we now show. Taking the derivative, with respect to , of equation (34) written for = , we get: d d

=

[

0

+

1]

1 2

+

2 (1

2)

For large , one can safely replace in this equation ( plication by , we obtain a sum of average energies: d d

=

0

+

1

+2

2

(53) 1 2) by (

1); after multi-

(54)

Taking relation (51) into account, this leads to: d d

=

(55)

We know (Appendix VI, § 2-b) that in the grand canonical ensemble, and at zero temperature, the derivative of the energy with respect to the particle number (for a fixed volume) is equal to the chemical potential. The quantity , introduced mathematically as a Lagrange multiplier, can therefore be simply interpreted as this chemical potential. 4-b.

Healing length

The “healing length” is an important concept that characterizes the way a solution of the time-independent Gross-Pitaevskii equation reacts to a spatial constraint (for example, the solution can be forced to be zero along a wall, or along the line of a vortex core). We now calculate an approximate order of magnitude for this length. Assuming the potential 1 (r) to be zero in the region of interest, we divide equation (28) by (r) and get: }2 ∆ (r) + 2 (r)

(r)

2

=

(56)

Consequently, the left-hand side of this equation must be independent of r. Let us assume (r) is constant in an entire region of space where the density is 0 , independent of r: 0

=

(r)

2

(57)

but constrained by the boundary conditions to be zero along its border. For the sake of simplicity, we shall treat the problem in one dimension, and assume (r) only depends on the first coordinate of r; the wave function must then be zero along a plane (supposed to be at = 0). We are looking for an order of magnitude of the distance over which the wave function goes from a practically constant value to zero, i.e. for the spatial range of the wave function transition regime. In the region where (r) is constant, relation (56) yields: = 1652

0

(58)



CONDENSED BOSON SYSTEM, GROSS-PITAEVSKII EQUATION

Figure 1: Variation as a function of the position of the wave function ( ) in the vicinity of a wall (at = 0) where it is forced to be zero. This variation occurs over a distance of the order of the healing length defined in (61); the stronger the particle interactions, the shorter that length. As increases, the wave function tends towards a constant plateau, of coordinate 0 , represented as a dashed line. On the other hand, in the whole region where particular close to the origin, we have: }2 ∆ (r) 2 (r)

=

(r) has significantly decreased, and in

(59)

0

In one dimension4 , we then get the differential equation: }2 d2 ( ) 2 d 2

0

( )

whose solutions are sums of exponential functions =

(60) , with:

}2 2

(61) 0

The solution that is zero for = 0 is the difference between these two exponentials; it is proportional to sin( ), a function that starts from zero and increases over a characteristic length . Figure 1 shows the wave function variation in the vicinity of the wall where it is forced to be zero. The stronger the interactions, the shorter this “healing length” ; it varies as the inverse of the square root of the product of the coupling constant and the density 0 . From a physical point of view, the healing length results from a compromise between the repulsive interaction forces, which try to keep the wave function as constant as possible in space, and the kinetic energy, which tends to minimize its spatial derivative (while the wave function is forced to be zero at = 0); is equal (except for a 2 coefficient) to the de Broglie wavelength of a free particle having a kinetic energy comparable to the repulsion energy 0 in the boson system. 4 A more precise derivation can be given by verifying that the one-dimensional equation (56).

( )=

0

tanh

2 is a solution of

1653

COMPLEMENT CXV

4-c.



Another trial ket: fragmentation of the condensate

We now show that repulsive interactions do stabilize a boson “condensate” where all the particles occupy the same individual state, as opposed to a “fragmented” state where some particles occupy a different state, which can be very close in energy. Instead of using a trial ket (7), where all the particles form a perfect Bose-Einstein condensate in a single quantum state , we can “fragment” this condensate by distributing the particles in two distinct individual states. Consequently, we take a trial ket where particles are in the state and = in the orthogonal state : 1 !

Ψ =

0

!

(62)

We now compute the change in the average variational energy. In formula (29) giving the average kinetic energy, for the operator to yield a Fock state identical to Ψ , we must have either = = , or = = . This leads to: 0

=

+

0

(63)

0

The computation of the one-body potential energy is similar and leads to: =

ext

+

1

(64)

1

In both cases, the contributions of two populated states are proportional to their respective populations, as expected for energies involving a single particle. As for the two-body interaction energy, we use again relation (32). It contains the operator , which will reconstruct the Fock state Ψ in the following three cases: - = = = = or yields the contribution: (

1) 2

1: +

;2 : (

1) 2

2 (1

2) 1 :

1:

;2 :

;2 : 2 (1

2) 1 :

;2 :

(65)

= = and = = , or = = and = = ; these two possibilities yield the same contribution (since the 2 operator is symmetric), and the 1 2 factor disappears, leading to the direct term: 1:

;2 :

2 (1

2) 1 :

;2 :

(66)

- Finally, = = and = = , or = = and = = , yield two contributions whose sum introduces the exchange term (here again without the factor 1 2): 1:

;2 :

2 (1

2) 1 :

;2 :

(67)

The direct and exchange terms have been schematized in Figure 3 in Chapter XV (replacing by , and by ), with the direct term on the left, and the exchange term on the right. 1654



CONDENSED BOSON SYSTEM, GROSS-PITAEVSKII EQUATION

The variational energy can thus be written as: =

[

[

0

(

+

1) 2

(

+

1) 2

+

1]

]+

1:

;2 :

1:

;2 :

[

[

2 (1

0

2) 1 :

2 (1

2) 1 :

+

1:

;2 :

2 (1

2) 1 :

;2 :

+

1:

;2 :

2 (1

2) 1 :

;2 :

+

1]

]

;2 : ;2 :

(68)

As above, the interaction between particles in the same state contributes a term proportional to ( 1) 2, the number of pairs of particles in that state; the same is true for the interaction term between particles in the same state . The direct term associated with the interaction between two particles in distinct states is proportional to , the number of such pairs. But to this direct term we must add an exchange term, also proportional to , corresponding to an additional interaction. This increased interaction is due to the bunching effect of two bosons in different quantum states, that will be discussed in more detail in § 3-b of Complement AXVI . As they are indistinguishable, two bosons occupying individual orthogonal states show correlations in their positions; this increases the probability of finding them at the same point in space. This increase does not occur when the two bosons occupy the same individual quantum state. We now assume the diagonal matrix elements of [ 0 + 1 ] between the two states and to be practically the same. For example, if these two states are the lowest energy levels of spinless particles in a cubic box of edge , the corresponding energy difference is proportional to 1 2 – hence very small in the limit of large . We also assume all the matrix elements of 2 (1 2) to be equal, which is the case if the (microscopic) range of the particle interaction potential is very small compared to the distances over which the wave functions of the two states vary. We can therefore replace in all the matrix elements the kets and by the same ket . Since + = , we obtain: = +

[ 1 [ 2

+

0

+

(

1]

1) + 1 : ;2 :

(

1) + 2

2 (1

] 1 : ;2 :

2 (1

2) 1 : ; 2 :

2) 1 : ; 2 :

(69)

However: (

1) = (

+

)(

+

1) =

(

1) +

(

1) + 2

(70)

so that: =

[

0

+

1]

+

(

1) 2

1 : ;2 :

2 (1

2) 1 : ; 2 :

+∆

(71)

with: ∆

=

1 : ;2 :

2 (1

2) 1 : ; 2 :

(72) 1655

COMPLEMENT CXV



We find again result (34), but with an additional term ∆ , the exchange term. Two cases are then possible, depending on whether the particle interactions are attractive or repulsive. In the first case, the fragmentation of the condensate lowers the energy and leads to a more stable state. Consequently, when the particle interactions are attractive, a condensate where only one individual state is occupied tends to split into two condensates, which might each split again, and so on. This means that the initial single condensate is unstable (we will come back and discuss this instability in § 2-b of Complement FXV for the more general case of thermal equilibrium at non-zero temperature). On the contrary, for repulsive interactions the fragmentation increases the energy and leads to a less stable state: repulsive interactions therefore tend to stabilize the condensate in a single individual quantum state5 . This result will be interpreted in § 3-b of Complement AXVI in terms of changes of the particle position correlation function (bunching effect of bosons). As for the ideal gas, an intermediate case between the two previous ones, it is a marginal borderline case: adding any infinitesimal attractive interaction, no matter how small, destabilizes any condensate.

5 We are discussing here the simple case of spinless bosons, contained in a box. When the bosons have several internal quantum states, and in other geometries, more complex situations may arise where the ground state is fragmented [4].

1656



TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

Complement DXV Time-dependent Gross-Pitaevskii equation

1

2

3

Time evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-a Functional variation . . . . . . . . . . . . . . . . . . . . . . . 1-b Variational computation: the time-dependent Gross-Pitaevskii equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-c Phonons and Bogolubov spectrum . . . . . . . . . . . . . . . Hydrodynamic analogy . . . . . . . . . . . . . . . . . . . . . . 2-a Probability current . . . . . . . . . . . . . . . . . . . . . . . . 2-b Velocity evolution . . . . . . . . . . . . . . . . . . . . . . . . Metastable currents, superfluidity . . . . . . . . . . . . . . . 3-a Toroidal geometry, quantization of the circulation, vortex . . 3-b Repulsive potential barrier between states of different . . . 3-c Critical velocity, metastable flow . . . . . . . . . . . . . . . . 3-d Generalization; topological aspects . . . . . . . . . . . . . . .

1657 1658 1659 1660 1664 1664 1665 1667 1667 1669 1671 1674

In this complement, we return to the calculations of Complement CXV , concerning a system of bosons all in the same individual state. We now consider the more general case where that state is time-dependent. Using a variational method similar to the one we used in Complement CXV , we shall study the time variations of the -particle state vector. This amounts to using a time-dependent mean field approximation. We shall establish in § 1 a time-dependent version of the Gross-Pitaevskii equation, and explore some of its predictions such as the small oscillations associated with Bogolubov phonons. In § 2, we shall study local conservation laws derived from this equation for which we will give a hydrodynamic analogy, introducing a characteristic relaxation length. Finally, we will show in § 3 how the Gross-Pitaevskii equation predicts the existence of metastable flows and superfluidity. 1.

Time evolution

We assume that the ket describing the physical system of relation (7) of Complement CXV : ^ Ψ( ) =

1 !

()

0

bosons can be written using

(1)

but we now suppose that the individual ket is a function of time ( ) . The creation operator ( ) in the corresponding individual state is then time-dependent: () 0 =

()

We will let the ket ()

() =1

(2) ( ) vary arbitrarily, as long as it remains normalized at all times: (3) 1657

COMPLEMENT DXV



We are looking for the time variations of ( ) that will yield for ^ Ψ( ) variations as close as possible to those predicted by the exact -particle Schrödinger equation. As the one-particle potential 1 may also be time-dependent, it will be written as 1 ( ). 1-a.

Functional variation

Let us introduce the functional of Ψ( ) : 1

[ Ψ( ) ] =

d Ψ( )

( ) Ψ( )

}

0

+

} 2

Ψ( 0 ) Ψ( 0 )

Ψ( 1 ) Ψ( 1 )

(4)

It can be shown that this functional is stationary when Ψ( ) is solution of the exact Schrödinger equation (an explicit demonstration of this property is given in § 2 of Complement FXV . If Ψ( ) belongs to a variational family, imposing the stationarity of this functional allows selecting, among all the family kets, the one closest to the exact solution of the Schrödinger equation. We shall therefore try and make this functional stationary, choosing as the variational family the set of kets ^ Ψ( ) written as in (1) where the individual ket ( ) is time-dependent. As condition (3) means that the norm of ^ Ψ( ) remains constant, the second bracket in expression (4) must be zero. We now have to evaluate the average value of the Hamiltonian ( ) that, actually, has been already computed in (34) of Complement CXV : ^ Ψ( ) [ ( )] ^ Ψ( ) =

() [ (

+

1) 2

0

+

1(

)] ( )

1 : ( ); 2 : ( )

2 (1

2) 1 : ( ); 2 : ( )

(5)

The only term left to be computed in (4) contains the time derivative. This term includes the diagonal matrix element: ^ Ψ( )

}

d d

} ^ Ψ( ) = 0 !

1

()

() =0

d d

1

()

0

(6)

For an infinitesimal time , the operator is proportional to the difference ( + ) ( ), hence to the difference between two creation operators associated with two slightly different orthonormal bases. Now, for bosons, all the creation operators commute with each other, regardless of their associated basis. Therefore, in each term of the summation over , we can move the derivative of the operator to the far right, and obtain the same result, whatever the value of . The summation is therefore equal to times the expression: 1

1

0

()

! Now, we know that:

()

d d

0

(7)

1

() 1658

()

0 =

! 1: () =

!

() 0

(8)



TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

Using in (6) the bra associated with that expression, multiplied by ^ Ψ( )

}

d d

^ Ψ( ) = }

0

()

d d

0 =

()

}

d d

()

)]

}

d d

()

, we get: (9)

Regrouping all these results, we finally obtain: 1

^ Ψ( ) =

d

() [

0

+

1(

0

(

+ 1-b.

1) 2

1 : ( ); 2 : ( )

2 (1

2) 1 : ( ); 2 : ( )

(10)

Variational computation: the time-dependent Gross-Pitaevskii equation

We now make an infinitesimal variation of ()

() +

()

(): (11)

in order to find the kets ( ) for which the previous expression will be stationary. As in the search for a stationary state in Complement CXV , we get variations coming from the infinitesimal ket ( ) and others from the infinitesimal bra ( ) ; as is chosen arbitrarily, the same argument as before leads us to conclude that each of these variations must be zero. Writing only the variation associated with the infinitesimal bra, we see that the stationarity condition requires ( ) to be a solution of the following equation, written for ( ) : }

d d

() =[

0

+

1(

The mean field operator CXV by a partial trace: (1 ) = ( ( )

where ( )

=

1) Tr2

)+

( )]

()

( ) is defined as in relations (45) and (46) of Complement

( )

(2)

2

(1 2)

is the projector onto the ket ()

()

(12)

(13)

(): (14)

As we take the trace over particle 2 whose state is time-dependent, the mean field is also time-dependent. Relation (12) is the general form of the time-dependent Gross-Pitaevskii equation. Let us return, as in § 2 of Complement CXV , to the simple case of spinless bosons, interacting through a contact potential: 2

(r r ) =

(r

r)

(15)

Using definition (13) of the Gross-Pitaevskii potential, we can compute its effect in the position representation, as in Complement CXV . The same calculations as in §§ 2-b- and 2-b- of that complement allow showing that relation (12) becomes the Gross-Pitaevskii 1659



COMPLEMENT DXV

time-dependent equation ( by ): }2 ∆+ 2

(r ) =

}

is supposed to be large enough to permit replacing

1 (r

)+

(r )

Normalizing the wave function (r ) to d3

(r )

2

2

(r )

1

(16)

:

=

(17)

equation (16) simply becomes: }2 ∆+ 2

(r ) =

}

1 (r

)+

(r )

2

(r )

(18)

Comment: It can be shown that this time evolution does conserve the norm of ( ) , as required by (3). Without the nonlinear term of (16), it would be obvious since the usual Schrödinger equation conserves the norm. With the nonlinear term present, it will be shown in § 2-a that the norm is still conserved.

1-c.

Phonons and Bogolubov spectrum

Still dealing with spinless bosons, we consider a uniform system, at rest, of particles contained in a cubic box of edge length . The external potential 1 (r) is therefore zero inside the box and infinite outside. This potential may be accounted for by forcing the wave function to be zero at the walls. In many cases, it is however more convenient to use periodic boundary conditions (Complement CXIV , § 1-c), for which the wave function of the individual lowest energy state is simply a constant in the box. We thus consider a system in its ground state, whose Gross-Pitaevskii wave function is independent of r: (r ) = with a

0(

)=

1

}

3 2

(19)

value that satisfies equation (16):

=

3

=

0

(20)

3 where 0 = is the system density. Comparing this expression with relation (58) of Complement CXV allows us to identify with the ground state chemical potential. We assume in this section that the interactions between the particles are repulsive (see the comment at the end of the section):

0 1660

(21)

• .

TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

Excitation propagation

Let us see which excitations can propagate in this physical system, whose wave function is no longer the function (19), uniform in space. We assume: (r ) =

0(

)+

(r )

(22)

where (r ) is sufficiently small to be treated to first order. Inserting this expression in the right-hand side of (16), and keeping only the first-order terms, we find in the interaction term the first-order expression: 2

(r )

(r ) =

(2

=

0

)

0

2

0 2

+

2 0

+ }

(23)

We therefore get, to first-order: }

(r ) =

}2 ∆+2 2

(r ) +

0

2

0

}

(r )

which shows that the evolution of (r ) is coupled to that of conjugate equation can be written as: }

(r ) =

}2 ∆ 2

2

(r )

0

0

2

}

(24) (r ). The complex

(r )

(25)

We can make the time-dependent exponentials on the right-hand side disappear by defining: (r ) = (r ) (r ) = (r )

}

(26)

}

This leads us to a differential equation with constant coefficients, which can be simply expressed in a matrix form:

}

(r ) (r )

=

}2 2

∆+

0

0



0

where we have used definition (20) for to replace 2 solutions having a plane wave spatial dependence: (r ) = (k ) k r (r ) = (k )

(r ) (r )

0 }2 2

0

by

0.

(27)

If we now look for

(28)

kr

the differential equation can be written as:

}

(k ) (k )

=

}2 2

2

+

0 0

0 }2 2

2

0

(k ) (k )

(29)

1661

COMPLEMENT DXV



The eigenvalues } (k) of this matrix satisfy the equation: }2 2

2

+

}2 2

} (k)

0

2

2 0)

} (k) + (

0

=0

(30)

that is: }2 2

2

[} (k)]

2

2

+

+(

0

2 0)

=0

(31)

The solution of this equation is: }2 2

} (k) =

2

2

+

(

0

0)

2

=

}2 2

2

}2 2

2

+2

(32)

0

(the opposite value is also a solution, as expected since we calculate at the same time the evolution of ( ) and of its complex conjugate; we only use here the positive value). Setting: 0

=

2 }

(33)

0

relation (32) can be written: (k) =

} 2

2

(

2

+

2 0)

(34)

The spectrum given by (32) is plotted in Figure 1, where one sees the intermediate regime between the linear region at low energy, and the quadratic region at higher energy. It is called the “Bogolubov spectrum” of the boson system. .

Discussion

Let us compute the spatial and time evolution of the particle density (r t) when (r ) obeys relation (28). The particle density at each point r of space is the sum of the densities associated with each particle, that is times the squared modulus of the wave function (r ). To first-order in (r ), we obtain: (r t) =

0(

)

}

[

(r )] + c.c.

(35)

(where c.c. stands for “complex conjugate”). Using (26) and (28), we can finally write: (r t) = =

0(

3 2

)

}

(k 0)

[k r

}

(k 0)

(k) ]

+ c.c.

[k r

(k) ]

+ c.c. (36)

Consequently, the excitation spectrum we have calculated corresponds to density waves propagating in the system with a phase velocity (k) . In the absence of interactions, ( = 0 = 0), this spectrum becomes: } (k) = 1662

}2 2

2

(37)



TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

Figure 1: Bogolubov spectrum: variations of the function (k) given by equation (32) as a function of the dimensionless variable = 1, we get a linear 0 . When spectrum (the arrow in the figure shows the tangent to the curve at the origin), whose slope is equal to the sound velocity ; when 1, the spectrum becomes quadratic, as for a free particle. which simply yields the usual quadratic relation for a free particle. Physically, this means that the boson system can be excited by transferring a particle from the individual ground state, with wave function 0 (r) and zero kinetic energy, to any state k (r) having an energy }2 2 2 . In the presence of interactions, it is no longer possible to limit the excitation to a single particle, which immediately transmits it to the others. The system’s excitations become what we call “elementary excitations”, involving a collective motion of all the particles, and hence oscillations in the density of the boson system. If 0 , we see from (34) that: (k) where =

(38) is defined as:

} 2

0

0

=

(39)

For small values of , the interactions have the effect of replacing the quadratic spectrum (37) by a linear spectrum. The phase velocity of all the excitations in this value domain is a constant . It is called the “sound velocity ” in the interacting boson system, by analogy with a classical fluid where the sound wave dispersion relation is linear, as predicted by the Helmholtz equation. We shall see in § 3 that the quantity plays a fundamental role in the computations related to superfluidity, especially for the critical velocity determination. If, on the other hand, 0 , the spectrum becomes: } (k)

}2 2

2

+

0

+

(40)

(the following corrections being in 02 2 , 04 4 , etc.). We find again, within a small correction, the free particle spectrum: exciting the system with enough energy allows 1663



COMPLEMENT DXV

exciting individual particles almost as if they were independent. Figure 1 shows the complete variation of the spectrum (32), with the transition from the linear region at low energies, to the quadratic region at high energies. Comment: As we assumed the interactions to be repulsive in (21), the square roots in (32) and (39) are well defined. If the coupling constant becomes negative, the sound velocity will become imaginary, and, as seen from (32), so will the frequencies ( ) (at least for small values of ). This will lead, for the evolution equation (29), to solutions that are exponentially increasing or decreasing in time, instead of oscillating. An exponentially increasing solution corresponds to an instability of the system. As already encountered in § 4-c of Complement CXV , we see that a boson system becomes unstable in the presence of attractive interactions, however small they might be. In § 4-b of Complement HXV , we shall see that this instability persists even for non-zero temperature. In a general way, an attractive condensate occupying a large region in space tends to collapse onto itself, concentrating into an ever smaller region. However, when it is confined in a finite region (as is the case for experiments where cold atoms are placed in a magneto-optical trap), any change in the wave function that brings the system closer to the instability also increases the gas energy; this results in an energy barrier, which allows the system of condensed attractive bosons to remain in a metastable state.

2.

Hydrodynamic analogy

Let us return to the study of the time evolution of the Gross-Pitaevskii wave function and of the density variations (r ), without assuming as in § 1-c that the boson system stays very close to uniform equilibrium. We will show that the Gross-Pitaevskii equation can take a form similar to the hydrodynamic equation describing a fluid’s evolution. In this discussion, it is useful to normalize the Gross-Pitaevskii wave function to the particle number, as in equation (17). Equation (16) can then be written as: }

}2 ∆+ 2

(r ) =

1 (r

)+

(r )

(r )

(41)

where the local particle density (r ) is given by: (r ) = 2-a.

(r )

2

(42)

Probability current

Since: (r ) =

(r )

(r ) + (r )

(r )

(43)

the time variation of the density may be obtained by first multiplying (41) by (r ), then its complex conjugate by (r ), and then adding the two results; the potential terms in 1 (r ) and (r ) cancel out, and we get: (r ) = 1664

} 2

[

(r )∆ (r )

(r )∆

(r )]

(44)



TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

Let us now define a vector J(r ) by: J(r ) =

}

[

2

(r )∇ (r )

(r )∇

(r )]

(45)

If we compute the divergence of this vector, the terms in ∇ ∇ cancel out and we are left with terms identical to the right-hand side of (44), with the opposite sign. This leads to the conservation equation: (r ) + ∇ J(r ) = 0

(46)

J(r ) is thus the probability current associated with our boson system. Integrating over all space, using the divergence theorem, and assuming (r ) (hence the current) goes to zero at infinity, we obtain: d3

d3

(r ) =

(r )

2

=0

(47)

This shows, as announced earlier, that the Gross-Pitaevskii equation conserves the norm of the wave function describing the particle system. We now set: (r ) =

(r )

(r t)

(48)

The gradient of this function is written as: (r t)

∇ (r ) =



(r ) +

(r )∇ (r )

(49)

Inserting this result in (45), we get: J(r ) =

}

(r )∇ (r )

(50)

or, defining the particle local velocity v(r ) as the ratio of the current to the density: v(r ) =

J(r ) } = ∇ (r ) (r )

(51)

We have defined a velocity field, similar to the velocity field of a fluid in motion in a certain region of space; this field velocity is irrotational (zero curl everywhere). 2-b.

Velocity evolution

We now compute the time derivative of this velocity. Taking the derivative of (48), we get: }

(r ) =

(r t)

(r )

}

}

(r )

(r )

(r )

(52)

so that we can isolate the time derivative of (r t) by the following combination: }

(r )

(r )

(r )

(r ) =

2} (r )

(r )

(53) 1665



COMPLEMENT DXV

The left-hand side of this relation can be computed with the Gross-Pitaevskii equation (18) and its complex conjugate, as we now show. We first take the divergence of the gradient (49) to obtain the Laplacian: (r )

∆ (r ) = ∇ ∇ (r ) =



+

(r ) + 2



(r )∆ (r )

(r )

∇ (r )

(r ) (∇ (r ))

2

(54)

We then insert the time derivative of (r ) given by the Gross-Pitaevskii equation (18) in the left-hand side of relation (53), which becomes: }2 [ 2 =

(r )∆ (r ) + (r )∆ }2 2

2

(r ) ∆

(r )] + 2 [

(r ) +2

1 (r

1 (r

)+

2 (r ) (∇ (r )) )+

(r )

(r )]

(r )

2

2

(r )

(55)

This result must be equal to the right-hand side of (53). We therefore get, after dividing both sides by 2 (r ): }

(r t) =

}2 2

1 (r )



(r )

(∇ (r ))

2

[

1 (r

)+

(r )]

(56)

Using (51), we finally obtain the evolution equation for the velocity v(r ): v(r ) =



1 (r

)+

(r ) +

v2 (r ) }2 + 2 2

1 (r )



(r )

(57)

This equation looks like the classical Newton equation. Its right-hand side includes the sum of the forces corresponding to the external potential 1 (r ), and to the mean interaction potential with the other particles (r ); the third term in the gradient is the classical kinetic energy gradient1 (as in Bernoulli’s equation of classical hydrodynamics). The only purely quantum term is the last one, as shown by its explicit dependence on }2 . It involves spatial derivatives of (r ), and is only important if the relative variations of the density occur over small enough distances (for example, this term is zero for a uniform density). This term is sometimes called “quantum potential”, or “quantum pressure term” or, in other contexts, “Bohm potential”. A frequently used approximation is to consider the spatial variations of (r ) to be slow, which amounts to ignoring this quantum potential term: this is the so-called Thomas-Fermi approximation. We have found for a system of particles a series of properties usually associated with the wave function of a single particle, and in particular a local velocity directly proportional to its phase gradient2 . The only difference is that, for the -particle case, 1 It is a “total derivative” term (the derivative describing, in a fluid, the motion of each particle). As the velocity field has a zero curl according to (51), a simple vector analysis calculation shows this term to be equal to (v ∇) v; it can therefore be accounted for by replacing on the left-hand side of (57) the partial derivative by the total derivative d d = + v ∇. 2 The quantum potential is still present for a single particle, since making = 0 in (57) does not change this potential. For = 0, the Gross-Pitaevskii equation simply reduces to the standard Schrödinger equation, valid for a single particle.

1666



TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

Figure 2: A repulsive boson gas is contained in a toroidal box. All the bosons are supposed to be initially in the same quantum state describing a rotation around the axis. As we explain in the text, this rotation can only slow down if the system overcomes a potential energy barrier that comes from the repulsive interactions between the particles. This prevents any observable damping of the rotation over any accessible time scale; the fluid rotates indefinitely, and is said to be superfluid.

we must add to the external potential 1 (r ) a local interaction potential (r ), which does not significantly change the form of the equations but introduces some nonlinearity that can lead to completely new physical effects. 3.

Metastable currents, superfluidity

Consider now a system of repulsive bosons contained in a toroidal box with a rotational axis (Figure 2); the shape of the torus cross-section (circular, rectangular or other) is irrelevant for our argument and we shall use cylindrical coordinates , and . We first introduce solutions of the Gross-Pitaevskii equation that correspond to the system rotating inside the toroidal box, around the axis. We will then show that these rotational states are metastable, as they can only relax towards lower energy rotational states by overcoming a macroscopic energy barrier: this is the physical origin of superfluidity. 3-a.

Toroidal geometry, quantization of the circulation, vortex

To prevent any confusion with the azimuthal angle , we now call the GrossPitaevskii wave function. The time-independent Gross-Pitaevskii equation then becomes (in the absence of any potential except the wall potentials of the box): }2 2

1

+

1 2

2

2 2

+

2

+

(r)

2

(r) =

(r)

(58)

We look for solutions of the form: (r) =

(

)

(59) 1667



COMPLEMENT DXV

where is necessarily an integer (otherwise the wave function would be multi-valued). Such a solution has an angular momentum with a well defined component along , equal to } per atom. Inserting this expression in (58), we obtain the equation for ( ): }2 2

1 =

( (

)

)

2

+

(

) 2

+

(

2 2

}

2

) +

2

2

(

) (60)

which must be solved with the boundary conditions imposed by the torus shape to obtain the ground state (associated with the lowest value of ). The term in 2 }2 2 2 is simply the rotational kinetic energy around . If the tore radius is very large compared to the size of its cross-section, the term 2 2 may, to a good approximation, be replaced by the constant 2 2 . It follows that the same solution of (60) is valid for any value of as long as the chemical potential is increased accordingly. Each value of the angular momentum thus yields a ground state and the larger , the higher the corresponding chemical potential. All the coefficients of the equation being real, we shall assume, from now on, the functions ( ) to be real. As the wave function is of the form (59), its phase only depends on , and expression (51) for the fluid velocity is written as: v=

1 }

(61)

e

where e is the tangential unit vector (perpendicular both to r and the axis). Consequently, the fluid rotates along the toroidal tube, with a velocity proportional to . As v is a gradient, its circulation along a closed loop “equivalent to zero” (i.e. which can be contracted continuously to a point) is zero. If the closed loop goes around the tore, the path is no longer equivalent to zero and its circulation may be computed along a circle where and remain constant, and varies from 0 to 2 ; as the path length equals 2 , we get: v ds =

2 }

(62)

(with a + sign if the rotation is counterclockwise and a sign in the opposite case). As is an integer, the velocity circulation around the center of the tore is quantized in units of . This is obviously a pure quantum property (for a classical fluid, this circulation can take on a continuous set of values). To simplify the calculations, we have assumed until now that the fluid rotates as a whole inside the toroidal ring. More complex fluid motions, with different geometries, are obviously possible. An important case, which we will return to later, concerns the rotation around an axis still parallel to , but located inside the fluid. The GrossPitaevskii wave function must then be zero along a line inside the fluid itself, which thus contains a singular line. This means that the phase may change by 2 as one rotates around this line. This situation corresponds to what is called a “vortex”, a little swirl of fluid rotating around the singular line, called the “vortex core line”. As the circulation of the velocity only depends on the phase change along the path going around the vortex core, the quantization relation (62) remains valid. Actually, from a historical point of view, the Gross-Pitaevskii equation was first introduced for the study of superfluidity and the quantization of the vortices circulation. 1668

• 3-b.

TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

Repulsive potential barrier between states of different

A classical rotating fluid will always come to rest after a certain time, due to the viscous dissipation at the walls. In such a process, the macroscopic rotational kinetic energy of the whole fluid is progressively degraded into numerous smaller scale excitations, which end up simply heating the fluid. Will a rotating quantum fluid of repulsive bosons, described by a wave function (r), behave in the same way? Will it successively evolve towards the state 1 (r), then 2 (r), etc., until it comes to rest in the state 0 (r)? We have seen in § 4-c of Complement CXV that, to avoid the energy cost of fragmentation, the system always remains in a state where all the particles occupy the same quantum state. This is why we can use the Gross-Pitaevskii equation (18). .

A simple geometry Let us first assume that the wave function (r) according to: (r ) =

()

(r) +

()

(r ) changes smoothly from

(r)

(63)

where the modulus of ( ) decreases with time from 1 to 0, whereas opposite. Normalization imposes that at all times : 2

() +

()

2

(r) to

=1

( ) does the (64)

In such a state, let us show that the numerical density ( ; ) now depends on (this was not the case for either states or separately). The transverse dependence of the density as a function of the variables and , is barely affected3 . The variations of ( ; ) are given by: 2

(

; )=

()

=

()

+

()

( 2

)

()

2

( ()

+ ) +

(

)

(

() (

2

) (

) (

)

2

) + c.c.

(65)

where c.c. stands for the complex conjugate of the preceding factor. The first two terms are independent of , and are just a weighted average of the densities associated with each of the states and . The last term oscillates as a function of with an amplitude () ( ) , which is only zero if one of the two coefficients ( ) or ( ) is zero. Calling the phase of the coefficient ( ), this last term is proportional to: ()

() (

) + c.c. = 2

()

( ) cos [(

)

+

]

(66)

Whatever the phases of the two coefficients ( ) and ( ), the cosine will always oscillate between 1 and 1 as a function of . Adjusting those phases, one can deliberately change the value of for which the density is maximum (or minimum), but this will always occur somewhere on the circle. Superposing two states necessarily modulates the density. Let us evaluate the consequences of this density modulation on the internal repulsive interaction energy of the fluid. As we did in relation (15), we use for the interaction 3 or

not at all, if we suppose the functions

(

) and

(

) to be equal.

1669



COMPLEMENT DXV

energy the zero range potential approximation, and insert it in expression (15) of Complement CXV . Taking into account the normalization (17) of the wave function, we get: =

2

d3

2

(r )

4

2

=

d

2

0

+

d

d

[ (

2

; )]

(67)

0

We must now include the square of (65) in this expression, which will yield several terms. 4 The first one, in ( ) , leads to the contribution: ()

4

where

(68)

2

2

is the interaction energy for the state

(r). The second contribution is 2

the similar term for the state , and the third one, a cross term in 2 ( ) Assuming, to keep things simple, that the densities associated with the states are practically the same, the sum of these three terms is just: 2

() +

()

2

2

() . and

2

=

2

(69)

2

Up to now, the superposition has had no effect on the repulsive internal interaction energy. As for the cross terms between the terms independent of in (65) and the terms ( ) , they will cancel out when integrated over . We are then left with the cross in ( ) ( ) , whose integral over yields: terms in 2

()

2

()

2

(

)

2

(

)

2

(70)

Assuming as before that the densities associated with the states the same, we obtain, after integration over and : 2

()

2

()

and

are practically

2

(71)

2

Adding (69), we finally obtain: 2

= 1+2

()

2

()

2

(72)

2

We have shown that the density modulation associated with the superposition of states always increases the internal repulsion energy: this modulation does lower the energy in the low density region, but the increase in the high energy region outweighs the decrease (since the repulsive energy is a quadratic function of the density). The internal energy therefore varies between when the moduli of .

( ) and

2

and the maximum (3 2)

( ) are both equal to 1

2

, reached

2.

Other geometries, different relaxation channels

There are many other ways for the Gross-Pitaevskii wave function to go from one rotational state to another. We have limited ourselves to the simplest geometry to introduce the concept of energy barriers with minimal mathematics. The fluid could 1670



TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

transit, however, through more complex geometries, such as the frequently observed creation of a vortex on the wall, the little swirl we briefly talked about at the end of § 3-a. A vortex introduces a 2 phase shift around a singular line along which the wave function is zero. Once the vortex is created, and contrary to what was the case in (62), the velocity circulation along a loop going around the torus is no longer independent of its path: it will change by 2 } depending on whether the vortex is included in the loop or not. Furthermore, as the vortex moves in the fluid from one wall to another, it can be shown that the proportion of fluid conserving the initial circulation decreases while the proportion having a circulation where the quantum number differs by one unit increases. Consequently, this vortex motion changes progressively the rotational angular momentum. Once the vortex has vanished on the other wall, the final result is a decrease by one unit of the quantum number associated with the fluid rotation. The continuous passage of vortices from one wall to another therefore yields another mechanism that allows the angular moment of the fluid to decrease. The creation of a vortex, however, is necessarily accompanied by a non-uniform fluid density, described by the Gross-Pitaevskii equation (this density must be zero along the vortex core). As we have seen above, this leads to an increase in the average repulsive energy between the particles (the fluid elastic energy). This process thus also encounters an energy barrier (discussed in more detail in the conclusion). In other words, the creation and motion of vortices provide another “relaxation channel” for the fluid velocity, with its own energy barrier, and associated relaxation time. Many other geometries can be imagined for changing the fluid flow. Each of them is associated with a potential barrier, and therefore a certain lifetime. The relaxation channel with the shortest lifetime will mainly determine the damping of the fluid velocity, which may take, in certain cases, an extraordinarily long time (dozens of years or more), hence the name of “superfluid”. 3-c.

Critical velocity, metastable flow

For the sake of simplicity, we will use in our discussion the simple geometry of § 3a. The transposition to other geometries involving, for example, the creation of vortices in the fluid would be straightforward. The main change would concern the height of the energy barrier4 . With this simple geometry, the potential to be used in (60) is the sum of a repulsive 2 potential ( ) and a kinetic energy of rotation around , equal to 2 }2 2 2 . We now show that, in a given state, these two contributions can be expressed as a function of two velocities. First, relation (61) yields the rotation velocity associated with state : =

1 }

(73)

and the rotational energy is simply written as: 2 2

=

}

2

2

=

1 2

2

( )

(74)

4 When several relaxation channels are present, the one associated with the lowest barrier mainly determines the time evolution.

1671



COMPLEMENT DXV

As for the interaction term (term in on the left-hand side), we can express it in a more convenient way, defining as before the numerical density 0 : 0

=

(

)

2

(75)

and using the definition (39) for the sound velocity . It can then be written in a form similar to (74): 0

=

2

(76)

The two velocities and allow an easy comparison of the respective importance of the kinetic and potential energies in a state . We now compare the contributions of these two terms either for states with a given , or for a superposition of states (63). To clarify the discussion and be able to draw a figure, we will use a continuous variable defined as the average of the component along of the angular momentum: 2

= }

() + }

()

2

(77)

This expression varies continuously between } and } when the relative weights of 2 2 ( ) and ( ) are changed while imposing relation (64); the continuous variable: =

(78)

}

allows making interpolations between the discrete integer values of . Using the normalization relation (64) of the wave function (63), we can express 2 as a function of () : =(

)

2

() +

(79)

The variable characterizes the modulus of each of the two components of the variational function (63). A second variable is needed to define the relative phase between these two components, which comes into play for example in (66). Instead of studying the time evolution of the fluid state vector inside this variational family, we shall simply give a qualitative argument, for several reasons. First of all, it is not easy to characterize precisely the coupling between the fluid and the environment by a Hamiltonian that can change the fluid rotational angular momentum (for example, the wall’s irregularities may transfer energy and angular momentum from the fluid to the container). Furthermore, as the time-dependent Gross-Pitaevskii equation is nonlinear, its precise solutions are generally found numerically. This is why we shall only qualitatively discuss the effects of the potential barrier found in §3-b. The higher this barrier, the more difficult it is for to go from to . Let us evaluate the variation of the average energy as a function of . For integer values of , relation (74) shows that the average rotational kinetic energy varies as the square of ; in between, its value can be found by interpolation as in (77). As for the potential energy, we saw that a continuous variation of ( ) and ( ) necessarily involves a coherent superposition, which has an energy cost and increases the repulsive potential interaction. In particular, this interaction energy is multiplied by the factor 3 2 when the moduli of ( ) and ( ) are equal (i.e. when is an integer plus 1 2). As a result, to the quadratic variation of the rotational kinetic energy, we must 1672



TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

add an oscillating variation of the potential energy, minimum for all the integer values of , and maximum half-way between. The oscillation amplitude is given by: 0

2

=

1 2

2

(80)

Figure 3 shows three plots of the variation of the system energy as a function of the average value . The lowest one, shown as a dotted line, corresponds to a superposition of the state with the state = 1, for a very small value of the coupling constant (weak interactions, gas almost ideal). In this case and according to (39), the sound velocity is also very small and we are in the case . Comparing (74) and (80) then shows that the potential energy contribution is negligible compared to the variation of the rotational kinetic energy between the two states. As a result, the modulation on this dotted line is barely perceptible, and this curve presents a single minimum at = 0: whatever the initial rotational state, no potential barrier prevents the fluid rotational velocity from returning to zero (for example under the effect of the interactions with the irregularities of the walls containing the fluid). The other two curves in Figure 3 correspond to a much larger value of , hence, according to (39), to a much higher value of . There are now several values of for which is small compared to . The dashed line corresponds, as for the previous curve, to a superposition of the two states = 1 and = 1; the solid line (for the same value of ) to a superposition of = 3 and = 0, corresponding to the case where the system goes directly from the state = 3 to the rotational ground state in the torus, with = 0. It is obviously this last curve that presents the lowest energy barrier starting from = 3 (shown with a circle in the figure). This is normal since this is the curve that involves the largest variation in the kinetic energy, in a sense opposite to that of the potential energy variation. It is thus the direct transition from = 3 to = 0 that will determine the possibility for the system to relax towards a state of slower rotation. Let us again use (74) and (80) to compare the kinetic energy variation and the height of the repulsive potential barrier. All the states , with velocities much larger than , have a kinetic energy much bigger than the maximum value of the potential energy: no energy barrier can be formed. On the other hand, all the states with velocities much smaller than cannot lower their rotational state without going over a potential barrier. In between these two extreme cases, there exists (for a given ) a “critical” value corresponding to the onset of the barrier. It is associated with a “critical velocity” = } , of the order of the sound velocity , fixing the maximum value of for which this potential barrier exists. If the fluid rotational velocity in the torus is greater than , the liquid can slow down its rotation without going over an energy barrier, and dissipation occurs as in an ordinary viscous liquid – the fluid is said to be “normal”. If, however, the fluid velocity is less than the critical velocity, the physical system must necessarily go over a potential barrier (or more) to continuously tend towards = 0. As this barrier results from the repulsion between all the particles and their neighbors, it has a macroscopic value. In principle, any barrier can be overcome, be it by thermal excitation, or by the quantum tunnel effect. However the time needed for this passage may take a gigantic value. First of all, it is extremely unlikely for a thermal fluctuation to reach a macroscopic energy value. As for the tunnel effect, its transition probability decreases exponentially with the barrier height and becomes extremely low for a macroscopic object. Consequently, the relaxation times of the fluid velocity may 1673

COMPLEMENT DXV



Figure 3: Plots of the energy of a rotating repulsive boson system, in a coherent superposition of the state and the state , as a function of its average angular momentum , expressed in units of }. The lower dotted curve corresponds to the case where = 1 and the interaction constant is small (almost ideal gas). The potential energy is then negligible and the total energy presents a single minimum in = 0. Consequently, whatever the initial rotational state of the fluid, it will relax to a motionless state = 0 without having to go over any energy barrier, and its rotational kinetic energy will dissipate: it behaves as a normal fluid. The other two curves correspond to a much larger value of – therefore, according to (39) to a much higher value of . The dashed curve still corresponds to a superposition of the rotational states and = 1, and the solid line to the direct superposition of the state = 3 (shown with a circle in the figure) and the ground state = 0. The solid line curve presents the smallest barrier, hence determining the metastability of the current. The higher the coupling constant , the more states presenting a minimum in the potential energy appear. They correspond to flow velocities in the torus that are smaller than the critical velocity. To go from the rotational state = 1 to the motionless state = 0, the system must go over a macroscopic energy barrier, which only occurs with a probability so small it can be considered equal to zero. The rotational current is therefore permanent, lasting for years, and the system is said to be superfluid. On the other hand, the states with higher values of , for which the curve presents no minima, correspond to a normal fluid, whose rotation can slow down because of the viscosity (dissipation of the kinetic energy into heat).

become extraordinarily large, and, on the human scale, the rotation can be considered to last indefinitely. This phenomenon is called “superfluidity”. 3-d.

Generalization; topological aspects

Our argument remained qualitative for several reasons. To begin with, we showed the existence for the fluid of a critical velocity , of the order of , without giving its precise value. It would require a more detailed study of the potential curves such as the ones plotted in Figure 3, to obtain the precise values of the parameters for which the potential barrier appears or disappears. We also limited ourselves to simple geometries that could be described by a single variable , not taking into account other possible 1674



TIME-DEPENDENT GROSS-PITAEVSKII EQUATION

deformations of the wave function. Various situations could occur, such as the creation of vortices or more complex processes, which would require a more elaborate mathematical treatment. In other words, we would have to take into account the existence of other relaxation channels for the moving fluid to come to rest, and look for the one leading to the lowest potential barrier, thereby determining the lifetime of the superfluid current. There is, however, a more general way to address the problem, which shows that our basic conclusions are not limited to the particular case we have studied. It is based on the topological aspects of the wave function phase. When this phase varies by 2 as we go around the torus, it expresses a topological property characterized by the winding number , which is an integer and cannot vary continuously. This is why, as long as the phase is well defined everywhere – i.e. as long as the wave function does not go to zero – we cannot go continuously from to 1. We already saw this in the particular example of the wave function (63): when the modulus of ( ) varies in time from 1 to 0, while the modulus of ( ) does the opposite, we necessarily went through a situation where the wave function went to zero through interference, in a plane corresponding to a certain value of ; but the phase of the wave function is undetermined in this plane, and as we cross it, the phase undergoes a discontinuous jump. Now the canceling of the wave function of a great number of condensed bosons means the density must also be zero at that point, hence larger in other points of space. This spatial density variation introduces an energy increase, due to the finite compressibility of the fluid (as we saw in § 3-b, the energy increase in the high density regions is larger than the energy decrease in low density regions). This means there is an energy barrier opposing the change in the number of turns of the phase. The height of this barrier must now be compared with the kinetic energy variation. As seen above, there is a drastic change in the flow regime, depending on whether the fluid velocity is smaller or larger than a certain critical velocity . In the first case, superfluidity allows a current to flow without dissipation, lasting practically indefinitely. In the second, no energy consideration opposes dissipation, and the rotation slows down progressively, as in an ordinary liquid. The essential idea to remember is that superfluidity comes from the repulsive interactions, and for two reasons. First of all, they explain the presence of the energy barrier, responsible for the metastability. The second reason, even more essential, is that the repulsion between bosons constantly tends to put all the fluid particles in the same quantum state - see § 4-c of Complement CXV ; thanks to this property, we were able to characterize the intermediate rotational states by a very simple wave function (63). This implies that the quantum fluid can only occupy a very limited number of states, compared to a situation where the particles would be distinguishable. Consequently, it has a hard time dissipating its kinetic energy into heat, as a classical fluid would do, and it therefore maintains its rotation over such long times that a slowing down is practically impossible to observe.

1675



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

Complement EXV Fermion system, Hartree-Fock approximation

1

2

Foundation of the method . . . . . . . . . . . . . . . . . . 1-a Trial family and Hamiltonian . . . . . . . . . . . . . . . . 1-b Energy average value . . . . . . . . . . . . . . . . . . . . . 1-c Optimization of the variational wave function . . . . . . . 1-d Equivalent formulation for the average energy stationarity 1-e Variational energy . . . . . . . . . . . . . . . . . . . . . . 1-f Hartree-Fock equations . . . . . . . . . . . . . . . . . . . Generalization: operator method . . . . . . . . . . . . . . 2-a Average energy . . . . . . . . . . . . . . . . . . . . . . . . 2-b Optimization of the one-particle density operator . . . . . 2-c Mean field operator . . . . . . . . . . . . . . . . . . . . . 2-d Hartree-Fock equations for electrons . . . . . . . . . . . . 2-e Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

1678 1678 1680 1682 1684 1685 1686 1688 1689 1692 1693 1695 1698

Introduction Computing the energy levels of a system of electrons, interacting with each other through the Coulomb force, and placed in an external potential 1 (r) is a very important problem in physics and chemistry. It is encountered in the determination of the energy levels of atoms (in which case the external potential for the electrons1 is the Coulomb 2 potential created by the nucleus 4 0 ), or of molecules as well, or of electrons in a solid (submitted to a periodic potential), or in an aggregate or a nanocristal, etc. It is a problem where two ingredients simultaneously play an essential role: the fermionic character of the electrons, which forbids them to occupy the same individual state, and the effects of their mutual interactions. Ignoring the Coulomb repulsion between electrons would make the calculation fairly simple, and similar to that of § 1 in Complement CXIV , concerning free fermions in a box; the free plane wave individual states would have to be replaced by the energy eigenstates of a single particle placed in the potential 1 (r). This would lead to a 3-dimensional Schrödinger equation, which can be solved with very good precision, although not necessarily analytically. However, be it in atoms or in solids, the repulsion between electrons plays an essential role. Neglecting it would lead us to conclude, for example, that, as increases, the size of atoms decreases due to the attractive effect of the nucleus, whereas the opposite occurs2 ! For interacting particles, even without taking the spin into account, an exact 1 We

assume the nucleus mass to be infinitely larger than the electron mass. The electronic system can then be studied assuming the nucleus fixed and placed at the origin. 2 The Pauli exclusion principle is not sufficient to explain why an atom’s size increases with its atomic number . One can evaluate the approximate size of a hypothetical atom with non-interacting electrons (we consider the atom’s size to be given by the size of the outermost occupied orbit). The Bohr radius

1677

COMPLEMENT EXV



computation would require solving a Schrödinger equation in a 3 -dimensional space; this is clearly impossible when becomes large, even with the most powerful computer. Hence, approximation methods are needed, and the most common one is the HartreeFock method, which reduces the problem to solving a series of 3-dimensional equations. It will be explained in this complement for fermionic particles. The Hartree-Fock method is based on the variational approximation (Complement EXI ), where we choose a trial family of state vectors, and look for the one that minimizes the average energy. The chosen family is the set of all possible Fock states describing the system of fermions. We will introduce and compute the “self-consistent” mean field in which each electron moves; this mean field takes into account the repulsion due to the other electrons, hence justifying the central field method discussed in Complement AXIV . This method applies not only to the atom’s ground state but also to all its stationary states. It can also be generalized to many other systems such as molecules, for example, or to the study of the ground level and excited states of nuclei, which are protons and neutrons in bound systems. This complement presents the Hartree-Fock method in two steps, starting in § 1 with a simple approach in terms of wave functions, which is then generalized in § 2 by using Dirac notation and projector operators. The reader may choose to go through both steps or go directly to the second. In § 1, we deal with spinless particles, which allows discussing the basic physical ideas and introducing the mean field concept keeping the formalism simple. A more general point of view is exposed in § 2, to clarify a number of points and to introduce the concept of a one-particle (with or without spin) effective Hartree-Fock Hamiltonian. This Hamiltonian reduces the interactions with all the other particles to a mean field operator. More details on the Hartree-Fock methods, and in particular their relations with the Wick theorem, can be found in Chapters 7 and 8 of reference [5]. 1.

Foundation of the method

Let us first expose the foundation of the Hartree-Fock method in a simple case where the particles have no spin (or are all in the same individual spin state) so that no spin quantum number is needed to define their individual states, specified by their wave functions. We introduce the notation and define the trial family of the -particle state vectors. 1-a.

Trial family and Hamiltonian

We choose as the trial family for the state of the that can be written as: Ψ =

1

2

0

-fermion system all the states (1)

where 1 , 2 ,..., are the creation operators associated with a set of normalized individual states 1 , 2 , ... , all orthogonal to each other (and hence distinct). The state Ψ is therefore normalized to 1. This set of individual states is, at the moment, arbitrary; it will be determined by the following variational calculation. , whereas the highest value of the principal quantum number of the occupied states 0 varies as 1 1 3. varies approximately as 1 3 . The size 2 0 we are looking for varies approximately as

1678



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

For spinless particles, the corresponding wave function Ψ(r1 r2 ten in the form of a Slater determinant (Chapter XIV, § C-3-c- ):

Ψ(r1 r2

1

r )=

1 (r1 )

2 (r1 )

1 (r2 )

2 (r2 )

(r1 ) (r2 )

1 (r

2 (r

(r )

r ) can be writ-

(2)

! )

)

The system Hamiltonian is the sum of the kinetic energy, the one-body potential energy and the interaction energy: =

0

+

ext

+

(3)

int

The first term, 0 , is the operator associated with the fermion kinetic energy, sum of the individual kinetic energies:

0

= =1

where term, 1 :

ext

(P ) 2

2

(4)

is the particle mass and P , the momentum operator of particle . The second ext , is the operator associated with their energy in an applied external potential

=

1 (R

)

(5)

=1

where R is the position operator of particle . For electrons with charge placed in the attractive Coulomb potential of a nucleus of charge positioned at the origin ( is the nucleus atomic number), this potential is attractive and equal to: 2 1 (r)

=

4

0

1 r

(6)

where 0 is the vacuum permittivity. Finally, the term interaction energy: int

=

1 2

2 (R

corresponds to their mutual

R )

(7)

=

For electrons, the function 2 2 (r

int

r)=

4

1 0

r

r

2

is given by the Coulomb repulsive interaction: (8)

The expressions given above are just examples; as mentioned earlier, the Hartree-Fock method is not limited to the computation of the electronic energy levels in an atom. 1679

COMPLEMENT EXV

1-b.



Energy average value

Since state (1) is normalized, the average energy in this state is given by: = Ψ

Ψ

(9)

Let us evaluate successively the contributions of the three terms of (3), to obtain an expression which we will eventually vary. .

Kinetic energy

Let us introduce a complete orthonormal basis of the one-particle state space by adding to the set of states ( = 1, 2, ..., ) other orthonormal states; the subscript now ranges from 1 to , dimension of this space ( may be infinite). We can then expand 0 as in relation (B-12) of Chapter XV: 0

P2 2

=

(10)

where the two summations over and range from 1 to the kinetic energy can then be written: 0

P2 2

=

0

2

1

1

. The average value in Ψ of

2

0

(11)

which contains the scalar product of the ket: 1

0 =

2

1

2

(12)

by the bra: 0

2

=

1

1

2

(13)

Note however that in the ket, the action of the annihilation operator yields zero unless it acts on a ket where the individual state is already occupied; consequently, the result will be different from zero only if the state is included in the list of the states 1 , 2 , .... . Taking the Hermitian conjugate of (13), we see that the same must be true for the state , which must be included in the same list. Furthermore, if = the resulting kets have different occupation numbers, and are thus orthogonal. The scalar product will therefore only differ from zero if = , in which case it is simply equal to 1. This can be shown by moving to the front the state both in the bra and in the ket; this will require two transpositions with two sign changes which cancel out, or none if the state was already in the front. Once the operators have acted, the bra and the ket correspond to exactly the same occupied states and their scalar product is 1. We finally get:

0

= =1

1680

P2 2

(14)



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

Consequently, the average value of the kinetic energy is simply the sum of the average kinetic energy in each of the occupied states . For spinless particles, the kinetic energy operator is actually a differential operator ~2 ∆ 2 acting on the individual wave functions. We therefore get: 0

.

}2 2

=

d3

(r) ∆ (r)

(15)

=1

Potential energy

As the potential energy 1 is also a one-particle operator, its average value can be computed in a similar way. We obtain:

ext

=

1 (R)

(16)

=1

that is, for spinless particles:

ext

=

d3

1 (r)

(r)

2

(17)

=1

As before, the result is simply the sum of the average values associated with the individual occupied states. .

Interaction energy

The average value of the interaction energy 2 in the state Ψ has already been computed in § C-5 of Chapter XV. We just have to replace, in the relations (C-28) as well as (C-32) to (C-34) of that chapter, the by 1 for all the occupied states , by zero for the others, and to rename the wave functions (r) as (r). We then get: int

= Ψ

int

Ψ = (r)

1 2 2

d3 (r )

d3 2

2 (r

r) (18)

(r) (r ) (r ) (r)

=1

We have left out the condition = , no longer useful since the = terms are zero. The second line of this equation contains the sum of the direct and the exchange terms. The result can be written in a more concise way by introducing the projector over the subspace spanned by the kets : =

(19) =1

Its matrix elements are: r

r =

(r) (r )

(20)

=1

1681



COMPLEMENT EXV

This leads to: int

=

1 2

d3

d3

2 (r

r)

r

r r

r

r

r

r

r (21)

Comment: The matrix elements of are actually equal to the spatial non-diagonal correlation function 1 (r r ), which will be defined in Chapter XVI (§ B-3-a). This correlation function can be expressed as the average value of the product of field operators Ψ(r): 1 (r

r ) = Ψ (r)Ψ(r )

For a system of 1 (r

r)=

=

fermions in the states 1

1

(22) 1

Ψ (r)Ψ(r ) 1 (r) (r )

2

2

,

, ..,

2

, we can write:

2 1

=

2

=1

(r) (r )

(23)

Inserting this relation in (18) we get: =

int

1 2

d3

d3

2 (r

r)

1 (r

1 (r

r)

r)

1 (r

r)

1 (r

r)

(24)

Comparison with relation (C-28) of Chapter XV, which gives the same average value, shows that the right-hand side bracket contains the two-particle correlation function 2 (r r ). For a Fock state, this function can therefore be simply expressed as two products of one-particle correlation functions at two points: 2 (r

1-c.

r)=

1 (r

r)

1 (r

r)

1 (r

r)

1 (r

r)

(25)

Optimization of the variational wave function

We now vary Ψ to determine the conditions leading to a stationary value of the total energy : =

0

+

+

ext

(26)

int

where the three terms in this summation are given by (15), (16) and (18). Let us vary one of the kets , being arbitrarily chosen between 1 and : +

(27)

or, in terms of an individual wave function: (r)

(r) +

(r)

(28)

This will yield the following variations: 0

1682

=

}2 2

d3

[

(r) ∆ (r) +

(r) ∆

(r)]

(29)



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

and: ext

d3

=

1 (r) [

(r)

(r) +

(r)

(r)]

(30)

As for the variation of int , we must take from (18) two contributions: the first one from the terms = , and the other from the terms = . These contributions are actually equal as they only differ by the choice of a dummy subscript. The factor 1 2 disappears and we get: int

d3

=

d3

2 (r

r)

(r) (r) +

(r)

(r)

(r )

2

=1

(r) (r ) (r ) (r)

(r) (r )

(r ) (r)

(31)

The variation of is simply the sum of (29), (30) and (31). We now consider variations , which can be written as: (r) =

(r)

with

(32)

(where is a first order infinitely small parameter). These variations are proportional to the wave function of one of the non-occupied states, which was added to the occupied states to form a complete orthonormal basis; the phase is an arbitrary parameter. Such a variation does not change, to first order, either the norm of , or its scalar product with all the occupied states ; it therefore leaves unchanged our assumption that the occupied states basis is orthonormal. The first order variation of the energy is obtained by inserting and its complex conjugate into (29), (30) and (31); we then get terms in in the first case, and terms in in the second. For to be stationary, its variation must be zero to first order for any value of ; now the sum of a term in and another in will be zero for any value of only if both terms are zero. It follows that we can impose to be zero (stationary condition) considering the variations of and to be independent. Keeping only the terms in , we obtain the stationary condition of the variational energy: d3

(r)

}2 2

∆ (r) +

1 (r)

(r) + (33)

+

d3

2 (r

r)

(r)

(r )

2

(r ) (r ) (r)

=0

=1

or, taking (20) into account: d3

(r) +

}2 2 d3

∆ (r) +

1 (r)

(r)+ (34)

2 (r r ) [ r

r

(r)

r

r

(r )]

=0

This relation can also be written as:: d3

(r)

[ (r)] = 0

(35) 1683

COMPLEMENT EXV



where the integro-differential operator (r): [ (r)] =

}2 2

∆+

1 (r)

+

d3

is defined by its action on an arbitrary function

2 (r

d3

r) r

r

2 (r

r) r

(r) r

(r )

(36)

This operator depends on the diagonal r r and non-diagonal r r spatial correlation functions associated with the set of states occupied by the fermions. Relation (35) thus shows that the action of the differential operator on the function (r) yields a function orthogonal to all the functions (r) for . This means that the function [ (r)] only has components on the wave functions of the occupied states: it is a linear combination of these functions. Consequently, for the energy to be stationary there is a simple condition: the invariance under the action of the integro-differential operator of the -dimensional vector space F , spanned by all the linear combinations of the functions (r) with = 1 2 . Comment: One could wonder why we limited ourselves to the variations written in (32), proportional to non-occupied individual states. The reason will become clearer in § 2, where we use a more general method that shows directly which variations of each individual states are really useful to consider (see in particular the discussion at the end of § 2-a). For now, it can be noted that choosing a variation proportional to the same wave function (r) would simply change its norm or phase, and therefore have no impact on the associated quantum state (in addition, a change of norm would not be compatible with our hypotheses, as in the computation of the average values we always assumed the individual states to remain normalized). If the state does not change, the energy must remain constant and writing a stationary condition is pointless. Similarly, to give (r) a variation proportional to another occupied wave function (r) (where is included between 1 and ) is just as useless, as we now show. In this operation, the creation operator acquires a component on (Chapter XV, § A-6), but the state vector expression (1) remains unchanged. The state vector thus acquire a component including the square of a creation operator, which is zero for fermions. Consequently, the stationarity of the energy is automatically ensured in this case. 1-d.

Equivalent formulation for the average energy stationarity

Operator can be diagonalized in the subspace F , as can be shown3 from its definition (36) – a more direct demonstration will be given in § 2. We call (r) its 3 As any Hermitian operator can be diagonalized, we simply show that (36) leads to matrix elements 3 obeying the Hermitian conjugation relation. Let us verify that the two integrals [ 2 (r)] 1 (r) 3 and [ 1 (r)] are complex conjugates of each other. For the contributions to these matrix 2 (r) elements of the kinetic and potential (in 1 ) energy, we simply find the usual relations insuring the corresponding operators are Hermitian. As for the interaction term, the complex conjugation is obvious for the direct term; for the exchange term, a simple inversion of the integral variables 3 and 3 , plus the fact that 2 (r r ) is equal to 2 (r r) allows verifying the conjugation.

1684



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

eigenfunctions. These functions (r) are linear combinations of the (r) corresponding to the states appearing in the trial ket (1), and therefore lead to the same -particle state, because of the antisymmetrization4 . The basis change from the (r) to the (r) has no effect on the projector onto to the subspace , whose matrix elements appearing in (36) can be expressed in a way similar to those in (20): r =

r

(r)

(r )

(37)

=1

Consequently, the eigenfunctions of the operator }2 2

∆+

1 (r)

d3

+

2 (r

r)

(r )

obey the equations: 2

(r)

=1 3

d

2 (r

(38)

r)

(r )

(r)

(r ) =

(r)

=1

where are the associated eigenvalues. These relations are called the “Hartree-Fock equations”. For the average total energy associated with a state such as (1) to be stationary, it is therefore necessary for this state to be built from individual states whose orthogonal wave functions 1 , 2 , .. , are solutions of the Hartree-Fock equations (38) with = 1, 2, .. , . Conversely, this condition is sufficient since, replacing the (r) by solutions (r) of the Hartree-Fock equations in the energy variation (34) yields the result: d3

(r)

(r)

which is zero for all to the solutions 1-e.

(39)

(r) variations, since, according to (32), they must be orthogonal (r). Conditions (38) are thus equivalent to energy stationarity.

Variational energy

Assume we found a series of solutions for the Hartree-Fock equations, i.e. a set of eigenfunctions (r) with the associated eigenvalues . We still have to compute the minimal variational energy of the -particle system. This energy is given by the sum (26) of the three terms of kinetic, potential and interaction energies obtained by replacing in (15), (16) and (18) the (r) by the eigenfunctions (r): =

0

+

ext

+

int

(40)

4 A determinant value does not change if one adds to one of its column a linear combination of the others. Hence we can add to the first column of the Slater determinant (2) the linear combination of the 2 (r), 3 (r), ... that makes it proportional to 1 (r). One can then add to the second column the combination that makes it proportional to 2 (r), etc. Step by step, we end up with a new expression for the original wave function Ψ(r1 r2 r ), which now involves the Slater determinant of the (r). It is thus proportional to this determinant. A demonstration of the strict equality (within a phase factor) will be given in § 2.

1685

COMPLEMENT EXV



(the subscripts indicate we are dealing with the average energies after the HartreeFock optimization, which minimizes the variational energy). Intuitively, one could expect this total energy to be simply the sum of the energies , but, as we are going to show, this is not the case. Multiplying the left-hand side of equation (38) by (r) and after integration over d3 , we get: d3

=

}2

(r)

2

d3

+

2 (r

∆+

1 (r)

r)

(r)

(r )

2

(r)

(r )

(r )

(r)

(41)

=1

We then take a summation over the subscript , and use (15), (16) and (18), the replaced by the : =

+

0

+2

ext

being

(42)

int

=1

This expression does not yield the stationary value of the total energy, but rather a sum where the particle interaction energy is counted twice. From a physical point of view, it is clear that if each particle energy is computed taking into account its interaction with all the others, and if we then add all these energies, we get an expression that includes twice the interaction energy associated with each pair of particles. The sum of the does contain, however, useful information that enables us to avoid computing the interaction energy contribution to the variational energy. Eliminating int between (40) and (42), we get: =

1 2

+

+

0

(43)

ext

=1

where the interaction energy is no longer present. One can then compute

0

and

using the solutions of the Hartree-Fock equations (38), without worrying about

ext

the interaction energy. Using (15) and (17) in this relation, we can write the total energy as: =

1 2

d3

+ =1

(r )

2

1 (r

)

=1

}2 2

d3

(r ) ∆

(r )

=1

(44) The total energy is thus half the sum of the of the one-body average potential energy. 1-f.

, of the average kinetic energy, and finally

Hartree-Fock equations

Equation (38) may be written as: }2 ∆+ 2 1686

1 (r)

+

dir

(r)

(r)

d3

ex

(r r )

(r ) =

(r)

(45)

• dir

where the direct dir

(r) and exchange

d3

(r) =

(r )

2

2 (r

=1 ex

(r r ) =

ex

FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

(r r ) potentials are defined as:

r) (46)

(r )

(r)

2 (r r )

=1

Note that the terms = coming from the two potentials cancel each other; hence they can be eliminated from the two summations, without changing the final result. The contribution of the direct potential is sometimes called the “Hartree term”, and the contribution of the exchange potential, the “Fock term”. The first is easy to understand: with the exception of the term = , it corresponds to the interaction of a particle at point r with all the others at points r , averaged for each of them by its density 2 distribution (r ) . As for the exchange potential, and in spite of its name, this term is not, strictly speaking, a potential; it is not diagonal in the position representation, even though it basically comes from a particle interaction which is diagonal in that representation. This peculiar non-diagonal form actually comes from the combination of the fermion antisymmetrization and the variational approximation. This exchange potential is homogeneous to a potential divided by the cube of a length. It is obviously a Hermitian operator as it is derived from a potential 2 (r r ) which is real and symmetric with respect to r and r . A more intuitive and simplified version of these equations was suggested by Hartree, in which the exchange potentials are ignored in (45). Without the integral term, these equations become very similar to a series of Schrödinger equations for independent particles, each of them moving in the mean potential created by all the others (still with the exception of the term = in the summation). Including the Fock term should, however, lead to more precise calculations. Using for the potentials their expressions (46), the Hartree-Fock equations (45) become a set of coupled equations. They are nonlinear, since the direct and exchange potentials depend on the functions (r). Even though they look like linear eigenvalue equations with eigenfunctions (r) as solutions, a linear resolution would actually require knowing in advance the solutions, since these functions also appear in the potentials (46). The term “self-consistent” is used to characterize this type of situation and the solutions (r) it leads to. There are no general analytical methods to solve nonlinear self-consistent equations of this type, even in their simplified Hartree version, and numerical methods using successive approximations are commonly used. We start from a series of plausible functions (0) (r), and compute with (46) the associated potentials. Considering them to be fixed, we obtain linear eigenvalue equations which can be solved quite readily with computers (the single very complicated equation in a 3 -dimensional space has been replaced by independent 3-dimensional equations); we have to diagonalize a Hermitian operator to get a new series of orthonormal functions, resulting from the first iteration, and called (1) (1) (1) (r) and . The second iteration starts from these (r), to compute the new potential values, and get new linear differential equations. Solving these equations yields (2) (2) ( ) the next order (r), , etc. After a few iterations, one expects the (r) and ( ) to vary only slightly with the iteration order ( ), in which case the Hartree-Fock 1687

COMPLEMENT EXV



equations have been solved to a good approximation. Using (44) we can then compute the energy we were looking for. It is also possible that physical arguments can help us choose directly adequate trial functions (r) without any iteration. Inserting them in (44) then directly provides the energy. Comments: (i) The solutions of the Hartree-Fock equations may not be unique. Using the iteration process described above, one can easily wind up with different solutions, depending on (0) the initial choice for the (r) functions. This multiplicity of solutions is actually one of the method’s advantages, as it can help us find not only the ground level but also the excited levels. (ii) As we shall see in § 2, taking into account the 1 2 spin of the electrons in an atom does not bring major complications to the Hartree-Fock equations. It is generally assumed that the one-body potential is diagonal in a basis of the two spin states, labeled + and , and that the interaction potential does not act on the spins. We then simply assemble + (r) associated with the spin + particles, with + equations, for + wave functions other equations, for wave functions (r) associated with spin particles. These two sets of equations are not independent, since they contain the same direct potential (computed using (46), whose first line includes a summation over of all the = + + wave functions). As for the exchange potential, it does not lead to any coupling between the two sets of equations: in the second line of (46), the summation over only includes particles in the same spin state for the following reason. If the particles have opposite spins, they can be recognized by the direction of their spin (the interaction does not act on the spins), and they no longer behave as indistinguishable particles. The exchange effects only arise for particles having the same spin.

2.

Generalization: operator method

We now describe the method in a more general way, using an operator method that leads to more concise expressions, while taking into account explicitly the possible existence of a spin – which plays an essential role in the atomic structure. We will identify more precisely the mathematical object, actually a projector, which we vary to optimize the energy. Physically, this projector is simply the one-particle density operator defined in § B-4 of Chapter XV. This will lead to expressions both more compact and general for the Hartree-Fock equations. They contain a Hartree-Fock operator acting on a single particle, as if it were alone, but which includes a potential operator defined by a partial trace which reflects the interactions with the other particles in the mean field approximation. Thanks to this operator we can get an approximate value of the entire system energy, computing only individual energies; these energies are obtained with calculations similar to the one used for a single particle placed in a mean field. With this approach, we have a better understanding of the way the mean field approximately represents the interaction with all the other particles; this approach can also suggest ways to make the approximations more precise. We assume as before that the -particle variational ket Ψ is written as: Ψ = 1688

1

2

0

(47)



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

This ket is derived from individual orthonormal kets , but these kets can now describe particles having an arbitrary spin. Consider the orthonormal basis of the one-particle state space, in which the set of ( = 1, 2, ... ) was completed by other orthonormal states. The projector onto the subspace is the sum of the projections onto the first kets : =

(48) =1

This is simply the one-particle density operator defined in § B-4 of Chapter XV (normalized by a trace equal to the particle number and not to one), as we now show. Relation (B-24) of that chapter can be written in the basis: =

1

(49)

where the average value is taken in the quantum state (47). In this kind of Fock state, the average value is different from zero only when the creation operator reconstructs the population destroyed by the annihilation operator, hence if = , in which case it is equal to the population of the individual states . In the variational ket (47), all the populations are zero except for the first states ( = 1, 2, ... ), where they are equal to one. Consequently, the one-particle density operator is represented by a matrix, diagonal in the basis , and whose first elements on the diagonal are all equal to one. It is indeed the matrix associated with the projector , and we can write: 1

=

(50)

As we shall see, all the average values useful in our calculation can be simply expressed as a function of this operator. 2-a.

Average energy

We now evaluate the different terms included in the average energy, starting with the terms containing one-particle operators. .

Kinetic and external potential energy Using relation (B-12) of Chapter XV, we obtain for the average kinetic energy 0

:

0

P2 2

=

(51)

The same argument as that for the evaluation of the matrix elements (49) shows that the average value in the state (47) is only different from zero if = ; in that case, it is equal to one when , and to zero otherwise. This leads to: 0

= =1

P2 2

= Tr1

P2 2

(52) 1689



COMPLEMENT EXV

The subscript 1 was added to the trace to underline the fact that this trace is taken in the one-particle state space and not in the Fock space. The two operators included in the trace only act on that same particle, numbered arbitrarily 1; the subscript 1 could obviously be replaced by the subscript of any other particle, since they all play the same role. The average potential energy coming from the external potential is computed in a similar way and can be written as:

ext

=

1

= Tr1

(53)

1

=1

.

Average interaction energy, Hartree-Fock potential operator

The average interaction energy int can be computed using the general expression (C-16) of Chapter XV for any two-particle operator, which yields: int

=

1 2

1:

;2 :

2 (1

2) 1 :

;2 :

(54)

For the average value in the Fock state Ψ to be different from zero, the operator must leave unchanged the populations of the individual states and . As in § C-5-b of Chapter XV, two possibilities may occur: either = and = (the direct term), or = and = (the exchange term). Commuting some of the operators, we can write: =

+

=[

]

(55)

where and are the respective populations of the states and . Now these populations are different from zero only if the subscripts and are between 1 and , in which case they are equal to 1 (note also that we must have = to avoid a zero result). We finally get5 :

int

=

1 2

1:

;2 :

2 (1

2) 1 :

;2 :

=

1:

;2 :

2 (1

2) 1 :

;2 :

(56)

(the constraint = may be ignored since the right-hand side is equal to zero in this case). Here again, the subscripts 1 and 2 label two arbitrary, but different particles, that could have been labeled arbitrarily. We can therefore write:

int

=

1 2

1:

;2 :

2 (1

2) [1

ex (1

2)] 1 :

;2 :

(57)

=1

where ex (1 2) is the exchange operator between particle 1 and 2 (the transposition which permutes them). This result can be written in a way similar to (53) by introducing a 5 As

1690

in the previous complement, we have replaced

2 (R1

R2 ) by

2 (1

2) to simplify the notation



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

“Hartree-Fock potential” , similar to an external potential acting in the space of particle 1; this potential is defined as the operator having the matrix elements: (1)

=

1:

;2 :

2 (1

2) [1

ex (1

2)] 1 :

;2 :

(58)

=1

This operator is Hermitian, since, as the two operators commute, we can write: (1)

=

1:

;2 :

1:

;2 :

2 (1

2) [1

ex

and

2

are Hermitian and

ex (1

2)] 1 :

;2 :

2 (1

2) 1 :

;2 :

=1

=

[1

ex (1

2)]

=1

=

(1)

(59)

Furthermore, we recognize in (58) the matrix element of a partial trace on particle 2 (Complement III , § 5-b): (1) = Tr2

(2)

2 (1

2) [1

ex (1

2)]

(60)

where the projector has been introduced inside the trace to limit the sum over to its first terms, as in (57). The one-particle operator (1) is thus the partial trace over a second particle (with the arbitrary label 2) of a product of operators acting on both particles. As the summation over is now taken into account, we are left in (57) with a summation over , which introduces a trace over the remaining particle 1, and we get: int

=

1 Tr1 2

(1)

(1)

(61)

This average value depends on the subspace chosen with the variational ket Ψ in two ways: explicitly as above, via the projector (1) that shows up in the average value (61), but also implicitly via the definition of the Hartree-Fock potential in (60). .

Role of the one-particle reduced density operator

All the average values can be expressed in terms of the projector onto the subspace of the space the individual states spanned by the individual states 1 , , which means, according to (50), in terms of the one-particle reduced density 2 , .... operator 1 = . Hence it is this operator that is the pertinent variable to optimize rather than the set of individual states: certain variations of those states do not change , and are meaningless for our purpose. Furthermore, the choice of the trial ket Ψ is equivalent to that of . In other words, the variational ket Ψ built in (1) does not depend on the basis chosen in the subspace : if we choose in this subspace any orthonormal basis other than the basis, and if we replace in (1) the by the , the ket will remain the same 1691



COMPLEMENT EXV

(to within a non-relevant phase factor) as we now show. As seen in § A-6 of Chapter XV, each operator is a linear combination of the , so that in the product of all the ( = 1, 2, .. ) we will find products of operators . Relation (A-43) of Chapter XV however indicates that the squares of any creation operators are zero, which means that the only non-zero products are those including once and only once each of the different operators . Each term is then proportional to the ket Ψ built from the . Consequently, the two variational kets built from the two bases are necessarily proportional. As definition (1) ensures they are also normalized, they can only differ by a phase factor, which means they are equivalent from a physical point of view. It is thus the operator = 1 that best embodies the trial ket Ψ . 2-b.

Optimization of the one-particle density operator

We now vary =

+

0

1

= +

1

int

to look for the stationary conditions for the total energy: P2 + 2

= Tr1

1

+

1 2

(62)

We therefore consider the variation: +

(63)

which leads to the following variations for the average values of the one-particle operators: 0

+

1

P2 + 2

= Tr1

(64)

1

As for the interaction energy, we get two terms: =

int

1 Tr1 2

1 + Tr1 2

(65)

which are actually equal since: Tr1

(1)

(1) = Tr1 2

(1)

(2)

2 (1

2) [1

ex (1

2)]

(66)

and we recognize in the right-hand side of this expression the trace: Tr2

(2)

(2)

(67)

As we can change the label of the particle from 2 to 1 without changing the trace, the two terms of the interaction energy are equal. As a result, we end up with the energy variation: P2 + 2

= Tr1

1

+

To vary the projector 0

1692

0

+

(68) , we choose a value

0

of

and make the change: (69)



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

where is any ket from the space of individual states, and any real number; no other individual state vector varies except for 0 . The variation of is then written as: =

+

0

(70)

0

We assume has no components on any , that is no components in , since this would change neither , nor the corresponding projector . We therefore impose: =0

(71)

which also implies that the norm of (70) into (68), we obtain: =

Tr1

P2 + 2

1

remains constant6 to first order in

0

+

+

. Inserting

0

Tr1

P2 + 2

1

+

(72)

0

For the energy to be stationary, this variation must remain zero whatever the choice of the arbitrary number . Now the linear combination of two exponentials and will remain zero for any value of only if the two factors in front of the exponentials are zero themselves. As each term can be made equal to zero separately, we obtain: P2 + 2

0 = Tr1

1

+

P2 + 2

=

0

1

+

(73)

0

This relation must be satisfied for any ket orthogonal to the subspace This means that if we define the one-particle Hartree-Fock operator as: =

P2 + 2

1

+

(74)

the stationary condition for the total energy is simply that the ket to :

As this relation must hold for any 0 chosen among the 1 , 2 , .... that the subspace is stable under the action of the operator (74).

¯

=

6 Since 0

must belong

, it follows

Mean field operator

We can then restrict the operator

(

0

(75)

0

2-c.

.

+

(1)

P2 + 2

(71) shows that )( 0 + )=

1 (1)

0

+

(1)

to that subspace: (1)

is orthogonal to any linear combinations of the = 1+ second order terms. 0 +

(76) , we can write

1693

COMPLEMENT EXV



This operator, acting in the subspace spanned by the kets , is a Hermitian linear operator, hence it can be diagonalized. We call its eigenvectors ( = 1, 2, .. ), which are linear combinations of the kets . The stationary condition for the energy (75) amounts to imposing the to be not only eigenvectors of ¯ , but also of the operator defined by (74) in the entire one-particle state space (without the restriction to ); consequently, the must obey: (77)

=

Operator is defined in (74), where the operator is given by (60) and depends on the projector . This last operator may be expressed as a function of the in the same way as with the , and relation (48) may be replaced by: =

(78) =1

Relations (77), together with definition (60) where (78) has been inserted, are a set of equations allowing the self-consistent determination of the ; they are called the Hartree-Fock equations. This operator form (77) is simpler than the one obtained in § 1-c; it emphasizes the similarity with the usual eigenvalue equation for a single particle moving in an external potential, illustrating the concept of a self-consistent mean field. One must keep in mind, however, that via the projector (78) included in , this particle moves in a potential depending on the whole set of states occupied by all the particles. Remember also that we did not carry out an exact computation, but merely presented an approximate theory (variational method). The discussion in § 1-f is still relevant. As the operator depends on the , the Hartree-Fock equations have an intrinsic nonlinear character, which generally requires a resolution by successive approximations. We start from a set of individual states 0 to build a first value of and the operator , which are used to compute the Hamiltonian (74). Considering this Hamiltonian now fixed, the Hartree-Fock equations (77) become linear, and can be solved as usual eigenvalue equations. This leads to new values 1 for the , and finishes the first iteration. In the second iteration, we use the 1 in (78) to compute a new value of the mean field operator ; considering again this operator as fixed, we solve the eigenvalue equation and obtain the second iteration values 2 for the , and so on. If the initial values 0 are physically reasonable, one can hope for a rapid convergence towards the expected solution of the nonlinear Hartree-Fock equations. The variational energy can be computed in the same way as in § 1-e. Multiplying on the left equation (77) by the bra , we get: P2 + 2

=

1

+

(79)

After summing over the subscript , we obtain: = =1

1694

=1

P2 + 2

1

+

= Tr1

(1)

P2 + 2

1

+

(80)



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

Taking into account (51), (53), and (61), we get: =

+

0

+2

1

(81)

int

=1

where the particle interaction energy is counted twice. To compute the energy eliminate int between (26) and this relation and we finally obtain:

=

1 2

2-d.

+

+

0

, we can

1

(82)

=1

Hartree-Fock equations for electrons

Assume the fermions we are studying are particles with spin 1 2, electrons for example. The basis r of the individual states used in § 1 must be replaced by the basis formed with the kets r , where is the spin index, which can take 2 distinct values noted 1 2, or more simply . To the summation over d3 we must now add a summation over the 2 values of the index spin . A vector in the individual state space is now written: d3

=

(r

) r

(83)

= 1 2

with: (r

)= r

(84)

The variables r and play a similar role but the first one is continuous whereas the second is discrete. Writing them in the same parenthesis might hide this difference, and we often prefer noting the discrete index as a superscript of the function , and write: (r) = r

(85)

Let us build an particle variational state Ψ from orthonormal states , with = 1, 2, .. , . Each of the describes an individual state including the spin and position variables; the first + values of ( =1 2 + ) are equal to +1 2, the last are equal to 1 2, with + + = (we assume + and are fixed for the moment but we may allow them to vary later to enlarge the variational family). In the space of the individual states, we introduce a complete basis whose first kets are the , but where the subscript varies from 1 to infinity7 . We assume the matrix elements of the external potential 1 to be diagonal for ; these two diagonal matrix elements can however take different values 1 (r), which allows including the eventual presence of a magnetic field coupled with the spins. We also assume the particle interaction 2 (1 2) to be independent of the spins, and diagonal in the position representation of the two particles, as is the case, for example, for the Coulomb 7 The subscript determines both the orbital and the spin state of the particle; the index independent since it is fixed for each value of .

is not

1695

COMPLEMENT EXV



interaction between electrons. With these assumptions, the Hamiltonian cannot couple states having different particle numbers + and . Let us see what the general Hartree-Fock equations become in the r representation. In this representation, the effect of the kinetic and potential operators are well known. We just have to compute the effect of the Hartree-Fock potential . To obtain its matrix elements, we use the basis 1 : r ; 2 : to write the trace in (60): (1) r

r

=

1:r

;2 :

2:

2:

2)] 1 : r

;2 :

=1

=1 2 (1

2) [1

ex (1

As the right-hand side includes the scalar product 2 : the sum over disappears and we get:

2:

(86) which is equal to

,

(1) r

r =

1:r

;2 :

2 (1

2) [1

ex (1

2)] 1 : r

;2 :

(87)

=1

(i) We first deal with the direct term contribution, hence ignoring in the bracket the term in ex (1 2). We can replace the ket 2 : by its expression: 2:

=

d3

(r2 ) 2 : r2

2

(88)

As the operator is diagonal in the position representation, we can write: 2 (1

2) 1 : r

;2 :

=

2 (1

=

d3

2)

d3

2

(r2 )

2

(r2 ) 1 : r 2 (r

; 2 : r2

r2 ) 1 : r

; 2 : r2 (89)

The direct term of (87) is then written:

d3

2 (r

2

r2 )

(r2 ) 1 : r

;2 :

1:r

; 2 : r2

(90)

=1

where the scalar product of the bra and the ket is equal to finally obtain: (r

r)

d3

2

2 (r

r2 )

(r2 )

2

=

(r

(r

r)

r)

dir (r)

(r2 ) . We

(91)

=1

with: dir (r)

d3

= =1

1696

2 (r

r)

(r )

2

(92)



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

This component of the mean field (Hartree term) contains a sum over all occupied states, whatever their spin is; it is spin independent. (ii) We now turn to the exchange term, which contains the operator ex (1 2) in the bracket of (87). To deal with it, we can for example commute in (87) the two operators 2 (1 2) and ex (1 2); this last operator will then permute the two particles in the bra. Performing this operation in (90), we get, with the minus sign of the exchange term: 3

2 (r

2

r2 )

(r2 ) 1 :

;2 : r

1:r

; 2 : r2

(93)

=1

The scalar product will yield the products of (r r2 ), making the integral over 3 disappear; this term is zero if = , hence the factor . Since 2 (r r) = 2 (r r ), we are left with: 2 2 (r

r)

(r)

(r )

=

(r r )

(94)

=

where the sum is over the values of for which = = (hence, limited to the first , depending on the case); the exchange potential ex has + values of , or the last been defined as: ex (r

r)=

2 (r

r)

(r )

(r)

(95)

=

As is the case for the direct term, the exchange term does not act on the spin. There are however two differences. To begin with, the summation over is limited to the states having the same spin ; second, it introduces a contribution which is non-diagonal in the positions (but without an integral), and which cannot be reduced to an ordinary potential (the term “non-local potential” is sometimes used to emphasize this property). We have shown that the scalar product of equation (77) with r introduces three potentials (in addition to the the one-body potential 1 ), a direct potential dir (r) and two exchange potentials ex (r r ) with = 1 2. Equation (77) then becomes, in the r representation, a pair of equations: }2 ∆+ 2

1

(r) +

dir. (r)

(r)

d3

ex

(r r )

(r ) =

(r)

(96)

These are the Hartree-Fock equations with spin and in the position representation, widely used in quantum physics and chemistry. It is not necessary to worry, in these equations, about the term in which the subscript in the summation appearing in (92) and (95) is the same as the subscript (of the wave function we are looking for); the contributions = cancel each other exactly in the direct and exchange potentials. Both the “Hartree term” giving the direct potential contribution, and the “Fock term ” giving the exchange potential, can be interpreted in the same way as above (§ 1-f). The Hartree term contains the contributions of all the other electrons to the mean potential felt by one electron. The exchange potential, on the other hand, only involves electrons in the same spin state, and this can be simply interpreted: the exchange effect only occurs for two totally indistinguishable particles. Now if these particles are in 1697

COMPLEMENT EXV



orthogonal spin states, and as the interactions do not act on the spins, one can in principle determine which is which and the particles become distinguishable: the quantum exchange effects cancel out. As we already pointed out, the exchange potential is not a potential stricto sensu. It is not diagonal in the position representation, even though it basically comes from a particle interaction that is diagonal in position. It is the antisymmetrization of the fermions, together with the chosen variational approximation, which led to this peculiar non-diagonal form. It is however a Hermitian operator, as can be shown using the fact that the initial potential 2 (r r ) is real and symmetric with respect to r and r . 2-e.

Discussion

The resolution of the nonlinear Hartree-Fock equations is generally done by the successive iteration approximate method discussed in § 1-f. There is no particular reason for the solution of the Hartree-Fock equations to be unique8 ; on the contrary, they can yield solutions that depend on the states chosen to begin the nonlinear iterations. They can actually lead to a whole spectrum of possible energies for the system. This is how the ground state and excited state energies of the atom are generally computed. The atomic orbitals discussed in Complement EVII , the central field approximation and the electronic “configurations” discussed in Complement BXIV can now be discussed in a more precise and quantitative way. We note that the exchange energy, introduced in this complement for a two-electron system, is a particular case of the exchange energy term of the Hartree-Fock potential. There exist however many other physical systems where the same ideas can be applied: nuclei (the Coulomb force is then replaced by the nuclear interaction force between the nucleons), atomic aggregates (with an interatomic potential having both repulsive and attractive components, see Complements CXI and GXI ), and many others. Once a Hartree-Fock solution for a complex problem has been found, we can go further. One can use the basis of the eigenfunctions just obtained as a starting point for more precise perturbation calculations, including for example correlations between particles (Chapter XI). In atomic spectra, we sometimes find cases where two configurations yield very close mean field energies. The effects of the interaction terms beyond the mean field approximation will then be more important. Perturbation calculations limited to the space of the configurations in question permits obtaining better approximations for the energy levels and their wave functions; one then speaks of “mixtures”, or of “interactions between configurations”. Comment: The variational method based on the Fock states is not the only one that leads to the Hartree-Fock equations. One could also start from an approximation of the two-particle density operator by a function of the one-particle density operator and write: (1 2)

1 [1 2

ex (1

2)]

(1)

(2)

(97)

Expressing the energy of the -particle system as a function of , we minimize it by varying this operator, and find the same results as above. This method amounts to a closure of the hierarchy of the -body equations (§ C-4 of Chapter XVI). We have in 8 They

1698

all yield, however, an upper limit for the ground state energy



FERMION SYSTEM, HARTREE-FOCK APPROXIMATION

fact already seen with equation (21) and in § 2-a that the Hartree-Fock approximation amounts to expressing the two-particle correlation functions as a function of the oneparticle correlation functions. In terms of correlation functions (Complement AXVI ), this amounts to replacing the two-particle function (four-point function) by a product of one-particle functions (two-point function), including an exchange term. Finally, another method is to use the diagram perturbation theory; the Hartree-Fock approximation corresponds to retaining only a certain class of diagrams (class of connected diagrams).

Finally note that the Hartree-Fock method is not the only one yielding approximate solutions of Schrödinger’s equation for a system of interacting fermions; in particular, one can use the “electronic density functional” theory (a functional is a function of another function, as for instance the action in classical lagrangian mechanics). The method is used to obtain the electronic structure of molecules or condensed phases in physics, chemistry, and materials science. Its study nevertheless lies outside the scope of this book, and the reader is referred to [6], which summarizes the method and gives a number of references.

1699



FERMIONS, TIME-DEPENDENT HARTREE-FOCK APPROXIMATION

Complement FXV Fermions, time-dependent Hartree-Fock approximation

1 2

3

4

Variational ket and notation . . . . . . . . . . . . . . . . . . . 1701 Variational method . . . . . . . . . . . . . . . . . . . . . . . . 1702 2-a

Definition of a functional . . . . . . . . . . . . . . . . . . . . 1702

2-b

Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1703

2-c

Particular case of a time-independent Hamiltonian . . . . . . 1705

Computing the optimizer . . . . . . . . . . . . . . . . . . . . . 1705 3-a

Average energy . . . . . . . . . . . . . . . . . . . . . . . . . . 1705

3-b

Hartree-Fock potential . . . . . . . . . . . . . . . . . . . . . . 1706

3-c

Time derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 1707

3-d

Functional value . . . . . . . . . . . . . . . . . . . . . . . . . 1707

Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . 1707 4-a

Time-dependent Hartree-Fock equations . . . . . . . . . . . . 1708

4-b

Particles in a single spin state . . . . . . . . . . . . . . . . . . 1709

4-c

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1709

The Hartree-Fock mean field method was introduced in Complement EXV for a time-independent problem: the search for the stationary states of a system of interacting fermions (the search for its thermal equilibrium will be discussed in Complement GXV . In this complement, we show how this method can be used for time-dependent problems. We start, in § 1, by including a time dependence in the Hartree-Fock variational ket (time-dependent Fock state). We then introduce in § 2 a general variational principle that can be used for solving the time-dependent Schrödinger equation. We then compute, in § 3, the function to be optimized for a Fock state; the same mean field operator as the one introduced in Complement EXV will here again play a very useful role. Finally, the time-dependent Hartree-Fock equations will be obtained and discussed in § 4. More details on the Hartree-Fock methods in general can be found, for example, in Chapter 7 of reference [5], and especially in its Chapter 9 for time-dependent problems. 1.

Variational ket and notation -particle state vector ^ Ψ ( ) to be of the form:

We assume the ^ Ψ( ) =

1(

)

2(

)

( )

0

(1)

where the 1 ( ) , 2 ( ) , ..., ( ) are the creation operators associated with an arbitrary series of orthonormal individual states 1 ( ) , 2 ( ) , ..., ( ) which depend on time . This series is, for the moment, arbitrary, but the aim of the following variational calculation is to determine its time dependence. 1701



COMPLEMENT FXV

As in the previous complements, we assume that the Hamiltonian is the sum of three terms: a kinetic energy Hamiltonian, an external potential Hamiltonian, and a particle interaction term: =

0

+

ext (

)+

(2)

int

with: 0

= =1

(

(P ) 2

2

(3)

is the particles’ mass, P the momentum operator of particle ), and: ext (

)=

1 (R

)

(4)

R )

(5)

=1

and finally: int

2.

=

1 2

2 (R =

Variational method

Let us introduce a general variational principle; using the stationarity of a functional of the state vector Ψ( ) , it will yield the time-dependent Schrödinger equation. 2-a.

Definition of a functional

Consider an arbitrarily given Hamiltonian ( ). We assume the state vector Ψ( ) to have any time dependence, and we note Ψ( ) the ket physically equivalent to Ψ( ) , but with a constant norm: Ψ( )

Ψ( ) =

(6)

Ψ( ) Ψ( ) of Ψ( ) is defined as1 :

The functional

1

Ψ( )

=

d Re

Ψ( )

0 1

=

d 0

} 2

Ψ( )

d Ψ( ) d

}

d ( ) Ψ( ) d d Ψ( ) Ψ( ) d

(7) Ψ( )

( ) Ψ( )

where 0 and 1 are two arbitrary times such that 0 1 . In the particular case where the chosen Ψ( ) is equal to a solution Ψ ( ) of the Schrödinger equation: }

d Ψ () = d

() Ψ ()

(8)

1 The notation where the differential operator d d is written between a bra and a ket means that the operator takes the derivative of the ket that follows (and not of the bra just before).

1702



FERMIONS, TIME-DEPENDENT HARTREE-FOCK APPROXIMATION

the bracket on the first line of (7) obviously cancels out and we have: Ψ ()

=0

(9)

Integrating by parts the second term2 of the bracket in the second line of (7), we get the same form as the first term in the bracket, plus an already integrated term. The final result is then: 1

Ψ( )

=

d

Ψ( )

}

0

+

} 2

d

Ψ( )

()

Ψ( )

Ψ( 0 ) Ψ( 0 )

1

=

d d

}

0

d d

()

Ψ( 1 ) Ψ( 1 )

Ψ( )

(10)

where we have used in the second line the fact that the norm of Ψ( ) always remains equal to unity. This expression for is similar to the initial form (7), but without the real part. 2-b.

Stationarity

Suppose now Ψ( ) has an arbitrary time dependence between 0 and 1 , while keeping its norm constant, as imposed by (6); the functional then takes a certain value , a priori different from zero. Let us see under which conditions will be stationary when Ψ( ) changes by an infinitely small amount Ψ( ) : Ψ( )

Ψ( ) +

Ψ( )

(11)

For what follows, it will be convenient to assume that the variation Ψ( ) is free; we therefore have to ensure that the norm of Ψ( ) remains constant, equal to unity3 . We introduce Lagrange multipliers (Appendix V) ( ) to control the square of the norm at every time between 0 and 1 , and we look for the stationarity of a function where the sum of constraints has been added. This sum introduces an integral, and we the function in question is: 1

Ψ( )

=

Ψ( )

d

( ) Ψ( ) Ψ( )

0 1

=

d

Ψ( )

0

where

}

d d

()

()

Ψ( )

(12)

( ) is a real function of the time .

2 If we integrate by parts the first term rather than the second, we get the complex conjugate of equation (10), which brings no new information. 3 For the normalization of Ψ ( ) to be conserved to first order, it is necessary (and sufficient) for

the scalar product Ψ ( ) Ψ ( ) multiplier ( ) is not needed

to be zero or purely imaginary. If this is the case, the Lagrangian

1703

COMPLEMENT FXV



The variation of to first order is obtained by inserting (11) in (10). It yields the sum of a first term 1 containing the ket Ψ( ) and of another 2 containing the bra Ψ( ) : 1

1

=

d

Ψ( )

}

0

d d

1

2

=

d

Ψ( )

}

0

()

d d

()

() ()

Ψ( ) Ψ( )

(13)

We now imagine another variation for the ket: Ψ( )

Ψ( ) +

Ψ( )

(14)

which yields a variation of ; in this second variation, the term in Ψ( ) becomes Ψ( ) becomes 1 = 1 , whereas the term in 2 = 2 . Now, if the functional is stationary in the vicinity of Ψ( ) , the two variations and are necessarily zero, as are also and + . In those combinations, only terms in 1 appear for the first one, and in for the second; consequently they must both be zero. As a 2 result, we can write the stationarity conditions with respect to variations of the bra and the ket separately. Let us write for example that 2 = 0, which means the right-hand side of the second line in (13) must be zero. As the time evolution between 0 and 1 of the bra Ψ( ) is arbitrary, this condition imposes this bra multiplies a zero-value ket, at all times. Consequently, the ket Ψ( ) must obey the equation: }

d d

()

()

Ψ( ) = 0

(15)

which is none other than the Schrödinger equation associated with the Hamiltonian ( ) + ( ). Actually, ( ) simply introduces a change in the origin of the energies and this only modifies the total phase4 of the state vector Ψ( ) , which has no physical effect. Without loss of generality, this Lagrange factor may therefore be ignored, and we can set: ()=0

(16)

A necessary condition5 for the stationarity of is that Ψ( ) obey the Schrödinger equation (8) – or be physically equivalent (i.e. equal to within a global time-dependent phase factor) to a solution of this equation. Conversely, assume Ψ( ) is a solution of the Schrödinger equation, and give this ket a variation as in (11). It is then obvious from the second line of (13) that 2 is zero. As for 1 , an integration by parts over time shows that it is the complex conjugate of 2 , and therefore also equal to zero. The 4 If

in (15) we set Ψ( ) = } dd

( )

Θ( ) , we see that Θ( ) obeys the differential equation obtained

by replacing ( ) by ( ) in (15). If we simply choose for ( ) the integral over time of the function ( ), this constant will disappear from the differential equation. 5 The same argument as above, but starting from the variation , would lead to the complex conjugate of (8), and hence to the same equation.

1704



FERMIONS, TIME-DEPENDENT HARTREE-FOCK APPROXIMATION

functional is thus stationary in the vicinity of any exact solution of the Schrödinger equation. ^ Suppose we choose any variational family of normalized kets Ψ ( ) , but which stat ^ now includes a ket Ψ ( ) for which is stationary. A simple example is the case where is a family 0 that contains the exact solution of the Schrödinger equation; according to what we just saw, this exact solution will make stationary, and conversely, the ket stat ( ) . In this case, imposing the variation of that makes stationary is necessarily Ψ^ 0

to be zero allows identifying, inside the family 0 , the exact solution we are looking for. If we now change the family continuously from 0 to , in general will no longer contain the exact solution of the Schrödinger equation. We can however follow the modifications stat ( ) . Starting from an exact solution of the at all times of the values of the ket Ψ^ equation, this ket progressively changes, but, by continuity, will stay in the vicinity of this exact solution if stays close to 0 . This is why annulling the variation of in the family is a way of identifying a member of that family whose evolution remains close to that of a solution of the Schrödinger equation. This is the method we will follow, using the Fock states as a particular variational family. 2-c.

Particular case of a time-independent Hamiltonian

If the Hamiltonian is time-independent, one can look for time-independent kets Ψ to make the functional stationary. The function to be integrated in the definition of the functional also becomes time-independent, and we can write as: =(

1

0)

Ψ

Ψ

(17)

Since the two times 0 and 1 are fixed, the stationarity of is equivalent to that of the diagonal matrix element of the Hamiltonian Ψ Ψ . We find again the stationarity condition of the time-independent variational method (Complement EXI ), which appears as a particular case of the more general method of the time-dependent variations. Consequently, it is not surprising that the Hartree-Fock methods, time-dependent or not, lead to the same Hartree-Fock potential, as we now show. 3.

Computing the optimizer

The family of the state vectors we consider is the set of Fock kets ^ Ψ( ) defined in (1). We first compute the function to be integrated in the functional (10) when Ψ( ) takes the value ^ Ψ( ) . 3-a.

Average energy

For the term in ( ), the calculation is identical to the one we already did in § 1-b of Complement EXV . We first add to the series of orthonormal states ( ) with = 1, 2, ..., other orthonormal states ( ) with = + 1, + 2, ..., to obtain a complete orthonormal basis in the space of individual states. Using this basis, we can express the one-particle and two-particle operators according to relations (B-12) and (C-16) of Chapter XV. This presents no difficulty since the average values of creation and annihilation operator products are easily obtained in a Fock state (they only differ 1705



COMPLEMENT FXV

from zero if the product of operators leaves the populations of the individual states unchanged). Relations (52), (53) and (57) of Complement EXV are still valid when the become time-dependent. We thus get for the average kinetic energy: =

0

() =1

P2 2

()

(18)

for the external potential energy: ext (

) =

()

1 (R

)

()

(19)

=1

and for the interaction energy: =

int

3-b.

1 2

1:

( ); 2 :

()

2 (1

2) [1

ex (1

2)] 1 :

( ); 2 :

()

(20)

Hartree-Fock potential

We recognize in (20) the diagonal element ( = ) of the Hartree-Fock potential operator (1 ) whose matrix elements have been defined in a general way by relation (58) of Complement EXV : ()

(1 ) =

()

1:

( ); 2 :

()

2 (1

2) [1

ex (1

2)] 1 :

( ); 2 :

()

(21)

We also noted in that complement EXV that (1 ) is a Hermitian operator. It is often handy to express the Hartree-Fock potential using a partial trace: (1 ) = Tr2 where

(2 )

2 (1

2) [1

ex (1

2)]

is the projector onto the subspace spanned by the

(2 ) =

2:

() 2:

(22) kets

()

(): (23)

=1

As we have seen before, this projector is actually nothing bu the one-particle reduced density operator 1 normalized by imposing its trace to be equal to the total particle number : (2 ) =

1 (2

)

(24)

The average value of the interaction energy can then be written as: int

1706

=

1 Tr1 2

1 (1

)

(1 )

(25)

• 3-c.

FERMIONS, TIME-DEPENDENT HARTREE-FOCK APPROXIMATION

Time derivative

As for the time derivative term, the function it contains can be written as: 0

()

()

1

()

1

d d

()

=1

()

() 0

(26)

In this summation, all terms involving the individual states other than the state (which is undergoing the derivation) lead to an expression of the type: 0

()

() 0

(27)

which equals 1 since this expression is the square of the norm of the state ( ) 0 , which is simply the Fock state = 1 . As for the state , it leads to a factor written in the form of a scalar product in the one-particle state space: 0

()

3-d.

d d

() 0 =

d d

()

()

(28)

Functional value

Regrouping all these results, we can write the value of the functional 1

^ Ψ( ) =

d =1

1 2 4.

1:

()

}

0

d d

()

()

P2 + 2

1(

)

in the form:

() (29)

( ); 2 :

()

2 (1

2) [1

ex (1

2)] 1 :

( ); 2 :

()

=1

Equations of motion

We now vary the ket ()

() +

( ) according to: ()

with

(30)

As in complement EXV , we will only consider variations ( ) that lead to an actual ^ variation of the ket Ψ( ) ; those where ( ) is proportional to one of the occupied ^ states ( ) with yield no change for Ψ( ) (or at the most to a phase change) and are thus irrelevant for the value of . As we did in relations (32) or (69) of Complement EXV , we assume that: () =

()

()

with

(31)

where

( ) is an infinitesimal time-dependent function. The computation is then almost identical to that of § 2-b in Complement EXV . When ( ) varies according to (31), all the other occupied states remaining constant, the only changes in the first line of (29) come from the terms = . In the second line, the changes come from either the = terms, or the = terms. As the 2 (1 2) 1707

COMPLEMENT FXV



operator is symmetric with respect to the two particles, these variations are the same and their sum cancels the 1 2 factor. All these variations involve terms containing either the ket ( ) , or the bra () . Now their sum must be zero for any value of , and this is only possible if each of the terms is zero. Inserting the variation (31) of ( ) , and canceling the term in leads to the following equality: d d

1

d

()

()

}

0

()

P2 + 2

()

1(

)

() (32)

1:

( ); 2 :

()

2 (1

2) [1

ex (1

2)] 1 :

( ); 2 :

()

=0

=1

As we recognize in the function to be integrated the Hartree-Fock potential operator (1 ) defined in (21), we can write: 1

d

()

()

}

0

with

P2 2

d d

1(

)

()

()

=0

(33)

.

4-a.

Time-dependent Hartree-Fock equations

As the choice of the function ( ) is arbitrary, for expression (33) to be zero for any ( ) requires the function inside the curly brackets to be zero at all times . Stationarity therefore requires the ket: }

d d

P2 2

1(

)

()

()

(34)

to have no components on any of the non-occupied states ( ) with ( words, stationarity will be obtained if, for all values of between 1 and }

d d

() =

P2 + 2

1(

)+

()

() +

()

). In other , we have: (35)

where ( ) is any linear combination of the occupied states ( ) ( ). As we pointed out at the beginning of § 4, adding to one of the ( ) a component on the already occupied individual states has no effect on the -particle state (aside from an eventual change of phase), and therefore does not change the value of ; consequently, the stationarity of this functional does not depend on the value of the ket ( ) , which can be any ket, for example the zero ket. Finally, if the ( ) are equal to the solutions ( ) of the equations:

}

d d

() =

P2 + 2

1(

)+

()

()

(36)

the functional is indeed stationary for all times. Furthermore, as we saw in Complement EXV that ( ) is Hermitian, so is the operator on the right-hand side of (36). 1708



FERMIONS, TIME-DEPENDENT HARTREE-FOCK APPROXIMATION

Consequently, the kets ( ) follow an evolution similar to the usual Schrödinger evolution, described by a unitary evolution operator (Complement FIII ). Such an operator does not change either the norm nor the scalar products of the kets: if the kets () initially formed an orthonormal set, this remains true at any later time. The whole calculation just presented is thus consistent; in particular, the norm of the -particle state vector ^ Ψ ( ) is constant over time. Relations (36) are the time-dependent Hartree-Fock equations. Introducing the one-particle mean field operator allowed us not only to compute the stationary energy levels, but also to treat time-dependent problems. 4-b.

Particles in a single spin state

Let us return to the particular case of fermions all having the same spin state, as in § 1 of Complement EXV . We can then write the Hartree-Fock equations in terms of the wave functions as: }

(r ) =

}2 ∆+ 2

1 (r)

+

dir (r;

d3

)

(r ) ex (r

r; )

(r

)

(37)

using definitions of (46) of that complement for the direct and exchange potentials, which are now time-dependent. There is obviously a close relation between the Hartree-Fock equations, whether they are time-dependent or not. 4-c.

Discussion

As encountered in the search for a ground state with the time-independent HartreeFock equations, there is a strong similarity between equations (36) and an ordinary Schrödinger equation for a single particle. Here again, an exact solution of these equations is generally not possible, and we must use successive approximations. Assume for example that the external time-dependent potential 1 ( ) is zero until time 0 and that for 0 , the physical system is in a stationary state. With the time-independent HartreeFock method we can compute an approximate value for this state and hence a series of initial values for the individual states ( 0 ) . This determines the initial Hartree-Fock potential. Between time 0 and a slightly later time 0 + ∆ , the evolution equation (36) describes the effect of the external potential 1 ( ) on the individual kets, and allows obtaining the ( 0 + ∆ ) . We can then compute a new value for the Hartree-Fock potential, and use it to extend the computation of the evolution of the ( ) until a later time 0 + 2∆ . Proceeding step by step, we can obtain this evolution until the final time 1 . For the approach to be precise, ∆ must be small enough for the Hartree-Fock potential to change only slightly from one time step to another. Another possibility is to proceed as in the search for the stationary states. We start from a first family of orthonormal kets, now time-dependent, and which are not too far from the expected solution over the entire time interval; we then try to improve it by successive iterations. Inserting in (21) the first series of orthonormal trial functions, we get a first approximation of the Hartree-Fock potential and its associated dynamics. We then solve the corresponding equation of motion, with the same initial conditions at 1709

COMPLEMENT FXV



= 0 , which yields a new series of orthonormal functions. Using again (21), we get a value for the Hartree-Fock potential, a priori different from the previous one. We start the same procedure anew until an acceptable convergence is obtained. Applications of this method are quite numerous, in particular in atomic, molecular, and nuclear physics. They allow, for example, the study of the electronic cloud oscillations in an atom, a molecule or a solid, placed in an external time-dependent electric field (dynamic polarisability), or the oscillations of nucleons in their nucleus. We mentioned in the conclusion of Complement EXV that the time-independent Hartree-Fock method is sometimes replaced by the functional density method; this is also the case when dealing with time-dependent problems. In concluding this complement we underline the close analogy between the HartreeFock theory and a time-independent or a time-dependent mean field theory. In both cases the same Hartree-Fock potential operators come into play. Even though they are the result of an approximation, these operators have a very large range of applicability.

1710



FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

Complement GXV Fermions or Bosons: Mean field thermal equilibrium

1

2

3

Variational principle . . . . . . . . . . . . . . . . . . . . . . . . 1-a Notation, statement of the problem . . . . . . . . . . . . . . . 1-b A useful inequality . . . . . . . . . . . . . . . . . . . . . . . . 1-c Minimization of the thermodynamic potential . . . . . . . . . Approximation for the equilibrium density operator . . . . 2-a Trial density operators . . . . . . . . . . . . . . . . . . . . . . 2-b Partition function, distributions . . . . . . . . . . . . . . . . . 2-c Variational grand potential . . . . . . . . . . . . . . . . . . . 2-d Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . Temperature dependent mean field equations . . . . . . . . 3-a Form of the equations . . . . . . . . . . . . . . . . . . . . . . 3-b Properties and limits of the equations . . . . . . . . . . . . . 3-c Differences with the zero-temperature Hartree-Fock equations (fermions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-d Zero-temperature limit (fermions) . . . . . . . . . . . . . . . 3-e Wave function equations . . . . . . . . . . . . . . . . . . . . .

1712 1712 1713 1715 1716 1716 1717 1721 1722 1725 1726 1727 1728 1729 1729

Understanding the thermal equilibrium of a system of interacting identical particles is important for many physical problems: conductor or semiconductor electronic properties, liquid Helium or ultra-cold gas properties, etc. It is also essential for studying phase transitions, various and multiple examples of which occur in solid and liquid physics: spontaneous magnetism appearing below a certain temperature, changes in electrical conduction, and many others. However, even if the Hamiltonian of a system of identical particles is known, calculation of the equilibrium properties cannot, in general, be carried to completion: these calculations present real difficulties in the handling of state vectors and interaction operators, where non-trivial combinations of creation and annihilation operators occur. One must therefore use one or several approximations. The most common one is probably the mean field approximation, which, as we saw in Complement EXV , is the base of the Hartree-Fock method. In that complement, we showed, in terms of state vectors, how this method could be used to obtain approximate values for the energy levels of a system of interacting particles. As we consider here the more complex problem of thermal equilibrium, which must be treated in terms of density operators, we show how the Hartree-Fock method can be extended to this more general case. We are going to see that, thanks to this approach, one can obtain compact formulas for an approximate value of the density operator at thermal equilibrium, in the framework of the grand canonical ensemble. The equations to be solved are fairly similar1 to those of Complement EXV . The Hartree-Fock method also gives a value of the 1 They

are not simply the juxtaposition of that complement’s equations: one could imagine writing

1711

COMPLEMENT GXV



thermodynamic grand potential, which leads directly to the pressure of the system. The other thermodynamic quantities can then be obtained via partial derivatives with respect to the equilibrium parameters (volume, temperature, chemical potential, eventually external applied field, etc. – see Appendix VI). It is clearly a powerful method even though it still is an approximation as the particles interactions are treated via a mean field approach where certain correlations are not taken into account. Furthermore, for bosons, it can only be applied to physical systems far from Bose-Einstein condensation; the reasons for this limitation will be discussed in detail in § 4-a of Complement HXV . Once we have recalled the notation and a few generalities, we shall establish (§ 1) a variational principle that applies to any density operator. It will allow us to search in any family of operators for the one closest to the density operator at thermal equilibrium. We will then introduce (§ 2) a family of trial density operators whose form reflects the mean field approximation; the variational principle will help us determine the optimal operator. We shall obtain Hartree-Fock equations for a non-zero temperature, and study some of their properties in the last section (§ 3). Several applications of these equations will be presented in Complement HXV . The general idea and the structure of the computations will be the same as in Complement EXV , and we keep the same notation: we establish a variational condition, choose a trial family, and then optimize the system description within this family. This is why, although the present complement is self-contained, it might be useful to first read Complement EXV . 1.

Variational principle

In order to use a certain number of general results of quantum statistical mechanics (see Appendix refappend-6 for a more detailed review), we first introduce the notation. 1-a.

Notation, statement of the problem

We assume the Hamiltonian is of the form: =

0

+

ext

+

(1)

int

which is the sum of the particles’ kinetic energy external potential: ext

=

1(

=

1 2

their coupling energy

)

ext

with an

(2)

and their mutual interaction int

0,

2 (R

R )

int ,

which can be expressed as: (3)

=

We are going to use the “grand canonical” ensemble (Appendix VI, § 1-c), where the particle number is not fixed, but takes on an average value determined by the chemical those equations independently for each energy level, and then performing a thermal average. We are going to see (for example in § 2-d- ) that the determination of each level’s position already implies thermal averages, meaning that the levels are coupled.

1712



FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

potential . In this case, the density operator is an operator acting in the entire Fock space (where can take on all the possible values), and not only in the state space for particles (which is is more restricted since it corresponds to a fixed value of ). We set, as usual: =

1

(4)

where is the Boltzmann constant and the absolute temperature. At the grand canonical equilibrium, the system density operator depends on two parameters, and the chemical potential , and can be written as: eq

=

1

exp

(5)

with the relation that comes from normalizing to 1 the trace of

eq :

= Tr exp

(6)

The function is called the “grand canonical partition function” (see Appendix VI, § 1-c). The operator associated with the total particle number is defined in (B17) of Chapter XV. The temperature and the chemical potential are two intensive quantities, respectively conjugate to the energy and the particle number. Because of the particle interactions, these formulas generally lead to calculations too complex to be carried to completion. We therefore look, in this complement, for approximate expressions of eq and that are easier to use and are based on the mean field approximation. 1-b.

A useful inequality

Consider two density operators Tr

= Tr

and

, both having a trace equal to 1:

=1

(7)

As we now show, the following relation is always true: Tr

ln

Tr

ln

(8)

We first note that the function ln , defined for 0, is always larger than the function 1, which is the equation of its tangent at = 1 (Fig. 1). For positive values of and we therefore always have: ln

1

(9)

or, after multiplying by : ln

ln

the equality occurring only if

(10) = . 1713

COMPLEMENT GXV



Figure 1: Plot of the function ln . At = 1, this curve is tangent to the line = 1 (dashed line) but always remains above it; the function value is thus always larger than 1.

Let us call the eigenvalues of corresponding to the normalized eigenvectors , and the eigenvalues of corresponding to the normalized eigenvectors . Used for the positive numbers and , relation (10) yields: ln

ln

(11)

We now multiply this relation by the square of the modulus of the scalar product: 2

=

(12)

and sum over and . For the term in ln of (11), the summation over yields in (12) the identity operator expanded on the basis ; we then get = 1, and are left with the sum over of ln , that is the trace Tr ln . As for the term in ln , the summation over introduces: ln

= ln

(13)

and we get: ln

= Tr

ln

(14)

As for the terms on the right-hand side of inequality (11), the term in =

1714

= Tr

=1

yields: (15)



FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

and the one in also yields 1 for the same reasons, and both terms cancel out. We finally obtain the inequality: Tr

ln

Tr

ln

0

(16)

which proves (8). Comment: One may wonder under which conditions the above relation becomes an equality. This requires the inequality (11) to become an equality, which means = whenever the scalar product (12) is non-zero; consequently all the eigenvalues of the two operators and must be equal. In addition, the eigenvectors of each operator corresponding to different eigenvalues must be orthogonal (their scalar product must be zero). In other words, the eigenvalues and the subspaces spanned by their eigenvectors are identical, which amounts to saying that = . 1-c.

Minimization of the thermodynamic potential

The entropy associated with any density operator defined by relation (6) of Appendix VI: =

Tr

having a trace equal to 1 is

ln

(17)

The thermodynamic potential of the grand canonical ensemble is defined by the “grand potential” Φ, which can be expressed as a function of by relation (Appendix VI, § 1-c- ): Φ=

= Tr

+

ln

(18)

Inserting (5) into (18), we see that the value of Φ at equilibrium, Φeq , can be directly obtained from the partition function : Φeq = Tr

1

+ 1

=

ln

Tr

+ eq

=

ln

eq

ln

(19)

We therefore have: =

Φ

or

Φ

=

ln

(20)

Consider now any density operator and its associated function Φ obtained from (18). According to (5) and (20), we can write: = ln

+ ln

= ln

Φ

(21)

Inserting this result in (18) yields: Φ = Tr

1

[ ln

+ Φ

Now relation (16), used with Tr [ln

ln

eq ]

0

+ ln ] =

eq ,

=

1

Tr [ ln

+ ln ]



(22)

is written as: (23) 1715

COMPLEMENT GXV



Relation (22) thus implies that for any density operator have: Φ

having a trace equal to 1, we

Φeq

(24)

the equality occurring if, and only if, = eq . Relation (24) can be used to fix a variational principle: choosing a family of density operators having a trace equal to 1, we try to identify in this family the operator that yields the lowest value for Φ. This operator will then be the optimal operator within this family. Furthermore, this operator yields an upper value for the grand potential, with an error of second order with respect to the error made on . 2.

Approximation for the equilibrium density operator

We now use this variational principle with a family of density operators that leads to manageable calculations. 2-a.

Trial density operators

The Hartree-Fock method is based on the assumption that a good approximation is to consider that each particle is independent of the others, but moving in the mean potential they create. We therefore compute an approximate value of the density operator by replacing the Hamiltonian by a sum of independent particles’ Hamiltonians ( ): =

( )

(25)

=1

We now introduce the basis of the creation and annihilation operators, associated with the eigenvectors of the one-particle operator : 0 =

with

=

The symmetric one-particle operator of Chapter XV: =

(26) can then be written, according to relation (B-14) (27)

where the real constants are the eigenvalues of the operator . We choose as trial operators acting in the Fock space the set of operators that can be written in the form corresponding to an equilibrium in the grand canonical ensemble – see relation (42) of Appendix VI. We then set: =

1

exp

(28)

where is any symmetric one-particle operator, the constant the inverse of the temperature defined in (4), a real constant playing the role of a chemical potential, and the trace of : = Tr exp 1716

(29)



FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

Consequently, the relevant variables in our problem are the states , which form an arbitrary orthonormal basis in the individual state space, and the energies . These variables determine the as well as , and we have to find which of their values minimizes the function: Φ = Tr

+

ln

(30)

Taking (27) and (28) into account, we can write: =

1

exp

(

)

(31)

The following computations are simplified since the Fock space can be considered to be the tensor product of independent spaces associated with the individual states ; consequently, the trial density operator (28) can be written as a tensor product of operators each acting on a single mode : =

1

2-b.

exp

(

)

(32)

Partition function, distributions

Equality (32) has the same form as relation (5) of Complement BXV , with a simple change: the replacement of the free particle energies = }2 2 2 by the energies , which are as yet unknown. As this change does not impact the mathematical structure of the density operator, we can directly use the results of Complement BXV . .

Variational partition function The function

only depends on the variational energies

may be computed in the basis =

exp [

(

, since the trace of (32)

, which yields: )]

(33)

We simply get an expression similar to relation (7) of Complement BXV , obtained for an ideal gas. Since for fermions can only take the values 0 and 1, we get: =

(

1+

whereas for bosons

)

(34)

varies from 0 to infinity, so that: 1

= 1

(

(35)

)

In both cases we can write: ln

=

ln 1

(

)

(36) 1717

COMPLEMENT GXV



with

= +1 for bosons, and = 1 for fermions. Computing the entropy can be done in a similar way. As the density operator has the same form as the one describing the thermal equilibrium of an ideal gas, we can use for a system described by the formulas obtained for the entropy of a system without interactions. .

One particle, reduced density operator Let us compute the average value of

with the density operator :

= Tr

(37)

We saw in § 2-c of Complement BXV that: Tr

=

(

)

where the distribution function

(

(38) is noted

(

)=

(

)=

)=

for fermions, and

1 (

)

1 (

)

+1 1

for bosons:

for fermions (39) for bosons

When the system is described by the density operator , the average populations of the individual states are therefore determined by the usual Fermi-Dirac or Bose-Einstein distributions. From now on, and to simplify the notation, we shall write simply for the kets . We can introduce a “one-particle reduced density operator” 1 (1) by2 : 1 (1)

=

(

) 1:

1:

(40)

where the 1 enclosed in parentheses and the subscript 1 on the left-hand side emphasize we are dealing with an operator acting in the one-particle state space (as opposed to that acts in the Fock space); needless to say, this subscript has nothing to do with the initial numbering of the particles, but simply refers to any single particle among all the system particles. The diagonal elements of 1 (1) are the individual state populations. With this operator, we can compute the average value over of any one-particle operator : = Tr

(41)

as we now show. Using the expression (B-12) of Chapter XV for any one-particle opera-

2 Contrary to what is usually the case for a density operator, the trace of this reduced operator is not equal to 1, but to the average particle number – see relation (44). This different normalization is often more useful when studying systems composed of a large number of particles.

1718



FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

tor3 , as well as (38), we can write: = =

Tr

=

(

1 (1)

) (42)

that is: = Tr1

1 (1)

(43)

As we shall see, the density operator 1 (1) is quite useful since it allows obtaining in a simple way all the average values that come into play in the Hartree-Fock computations. Our variational calculations will simply amount to varying 1 (1). This operator presents, in a certain sense, all the properties of the variational density operator chosen in (28) in the Fock space. It plays the same role4 as the projector (which also represents the essence of the variational -particle ket) played in Complement EXV . In a general way, one can say that the basic principle of the Hartree-Fock method is to reduce the binary correlation functions of the system to products of single-particle correlation functions (more details on this point will be given in § 2-b of Complement CXVI ). The average value of the operator for the total particle number is written: = Tr

= Tr1

1 (1)

=

(

)

(44)

=1

Both functions and increase as a function of and, for any given temperature, the total particle number is controlled by the chemical potential. For a large physical system whose energy levels are very close, the orbital part of the discrete sum in (44) can be replaced by an integral. Figure 1 of Complement BXV shows the variations of the Fermi-Dirac and Bose-Einstein distributions. We also mentioned that for a boson system, the chemical potential could not exceed the lowest value 0 of the energies ; when it approaches that value, the population of the corresponding level diverges, which is the Bose-Einstein condensation phenomenon we will come back to in the next complement. For fermions, on the other hand, the chemical potential has no upper boundary, as, whatever its value, the population of states having an energy lower than cannot exceed 1. .

Two particles, distribution functions

We now consider an arbitrary two-particle operator and compute its average value with the density operator . The general expression of a symmetric two-particle

3 We have changed the notation and of Chapter XV into and to avoid any confusion with the distribution functions . 4 For fermions, and when the temperature approaches zero, the distribution function included in the definition of (1) becomes a step function and (1) does indeed coincide with (1).

1719



COMPLEMENT GXV

operator is given by relation (C-16) of Chapter XV, and we can write: = Tr 1 2

=

1:

;2 :

(1 2) 1 :

;2 :

Tr

(45)

We follow the same steps as in § 2-a of Complement EXV : we use the mean field approximation to replace the computation of the average value of a two-particle operator by that of average values for one-particle operators. We can, for example, use relation (43) of Complement BXV , which shows that: Tr

=[

+

]

(

)

(

)

(46)

We then get: 1 2

= 1:

(

)

(

) (47)

;2 :

(1 2) 1 :

;2 :

+

1:

;2 :

(1 2) 1 :

;2 :

Which, according to (40), can also be written as: 1 2

=

1: 1:

where 1:

1 (1)

;2 :

1:

(1 2) [1 +

1 (2) ex (1

2:

(48)

2)] 1 :

;2 :

is the exchange operator between particles 1 and 2. Since:

ex

1 (1)

1:

2: = 1:

1 (2)

1:

;2 :

[ 1 (1)

2:

;2 :

and as the operators 1 (1) and right-hand side of (48) as: 1 2

2:

1 (1) 1 (2)

1 (2)]

1 (2)

1:

;2 :

(49)

are diagonal in the basis

(1 2) [1 +

ex (1

2)] 1 :

, we can write the

;2 :

(50)

which is simply a (double) trace on two particles 1 and 2. This leads to: =

1 Tr1 2 [ 1 (1) 2

1 (2)]

(1 2) [1 +

ex (1

2)]

(51)

As announced above, the average value of the two-particle operator can be expressed, within the Hartree-Fock approximation, in terms of the one-particle reduced density operator 1 (1); this relation is not linear.

1720



FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

Comment: The analogy with the computations of Complement EXV becomes obvious if we regroup its equations (57) and (58) and write: ˆ int = 1 Tr1 2 [ 2

(1)

(2)]

2 (1

2) [1 +

ex (1

2)]

(52)

Replacing 2 (1 2) by , we get a relation very similar to (51), except for the fact that the projectors must be replaced by the one-particle operators 1 . In § 3-d, we shall come back to the correspondence between the zero and non-zero temperature results. 2-c.

Variational grand potential

We now have to compute the grand potential Φ written in (30). As the exponential form (28) for the trial operator makes it easy to compute ln , we see that the terms in cancel out, and we get: Φ = Tr

ln

(53)

We now have to compute the average energy, with the density operator , of the difference between the Hamiltonians and respectively defined by (1) and (25). We first compute the trace: Tr

=

(54)

starting with the kinetic energy contribution energy operator: 0

=

0

in (1). We call

P2 2

the individual kinetic

(55)

( is the particle mass). Equality (43) applied to when the system is described by : 0

0

= Tr1

0

1 (1)

=

0

yields the average kinetic energy

(

0

)

(56)

This result is easily interpreted; each individual state contributes its average kinetic energy, multiplied by its population. The computation of the average value follows the same steps: ext ext

= Tr1

1

1 (1)

=

(

1

)

(57)

(as in Complement EXV , operator 1 is the one-particle external potential operator). To complete the calculation of the average value of , we now have to compute the trace Tr , the average value of the interaction energy when the system is int described by . Using relation (51) we can write this average value as a double trace: int

=

1 Tr1 2 2

1 (1)

1 (2)

[

2 (1

2)] [1 +

ex (1

2)]

(58) 1721

COMPLEMENT GXV



We now turn to the average value of . The calculation is simplified since is, like 0 , a one-particle operator; furthermore, the have been chosen to be the eigenvectors of with eigenvalues – see relation (26). We just replace in (56), 0 by , and obtain: =

( )

(

)=

(

)

(59)

Regrouping all these results and using relation (36), we can write the variational grand potential as the sum of three terms: Φ = Φ1 + Φ2 + Φ3

(60)

with: Φ1 = Tr1 [ 0 + 1 ] 1 Φ2 = Tr1 2 [ 1 (1) 2 Φ3 = ( 2-d.

1 (1) 1 (2)]

2 (1

)+

2) [1 +

ex (1

ln 1

2)] ( )

(61)

Optimization

We now vary the eigenenergies and eigenstates of to find the value of the density operator that minimizes the average value Φ of the potential. We start with the variations of the eigenstates, which induce no variation of Φ3 . The computation is actually very similar to that of Complement EXV , with the same steps: variation of the eigenvectors, followed by the demonstration that the stationarity condition is equivalent to a series of eigenvalue equations for a Hartree-Fock operator (a one-particle operator). Nevertheless, we will carry out this computation in detail, as there are some differences. In particular, and contrary to what happened in Complement EXV , the number of states to be varied is no longer fixed by the particle number ; these states form a complete basis of the individual state space, and their number can go to infinity. This means that we can no longer give to one (or several) state(s) a variation orthogonal to all the other ; this variation will necessarily be a linear combination of these states. In a second step, we shall vary the energies . .

Variations of the eigenstates As the eigenstates

vary, they must still obey the orthogonality relations:

=

(62)

The simplest idea would be to vary only one of them, change: + d

for example, and make the (63)

The orthogonality conditions would then require: d 1722

=0

for all =

(64)



FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

preventing d from having a component on any ket other than : in other words, d and would be colinear. As must remain normalized, the only possible variation would thus be a phase change, which does not affect either the density operator 1 (1) or any average values computed with . This variation does not change anything and is therefore irrelevant. It is actually more interesting to vary simultaneously two eigenvectors, which will be called and , as it is now possible to give a component on , and the reverse. This does not change the two-dimensional subspace spanned by these two states; hence their orthogonality with all the other basis vectors is automatically preserved. Let us give the two vectors the following infinitesimal variations (without changing their energies and ): d d

= =

d d

with =

(65)

where d is an infinitesimal real number and an arbitrary but fixed real number. For any value of , we can check that the variation of is indeed zero (it contains the scalar products or which are zero), as is the symmetrical variation of , and that we have: d

=d

d

=0

(66)

The variations (65) are therefore acceptable, for any real value of . We now compute how they change the operator 1 (1) defined in (40). In the sum over , only the = and = terms will change. The = term yields a variation: d

(

)

+

(67)

whereas the = term yields a similar variation but where ( ) This leads to: d 1 (1) = d

[

(

)

(

)]

(

) is replaced by

+

(68)

We now include these variations in the three terms of (61); as the distributions are unchanged, only the terms Φ1 and Φ2 will vary. The infinitesimal variation of Φ1 is written as: dΦ1 = Tr1 [

0

+

1]

d 1 (1)

(69)

As for dΦ2 , it contains two contributions, one from d 1 (1) and one from d 1 (2). These two contributions are equal since the operator 2 (1 2) is symmetric (particles 1 and 2 play an equivalent role). The factor 1 2 in Φ2 disappears and we get: dΦ2 = Tr1 2 d 1 (1)

1 (2)

[

2 (1

2)] [1 +

ex (1

2)]

We can regroup these two contributions, using the fact that for any operator can be shown that: Tr1 2 d 1 (1)

(1 2) = Tr1 d 1 (1) Tr2

(1 2)

(70) (1 2), it

(71) 1723

COMPLEMENT GXV



This equality is simply demonstrated5 by using the definition of the partial trace Tr2 of operator (1 2) with respect to particle 2. We then get:

(1 2)

dΦ = dΦ1 + Φ2 = Tr1 d 1 (1)

0

+

1

+ Tr2

1 (2)

2 (1

Inserting now the expression (68) for , another one to , whose value is:

to d

[

(

)

(

Tr1

0

Now, for any operator Tr1

+

2) [1 +

1 (1),

ex (1

2)]

(72)

we get two terms, one proportional

)] 1

+ Tr2

1 (2)

2 (1

2) [1 +

ex (1

2)]

(73)

(1), we can write:

(1) =

(1)

=

=

(1)

(1)

(74)

so that the variation (73) can be expressed as: d

[

(

) 0

(

+

1

+ Tr2

)] 1 (2)

2 (1

2) [1 +

ex (1

2)]

(75)

The term in has a similar form, but it does not have to be computed for the following reason. The variation dΦ is the sum of a term in and another in : dΦ = d

+

1

(76)

2

and the stationarity condition requires dΦ to be zero for any choice of . Choosing = 0, yields 1 + 2 = 0; choosing = 2, and multiplying by , we get 1 2 = 0. Adding and subtracting those two relations shows that both coefficients 1 and 2 must be zero. Consequently, it suffices to impose the terms in , and hence expression (75), to be zero. When = , the distribution functions are not equal, and we get: 0

(if

=

+

1

+ Tr2

1 (2)

2 (1

2) [1 +

ex (1

2)]

=0

(77)

, however, we have not yet obtained any particular condition to be satisfied6 ).

5 The definition of partial traces is given in § 5-b of Complement E . The left hand side of (71) III can be written as 1 : ;2 : (1) (1 2) 1 : ; 2 : . We then insert, after (1), a closure relation on the kets 1 : ; 2 : , with = since (1) does not act on particle 2. This yields: 1: (1) 1 : 1 : ;2 : (1 2) 1 : ; 2 : , where the sum over is the definition of the matrix element between 1 : and 1 : of the partial trace over particle 2 of the operator (1 2). We then get the right-hand side of (71). 6 This was expected, since this choice does not lead to any variation of the trial density operator.

1724

• .

FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

Variation of the energies

Let us now see what happens if the energy varies by d . The function ( then varies by d which, according to relation (40), induces a variation of 1 : d

1

=

d

)

(78)

and thus leads to variations of expressions (61) of Φ1 and Φ2 . Their sum is: dΦ1 + dΦ2 = Tr1 d 1 (1)

0

+

1

+ Tr2

1 (2)

[

2 (1

2)] [1 +

ex (1

2)]

(79)

where the factor 1 2 in Φ2 has been canceled since the variations induced by 1 (1) and 1 (2) double each other. Inserting (78) in this relation and using again (74), we get: dΦ1 + dΦ2 = d

0

+

1

+ Tr2

1 (2)

[

2 (1

2)] [1 +

ex (1

2)]

(80)

As for Φ3 , its variation is the sum of a term in d coming from the explicit presence of the energies in its definition (61), and a term in d . If we let only the energy vary (not taking into account the variations of the distribution function), we get a zero result, since: (

)+

(

)(

(

)

)

1

d

(81)

1

=

(

)+

Consequently, we just have to vary by d dΦ3 =

(

) d

=0

the distribution function, and we get:

d

(82)

Finally, after simplification by d (which, by hypothesis, is different from zero), imposing the variation dΦ to be zero leads to the condition: dΦ1 + dΦ2 + dΦ3 =

[

0

+

1

+ Tr2

1 (2)

[

2 (1

2)] [1 +

ex (1

2)] ]

=0

(83)

This expression does look like the stationarity condition at constant energy (77), but now the subscripts and are the same, and a term in is present in the operator.

3.

Temperature dependent mean field equations

Introducing a Hartree-Fock operator acting in the single particle state space allows writing the stationarity relations just obtained in a more concise and manageable form, as we now show. 1725

COMPLEMENT GXV

3-a.



Form of the equations

Let us define a temperature dependent Hartree-Fock operator as the partial trace that appears in the previous equations: ( ) = Tr2

1 (2)

2 (1

2) [1 +

ex (1

2)]

(84)

It is thus an operator acting on the single particle 1. It can be defined just as well by its matrix elements between the individual states: ( ) =

(

) 1:

;2 :

2 (1

2) [1 +

ex (1

2)] 1 :

Equation (77) is valid for any two chosen values and is fixed and varies, it simply means that the ket: [

0

+

1

+

( )]

;2 :

, as long as

(85) =

. When (86)

is orthogonal to all the eigenvectors having an eigenvalue different from ; it has a zero component on each of these vectors. As for equation (83), it yields the component of this ket on , which is equal to . The set of (including those having the same eigenvalue as ) form a basis of the individual state space, defined by (26) as the basis of eigenvectors of the individual operator . Two cases must be distinguished: (i) If is a non-degenerate eigenvalue of , the set of equations (77) and (83) determine all the components of the ket [ 0 + 1 + ( )] . This shows that is an eigenvector of the operator 0 + 1 + with the eigenvalue . (ii) If this eigenvalue of is degenerate, relation (77) only proves that the eigensubspace of , with eigenvalue , is stable under the action of the operator 0 + 1 + ; it does not yield any information on the components of the ket (86) inside that subspace. It is possible though to diagonalize 0 + 1 + inside each of the eigen-subspace of , which leads to a new eigenvectors basis , now common to and 0 + 1 + . We now reason in this new basis where all the [ 0 + 1 + ( )] are proportional to . Taking (83) into account, we get: [

0

+

1

+

( )]

=

(87)

As we just saw, the basis change from the to the only occurs within the eigensubspaces of corresponding to given eigenvalues ; one can then replace the by the in the definition (40) of 1 (1) and write: 1 (1)

=

(

) 1:

1:

(88)

Inserting this relation in the definition (84) of ( ) leads to a set of equations only involving the eigenvectors . For all the values of we get a set of equations (87), which, associated with (84) and (88) defining the potential ( ) as a function of the , are called the temperature dependent Hartree-Fock equations. 1726

• 3-b.

FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

Properties and limits of the equations

We now discuss how to apply the mean field equations we have obtained, and their limit of validity, which are more stringent for bosons than for fermions. .

Using the equations

Hartree-Fock equations concern a self-consistent and nonlinear system: the eigenvectors and eigenvalues of the density operator 1 (1) are solutions of an eigenvalue equation (87) which itself depends on 1 (1). This situation is reminiscent of the one encountered with the zero-temperature Hartree-Fock equations, and, a priori, no exact solutions can be found. As for the zero-temperature case, we proceed by iteration: starting from a physically reasonable density operator 1 (1), we use it in (84) to compute a first value of the Hartree-Fock potential operator. We then diagonalize this operator to get its eigenkets and eigenvalues ¯ . Next, we build the operator 1 that has the same eigenkets, but whose eigenvalues are the (¯ ). Inserting this new operator 1 in (84), we get a second iteration of the Hartree-Fock operator. We again diagonalize this operator to compute new eigenvalues and eigenvectors, on which we build the next approximation 1 of 1 , and so on. After a few iterations, we may expect convergence towards a self-consistent solution. .

Validity limit

For a fermion system, there is no fundamental general limit for using the HartreeFock approximation. The pertinence of the final result obviously depends on the nature of the interactions, and whether a mean field treatment of these interactions is a good approximation. One can easily understand that the larger the interaction range, the more each particle will be submitted to the action of many others. This will lead to an averaging effect improving the mean field approximation. If, on the other hand, each particle only interacts with a single partner, strong binary correlations may appear, which cannot be correctly treated by a mean field acting on independent particles. For bosons, the same general remarks apply, but the populations are no longer limited to 1. When, for example, Bose-Einstein condensation occurs, one population becomes much larger than the others, and presents a singularity that is not accounted for in the calculations presented above. The Hartree-Fock approximation has therefore more severe limitations than for the fermions, and we now discuss this problem. For a boson system in which many individual states have comparable populations, taking into account the interactions by the Hartree-Fock mean field yields as good an approximation as for a fermion system. If the system however is close to condensation, or already condensed, the mean field equations we have written are no longer valid. This is because the trial density operator in relation (31) contains a distribution function associated with each individual quantum state and varies as for an ideal gas, i.e. as an exponentially decreasing function of the occupation numbers. Now we saw in § 3-b- of Complement BXV that, in an ideal gas, the fluctuations of the particle numbers in each of the individual states are as large as the average values of those particle numbers. If the individual state has a large population, these fluctuations can become very important, which is physically impossible in the presence of repulsive interactions. Any population fluctuation increases the average value of the square of the occupation number (equal 1727

COMPLEMENT GXV



to the sum of the squared average value and the squared fluctuation), and hence of the interaction energy (proportional to the average value of the square). A large fluctuation in the populations would lead to an important increase of the interaction repulsive energy, in contradiction with the minimization of the thermodynamic potential. In other words, the finite compressibility of the physical system, introduced by the interactions, prevents any large fluctuation in the density. Consequently, the fluctuations in the number of condensed particles predicted by the trial Hartree-Fock density operator are not physically acceptable, in the presence of condensation. It is worth analyzing more precisely the origin of this Hartree-Fock approximation limit, in terms of correlations between the particles. Relation (51) concerns any twoparticle operator . It shows that, using the trial density operator (31), the two-particle reduced density operator can be written as: 2

(1 2) =

1 (1)

1 (1) [1

+

ex

(1 2)]

(89)

Its diagonal matrix elements are then written: 1:

;2 :

= 1:

2 1

1:

(1 2) 1 : 2:

;2 : 1

2:

+ 1:

1

1:

2:

1

2:

(90)

and are the sum of a direct term, and an exchange term. When = , the presence of an exchange term is not surprising, and corresponds to the general discussion of § C-5 in Chapter XV. It is similar to the expression of the spatial correlation function written in (C-34) of that chapter, which is also the sum of two contributions, a direct one (C32) and an exchange one (C-33). Since this last contribution is positive when 1 2, the physical consequence of the exchange is a spatial bunching of the bosons. What is surprising though is that the exchange term still exists in (90) when = , even though the notion of exchange is meaningless: when dealing with a single individual state, the four expressions (C-21) of Chapter XV reduce to a single one, the direct term. We can also check that the exchange term (C-34) of Chapter XV includes the explicit condition = , which means it receives no contribution from = . We shall furthermore confirm in § 3 of Complement AXVI that bosons all placed in the same individual quantum state are not spatially correlated, and therefore present neither bunching nor exchange effects. The mathematical expression of the trial two-particle Hartree-Fock density operator thus contains too many exchange terms. This does not really matter as long as the boson system remains far from Bose-Einstein condensation: the error involved is small since the = terms play a negligible role compared to the = terms in the summations over and appearing in the interaction energy. However, as soon as an individual state becomes highly populated, significant errors can occur and the Hartree-Fock method must be abandoned. There exist, however, more elaborate theoretical treatments better adapted to this case. 3-c.

Differences with the zero-temperature Hartree-Fock equations (fermions)

The main difference between the approach we just used and that of Complements CXV and EXV is that these complements were only looking for a single eigenstate of the Hamiltonian ˆ , generally its ground state. If we are now interested in several of these states, we have to redo the computation separately for each of them. To study the 1728



FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

properties of thermal equilibrium, one could imagine doing the calculations a great many times, and then weigh the results with occupation probabilities. This method obviously leads to heavy computations, which become impossible for a macroscopic system having an extremely large number of levels. In the present complement, the Hartree-Fock equations yield immediately thermal averages, as well as eigenvectors of a one-particle density operator with their energies. Another important difference is that the Hartree-Fock operator now depends on the temperature, because of the presence in (85) of a temperature dependent distribution function – or, which amounts to the same thing, of the presence in (84) of an operator dependent on , and which replaces the projector (2) onto all the populated individual states. The equations obtained remind us of those governing independent particles, each finding its thermodynamic equilibrium while moving in the self-consistent mean field created by all the others, also including the exchange contribution (which can be ignored in the simplified “Hartree” version). We must keep in mind, however, that the Hartree-Fock potential associated with each individual state now depends on the populations of an infinity of other individual states, and these populations are function of their energy as well as of the temperature. In other words, because of the nonlinear character of the Hartree-Fock equations, the computation is not merely a juxtaposition of separate mean field calculations for stationary individual states. 3-d.

Zero-temperature limit (fermions)

Let us check that the Hartree-Fock method for non-zero temperature yields the same results as the zero temperature method explained in Complement EXV for fermions. In § 2-d of Complement BXV , we introduced for an ideal gas the concept of a degenerate quantum gas. It can be generalized to a gas with interactions: in a fermion system, when 1, the system is said to be strongly degenerate. As the temperature goes to zero, a fermion system becomes more and more degenerate. Can we be certain that the results of this complement are in agreement with those of Complement EXV , valid at zero temperature? We saw that the temperature comes into play in the definition (85) of the mean Hartree-Fock potential, . In the limit of a very strong degeneracy, the Fermi-Dirac distribution function appearing in the definition (40) of 1 (1) becomes practically a step function, equal to 1 for energies less than the chemical potential , and zero otherwise (Figure 1 of Complement BXV . In other words, the only populated states (and by a single fermion) are the states having energies less than , i.e. less than the Fermi level. Under such conditions, the 1 (2) of (84) becomes practically equal to the projector (2) which, in Complement EXV , appears in the definition (52) of the zero-temperature Hartree-Fock potential; in other words, the partial trace appearing in this relation (85) is then strictly limited to the individual states having the lowest energies. We thus obtain the same Hartree Fock equations as for zero temperature, leading to the determination of a set of individual eigenstates on which we can build a unique -particle state. 3-e.

Wave function equations

Let us write the Hartree-Fock equations (87) in terms of wave functions: these equations are strictly equivalent to (87), written in terms of operators and kets, but their 1729

COMPLEMENT GXV



form is sometimes easier to use, in particular for numerical calculations. Assuming the particles have a spin, we shall note the wave functions (r) = r

(r), with: (91)

where the spin quantum number can take (2 + 1) values; according to the nature of the particles, the possible spins are = 0, = 1 2, = 1 etc. As in Complement EXV (§ 2-d), we introduce a complete basis for the individual state space, built from kets that are all eigenvectors of the spin component along the quantization axis, with eigenvalue . For each value of , the spin index takes on a given value and is not, therefore, an independent index. As for the potentials, we assume here again that 1 is diagonal in , but that its diagonal elements 1 (r) may depend on . The interaction potential, however, is described by a function 2 (r r ) that only depends on r r , but does not act on the spins. To obtain the matrix elements of ( ) in the representation r , we use (85) after replacing the by the (we showed in § 3 that this was possible). We now multiply both sides by r and r , and sum over the subscripts and ; we recognize in both sides the closure relations: = r

r

and

= r

r

(92)

This leads to: ( ) r (

r

= ) 1:r

;2 :

2 (1

2) [1 +

(1 2)] 1 : r

;2 :

(93)

As in § C-5 of Chapter XV, we get the sum of a direct term (the term 1 in the central bracket) and an exchange term (the term in ex ). This expression contains the same matrix element as relation (87) of Complement EXV , the only difference being the presence of a coefficient ( ) in each term of the sum (plus the fact that the summation index goes to infinity). (i) For the direct term, as we did in that complement, we insert a closure relation on the particle 2 position: 2:

3

=

(r2 ) 2 : r2

2

(94)

Since the interaction operator is diagonal in the position representation, the part of the matrix element of (93) that does not contain the exchange operator becomes: 3

2

(r2 )

2

1:r

; 2 : r2

2 (1

2) 1 : r

; 2 : r2

(95)

The direct term of (93) is then written: (r

r)

d3

2

2 (r

r2 )

(

)

(r2 )

2

which is equivalent to relation (91) of Complement EXV . 1730

(96)



FERMIONS OR BOSONS: MEAN FIELD THERMAL EQUILIBRIUM

(ii) The exchange term is obtained by permutation of the two particles in the ket appearing on the right-hand side of (93); the diagonal character of 2 (1 2) in the position representation leads to the expression: 1:r

1:

2:

2:r

2 (r

r)

(97)

For the first scalar product to be non-zero, the subscript must be such that = ; in the same way, for the second product to be non-zero, we must have = . For both conditions to be satisfied, we must impose = , and the exchange term (93) is equal to: (

)

(r)

(r )

2 (r

r)

(98)

=

where the summation is on all the values of such that = : this term only exists if the two interacting particles are totally indistinguishable, which requires that they be in the same spin state (see the discussion in Complement EXV ). We now define the direct and exchange potentials by: dir (r)

=

3

2 (r

r )

(

)

(r )

2

(99) ex (r

r)=

2 (r

r)

(

)

(r)

(r )

=

The equalities (87) then lead to the Hartree-Fock equations in the position representation: }2 ∆+ 2

1

(r) +

dir (r)

(r) +

d3

ex

(r r )

(r ) =

(r)

(100)

The general discussion of § 3-b can be applied here without any changes. These equations are both nonlinear and self-consistent, as the direct and exchange potentials are themselves functions of the solutions (r) of the eigenvalue equations (100). This situation is reminiscent of the zero-temperature case, and we can, once again, look for solutions using iterative methods. The number of equations to be solved, however, is infinite and no longer equal to the finite number , as already pointed out in § 3-c. The set of solutions must span the entire individual state space. Along the same line, in the definitions (99) of the direct and exchange potentials, the summations over are not limited to states, but go to infinity. However, even though the number of these wave functions is in principle infinite, it is limited in practice (for numerical calculations) to a high but finite number. As for the initial conditions to start the iteration process, one can choose for example the states and energies of a free fermion gas, but any other conjecture is equally possible. Conclusion There are many applications of the previous calculations, and more generally of the mean field theory. We give a few examples in the next complement, which are far from showing the richness of the possible application range. The main physical idea is to 1731

COMPLEMENT GXV



reduce, whenever possible, the calculation of the various physical quantities to a problem similar to that of an ideal gas, where the particles have independent dynamics. We have indeed shown that the individual level populations, as well as the total particle number, are given by the same distribution functions as for an ideal gas – see relations (38) and (44). The same goes for the system entropy , as already mentioned at the end of § 2-b- . If we replace the free particle energies by the modified energies k , the analogy with independent particles is quite strong. If we now want to compute other thermodynamic quantities, as for example the average energy, we can no longer use the ideal gas formulas; we must go back to the equations of § 2-c. The grand potential may be calculated by inserting in (61) the and the obtained from the Hartree-Fock equations. Another method uses the fact that is given by ideal gas formulas that contain the distribution , and hence do not require any further calculations. As: =

1

we can integrate

ln

(101) over

(between

and the current value , for a fixed value of

) to obtain ln , and hence the grand potential. From this grand potential, all the other thermodynamic quantities can be calculated, taking the proper derivatives (for example a derivative with respect to to get the average energy). We shall see an example of this method in § 4-a of the next complement. We must however keep in mind that all these calculations derive from the mean field approximation, in which we replaced the exact equilibrium density operator by an operator of the form (32). In many cases this approximation is good, even excellent, as is the case, in particular, for a long-range interaction potential: each particle will interact with several others, therefore enhancing the averaging effect of the interaction potential. It remains, however, an approximation: if, for example, the particles interact via a “hard core” potential (infinite potential when the mutual distance becomes less than a certain microscopic distance), the particles, in the real world, can never be found at a distance from each other smaller than the hard core diameter; now this impossibility is not taken into account in (32). Consequently, there is no guarantee of the quality of a mean field approximation in all situations, and there are cases for which it is not sufficient.

1732



APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURE

Complement HXV Applications of the mean field method for non-zero temperature (fermions and bosons)

1 2

Hartree-Fock for non-zero temperature, a brief review Homogeneous system . . . . . . . . . . . . . . . . . . . . 2-a Calculation of the energies . . . . . . . . . . . . . . . . . 2-b Quasi-particules . . . . . . . . . . . . . . . . . . . . . . Spontaneous magnetism of repulsive fermions . . . . . 3-a A simple model . . . . . . . . . . . . . . . . . . . . . . . 3-b Resolution of the equations by graphical iteration . . . . 3-c Physical discussion . . . . . . . . . . . . . . . . . . . . . Bosons: equation of state, attractive instability . . . . 4-a Repulsive bosons . . . . . . . . . . . . . . . . . . . . . . 4-b Attractive bosons . . . . . . . . . . . . . . . . . . . . . .

3

4

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

1733 1734 1735 1736 1737 1737 1739 1743 1745 1746 1747

In the previous complement, we presented the Hartree-Fock (mean field) method for non-zero temperatures, which has numerous applications – a few of them will be discussed in this complement. We start in § 1 with a brief review of the results obtained with this method in the previous complement, and which will be used in the present complement. The general properties of a homogeneous system are then studied in § 2, as this particular case is often encountered, hence giving it a special importance. The last two sections are concerned with the study of phase transitions in homogeneous systems. Section § 3 studies fermions; we show how the mean field theory predicts the existence of a transition where a fermion system becomes spontaneously magnetic because of the repulsion between particles (even though this repulsion is supposed to be completely independent of the spins). Finally, the last section deals with bosons and the study of their equation of state. This will allow us to show, in particular, the appearance of an instability when the bosons are attractive and close to Bose-Einstein condensation. 1.

Hartree-Fock for non-zero temperature, a brief review

We start with a brief review of the results obtained previously (§ 2 of Complement BXV and § 3 of Complement GXV , which will be useful for what follows. For an ideal gas, the distribution function for fermions, or for bosons, is given by:

(

where

(

)=

(

)=

)=

=1

and

1 (

)

1 (

)

+1 1

for fermions (1) for bosons

is the chemical potential. The average total particle number

is then obtained by a sum over all the individual accessible states, labeled by the 1733



COMPLEMENT HXV

subscript : =

(

)

(2)

=1

The temperature dependent Hartree-Fock equations (mean field equations) in the position representation are given by relation (100) of Complement GXV : }2 ∆+ 2

(r) +

1

dir (r)

d3

(r) +

ex

(r r )

(r ) =

(r)

(3)

where = +1 for bosons, = 1 for fermions, and where dir (r) and ex (r) are given by relation (99) of that same complement (we assume the interaction potential does not act on the spin quantum numbers ): dir (r)

3

=

2 (r

r )

(

)

(r )

2

(4) ex (r

r)=

2 (r

r)

(

)

(r)

(r )

=

2.

Homogeneous system

We assume from now on that the physical system is subjected to boundary conditions created by a one-body potential, which confines the particles inside a cubic box of edge length ; this potential is zero ( 1 = 0) inside the box, and takes on an infinite value outside. To take this confinement into account, we shall use the periodic boundary conditions (Complement CXIV , § 1-c), for which the normalized eigenfunctions of the kinetic energy are written as: 1

kr

(5)

3 2

where the possible wave vectors k are those whose three components are integer multiples of 2 . Because of the spin, the eigenvectors of the kinetic energy are labeled by the values of both k and , and are written k , with: r

k

=

k (r)

=

1 3 2

kr

(6)

The index (or ) that labeled the basis vectors in the previous complements is now replaced by two indices, k and (which are independent, as opposed to the indices and used in § 3-e of Complement GXV ). We shall finally assume that the particle interaction is invariant under translation: 2 (r1 r2 ) only depends on r1 r2 . We are going to see that, in such a case, solutions of the Hartree-Fock equations can be found without having to search for the eigenfunctions of the (Hartree-Fock) operator written on the left-hand side of (3); these solutions are simply the plane waves written in (5). Only the operator’s eigenvalues k remain to be calculated, and can be interpreted as the energies of independent objects called “quasi-particles” (§ 2-b).

1734



APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURE

Comment: We shall verify that these plane waves are solutions of the Hartree-Fock equations, while neither being necessarily the only ones, nor even those leading to the lowest energy of the total system. A phenomenon called “symmetry breaking” (translation symmetry in this case) could occur and introduce solutions whose moduli vary in space and correspond to lower energies. The Wigner crystal of electrons is such an example, where the particle density spontaneously shows a periodic spatial modulation. Another example of spontaneous symmetry breaking will be discussed in § 3-c of this complement. There are many other cases (in nuclear physics in particular) where the Hartree-Fock method can be used to study symmetry breaking phenomena. 2-a.

Calculation of the energies

As the plane waves are obviously eigenfunctions of the kinetic energy, and since the potential is zero inside the box, we just have to demonstrate that they are also eigenfunctions of the direct and exchange potentials. Inserting (5) in (4), we get: dir (r)

1

=

3

(

k

)

(

k

)

d3

2 (r

r )

k 0 3

=

(7)

k

where 0

0

is defined as (with a change of variable r d3

=

2 (r

d3

r )=

r = s):

2 (s)

(8)

The direct potential is therefore a constant, independent of the position r; multiplying an exponential k r , it yields a function proportional to it. This means that k r is an eigenfunction of the direct potential. As for the exchange potential, using the second relation of (4), we get: ex (r

r)=

1

(

3

)

k

k (r r )

2 (r

r)

(9)

k

The exchange potential is thus also translation-invariant (it only depends on the difference r r ). Consequently, the last term on the left-hand side of (3) can be written as: (

3

)

k

d3

kr

k (r r )

2 (r

r)

3 2

k kr

=

(

3 2

k

)

3

(k

k)

(10)

k

where (with the change of variable r (k

k)

=

d3

(k k ) s

2 (s)

r = s): (11) 1735



COMPLEMENT HXV

Consequently, the exchange term simply multiplies the plane wave (

3

)

k

(k

kr

3 2

by: (12)

k)

k

To sum up, we showed that, for a uniform system, the plane waves are indeed solutions of the Hartree-Fock equations (3). It is no longer necessary to solve the eigenvector equations, but we simply have to replace in (3) the k (r) by plane waves. This leads to: =

k

+

0 3

( k

where

k

)+

(

3

k

)

(k

k)

(13)

k

is the kinetic energy of a free particle:

=

}2 k2 2

(14)

We have obtained self-consistent conditions for the eigenvalues, which are a set of coupled nonlinear equations because of the ( k ) dependence on the energies k . Comment: The exchange term contains the Fourier transform at the spatial frequency k k of the particle interaction potential; the direct term, however, contains the Fourier component at zero spatial frequency. This property is easy to understand from a physical point of view. Consider two particles, having respectively an initial momentum }k and }k . We saw in Chapter VIII (§ B-4-a) that the effect, to first order (Born approximation), of an interaction potential is proportional to the Fourier transform of that potential, calculated at the value of the variation of the relative momentum between the two particles (Chapter VII, § B-2-a); this variation is none other than the momentum transfer between the particle as they interact. Consequently, it is normal that the system energy is the sum of two terms: a direct term where no particle changes its momentum (no momentum transfer, hence a Fourier variable equal to zero); and another one where the two particles exchange their momenta, so that the relative momentum changes sign and the Fourier variable is proportional to the difference k k. 2-b.

Quasi-particules

Equations (13) yield the individual energies k , which are the sum of the free particle energy }2 2 2 and a contribution from the interactions. One can look at them as energies of individual objects1 , often referred to as “quasi-particles”. The populations of the corresponding levels, as well as the total number of quasi-particles, are given by the same distribution functions as for an ideal gas – see relations (39) and (44) of Complement GXV . The same is true for the system entropy , as we already mentioned at the end of § 2-b- of that complement. Provided we replace the free particle energy by the modified energies k , the analogy with independent particles is quite strong. 1 The concept of quasi-particle is not necessarily limited to systems whose interacting particles are free inside a box; it remains valid for non-zero 1 potentials (a harmonic potential for example). The first term in (13) must then be replaced by the particle energy in the potential 1 , and the direct and exchange terms will have a different expression.

1736

• 3.

APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURE

Spontaneous magnetism of repulsive fermions

Consider a system of spin 1 2 fermions, contained in a box. To make the computations easier, we shall make a few simplifying hypotheses. They lead to a simple model, giving a good illustration of the nonlinear character of the Hartree-Fock theory. They involve the resolution of a set of nonlinear equations, containing only two variables – equations we will write in (22). 3-a.

A simple model

We assume the mutual interaction potential to be repulsive and to have a very short range 0 . In relation (13) the only vectors k and k that matter are those for which the distributions ( k ) and ( k ) are not negligible. If, for all these vectors, the products 0 and k) s may be 0 are very small compared to 1, the product (k replaced by zero in (11), and we get: (k

.

(15)

0

k)

Energy of the quasi-particles; spin state populations As

k

=

=

+

1 for fermions, equation (13) can be simplified: 0 3

(

)

k

(

k

k

)

(16)

k

or else: k

=

+

0 3

( k

)

k

(17)

=

Consequently, the energy of a quasi-particle with a given is only modified by the interaction with quasi-particles having a different spin component (opposite spin if the particles have a spin = 1 2). This result was to be expected since if the spins of the two quasi-particles are parallel, they cannot be distinguished; the Pauli principle then forbids them to approach at a distance closer than 0 , and they cannot interact. On the other hand, if their spins are opposite, they can be identified by the direction of their spin (we have assumed the interaction does not act on the spins): they behave as distinguishable particles, the exclusion principle does not apply, and they now interact. We note the total particle numbers respectively in the spin state + and + or : =

(

)

k

(18)

k

Equation (17) shows that the energies of the + and to: k+

=

+

;

k

=

+

+

spin states are modified according

(19) 1737

COMPLEMENT HXV



where the coupling constant

(having the dimension of an energy) is defined by:

0 3

=

(20)

Since the particle numbers only depend on the difference between the energies and the chemical potential , we can account for the terms in appearing in (19) by keeping the energies for free particles, but lowering the chemical potentials by the quantity . Calling ( ), as in relation (47) of Complement BXV , the total number of fermions in an ideal gas: 3

(

)=

1

d3

2

(

)

+1

(21)

we get, for an interacting gas: =

+

(22) =

+

These equations determine the populations of the two spin states as a function of the parameters (or the temperature), and finally the volume . These are, however, two coupled equations since the population depends on and conversely. + Finding their solution is not obvious, and we shall use a change of variables and resort to a graphic construction. .

Change of variables

It is useful to write the previous relations in terms of dimensionless variables. We shall thus introduce the “thermal wavelength” by: 2

=}

=}

2

(23)

We can now make the same change of integration variable as in § 4-a of Complement BXV : =

(24)

2

Relation (21) then becomes: 3

(

)=

3 2(

)

(25)

where: 3 2(

)=

3 2

d3

1 2

+1

(26)

These relations are just the same as those written in (51) and (52) in Complement BXV . The value of 3 2 only depends on a dimensionless variable, the product . As opposed 1738



APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURE

to which is an “extensive” quantity (proportional2 to the volume = system), 3 2 ( ) is an “intensive” quantity (independent of the volume).

3

We can also replace the two unknowns, which are the populations two spin states, by two dimensionless and intensive variables :

of the of the

3

=

(27)

To characterize the interactions appearing in relations (22) via the constant , we introduce the dimensionless parameter : 3

=

=

0

Replacing in equations (22) the +

(1)

= =

(1)

( (

( )=

by the

, and

by , we get the simpler form:

)

(29)

+)

where the function (1)

(28)

2 }2

3 2

(1)

(

is defined by: )

(30)

(this function depends not only on , but also on the parameters and ). As (22), the system (29) contains two coupled equations: allows computing directly + , and vice versa. 3-b.

+

Resolution of the equations by graphical iteration

We now show how to solve equations (29) by a graphical method. The two variables and can be uncoupled by noting that: +

=

(1)

(1)

(

+)

with the same equation for . We now introduce a second iterated function function (1) (function of the same function) as: (2)

( )=

(31) (2)

of the

[ ( )]

(32)

+)

(33)

This leads to: +

=

(2)

(

Applying the function (2) to the variable yields the same value , which is said to be the abscissa of a “fixed point” of this function. Graphically, the fixed points of any function are at the intersections of the curve representing the function with the first bisector. 2 In a more general way, Appendix VI recalls that a quantity is said to be extensive if, in the limit of large volumes (for and constant), its ratio to the volume tends toward a constant; this does not prevent the quantity from containing terms in 2 for example. On the other hand, it is said to be intensive if, in this same limit (and without dividing by the volume), it tends toward a constant.

1739

COMPLEMENT HXV

.



Iterations of a function Consider, from a general point of view, the equation: =

( )

(34)

whose solutions correspond to the fixed points of the function . These solutions can be found by iteration: starting from an approximate value 1 of the solution, we compute ( 1 ), then use 2 = ( 1 ) as a new value of the variable to compute 3 = ( 2 ), etc. It can be shown that this iteration process converges toward the solution of equation (34), hence toward the fixed point on the first bisector, if the slope of the function at that point is included between 1 and +1, that is if: 1

( )

+1

(35)

where is the derivative of the function . The fixed point of the application is then said to be stable. On the other hand, if that slope is outside the interval [ 1 +1], the fixed point on the first bisector becomes unstable; the iteration method for no longer converges. We can also introduce the “second iterated function” (2) ( ) = [ ( )]. Any fixed point of is necessarily a fixed point of (2) . The inverse is not true, as it is possible to get a “two-order cycle” where two different values of are swapped under the effect of : 2 1

= =

( (

1)

(36)

2)

In such a case, 1 and 2 are both fixed points of (2) , but not of (we shall see below an example of such a situation, illustrated by Figure 3). These fixed points can be stable for (2) , in which case they constitute a “stable cycle of order two” for the initial function . After a certain number of iterations of , the solution converges toward a series taking alternatively two distinct values, 1 and 2 . The process may repeat itself: it is possible for the fixed point of (2) to next become unstable, and yield fixed points for an iterated function of a higher order, and hence to a stable cycle of that order. .

Form of the function

(1)

Relation (25) shows that the variations of 3 2 ( ) as a function of are very similar to the variations of ( ) as a function of , already studied in Complement BXV (Figure 2). The equality (30) shows that the plot of (1) ( ) can be deduced from that of 3 2 ( ): reversing the variable

(symmetry with respect to the vertical axis)

multiplying this variable by

(scale change of the abscissa )

and finally shifting to the right the abscissa origin by the value

.

This leads to the solid line curve in Figure 1, that plots a constantly decreasing function (for fixed values of , and ). 1740



APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURE

As the parameter changes (for and constant), we get a set of curves representing (1) ( ) for each value of . For = 0, all these curves go through the same point of ordinate 3 2 ( ). If = 0, the curve is a simple horizontal line going through this point. When becomes slightly positive, the curve starts decreasing but still extends along the abscissa axis; it has a small negative slope at the origin. As increases more and more, the curve contracts more and more toward the ordinate axis and its slope at the origin is more and more negative. In the limit where goes to infinity, the curve becomes a straight vertical line. .

Form of the function

(2)

Figure 1 also shows the geometric construction used to go from (1) to (2) . For a given value of , we start from point 1 of ordinate (1) ( ), and draw a horizontal line until it crosses the first bisector in 2 , which transfers the ordinate of 1 onto the abscissa. From the intersection point 2 we draw a vertical line that intersects the function (1) at point 3 of ordinate (2) ( ), which we simply transfer to the initial abscissa to get the final point (surrounded by a triangle). This construction shows that (2) ( ) is an increasing function of confined between two horizontal asymptotes: the abscissa axis, and a horizontal line of ordinate ). The larger the value of , the faster the increase of (2) ( ). 3 2( .

Influence of the coupling parameter on the fixed points

We now discuss the influence of the parameter 0 on the stability of single or multiple fixed points. (i) The trivial case where 0 = = 0 (no interaction between the fermions) is particularly simple: the two curves are now identical horizontal lines, whose zero slope makes their intersection with the first bisector obviously stable. We then get the ideal gas results, with equal + and densities. (ii) As long as (hence 0 ) is weak enough, the slope of (1) at the intersection point is small and the corresponding fixed point remains stable, as shown in Figure 1. This same point is obviously a fixed point for (2) as well; as the derivative of a function 2 [ ( )] with respect to is [ ( )] ( ), that is [ ( )] at a point where = ( ), (2) the slope of is less than 1, and that fixed point is also stable. In such a case, both functions have only one common fixed point, which determines the only solution of the equations: the numbers + and are necessarily equal since they correspond to a fixed point of (1) . The only effect of the fermion repulsion is to lower in an equal way the densities associated with each of the spin states. (iii) If now (or 0 ) gets larger, we come to a situation, for a certain critical value of , where the slope of (1) at the intersection with the first bisector becomes equal to 1, and that of (2) now takes the value +1. The corresponding critical situation is plotted in Figure 2, where the curve representing the function (2) is now tangent to the first bisector at their intersection (even osculating3 to it, as their contact is of order two). For both functions, the fixed point is now right at the border of its stability domain.

3 The first derivative of the function [ ( )] is equal to ( ) [ ( )], and its second derivative, to ( ) [ ( )]+[ ( )]2 [ ( )]. At a fixed point, that second derivative becomes ( ) ( ) [1 + ( )], which cancels out when ( ) = 1.

1741

COMPLEMENT HXV



Figure 1: Plots of the functions (1) (solid line) and (2) (dashed line) as a function of . Starting from any initial value of the variable , we place a point 1 on the first iterated (1) curve, whose ordinate we transfer on the axis by using the first bisector (point ); a new vertical intersection with the solid line (point 3 ) yields the second iterated 2 value (2) , that must be simply transferred to the initial value of the variable (final point surrounded by a triangle). The whole dashed line curve can thus be constructed point by point. This method shows that when , that curve is asymptotic to the abscissa axis; when + , the curve now has another horizontal asymptote, of ordinate (1) (0) = 3 2 ( ), represented by a line with smaller dashes. The general form of the second iterated function is plotted in the figure: a uniformly increasing function between those two asymptotic values. In the case represented here, the coupling constant is supposed to be weak enough for the two curves to intersect the first bisector with slopes of moduli less than 1; we then get a unique stable solution where and + are equal, which corresponds to a non-polarized spin system. (iv) Beyond that situation, as shown in Figure 3, the curve representing (2) intersects the first bisector in three points; the middle one is unstable as it corresponds to a slope larger than 1, but the two points on each sides are stable since they are associated with slopes between 1 and +1. As far as the central point is concerned, the slightest perturbation moves the iteration away from that point. On the other hand, the other two points are fixed stable points for (2) ; they correspond to a physically acceptable solution of equations (29). As those two points are not fixed for (1) , two different values, , are swapped under the action of the function (1) (two-cycle fixed points, + and represented by the arrows in the figure). We get a solution of the equations where the spin state populations are different: the gas develops a spontaneous polarization when the repulsion goes beyond a certain critical value and a phase transition occurs.

Comment: For convenience, we discussed the emergence of the spontaneous polarization as a function and : the plot of the curves is then simply of the parameter 0 , for fixed values of

1742



APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURE

Figure 2: Plots of the functions (1) (solid line) and (2) (dashed line) in the critical case = , where the function (1) intersects the first bisector with a slope equal to 1; the slope of (2) is then equal to +1, and this function is not only tangent but also osculating to the first bisector (in other words it intersects this bisector at three points grouped together).

obtained by a scale change along the axis. In general, however, the phase transition is observed by changing either the physical system density, or its temperature, while keeping the interactions constant. Our line of reasoning can be applied to this case, keeping the interactions constant while changing either the chemical potential that controls the particle density, or , which is the inverse of the temperature. When either of these two parameters gets larger, the ordinate at the origin of 3 2 ( ) gets higher, which increases the absolute values of the slopes of (1) and (2) ; the same phenomenon as above (instability and phase transition) will thus occur when the temperature is lowered or the density increased. If 3 2 ( ) 1, relation (25) shows that the particle number contained in a volume ( )3 is large compared to one, which means that the average distance between the particles is smaller than the thermal wavelength; the fermion gas is then degenerate.

3-c.

Physical discussion

As the spins carry a magnetic moment, a spontaneous polarization implies a transition towards a ferromagnetic phase. The origin of this phenomenon comes from an equilibrium between two opposite tendencies. On one hand, the “motor” of the transition comes from the fact that, to minimize the repulsion energy, the system tends to put all its particles in the same spin state (polarized system), which prevents them from interacting. This is because the Pauli principle forbids them to be at the same point in space, and as we assumed their interaction potential to be of zero-range, they can no longer interact. On the other hand, the system polarization (for a fixed total density) increases its kinetic energy: the same number of particles must be placed in a single Fermi sphere, instead of two, which results in a sphere with a larger radius, i.e. a higher Fermi level. It also changes the system entropy. The compromise between gain and loss (for 1743

COMPLEMENT HXV



Figure 3: Beyond the critical point, the function (1) intersects the first bisector with a slope less than 1, and there are now three distinct intersection points of the function (2) and the bisector. The middle point corresponding to an (2) slope larger than 1 is unstable, but the other two points (surrounded by circles) are stable. These two points are swapped under the action of (1) (two-cycle fixed points, represented by the arrows). They yield different values for the spin densities, which leads to the appearance of a spontaneous spin polarization.

the grand potential) varies as a function of the parameters; when those parameters take a value where gain and loss balance each other perfectly, a spontaneous ferromagnetic transition occurs. A more detailed study is possible; examining the shapes of the curves we plotted, we deduce that the conditions that favor the transition are: strong repulsion, high density, low temperature. It is worth noting that no Hamiltonian acting on the spins comes into play in this phase transition. Even though the interactions are totally independent of the spin, the Fermi-Dirac statistics has an effect on the spins, and can induce a transition polarizing those spins. At the critical point (Figure 2), the two new stable points appear at the same place, and move away from each other in a continuous way. The phase transition is therefore continuous, which puts it into the category of second order phase transitions. The study of critical transitions is a very large domain of physics that we cannot discuss here in a general way. We can, however, take the analysis a little further, without too much difficulty: we note, from the equations written above, that the distance between the two stable points increases, beyond the critical point where = and = , as the square root of the difference (or ). In other words, the system spontaneous magnetization varies as the square root of the distance to the critical point, which is typical of the so-called “Hopf bifurcation”. In addition, at the critical point, the magnetic susceptibility of the spin system diverges. Comments: (i) A very general concept plays a role here: spontaneous symmetry breaking. The first symmetry breaking concerns the two opposed directions along the quantization axis

1744



APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURE

. Equations (29) are invariant upon a permutation of + and ; for any solution of these equations, there exists another one where these two variables are interchanged, and where the spin magnetization points in the opposite direction. This was to be expected since nothing physically distinguishes those two directions. The symmetry is said to be broken if the stable solutions of the set of equations are asymmetric, corresponding to different values of + and ; there are then necessarily two (or more) distinct solutions, symmetric to one another. Furthermore, the quantization axis we used is arbitrary; had we chosen a different direction, we would have found that the spontaneous magnetization could point in any spatial direction. This again was to be expected since our problem is rotation invariant. The ferromagnetic transition phenomenon we have just studied corresponds to a spontaneous breaking of the rotational symmetry of the usual space, often called, in terms of group symmetry, “ (3) symmetry breaking”. There are many other second order transitions that break various symmetry groups, as for example the symmetry (1) for the superfluid transition, etc. (ii) A mean field theory like the one we used – i.e. an approximate theory – may identify the existence of a critical transition (second order transition) as explained above, but does not allow an exhaustive study of all its aspects, in particular in the vicinity of the critical point. Several critical phenomena (large wavelength critical fluctuations for example) cannot be accounted for with such an approximation, and require more elaborate theoretical methods.

4.

Bosons: equation of state, attractive instability

For bosons, the equations (3) are very similar to those we used for fermions, except for a change of sign of , and hence of the exchange potential. For a barely degenerate system, this modifies the interaction effects, but does not drastically change their consequences. On the other hand, for a system of degenerate bosons, the situation is radically different since expression (1) presents a singularity when ( ) is zero – whereas none occurs for fermions. As pointed out in Complement BXV , this is the origin of the “Bose-Einstein condensation” phenomenon: as the chemical potential increases, the singularity becomes significant when gets close (through lower values) to the lowest individual energy among the , that is close to the ground level energy 0 . The population of this level then increases more and more and can become “extensive” (proportional to the system volume in the limit of large volumes). Actually, using the Hartree-Fock equations for condensed boson systems leads to some difficulties, which will be briefly discussed below – see Comment (ii) of § 4-a. We shall limit ourselves to the study of non-condensed systems, not excluding the possibility that they approach condensation. We assume the bosons are without spin, and, as we did for fermions, that the range of the interaction potential 2 (s) appearing in relation (11) is short enough so that: (k

k)

=

(37)

0

In that case, the direct and exchange contributions in (13) are equal. For a homogeneous system, this equation then becomes: k

=

+

2

0 3

(

k

)=

+

2

0 3

(38)

k

1745

COMPLEMENT HXV

where



is the average total number of particles: =

(

)=

k

(

k

∆ )

k

(39)

k

with: 2

∆ =

0 3

(40)

We therefore find that the average total number of particles is the same as for a boson gas without interactions, provided the chemical potential is replaced by an effective chemical potential = + ∆ . The same holds true for the average population of each individual state k. As in Complement BXV , we note ( ) the function yielding the particle number for an ideal gas of bosons: (

)=

(

)=

k

(2 )

k

1

d3

3

(

)

1

(41)

(the second equality is valid for large volumes). Equation (39) then becomes: = 4-a.

(

)

(

+∆ )

(42)

Repulsive bosons

For repulsive interactions, Figure 4 shows how to graphically obtain, by a geometric construction, the system density predicted by equation (42). For a given chemical potential, the particle number decreases because of the repulsion, which takes the system further away from condensation; consequently, its description by the Hartree-Fock equations is a good approximation. To first order in 0 , relation (42) may be approximated by:

= Noting Φ( =

1

(

)+∆

(

)

( 0

) )

(

3

)

(43)

) the grand potential, relation (62) of Appendix VI shows that: ln

Integrating over Φ(

2

(

)=Φ (

=

Φ(

)

(44)

relation (43) from )+

0 3

(

to the value , we get the grand potential: )

2

(45)

where Φ ( ) is the grand potential for the ideal gas, at the same temperature and chemical potential. In addition, relation (62) of Appendix VI shows that the grand potential is equal to minus the product of the volume and the pressure : Φ( 1746

)=

(46)



APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURE

Figure 4: Geometric solutions of equations (40) and (42) for a gas of repulsive bosons. On the graph plotting the total number of particles as a function of the chemical potential, we draw a line starting from the point on the abscissa axis with chemical potential , and 3 having the slope 2 0 . It intersects the curve at a point whose abscissa is and ordinate . As 0 is positive for a repulsive gas, we see that the interactions lower the density (at constant temperature and chemical potential). The text explains how this geometric construction yields the equation of state for the interacting gas. This means that if, at constant , we vary the parameter in (43) and (45), we obtain in the plane ( , ) a curve representing the pressure as a function of the particle number in the volume , which is an isothermal line of the equation of state. Repeating this plot for several values of , we get a set of curves covering the whole equation of state, taking into account the changes introduced by the interactions. Comments: (i) To keep the computations as simple as possible, we limited ourselves to the first order in 0 . It is however possible to make the graphical construction of Fig. 4 more precise by including the higher order terms. (ii) We discussed in § 3-b- of Complement GXV the limits of the Hartree-Fock approximation for bosons, which can no longer be used when the physical system gets too close to Bose-Einstein condensation. The graphical construction shown in Figure 4 loses its physical meaning if the intersection point on the curve is too close to the contact point of the curve with the vertical axis. 4-b.

Attractive bosons

Attractive interactions ( 0 0) result in an increase of the effective chemical potential, and consequently raises the value of . This in turn increases the effective chemical potential, and this positive feedback may even induce an avalanche effect leading to an instability if is too close to zero. 1747

COMPLEMENT HXV



Figure 5: Graphical construction similar to that of Figure 4, but for an attractive boson gas (where 0 is negative). When the attractive potential 0 is not too large, the line noted 1 in the figure yields two possible solutions, only one of which is close to the solution in the absence of interactions, and hence suitable for our approximation. As 0 increases, for a certain critical value we only get one solution (tangent line 2), then none (line 3). In this last case, no solution signals an instability of the gas, which collapses onto itself because of the attractive interactions. Starting from a nearly condensed ideal gas, the closer it is to condensation, the weaker the attractive interactions necessary to trigger the instability.

The geometrical construction that yields ∆ and from the intersection of a straight line with a curve is shown on Fig. 5. If 0 is weak enough, and for a fixed value of , we get two intersection points, corresponding to possible solutions. We only keep the first one, yielding the lowest value of ∆ . The other point yields a high value of ∆ , which changes radically and increases considerably the system density; in that case, chances are the approximate mean field treatment of the interactions is no longer valid. Beyond the value of 0 for which the straight line becomes tangent to the curve, the couple of equations (39) and (40) do not have a solution: there no longer exists any stable solution. Figure 5 also shows that as the chemical potential gets closer to zero, the effect of the attractive interactions between bosons becomes more and more important; weak interactions are enough to render the system unstable. The reason we did not find any solution to the equations is that we assumed, in the computations, that the system was perfectly homogeneous; now this homogeneity cannot be maintained beyond a certain attraction intensity. We must therefore enlarge the theoretical framework, and include the possibility for the system to become spontaneously inhomogeneous. A more precise study would show that the system may develop local instabilities, hence breaking spontaneously the translation invariance symmetry. In the limit of large systems (thermodynamic limit), condensed bosons tend to collapse onto themselves under the effect 1748



APPLICATIONS OF THE MEAN FIELD METHOD FOR NON-ZERO TEMPERATURE

of an attractive interaction, however weak it may be4 . As a general conclusion, the Hartree-Fock method applied to fermions yields results valid in a very large parameter range. As an example, it allowed computing effects of the interactions on the particle number and the pressure of the system. In addition, this method was able to predict the existence of phase transitions. This is also true for non-degenerate bosons, and the mean field method actually has a very large number of applications that we cannot detail here. We must, however, keep in mind that when Bose-Einstein condensation occurs, certain predictions pertaining to the condensate may not be realistic from a physical point of view, as they depend too closely on the mean field approximation which does not properly account for the correlations between particles.

4 If the interaction potential is attractive at large distance, but strongly repulsive at short range (hard core for example), the system spontaneously forms a high density liquid or solid.

1749

Chapter XVI

Field operator A

B

C

D

Definition of the field operator . . . . . . . . . . . . . . . . A-1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2 Commutation and anticommutation relations . . . . . . . . Symmetric operators . . . . . . . . . . . . . . . . . . . . . . B-1 General expression . . . . . . . . . . . . . . . . . . . . . . . B-2 Simple examples . . . . . . . . . . . . . . . . . . . . . . . . B-3 Field spatial correlation functions . . . . . . . . . . . . . . . B-4 Hamiltonian operator . . . . . . . . . . . . . . . . . . . . . Time evolution of the field operator (Heisenberg picture) C-1 Contribution of the kinetic energy . . . . . . . . . . . . . . C-2 Contribution of the potential energy . . . . . . . . . . . . . C-3 Contribution of the interaction energy . . . . . . . . . . . . C-4 Global evolution . . . . . . . . . . . . . . . . . . . . . . . . Relation to field quantization . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

1752 1752 1754 1755 1755 1756 1758 1761 1763 1763 1764 1764 1765 1765

Introduction This chapter is a continuation of the previous chapter and uses the same mathematical tools. The main difference is that, until now, we have mainly used discrete bases in the individual state space, or . In this chapter, we shall use a continuous basis, which is the basis, for spinless particles, of the position eigenvectors (see Chapter II, § E). As they now depend on the position r, the creation and annihilation operators become field operators depending on a continuous subscript r. They are the operator analog of the classical fields (which are numbers and not operators), and are often called “field operators”. They are useful for concisely describing numerous properties of identical particle systems. They have commutation relations for bosons, and anticommutation relations for fermions. This chapter is a preparation for Chapters XIX and XX, where we will introduce the quantization of the electromagnetic field. Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XVI

FIELD OPERATOR

After defining these operators in § A-1, we discuss some of their properties. Their commutation and anticommutation relations are examined in § A-2. As in Chapter XV, we then study (§ B) symmetric operators and their expression as a function of the field operators; special attention will be given to the operators associated with the field correlation functions. In § C, we shall use the Heisenberg picture to study the time dependence of these operators. As a conclusion, we shall briefly come back (in § D) to the field quantization procedure and its link to the concept of identical particles. A.

Definition of the field operator

The field operator is defined as the annihilation operator , but associated with a basis of individual states labeled by the continuous position index r instead of a discrete index . Our starting point will be the basis change relation (A-52) of Chapter XV: =

(A-1)

where the subscripts and label the kets of the two orthonormal bases and in the individual state space. In what follows, and as already done in Chapter XV, we will simplify the notation and often replace the subscript by , and the subscript by . A-1.

Definition

We will first define the field operator for spinless particles, and later generalize it to the case with spin. A-1-a.

Spinless particles

We replace, in relation (A-1), the basis by the basis of vectors r , where r symbolizes three continuous indices (the three vector components). The operator now becomes an operator depending on the continuous index r, that we shall call “field operator” for the system of identical particles we consider. We could write it simply as r , but we shall prefer the commonly used classical notation Ψ(r). Like any annihilation operator, Ψ(r) acts in the Fock space where it lowers by one unit the particle number. In (A-1), the coefficient appearing in the sum is now the wave function (r) associated with the ket : (r) = r

(A-2)

and this relation becomes: Ψ(r) =

(r)

(A-3)

Formally, definition (A-3) looks like the expansion of a wave function on the basis functions (r); here, however, the “components” are operators and no longer simple complex numbers. In the same way as annihilates a particle in the state , the operator Ψ(r) now annihilates a particle at point r. 1752

A. DEFINITION OF THE FIELD OPERATOR

It does not depend on the basis , i.e. on the wave functions chosen to define it in (A-3), as we now show. We insert in this equality the closure relation in any arbitrarily chosen basis , and use again (A-1): Ψ(r) =

=

r

r

(A-4)

(we temporarily came back to the explicit notation for the annihilation operators). We can thus write: Ψ(r) =

(r)

(A-5)

which means Ψ(r) satisfies, in the new basis, a relation similar to (A-3). We now take the Hermitian conjugate of (A-3): Ψ (r) =

(r)

(A-6)

The operator Ψ (r) creates a particle at point r, as can be shown for example by computing the ket resulting from its action on the vacuum: Ψ (r) 0 =

(r)

0 =

(r)

(A-7)

that is: Ψ (r) 0 =

r = r

(A-8)

which represents, as announced, a particle localized at point r. One can easily invert formulas (A-3) and (A-6) by writing, for example: d3

d3

(r) Ψ(r) =

(r)

=

=

(r) (A-9)

or else, by Hermitian conjugation: d3 A-1-b.

(r) Ψ (r) =

(A-10)

Particles with spin

When the particles have a spin , the basis vector r used above must be replaced by the basis vector r , where is the spin index, which can take 2 + 1 discrete values ( = , +1 + ). To all the summations over d3 , we must now add a summation on the 2 + 1 values of the spin index . As an example, a basis vector in the individual state space must now be written: +

d3

=

(r

) r

(A-11)

=

1753

CHAPTER XVI

FIELD OPERATOR

with: (r

)= r

(A-12)

The variables r and play a similar role. The first one is, however, continuous, whereas the second is discrete. Writing them in the same parenthesis might hide this difference and one often prefers noting the discrete index as a superscript of the function , writing for example: (r) = r

(A-13)

Let us again use relation (A-1). On the left-hand side, the index now symbolizes both the position r and the spin quantum number , which leads us to define a field operator Ψ (r) having 2 + 1 spin components. Inserting (A-13) in the right-hand side of (A-1), we get: Ψ (r) =

(r)

(A-14)

The Hermitian conjugate operator Ψ (r) now creates a particle at point r with a spin : Ψ (r) 0 =

= r

r

(A-15)

As we did above, we can invert those relations. In relation (A.51) of Chapter XV (basis change), we replace by , and by r (which means that the summation over is replaced by an integral over d3 and a summation over ), and use equality (A-11); we therefore get: +

d3

=

(r) Ψ (r)

(A-16)

=

which is the analog, in the presence of spin, of relation (A-10). A-2.

Commutation and anticommutation relations

Commutation relations for field operators are analogous to those obtained in § A-5 of Chapter XV, but the discrete index is now replaced by a continuous index. A-2-a.

Spinless particles

The commutator (or anticommutator) of two field operators: [Ψ(r) Ψ(r )]

=

(r) (r ) [

]

=0

(A-17)

is indeed zero, as expected from expression (A-48) of Chapter XV. In the same way, by Hermitian conjugation: Ψ (r) Ψ (r )

1754

=

(r)

(r )

=0

(A-18)

B. SYMMETRIC OPERATORS

However, when we (anti)commute the field operator and the adjoint operator, we get: Ψ(r) Ψ (r )

=

(r)

(r )

(A-19)

which yields, taking into account the commutation relations (A-49) of Chapter XV: Ψ(r) Ψ (r )

=

(r)

(r ) =

r = r r

r

(A-20)

Finally, we get: Ψ(r) Ψ (r )

= (r

r)

(A-21)

which is the equivalent of the relations (A-49) of Chapter XV in the case of a continuous basis. A-2-b.

Particles with spin

Relation (A-17) now becomes: [Ψ (r) Ψ (r )]

=

(r)

(r ) [

]

=0

(A-22)

Relation (A-18) remains valid even if the field operators have spin indices. Finally, relation (A-19) becomes: Ψ (r) Ψ (r )

=

(r

= r

r

=

(r

)

(r

) (A-23)

that is: Ψ (r) Ψ (r )

B.

r)

(A-24)

Symmetric operators

In the previous chapter, we wrote one- or two-particle symmetric operators in terms of creation and annihilation operators in the discrete states . We are now going to express those operators in terms of the field operator (and its Hermitian conjugate). B-1.

General expression

We start with spinless particles. We can either directly transpose expressions (B-12) and (C-16) of Chapter XV to a continuous basis r (replacing the sums by integrals), or insert in those expressions the form (A-9) for the operators . In both cases, we get: =

d3

d3

r

r

Ψ (r)Ψ(r )

(B-1) 1755

CHAPTER XVI

FIELD OPERATOR

and: 1 2

=

d3

d3

d3

1 : r; 2 : r

d3

1 : r ;2 : r

Ψ (r)Ψ (r )Ψ(r )Ψ(r )

(B-2)

where, as in relation (C-16) of Chapter XV, the order of the annihilation operators is the inverse of the order that appears in the ket of the matrix element. Expression (B-1) reminds us of the average value (1) of the operator (1) for a single particle (without spin), described by the wave function d3

(1) =

d3

1

r1

1

r1

(r1 ):

(r1 ) (r1 )

(B-3)

Both expressions are not equivalent since (B-1) concerns any number of identical particles, rather than a single one; furthermore, the Ψ are now operators, and their respective order matters – as opposed to the order of the in (B-3). As for formula (B-2), it can be compared to the average value of an operator (1 2) acting on two particles, 1 and 2, both described by the same wave function . Here again, the order of the field operators is important, as opposed to the order in a product of wave functions. For particles with spins, we simply complement each integral over r with a sum over the spin index , we include this index in the matrix elements and add a spin index to the field operator. As an example, relation (B-1) is generalized to: d3

=

d3

r =

r

Ψ (r)Ψ (r )

(B-4)

=

As for relation (B-2), we get four summations over the spin indices , and the operator matrix elements are taken between bras and kets where an index is added to the variable r. B-2.

Simple examples

We start with a few examples concerning operators for one spinless particle. For a single particle, the operator associated with the local density at point r0 is: (B-5)

r0 r0 and its matrix elements are: r r0 r0 r = (r The corresponding (

)

(r0 ) =

r0 ) (r

r0 )

(B-6)

-particle operator is written as: : r0

: r0

(B-7)

=1

Replacing

by r0 r0 in (B-1) yields the operator acting in the Fock space:

(r0 ) = Ψ (r0 )Ψ(r0 ) 1756

(B-8)

B. SYMMETRIC OPERATORS

This operator annihilates a particle at point r0 , and immediately recreates it at the same point. The average value: (r0 ) = Φ

(r0 ) Φ

(B-9)

in a normalized state Φ of the -particle system yields the particle density associated with this state at point r0 . The operator , total number of particles, has been written in (B-15) of Chapter XV. As the discrete summation index is changed to a continuous index r, the summation becomes an integral over all space: =

d3 Ψ (r)Ψ(r)

(B-10)

As expected, it is the integral over d3 of the operator (r). The operator 1 (r) describing the one-particle potential energy is also diagonal in the position representation; in the Fock space, it becomes the operator 1 : 1

=

d3

1 (r)Ψ

(r)Ψ(r)

(B-11)

As for the particle current, it can be deduced from the expression of the current j(r0 ) associated with a single particle of mass ; it is the product of the operators giving the local density r0 r0 and the velocity p (product that must obviously be symmetrized): j(r0 ) =

1 2

r0 r0

p

+

p

(B-12)

r0 r0

If the particle is described by a wave function (r), a simple calculation1 shows that the average value of this operator yields the usual expression for the probability current – see equation (D-17) of Chapter III: J(r0 ) =

} 2

[

(r0 )∇ (r0 )

(r0 )∇

(r0 )]

(B-14)

The current J(r0 ) of a system of identical particles is obtained by replacing in (B-1) the operator by j(r0 ): J(r0 ) =

} 2

Ψ (r0 )∇Ψ(r0 )

Ψ(r0 )∇Ψ (r0 )

(B-15)

Another way of arriving at this equality is to use the substitution process mentioned in § B-1. To obtain the operator we are looking for in terms of Ψ(r), we start from the 1 Let us calculate the average value Ψ j(r ) Ψ . The first term on the right-hand side of (B-12) 0 yields:

1 2

Ψ r0

r0 p Ψ =

} 2

Ψ (r0 )∇ (r0 )

(B-13)

since the action of the operator p in the position representation is given by (} )∇. The second term is its complex conjugate, and we therefore get (B-14).

1757

CHAPTER XVI

FIELD OPERATOR

expression of the average value for a single particle, described by the wave function (r), which is then replaced by the field operator Ψ(r). For particles with spin, the local density at point r0 , with spin , is written in the same way: (

)

(r0

)=

: r0

: r0

(B-16)

=1

and yields, in the Fock space, the operator: (r0

) = Ψ (r0 )Ψ (r0 )

(B-17)

The total density is obtained by a summation over : (r0 ) =

Ψ (r0 )Ψ (r0 )

(B-18)

=

For particles with spin, the operator associated with the total particle number, or the operator probability current Jr0 , can be obtained in a similar way. B-3.

Field spatial correlation functions

Operators associated with spatial correlation functions can also be defined with field operators; their average values are very useful for characterizing the field properties at different points of space. When we reason in terms of fields, we generally characterize each correlation function by the number of points concerned, which is different from the number of particles involved: the two-point functions characterize the properties of a single particle, the four-point ones concern two particles, etc. The reason is simple: a one-particle density operator is characterized by non-diagonal elements r0 r0 depending on two positions, a two-particle density operator involves elements depending on four positions, etc. B-3-a.

Two-point correlation functions

Defining a non-diagonal operator depending on two parameters r0 and r0 , we can generalize relation (B-5): (B-19)

r0 r0

A calculation very similar to the one leading to equation (B-7) – we simply add a “prime” to the second r0 – gives the -particle symmetric operator: (

)

(r0 r0 ) =

: r0

: r0

(B-20)

=1

which yields, in the Fock space, the operator: (r0 r0 ) = Ψ (r0 )Ψ(r0 ) 1758

(B-21)

B. SYMMETRIC OPERATORS

We thus obtain an operator that annihilates a particle at point r0 and recreates it at a different point r0 . When the -particle system is described by a quantum state Φ , we call two-point field correlation function, 1 (r0 r0 ), the average value: 1 (r0

r0 ) = Ψ (r0 )Ψ(r0 ) = Φ Ψ (r0 )Ψ(r0 ) Φ

which also yields the matrix element, in the operator2 : Ψ (r0 )Ψ(r0 ) = r0

r

(B-22) representation, of the one-particle

(B-23)

r0

Demonstration: The matrix elements of the density operator r0

( )

r0 = Tr

( )

: r0

: r0

=

( )

of particle : r0

are:

: r0

For an -particle system, we define the one-particle density operator all the particles:

(B-24) by a sum over

( )

=

(B-25)

=1

(be careful: the trace of this operator is , not 1). Its matrix elements are the sum of the average values written in (B-24), i.e. the average value of the one-particle symmetric operator obtained by summing over the : r0 : r0 . This result is simply the operator ( ) (r0 r0 ) of (B-20), which, as seen above, yields in the Fock space expression (B-21). We then simply take the average value of each side of this expression to get equality (B-23).

The average value (B-22) at two different points plays an important role in the study of Bose-Einstein condensation. For a system at thermal equilibrium, this average value generally tends to zero rapidly as the distance between r and r increases; it only remains non-zero in a domain of microscopic size. However, for a Bose-Einstein condensed gas, this average value behaves quite differently as it tends toward a non-zero value at large distance. This difference is actually the “Penrose and Onsager condensation criterion ”; they have defined the existence of such a condensation as the appearance of a non-zero value of the matrix element of at large distance; this definition is quite general as it applies not only to the ideal gas but also to systems of interacting particles. Particles with spin: If the particles have a non-zero spin , we use, as a basis, the kets r where takes the (2 + 1) values , + 1 , .., + , and we add a index to the field operators. We then define (2 + 1)2 two-point field correlation functions as the average values: Ψ (r0 )Ψ (r0 ) =

1 (r0

; r0

)

2 Note the inversion of the variable order between the function and the matrix element of .

(B-26) 1

(or the variables of Ψ

and Ψ)

1759

CHAPTER XVI

FIELD OPERATOR

The same computation that led to (B-23) for spinless particles, can be repeated with no changes other than the simple replacement of the kets (or bras) r by r ; it shows that these average values yield the matrix elements of the one-particle density operator: r0

= Ψ (r0 )Ψ (r0 )

r0

(B-27)

(here again we have chosen to normalize to ator). B-3-b.

the trace of the one-particle density oper-

Higher order correlation functions

One can also start with the two-particle operator, which now depends on four positions: (1 : r0 r0 ; 2 : r0 r0 ) = 1 : r0 1 : r0

2 : r0

2 : r0

(B-28)

In this case, the expression of is not symmetric with respect to the exchange of particles 1 and 2, as opposed to what happens for an interaction energy. The operator is then defined without the 1 2 factor of relation (C-1) of Chapter XV: (

)

(r0 r0 r0 r0 ) =

( : r0 r0 ; =1;

: r0 r0 )

(B-29)

=

and yields in the Fock space the operator factor 1 2, then leads to:

(r0 r0 r0 r0 ). Relation (B-2), without this

(r0 r0 r0 r0 ) = Ψ (r0 )Ψ (r0 )Ψ(r0 )Ψ(r0 )

(B-30)

In this case, the operator annihilates two particles at two points and recreates them at two others. A computation very similar to the one leading to (B-22) and (B-23) enables us to show, using (B-2), that the matrix elements of the two-particle density operator can be written3 : 1 : r0 ; 2 : r0

1 : r0 ; 2 : r0 = Ψ (r0 )Ψ (r0 )Ψ(r0 )Ψ(r0 )

(B-31)

This density operator, whose trace is equal to ( 1), plays an essential role in the study of correlations between particles. A particularly important example of a higher order correlation function corresponds to the case where r0 = r0 and r0 = r0 . We then get: (

)

(r0 r0 r0 r0 ) = =1;

=

=1;

=

=

3 One

1760

: r0

: r0

: r0 ;

: r0

: r0

: r0 ;

: r0

: r0

can also use relation (C-19) of Chapter XV to get the same result.

(B-32)

B. SYMMETRIC OPERATORS

which yields in the Fock space the operator: (r0 r0 r0 r0 ) = Ψ (r0 )Ψ (r0 )Ψ(r0 )Ψ(r0 )

(B-33)

The expression on the right-hand side of (B-32) characterizes the probability of finding any particle at r0 and any other one at r0 . In the same way as the average value (B-9) gives the one-particle density, the average value: (r0 r0 r0 r0 ) = Φ

(r0 r0 r0 r0 ) Φ =

2 (r0

r0 )

(B-34)

gives the two-particle “double density”, which contains information on all the binary correlations between the particle positions. We are now in a position to again obtain expression (C-28) of Chapter XV, and more precisely justify the interpretation we gave of the average value of the interaction energy written in (C-27) of that chapter. We replace in (B-33) the field operators (or their adjoints) by their expansion (A-3) on the operators (or the ); we then get (C-28) of Chapter XV, r0 being replaced by r1 and r0 by r2 . If r0 = r0 , we can check4 that this operator is equal to the product of the simple densities defined in (B-8): (r0 r0 r0 r0 ) =

(r0 )

(r0 )

(B-35)

Obviously this relation between operators does not mean that the double density

2 (r0

r0 )

is merely the product (r0 ) (r0 ) of the simple densities: the average value of a product of operators is not, in general, equal to the product of the average values. When we studied the function 2 (§ C-5-b in Chapter XV), we did find the presence of an exchange term that introduces “statistical correlations” between particles, even in the absence of interactions. Particles with spin: For particles with non-zero spin, we just have to add an index to each of the kets or bras, as well as to the field operators; this brings up to (2 + 1)4 the number of 4-point correlation functions. The matrix elements of the two-body density operator are then given by the average values: 1:r

;2 : r

1:r

;2 : r

= Ψ (r)Ψ

(r )Ψ

(r )Ψ (r ) (B-36)

B-4.

Hamiltonian operator

We now establish the expression, in terms of the field operator, of the Hamiltonian operator for a system of identical (spinless) particles. Two formulas will be useful for this computation. The first one transposes to three dimensions the formula (34) of Appendix II: 1 k (r r ) (r r ) = d3 (B-37) 3 (2 ) 4 If the particles are bosons, we just permute the commuting operators to bring Ψ(r ) to the second 0 position and obtain the result. If we are dealing with fermions, two successive anticommutations are necessary to get that result, and the corresponding two minus signs cancel each other.

1761

CHAPTER XVI

FIELD OPERATOR

The second one is obtained from (B-37) by a double derivation with respect to r: ∆ (r

r)=

1 3

(2 )

d3

k (r r )

2

(B-38)

The matrix elements of a single particle’s kinetic energy are written as: 2

r

2

r =

d3

r k

}2 2

2

k r =

}2 ∆ (r 2

r)

(B-39)

In the Fock space, it corresponds to the following operators (an integration by parts5 was used to go from the first to the second relation): 0

=

}2 2

d3 Ψ (r)∆Ψ(r) =

}2 2

d3 ∇Ψ (r) ∇Ψ(r)

(B-40)

As in § B-1, we obtain here an expression similar to the average value of an operator (here the kinetic energy) for one particle; but the gradient of the wave function must be replaced by that of a field operator, and the order of the operators can matter. The system Hamiltonian includes in general an interaction term, which makes it a two-particle operator and requires using formula (B-2). For a two-particle system, we know that the interaction yields an operator that is diagonal in the r r representation; furthermore, it only depends on the relative position r r (and not on r and r separately). Consequently, the matrix element in (B-2) takes the form: r r

r r

= (r

r ) (r

r )

2

(r

r)

(B-41)

where 2 (r r ) is the interaction potential energy between two particles located at a relative position r r (this interaction is often isotropic, in which case 2 only depends on the relative distance r r ). Finally, starting from (B-2), we get the following expression for the Hamiltonian operator: =

d3

}2 ∇Ψ (r) ∇Ψ(r) + 2 1 + d3 d3 2 (r 2

1 (r)Ψ

(r)Ψ(r) (B-42)

r ) Ψ (r )Ψ (r )Ψ(r )Ψ(r )

The first term corresponds to the particles’ kinetic energy, the second to the external potential 1 (r) acting separately on each particle, and the third one to the mutual interaction between particles; note that this last term involves four field operators, whereas the first two involve only two. The same comment as above still applies: this expression is reminiscent of the average energy of a system of two particles, both described by the same wave function; but now we are dealing with operators that do not commute. This Hamiltonian can also be expressed directly via the one-particle simple density (r0 r0 ) and the two-particle double density (r0 r0 r0 r0 ) operators, as we now show. Inserting relations (B-21) and (B-30) in expression (B-42), we obtain: =

d3

}2 ∇r0 ∇r0 2 1 + d3 2

(r0 r0 ) 3

+ r0 =r0 =r

2 (r

r )

1 (r)

(r r) (B-43)

2 (r r r r )

5 The value of the already integrated terms must be taken at infinity, and we assume that all the states of the physical system are limited to a finite volume; as a result, those terms do not play any role and can be ignored.

1762

C. TIME EVOLUTION OF THE FIELD OPERATOR (HEISENBERG PICTURE)

where the notation ∇r0 and ∇r0 represents the gradient taken with respect to the variables r0 (for the first one), and r0 (for the second); once these gradients have been computed in the kinetic energy term of (B-43), both variables take on the same value r. The fact that the Hamiltonian operator can be directly expressed in terms of the simple and double density operators can be useful. For example, to determine the ground state energy of an -particle system, we do not have to compute the state wave function, which involves all the correlations of order 1 to between the particles; it is sufficient to know the average values of these two densities. There exist, in certain cases, approximation methods that yield directly good estimates of these simple and double densities, hence allowing an access to the -body energy. Complements EXV and GXV discuss the Hartree-Fock method, which is based on an approximation where the two-particle density operator is simply expressed as a function of the one-particle density operator, i.e. the double density as a function of the simple density (Complement GXV , § 2-b- ); this allows convenient mean field calculations. C.

Time evolution of the field operator (Heisenberg picture)

The operators we have considered until now correspond to the “Schrödinger picture”, where the time evolution of the system is determined by the time evolution of its state vector. It may, however, be more convenient to adopt the Heisenberg picture (Complement GIII ), where this time evolution is transferred to the operators associated with the system’s physical quantities. For spinless particles, let us call Ψ (r; ) the operator corresponding, in the Heisenberg picture, to Ψ(r): Ψ (r; ) = (

}

Ψ(r)

}

(C-1)

is the Hamiltonian operator), and whose time dependence follows the equation: }

Ψ (r; ) = Ψ (r; )

(C-2)

We are going to compute successively the commutator of Ψ (r; ) with each of the three terms on the right-hand side of (B-42). The evolution equation for the field operator involves all the terms of the Hamiltonian (B-42): the kinetic, potential and interaction energies. C-1.

Contribution of the kinetic energy

In order to determine the commutator of the field operator with the kinetic energy, we first transpose the equations (A-17), (A-18) and (A-21) to the Heisenberg picture. Actually, they can be used without any changes: the unitary transform of a product by (C-1) is the product of the unitary transforms, that of the commutator (or of the anticommutator) is the commutator (or the anticommutator) of the transforms, and numbers like zero or the function (r r ) are invariant. Those three relations are therefore still valid in the Heisenberg picture, if we simply add an index to the field operators. We now take their derivative with respect to the positions; only (A-21) yields a non-zero result: Ψ (r; ) ∇r Ψ (r ; )

= ∇r

(r

r)=

∇r (r

r)

(C-3) 1763

CHAPTER XVI

FIELD OPERATOR

Taking (B-40) into account, we can write the commutator to be evaluated as: Ψ (r; )

() =

0

}2 2

d3

Ψ (r; ) ∇Ψ (r ; ) ∇Ψ (r ; )

(C-4)

In the term to be integrated on the right-hand side, a sign is introduced each time we permute two field operators, or two adjoints; when we permute a field operator and an adjoint, we must add to the result the right-hand side of (C-3). Adding and subtracting two equal terms, we then obtain for the function to be integrated: Ψ (r; ) ∇Ψ (r ; ) ∇Ψ (r ; ) ∇Ψ (r ; ) (Ψ (r; )∇Ψ (r ; )) + ∇Ψ (r ; ) (Ψ (r; )∇Ψ (r ; )) ∇Ψ (r ; ) ∇Ψ (r ; )Ψ (r; )

(C-5)

that is: Ψ (r; ) ∇Ψ (r ; )

∇Ψ (r ; )

∇Ψ (r ; ) [Ψ (r; ) ∇Ψ (r ; )]

+

= ∇r The integration over d3 finally get: Ψ (r; ) C-2.

() =

0

r ) ∇Ψ (r

(r

)

(C-6)

then yields the Laplacian at r of the field operator, and we }2 ∆Ψ (r; ) 2

(C-7)

Contribution of the potential energy

Instead of (C-4), it is now the commutator: Ψ (r; )

1

() =

d3

1 (r

) Ψ (r; ) Ψ (r ; )Ψ (r ; )

(C-8)

which comes into play. The calculation is similar to the previous one, but without the gradients which were applied to the field operators depending on r . The right-hand side of (C-6) now becomes simply: (r

r )Ψ (r )

(C-9)

and the integration over d3 Ψ (r; ) C-3.

1

() =

1 (r)

is straightforward, so that: Ψ (r; )

(C-10)

Contribution of the interaction energy

It is now the commutator of Ψ (r; ) with a product of four field operators that will have to be integrated: Ψ (r; ) Ψ (r ; )Ψ (r ; )Ψ (r ; )Ψ (r ; ) 1764

(C-11)

D. RELATION TO FIELD QUANTIZATION

We will not go through the details of the calculation, a bit long but without any real difficulties; as in the previous two cases, it involves the repeated application of the commutation relations. The result is that the commutator of the field with the interaction energy can be written as: d3 C-4.

2 (r

r )Ψ (r ; )Ψ (r ; )Ψ (r; )

(C-12)

Global evolution

Regrouping the three previous terms, we get the evolution equation for the field operator: }

+

}2 ∆ 2

1 (r)

Ψ (r; ) =

d3

2 (r

r ) Ψ (r ; )Ψ (r ; )Ψ (r; ) (C-13)

The left hand-side includes the differential operator of the usual Schrödinger equation for a single particle in a potential 1 (r); however, as already pointed out, Ψ is not a simple function here, but an operator. The right-hand side includes the binary interaction effects; its presence implies that the evolution equation of the field operator is not “closed”. Its evolution depends not only on the operator itself, but also on a term containing the product of three fields (or their conjugates). Analyzing in a similar way the evolution of such a product of three factors, we see that it depends on that product, and also on the product of 5 fields (or their conjugates); in turn, the evolution of a product of 5 fields will involve 7 others, etc. We thus get a series of more and more complex equations, often called a “hierarchy”of equations. They are in general very hard to solve exactly. This is why it is frequent to use an approximation by truncating this hierarchy at a certain stage, or else by eliminating the coupling term at a certain order, or replacing it by a more convenient expression. Many different methods have been proposed to accomplish this, the most well-known being the mean field approximation (Complements EXV and GXV ). D.

Relation to field quantization

In conclusion, we make some remarks concerning the field quantization procedures, and their relation to the identical particle concept. Consider a single spinless particle in an external potential well, and call (r) the wave functions associated with its stationary states in that potential (the subscript runs from 1 to infinity; we assume, for the sake of simplicity, that the spectrum is entirely discrete). The (r) form a basis on which we can expand any particle wave function (r): (r) =

(r)

(D-1)

We already noted the similarity between this formula and equality (A-3), where the only difference is that the numbers are replaced by the operators . Along the same line, relations (B-8) and (B-15) are reminiscent of a particle’s probability density of presence, and of its probability current. Finally, expression (B-42) is very similar to the average 1765

CHAPTER XVI

FIELD OPERATOR

value of the energy of a system of two particles, both placed in the same state described by the wave function (r), once we replace that usual wave function by an operator depending on the parameter r. Consequently, the creation and annihilation operator method has an air of “second quantization”: we start with the quantum wave function for one (or two) particle(s) (“first quantization”), and in a second stage, we replace the wave functions coefficients by operators (“second quantization”). However, we must keep in mind that we do not, in reality, quantize the same physical system twice; the main difference comes from the fact that we go from a very small number of particles, one or two, to a very large number of identical particles. Field operators can also appear when quantizing a classical field, such as the electromagnetic field. This is the object of Chapters XIX and XX, where we will show how the concept of a photon emerges, as the elementary excitation of the electromagnetic field. We shall also see how the electric and magnetic fields, which were classical functions, become operators defined at each point in space, creating and annihilating photons. Generally speaking, a system consisting of an ensemble of identical bosons and a system obtained by quantizing a classical field obey exactly the same equations. The particles of the first system play the role of field quanta for the second system, and the field operators then satisfy commutation relations. The two physical systems are therefore perfectly equivalent. In the case of the electromagnetic field, the particles in questions are the photons and they have a zero mass. However, this is not necessarily the case for all fields. Moreover, quantum fields associated with a system of identical fermions also exist. These fields do not have a direct classical correspondence and their operators obey anticommutation relations. In particle physics, one simultaneously takes into account fermonic and bosonic fields, associated in general with non-zero mass particles.

1766

COMPLEMENTS OF CHAPTER XVI, READER’S GUIDE AXVI : SPATIAL CORRELATIONS IN A BOSON OR FERMION IDEAL GAS

In this complement we study the properties of the spatial correlation functions in systems of fermions or bosons. For fermions, we establish the existence of an “exchange hole” which corresponds to the impossibility for fermions with parallel spins to be found at the same point in space. For bosons, we discuss their tendency to bunch (group together). Recommended in a first reading

BXVI : SPATIO-TEMPORAL COORELATION FUNCTIONS, GREEN’S FUNCTIONS

Green’s functions are a very general tool for the theoretical study of -body systems. In this complement, they are first introduced in ordinary space, then in reciprocal space (the Fourier momentum space). Knowledge of these functions allows calculation of numerous physical properties of the system. Slightly more difficult than the previous complement

CXVI : WICK’S THEOREM

Wick’s theorem permits calculating average values of any product of creation and annihilation operators, for an ideal gas system in thermal equilibrium. The calculation involves a very useful concept, the operator “contraction”.

1767



SPATIAL CORRELATIONS IN AN IDEAL GAS OF BOSONS OR FERMIONS

Complement AXVI Spatial correlations in an ideal gas of bosons or fermions

1

System in a Fock state . . . . . . . . . . . . . 1-a Two-point correlations . . . . . . . . . . . . . 1-b Four-point correlations . . . . . . . . . . . . . Fermions in the ground state . . . . . . . . . 2-a Two-point correlations . . . . . . . . . . . . . 2-b Correlations between two particles . . . . . . Bosons in a Fock state . . . . . . . . . . . . . 3-a Ground state . . . . . . . . . . . . . . . . . . 3-b Fragmented state . . . . . . . . . . . . . . . . 3-c Other states . . . . . . . . . . . . . . . . . . .

2

3

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

1769 1770 1771 1771 1772 1774 1775 1775 1776 1777

In this complement, we establish a certain number of properties of the correlation functions, arising solely from the particle statistics (i.e. from the fact they are either bosons or fermions), and independent of their possible interactions. To keep the calculations simple, we assume that the -particle system is described by a Fock state, characterized by the occupation numbers of each individual state . We shall see that fermions and bosons behave very differently: whereas the latter tend to bunch, the former tend to avoid each other, as indicated by the existence of an “exchange hole”. We give in § 1 the general expression of these correlation functions, without making any hypothesis concerning the nature of the individual states; the physical system is not necessarily homogeneous in space. In §§ 2 and 3, we study successively bosons and fermions, assuming the physical system to be contained in a box of volume in which the particles are free. The periodic boundary conditions (Complement CXIV ) allow taking into account the confinement while maintaining translation invariance (the system is perfectly homogeneous in space), which makes the calculations easier. For spinless particles, the individual states correspond to plane waves normalized in the volume : 1

(r) =

k r

(1)

where the k are chosen to satisfy the periodic boundary conditions. 1.

System in a Fock state

We assume the state Φ of the -particle system to be a Fock state built from the basis of individual states , with occupation numbers (not greater than 1 for fermions): Φ =

1

:

1;

2

:

2;

;

:

;

(2)

This will be the case, for example, if the particles do not interact (ideal gas) and if the system is in a stationary state, such as its ground state. 1769



COMPLEMENT AXVI

1-a.

Two-point correlations

For spinless particles, relations (A-3) and (B-21) of Chapter XVI lead to: (r r ) =

(r) (r )

(3)

Now the average value in a Fock state of the operator product is zero if = : the successive action of the two operators leads to another Fock state with the same particle number, but with two different occupation numbers – therefore to an orthogonal state. If, on the other hand, = , that product becomes the particle number operator that now acts on its eigenket, with eigenvalue . We thus get: Φ

Φ =

(where 1 (r

(4)

is the Kronecker delta); this yields: r)=

(r r ) =

(r) (r )

(5)

The physical interpretation of this result is the following: for a single particle in the individual state , the function 1 would simply be (r) (r ); for an -particle system in a Fock state, each individual state gives a contribution, multiplied by a coefficient equal to its population. Particles with non-zero spin: For particles with spin, we define the

(r) by:

(r) = r

(6)

When we add a discrete spin index (r

; r

)=

[

(r)]

to r, formula (3) becomes:

(r )

(7)

We obtain, with the same reasoning: 1 (r

;r

)=

(r

; r

) =

[

(r)]

(r )

(8)

whose physical interpretation is similar to the previous one. One often chooses a basis of individual states , such that each ket corresponds to a well defined value of the spin: each index indicates both an individual orbital state and a value of (which then becomes a function of ). In that case, for a given , the wave function (r) is only defined for a single value of the index ; conversely, for a given , the wave functions are different from zero only if the index (or ) belongs to a certain domain ( ). In expression (8), and are fixed, and the index must necessarily belong to both ( ) and ( ), or else the result is zero. This leads to: 1 (r

;r

)=

[

(r)]

(r )

( )

This correlation function is therefore zero if

1770

= .

(9)

• 1-b.

SPATIAL CORRELATIONS IN AN IDEAL GAS OF BOSONS OR FERMIONS

Four-point correlations

We limit ourselves to the calculation of

(r r r r )

in the “diagonal case”

r = r and r = r ; we call r1 the common value of r and r , and r2 the common value of r and r . Such a diagonal correlation function was already written in (B-20) of Chapter XVI and in § C-5-a of Chapter XV; this is the only one that plays a role in the particle interaction energy, as the associated operator is also diagonal in the position representation. In the absence of spin, and for a Fock state, the calculation of the correlation function 2 (r1 r2 ) was carried out in § C-5-b of Chapter XV, where relations (C-32) to (C-34) yield the value of this function: 2 (r1

r2 ) =

(

1)

+

2

(r1 )

(r1 )

2

(r2 )

2

+

2

(r2 ) +

(r1 ) (r2 )

(r2 ) (r1 )

(10)

=

where = +1 for bosons, = 1 for fermions. The second line of this equation contains a “direct term” only involving the moduli squared of the wave functions; it also contains an “exchange term” where the phase of the wave functions come into play, and which changes sign depending on whether we are dealing with a system of bosons or fermions. Particles with non-zero spin: In the presence of spins, we must add, as previously, the corresponding spin indices , and the correlation function becomes: 2 (r1

+

1 ; r2

2)

= 1

=

(

(r1 )

2

1

1) 2

(r2 )

2

(r1 ) 2 1

+

2

(r2 ) 2 + 2

(r1 )

2

(r2 )

(r2

1

(r1 )

(11)

If we choose, as in (9), a basis of individual states in which each ket has a well defined value of the spin, the summation is simpler and we get: 2 (r1

1 ; r2

+

2)

=

1 2

(

( 1)

1

( 1)

( 2 ); =

+

1 1 2

(r1 )

1

1) (r1 ) 2

2

(r2 )

(r1 ) 2 2

(r2

2

2

2

(r2 )

(r2 ) 2 +

+

(12) 1

(r1 )

As above, ( ) corresponds to the domain of the index for the wave function (r) to exist (otherwise, it is not defined). If the spin states are different ( 1 = 2 ), the only contribution to the correlation function comes from the second line (the direct term). The exchange term which follows only concerns particles being in the same spin state; it changes sign depending on whether they are bosons or fermions. No exchange term exists for particles having orthogonal spins. This comes from the fact that to behave as strictly identical objects, two particles must occupy the same spin state; otherwise, their spin direction could, at least in principle, be used to distinguish them.

2.

Fermions in the ground state

Consider an ideal gas of fermions, contained in a volume , and having a spin equal to 1 2 (as for electrons); the index can only take on two values 1 2. If we assume 1771



COMPLEMENT AXVI

the gas to be in its ground state, this corresponds, for an ideal gas, to a Fock state: for each of the two spin states, all the individual states with an energy lower than a certain value (called the Fermi energy) have an occupation number equal to 1, all the others being empty. We shall proceed as in Complement CXIV (in particular for the study of the magnetic susceptibility) and attribute to each of the two spin states a different Fermi energy – this is useful to account for an average spin orientation (under the influence of a magnetic field for example). For all the values of the index corresponding to = +1 2, we thus assume that the occupation numbers are equal to 1 if they correspond to plane waves (1) having a wave vector smaller than the Fermi vector + , and zero otherwise. This Fermi vector is linked to the Fermi energy by the relation: +

=

+ 2

}2

(13)

2

where is the mass of each particle. In a similar way, for all the spin values = 1 2, we assume that only the states with wave vectors smaller than the Fermi vector are occupied, with a relation similar to (13) in which the index + is replaced by . The total particle number in each of the two spin states are then: =

1

(14)

k

(the summation runs over all the states having a population equal to 1). In the limit of large volumes , this expression becomes an integral: 3

=

3

3

(2 )

d

=

k

2

2

2

d =

0

(15)

2

6

Depending on whether + is larger or smaller than , the spins + or the majority, the populations being equal if + = . 2-a.

will make up

Two-point correlations

Let us compute the average value of the operator (r ; r tinguishing the two cases where and are equal or different. (i) Same spin states Taking (1) into account, relation (9) then yields: 1 (r

;r

)=

(r

; r

) =

1

k

(r

r)

)

defined in (3), dis-

(16)

k

where the notation for the spins is simplified from 1 2 to . In the limit of large volumes, the summation over k becomes an integral, and we get: 1 (r

;r

)=

1 3

(2 )

d3

k (r

r)

(17)

k

For r = r , this function simply yields the particle density , already computed. For r = r , the function to be computed is the Fourier transform of a function of k that only 1772



SPATIAL CORRELATIONS IN AN IDEAL GAS OF BOSONS OR FERMIONS

Figure 1: Plot of the function

( ) as a function of the dimensionless variable

=

.

depends on its modulus. Using relation (59) of Appendix I in volume II, we can write: 1 (r

;r

(r

)=

r)

(18)

(0)

with1 : ( )= =

3

d

3

sin

=

0

3 3

sin

cos

3 3

cos

+

1 2

sin 0

(19)

Finally, we get: 1 (r

;r

)=

(r

r)

(20)

Figure 1 is a plot of ( ) as a function of the mutual distance between the two points. It shows that, for each spin state, the “non-diagonal” one-particle correlation function presents a maximum at r = r , then rapidly decreases to zero over a distance of the order of a few Fermi wavelengths, =2 . A system of free fermions, in its ground state, does not show any long-range “non-diagonal order”. (ii) Opposed spin states. Relation (9) shows that the two-point correlation function is zero between two states of different spins; there is no non-diagonal order.

1 An

3

arbitrary coefficient 3 has been introduced in the function to make it tend towards 1 when its variable tends towards zero, which allows dropping the factor (0).

1773

COMPLEMENT AXVI

2-b.



Correlations between two particles

We start from relation (12). In the second line, the condition = may be ignored as, for fermions, the = terms exactly cancel those on the third line; we can therefore consider the indices and as independent. Two cases must be distinguished: (i) If 1 = 2 , the three terms in (12) remain, but their behavior as a function of the volume are different. This is because, in the limit of large volumes, each of the summations over or is proportional to the volume, whereas the moduli squared of the wave functions are each proportional to 1 . For a large system, as the first of the three terms only contains a single sum over , it varies as 1 and is thus negligible. We are left with the two other terms: 2 2

(r

;r

)=

(r)

2

2

(r)

( = )

(r )

( = ) 2

2

1

=

(r

k

r)

(21)

k

The same sum as (17) appears again in this relation. This leads to, in the limit of large volumes: 2 2

(r

;r

)=

1

[

(r

2

r )]

(22)

The Pauli principle forbids particles in the same spin states to be at the same point in space; as expected, expression (22) goes to zero when r = r . As the distance between particles increases, the function (r r ) goes to zero, and the two-body correlation function tends towards the square of the one-body density , indicating that the long-range correlations disappear. This change of behavior occurs over a characteristic distance of the order of , comparable to the distance over which the non-diagonal order disappears. A plot of the spatial variations of the correlation function is given in Figure 2; it shows clearly the existence of an “exchange hole” corresponding to the mutual particle exclusion over this characteristic distance. (ii) If 1 = 2 , of the three terms of (12) only the second one (the direct term) is non-zero and yields a constant: 2

(r

;r

)=

+ 2

(23)

It is simply the product of the densities of the two kinds of spins; in the absence of interactions, particles with different spins do not show any correlation. This is because physically two particles at positions r and r and in different spin states, can in principle be identified by the direction of their spin; consequently, they no longer behave as really indistinguishable quantum particles, and no Fermi statistical effects may be observed. As we assumed the particles did not interact with each other, no spatial correlations can develop.

1774



SPATIAL CORRELATIONS IN AN IDEAL GAS OF BOSONS OR FERMIONS

Figure 2: Plot, as a function of the dimensionless variable r r , of the correlation function 2 (r ; r ) between the positions r and r of two particles in the same spin state, in a free fermion gas. As the Pauli principle forbids two particles to be at the same point in space, this function goes to zero at the origin, which creates an “exchange hole”. As the distance increases, the function approaches 1 over a distance of the order of the inverse of the Fermi vector associated with this spin state.

Comment: To keep the computation simple, we considered a system of non-interacting fermions, contained in a cubic box and in its ground state. The properties we discussed are, however, more general. In particular, it can be shown that a fermion system always exhibits an exchange hole for particles with identical spins, whether they interact or not; for a system at thermal equilibrium, the hole width gets smaller as the temperature increases, and goes from the Fermi wavelength at low temperature (degenerate system) to the thermal wavelength at high temperature (non-degenerate system).

3.

Bosons in a Fock state

The situation is radically different for bosons, as there is no upper boundary for the occupation numbers . 3-a.

Ground state

For non-interacting spinless bosons in their ground state, the occupation number of the individual state 0 having the lowest energy is equal to the total particle number , all the other occupation numbers being equal to zero. Relation (5) then yields: 0

1 (r; r

)=

(r r )

=

0 (r) 0 (r

)

(24) 1775



COMPLEMENT AXVI

As the wave function 0 (r) extends over the entire volume , the modulus of this wave function does not decrease as the distance between r and r increases and becomes comparable to the size of the system, as opposed to what occurs for fermions. This asymptotic behavior of 1 (r; r ) has been used by Penrose and Onsager to define a general criterion for Bose-Einstein condensation, valid also for interacting bosons . As for the two-particle correlation function, formula (10) yields: 2 (r

r )=

(r r r

r )

=

(

1)

0 (r)

2

0 (r

)

2

(25)

k0 r If the ground state wave function is of the form , this function is simply equal 2 to the constant ( 1) , independent of r and r . A system of bosons that are all in the same quantum state does not show any spatial correlations.

3-b.

Fragmented state

We now assume the -boson system to be in a “fragmented” state: instead of all the particles being in the same individual state, 1 particles are in the state 1 and 2 in the state 2 , with = 1 + 2 . Relation (5) then yields: 1 (r; r

)=

1 (r) 1 (r

1

)+

2 (r) 2 (r

2

)

(26)

When r = r , expressions (24) and (26) contain the moduli squared of the wave functions (r ), which are all equal to 1 (for a system contained in a box of volume with periodic boundary conditions); both expressions (24) and (26) are therefore equal. On the other hand, when r and r are different, the phases of the two terms in (26) do not coincide any longer, and (destructive) interference effects can lower the modulus of 1 (r; r ). Consequently, the fragmentation of a physical system into two states decreases the modulus of the non-diagonal terms of 1 (r; r ). Obviously, the more fragmented states there are, the more noticeable the decrease. Relation (10) now becomes: 2 (r

r )=

1

+

(

1

1 2

1) 2

1 (r) 1 (r)

2

2

1 (r

2 (r

2

2

) +

) +

2

(

1)

2

1 (r) 1 (r

)

2 (r

2 (r)

)

2

2 (r)

2 (r

)

+ c.c.

2

(27)

where the factor 2 in the second line comes from the fact that either = 1 and = 2, or the opposite; the last two terms of this expression correspond to the exchange term, and the notation c.c. indicates the complex conjugate of the previous term. Replacing k1 2 r the wave functions by , and assuming 1 and 2 to be very large compared 2 to 1, we obtain the square of the sum ( 1 + 2 ) = 2 , and we can write: 2 2 (r

r )

2

+2

1 2 2

cos [(k2

k1 ) (r

r )]

(28)

The first term is simply the square of the one-particle density ; it does not have any spatial dependence and is what we expect in the absence of any particle correlation. On the other hand, the exchange term is position dependent; it presents a maximum when r = r , and oscillates at the spatial frequency k2 k1 . This exchange term enhances the probability of finding two bosons close to one another (bunching effect 1776



SPATIAL CORRELATIONS IN AN IDEAL GAS OF BOSONS OR FERMIONS

coming from the Bose-Einstein statistics); the probability of finding them at a greater distance is then lower, and then increases again, etc. For a short interaction range, only the first maximum plays a role, and increases the average value of the interaction. The consequences of that effect, in terms of the internal interaction energy, has been discussed in § 4-c of Complement CXV . 3-c.

Other states

We now consider situations described by Fock states where 0 is still very large, but where other states are also occupied, with populations much smaller than 0 . (i) One could for instance place a finite fraction 0 of particles in the ground state, and distribute the remaining fraction 1 among a large number of states, 0 whose individual populations remain small and vary regularly with the index . This leads to: 1 (r; r

where

)=

(r

(r

0

0 (r) 0 (r

)+

(r

r)

(29)

r ) is given by:

r)=

1

k

(r

=0

r)

=

1 (2 )

3

d3

(k )

k

(r

r)

(30)

This function is the Fourier transform of the distribution = (k ) for k = 0. As all the are positive or zero, the function (r r ) presents a maximum when r = r , since this is where all the exponentials k (r r) are in phase; it then decreases when the difference r r increases, as all the phases spread out. If the distribution is a regular function of width ∆ (a Gaussian for example), the function (r r ) tends towards zero over a distance ∆ 1 ∆ , in general much smaller than the size of the system. Figure 3 is a plot of the function 1 (r; r ) when the particles are contained in a box, so that we can use (1). As we assumed 0 to be the ground state, the corresponding wave function is the inverse of the square root of the volume, and (29) becomes: 1 (r; r

where 0

=

0

)=

0

+

(r

r)

(31)

is the density of atoms in the ground state: 0

(32)

After the decrease linked to that of (r r ), and occurring over the interval ∆ 1 ∆ , the function does not tend towards zero (as it would for fermions), but towards a constant proportional to the population 0 . As already mentioned, this particular behavior is the base for the Penrose and Onsager criterion that defines, in a general way, the appearance of Bose-Einstein condensation.

1777



COMPLEMENT AXVI

Figure 3: Plot of the function 1 (r; r ) = 0 + (r r ) for bosons, as a function of the distance r r . The function starts by decreasing over an interval of the order of the inverse of ∆ ; at a larger distance, it tends towards a constant 0 proportional to the ground state population. The fact that it does not go to zero indicates the presence of a long-range non-diagonal order, and the existence of a highly populated individual level. As for the two-body correlation function, relation (10) shows the existence of three kinds of terms for a system having only one single highly populated state 0 : – the terms corresponding to two particles in the highly populated state (condensate), which come from the first line2 of (10), and yield again (25), replacing by 0. – the crossed terms in 0 , which yield: 2

0

(r)

2

0 (r

2

) +

0 (r) 0 (r

)

(r ) (r) +

(r) (r )

0 (r

)

0 (r)

=0

(33) Inserting the value (1) for the wave functions (assuming k0 = 0), we get a contribution to 2 (r r ) equal to: 0 2

2+

k

(r

r

)+

k

(r

r)

0,

we can write this result in the form:

)

(34)

=0

Using relation 2

0

=0 0

=

+ Re [ (r

r )]

(35)

2 In the limit of large volumes, we have assumed that 0 is the only population proportional to the volume. In the first line of (10), the term = 0 then contains the product of 20 and the two wave functions squared, each proportional to the inverse of the volume; this term is therefore independent of the volume. On the other hand, the terms = 0 of the second line contains one summation over , introducing a factor proportional to the volume (in the limit of large volumes), but also two wave functions squared, each inversely proportional to the volume. The net result is a contribution inversely proportional to the volume, hence negligible compared to the previous one.

1778



SPATIAL CORRELATIONS IN AN IDEAL GAS OF BOSONS OR FERMIONS

where Re means “real part” and is the function defined in (30). – the terms corresponding to two particles in states other than the ground state, and which yield: 2

1+

(k

k ) (r

r)

(36)

=

In (34), as well as in (36), we notice that the contributions from all the states = 0 have various phases in general; they are, however, all in phase when r = r and the correlation function then presents a maximum. The bosons have thus a tendency for bunching, and this effect is felt over a distance ∆ 1 ∆ , as if they were attracted to one another. It is, however, a purely statistical effect linked to the bosonic character of the particles, since we assumed there were no interactions between the particles. (ii) One could also imagine the population distribution to be regular, without favoring any individual state, in which case the 0 contribution vanishes; we are then left with the contribution from the terms of (36), which is maximum when r = r for the same reasons as above. The general behavior of the two-body correlation function is shown in Figure 3: it presents a maximum at the origin, and then tends towards zero at large distance (in this case, 0 = 0). Once again, identical bosons exhibit a bunching tendency. Comment: Suppose that, instead of assuming the bosonic system to be in a Fock state (a pure state), it is at thermal equilibrium, described by the thermodynamic equilibrium. This would lead to results similar to those we just derived, but with ∆ , the thermal wavelength of the particles [24]. The boson bunching tendency is a quite general property.

1779



SPATIO-TEMPORAL CORRELATION FUNCTIONS, GREEN’S FUNCTIONS

Complement BXVI Spatio-temporal correlation functions, Green’s functions

1

2

3

Green’s functions in ordinary space . . . . . . . . . . . . . . 1781 1-a

Spatio-temporal correlation functions

. . . . . . . . . . . . . 1782

1-b

Two- and four-point Green’s functions . . . . . . . . . . . . . 1786

1-c

An example, the ideal gas . . . . . . . . . . . . . . . . . . . . 1787

Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . 1790 2-a

General definition . . . . . . . . . . . . . . . . . . . . . . . . 1790

2-b

Ideal gas example

. . . . . . . . . . . . . . . . . . . . . . . . 1791

2-c

General expression in the presence of interactions . . . . . . . 1792

2-d

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1794

Spectral function, sum rule . . . . . . . . . . . . . . . . . . . 1795 3-a

Expression of the one-particle correlation functions . . . . . . 1795

3-b

Sum rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1796

3-c

Expression of various physical quantities . . . . . . . . . . . . 1797

This complement discusses the properties of the spatio-temporal correlation functions of an ensemble of identical particles, generalizing the spatial correlation functions defined in § C-5-b of Chapter XV and § B-3 of Chapter XVI; the corresponding Green’s functions shall also be introduced. We first study (§ 1) the normal and anti-normal spatio-temporal correlation functions, then the Green’s function, and discuss some of their properties, illustrated with the example of an ideal gas. We then study in § 2 the Fourier transforms of these functions for physical systems that are translation invariant both in space and time; we shall write their general expression in the presence of interactions. In § 3, we finally introduce the “spectral function”, which leads, for interacting particles, to very simple expressions for various physical quantities, in a form similar to the one used for an ideal gas. 1.

Green’s functions in ordinary space

In the previous complement, we studied the spatial dependence of the correlation functions, taken at a given time . We now take into account the temporal dependence, using the Heisenberg picture (Complement GIII ) where the operators are time-dependent. To keep the notation simple, we assume, from now on, that either the spin is zero for bosons ( = 0), or else, in the general case (fermions and bosons), that all the particles are in the same spin state. As mentioned before, the generalization to the case where is nonzero would only require adding an index to all the field operators. In the Heisenberg picture, the field operator Ψ(r) becomes a time-dependent operator Ψ (r t): Ψ (r ) =

}

Ψ(r)

}

(1) 1781

COMPLEMENT BXVI



} where is the evolution operator, expressed as a function of the system Hamiltonian (including the particle interactions when present), that we assume to be timeindependent. Consider a system of identical particles, fermions or bosons, described by a density operator . The spatio-temporal correlation functions and the Green’s functions are defined as the average values, computed with , of the products of a number of field operators Ψ (r ) and their Hermitian conjugates Ψ (r ) taken at different space-time points (r ), (r ),... etc.

1-a.

Spatio-temporal correlation functions

The density operator may contain very complex correlations between particles, and its time evolution can be very complicated in the presence of interactions. We are going to define a certain number of functions that characterize its most simple and useful properties, as they only pertain to a small number of particles. .

Two-point normal and anti-normal functions

The one-particle spatio-temporal “normal” functions are defined by1 : 1 1

(r ; r

) = Tr

Ψ (r )Ψ (r

1

and “anti-normal”

) = Ψ (r )Ψ (r

1

correlation

) (2)

(r ; r

) = Tr

Ψ (r

)Ψ (r ) = Ψ (r

)Ψ (r )

The normal ordering is obtained when the creation operator is on the left and the annihilation on the right; it is the opposite for the anti-normal order. Note that it is only the order of the two operators that changes between 1 and 1 ; the position and time variables attributed to each of the two operators Ψ and Ψ remain the same. In the particular case where = , the normal correlation function simply yields the matrix elements of the one-particle density operator – see formulas (B-22) and (B23) of Chapter XVI. The normal correlation function is therefore a generalization, at different times, of this matrix element, which will prove to be useful. To understand the physical meaning of these two definitions in an intuitive way, we start with the anti-normal function and consider the simple case where an -particle system is in a pure state Φ0 (its ground state, for example). We then get: 1

(r ; r

) = Φ0 Ψ (r = Φ0

}

)Ψ (r ) Φ0 Ψ(r ) ( ) } Ψ (r)

}

Φ0

(3)

The right-hand side reads as follows (from right to left): starting from the system initial state Φ0 , we let it evolve according to its own Hamiltonian until the time , when we create a particle at point r; we then let the ( + 1)-particle system freely evolve until time , and we finally annihilate a particle at point r . Consequently, the function 1 } is the scalar product of the ket thus obtained with the state Φ0 , result of the 1 To be consistent with the notation of Chapter XVI (§ B), we choose a notation where, for the trace, the first group of variables (r ) of function 1 is associated with the operator Ψ , the second group (r ) to the operator Ψ. Note however that the opposite convention can also be found in the literature.

1782



SPATIO-TEMPORAL CORRELATION FUNCTIONS, GREEN’S FUNCTIONS

free evolution of the initial ket over the same time interval (without the creation or annihilation of a particle). In other words, the perturbation created by the creation of a particle, followed by its later destruction, changes the state of the physical system; the value of 1 is given by the probability amplitude of finding the system in the same state as the one it would have reached in the absence of this perturbation. The previous interpretation is natural when ; if this is not the case, the mathematical definition of 1 is the same, but the intermediate evolution stage goes backward in time. In this process, and as expected, the dynamics of the system remains unchanged, including the particle interactions. Furthermore, the additional particle is not simply a particle juxtaposed to the pre-existing system, it is indistinguishable from the others and hence undergoes indistinguishability effects (more details on this point will be given in § 1-a- ). For the normal function, Ψ (r ) acts before Ψ (r ), which means this function has a natural interpretation if : the system evolves freely until , at which time a particle is annihilated (which amounts to the creation of a “hole”, see Complement AXV AXV ); the system, with 1 particles, then evolves freely until time , when a particle is created (which annihilates the hole). The normal function is therefore the analog of the anti-normal function, provided we replace the additional particle by a hole, and we invert the times. .

Physical discussion

The definition of the correlation functions contains particle creation and annihilation operators, but this does not imply that such physical processes really occur in our system. Going from an -particle system to another one with 1 particles can be mathematically useful but only plays an intermediate role since we finally return, via a second operator, to the same number of particles. Furthermore, the action of an operator Ψ(r) is not merely a local destruction of a particle, neither is the action of Ψ (r) the simple creation of a particle at point r, juxtaposed with the already present particles: the quantum indistinguishability that concerns all the particles (including the new one) plays an essential role. A few simple examples (i) For the normal correlation function, the perturbation starts with the annihilation of a particle (creation of a hole), followed by a later creation of a particle (destruction of a hole). At time = 0, the annihilation operator Ψ (r ) can be expanded according to formula (A-3) of Chapter XVI: Ψ(r ) =

(r )

(4)

where the effect of each operator depends on the population of the state as it introduces the factor . Consider a very simple case: a gas of bosons, all in the same quantum state 0 (ideal gas of totally condensed bosons). Acting on a state Φ0 where only the individual state 0 is occupied, all the terms of the sum (4) yield zero, except for the = 0 term. The effect of the operator Ψ(r ) on Φ0 is to actually destroy a particle in the state 0 ; as r varies, the result is still the same, simply multiplied by the coefficient 0 (r ). If, for example, the ideal gas is placed in a trap where the individual ground state is 0 , the operator Ψ(r ) yields a ket with an appreciable norm only if

1783

COMPLEMENT BXVI



r falls in a domain where 0 (r ) is not negligible; if it falls outside this domain, the resulting ket is practically zero. In a general way, for fermions as for bosons, Ψ(r ) can obviously destroy particles only while acting on an already occupied individual state. For bosons, in addition, the factor means that the operator Ψ(r ) gives more weight to the highly populated individual states rather than those with low occupation numbers. Consequently, the creation of a hole is not a local process at point r . (ii) For the anti-normal correlation function, we start with the creation of a particle by the operator Ψ (r). In the case of bosons, and because of the factor + 1 introduced by , Ψ (r) tends to preferentially create particles in states having a high population . Let us go back to the previous example of a large number of bosons in a trap, all in the same individual ground state 0 . When 0 (r) is not negligible, the supplementary boson is created in the same ground state. If, on the other hand, r is far away from the trap center and falls in a domain where 0 (r) is practically zero, the boson is actually created at point r but without perturbing very much the bosons already present. For fermions, on the contrary, it is impossible to create a particle in an already occupied state; in a Fock state, an additional fermion can only be created in a state orthogonal to all the initially occupied states. Let us assume the ideal gas of fermions is in its ground state, and contained in a harmonic trap; the energy levels, up to the Fermi level, are all occupied. The effect of the creation operator Ψ (r) at a point close to the trap center is to create an additional particle in a state that can be expanded on all the individual stationary states of the trap; as this state must be orthogonal to all the already occupied states, it only has components on the non-occupied states, which have an energy higher than the Fermi level. Now the corresponding wave functions take on small values at the center of the trap, and are maximum in the classical turning point regions2 , which means, in this case, at the edge of the existing fermion cloud (or even further). If position r is close to the center of the trap, the additional fermion will be added on the periphery or outside the cloud of fermions. On the other hand, if r falls outside, in a region of space where all the wave functions of the occupied states are practically zero, the additional particle will be created practically right at point r. Examples of these various situations will be given in § 1-c.

.

Properties

The complex conjugate of 1 is obtained by changing the field operators’ order in (2) and replacing them by their Hermitian conjugates: 1

(r ; r

The complex two points (r and . A transforms of

)

= Ψ (r

)Ψ (r ) =

1

(r

; r )

(5)

conjugation of the function 1 is therefore equivalent to exchanging the ) and (r ), which amounts to a parity operation on the variables r r similar property is easily demonstrated for 1 . As a result, the Fourier these functions with respect to the variables r r and must be real.

2 The modulus squared of a stationary state wave function yields, at each point, the probability density of presence for this state. This probability is maximal in the regions of space where, classically, the particle spends the most time, i.e. where its velocity is small, as is the case for the classical turning point regions. Figure 6 of Chapter V gives an example of such a situation.

1784



SPATIO-TEMPORAL CORRELATION FUNCTIONS, GREEN’S FUNCTIONS

In addition, when the system is in a state that is translation invariant in space and time3 , the correlation functions only depend on the differences r r and . We note that the linear combination 1 (r ; r ) (r ; r ) involves the 1 average value of the operator: Ψ (r

)

Ψ (r )

(6)

where = +1 for bosons, = 1 for fermions. If = , and taking into account relation (A-19) of Chapter XVI, this equality becomes: 1

.

(r ; r

)

1

(r ; r

) = Tr

(r

r ) = (r

r)

(7)

Temporal evolution, four-point functions

We now use, in definition (2) of 1 , the Hermitian conjugate of the evolution equation for Ψ (r ), written in (C-13) of Chapter XVI; this leads to: }

+

}2 ∆r 2

1 (r)

1

d3

2 (r

r )

=

=

d3

(r ; r

)

Ψ (r; )Ψ (r ; )Ψ (r ; )Ψ (r ; ) 2 (r

r )

2

(r ; r

;r

;r

)

(8)

which involves the normal two-particle (or four-point) correlation function, whose general expression is: 2

(r ; r

;r

;r

) = Tr

Ψ (r )Ψ (r

)Ψ (r

)Ψ (r

)

(9)

Taking into account a time dependence generalizes formula (B-30) of Chapter XVI (or more precisely its average value). We obtain, in a similar way, the equation giving the variation of 1 with respect to the variables r and , or that giving the evolution of the anti-normal function 1 . The four groups of space-time variables the function 2 depends on, are, in general, independent. Most of the time, however, we only need the “diagonal part” 2 (r ; r ) of the correlation function, obtained for r = r, = and r = r , = ; this diagonal part is analogous to the two-particle correlation function (two positions, two times) in classical statistical mechanics. When the system is translation-invariant both in space and time, this function only depends on the differences r r and . Whereas, for the function 1 , a hole is created and then destroyed, for the function 2 there are now two holes first created and then destroyed; the natural order of the increasing times is given by the time variables of the operators in (9), taken from right to left. One could define, in a similar way, a function 2 where two particles would be created at the beginning and then destroyed at later times (compared to the normal correlation function, the role of particles and holes are interchanged). Consequently, when the particles interact, the evolution equation of 1 involves another higher order 3 This is the case if the system Hamiltonian is translation invariant, and if the system is in an eigenstate of , or described by a density operator that is a function of .

1785

COMPLEMENT BXVI



correlation function, 2 . In turn, the evolution of 2 involves correlation functions of an even higher order, 3 etc. This means that, because of the interactions, the set of equations is not “closed”, but includes a complete “hierarchy” of a large number of equations, involving correlations of higher and higher order. 1-b.

Two- and four-point Green’s functions

Equation (8) is a linear partial differential equation, with a right-hand side sometimes called a “source term”. In our case, this right-hand side does not contain any singular function. But when this right-hand side is modified to include a delta function, the new solutions of the equation are called the “Green’s functions” . We now show how to introduce Green’s functions in the problem we are concerned with. .

Two-point Green’s function

The two-point Green’s function 1 is obtained by a combination of the two correlation functions of § 1-a. We saw in § 1-a that, when , the anti-normal correlation function is the most natural as it includes the propagation of a particle from to . On the other hand, for , the most natural one is the normal function that involves the propagation of a hole from to . We can combine those two possibilities into one by setting: 1 (r

;r

)=

(

)

1

(r ; r

)+

(

)

1

(r ; r

)

(10)

where ( ) is the Heaviside function of the variable (equal to 1 if 0, zero otherwise), and where equals +1 for bosons, 1 for fermions; we shall see later on, for example in § 1-c- , that the introduction of the factor simplifies the following computations. When we take the derivative of the two-point Green’s function 1 with respect to time, the discontinuities introduced by the Heaviside functions yield delta functions; the precise calculation will be done in § 1-c- for an ideal gas, allowing us to verify that 1 is indeed a Green’s function. Using this type of function is quite useful in a number of problems, for example those involving Fourier transforms, or in perturbation calculations. .

Four-point Green’s function By analogy with (9) we define a two-particle (or four-point) Green’s function: 2 (r

;r

;r

;r

) = Tr

Ψ (r

)Ψ (r

)Ψ (r

)Ψ (r ) (11)

where is the time ordering operator, which orders the 4 times by decreasing values from left to right (by definition, this operator also includes a factor , where is the parity of the permutation necessary for this time ordering; this may result in a change of sign for fermions). 1786

• 1-c.

SPATIO-TEMPORAL CORRELATION FUNCTIONS, GREEN’S FUNCTIONS

An example, the ideal gas

When the system considered is an ideal gas, it is possible to get explicit values of the previous functions. As before, we assume the gas to be contained in a box of volume , with periodic boundary conditions. Using relation (A-3) of Chapter XVI in the case where the (r) are plane waves, we can write the field operator Ψ(r) as: k r

Ψ(r) =

(12)

k

where the sum covers all the wave vectors k satisfying the periodic boundary conditions. This expansion is convenient as, for an ideal gas, the time dependence of the operators in the Heisenberg picture k is particularly simple, as we now show. The Hamiltonian can be written as: =

}

k

(13)

k

k

with: k

=

} (k ) 2

2

(14)

As the operator

k

k

commutes with any annihilation operator

different momentum4 , and since [

]=}

k

k

k

k

k

=

k

k

pertaining to a

, we get: (15)

k

This corresponds, in the Heisenberg picture, to an evolution: k

()=

k

(16)

k

The time evolution of Ψ (r ) is then written as: (k

r

k

)

Ψ (r ) =

.

(17)

k

Normal correlation function Inserting this result in (2), we get: (k 1

(r ; r

r

k

) (k

)=

r

k

) k

k

(18)

We now show that, because of the translation invariance, the average value k k must be zero whenever k = k . Assume, for example, that the system density operator 4 For fermions, two minus signs are introduced because of the anticommutations, but they cancel each other. If the momentum is the same, we have k k k = + k = 0 + k and k k k k

k

k

= 0, so that

k

k

k

=

k

.

1787

COMPLEMENT BXVI



is the canonical thermal equilibrium operator = . This operator is diagonal in the basis of the Fock states, and the trace of the product k k will be zero unless both annihilation and creation operators act on the same individual state: the non-zero condition is therefore k = k . In the same way, if we are now using the grand canonical ( ) equilibrium = , the operator is still diagonal in the same basis, and the same rule applies. In a general way, one can see5 that the translation invariance of requires: k

where

= Tr

k

k

k

=

(19)

is the average value of the population operator

=

: (20)

We can then write: 1

(r ; r

)=

1

[ k (r

r)

k

(

)]

(21)

The normal correlation function is simply the sum of all the contributions from the individual states occupied by free particles, labeled by the index k . Each contributes proportionally to its average population , and has a spatio-temporal dependence given by the progressive wave [k (r r) k ( )] it is associated with. As expected for a translation invariant (both in space and time) system, this normal function only depends on the differences r r and . Taking (14) into account, it obeys the partial differential equation: +

} ∆r 2

1

(r ; r

)=0

(22)

which corresponds to the free propagation of particles in an ideal gas (a similar equation exists for the variables and r, with a change of sign for the time derivative term). Expression (21) allows shedding new light on the physical interpretation of the normal correlation function given in § 1-a, in terms of the creation, in the -particle system, of a “hole”, which then propagates until it is annihilated at a later time. In an ideal gas, each term of the sum in (21) is a free wave plane: in the absence of interactions, the particles can freely propagate along straight lines. The appearing in the formula shows that the hole can only propagate along already populated individual states in the -particle state: a hole can only be created in an already occupied quantum state, as pointed out already in § 1-a- . As a result, the created hole is not a point-like object actually localized at point r : it is only built from superpositions of free occupied states, whereas for a truly point-like excitation, one would have to combine values of k extending to infinity.

5 To prove this in a general way, we simply have to note that the operator is translation k k invariant only if k = k . To show this, one can use the expression of the translation operator as an exponential of the operator associated with the total momentum (Complement EII , § 3).

1788

• .

SPATIO-TEMPORAL CORRELATION FUNCTIONS, GREEN’S FUNCTIONS

Anti-normal correlation function

For the anti-normal function 1 (r ; r ), the calculation is practically the same, the only difference being the inversion of the order of the operators k and k , which leads to: 1 )= [1 + ] [k (r r) k ( )] (23) 1 (r ; r It obeys the same partial differential equation as the normal function. It follows from (21) and (23) that the linear combination: (r ; r

1

)

1

(r ; r

1

)=

[k (r

r)

k

(

)]

(24)

is independent of the state of the system; for = , it is simply equal to the function (r r ). We shall see later the relation between this expression and the spectral function. Let us come back, here again, to the interpretation of the effect of the operator Ψ (r), discussed in § 1-a- . For a fermion system ( = 1), relation (23) clearly shows that the particle creation does not involve any of the already occupied individual states k , with = 1, since the corresponding term is zero. The object created by the excitation cannot have any component on the already occupied individual states, which simply means that its wave function must remain orthogonal to those of all the fermions already present. For a boson system, the effect is just the opposite: if, for instance, an individual state of bosons is highly populated compared to all the others, the term in (23) corresponding to this large will be dominant: the additional particle will be mainly in the same individual state as all the other particles already present. .

Two-point Green’s function Using definition (10), we obtain the Green’s function 1 (r

=

;r 1

1 (r

;r

):

) [ k (r

r)

k

(

)]

(

) [1 +

]+

(

)

(25)

If we take the derivative with respect to the time , the two Heaviside step functions yield delta functions with opposite signs. We then get the partial differential equation: +

} ∆r 2

1 (r

;r

)=

+

} ∆r 2

1 (r

;r

)=

(

)

(

)

k

(r

r)

[1 +

]

(26)

that is: (r

r)

(27)

As expected, the right-hand side is a product of delta functions of the set of variables characterizing a Green’s function. In the presence of interactions, this partial differential equation is no longer valid; we must add to its right-hand side the interaction contributions, which involve Green’s functions of a higher order. 1789

COMPLEMENT BXVI

.



Four-point functions

The computation of the four-point correlation functions is very similar to the one we just explained; we simply use again relation (17) to get their explicit expressions for an ideal gas contained in a box. The results are the same as those obtained in Complement AXVI , as for example in relation (21), except for the fact that we must now multiply each k spatial plane wave k r by the associated time evolution factor . 2.

Fourier transforms

From now on, we shall only study systems that are translation-invariant in space and time. Consequently, the correlation functions only depend on the difference of the variables r r and , so that we can choose to cancel both r and . 2-a.

General definition

Let us introduce the two (double) Fourier transforms with respect to time and space: 1

(k

d3

)=

(

d

kr

)

(0 0; r

1

)

(28)

(we have set r = 0 and = 0 in the function 1 ). These functions are real, as shown by the parity relation (5). The Fourier transform 1 of the Green’s function 1 introduced in (10) is called the “one-particle propagator”; it is defined by: 1 (k

)=

d3

(

d

kr

)

1 (0

0; r

)

(29)

For a system contained in a volume with periodic boundary conditions (Complement FXI ), the integrals over d3 in the above formulas must be taken over the volume . They yield the coefficients of a Fourier series where the vectors k take the discrete values k corresponding to the boundary conditions. This series characterizes the spatial dependence. As for the time dependence, the Fourier transform is a continuous function6 . The inverse transformation relations are: 1

(0 0; r

)=

1

d 2

k

(k

r

)

(k

1

)

(30)

where, in the limit of very large volumes, the discrete summation becomes an integral 3 with the coefficient (2 ) : 1

(0 0; r

)=

d3 3

(2 )

d 2

(k r

)

1

(k

)

(31)

6 As opposed to the space variables, the time variable is not confined to a finite variation domain, and time Fourier transforms may be singular; such an example, concerning an ideal gas, will be given in § 2-b, where we shall introduce, as a convergence factor, a decreasing exponential (where tends toward zero through positive values).

1790



SPATIO-TEMPORAL CORRELATION FUNCTIONS, GREEN’S FUNCTIONS

Comment: One can also express the functions 1 ( ), not as Fourier transforms as in (28), but directly from the average values of products of creation k and annihilation k operators in the individual states k . Formulas (A-3) and (A-6) of Chapter XVI tell us that: Ψ(r) =

1

k

r

and

k

Ψ (r) =

1

k

r k

(32)

Inserting these relations in the definitions (2), then in (28), we get: 1

(k

1

)=

d3

(

d

k r

)

k

r k

k

( )

(33)

where k ( ) is the operator k in the Heisenberg picture. The integral over the volume selects a single term, = , and cancels the volume . Furthermore, the translation invariance means that neither the density operator , nor the system Hamiltonian, have matrix elements between state vectors with different total momentum. Now the operator increases that total momentum by } and k ( ) decreases it by } . The average k value

k

k

( )

summation over 1

(k

is therefore zero unless

is equal to , and that eliminates the

. Finally:

)=

d

k

k

( )

(34)

In a similar way, we can also show that: 1

2-b.

(k

)=

d

k

( )

(35)

k

Ideal gas example

For an ideal gas contained in a box (with periodic boundary conditions), the k are discrete. Replacing k by k in (28) and using expression (21), after replacing the dummy index k by k , we get a product of exponentials to be integrated. The integral over d3 , combined with the factor 1 , yields a Kronecker delta k k that eliminates the summation over k ; the integral over d yields 2 ( k ), and we get: 1

(k

)=2

(

k

)

(36)

In a similar way: 1

(k

)=2

[1 +

] (

k

)

(37)

and we can finally write: 1

(k

)

1

(k

)=2

(

k

)

(38)

These expressions are particularly simple, but no longer valid when the particles interact. They are, however, useful as a reference point to understand the interaction effects. 1791



COMPLEMENT BXVI

The one-particle propagator 1 (k ) defined in (29) can be obtained in a similar way; the integral over d3 is unchanged, but the integral over time now yields: (

d

k

)

( ) [1 +

]+

(

)

(39)

which does not converge. For the term in ( ), a classical method is to introduce a convergence factor by changing into + , with 0 through positive values; for the ( ) term, the change is to . We get: 1 (k

1+

)= =

[ [

k

k

+

+ ]

]

+

[

]

k

2

+

2

[

k

] +

(40)

2

In the limit where 0, the two fractions on the right-hand side yield principal parts and delta functions – see relation (12) of Appendix II. The first term on the right-hand side yields both a principal part [1 ( k )] and a delta function ( k ), whereas the second one (in ) yields only a delta function ( ). If the system is diluted, k it is in the classical regime (i.e. non-degenerate), where all the occupation numbers are small compared to 1 and where the exchange effects are weak; the second term, associated with the indistinguishability between the particles, is then negligible. 2-c.

General expression in the presence of interactions

A translation invariant system has a Hamiltonian that commutes with its total momentum. We can then build a basis with state vectors that are, for each particle number and for each value of the total momentum }K, eigenvectors of with energies ; we shall note K these energies as it is often useful to explicitly keep track of the values of and K that define the subspace corresponding to that eigenvalue in the spectrum of . We call Φ the corresponding eigenvectors, where the index accounts for possible degeneracies of these eigenvalues. We assume the system to be in a stationary state, and translation invariant. This means that its density operator cannot have non-diagonal matrix elements between eigenvectors corresponding to different momenta or energies; in each of the subspaces common to both of these quantities, we can choose the basis Φ that diagonalizes the density operator and set: K

= Φ

Φ

K

(41)

K

We now use this basis to compute the trace appearing in (2). We first insert expression (12) for the field operators (and their conjugates) as a function of the operators k ; taking into account the exponentials introduced by the operators in the Heisenberg picture, we get: 1

(0 0; r Φ

K

)= k

1

Φ

k r K K

Φ

K

K

K k

Φ

K

e

K

K

}

(42)

Several simplifications can be made on the right-hand side of this equation. First of all, as the operator k destroys a particle, we necessarily have = 1, or else the 1792



SPATIO-TEMPORAL CORRELATION FUNCTIONS, GREEN’S FUNCTIONS

matrix element of k would be zero; the sum over disappears. Along the same line, as this operator decreases the total momentum by ~k , we must also have K = K k , and the sum over K is also eliminated. Now, for the matrix element of k to be nonzero, and since this operator adds ~k to the momentum ~ (K k ), to recover the initial momentum ~K, we necessarily have k = k . Once these simplifications have been made, we insert the result in definition (28) of (k

1

)=

1

d3

(

d

Φ

K

K

kr

1

(k

)

K

), and get:

k r

k

Φ

2

1 K k

e[

1 K

K

k

]

}

(43)

On the first line, the integral over 3 yields a Kronecker delta that forces k = k; the integral then yields , which cancels the volume appearing in the denominator. The time integral in relation (43) yields a 2 coefficient multiplied by a delta function of the variable 1

+

(k

}. We finally get:

K

1K k

)=

2

Φ

K

K

k

Φ

1 K k

2

K

1K k

(44)

}

K

The same type of calculation also yields: 1

(k

)=

2

Φ

K

k

K

Φ

+1 K+k

2

+1 K+k

K

(45)

}

K

Let us assume, in addition, that our system is at thermal equilibrium, and described by the grand canonical density operator: =

1

(46)

with the classical notation: mann constant) and The two functions condition”: 1

(k

)=

= Tr 1

(}

is the total particle number,

and

) 1

=1

(

is the Boltz-

is the grand canonical partition function. 1

(k

then obey a simple relation, often called “boundary

)

(47)

This relation turns out to be crucial in many calculations involving Green’s functions; we shall use it in § 3.

1793



COMPLEMENT BXVI

Demonstration: To establish this relation, we rewrite equality (45) using the fact that Hermitian conjugates. We then get: 1

(k

)=2

Φ

K

K

+1 K+k

k

2

Φ

k

k

are

(48)

K

+1 K+k

K

and

}

We now permute the indices and , as well as the indices and , we change the dummy summation index to = + 1, and finally replace the dummy variable by = + . This yields: 1

(k Φ

)=2 K

K

K

k

1 K

Φ

1 k

2

K

k

1 K

(49)

k

}

In this summation, just as in (44), the lowest value of the index that gives a contribution is = 1; we therefore get the same expression as (44), with (aside from the irrelevant change of the dummy index into ) just one modification, the replacement of K by K 1k . However, since: K

=

1

[

]

K

(50)

the ratio of these two diagonal elements factor: 1 K

k

+ (

1)+

K

K

1 k

K

=

introduces in the integral a

K

1 K

(51)

k

Now, the delta function in (49) allows replacing, in the exponent of this factor, the energy difference by } : [

K

1 K

k

]=

[}

]

(52)

This factor comes out of the summation and relation (47) is established. 2-d.

Discussion

Expressions (44) and (45) give an idea of the behavior of the functions 1 and 1 . For an ideal gas, relations (36) and (37) show that they are singular functions, actually delta functions forcing the energy } to take on exactly the one-particle kinetic energy }2 2 2 . This is because, in an ideal gas, one can choose Fock states as stationary states Φ K , where each individual state, with a given momentum }k, has a well defined population. In such a case, the operator couple, via its matrix element Φ

K

k

Φ

k in +1 K+k

expression (45) of , one state Φ

1 +1 K+k

can only to the

initial state Φ K : the Fock state, where the occupation number k is larger by one unit. Consequently, the energy difference K always takes the same value, +1 K+k 2 2 the energy } 2 of an individual particle with momentum }k. We find again the results of § 2-b. Let us assume now that the system includes mutual interactions. The matrix +1 element Φ K then yields the scalar product of an -particle system k Φ K+k 1794



SPATIO-TEMPORAL CORRELATION FUNCTIONS, GREEN’S FUNCTIONS

stationary state with another state where a particle with a well defined momentum }k has been removed from a stationary state of a system with +1 particles. This new state has no reason to also be stationary, because of the particle interactions. It is therefore expected to have a non-zero scalar product with a whole series of kets Φ K with different energies, which can become more numerous as the system enlarges. The sum over no longer reduces to a single value of the energy; in the limit of large systems, it becomes a continuous sum, which absorbs the delta function, so that the function ) takes, for a given k, non-negligible values in a whole domain. A priori, 1 (k its variations can take any value in this domain; however, if the interactions are not too strong and by comparison with the ideal gas, we expect the scalar product to have 2 2 non-negligible values mostly in an energy domain close to 2 . For each K +} k, the interaction effect is to enlarge the energy peak, infinitely narrow for an ideal gas, by giving it a width that increases as the interaction becomes stronger. This width is interpreted in terms of a finite lifetime of the excitation created when a free particle is removed from a stationary state of the ( + 1)-particle system; as the moduli of the two +1 matrix elements Φ K and Φ +1 are equal, this k Φ K+k K+k k Φ K lifetime can also be interpreted as that of the excitation created when a free particle is added to a stationary state of the -particle system.

3.

Spectral function, sum rule

In view of formula (7), it is natural to introduce the real function (k

)=

(k

1

=

)

1

d

k

(k

( )

)

(k

) as: (53a) (53b)

k

where we have used (34) and (35). We call (k ) the “spectral function”; it is real since both functions 1 and 1 are real. For an ideal gas, formula (38) shows that we simply have: (k

)=2

(

)

(54)

but this equality is no longer valid as soon as the particles interact. We are going to show, however, that the spectral function allows expressing the correlation functions with a general formula, reminiscent of that established for an ideal gas, even in the presence of interactions. 3-a.

Expression of the one-particle correlation functions

Inserting (47) in the definition of (k

)=

(}

)

(k

), we get: (55)

1

that is: 1

(k

)=

(k

)

(}

)

(56) 1795

COMPLEMENT BXVI



where (} ) is the usual Bose-Einstein distribution for bosons, and the Fermi-Dirac distribution for fermions: (}

1

)=

(}

(57)

)

Using again (47), we get: (k

1

(}

)=

)

(k

)

(}

)=

(k

) 1+

(}

)

(58)

Knowing the spectral function (k ), we can determine the one-particle correlation functions with formulas similar to those of the ideal gas, containing the same quantum distribution functions . Note, however, that in the presence of interactions, the energy } and momentum }k variables are independent, whereas for an ideal gas, they are constrained by relation (54). 3-b.

Sum rule

We now insert relations (28) and (2) in (53a); since the operators Ψ and Ψ coincide at = 0, we get: (k

)=

d3

(

d

kr

) Tr

Ψ (r

)

Ψ (r = 0)

(59)

Taking a summation over , a time delta function is introduced: d

(k

)=2

(k

)=

d3

kr

Tr

Ψ(r )

Ψ (r = 0)

(60)

that is: d 2

d3

kr

(r ) = 1

(61)

In the presence of interactions, we do not know, a priori, how the spectral function depends on k and . However, for each value of k, the interactions can only modify the frequency distribution, but not its integral over . We have seen that for a gas of interacting particles, the effect of the interactions is to “spread” the spectral function over a certain domain of frequencies, all the while obeying the sum rule (61). There is no reasons for the spectral function to present any particular shape, or still contain delta functions of frequency. It often happens, however, that it presents marked peaks, whose narrowness signals the existence of excitations in the system, behaving almost like free particles (with long lifetimes). These excitations are called “quasi-particles” as they are, in a way, an extension – once the interactions have been introduced – of the independent particles of the ideal gas. The peak associated with a particle of momentum ~k is not, in general, centered on the energy = }2 2 2 of a free particle: in addition to the spreading, the energies of the quasi-particles are shifted by the interactions. This spreading and shifting is reminiscent of the results obtained in Complement DXIII , where we studied, to lowest order, the coupling of a discrete state with a continuum. In other words, one can say that the interactions couple the state of a free particle with momentum ~k, to a continuum of states having different momenta, 1796



SPATIO-TEMPORAL CORRELATION FUNCTIONS, GREEN’S FUNCTIONS

which explains the analogy. Note, however, that the present results concern not a single but an ensemble of identical particles, and its properties at thermal equilibrium; the physical situation is therefore different. To sum up, to go from the ideal gas (where is necessarily equal to ) to an interacting gas, we simply introduce in the Green’s function, and for each value of k, a weighting by a spectral function (k ); this function distributes the dependence over a certain frequency domain. We are going to show that from the spectral function we can infer many properties of an interacting physical system. But this obviously does not mean that the spectral function is easy to compute! On the contrary, in most interacting physical systems, we do not know its exact value. Its very existence, however, independently of its precise mathematical form, is a very useful conceptual tool. 3-c.

Expression of various physical quantities

The spectral function contains information on a great number of physical properties of interacting systems, in a form much more concise than the density operator of the -body system: that operator contains everything but is mathematically far more complicated than a simple function. Consider first the particle density, given by: (r) = Ψ (r)Ψ(r) =

1

(r 0; r 0)

(62)

according to the definition (2) of the normal function 1 . For a translation invariant system, this density is independent of r, and we can set r = 0. The density is then the value, at the origin, of the normal function (31), that is, taking (56) into account: d3

=

3

(2 )

d 2

1

(k

)=

d3 (2 )

d 2

3

(k

)

(}

)

(63)

Let us now study a quantity furnishing more precise information, the particle momentum distribution, and compute the average number k of particles having a momentum }k. We assume the system is contained in a cubic box of volume . Relations (A-9) and (A-10) of Chapter XVI, applied to the case where the basis wave functions are given by (12), yield: k

=

k k

=

1

d3

d3

k (r

r)

Ψ (r )Ψ(r)

(64)

Replacing the integral variable r by s = r r , the average value Ψ (r )Ψ(r + s) appears, which is independent of r because of the translation invariance; we can then replace, in this average value, r by zero, and the integral over 3 simply yields the volume . We are left with: k

=

d3

ks

Ψ (0)Ψ(s)

Now, performing the integral over

(65) of the definition (28) of

1

(k

), and taking (2) 1797



COMPLEMENT BXVI

into account, we get: d

1

(k

)=2

d3

kr

=2

d3

kr

1

(0 0; r 0)

Ψ (0)Ψ(r )

(66)

which is identical to (65) within a factor of 2 . It then follows that7 : k

=

1 2

d

1

(k

)=

d 2

(k

)

(}

)

(67)

In the same way, one can show that the system average energy8 per unit volume is given by: =

d3 3

(2 )

d 2

}

+ 2

(k

)

(}

)

(68)

where is defined in (14). Once this function is known, one can get, by integration over , the logarithm of the partition function, which in turn yields, by derivation, all the thermodynamic quantities (particle density, pressure, etc.). It is remarkable that the spectral function, whose definition comes from the one-particle Green’s functions and could therefore be expected to only contain information on the one-particle density operator, actually allows computing all these physical quantities that depend on the correlations between the particles, and hence on their interactions. With this method, the study of -body properties is reduced to the computation of functions mathematically defined for a single particle. It generalizes, in a way, the ideal gas equations, while taking rigorously into account the presence of an ensemble of indistinguishable particles at thermal equilibrium. It is therefore quite powerful. Nevertheless, this obviously does not solve the problem of calculating the equilibrium properties of an interacting system; in practice, getting precise values for the spectral function can pose a very difficult mathematical problem. Numerous approximation methods have been developed to try and resolve this, using in particular the concept of “self energy” and of perturbation diagrams, but this is beyond the scope of this complement.

7 For an ideal gas of bosons, the chemical potential is always below the lowest individual energy, so that the distribution function never diverges. As in relation (67) is integrated from to + , this divergence now seems unavoidable. Relation (55), however, shows that the spectral function (k ) for

bosons goes to zero for ~ = , as long as the function 1 remains regular at this point. Consequently, integrals (63), (67) and (68) do not present any divergences associated with that value 8 Relation (68) can be demonstrated by first computing the time evolution of Ψ (r ) and Ψ (r ) to get the expression for } Ψ (r )Ψ (r ); one then takes its average value and performs a Fourier transformation to get the average value of the energy – see for example § 2.2 of reference [7].

1798



WICK’S THEOREM

Complement CXVI Wick’s theorem

1

2

Demonstration of the theorem . . . . . . . . . . . . 1-a Statement of the problem . . . . . . . . . . . . . . . 1-b Recurrence relation . . . . . . . . . . . . . . . . . . . 1-c Contractions . . . . . . . . . . . . . . . . . . . . . . 1-d Statement of the theorem . . . . . . . . . . . . . . . Applications: correlation functions for an ideal gas 2-a First order correlation function . . . . . . . . . . . . 2-b Second order correlation functions . . . . . . . . . . 2-c Higher order correlation functions . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

1799 1799 1800 1802 1804 1804 1805 1806 1808

For an ideal gas at thermal equilibrium, we computed in Complement BXV the average values of one- and two-particle operators, and showed they could all be expressed in terms of the one-body quantum distributions (the Fermi-Dirac distribution for fermions, and Bose-Einstein for bosons). In this complement we establish a theorem that allows generalizing those results to operators involving any number of particles. The demonstration of Wick’s theorem is explained in § 1, and will be applied in § 2 to the calculation of correlation functions in an ideal gas. 1.

Demonstration of the theorem

Let us consider an ideal gas at thermal equilibrium, described by the grand canonical ensemble (Appendix VI, § 1-c), with the density operator: =

1

(1)

where = 1 is the inverse of the temperature multiplied by the Boltzmann constant , the chemical potential, and the Hamiltonian: =

with: k

=

~2 2

The grand canonical partition function

2

(2) is defined by:

= Tr 1-a.

(3)

Statement of the problem

We wish to calculate the average value =

1 2

of a product of operators: (4) 1799



COMPLEMENT CXVI

where each of the operators :

is, either an annihilation operator

, or a creation operator

=

(5)

Taking into account relations (A-48) and (A-49) of Chapter XV, we have: [

]

=

0 if

and are both creation or annihilation operators is an annihilation, and a creation operator if is a creation, and an annihilation operator

if

=

(6)

with

= +1 for bosons and = 1 for fermions. Assuming the quantum state is described by the density operator the average value of is: = Tr

given in (1), (7)

1 2

As is diagonal in the basis of the Fock states associated with the , this average value will be different from zero only if the series of operators contains, for each creation operator , an annihilation operator in that same individual state; they must exactly balance one another and must therefore appear the same number of times. In particular, the average value will always be zero if is odd; from now on, we shall assume that = 2 , being an integer. 1-b.

Recurrence relation

We have to compute: 2

= Tr

1 2

(8)

2

We first start by changing the order of 1 and 2 , using one of the relations (6); we will then continue to progressively shift 1 towards the right, by permuting it first with 3 , then with 4 , etc. until the permutation with 2 brings it to the very last position. As a trace is invariant under a circular permutation, the operator 1 can then be moved back all the way to the first position, ahead of ; a last commutation with , that we compute just below, returns it to its initial position, and allows computing the value of as a function of the average values of a product of 2( 1) operators. The 2 computation goes as follows: 2

= Tr =[

[

1

2]

1

2]

2]

1

3 4

Tr

3 4

+ Tr =[

Tr +

2

2

[

+

2 2

+ [

2 3 1 4

2

3 4

2

+ [

]

Tr

1

2

2 1 3 4

2

1

3]

Tr

2 4

2

1

3]

Tr

2 4

2

2 3 4

2

2 3 4

+ 1800

Tr

2

1

Tr

2

+

1 1

(9)



WICK’S THEOREM

Most of the terms on the right-hand side are in general zero: the first is non-zero only if 1 and 2 are two conjugated operators (an annihilation and a creation operator associated with the same individual quantum state); the second is non-zero if this is also the case for operators 1 and 3 , etc. After a circular permutation under the trace, the last term of the sum can be written as: 2

1

Tr

1

2 3

(10)

2

We shall now relate both operators 1 and 1 , showing that they are proportional to each other. Assume, for example, that 1 is a creation operator ; in the operator defined in (1), all the terms = commute with and the change of order for the operators leads to two expressions, to be compared: (

)

(

and

)

(11)

By action on the Fock vectors, it is easy to check (as we assumed the system was in thermal equilibrium) that: (

)

(

=

)

(

)

(12)

which leads to: =

(

)

(13)

If 1 is an annihilation operator, the same reasoning shows that the change of order ( ) introduces the inverse factor: . To sum up: =

1

(

)

1

(14)

1

with a + sign in the exponential if 1 is a creation operator, and a sign if ( 1 annihilation operator. Consequently, the last term in (9) is equal to 2 1 with a factor = since = 1. Moving this last term to the left-hand side, we get: 2

(

1 + [

1

3]

1

)

=[

Tr

1

2]

Tr

2 4

+

2

2 2

[

1

3

1 )

is an 2 ,

2

+ 2

]

Tr

2 3

2

1

(15)

On the right-hand side of this equality, all the (anti)commutators [ 1 2 ] are actually numbers, and many of them are zero: as before, the only non-zero ones are those for which the two concerned operators are conjugates of each other (a creation and an annihilation operator for the same individual quantum state). The average value of the product of 2 operators we are looking for can therefore be expressed as a linear combination of average values of products of 2 2 operators. 1801



COMPLEMENT CXVI

1-c.

Contractions

We now define the “contraction” of two operators

and

as the number, written

, defined by: =

1 (

1

[

)

]

=

( (

))

[

]

(16)

where, as above, in the denominator a + sign is chosen in the exponential if 1 is a creation operator, and a sign if 1 is an annihilation operator. The function is the Fermi-Dirac distribution for fermions, and that of Bose-Einstein for bosons: (

)=

1 (

(17)

)

The contraction is zero if it concerns two operators and acting on different individual quantum states; it is also zero if the operators are both creation or both annihilation operators in the same individual quantum state. If is the creation operator, and the annihilation operator in the same individual state, the contraction is simply equal to the distribution function ( ) since: =

=

+ (

1

)

=

(

)

(18)

In the opposite case (antinormal order), the contraction is given by: =

=

1 (

1

)

=1+

(

)

(19)

Relation (15) can thus be rewritten as: 2

=

1 2

3

2

+

1 3

2 4

2

+

+

2

2

1 2

2 3

2

1

(20)

where the traces have been replaced with quantum averages. We shall then reason by recurrence: each of the average values on the right-hand side of (20) is of the same type as 2 written in (8), except for the fact that has been lowered by one unit. Dealing with each of the average values 2 2 as we did for 2 , that last average value now appears as a double sum of terms containing two contractions and average values 2 4 . Continuing as many times as necessary, we end up with an average value 2 expressed as the sum of diverse products of contractions. As an illustration, let us consider a few simple examples. If = 1, we get directly:

1 2

=

1 2

(21)

This simple relation can actually be used as a definition of contractions, instead of (16). If 1 is a creation operator and 2 the corresponding annihilation operator, we get the result (18), equivalent to relations (19) and (23) of Complement BXV ; if the operator’s order is reversed, we get (19) which comes directly from the previous result and from the 1802



WICK’S THEOREM

commutation or anticommutation relation (6). For all the other cases, we find zero on each side of the equality. If = 2, we use a first time relation (20), and obtain: 1 2 3 4

=

1 2

3 4

+

1 3

2 4

+

1 4

(22)

2 3

Using again this same relation, we compute each of the average values of the product of two operators , which yields: 1 2 3 4

=

1 2 3 4

+

=

1 2 3 4

+

1 3 2 4 1 2 3 4

+

+

1 4 2 3

(23)

1 2 3 4

In the second line, we have used a generalization of the notation of products of contractions. When two operators inside a contraction are separated, a permutation is needed to group them. For fermions, this introduces a sign given by the parity of the required permutation, but no sign change for bosons. When two contractions are embedded, we group together all pairs of operators belonging to the same contraction and, for fermions, we multiply by the parity of the corresponding permutation1 ; for bosons, no sign change is introduced. In the present case, we therefore have : 1 2 3 4

=

and

1 3 2 4

1 2 3 4

=

1 4 2 3

(24)

The final result (23) only contains products of contractions, i.e. of distribution functions. One can easily check that, among those three products, a maximum of two are non-zero.

Comments: (i) Another notation is frequently used, where operators and contractions are embedded in the same average value, for instance: 1 2

2

=(

)

1 2

(25)

2

where, for fermions, is the parity of the permutation needed to bring operator next to ; for bosons, = 1. This can be generalized to cases where several contraction appear, embedded or not. (ii) In the limit of zero temperature where

, relation (16) simplifies into:

= 0 if the two operators are of the same nature (both creation, or both annihilation operators)

(26)

as well as: =

1 si 0 si

(27)

1 We multiply by 1 if, when writing the permutation, the number of crossings between brackets is odd. This is for instance the case in the permutation in the left of (24), but not that on the right.

1803

COMPLEMENT CXVI



and: =

0 si 1 si

(28)

(the second lines ( ) of these relations are useful only for fermions, since for bosons cannot be larger than ). 1-d.

Statement of the theorem

The recurrence over

we have been using leads to Wick’s theorem:

“The average value 1 2 is the sum of all the complete systems of 2 contractions that can be made on the string of operators 1 2 2 . Each system is the product of binary contractions (16); for fermions, this product is multiplied by parity factors associated with each of them.” The word “complete” means in this case that in every considered system of contractions, each operator listed in the string of operators is taken in one and only one contraction. The parity factor first includes the parity 1 of the permutation that brings right after 1 the operator it is contracted with; these two operators are then taken out of the list of the . In the remaining list, we again compute the parity 2 of the permutation needed to bring together the next two operators to be contracted, and it is multiplied by 1 . We continue this until all the contractions have been taken into account, and obtain the product 1 2 of all the parities involved. Among all the system of contractions, a very large number yield zero. The only non-zero ones are those for which every contraction contains a creation and an annihilation operator in the same individual quantum state. This rule significantly limits the number of contractions involved in the final result. As seen above, the theorem yields again the results of Complement BXV . For example, if (as is the case in the formula for the two-particle symmetric operators) the first two operators are creation operators, and the last two annihilation operators, the first system of contraction in (23) yields zero, and we are left with the last two, corresponding to the two terms of equation (43) in Complement BXV . The main interest of the theorem is, however, that it allows getting, almost without calculations, the average value of the product of any number of operators. Comment :

Until now, we assumed that the operators were creation or annihilation operators associated with the basis of individual states formed by the one-particle Hamiltonian eigenvectors. If this is not the case, and we wish to compute the average value of the product of creation and annihilation operators associated with any other basis, we first use formulas (A-51) and (A-52) of Chapter XV to express those operators in terms of the operators associated with the eigenbasis of the one-particle Hamiltonian, and then use Wick’s theorem. 2.

Applications: correlation functions for an ideal gas

As an illustration of the use of Wick’s theorem, we now compute the -order correlation functions in an ideal gas at thermal equilibrium. Thanks to Wick’s theorem, they can 1804



WICK’S THEOREM

each be expressed as simple products of first order correlation functions. As a first step, we will derive, in a simpler way, a number of results already obtained in § 3 of Complement BXV ; these will then be generalized to correlation functions of a higher order. Consider a gas of spinless particles, confined by a one-body potential inside a cubic box of edge length ; this potential is zero inside the box, and becomes infinite outside. We use periodic boundary conditions to account for this confinement (Complément CXIV , § 1-c); the normalized eigenfunctions k (r) of the kinetic energy are then written: k

(r) =

1

kr

(29)

3 2

where the possible wave vectors k are those whose three components are integer multiples of 2 . 2-a.

First order correlation function

Relation (B-21) of Chapter XVI defines the first order correlation function which depends on the two positions r1 and r1 : 1 (r1

r1 ) = Ψ (r1 )Ψ(r1 )

1,

(30)

Using relations (A-3) and (A-6) of Chapter XVI, the field operator can be expressed as a function of the annihilation operators k in the state (29), and its adjoint, as a function of the creation operators k in that same state. Taking into account (29), this leads to: 1 (r1

r1 ) =

1

(k

r1 k r1 )

k

(31)

k k

3 k

At thermal equilibrium, all the average values of operators state described by the density operator written in (1): = Tr

are taken in the

(32)

We can then use Wick’s theorem in a particularly simple case, since in (31) the only contraction that comes into play is the one containing

k

k

. Relation (18) thus

applies and we get: 1 (r1

r1 ) =

1

k (r1 r1 )

3

(

)

k

(33)

k

The correlation function 1 (r r ) is therefore directly (to within a constant factor) the Fourier transform of the distribution function ( k ) itself. The definition of 1 can be generalized, using the expressions of the field operators in the Heisenberg picture; this leads to a correlation function depending on space and time: 1 (r1

; r1

) = Ψ (r1 )Ψ (r1

)

(34) 1805



COMPLEMENT CXVI

For free particles (ideal gas), we have (§ 1-c of Complement BXV ): Ψ (r ) =

1

(k r

)

(35)

k

3 2 k

where is the (angular) Bohr frequency associated with the energy of a particle of mass , with wave vector k: =

} 2

2

(36)

For an ideal gas, we simply multiply each exponential k r by e to go from the Schrödinger to the Heisenberg representation. Expression (33) is then generalized as: 1 (r1

; r1

)=

1

[ k (r1

r1 )

(

)]

(

3

)

k

(37)

k

Note that this correlation function only depends on the differences in positions (space homogeneity) and times (time translation invariance). 2-b.

Second order correlation functions

.

Application of Wick’s theorem The second order correlation function is defined as: 2 (r1

r1 ; r2 r2 ) = Ψ (r1 )Ψ (r2 )Ψ(r2 )Ψ(r1 )

(38)

Here again, the average value is computed with the density operator at thermal equilibrium. The same calculation as in § 2-a leads to: 2 (r1

r1 ; r2 r2 ) =

1

(k

r1 +k

r2 )

r2 k r1 k

k k

6 k

k

k

k

k

(39)

k

As we already saw in (23), using Wick’s theorem yields two contraction systems, one where k is contracted with k (and hence k with k ), and another one where k is contracted with k (and hence k with k ): k k

k

et

k

k k

k

k

The second contraction involves an odd permutation, and introduces a factor . We therefore obtain (after changing the dummy index k into k ): 2 (r1

=

r1 ; r2 r2 ) 1

[k (r1

r1 )+k

(r2

r2 )]

6 k

[k (r2

+

r1 )+k

(r1

r2 )]

k

(

k

)

(

k

)

(40)

that is, taking (33) into account: 2 (r1

1806

r1 ; r2 r2 ) =

1 (r1

r1 )

1 (r2

r2 ) +

1 (r1

r2 )

1 (r2

r1 )

(41)



WICK’S THEOREM

This means that the second order correlation function is simply expressed as the sum of two products of first order correlation functions. The first is the direct term, and corresponds to totally uncorrelated particles. The second is the exchange term, a consequence of the quantum indistinguishability of the particles; it has an opposite sign for fermions and bosons. As in Complement AXVI , we will show that this term introduces correlations between the particles. .

Double density

Of particular interest is the “diagonal” case where r1 = r1 and r2 = r2 , as the function 2 (r1 r1 ; r2 r2 ) becomes very simple to interpret: it is the “double density” 2 (r1 r2 ) characterizing the probability of finding a particle at point r1 and another one at point r2 . The above relation takes on the simplified form: 2 (r1

r2 ) =

1 (r1

r1 )

1 (r2

r2 ) +

1 (r1

r2 )

1 (r2

r1 )

(42)

If, in addition, r1 = r2 , this function indicates the probability of finding two particles at the same point. We then get: for fermions for bosons

2 (r; r) 2 (r; r)

=0 =2 [

1 (r

(43)

2

r)]

For fermions, we see as expected that one can never find two of them at the same point in space, a consequence of Pauli exclusion principle. For bosons, we find that the double density is twice the square of the one-body density. Now, if both particles were uncorrelated, this double density should simply be equal to the square, without the factor two. This factor two thus indicates an increase in the probability of finding two bosons at the same point in space; it expresses the bunching tendency of identical bosons, a tendency that comes from a pure quantum statistical effect since we assumed the particle’s interactions to be zero. These results were already discussed in Complement AXVI – see in particular Figure 3. The Hartree-Fock method (mean field approximation), presented in Complements EXV and FXV , uses a variational ket (or a density operator) such that the binary correlation function 2 (r1 r2 ) is given by the sum of products of functions 1 , written in (42), even in the presence of interactions. Moreover, another way of introducing the Hartree-Fock approximation is to assume directly that the binary correlation function keeps this form even in the presence of interactions, which then allows a simple calculation of the interaction energy. Even though this method has numerous applications, and may be quite precise in certain cases, it does rely on an approximation: when the particles interact with each other, there is no general reason for 2 to remain linked to 1 by this relation, obtained with the assumption that the gas was ideal. .

Time-dependent correlation function

As we did for the first order correlation function, we can include time dependence in the field operators, and define: 2 (r1

1 ; r1

1 ; r2

2 ; r2

2)

= Ψ (r1

1 )Ψ

(r2

2 )Ψ

(r2

2 )Ψ

(r1

1)

(44) 1807



COMPLEMENT CXVI

To include the time dependence, we simply add, as above, in each spatial exponential with wave vector k, a time exponential with the corresponding angular frequency , which leads to: 2 (r1

1 ; r1

1 ; r2

1

2 ; r2

k

+ e 1 (r1

(

1 )+k

1

(r2 r2 )

(

2 )]

2

k i[k (r2 r1 )

=

=

r1 )

ei[k (r1

6

2)

1 ; r1

1)

(

2

1 )+k

1 (r2

2 ; r2

(r1 r2 ) 2)

(

+

1

1 (r1

2 )]

(

1 ; r2

2)

)

k 1 (r2

(

)

k

2 ; r1

1)

(45)

Hence, when time dependence is included, we get a factored relation similar to (41). As before, because of the space homogeneity and the time translation invariance, only the differences in the space and time variables appear in the correlation function expression. 2-c.

Higher order correlation functions

In a more general way2 , the (r1 r1 ; r2 r2 ; ; r

-order correlation function

is defined by:

r ) = Ψ (r1 )Ψ (r2 ) Ψ (r )Ψ(r ) Ψ(r2 )Ψ(r1 )

(46)

These functions give information on the correlated behavior of groups of particles in an ideal gas at equilibrium. Using Wick’s theorem, each of them can be expressed in terms of the first order correlation function 1 (r1 r1 ). As an example, let us study the correlation function of order three: 3 (r1

r1 ; r2 r2 ; r3 r3 ) = Ψ (r1 )Ψ (r2 )Ψ (r3 )Ψ(r3 )Ψ(r2 )Ψ(r1 ) 1 = 9 k

k

k

(k

r1 +k

k

k r2 +k

k r3 k r1 k

r2 k

k k

k

k

r3 ) k

k

(47)

Six contraction systems must be considered to compute the average values. In the first system, k and k are associated, k with k , and finally k with k . One can then permute the three vectors k , k and k in 5 different ways, with odd or even permutations. In each of the terms thus obtained, the sixfold summation on the wave vectors is reduced to a triple sum, which yields a product of functions 1 . This leads to: 3 (r1

+ +

r1 ; r2 r2 ; r3 r3 ) = 1 (r1 r2 ) 1 (r2 r3 ) 1 (r1 r1 ) 1 (r2 r3 )

(r1 r1 ) 1 (r2 r2 ) 1 (r3 r3 ) (r 1 3 r1 ) + 1 (r1 r3 ) 1 (r2 r1 ) 1 (r3 r2 ) (r 1 3 r2 ) + 1 (r1 r3 ) 1 (r2 r2 ) 1 (r3 r1 ) + 1 (r1 r2 ) 1 (r2 r1 ) 1 (r3 r3 ) 1

(48)

The computation can be generalized in the same way to correlation functions of any order; in an ideal gas, they are not independent since they are all simple products 2 We only consider here the so-called “normal”correlation functions, those where the Ψ come before the Ψ. In Complement BXVI we introduce more general correlation functions.

1808



WICK’S THEOREM

of first order correlation functions. In other words, the function 1 contains all the information necessary for computing correlations of any order. Finally, we can compute the triple density 3 (r1 r2 r3 ) by setting r1 = r1 , r2 = r2 and r3 = r3 in (48). The particular case r1 = r2 = r3 where all the positions are identical is interesting. For fermions, the triple density is zero, for the same reason as with the double density: Pauli principle does not allow several fermions to occupy the same point in space. For bosons, we find: 3 (r

r; r) = 6 [

3

1

(r r)]

(49)

For three identical bosons, the bunching tendency under the effect of their quantum statistics is even higher than for two bosons, introducing a factor 6 instead of 2. Comment: The results we have obtained are valid when the system’s density operator is that of an ideal gas at thermal equilibrium as in relation (1), but they could be quite different if the system is in another state. If, for example, we assume (as in Complement AXVI ) that the system is described by a Fock state, the relations between correlation functions can be totally different. The simplest case is that of an ideal gas of bosons in its ground state, where all the bosons occupy the same individual state; relations (24) and (25) of that complement indicate that: 2 (r1

r2 ) =

1

1 (r1

r1 )

1 (r2

r2 )

1 (r1

r1 )

1 (r2

r2 )

(50)

2 is thus simply the product of two functions 1 , without the exchange term of (42); consequently, the factor 2 in the second line of (43) no longer exists. In a similar way, one can show that the factor 6 of relation (49) is no longer present. In a general way, for an ensemble of bosons all in the same individual state, the bunching effects related to the indistinguishability of the particles are not present.

Following this line of thought, note that it is not possible to get the projector onto a Fock state (other than the vacuum) such as the one discussed above, by using the density operator (1) at thermal equilibrium, and taking its limit as the temperature goes to zero, i.e. as . This is because this density operator associates with each individual state an occupation number distribution that is always a decreasing exponential, and never a narrow curve centered around a high value of the particle number. Consequently, there exist large fluctuations of the particle number in each mode, and hence the presence of the factors 2 in (43) and 6 in (49), whatever the value of .

To conclude, let us mention that Wick’s theorem can take on diverse forms, in particular at zero or non-zero temperadure (see for instance Chapter 4 of Reference [5]). As we saw, thanks to this theorem, and when dealing with independent particles, the computation of correlation functions of any order, time-dependent or not, can be reduced to computing the product of first order correlation functions. It is obviously a great simplification. This property is reminiscent of random Gaussian variables in classical statistics: for such variables, all moments of any order can be expressed in terms of products of the lowest order moment. These properties are characteristic of an ideal gas: in a system where particles interact, the correlation functions of successive orders remain independent in general. Nevertheless, the use of Wick’s theorem is not limited to ideal gases; its range of application is much more general, and it is very useful in perturbation calculations where power series of the interaction potential are derived [5]. 1809

Chapter XVII

Paired states of identical particles A

B

C

D

E

Creation and annihilation operators of a pair of particles . 1813 A-1 Spinless particles, or particles in the same spin state . . . . . 1813 A-2 Particles in different spin states . . . . . . . . . . . . . . . . . 1816 Building paired states . . . . . . . . . . . . . . . . . . . . . . . 1818 B-1 Well determined particle number . . . . . . . . . . . . . . . . 1818 B-2 Undetermined particle number . . . . . . . . . . . . . . . . . 1820 B-3 Pairs of particles and pairs of individual states . . . . . . . . 1822 Properties of the kets characterizing the paired states . . . 1822 C-1 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 1822 C-2 Average value and root mean square deviation of particle number1825 C-3 “Anomalous” average values . . . . . . . . . . . . . . . . . . . 1828 Correlations between particles, pair wave function . . . . . 1830 D-1 Particles in the same spin state . . . . . . . . . . . . . . . . . 1831 D-2 Fermions in a singlet state . . . . . . . . . . . . . . . . . . . . 1834 Paired states as a quasi-particle vacuum; Bogolubov-Valatin transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 1836 E-1 Transformation of the creation and annihilation operators . . 1836 E-2 Effect on the kets k . . . . . . . . . . . . . . . . . . . . . . 1838 E-3 Basis of excited states, quasi-particles . . . . . . . . . . . . . 1840

Introduction Fock states were introduced in Chapter XV by the action on the vacuum of a product of individual state creation operators. A certain number of their properties were studied in § C-5-b- of Chapter XV and in Complement AXVI (exchange hole for fermions, bunching Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

effect for bosons). We also used Fock states in Complements CXV , DXV and FXV as variational kets to account, approximately, for the interactions between the particles. This led us, both for fermions and bosons, to a mean field theory where each particle can be seen as propagating in the mean field created by all the others. We now introduce a larger class of variational states to improve the accuracy of these results, allowing us to study many more properties of physical systems of identical particles. It concerns the “paired states” obtained by the action on the vacuum of a product of creation operators, no longer of individual particles but rather of pairs of particles (if these particles form a molecule, we are dealing with molecule creation operators). As we shall see in the course of this chapter, these paired states are more general than the Fock states, since they can be reduced to Fock states for certain values of the parameters characteristic of the pair1 . What is sought is an improvement of the variational method allowing us to ameliorate our treatment of the interactions, compared to that based on Fock states. The additional flexibility introduced by the paired states plays an essential role for the following simple reason: changing the properties of the pair wave function (r1 r2 ) used to build them, we modify the binary correlation function of the -particle system. We therefore take advantage of the power of the mean field method, while retaining the possibility of taking into account any binary correlations. Whereas using variational Fock states allows taking into account only statistical correlations (due to particle indistinguishability), the paired states enable us to add dynamic correlations (due to interactions). These latter correlations are essential: when dealing with binary interactions (as is the case with a standard Hamiltonian), these correlations actually determine the average value of the potential energy (Chapter XV, § C-5-b). Three-body, four-body, etc.. correlations are indeed present in the system, playing their role; they are not, however, directly involved in the energy. This explains why the optimization with paired states of only the binary correlations can lead to fairly good results in the study of -body systems. These possibilities have a wide range of applications for both fermions and bosons, which will be discussed in the complements. This chapter is centered on the study of the general properties of paired states, and introduces the tools for handling such states. We study, in parallel, fermions and bosons to highlight the numerous analogies between results obtained for both cases. We first introduce (§ A) the creation and annihilation operators for pairs of particles. We then build (§ B) the paired states and discuss some of their properties; this permits introducing (§ C) the concept of “normal average values” (average values of operators conserving the particle number) or “anomalous average values” (average values of operators changing the particle number). We then show in § D how the paired states allow us to actually vary the spatial correlation functions of a system of identical particles. This will lead us to introduce a function playing an important role in what follows (in particular in the complements of this chapter), the pair wave function pair , which is related to the anomalous average values. We then study in § E another interesting property of the paired states: they can be related to the concept of “quasi-particle” thanks to the introduction of new creation and annihilation operators resulting from a linear transformation of the initial operators (Bogolubov transformation). As the paired states are eigenkets of the new annihilation operators with a zero eigenvalue, they behave as 1 For example, we shall clarify at the end of § C-1-a why the Hartree-Fock method can be viewed as a particular case of the pairing method.

1812

A. CREATION AND ANNIHILATION OPERATORS OF A PAIR OF PARTICLES

a “quasi-particle vacuum”. Furthermore, the creation operators can associate with each paired state an entire basis of other orthogonal states, which are interpreted as states occupied by quasi-particles. This study of the necessary tools for handling the paired states will be continued in the first two complements, AXVII and BXVII . ComplementAXVII discusses a complementary aspect of pairing, the introduction of the pair field operators. These operators have a non-zero average value in paired states, and highlight the cooperative effects existing in those states. This can lead to the spontaneous appearance in the system of an order parameter, described by the same pair wave function pair as the one appearing in the computations of correlation functions in a paired state. In addition, Complement AXVII will show that the commutation properties of these operators are reminiscent of those of a boson field: in a certain sense, a composite object built from two identical particles (whether they are bosons or fermions) behaves as a boson. It is, however, only an approximation, as can be inferred from the corrective terms appearing in the computation of the commutators, which can sometimes play an important role. Complement BXVII discusses the computation of the energy average value in a paired state, whose expression is the basis of the following complements; it gives an example of how to deal with normal and anomalous average values in these computations. The last three complements apply these results to the variational study of interacting boson or fermion systems. For fermions, the paired states play an essential role in the BCS (Bardeen-Cooper-Schrieffer, theory of supraconductivity) theory of supraconductivity (Complement CXVII ), and explain the appearance of a pair field as a collective effect; paired states also come into play noticeably in nuclear physics, and in the study of ultra-cold fermionic atomic gases. For repulsive bosons (Complement EXVII ), paired states can be quite useful for studying the ground state properties, and to obtain, for example, the Bogolubov linear spectrum. In that case, the paired state is associated with another state (a coherent state, for example), whose role is to describe the condensate as an accumulation of a notable fraction of particles in a single individual quantum state.

A.

Creation and annihilation operators of a pair of particles

Let us introduce the creation or annihilation operators, no longer of a single particle, but of two identical particles in a bound state. We first assume the particles have no spin (or are both in the same spin state, so that no spin variable is needed).

A-1.

Spinless particles, or particles in the same spin state

Consider two identical particles (bosons or fermions in the same spin state), with positions r1 and r2 ; the system is contained in a cubic box of edge length and volume = 3 . These two particles occupy a bound state, characterized by a normalized wave function (r1 r2 ), forming a kind of binary “molecule”. The state of the system is defined by this wave function (as far as its internal orbital variables are concerned), by spin variables identical for both particles (since those spin variables are of no importance here, they need not be written explicitly in what follows), and finally by its external orbital variables (center of mass). The normalized wave function of a “molecule” having 1813

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

a total momentum }K is then: 3 2

(r1 r2 ) = ( )

K

K (r1 +r2 ) 2

(

3

=( )

k

K 2 +k

(r1

)

r1

r2 ) (

k) r2

K 2

(A-1)

k

where

k

k

=

(r) =

is the Fourier transform of : 1

kr

3

3 2

(r)

3

1

kr

k

3 2

(A-2)

k

We assume that the individual wave functions of the particles obey the periodic boundary conditions (Complement CXIV , § 1-c); in (A-1), each component of the wave vector of particle 1 or 2 can therefore take only the values 2 ,2 and 2 , where , and are any integer number (positive, negative, or zero). The normalization of the functions and is written: d3

2

(r)

2

=

k

3

=1

(A-3)

k

Moreover, for identical particles, the symmetrization (or antisymmetrization) requires the function (r) and its Fourier transform (k) to have the parity : k

=

(A-4)

k

( = +1 for bosons, = 1 for fermions). In terms of kets, relation (A-1) becomes: K (1

2) = ( )

3

d3

1

d3

( K2 +k) r1 ( K2

k

2

k) r2

1 : r1 ; 2 : r2

k

=

k

1:

k

K + k ;2 : 2

K 2

(A-5)

k

which, taking (A-4) into account, and changing the sign of the sum variable k, can also be written as: K (1

2) =

1 2

k k

1:

K + k ;2 : 2 +

1:

K 2 K 2

k k ;2 :

K +k 2

(A-6)

The expression between brackets in the summation is simply the (anti)symmetrized ket of two particles, the first one of momentum } (k + K 2), and the other one of momentum } ( k + K 2). Two cases must be distinguished: (i) If k = 0, to normalize the ket between brackets, we divide it by 2; we then get a Fock state where two individual states with different momenta are occupied (see 1814

A. CREATION AND ANNIHILATION OPERATORS OF A PAIR OF PARTICLES

the general definition of the Fock states in Chapter XV). The ket between brackets is thus equal to: 2

K 2 +k

K 2

k

0

(A-7)

(ii) If k = 0 and in the case of bosons, the ket between brackets is equal to twice the Fock state where a single individual level is occupied by two particles; this ket is equal to: 2

2

0

K 2

(A-8)

For fermions, the ket between brackets must be zero, which is indeed the case of the ket in (A-8). To sum up, whether we are dealing with fermions or bosons, and whether k is zero or not, the ket between brackets can always be expressed as (A-7). This leads to: K

1 2

=

k

K 2 +k

k

K 2

k

0

(A-9)

If the particles are all in the same spin state, remember that in this expression the spin index is implicit: each creation operator is associated with an individual state whose momentum is specified by the operator index, and whose spin state is the common spin state of all the particles. The creation operator K of a “molecule” having a total momentum }K can therefore be written as: K

1 2

=

k k

K 2 +k

K 2

(A-10)

k

Its action is to create two particles of momenta } [(K 2) k], with amplitudes given by the function k . As this function has the parity , we note that: k

K 2

K 2 +k

k

=

k

K 2

k

K 2 +k

=

k

K 2 +k

K 2

k

(A-11)

Accordingly, the contributions of opposite values of k double each other in (A-10). Such a redundancy will cause a problem in § B-2, when we write a tensor product. It is thus preferable to eliminate it right now and this is why we restrict the summation over k to half the wave vector space. Calling this half space, we shall write K in the form: K

=

2

k k

K 2 +k

K 2

k

(A-12)

For a “molecule” having a zero total momentum, this relation becomes: K=0

2

=

k

k

k

(A-13)

k

As for the annihilation operator of a molecule with total momentum }K, it is simply the adjoint of (A-12): K

=

2

k

K 2

k

K 2 +k

(A-14)

k

1815

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

We have reasoned in terms of “molecules” being created or annihilated, but the wave function (r) and its Fourier transform k are not related to any particular bound state, and do not imply the existence of any attraction potential between the two components of this “molecule”. Actually, in what follows, the k will play the role of freely adjustable parameters, for example when using a variational method. To illustrate this generality, we shall, from now on, talk about “pairs”. Comment:

If we choose for (A-4): k

=

1 [ 2

as in a Kronecker delta, with the symmetrization required by

k

k k0

+

k0 ]

k

(A-15)

we get, according to (A-12): K

=

1 2

K 2 +k0

K 2

k0

+

K 2

k0

K 2 +k0

=

K 2 +k0

K 2

(A-16)

k0

In the right-hand side of this relation, the momenta appearing as indices of the creation operators can take any given values, obtained by varying K and k0 . It is therefore possible, by a suitable choice of the pair’s parameters, to create two particles in individual states having any given momenta, and thereby obtain a Fock state. Successive applications of operators K (having, in general, different values of K and k0 ) can thus yield a Fock state with 2 particles whose momenta can take on any values. A-2.

Particles in different spin states

We assume the internal state of the pair is a tensor product of an orbital state depending on r1 r2 and a spin state . Equation (A-1) must then be replaced by: 1

2

K

(r1 r2 ) = 1 : r1 =( ) =( )

1;

3 2

2 : r2

2

ΦK

K (r1 +r2 ) 2

(

3 k

K 2 +k

(r1

r2 )

) r1 (

K 2

1

2

k) r2

1

(A-17)

2

k

This means that relation (A-1) is to be multiplied by written: K (1

2) = ( )

3

d3

d3

1

k

2

1

( K2 +k) r1 ( K2

; relation (A-5) is now

2

k ) r2

k 1 1

=

k k

1816

1 1

2

1 : r1

2

1;

2 : r2

2

1;

K 2

2

2

1:

K +k 2

2:

k

2

(A-18)

A. CREATION AND ANNIHILATION OPERATORS OF A PAIR OF PARTICLES

The function (r1 r2 ) is supposed to have an orbital parity equal to , and the spin ket , a parity with respect to the exchange of spins equal to , with, obviously: =

(A-19)

Hence: K (1

1 2

2) =

k

1

k

1

1:

2

K +k 2

1; 2

:

K 2

k

2

2

+

1:

K 2

k

2; 2

:

K +k 2

1

(A-20)

which shows that the creation operator of a pair is: K

=

1 2

k k

1 1

2

K 2 +k

1

K 2

k

(A-21)

2

2

As an example, for two fermions of spin 1 2 in a singlet state: K

=

1 2

k

K 2 +k

k

=+

K 2

k =

K 2 +k

=

K 2

k =+

(A-22)

Since = 1, the functions (r) and k are even. Using this parity, we can exchange the dummy indices k and k in the second term on the right-hand side, and change the order of the two creation operators, with a sign change (anticommutation of fermionic operators). This second term then doubles the first one, and we get: K

=

k k

K 2 +k

K 2

k

(A-23)

with the simplified notation we shall use from now on: k =+

noted:

k

k =

noted:

k

(A-24)

(and, of course, a similar notation for the creation operators ). Note in passing that, because of the presence of spins, no redundancy is present in the summation appearing in (A-23), and there is no need to restrict it to a half-space. Comments: (i) Taking for k a (symmetrized) delta function, as in (A-15), it is possible, as we pointed out before, to construct any Fock state with arbitrary momenta by successive application of operators K on the vacuum; note, however, that the total occupation numbers of the two spin states must remain equal. (ii) Choosing in (A-22) a function k that is even instead of odd for fermions, the operator written in (A-23) creates a fermion pair with a total spin state = 1, and a = 0 component. This is because replacing the minus sign by a plus sign in the middle of the bracket of relation (A-22) yields a triplet spin state; using the fact that k is now an odd function, the same reasoning leads to (A-23).

1817

CHAPTER XVII

B.

PAIRED STATES OF IDENTICAL PARTICLES

Building paired states

To avoid complex formulas, we will build the simplest possible paired states Ψ . We shall be guided by the Gross-Pitaevskii variational method (Complement CXV ), where we assumed that the state of the -particle system could be obtained from the vacuum by creating particles in the same individual state. However, instead of applying many times the creation operator k of a single particle to the particle vacuum, we shall now use the pair creation operator K . This difference is essential, in particular for fermions. As we know, it is impossible to create several fermions in the same individual quantum state, since the square, cube, etc. of any creation operator acting on a given individual state yields zero. We shall see, however, that the creation of pairs of fermions, all in the same quantum state, does not lead to a zero state vector. B-1.

Well determined particle number

We define the paired state Ψ (K) as the (non-normalized) state vector where = 2 particles form pairs, each having a total momentum }K: Ψ (K) =

0

K

(B-1)

where K has been defined in (A-12) or (A-21), depending on the case. To keep things simple, we assume in what follows that all the created pairs have zero total momentum; if this is not the case, we can change the reference frame and choose the one where the common value of the total momentum of all the pairs of particles is zero. The paired state Ψ is then written: Ψ

=

0

K=0

(B-2)

We shall first study (as in § A-1) the case of bosons or fermions in the same spin state. As for the case of particles in several spin states (as in § A-2), we shall, from now on, limit our study to fermions in a singlet state; this will allows exposing the general principle while avoiding more complex calculations. In both cases, the 2 -particle state only depends on the values of the parameters k . As soon as 1, we will see that the normalization of the ket Ψ (K) does not reduce to the simple condition (A-3), which 2 required the sum of the k to be equal to unity. This is why we shall consider from now on that the k are totally free variational parameters. For example, multiplying them all by the same constant, one can choose to vary at will the norm of Ψ (K) . This will offer a flexibility simplifying the computations. B-1-a.

Particles in the same spin state

For particles in the same spin state, we can use (A-13), which leads to: Ψ

2

=

k

k

k

0

(B-3)

k

where is the summation domain defined previously (half of the k space); remember that the physical system is assumed to be contained in a cubic box of side length and 1818

B. BUILDING PAIRED STATES

volume = 3 ; the periodic boundary conditions then fixes all the possible values for the summation over k. Note also that the spin index is implicit: k is the creation operator in the individual state defined by the momentum }k and the unique spin state we are concerned with. Initially, the parameters k were introduced as the Fourier components of the normalized pair wave function (r); the sum of their moduli squared was fixed to unity. This condition, however, does not ensure the normalization of Ψ , as we now show. Because of the power of the operator appearing in (B-3), factors containing square roots of occupation numbers will be introduced for each occurrence of the index k; the ket Ψ is therefore not a simple tensor product, and its norm is not simply the sum of the squared moduli of the k raised to the power . It will be simpler for the following computations to consider the k as entirely free parameters, and hence not impose a normalization of the state Ψ . On can choose to take a finite or infinite number of non-zero k . The simplest case is the one already discussed above, where k then becomes k k0 ; the ket Ψ proportional to a simple Fock state where only two states of opposite momenta are occupied. For other functions k , the structure of the paired state will be more complex; adjusting those parameters allows a fine tuning of the particle correlation properties, which is not possible with a simple Fock state. B-1-b.

Fermions in a singlet state

Another frequently encountered case concerns fermionic particles in a singlet state; we must then use operator (A-23). The paired state is then:

Ψ

=

K=0

0

=

k

k

k

0

with

=2

(B-4)

k

The summation over k runs over all the non-zero wave vectors, without the restriction (B-3) to the half-space (because of the spins, the pairs of states k , k and k , k are different). Here again we see that the normalization of Ψ does not simply reduce to condition (A-3). When 1, the same index k may appear twice (or more) in the expansion of the power of the operator on the right-hand side of (B-4); the corresponding component cancels out since the square of any fermionic creation operator is zero. The norm of the ket Ψ is therefore a complex expression. Rather than imposing this norm to be equal to one, it is easier to let it vary, and consider the k to be totally free variational parameters. B-1-c.

Consequences of the symmetrization

The state vectors (B-3) for bosons, and (B-4) for fermions, are not simple juxtapositions of pairs of particles, each being described by the relative wave function (r), with k as its Fourier transform, according to (A-2). As we already saw, the symmetrization or antisymmetrization of the 2 -particle paired states strongly affect their norm; it also affects the very structure of these states, which are not merely the tensor product of pair states. This is particularly obvious for fermions: expanding the sum of operators to the power in the curly brackets of (B-4), we get the product of sums 1819

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

over indices k1 , k2 ,.., k , and many terms will cancel out: all those for which two (or more) summation indices k are equal (in which case we get squares of creation operators, which are zero for fermions). There exists, however, a limiting case where the paired state practically describes the juxtaposition of binary molecules. It occurs when the wave function (r1 r2 ) varies over a very short range, and hence has a large number of significant Fourier components k . When this number is much larger than the number of pairs , most of the terms do not contain several occurrences of the same summation index k, and consequently the paired state vector is very close to the tensor product of the pairs of particles. This state vector describes a gas formed by very strongly bound binary molecules, each moving in the mean field created by all the others (it is, in a certain sense, a “molecule Fock state”). This is actually a very special case; in general, when the pair wave function does not obey that criterion, we can study many other physical situations, hence the interest for introducing paired states. Even though the values of k or the wave function (r) of a “molecule” are mathematically the starting ingredient that allows building Ψ , the resulting state after symmetrization has a complex structure, hard to describe in terms of molecules. On the other hand, this state has a simple property: it contains exactly = 2 particles, since this is the case for each of its non-zero components; as all the particles are paired, it contains exactly pairs. B-2.

Undetermined particle number

Computations with the ket Ψ written above (and in particular its normalization) are not easy: a great number of individual states k appear inside the curly brackets, which must be raised to a very large power . This practical difficulty leads us to introduce another variational state Ψpaired where the total number of particles is no longer fixed. This new state, which leads to simpler calculations2 , is defined, starting with (B-2), by: 1

Ψpaired = =0

!

Ψ

1

= =0

!

K=0

0

(B-5)

The Ψ are not normalized; multiplying all the k , and hence K=0 , by the same constant , changes their norm by the factor . This results in varying the relative weights of the terms in the serie (B-5). The larger , the more weight is placed on the high values of , which is a way, for example, of modifying the average particle number. In (B-5) we recognize the series expansion of an exponential, so that: Ψpaired = exp

K=0

0

(B-6)

This property will greatly simplify the following calculations and is the major reason for letting the total particle number fluctuate. Writing (B-5), we chose a state vector that is the superposition of states corresponding to different total particle numbers ; there are actually no physical processes taken into account in our approach that could create such a coherent superposition. This 2 This does not mean that computations with a variational state having a fixed number of particles are always impossible, as shown for example in the treatment of the BCS theory in § 5.4 and Appendix 5C of the book [8].

1820

B. BUILDING PAIRED STATES

operation reminds us of the passage from the canonical to the grand canonical ensemble where one introduces, for mathematical convenience, an (incoherent) statistical mixture of different values. In our present case, however, we are dealing with a coherent superposition, introduced arbitrarily as we just did, and we may wonder whether it might radically change the physics of our problem. This is actually not the case for two reasons. The first is that, for very large values of , we are going to show that the components of Ψpaired are only important in a domain of whose width is very small compared to the average value of the particle number; the distribution of the possible values for is thus very narrow, in relative value, and the particle number remains quite well defined. The second reason is that we shall compute average values of operators that, such as , conserve the total particle number, and for which the coherence of the state vector between kets of different values is irrelevant. The average value in the coherent state Ψpaired is therefore the weighted average of the average values obtained for each which, when the average value of the particle number is very large, are approximately the same (since the distribution is very narrow). In other words, the average values we are going to compute are good approximations of those we would obtain by projecting Ψpaired onto one of its main components with fixed ; using the coherent superposition (B-5) is thus very convenient from a mathematical point of view, without greatly perturbing the results from a physical point of view. A more detailed discussion of this question will be presented in § 1 of Complement BXVII . B-2-a.

Particles in the same spin state

When all the particles are in the same spin state, inserting (A-13) in (B-6) leads to: Ψpaired = exp

2

k

k

0

k

(B-7)

k

The operators k k and k k commute with each other (for fermions in the same spin state, two minus signs cancel each other as we commute products of two operators). It then follows that the exponential of the sum is a product of exponentials, and we can write: Ψpaired =

exp

2

k

k

k

0

k

=

(B-8)

k k

The state vector Ψpaired is then simply a tensor product3 of state vectors k

= exp

2

k

k

k

0

k

: (B-9)

3 The Fock space is the tensor product of the states associated with all the individual quantum states k , each having any positive occupation number. One can regroup those spaces in pairs corresponding to opposite values of k, and introduce spaces (k) whose tensor product is also the Fock space. To build a basis in those spaces, one must vary two occupation numbers. The restriction of the summation over k to a half-space , introduced above, prevents each component of the tensor product from appearing twice in (B-8).

1821

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

For fermions in a single spin state, the square of any creation operator is zero; the exponential reduces to the sum of the first two terms of its expansion: = 1+

k

B-2-b.

2

k

k

k

0

(fermions only)

(B-10)

Fermions in a singlet state

For paired fermions in a singlet state, the state Ψpaired will be called the “BCS state” (Complement CXVII ) and noted ΨBCS ; relation (A-23) must be used with K = 0. As the exponential of a sum of commuting operators4 is a product of operators, we get: ΨBCS = exp

k

k

k

0 =

k

(B-11)

k

with: = exp

k

k

k

k

0

(B-12)

As the square of any fermion creation operator is zero, the series expansion of the exponential is limited to its first two terms: = 1+

k

B-3.

k

k

k

0

(fermions only)

(B-13)

Pairs of particles and pairs of individual states

Pairs of states is an important concept not to be confused with pairs of particles. In (B-7) as well as in (B-11), the individual states intervene as “pairs of states” (k k). The number of those pairs (which can be infinite if is infinite) is not related to the particle number. For fermions in a singlet state, it is convenient to label the pair of states by the momentum k associated with the spin state , while remembering that the momentum associated with the spin state is the opposite, k. We shall systematically use this simplification in what follows. C.

Properties of the kets characterizing the paired states

Let us examine a few properties of the states k that will be useful in what follows. To keep things simple, we continue limiting in this § C the generality of the cases under study, and assume the particles in the same spin states are bosons; as for the particles in different spin states, we shall continue using the example of fermions in a singlet state. The generalization to other paired cases does not introduce any particular difficulties. C-1.

Normalization

The normalization of the states k is actually simpler for fermions than for bosons; this is because, as we shall see below, the series expansion of the exponential (B-12) contains only two terms for fermions, instead of an infinity for bosons. This is why we do not keep in this § C the same order as in § A and start with the study of spin 1/2 fermions. 4 The

operators k and k k products of two fermionic operators.

1822

k

associated with different pairs commute, since they are

C. PROPERTIES OF THE KETS CHARACTERIZING THE PAIRED STATES

C-1-a.

Fermions in a singlet state

We choose to normalize separately each of the k by multiplying them by a number k . This operation amounts to replacing k by: =

k

k

+

k

k

k

0

(C-1)

with: k

=

k

(C-2)

k

The normalization condition becomes: 2 k

2

+

k

=1

(C-3)

It then becomes natural to set:

k

= cos

k

k

= sin

k

where k and between 0 and 0

k

k

(C-4)

k

k

are the two variables5 the ket 2:

k

depends on. One can choose

k

(C-5)

2

so that cos k and sin k are positive and represent the moduli of k and k . We saw in § A-2 that k = k ; the functions k and k are therefore even with respect to k. The variational ket ΨBCS now becomes the normalized ket ΨBCS : ΨBCS = =

k k

k

+

cos

k

k

k k

k

+ sin

0 k

k

k

k

0

(C-6)

Comment: A particular case occurs when all the k are either zero or equal to 2. The ket ΨBCS then reduces to a simple Fock state, whose populations of individual states are either zero, or equal to one (for populations corresponding to states belonging to a pair for which k = 2). In that case, the phases k no longer play any role: instead of fixing a relative phase, they only determine the global phase of the state vector. If, furthermore, we choose k = 2 for all values of k whose modulus is less than a given value , and zero otherwise, the paired state now describes an ensemble of fermions filling two Fermi spheres (one for each spin state), which is simply the ground state of an ideal gas of fermions. The ket ΨBCS then reduces to the trial ket of the HartreeFock method of Complement BXV ; that method appears as a particular case of the more general pairing method used in this chapter. 5 The variable k determines the difference 2 k between the phases of k and k . We could also introduce a variable to determine their sum, but that would be pointless: such a variable would only change the total phase of the ket k , without any physical consequences.

1823

CHAPTER XVII

C-1-b.

PAIRED STATES OF IDENTICAL PARTICLES

Bosons in the same spin state

For bosons, the results are slightly different. To maintain a certain analogy, we shall use the same parameters k and k as for fermions, but it is now the hyperbolic sine and cosine of k that will come into play. Relation (B-9) leads to: =

k

=0

1 !

2

k

k

0 =

k

=0

1 [ !

k]

k

k

0

(C-7)

with6 : k

=

2

(C-8)

k

As mentioned before, the spin index that comes in addition to the index k is not written explicitly as its value does not change. Consequently: k

k

=0

=

2

1 !

=

4

2

!

k

2

=

k =0

1

(C-9)

2

1

k

We assumed, to sum the series, that: 2

1

k

(C-10)

It is useful in what follows to characterize the complex variable k by two real variables: k to define its modulus, and an angle k that characterizes its phase. We therefore set: k

= tanh

2

k

with:

k

k

0

(C-11)

Inequality (C-10) is automatically satisfied since the modulus of a hyperbolic tangent is always less than 1; as the function k is even – see relation (A-4) – so are the variable 7 k and the functions k and k . We then get: k

k

=

1 tanh2

1

The normalized kets k

=

1 cosh

k k

k

=

= cosh2

(C-12)

k

k

can be written as: 1 cosh

exp k

k

k

k

0

(C-13)

Replacing the k by the k , the ket Ψpaired becomes normalized to 1. Initially, the kets k , as well as their normalized version k , have been defined in the tensor product (B-8) only when k belongs to the half-space . They can, however, be defined by relations (C-7) and (C-13) for any k; we then have simply k = k , which was to be expected since k involves the two individual states k and k in the same way. 6 The

minus sign in this definition is arbitrary – a change of sign of the wave function (r) or of its Fourier transform k has no physical consequences – but it is convenient to introduce this sign to ensure coherence with the calculations in § E. 7 Furthermore, rotational invariance generally requires those functions to depend only on the modulus of k.

1824

C. PROPERTIES OF THE KETS CHARACTERIZING THE PAIRED STATES

C-2.

Average value and root mean square deviation of particle number

The particle number in the individual state k corresponds to the operator: k

=

(C-14)

k k

We are now going to compute the average value and the root mean square deviation of the particle number, first in a given pair of states, then for the system as a whole. C-2-a.

Fermions in a singlet state

Let us compute the average value of the particle number in the state ΨBCS , which is the tensor product of the states k , each being associated with the pair of states (}k = +; }k = ); as defined above, each pair is labeled by the wave vector k of the spin + particle. The particle number in each of these pairs of states corresponds to the operator: (pair k)

=

+

k

=

k

+

k

k

(C-15)

k

k

with eigenvalues 0, 1 and 2. Now k is given by (C-1), the sum of two components, one with zero particles, and the other with two particles. This leads to: (pair k)

k

=2

k

2

= 2 sin2

k

(C-16)

k

and: 2 (pair k)

k

k

=4

2 k

= 4 sin2

The root mean square deviation ∆ ∆

(pair k)

=

2

4

k

2

1

(pair k)

= 2 sin

k

(C-17)

k

of the particle number in a pair is thus:

k

cos

(C-18)

k

Consequently, the fluctuations of the particle number in each pair of states can be large. On the other hand, the fluctuations of the total particle number, obtained by summing over all the pairs, remain small. The average value of this total number is: 2

=2

k

sin2

=2

k

(C-19)

k

k

As we will show just below, the square of the fluctuation ∆ 2

2

[∆ ] = 4

k

1

2 k

of

is given by: (C-20)

k

Since 1 [∆ ]

2

2 k

1, we get: 2

4

k

=2

(C-21)

k

1825

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

so that: ∆

2

(C-22)

Hence, for large values of , the fluctuations of the particle number, in relative value, are very small, decreasing at least as fast as the inverse of the square root of the average value. Demonstration: The operator corresponding to the square of the particle number is: 2

2

=

(pair k)

+

(pair k)

k

(C-23)

(pair k )

k=k

As the state ΨBCS is a product of states of pairs, the latter are not correlated and the average value of this operator is written: 2

ΨBCS

ΨBCS 2

=

k

(pair k)

k

+

k

(pair k)

k

k

k

(pair k )

k

(C-24)

k=k

Expression (C-1) for (pair k)

=2

k

k

leads to:

k ; k

k

(C-25)

so that: 2

2

=4

k k

2

+4

k

2

(C-26)

k

k=k

Now the square of the average value is equal to the last terms of this equality, but without the constraint k = k in the summation. It follows that the root mean square ∆ is written as: 2

[∆ ]2 =

[

]2 = 4

2 k

4 k

(C-27)

k

which leads to (C-20). C-2-b.

Bosons in the same spin state

For bosons, each pair contains two individual states of opposite k. We show below that for each of them, we have: k

= sinh2

(C-28)

k

and that: [ 1826

2 k]

=2

2 k

+

k

(C-29)

C. PROPERTIES OF THE KETS CHARACTERIZING THE PAIRED STATES

The root mean square deviation of the distribution associated with the values of thus: ∆

k

=

2 k]

[

= sinh

2

cosh

k

2

=

k

k

+

k

is

k

(C-30)

k

(the average value of the particle number in a pair of states is 2 square deviation of that number is 2∆ k ).

k

, and the root mean

Demonstration: As

is symmetric with respect to the two individual states k and

k

k

=

k

k

k, we have: (C-31)

k

with: k

2

=

2

=

k

k

2

k

k

k

=0 2 k

=

(C-32)

2 2

1

k

so that: 2 k

k

=

k k

k

=

k

k

(C-33)

2

1

k

which leads to (C-28). The average value of the particle number squared is computed in a similar way. Using the identity 2 = ( 1) + to bring up the second derivative with respect to k 2 , we can write: k

[

k]

2 k

2

=

2

k =0 2 4

=

k

k

2 2

k

+

4

2

=

k 2 3

1

2 k

2

k

k

k

k 2 k

+

2 2

1

k

(C-34)

k

and hence: [

k]

2

=

k

[ k

k]

2 k

=2

2 k

+

k

(C-35)

k

The total number of particles is written: 2

=

k k

sinh2

=

k

(C-36)

k

1827

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

(as expected, each pair of states appears twice in this summation; one can also restrict the summation to the half-space provided we add a factor 2). We also have: 2

=

k k

=

k

k

k

=

[

2 k]

k

+

k

k

+

k

k k= k

+

k

+

k

k

k

(C-37a)

kk= k

where, in the last term, we have used the fact that the paired state is the product of uncorrelated pairs. But this state is symmetric in k and k, and operators k and k act on it in the same way. Therefore: 2

=

2

2 k]

[

2 k

+

k

k

k

(C-37b)

kk

(the constraint in the second summmation has been eliminated by subtracting a term in the first summation). We then get: 2

=2

[∆

2 k]

2

+

(C-37c)

k

The root mean square deviations ∆ k have been obtained in (C-30). Hence, the square of the root mean square deviation of the total number of particles is written as: 2

[∆ ] = 2

[∆

2 k]

sinh2

=2

k

k

cosh2

(C-38)

k

k

As for the fermion case, this square contains only a single summation on k, whereas the square of the total particle number contains two. Now the number of non-zero terms in those summations is the number of Fourier components necessary to describe the pair of particles used, in § B, to build the paired state in a cube of edge length (size of the momentum quantization box – see § A-1). This number is of the order of the cube of the ratio between and the size of the pair, hence a very large number, as it is the ratio between a macroscopic and a microscopic volume. A double summation over k therefore contains many more terms than a simple summation, and since all the terms are positive and of comparable magnitude, we have: 2

2

[∆ ]

(C-39)

We again find, as for fermions, that ∆ C-3.

.

“Anomalous” average values

For computing average values of the energy (in particular, in Complement BXVII ), we will need the average values of products of two creation or annihilation operators. For example, for bosons we will need to calculate: k

1828

k

k

k

and

k

k

k

k

(C-40)

C. PROPERTIES OF THE KETS CHARACTERIZING THE PAIRED STATES

We note, right away, that they concern operators that do not conserve the particle number, and this is the reason they are often called “anomalous average values”. One could be surprised that such average values come into play while studying physical processes that do not physically imply creation or destruction of particles. We will show that they actually occur in a very natural way in the calculation of the average value of a Hamiltonian that conserves the particle number. The reason is that k is only a component of the total state vector (B-8), in which it is associated with many other k ; in the total state vector, the particle number in the state k may, for example, decrease by 2 while the particle number in the state k simultaneously increases by the same quantity. We are, therefore, performing computations on the components of a state vector that has the same total particle number; the “anomalous” character is only apparent, and is due to the fact that we only consider part of the total state vector. C-3-a.

Fermions in a singlet state

Consider the action of the operator on the ket k written in (C-1). k k Only one of its component, in (k), remains and, after two anticommutations, we can write: k

k

k

=

k

=

k

k

k

k

0 =

k

k

k

k

k

k

k

k

0

0

(C-41)

Taking the scalar product of this ket with the bra remains; the average value is thus: k

k

=

k k

= sin

k

cos

2

k

k

, only its component

(k) 0

(C-42)

k

The anticommutation of these two operators then yields: k

k

k

k

=

k k

=

sin

k

cos

k

2

k

(C-43)

Taking the Hermitian conjugate of (C-42), we get: k

k

k

k

=

k k

= sin

k

k

k

whereas the average value of k

k

k

k

=

sin

k

cos

k

cos

2

k

k

(C-44)

is the opposite (anticommutation): 2

k

(C-45)

We saw, in § C-1-a, that the functions k and k are even; we can therefore change the sign of k on the left-hand side of the previous relations without changing the right-hand side. C-3-b.

Bosons in the same spin state

For bosons, it is easier to first compute the average value of a product of creation operators: k

k

k

k

=

1 cosh2

k k

k

k

k

(C-46) 1829

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

This expression contains the product of the ket:

k

k

k

=

k

[

k

k]

k

= ;

k

=

=0

=

( + 1) [

k]

= + 1;

k

k

= +1

(C-47)

=0

by the bra: [

k]

k

=

;

=

k

(C-48)

=0

To get a non-zero term, we must have ( + 1) [

k]

[

whose sum over

k]

= + 1, which leads to:

+1

(C-49)

yields, taking (C-32) into account:

( + 1) [

2

k]

k

We finally divide by

k

=[

k]

[

k

+ 1]

k

k

(C-50)

=0

k

k

k

k

=

tanh 2

=

as in (C-46), to obtain, inserting the value (C-11) of

k

2

k k

sinh

k

k

1 + sinh2 cosh

k:

k

k

(C-51)

As for the other “anomalous” average value k k k k , a simple Hermitian conjugation operation shows that it is the complex conjugate of the previous one: k

k

k

k

=

2

k

sinh

k

cosh

k

(C-52)

As was the case for fermions, the functions k and k are even, which allows changing the sign of k on the left-hand side of the previous equations without changing the result.

D.

Correlations between particles, pair wave function

As already mentioned in this chapter’s introduction, one of the major interest of the paired states is to allow varying the spatial correlation functions of a system of identical particles. In addition to the purely statistical correlations, coming from the indistinguishability of the particles and already present in an ideal gas, we now have a way to include dynamic correlations due to the interactions. Using paired states instead of simple Fock states allows, for example, a better optimization of the energy. We shall limit our study to the two-particle diagonal correlation function, as it is the one that fixes the average value of the interaction Hamiltonian. This will lead us to introduce a new wave function, that we shall name the “pair wave function”. In the complements following 1830

D. CORRELATIONS BETWEEN PARTICLES, PAIR WAVE FUNCTION

this chapter we shall also study non-diagonal correlation functions; it will concern the one-particle correlation function, whose long range behavior may signal the existence of Bose-Einstein condensation, as well as the two-particle correlation function. In a general way, one may wonder about the physical significance of correlation functions computed in states Ψpaired or ΨBCS , since these states are coherent superpositions of kets containing different particle numbers . However, correlation functions are average values of operators keeping the particle number constant, and hence independent of the coherence between kets of different values. Furthermore, we saw in § C-2 that for large values of the average particle number , the relative fluctuations of that number were negligible. In the limit of large

, one can thus expect the results

obtained with Ψpaired or ΨBCS to be very close to those obtained with the Ψ , for which these fluctuations are strictly zero. This question will be discussed in more detail in § 1 of Complement BXVII . When studying correlation functions in the case where the paired particles are in the same spin state, the only relevant indices concern the orbital variables. We shall start with this simpler case, and study later the case of paired particles in a singlet state. D-1.

Particles in the same spin state

Relation (B-34) of Chapter XVI indicates that the two-particle diagonal correlation function 2 (r1 r2 ) can be written: 2

(r1 r2 ) = Ψ (r1 ) Ψ (r2 ) Ψ (r2 ) Ψ (r1 )

(D-1)

Replacing the field operators and their adjoints by expressions (A-3) and (A-6) of Chapter XVI, using as a basis the normalized plane waves, we get: 2

(r1 r2 ) =

1

[(k4 k1 ) r1 +(k3 k2 ) r2 ]

6

k1 k2 k3 k4

(D-2)

k1 k2 k3 k4

where the average value of the product of the 4 creation and annihilation operators must be taken in a paired state. Figure 1 symbolizes the different terms present in this correlation function. D-1-a.

Simplifications due to pairing

The computation is greatly simplified by noting that in a paired state, the populations of the two individual states having opposite wave numbers k and k must always be equal. Consequently, only those combinations of the 4 operators that do not change this equality will lead to non-zero average values. Three cases are then possible: – Case I: the two annihilation operators concern two individual states that do not belong to the same pair (k3 = k4 ); the two creation operators must then restore to their initial values the populations of these two same states, or else their average value will be zero; these are the so called “forward scattering” terms. We then have either k4 = k1 and k3 = k2 (direct term), or k4 = k2 and k3 = k1 (exchange term). – Case II: the two annihilation operators act on the two states of a first pair (k4 = k3 ), and the creation operators on the two states of another pair (k2 = k1 ). We then talk about a “pair annihilation-creation process” . 1831

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

Figure 1: This diagram symbolizes the terms that come into play in the computation of the correlation function of two particles at points r1 and r1 . The two incoming arrows at the bottom left-hand side represent the two particles eliminated by the annihilation operators; they are associated with a positive imaginary exponent of the position. The two outgoing arrows on the top right-hand side represent the two particles resulting from the action of the creation operators, associated with a negative imaginary exponent. The correlation function is the sum of these terms over all the values of the 4 k vecteurs.

– Case III: the two annihilation operators act on the two states of the same pair, and the creation operators replenish these same two states (this is a special case of the one we just discussed); another possibility is that the two annihilation operators act on the same individual state (all the wave numbers k must then be equal). Using these conditions on the values of the wave numbers in (D-2), we note that the terms corresponding to cases I and II include two summations over the wave vectors, whereas there is only one summation in the terms corresponding to case III. Consequently, for a large (macroscopic) volume 3 , there are far fewer terms coming from case III than from cases I and II. We shall therefore only take into account terms arising from case I and II. For the same reason, we shall ignore in our computation of these terms the constraints k3 = k4 or k3 = k1 , as this amounts to adding a negligible number of terms.

D-1-b.

Expression of the correlation function

The direct term is obtained for k4 = k1 and k3 = k2 ; it no longer has any spatial dependence. Since k1 and k2 are different, the average value of the product of operators can also be written k1 k1 k2 k2 – for fermions, the two minus signs coming from the anticommutations cancel each other. Now relation (B-8) shows that the paired state is a tensor product of pairs of states. This means that the average value we wish to determine is simply the product of the average values of the first two operators and of the last two operators, i.e. the product of the average values of two occupation numbers. We thus 1832

D. CORRELATIONS BETWEEN PARTICLES, PAIR WAVE FUNCTION

get a first contribution: 2 dir 2

1

(r1 r2 ) =

k1

6

k2

=

(D-3)

6

k1 k2

where the summation over k1 and k2 are considered as independent, since as we mentioned above, we can neglect the constraint linking these two indices. The exchange term is obtained for k4 = k2 and k3 = k1 ; it exhibits a spatial dependence. As we did for the direct term, we regroup the creation and annihilation operators acting on the same individual states, but this operation now involves only one commutation between operators. We then introduce a factor , equal to 1 for fermions, and we get: ex 2

(k2 k1 ) (r1 r2 )

(r1 r2 ) =

6

k1

(D-4)

k2

k1 k2

The pair annihilation-creation term k4 = k3 and k2 = k1 also exhibits a spatial dependence, but no longer involves average values of occupation numbers. Its expression is: pair-pair 2

(r1 r2 ) =

1

(k4 k1 ) (r1 r2 ) k1

6

k1

k4 k4

(D-5)

k1 k4

and its structure is schematized in Figure 2. Expression (D-5) contains average values of products of operators that do not conserve the particle number, but rather annihilate (or create) two of them. They are called “anomalous average values”. As we explained in § C-3, these anomalous average values come into play quite naturally in the computation of the average value of an operator that does conserve the particle number. Defining the “pair wave function” pair as: pair

(r) =

1

kr

k

3

(D-6)

k

k

this correlation function can be written as: pair-pair 2

(r1 r2 ) =

pair

(r1

r2 )

2

(D-7)

The complete correlation function contributions: 2

dir 2

(r1 r2 ) =

(r1 r2 ) +

ex 2

2

(r1 r2 ) is the sum of the three previous

(r1 r2 ) +

pair-pair 2

(r1 r2 )

(D-8)

For bosons in the same spin state, we can insert in this correlation function the average values given in (C-28), (C-51) and (C-52). We then get a binary correlation function that explicitly depends on the parameters k , as well as on the phases k , which both define the paired state. This clearly verifies that these parameters introduce flexibility in the two-body correlation function. For example, we find: pair

(r) =

1

sinh

3

k

cosh

k

(k r 2

k)

(D-9)

k

1833

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

Figure 2: Diagram symbolizing the pair-pair term of the binary correlation function, with the same convention as for Figure 1. The pair wave function thus directly depends on the phases k ; as we shall see in the complements, these phases actually play a major role in the optimization of the energy. For bosons, this wave function is always even since, as we saw in § C-1-b, this is the case for the functions k and k introduced in (C-11). Comment:

To describe systems of interacting bosons undergoing Bose-Einstein condensation (see § 4 of Complement BXVII and Complement EXVII , we shall add to the paired state Ψpaired another highly populated state with zero momentum (k = 0). This will introduce new terms in the correlation functions, in addition to those computed in this chapter. When the population of that individual state with zero momentum is very high, these additional terms may become dominant. D-2.

Fermions in a singlet state

For fermions with spin 1 2, since each spin can point in two directions, there exists a larger number of correlation functions. Several among them will be studied in §2 of Complement CXVII . We shall only compute one of them here, involving opposite spins, as it plays the most significant role: 2

(r1 ; r2 ) = Ψ (r1 ) Ψ (r2 ) Ψ (r2 ) Ψ (r1 )

(D-10)

Relation (D-2) now becomes: 2

(r1 ; r2 ) =

1

[(k4 k1 ) r1 +(k3 k2 ) r2 ] k1

6

k2

k3

k4

(D-11)

k1 k2 k3 k4

The diagram schematizing each term of this sum is obtained by adding spin indices to the positions in Figure 1 – as is done in Figure 4 of Complement CXVII . 1834

D. CORRELATIONS BETWEEN PARTICLES, PAIR WAVE FUNCTION

The computation is then similar to that of § D-1. The direct term is written: dir 2

(r1 ; r2 ) =

1 k1

6

=

k2

(D-12)

6

k1 k2

There is no exchange term where k4 = k2 and k3 = k1 , as it would correspond to the average value of an operator changing the direction of one of the spins in two different pairs, hence destroying the equality between populations of opposite spins in each pair; this term does exist, however, in the special case where k1 = k2 , but its contribution is negligible. Finally, the pair annihilation-creation term corresponds to k4 = k3 and k2 = k1 ; it is written: 1 pair-pair (k4 k1 ) (r1 r2 ) (r1 ; r2 ) = 6 (D-13) k4 k4 2 k1 k1 k1 k4

Here again, the pair-pair term involves anomalous average values. As before, we can define a pair wave function pair as: pair (r)

=

1

kr

k

3

=

k

1

kr

3

k

k

k

(D-14)

k

whose modulus squared appears in the correlation function: pair-pair 2

(r1 ; r2 ) =

pair

(r1

r2 )

2

(D-15)

Inserting relations (C-42) into (D-14) yields: pair

(r) =

1

kr

k k

3

=

1

k

sin

3

k

cos

k

(k r+2

k)

(D-16)

k

The important role of this pair wave function in the BCS condensation phenomenon will be discussed in detail in Complement CXVII . We will show in particular that this function not only plays a role in the diagonal binary correlation function; it also determines the long-range non-diagonal properties of the density operator, hence playing the role of an order parameter. We noted that the parameters k and k are even functions of k; consequently, the function pair (r) is also an even function of r. The total correlation function is then: 2

(r1 ; r2 ) =

6

+

pair

(r1

r2 )

2

(D-17)

Inserting in this result expression (D-16) for pair (r), we obtain the dependence of the correlation function on the parameters k and k . This illustrates how these parameters, which define the paired state, allow changing the correlation function. Comment:

In the particular case where all the k are either zero or equal to 2, we already mentioned (see end of § C-1-a) that the paired state becomes a Fock state in which the phases k no longer play any role. It is easy to check that the anomalous average values are then all equal to zero, as is, obviously, the function pair (r). On the other hand, for a different choice of the parameters k , the phases k play an especially important role, as will be shown for example in Complement CXVII . 1835

CHAPTER XVII

E.

PAIRED STATES OF IDENTICAL PARTICLES

Paired states as a quasi-particle vacuum; Bogolubov-Valatin transformations

The Hamiltonian of a noninteracting particle system can be written as: 0

=

where } Φ0 of

(} )

0

(E-1)

is the energy of an individual state labeled by the index . The ground state is an eigenvector of all the annihilation operators , with a zero eigenvalue:

Φ0 = 0

(E-2)

The paired ket Ψpaired is not an eigenvector of the usual annihilation operators . We shall, however, introduce in § E-1 a linear transformation of the and into new annihilation and creation operators, and show in § E-2 that Ψpaired is an eigenvector, with a zero eigenvalue, of all the new annihilation operators. The paired state will then appear as a “particle vacuum”. Furthermore, in § E-3, we shall associate with Ψpaired a family of operators having the same form as the Hamiltonian (E-1), but where the and are replaced by the new annihilation and creation operators. The interest of that association is the possibility, in certain cases (illustrated in the complements), to identify – with certain approximations if needed – an operator in this family with the Hamiltonian of a given physical situation. The problem of finding the ground state and the excited states is then solved, as if dealing with a system of independent particles. The state Ψpaired can then be considered as the ground state of the Hamiltonian of independent “quasi-particles”, while the new creation operators permit building a complete orthogonal basis of excited states. E-1.

Transformation of the creation and annihilation operators

For bosons in the same spin state, the state k belongs to the space k associated with the pair (k k); this space is generated by the action of two creation operators k and k on the vacuum. This is also the case for fermions in opposite spin states, if we simplify the notation k to k, as well as k to k (we have labeled each pair of individual states by the value of k associated with the spin ). For both cases, we now define two new couples of creation and annihilation operators that act in k . We introduce the two annihilation operators k and k , defined for k = 0, as well as the Hermitian conjugate operators k and k , as: k k

=

k k

=

k

k

+

k

+

k k

k

k k

=

k k

=

k

+ k

+

k k k

k

(E-3)

or: k

k k

k

=

k k

k

As for now, 1836

k

and

k

k

k

k

k k k

k

k

k

are any two complex numbers.

(E-4)

E. PAIRED STATES AS A QUASI-PARTICLE VACUUM; BOGOLUBOV-VALATIN TRANSFORMATIONS

As in Chapters XV and XVI, [ (bosons), and their anticommutator if as in

k (anti)commutes with k k remain:

[

k]

k

=

k k

k

k

and as

+

k

] denotes the commutator of and = 1 (fermions). We now compute [ k

(anti)commutes with

k

if k

=1 ] ; k

k , only the cross terms

(E-5)

k

For bosons, the commutator of k and k equals 1, and hence the commutator of k and 1; for fermions, the two anticommutators of those operators are equal k equals to 1, so that we obtain, in both cases: [

k]

k

=

k k

1

1

=0

(E-6)

By Hermitian conjugation, we get: k

=0

k

We now compute The one in

(E-7)

k

2

. This time, we get the two squared terms in

k

contains the (anti)commutator

k

k

k

2 k

and

equal to 1; the one in

2 k

.

2 k

contains, for bosons, the commutator of k and k which equals 1, and for fermions, the anticommutator of those two operators which is equal to +1. We therefore get: k

2

=

k

2

k

(E-8)

k

In a similar way: k

=

k

2 k

2

(E-9)

k

Finally, we are left with the computation of

k

k

and

k

k

. The first is

8 zero since k (anti)commutes both with k and itself , and that k (anti)commutes with itself and with k ; the reasoning is the same for the second, so that:

k

k

k

=0

k

=0

(E-10)

To sum up, it suffices to impose, for all values of k, the condition: 2 k

8 For

2 k

=1

(E-11)

fermions, its square is identically zero.

1837

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

to get the relations: k

k

k

=1

k

=1

(E-12)

so that the operators k and k , as well as their adjoints, obey the same relations of (anti)commutation as the usual annihilation and creation operators of identical particles. For fermions, we find again condition (C-3), which allows us to simply set, as in (C-4): k

= cos

k

k

= sin

k

k

(E-13)

k

We then see that the matrix in the right hand side of (E-4) is unitary. This unitary transformation of the creation and annihilation operators is called the “Bogolubov-Valatin transformation”. For bosons, we will set: k

= cosh

k

k

= sinh

k

k

(E-14)

k

Comparison with relation (C-11) shows that: k

k

=

(E-15)

k

The transformation of the creation and annihilation operators for bosons is called the “Bogolubov transformation”. E-2.

Effect on the kets

k

We now show that the vectors k are eigenkets of two annihilation operators k and k with eigenvalues zero. This property makes them similar to a usual vacuum state, which yields zero under the action of all the annihilation operators k . E-2-a.

Fermions in a singlet state

Let us compute the effect of those operators on the ket k defined by relation (C-1), that we write with the simplified notation already used above (k is associated with the spin index + and k with the spin index ): k

=

k

+

k

k

k

0

(E-16)

We start with the operator k defined in (E-3). Its k k term yields zero when acting on the term in k of k ; only the term in k remains, for which the operator lowers from one to zero the occupation number of the state k , since: k k

1838

k

0 =

k

0

(E-17)

E. PAIRED STATES AS A QUASI-PARTICLE VACUUM; BOGOLUBOV-VALATIN TRANSFORMATIONS

As for the operator k k , it yields zero when acting on the k term of k (for fermions, the square of a creation operator is zero), leaving only the term in k . This leads to: =

k

k

k k

k k

k

0 =0

k

(E-18)

The computation is the same for the operator k , except for the fact that the operator k must first anticommute with k before it can be regrouped with k and lower from one to zero the occupation number of the state k. The anticommutation therefore introduces a sign change, but as the definition of k does not contain any sign, we again find: =

k

k

k k

+

k

0 =0

k

We have shown that the two operators with eigenvalue zero. E-2-b.

(E-19) k

and

k

have the ket

k

as an eigenvector,

Bosons in the same spin state

Taking into account (E-15), relation (C-7) is written: =

k

=0

1 !

k k

k

k

0

(E-20)

Since: 1 k

k

0 =

k

k k

k

1

0 =( )

k

k

k

0

(E-21)

we have: k k

=

k

!

=0

or else, since

k k

k

k

k

1 !

k =0

=

k

where we have set k k

+

k

k

k

k

0

(E-22)

= k

k k

k

k

0 (E-23)

k

k

1. This leads to: =0

which clearly shows that k

k

commutes with all the operators in this expression:

=

k

1

k k

k

(E-24) is an eigenvector of the operator

=0

k

defined in (E-3): (E-25)

The same computation leads to: k

k

k

=

k k

k

(E-26) 1839

CHAPTER XVII

PAIRED STATES OF IDENTICAL PARTICLES

and hence to: k

k

=0

(E-27)

As for fermions, the two operators an eigenvalue zero. E-3.

k

and

k

have the ket

k

as an eigenvector with

Basis of excited states, quasi-particles

For bosons as for fermions, we just saw that the new creation and annihilation operators introduced in (E-3) and (E-4) have the same properties as the usual creation and annihilation operators. In particular, the two operators: ( k) =

k k

k

=

k

k

(E-28)

have as eigenvalues all the positive or zero integers, in perfect analogy with the operators corresponding to the population of individual states. By analogy with (E-1), it is therefore natural to introduce the operator: =

}

k k

+

k

k

(E-29)

k

where, for the moment, the are free parameters, as are the parameters which define the paired state (they will be fixed later on, depending on the physical problem we study). In relation (E-29), the summation is limited, as above, to a momentum half-space, which avoids taking opposite momenta into account twice. The eigenvalues of are all of the form: =

( k) +

k

}

(E-30)

k

where ( k ) and ( k ) are any positive or zero integers for bosons, and restricted to 0 or 1 for fermions. The ground state Φ0 ( ) of is an eigenvector of all the annihilation operators k and k with eigenvalues zero. Now we saw in (B-8) for bosons, and in (B-11) for fermions, that the paired state vector is a tensor product of states k , which are precisely the eigenvectors, with zero eigenvalues, of these two operators. The paired state, Ψpaired for bosons9 , or ΨBCS for fermions, is thus an eigenvector of with a zero eigenvalue (ground state). One can then obtain the other eigenstates of (excited states) by the action of the creation operators k and k on Φ0 ( ) . For bosons, each of these two operators will be able to act any number of times. For fermions, on the other hand, we shall only get 3 excited states, by the action of either k , or k , or their product; as these operators anticommute, any higher power of those operators’ product will yield zero. We finally note that operator (E-29) shares many of the properties of the Hamiltonian of an ensemble of particles without mutual interactions. Just as the usual creation operators 9 For bosons, in Complement B 0 to XVII , we will associate to that paired state a coherent state obtain the state Φ . But, as none of the operators k or k act in the Fock space associated with the individual state k = 0, the conclusions will be unchanged for Φ .

1840

E. PAIRED STATES AS A QUASI-PARTICLE VACUUM; BOGOLUBOV-VALATIN TRANSFORMATIONS

can add particles in a system of free identical particles, the creation operators k and k can be considered as the operators adding a supplementary “quasi-particle” into the physical system. These quasi-particles are not the same as particles in a system really without interactions, as illustrated by the expression of these creation operators. They yield, however, a basis of states in which we can reason as if there were no interactions, which is a very powerful framework for reasoning in many domains of physics. For the previous considerations to be relevant from a physical point of view, we have yet to show that the Hamiltonian of the problem we study can be approximated by an operator , provided we make a judicious choice of all the parameters k , k and . This is not a priori easy: the Hamiltonian of an ensemble of particles includes, in general, two-body interaction terms, and those are expressed in terms of sums of products of two creation operators k and two annihilation operators k , hence of 4 operators. Now, if we insert definitions (E-3) and (E-4) into (E-29) to express as a function of the old creation and annihilation operators k and k , it is clear that we shall only obtain combinations of products of 2 operators. We shall need to make certain approximations to be able to consider as a physically pertinent approximate Hamiltonian. Examples of such situations will be given in the complements. Conclusion In conclusion, the paired states are a powerful tool for studying both fermions and bosons. They provide a systematic method allowing a certain flexibility in variational calculations in the presence of interactions. Furthermore, starting from a paired ground state, we were able to build a whole basis of excited states using creation and annihilation operators matching that ground state. In the complements of this chapter, we shall use the paired states to study different problems and compute the optimal parameters most relevant for each situation. The physical results will be quite different, depending on the cases, especially for fermions or for bosons; but the main point remains that the paired states offer a unified framework for obtaining all these different results.

1841

COMPLEMENTS OF CHAPTER XVII, READER’S GUIDE The first two complements provide more details about a number of results given in the chapter, concerning various properties of the pair operators and the paired states. The following three complements apply these concepts to physical phenomena involving fermions, and then bosons.

AXVII : PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

The pair field operator is the analog, for a pair of particles, of the usual field operator for a single particle. It is a useful tool for computing average values in a paired state. The commutation relations of fermion pair operators are similar to those of bosons, except for an additional term due to the fermionic character of the pair constituents.

BXVII : AVERAGE ENERGY IN A PAIRED STATE

This complement explains the computation of the average energy in a paired state. For bosons, we add to this paired state a condensate, described by a coherent state. The results of this complement are used in Complements CXVII and EXVII .

CXVII : FERMION PAIRING, BCS THEORY

Even weak attractive interactions can greatly modify the ground state of a fermion system, via the BCS mechanism for pair formation. This complement discusses the theory of this phenomena, and its effect on the particle distribution and correlation functions, as well as its link to Bose-Einstein condensation of pairs of particles.

DXVII : COOPER PAIRS

The simple Cooper model studies the bound states of two weakly attracted particles, in the presence of a Fermi sphere that prevents the particles from occupying states inside that sphere. Whereas, in general, a minimum depth of an attractive potential is required for two particles to form a bound state in 3-D, the presence of the Fermi sphere ensures the existence of a bound state, no matter how weak the attraction is. The Cooper model accounts in a somewhat intuitive way for a number of results of the BCS theory.

EXVII : CONDENSED REPULSIVE BOSONS

For an ensemble of bosons, using paired states as variational states leads to the same results as the Bogolubov method based on operator transformations. We thus obtain the Bogolubov spectrum, compute the “quantum depletion” introduced by the interactions, etc.

1843



PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

Complement AXVII Pair field operator for identical particles

1

2

3

Pair creation and annihilation operators . . . . . . . . . . . 1-a Particles in the same spin state . . . . . . . . . . . . . . . . . 1-b Pairs in a singlet spin state . . . . . . . . . . . . . . . . . . . Average values in a paired state . . . . . . . . . . . . . . . . . 2-a Average value of a field operator; pair wave function, and order parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-b Average value of a product of two field operators; factorization of the order parameter . . . . . . . . . . . . . . . . . . . . . . 2-c Application to the computation of the correlation function (singlet pairs) . . . . . . . . . . . . . . . . . . . . . . . . . . . Commutation relations of field operators . . . . . . . . . . . 3-a Particles in the same spin state . . . . . . . . . . . . . . . . . 3-b Singlet pairs . . . . . . . . . . . . . . . . . . . . . . . . . . .

1846 1846 1849 1851 1851 1854 1858 1861 1861 1866

In Chapter XVI we introduced a field operator Ψ (r) acting in the state space of a system of identical particles. This operator was defined as a linear combination of annihilation operators associated with individual states having a given momentum. It proved to be a useful tool for various computations, and in particular for the determination of correlation functions. We then showed, in Chapter XVII, the relevance of paired states where, essentially, identical particles were grouped into pairs. We introduced creation and annihilation operators of pairs of particles in well defined momentum states, K and K . Consequently, it is natural to envisage the introduction of a field operator for pairs of particles, which will be the operator Φ (R) destroying a pair of particles whose center of mass is at point R and whose internal state is described by the wave function . Its adjoint, Φ (R), creates a pair of particles in that same state. In this complement, we will define these operators and study some of their properties. We start in § 1 by giving the expression of these field operators Φ (R) and Φ (R) for pairs described by any orbital state . We consider the case where the particles are either in the same spin state, or in a spin singlet state. We then study, in § 2, the average values, in paired states, of pair field operators and of products of such operators. These average values have some very interesting properties leading us, in particular, to introduce a new wave function pair (r), called the “pair wave function”, which is not simply the two-particle wave function pair (r) used to build the paired state. As we shall see in § 2-c, this new wave function explicitly appears in the binary correlation function of the particles’ positions. Moreover, its norm is linked to the number of quanta present in the field of condensed pairs. The origin of this pair function is the fact that pairs can collectively contribute to the creation of a field whose average value is what we shall call an “order parameter”. This non-zero order parameter indicates the existence of a macroscopic field associated with the pairs. We will show how it relates the “anomalous average values” (of operators that do not conserve particle number) to 1845

COMPLEMENT AXVII



the normal average values of a product of two field operators Φ (R) and Φ (R), that does conserve particle number. In particular, we shall use, in § 2-c, the properties of the pair field operator to get the correlation functions in a paired BCS state, and to study the consequences of the existence of the macroscopic field associated with the pairs. We shall finally study, in § 3, the commutation properties of these operators; they will be found to be similar to those of bosons (whether the particles building the pair are bosons or fermions), but not completely identical as corrective terms must be added to the boson commutator. We shall see that, since the pairs are strongly bound and have a spatial extension much smaller than all the characteristic dimensions of the problem, the pairs can be assimilated to bosons; if, however, the pairs are weakly bound (as is the case, in particular, for the BCS mechanism we will discuss in Complement CXVII , it is not possible to consider them as indivisible entities: the fermionic structure of their components plays an important role that cannot be ignored. 1.

Pair creation and annihilation operators

By analogy with the field operator for the particles composing the pairs, we now introduce a field operator concerning the pairs themselves. The adjoint of this field operator allows the direct creation of a pair of particles at a given point, and with a given internal state; as for the operator itself, it annihilates that same pair. 1-a.

Particles in the same spin state

We defined, in Chapter XVII, the operators without spin (or in the same spin state), as: K

K

1 2

=

1 2

=

k

K 2 +k

K 2

k

K 2

K 2 +k

k

=

1

k

K

for pairs of particles

k

(1)

is the Fourier transform of the wave function (r) characterizing

kr

3

3 2

and

k

In this expression, the pair: k

k

K

(r)

(2)

3

and is the edge length of a cube, of volume 3 , which contains the physical system. We now generalize these definitions to the case where the pair is not necessarily in a given orbital state, but in any state belonging to an orthonormal basis of states , with the index going from 1 to infinity; these states each have a wave function (r) whose Fourier transform is k . We therefore simply add an index to the previous definitions, as for example: K

=

1 2

k k

K 2 +k

K 2

k

(3)

We saw in Chapter XVII that the bosonic or fermionic character of the particles building the pair requires the functions k to have the parity with respect to the 1846

• variable k, which means +1 1

=

k

=

k

PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

with:

for bosons for fermions

(4)

Should k be of parity , it is easy to check by (anti)commutation of the operators that expression (3) yields zero. We can then only consider the case where the k , and hence the corresponding wave functions (r), have the parity . If, however, we need the basis of states associated with these wave functions to be complete, we can include states of any parity, 1. We must then remember that the operators K are zero whenever the index corresponds to a wave function of parity . .

Expression of

K

in terms of the particle field operator

Relation (A-10) in Chapter XVI permits replacing the creation operators k

=

1

d3

3

kr

Ψ (r)

k

by: (5)

where Ψ (r) is the adjoint of the field operator associated with the elementary components of the pair (the “atoms” of each “molecule”). Using twice this relation in (3), we get: K

=

1 2

k

3

d3

( K2 +k) r ( K2

d3

k) r

Ψ (r)Ψ (r )

or else, choosing as the integration variables R = (r + r ) 2 and x = r K

=

1 2

(6)

k

d3

3

KR

d3

kx k

Ψ (R +

k

x x )Ψ (R ) 2 2

r: (7)

The summation over k on the right-hand side leads to expression (A-2) of Chapter XVII for the wave function , and we can write: K

=

1 2

d3

3

KR

d3

(x) Ψ (R +

x x )Ψ (R ) 2 2

(8)

This other form for the operator already introduced in (3) demonstrates the fact that it creates a pair of particles in a molecular state characterized, for its external variables, by a plane wave of wave vector K, and for its internal variables, by the wave function . .

Pair field

For each internal state Chapter XVI, an operator Φ state : Φ

(R) =

1

of the pair, we can introduce, using relation (A-3) of (R) that creates a pair at point R and in the internal

KR K

3

(9)

K

1847



COMPLEMENT AXVII

Replacing in (8) the integral variable R by R , and using the result in equality (9), we get: Φ

1 2

(R) =

(R) =

R)

d3

(x) Ψ (R +

K

The sum over K of 3 , and we obtain: Φ

K (R

d3

3

1 2

K (R

d3

R)

then yields

(x) Ψ (R +

3

(R

x )Ψ (R 2

x ) 2

(10)

R ), which allows integrating over

x x )Ψ (R ) 2 2

(11)

This operator is therefore a product of field operators creating successively each of the two elements of the pair, which is easy to understand from a physical point of view. Note, however, that the two elements are not created at the same point, but symmetrically with respect to point R, and with a spatial distribution whose amplitude is given by the wave function (x) of the “molecule”. The spatial zone involved in the process thus extends over a distance of the order of the range of this wave function. As for the pair field operator itself, which annihilates a pair, it is defined by Hermitian conjugation of the previous relation: Φ

(R) =

1 2

d3

(x) Ψ(R

x x )Ψ(R+ ) 2 2

(12)

We now use relation (A-3) of Chapter XVI to come back to the annihilation operators in a basis of individual states with fixed momenta. Using (twice) this relation in (12), we get: Φ

1

(R) =

3

2

d3

k1 (R

(x)

x 2

)

k2 (R+ x 2)

k1 k2

(13)

k1 k2

This relation will be useful in what follows. .

Inversion; expression for the interaction energy

We call the individual states corresponding to the wave functions (r) and assume they form a complete basis. The closure relation on these states is written: k =

k

(

k) ( k

We now multiply relation (3) by ( (

k

)

K

=

1 2

K 2 +k

K 2

) =

k

) , and sum over

k

(14)

kk

to get: (15)

It is thus possible to invert relations (3) and express any two creation operators as a sum of pair creation operators, according to: k1 k2

1848

=

2

(k1 k2 ) 2

k1 +k2

(16)



PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

where we have replaced K par k1 +k2 and k by (k1 k2 ) 2. By Hermitian conjugation, we get a similar relation for any product k3 k4 of annihilation operators. Interaction Hamiltonian: Any operator as:

for the binary interactions between particles can therefore be written

int

=

int

1 : k1 ; 2 : k2

2

(1 2) 1 : k3 ; 2 : k4

k1 k2 k3 k4 (k1

k2 ) 2

(k3

k4 ) 2

K

K

(17)

where 2 (1 2) is the binary interaction between particles (as, for example, in Complement EXV ; using momentum conservation, we have set: K = k1 + k2 = k3 + k4

(18)

Written in terms of pair creation and annihilation operators, int is the sum of quadratic terms, and no longer of fourth degree terms as was the case with operators for individual particles. Note, however, that one must be careful when using relation (17) since, as we shall see in § 3, the pair creation and annihilation operators do not obey the usual commutation relations. The action of an operator K on a paired state obtained by the action of on the vacuum, does not necessarily yield zero when K = K . K Pair creation operators are not as simple to handle as particle creation operators. 1-b.

Pairs in a singlet spin state

For a pair of spin 1 2 particles in a singlet spin state, we use relation (A-23) of Chapter XVII and add an index to represent the internal orbital state of the pair; this reads: K

=

k k

K 2 +k

K 2

(19)

k

The following computations apply directly to fermions in a singlet state, for which the functions k must be even with respect to the variable k. We noted however in Chapter XVII, in comment (ii) just before § B, that they can also apply to fermions in a triplet spin state, when the function k is odd; even though this case can be included in the following discussion, for the sake of simplicity we will continue to talk about singlet pairs. .

Expression of

in terms of the particle field operator

K

Relation (A-9) of Chapter XVI becomes here, taking into account the spin indices: k

=

1

d3

3

kr

Ψ (r)

(20)

Inserting this equality in (19) yields: K

=

1 k

3

d3

d3

( K2 +k) r ( K2

k) r

Ψ (r)Ψ (r )

(21)

k

1849



COMPLEMENT AXVII

As previously, the wave function and x = r r , and we get:

K

=

1

d3

3

KR

(x) appears when we use as integral variables R = (r + r ) 2

d3

(x) Ψ (R +

x x )Ψ (R ) 2 2

(22)

This yields the form of the operator creating a pair of particles in a singlet molecular state, characterized by a plane wave of wave vector K for its external variables, and by the wave function for its internal variables. .

Pair field We now insert relation (22) in (9); we get: Φ

(R) =

1

K (R

d3

3

R)

d3

(x) Ψ (R +

K

As before, the sum over K of Φ

d3

(R) =

K (R

(x) Ψ (R +

R)

yields

3

(R

x )Ψ (R 2

x ) 2

(23)

R ), and we get:

x x )Ψ (R ) 2 2

(24)

The same comments as in § 1-a- can be made: this operator successively creates the two elements of the pair at different points, with a probability amplitude given by the internal wave function (x) of the distance between these points. The field operator is obtained by Hermitian conjugation: Φ

d3

(R) =

(x) Ψ (R

x x )Ψ (R+ ) 2 2

(25)

It will often be convenient to come back and use the annihilation operators in a basis of individual states of fixed momenta. Using (twice) relation (A-14) of Chapter XVI, we get: Φ

(R) =

1 3

d3

k1 (R

(x)

x 2

)

k2 (R+ x 2)

k1

k2

(26)

k1 k2

Comment: For singlet pairs, we could invert those relations, as we did before, and express the interaction energy in terms of the pair creation and annihilation operators. It is, however, a bit more complicated in this case than when the pairs were in the same spin state: as we shall see in § 2-c- , it would be necessary to involve another pair creation operator (in a triplet state). This would lead to cumbersome notation, and the computation will not be presented here.

1850

• 2.

PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

Average values in a paired state

We now compute the average values of pair field operators, or of products of such operators, in the paired state we defined Chapter XVII. We shall use relations (13) or (26), depending on whether the particles are in the same spin state, or in a singlet spin state. In both cases, the computation of the average value of those operators in a paired state involves the computation of average values of products of annihilation operators – i.e. of “anomalous” average values as defined in Chapter XVII. 2-a.

Average value of a field operator; pair wave function, and order parameter

Expressions for the paired kets were obtained in § B-2 of Chapter XVII as tensor products of states of pairs that are not eigenstates of the occupation number operators. These pairs all have a zero total momentum; we therefore assume, from now on, that K = 0. The average value computation of a pair field operator in these states will lead to a new wave function, that we will call the “pair wave function”. .

Particles in the same spin state

Relations (B-8) and (B-9) of Chapter XVII give the expression of the paired state vector Ψpaired for an ensemble of a large number of particles: Ψpaired =

exp

2

k

k

k

0

(27)

k D

The function k used to build this paired state is a priori totally independent of the functions k defining the pair field operators. In such paired states, the populations of the states of the same pair are always equal; consequently, the only non-zero average values k1 k2 are those in which the two annihilation operators act on the two states of the same pair, which have opposite momenta. As the total momenta of each pair is zero, we can set k1 = k2 in (13) and obtain: Φ

1

(R) =

3

2

d3

=

d3

k1 x

(x)

k1

k1

k1

(x)

pair

(x)

(28)

where the (non normalized) “pair wave function” has already been defined in (D-6) of Chapter XVII: pair

(x) = x

pair

=

1 3

kx

2

k

k

(29)

k

Changing the sign of the summation variable k, allows writing the pair wave function ¯pair (k) in the momentum representation as: ¯pair (k) = k

pair

=

1 3 2

2

k k

(30) 1851



COMPLEMENT AXVII

Note that because of the condition k1 = k2 (the total momentum of each pair is zero), the average value Φ (R) no longer depends on R. The average value of the pair field operator is thus: Φ

(R) = Φ

=

(31)

pair

As expected from the translation invariance of the system, it is independent of R. On the other hand, it depends on the internal state , and reaches a maximum when is equal to the normalized state norm proportional to pair : pair norm pair

pair

=

(32)

pair

pair

This computation therefore leads to a new state norm pair , different from the state that was used in Chapter XVII to build the paired state Ψpaired . Choosing for the first vector of the basis 1 = norm pair , the average value of the field is given by: Φ

1

(R) = Φ

1

=

pair

pair

(33)

This average value Φ 1 (R) is often called the “order parameter of the pairs”; its nonzero value is important as it indicates the existence of a field constructed collectively by the pairs. In the present case, the average value of this field is independent of R, as the paired state was built from pairs having a total momentum K = 0 and whose center of mass has a constant wave function. The average values k k , which according to (29) determine the pair wave function, have been called, in § C-3 of Chapter XVII, “anomalous average values”, as they involve operators that do not conserve particle number. For bosons in the same spin state, relation (C-52) of that chapter indicates that: k

k

=

2

k

sinh

k

cosh

k

(34)

One may wonder, of course, what the purpose of computing an anomalous average value is, as it can only be zero in a state with a fixed total particle number. We shall see, however, in § 2-b that these anomalous average values are a useful tool for computing average values of operators that do conserve the total number of particles and hence have a direct physical interpretation. For bosons, operators k and k commute, and hence the definition (29) shows that the wave function pair (x) is even: pair

( x) =

pair

(x)

(35)

The field mean value (31) is thus zero for any state of the basis whose wave function is odd: the postulate of symmetrization with respect to the pair components requires that pair to be in an even orbital state1 . 1 If the particles composing the pair were fermions in the same spin state, the conclusions would be opposite. The wave function would be odd (because of the anticommutation of the operators k and would be zero. k ); the average values for even internal states

1852

• .

PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

Singlet pairs

For fermions in a singlet state, the paired state vector ΨBCS is given by relations (B-11) and (B-12) of Chapter XVII: ΨBCS =

exp

k

k

0

k

(36)

k

Following (26), we must add the spin index to the state k, and the spin index to the state k. Here again, the average value of the product of annihilation operators is different from zero only if their wave vectors are opposite, and relation (26) then leads to: Φ

1

(R) =

d3

3

kx

(x)

k

k

k

=

(37)

pair

with the definition (D-14) of Chapter XVII of the state normalized) wave function: pair

(x) = x

pair

=

1

kx

k

3

k

pair

, associated with the (non-

(38)

k

In a similar way, the pair wave function in the momentum representation ¯pair (k) can be defined as: ¯pair (k) =

1 k

3 2

(39)

k

This wave function can be interpreted in the same way as the wave function defining the orbital variables of a pair in the singlet state. As in (32), we define the normalized ket norm is orthogonal to norm and pair . The field average value (37) is zero if pair reaches a maximum for = 1 = norm ; this maximum is equal to: pair Φ

1

(R) = Φ

1

=

pair

(40)

pair

and defines the order parameter of the physical system. It indicates the presence of a field created collectively by the pairs. As noted before, since the total momentum of each pair is zero, this average value does not depend on R. The average values that come into play in that definition are given by relation (C-42) of Chapter XVII: k

k

=

k k

= sin

k

cos

k

2

k

(41)

(in the BCS state, the k and k are even functions of k). We noted, at the end of § C-1-a of that chapter that, in the specific case where the k are either zero or equal to 2, the paired ket is simply a Fock state of individual particles, hence a ket without pairing. Since (41) is then equal to zero, we see that the pair wave function is zero in the absence of pairing. 1853



COMPLEMENT AXVII

2-b.

Average value of a product of two field operators; factorization of the order parameter

The field operators do not conserve particle number, as opposed to the usual operators such as the Hamiltonian, the total momentum, the double density, etc. On the other hand, the product of operators Φ (R) Φ (R ) does conserve that number, and may help characterizing the properties of the pairs while being easier to interpret from a physical point of view. .

Particles in the same spin state Using relation (13) we get: Φ

(R)Φ

(R ) =

1

d3

6

2

k1 (R+ x 2)

d3

(x) x 2)

k2 (R

(x )

k3 (R

x 2

k4 (R + x2 )

)

k1 k2 k3 k4

k1 k2 k3 k4

(42) The integrals over d3 and d3 k

=

1 3 2

d3

yield Fourier transforms

kx

k

of the wave functions

(x):

(x)

(43)

and we get: Φ

(R) Φ

(R ) =

1 2

k1 3

k2

k4

2

k1 k2 k3 k4

2

[(k3 +k4 )R where, to simplify the notation, we have written

k3

(k1 +k2 ) R]

(44) k1 k2 k3 k4

(k) the Fourier transform of

k.

Computation of the average value k1 k2 k3 k4 This computation follows the same steps as the one in § D of Chapter XVII for the correlation between particles, as well as the one in § 3-a- of Complement BXVII for the interaction energy. Three cases must be distinguished: – (I) The “forward scattering” terms are obtained either for k4 = k1 and k3 = k2 (direct terms), or k3 = k1 and k4 = k2 (exchange terms). We assume these forward scattering terms concern two different pairs, meaning k1 = k2 . Since: k1 k2 k2 k1

=

k1 k2 k1 k2

=

k1

(45)

k2

their sum yields the contribution: Φ

1854

(R)Φ 1 = 2 3

(R )

forward

(k) [ k1 k2

(k) +

( k)]

K (R

R)

k1

k2

(46)



PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

(as in § D-1-a of Chapter XVII, we may consider the summations over k1 and k2 as independent, since, for a large volume 3 , ignoring the constraint k1 = k2 leads to a negligible error); we have used the notation: K = k1 + k2 k1 k2 k = 2

(47)

When the parity of the function (k) is , the two terms in the bracket of (46) are equal and we get the simpler relation: Φ

(R)Φ

(R )

forward

1

=

(k)

3

(k)

K (R

R)

k1

k2

(48)

k1 k2

This result only depends on the difference R R (translation invariance); it goes to zero when R R becomes larger than the inverse of the momentum K distribution width of the function appearing on the right-hand side of (46), once it is summed over k, the difference in momenta. – (II) The terms corresponding to the annihilation-creation of different pairs are obtained for k2 = k1 and k4 = k3 , with k4 = k2 . Their contribution is written: Φ

(R)Φ =

(R )

paire-paire

1 k1

3

2

(k1 )

k1

k1

(k4 )

k4 k4

(49)

k4

Now, using (30) and the definition (2) of the Fourier components of each pair state we have: 1 3 2

2

k4 k4

(k4 ) =

k4

k4

k4 =

pair

,

(50)

pair

k4

The summation over k1 is computed in a similar way, via a simple complex conjugation. We then get on the right-hand side of (49) two scalar products, which finally yields: Φ

(R)Φ

(R )

pair-pair

=

pair

(51)

pair

Unlike the previous contribution, this one is independent of R R . – (III) The terms corresponding to the annihilation-creation of the same pair are obtained for k1 = k2 = k3 = k4 , and yield the average values k1 k1 and k1 k1 respectively. Those terms are just a particular case of the terms appearing in the summation (46) when k1 = k2 , and do not require a specific calculation. Finally, the terms k1 = k2 that we ignored in (I), and for which all the k’s must be equal, contain only one summation over the wave vectors; consequently, they are negligible compared to (46), and will be omitted in this computation. We are then left with the total (I) + (II), which yields: Φ

(R) Φ

(R ) = Φ

(R)Φ

(R )

forward

+ Φ

(R)Φ

(R )

pair-pair

(52) 1855

COMPLEMENT AXVII



where only the second term on the right-hand side does not go to zero when R R becomes large, which indicates a long-range non-diagonal order. According to (51), this second term reaches a maximum when the two internal states and are equal to the state norm defined in (32). It indicates the existence of a cooperative field of pair norm pairs that have a total momentum K = 0 as their external state, and pair as their internal state. Comparing (31) and (51) shows that: Φ

(R)Φ

(R )

pair-pair

= Φ

(R)

Φ

(R )

(53)

The pair-pair term of the two-point correlation function can thus be factored into a product of two one-point correlation functions; for = 1, we get the same function we previously called the “order parameter”. As already pointed out in § 2-a- , it is because the pairs have a zero total momentum that any R and R dependence has disappeared from both sides of (53), but this point is not essential. It is more important to note that introducing such an order parameter, a priori difficult to understand from a physical point of view as it is an average value that does not conserve particle number, is actually quite useful for computing other more physical parameters. We will make the connection between the factorization relation (53) and the Penrose-Onsager criterion for Bose-Einstein condensation in § 2-b- . .

Singlet pairs

Using (26) instead of (13) now leads to a relation very similar to (42); the factor 1 2 is, however, missing, and we must make the substitution: k1 k2 k3 k4

k1

k3

k2

(54)

k4

Relation (44) then becomes: Φ

(R) Φ

(R ) =

1

k1

3 k1 k2 k3 k4 (k1 +k2 ) R]

[(k3 +k4 ) R

k2

k4

2

k3 2

k1

k2

k3

(55)

k4

The rest of the calculation is very similar to the one we just did, and involves the sum of several terms: – (I) The forward scattering terms are obtained for k4 = k1 and k3 = k2 . In two different pairs, a particle is destroyed and then created again in the same individual state (as we now have spin indices, there is no exchange term in this case). The computation is the same as the one that yielded (48) for spinless particles; with the notation (47) for the wave vectors, we get here: Φ

(R)Φ

(R )

forward

=

1

(k)

3

(k)

K (R

R)

k1

k2

(56)

k1 k2

– (II) The terms corresponding to the annihilation-creation of different pairs are obtained for k2 = k1 and k4 = k3 , with k4 = k1 . The computation is now the same 1856



PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

as the one that yielded (51). The right-hand side of (55) becomes: 1

(k1 )

3

(k4 )

k1

k1

k4

(57)

k4

k1 k4

Using the definition (38) for the pair wave function in the singlet case, we again obtain: Φ

(R)Φ

(R )

pair-pair

=

pair

(58)

pair

As mentioned above, any R dependence has disappeared from this average value since the paired state was built from pairs having a zero total momentum. – (III) The terms corresponding to the annihilation-creation of the same pair are obtained for k2 = k3 = k1 = k4 ; they are proportional to k1 and already k1 included in the terms (I). The terms where all the k’s are equal are neglected for the same reason as above. To sum up, we find as before: Φ

(R) Φ

(R ) = Φ

(R)Φ

(R )

= Φ

(R)Φ

(R )

forward forward

+ Φ

(R)Φ

+ Φ

(R)

(R ) Φ

pair-pair

(R )

(59)

We arrive, finally, at the same results as for spinless bosons, with the same long-range non-diagonal order of the pairs, as well as the factorization (53) of the order parameters. We shall see in Complement CXVII that this long-range order parameter is intimately linked to the nature of the BCS transition. Here again, the anomalous average values turn out to be useful tools for computing normal average values that conserve the particle number. .

Link with Bose-Einstein condensation of pairs

There is a close link between the order parameter of the pairs and the existence of Bose-Einstein condensation of those pairs. To show this, it is convenient to introduce the density operator of pairs, limiting ourselves, for the sake of simplicity, to the case of spinless particles. In Chapter XVII, the one-particle density operator for identical particles was given, in terms of the field operator, by its matrix elements (B-26): r

r

= Ψ (r)Ψ (r )

where r is the particle’s position and written2 : R

pair

R

= Φ

(R)Φ

(60) its spin. For pairs, the corresponding relation is

(R )

(61)

where R is the position of the center of mass, and and define the internal state of the pair; the index plays a role similar to that of a spin index for a single particle (even though it corresponds to an internal orbital state). 2 As we shall see in § 3, the pair field operators do not exactly satisfy the boson commutation relations. Consequently, operator (61) is not, strictly speaking, a density operator; to underline this difference, pair is sometimes called a “density quasi-operator”.

1857

COMPLEMENT AXVII



In the momentum representation, the diagonal matrix elements of this density operator are: pair

K

=

K

1 3

d3

K

d3

(R

R

) Φ

(R)Φ

(R )

(62)

Since Φ (R)Φ (R ) only depends on R R , we perform the change of variables X = R R ; the integral over d3 is then trivial and cancels the factor 1 3 ; we therefore get: pair

K

=

K

d3

K X

Φ

(R)Φ

(R

X)

(63)

Inserting relation (52) in this result, we get the sum of a contribution from the forward scattering term and from the pair-pair term. (i) The first contribution comes from inserting (48) in (63). The integral over d3 yields a delta function K K and a factor 3 that cancels the same factor in the denominator. As the sum k1 + k2 must now be equal to K , the double summation over k1 and k2 reduces to a summation over k. We then get: (k)

2 K 2

+k

K 2

k

(64)

k

This result is a regular function of K , related to the wave number dependence of the occupation numbers. (ii) The pair-pair term contains the integral of the function (53), which is a product of two constant order parameters; it therefore leads to: K0

3

Φ

2

(65)

The presence of the delta function K 0 shows that the K = 0 level has an additional population (number of quanta of the pair field) that does not exist for any other value of the momentum K; this population is simply the square of the order parameter, multiplied by the system’s volume; it is thus an extensive quantity. It indicates that the pairs of the system undergo Bose-Einstein condensation. As the corresponding population is proportional to the square of the order parameter, this clearly shows the close link between the long-range non-diagonal order, the order parameter and the existence of condensation. The factorization appearing in (53) is often called the “Penrose-Onsager condensation criterion”. 2-c.

Application to the computation of the correlation function (singlet pairs)

The average values of products of pair field operators can also be used to get the correlation functions between particles. We are going to show, in particular, that the correlation function 2 is the sum of an “incoherent” term, independent of the positions, and of a coherent term that involves the pair wave function defined previously. In order to keep the demonstration short, we shall limit the discussion to the case of fermions described by a paired state built from singlet pairs3 , but the transposition to spinless particles is fairly straightforward. 3 Condensed bosons will be studied in Complement E XVII . We will then show that the properties of the paired state, built from the k = 0 states, are not determined by the interactions within this paired state, but rather by the interactions with a condensate k = 0, external to the paired state. It is therefore a completely different case.

1858



PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

Relation (24) expresses the conjugate of the pair field operator as a function of products of creation operators for its constituent particles; we shall start by inverting this relation. .

Inversion of the relation between fields

The closure relation for the orthonormal basis of the wave functions =1 2 is written: (x )

(x) = (x

x)

(r) with (66)

This summation over must include even orbital functions (x) (associated with a pair field Φ describing paired fermions in a singlet state) as well as odd functions (associated with a pair field describing fermions paired in a triplet state). We then multiply (24) by (x ) and perform the summation over . We recognize in the integral on the right-hand side the closure relation (66), which yields: (x ) Φ

(R) = Ψ (R +

x x )Ψ (R ) 2 2

(67)

r1 + r2 2

(68)

This leads to: Ψ (r1 )Ψ (r2 ) =

(r1

r2 ) Φ

Creating two particles of opposite spins at points r1 and r2 thus amounts to creating a coherent superposition of pairs with a center of mass at (r1 + r2 ) 2, in a singlet or triplet spin state, and with coefficients equal to the wave functions taken at the position r1 r2 . The average value of this expression can be computed in a paired state, using relation (37). This leads to: Ψ (r1 )Ψ (r2 ) =

(r1

r2 )

(69)

pair

As pair is an even function, it easily follows that only the even will contribute to this average value; the triplet pair fields have a zero average value in a singlet pair state. As in § 2-a- , we can choose for the a basis whose first ket 1 coincides with the normalized pair ket norm pair . We then get: Ψ (r1 )Ψ (r2 ) = .

pair (r1

r2 )

pair

pair

pair pair

pair

=

pair (r1

r2 )

(70)

pair

4-point correlation function According to (68), the 4-point correlation function (for opposite spins) is written

as: Ψ (r1 )Ψ (r2 )Ψ (r2 )Ψ (r1 ) =

(r1

r2 )

(r1

r2 ) Φ

r1 + r2 2

Φ

r1 + r2 2

(71)

1859



COMPLEMENT AXVII

It is expressed in terms of the average values of products of pair creation and annihilation operators, hence in terms of the average values of products of fields for which the index plays the role of an internal state of the molecule. We will show below that it can be expressed as:

Ψ (r1 )Ψ (r2 )Ψ (r2 )Ψ (r1 ) =

1

(r1 ; r1 ) +

where 1 (r ; r transform of k 1

(r ; r

pair

1

(r1

r2 )

(r2 ; r2 ) pair

(r1

r2 )

(72)

) is the non-diagonal one-particle correlation function, the Fourier :

)=

1

k (r

r)

3

(73)

k

k

and with a similar definition for 1 (r ; r ), the occupation number k being simply replaced by k ; the pair wave function pair has already been defined in (38). The function 1 (r ; r ), being the Fourier transform of a regular function , tends toward zero when the difference r r is larger than a certain (microk scopic) limit; the only terms left are those on the second line of (72). Imagine then that positions r1 and r2 are close to each other, forming a first group, and that the same is true for positions r1 and r2 , forming a second group, while these two groups are far from each other. The non-diagonal correlation function can then be factored into a product of functions pair . This situation is reminiscent of the Penrose-Onsager criterion for Bose-Einstein condensation of bosons (Complement AXVI , § 3-a), but it now concerns the 4-point (instead of 2-point) non-diagonal correlation function. As the norm of pair is the order parameter, it again underlines the important role of this parameter. An important particular case of the 4-point correlation function is the two-body (diagonal) correlation function for opposite spins: 2

(r1 ; r2 ) = Ψ (r1 )Ψ (r2 )Ψ (r2 )Ψ (r1 )

(74)

The intensity of the pair field is therefore written: 2

(r1 ; r2 ) =

1 k1

6

k2

+

pair

(r1

r2 )

2

k1 k2

=

6

+

pair

(r1

r2 )

2

(75)

We find again relation (D-17) of Chapter XVII, but via another method. The two-body correlation function is the sum of a contribution independent of the positions (hence, with no correlations) and of the modulus squared of the pair wave function. This latter contribution comes from the term that, for pairs, indicates the existence of a long-range non-diagonal order (Bose-Einstein condensation). This is an important property, which is at the heart of the BCS mechanism, and which will be discussed in more detail in Complement BXVII . 1860



PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

Demonstration We insert in (71) relations (56) and (58). In the forward scattering term, we get the following expression: (r1 =

r2 ) r1

r1 r2

(k)

r2

r1

r2

(k)

K (r1 +r2

k

K (r1 +r2 r1 r2 ) 2 = k r1 r2 r1 r2 k 1 = 3 (k2 k1 ) (r1 r2 r1 +r2 ) 2 (k1 +k2 ) (r1 +r2

=

1

k2

(r 1

r1 )

k1

(r2

k

r1

r1

r2 ) 2

K (r1 +r2

r1

r2 ) 2

r2 ) 2

r2 )

(76)

3

where k and K were defined in (47). Inserting this result in (71), we get the first term of the right-hand side of (72) As for the pair annihilation-creation term (58), it yields: Ψ (r1 )Ψ (r2 )Ψ (r2 )Ψ (r1 ) =

(r1

= =

r1 pair

(r1

r2 )

r1

r2 r2 )

pair-pair

r2 pair

pair

r1

pair

pair

r2

pair

r1

r2 (77)

and we obtain the second term of the right-hand side of (72). Note that only the singlet pair fields (associated with the even function ) contribute to this term.

3.

Commutation relations of field operators

We now study the commutation relations between the pair field operators just defined. The “spin-statistics theorem” (Chapter XIV, § C-1) states that particles with integer spin are bosons, and particles with half-integer spin are fermions. If we consider two paired fermions, the rules for adding angular momenta (Chapter X) indicate that this composite system necessarily has an integer spin. Intuitively, one could thus expect two bound fermions to behave like a boson; this is the question we now discuss by examining the commutation relations between the operators K and K , and establishing the correction factors introduced by the underlying fermionic structure. 3-a.

Particles in the same spin state

Starting with spinless particles, we shall explain in this simple case the main commutation properties of the pair operators. If the pairs created and annihilated by the operators K and K and their Hermitian conjugates were really bosons, the commutator of these two operators should be equal to KK . We are going to show that the commutator does contain such a term, but with several additional corrections. 1861

COMPLEMENT AXVII

.



Commutation relations of the

K

Any product of two creation operators commute with any product of two creation operators (for fermions, two minus signs cancel each other when products of two operators cross each other); the same is true for two products of annihilation operators. We therefore have: K

K

=0

K

K

=0

The commutator of

K

K

=

(78) K

1 2

and (

k

has yet to be computed:

K

k)

K 2

k

k

K 2 +k

K 2

k

+k

K 2

(79)

k

We will show below (§ 3-a- ) that: K

K

=

+2

KK

(

k)

k

=

+2

KK

κ

K

K 2

κ− K 2

+k

κ− K2

K

(K 2) k

K

κ

(K 2) k

K κ

(80)

(in the second line, we have set κ = k + K 2 and used the parity of the function ); if needed, we can get rid of the coefficient on the right-hand side provided we change the sign of the subscript of (or of ). The first term K K , on the right-hand side of (80) , is exactly the commutator of two bosons with internal states and (spin states for example): this term is different from zero only if both the external and internal variables are the same (in the present case, these internal states are actually orbital states). This first term is, however, followed by an additional term that shows that the fermionic structure of the pairs still plays a role. This latter term is a one-particle operator in the sense defined in § B of Chapter XV; relation (B-12) of that chapter permits computing the matrix elements of the corresponding operator . This additional term contains creation and annihilation operators in normal order, which means that it will go to zero when the populations of the individual states tend to zero; in this limit, the pairs can be assimilated to bosons. When K = K and = (pairs in the same internal and external states), we get the simpler relation: 2 K

K

=1+2

κ κ

K 2

with the usual definition of the population operator k

=

k k

(81)

K κ

k:

(82)

The corrections to a purely bosonic commutation are then proportional to the populations of the individual states of the particles forming the pair, hence confirming the fact that they become negligible when the sum of all these populations is small enough. 1862

• .

PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

Demonstration We start by computing the commutator: 1 2

=

3 4

1 2 3 4

(83)

3 4 1 2

where 1 2 3 4 are indices labeling any individual states. We can write: 1 2 3 4

=

23

1 4

+

=

23

1 4

+

13

2 4

+

3 1 2 4

=

23

1 4

+

13

2 4

+

24

3 1

+

3 1 4 2

=

23

1 4

+

13

2 4

+

24

3 1

+

14

3 1

+

1 3 2 4

3 2

+

14

3 2

(84)

3 4 1 2

so that: 1 2

=

3 4

23

1 4

+

24

13

2 4

+

(85)

Putting all the operators in normal order, we get4 : 1 2

=

3 4

23 14

+

13 24

+

24

3 1

+

13

4 2

+

23

4 1

+

14

3 2

(86)

The commutator appearing on the right-hand side of (79) is therefore equal to: KK

kk

+

KK

+

(K K ) 2 k+k

+

(K K ) 2 k

k

+

2) k

(K k

k

(K

2) k

(K K ) 2

k k

(K

2)+k

(K 2) k

(K 2)+k (K 2) k

+

(K K ) 2 k k

(K

2)+k

(K 2)+k

(87)

Inserting the first two terms back into (79), we get the following contribution to the commutator : K K 1 2

(

KK

k)

+

k

k

=

(

KK

k

k)

k

=

KK

(88)

k

where we have taken into account the parity with respect to k of the the functions k – see relation (A-4) of Chapter XVII – and used the fact that the internal states are orthonormal. As mentioned above, this K K is precisely what is expected for a boson commutation relation. It is, however, followed in (87) by four other terms, which are written: 1 2 1 2 2 2 4 Since

(

k)

(

k)

K

K 2

k

K

(K 2)+k (K 2)+k

(

k)

K

K 2

+k

K

(K 2) k (K 2) k

(

k)

K

k

k

k

k

=

K

K 2

K 2

K

k

+k

, we have

K

(K 2) k (K 2) k

(K 2)+k (K 2)+k

=

+

(89)

.

1863

COMPLEMENT AXVII



In each of them, and without modifying the result, we can change the sign of the summation dummy index k, or change the sign of the subscript of the functions or (provided we introduce a factor ). For example, in the second term, we can change the sign of the subscripts of the two functions and (two factors then cancel each other), then change the summation index k into k: this second term then doubles the first one. As for the third term, we simply change the sign of the subscript (K K ) 2 + k of the function (which introduces a factor canceling that same factor already present) and reproduce the first term. Finally, for the fourth term, a parity operation on the function followed by a change of the summation index from k to k makes it equal to the first. The four terms are therefore equal; choosing for example the expression of the third one, and replacing the summation index k by κ = k + K 2, we get relation (80).

.

Commutation relations of pair field operators

For the same reasons as explained above (commutation of any products of two annihilation operators), the operators Φ (R) all commute with each others; the same is true for the adjoint operators Φ (R). We have yet to examine the relations between the Φ (R) and the Φ (R ). Relations (11) and (12) show that: Φ

(R) Φ 1 2

(R ) =

d3

(x) Ψ(R

d3

(x )

x x x )Ψ(R+ ) Ψ (R + )Ψ (R 2 2 2

x ) 2

(90)

The computation will not be carried out explicitly (though it does not present any particular difficulty); it leads to: Φ

(R) Φ = (R + 16 = (R + 16 Ψ (

(R ) R) d3

[x]

[2 (R

R)

x] Ψ (2R

R

x x )Ψ(R ) 2 2

R) d3

(R

R

R+R +z +R 2

z)

(R

R) Ψ(

R + z)

R+R +z +R 2

R)

(91)

In the second more symmetrical form of this commutator, we have used the notation z = R R x. These relations are the equivalent, in the position representation, of the commutation relations (80) in the momentum representation (as already mentioned, it is possible to make the factor appear or disappear in front of the integral by changing the sign of the variable of one of the two functions or ). The commutator thus includes several terms. The first term in (R R ) corresponds to the commutation relation of a usual bosonic field (whether the pair constituents are bosons or fermions); the reflects the commutation of field components 1864



PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

corresponding to different orbital internal states of the pairs. To this term must be added a correction that depends on the structure of the pair, characterized by the functions (r) and (r). We again find a result similar to that obtained before: a first simple bosonic term, which only takes into account the simultaneous exchanges of the two constituents of a “molecule” with the two constituents of another one. This term is followed by a correction that comes from the possibility of exchanges other than those involving complete pairs. Note that this correction is expressed in terms of field operators of the elementary constituents themselves and not of the pairs, as was to be expected since it is the constituents themselves that are involved. This correction term is a one-particle operator, non-diagonal in the position representation, since it destroys a particle at point r and recreates another one at point r + 2 (R R), always at the same distance. To keep things simple, let us assume the dimensions of the “molecules” that define the pair field for the two internal states and we are concerned with, are both of the order of the same dimension 0 ; this means that the wave functions (r) and (r) go to zero when . In relation (91), the values of x that contribute to the integral 0 are those for which none of the two functions [x] and [2 (R R ) x] takes a negligible value; this requires that neither x , nor 2 (R R ) x , be large compared to 0 . This double condition imposes R R 0 , in which case there are values of x for which both functions take simultaneously large values and the correction to the commutator cannot be neglected. On the other hand, if R R 0 , there is no common domain where both functions and take on significant values and the integral over 3 is practically negligible. In other words, the molecular wave function range 0 also plays the role of the commutator correction range. The limit 0 0 can be obtained by choosing functions proportional to a function (Appendix II, § 1-b), whose width goes to zero as 0 and whose integral equals one (it takes values of the order of 1 3 in a domain of volume of the order of 3 ). For the sake of simplicity, we assume that = ; as it is the square of the function that is normalized to 1 (and not the function itself), we must choose: (r)

3 2

(r)

(92)

Using this form for the functions , the integral over d3 in (91) leads to the convolution of two delta functions, which yields a function (R R ) multiplied by the operator Ψ (R) Ψ (R); nevertheless, the coefficient 3 of this term yields zero in the limit where 0. Consequently, if the molecules’ size is very small compared to all the characteristic lengths of the system (such as the distance between molecules), the commutation relations of the field operator are exactly the same as for fields associated with bosons. In conclusion, when the “molecules” have no spatial overlap5 , the only relevant exchanges concern exchanges between both of their constituents. On the other hand, when the two molecules do overlap, individual exchanges between their constituents become possible. If the molecules are loosely bound, as in the example of the BCS fermion pairing mechanism (Complement CXVII ), they cannot be treated as bosons without structure, and one must use the complete formula (91) for the commutator.

5 This does not exclude the case where the distance between molecules is small or comparable to the de Broglie wavelength of their centers of mass: the gas of molecules may be degenerate.

1865



COMPLEMENT AXVII

3-b.

Singlet pairs

We now study the case of particles in a singlet pair, as in § 1-b. .

Commutation relations of the

K

As before, any products of two creation operators commute with any products of two creation operators; the same is true for any products of two annihilation operators. Relations (78) are thus still valid. We now have to compute the commutator: K

=

K

(

k)

k

k

K 2

k

K 2 +k

K 2

k

K 2

k

K 2

+k

(93)

k

We are going to show that: K

=

K

+

k

=

KK

(

k)

K

KK

K 2

K

+k

+

(K 2) k

κ− K 2

+

K

κ− K2

K

K κ,↓

κ

K 2

(K 2) k

+

K

κ

(94)

k K κ

(with, in the last line, the notation κ = k+K 2). Here again we find that the commutator of the two operators K and K includes, to begin, a purely bosonic term, followed by corrections containing operators in normal order ( which go to zero in the limit of low occupation numbers); a correction must be added for each of the two spin states. Demonstration To prove (94), we again use relation (86). As the indices 1, 2, 3 and 4 represent all the quantum numbers associated with an individual state, they must now contain the spin indices; these are added to the momentum indices, which play the same role as in the previous calculation. It then follows that the states 1 and 3 are always orthogonal, as are the states 2 and 4; the only terms remaining on the right-hand side of (86) are the terms in 23 and 14 , so that: =

K

K

+

(K K ) 2 k

k

(

k

k)

K 2

k

k

KK

k

kk

(95) K 2

+

k

or else (since the basis of the functions

k

(K K ) 2 k k

K 2

+k

K +k 2

is orthonormal):

KK

+

( k

k)

K

K 2

+k K

(K 2) k

+

K

K 2

K 2

k

+k K

(K 2)+k

K +k 2

(96)

We now modify the second term in the bracket of the summation, to make it similar to the first one: as the functions k have a definite parity ( ) with respect to k, we can change the index signs of and , and the sign of the dummy index k of the summation. The only difference between the two terms is then the spin directions, and we therefore obtain (94).

1866

• .

PAIR FIELD OPERATOR FOR IDENTICAL PARTICLES

Commutation relations of pair field operators A similar calculation to the one that led to (91) permits obtaining: Φ (R) Φ (R ) = (R

R)

+8

d3

Ψ (2R

R

[x]

[2 (R

R)

x x )Ψ (R ) + Ψ (2R 2 2

x] R

x x )Ψ (R ) 2 2

(97)

(a more symmetrical form of the right-hand side can be obtained by again using the notation z = R R x). The commutator is thus equal to that of elementary bosons plus a correction term. This latter term plays an important role over a distance R R, of the order of the range of the wave functions , and is the sum of contributions independent of the two spin states. Conclusion In conclusion, note that the pair field operator provides interesting insights concerning the physical properties of paired states in a many-body system. For a -particle state, built from a two-particle wave function , it leads to a new wave function pair when the particle indistinguishability is taken into account. In the framework of the BCS theory, we will see how this pair wave function allows characterizing the cooperative effects of pair interactions. Introducing an order parameter is also useful for showing the link between anomalous average values (which do not conserve particle number) and the normal average values. The results take on different forms for paired states of bosons or fermions. There is, however, a strong analogy between the two cases, which provides a unified framework for the study of different phenomena, such as Bose-Einstein condensation of particles or pairs.

1867



AVERAGE ENERGY IN A PAIRED STATE

Complement BXVII Average energy in a paired state

1

2

3

4

Using states that are not eigenstates of the total particle number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1869 1-a

Computation of the average values . . . . . . . . . . . . . . . 1870

1-b

A good approximation . . . . . . . . . . . . . . . . . . . . . . 1870

Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1871 2-a

Operator expression . . . . . . . . . . . . . . . . . . . . . . . 1871

2-b

Simplifications due to pairing . . . . . . . . . . . . . . . . . . 1874

Spin 1/2 fermions in a singlet state

. . . . . . . . . . . . . . 1874

3-a

Different contributions to the energy . . . . . . . . . . . . . . 1874

3-b

Total energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 1880

Spinless bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . 1881 4-a

Choice of the variational state

. . . . . . . . . . . . . . . . . 1881

4-b

Different contributions to the energy . . . . . . . . . . . . . . 1882

4-c

Total energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 1887

In Chapter XVII, the paired states were introduced in a general way, without specifying any particular form of the Hamiltonian. In order to use the paired states Ψpaired in the framework of a variational method, i.e. to be able to minimize the average value of the energy of an -particle system, we must compute the average value of the energy in these paired states; this is the purpose of this complement. We start (§ 1) by examining the consequences of the fact that these states are not eigenvectors of the total particle number operator . In § 2, we clarify the notation and give the expression of the Hamiltonian . We then deal successively with the fermion case (§ 3) and the boson case (§ 4). This second case is slightly more complicated since it requires the adjunction of a specific state to describe the condensate. 1.

Using states that are not eigenstates of the total particle number

The paired states Ψpaired are coherent superpositions of states containing different numbers of particles. One may wonder how the average values computed in such states can be relevant for a physical system where has a fixed value. As we already mentioned in § D of Chapter XVII, this approach is correct for large values of the average particle number, provided the operators, whose average values we are computing, conserve the particle number (i.e. commute with the total particle number , as is the case for the Hamiltonian operator ). We are going to show in more detail that when these conditions are met, the average values do not depend on the state vector’s coherences between different values; they can thus be obtained using the paired states. 1869



COMPLEMENT BXVII

1-a.

Computation of the average values

The state Ψpaired defined in (B-5) of Chapter XVII is a superposition of states where the particle number is exactly =2 :

Ψ

1

Ψpaired =

!

=0

Ψ

(1)

As the matrix elements of the operator different eigenvalues are zero, we have: Ψpaired

between eigenkets of

=0

!

is the energy average value in the state Ψ =

Ψ Ψ

Ψ

Ψ Ψ

(2)

(3) ( ) as:

2

1

Ψ

!

Ψ

the diagonal element of Ψpaired

Ψ

:

Consequently, if we define the weight distribution

( )=

Ψ

2

1

=

Ψ

!

=0

where

2

1

Ψpaired =

corresponding to

(4) in Ψpaired is given by:

Ψpaired =

( )

(5)

=0

The average value

is then obtained by dividing this expression by the square of the

norm Ψpaired Ψpaired . In a general way, the diagonal element in Ψpaired of any operator that commutes with is given by a linear combination of the average values of this operator in the states Ψ with the weight distribution ( ). As an example, for any function of the operator

, we can write:

Ψapp

Ψapp =

( )

(2 )

(6)

=0

1-b.

A good approximation

For a system with a fixed eigenvalues and the kets Ψ 1870

= 2 particle number, we are trying to determine the ; the most direct method would be to vary separately



AVERAGE ENERGY IN A PAIRED STATE

each ket Ψ to optimize . This would lead, however, to complicated calculations. It turns out to be much more practical to vary Ψpaired and optimize the corresponding energy; this leads to nearly the same results for large particle numbers, as we now explain. We saw in § C-2 of Chapter XVII that the particle number fluctuations in a state Ψpaired are very small in relative value when is large. This means that the distribution ( ) has a sharp peak around a certain value 0 of , which determines half the average value of the particle number. Now if the energies are practically constant over the width of that distribution, the Hamiltonian diagonal matrix element (2) can be written: Ψpaired

2

1

Ψpaired

0

=0

=

0

!

Ψ

Ψ

Ψpaired Ψpaired

(7)

Making this diagonal matrix element stationary (keeping constant the norm of Ψpaired ) is equivalent to making The optimal value obtained for this matrix 0 stationary. element, divided by the squared norm of Ψpaired , yields a good approximation of the energy we are looking for. Once Ψpaired has been optimized in this way, it can be 0 projected onto the various subspaces with fixed particle numbers, and therefore obtain the Ψ , corresponding to stationary states with fixed particle numbers. In the following complements, we shall use the paired states rather than the states Ψ with fixed particle number. Comment : In the following complements, rather than optimizing the average energy, it is the difference between this average energy and the average particle number multiplied by the chemical potential that we shall optimize. As the two operators and commute with the total particle number, the line of reasoning we just followed also applies to that case.

2.

Hamiltonian

Consider a physical system composed of fermions or bosons, placed in a cubic box of edge length . 2-a.

Operator expression

The Hamiltonian is the same as the one used on several occasions, for example in Complement EXV (but we assume here that there is no external potential): =

0

+

The operator particle : 0

=

(8)

int 0

0(

is the sum of the kinetic energy operators

)=

P2 ( ) 2

0(

) associated with each

(9)

1871



COMPLEMENT BXVII

and

is the sum of the interaction energies between particles:

int

1 2

=

int

2 (R

R )

(10)

= =1

where 2 (R R ) only depends on the difference R R (translation invariance) and does not act on the spins. We now express in terms of creation and annihilation operators, according to formulas established in Chapter XV. We use the basis of individual states k , where k labels the momentum }k of a plane wave that satisfies the periodic conditions in the box; the index labels the spin state of the particles, but if they are all in the same spin state, it can be omitted in what follows. We get: =

k

k

+

k

1 2

+

1:k k ;k

; 2:k

2 (R1

R2 ) 1 : k

; 2:k

;k ;k k

k

k

(11)

k

with: }2 2

=

2

(12)

(since the interaction potential does not act on the spins, we were able to replace the spin index associated with k by the index , as well as the index associated with k by the index ). The matrix elements of 2 appearing in (11) can be written: d3

d3

1

2 (r1

2

r2 )

1

(k

k

) r1 (k

k

) r2

(13)

6

We make the following change of variables: R = (r1 +r2 ) 2 and r = r1 over d3 of the exponential yields the Kronecker delta function: 1

d3

3

(k+k

k

k

)R =

r2 . The integral

(14)

k+k k +k

which enforces the conservation of the total momentum: k+k =k +k

(15)

The integral over d q

=

1 3

3

d3

introduces the Fourier transform qr

q

of the potential1 :

2 (r)

(16)

with: q=

(k

(k

k)

k)

1 The factor 1 3 in (16) comes from the normalization of the plane waves edge length ; it ensures the potential has the dimension of an energy.

1872

(17)

2 kr

3 2

in a cube of



AVERAGE ENERGY IN A PAIRED STATE

Figure 1: Symbolic plot of a general interaction process where two particles of momenta ~k and ~k are replaced, as the result of their mutual interaction, by particles of momenta ~k and ~k . The indices and label the spins, which are not modified by the interaction. The horizontal line represents the momentum transfer ~q whose value is given by (17) and (18).

or else, taking (15) into account: q=k

k=k

(18)

k

The momentum transfer q gives the momentum variation of particle 1, as well as the opposite of the momentum variation of particle 2. Since 2 (r1 r2 ) is symmetric with respect to the exchange of the variables r1 and r2 , the functions 2 (r) and q are both even and real. The matrix element of the interaction potential is then: 1:k

; 2:k

2 (R1

R2 ) 1 : k

; 2:k

=

k+k k +k

q

(19)

and is schematized in Figure 1, where the horizontal line represents the momentum transfer ~q resulting from the interaction between the ingoing and outgoing particles. The interaction potential operator can thus be written: int

=

1 2

q kk k

k

k

k

k

(20)

k

where the summation over the k actually concerns only three wave vectors, since k = k+k k . In a frequently used approximation, one assumes the interaction potential range to be very small compared with the de Broglie wavelengths of all the particles involved 1873

COMPLEMENT BXVII



(contact potential). The variations with k of 2 (k) can then be neglected, and all the matrix elements of the interaction potential are equal to a given constant 0 (provided they conserve the total momentum; otherwise, they are obviously zero): 0

2-b.

=

1 3

d3

2 (r)

(21)

Simplifications due to pairing

In general, the computation of the average value of the operator (11) is very complex, due to the large number of possible interaction terms. However, as we already saw in § D-1-a of Chapter XVII, some simplifications occur for a paired state. The main reason is that in the various components of a paired state on Fock states, all the paired individual states have the same population. If the population of an individual state k changes, the population of the individual state k must change by the same quantity, otherwise the average value of the operator is zero. To get a non-zero average value in a paired state, the combination of creation and annihilation operators in the considered interaction term must respect this parity condition. Now the interaction operator (20) is a sum of terms containing two annihilation operators on the right, and two creation operators on the left. Only two possibilities exist for the population balance of all the pairs to be conserved upon the action of these four operators: either the two creation operators re-establish the initial populations of the two states that were depopulated by the annihilation operators (in which case none of the populations are changed); or else, the two annihilation operators destroy particles in the same pair of states, and the creation operators produce another pair (in which case the population of the first pair2 is lowered by 2, and the population of the second increased by 2). The two possibilities are combined in the particular case where the creation operators restore precisely the pair of particles destroyed by the annihilation operators. We are then led to the different cases examined in detail in § D-1-a of Chapter XVII: Case I (direct and exchange forward scattering terms), Case II (pair annihilation-creation terms) and Case III (combination of the two previous terms, yielding a negligible contribution). 3.

Spin 1/2 fermions in a singlet state

We now compute the average value of the operator , written in (11), in the state ΨBCS defined in § B-2-b of Chapter XVII. As far as the interaction energy is concerned, we will show that the terms associated with Case I only yield the usual mean field contributions, already discussed in the previous chapters. On the other hand, the terms associated with Case II are a direct consequence of the pairing, and are therefore totally new; they play a leading role in the BCS theory. The terms associated with Case III, being a particular case of the other two cases, generally play a negligible role. 3-a.

Different contributions to the energy

The different contributions to the energy will be computed successively, starting with the kinetic energy. 2 We defined in § C-2 of Chapter XVII, the pair population operator ˆ pair as the sum of the population operators of each of the two individual states forming the pair.

1874

• .

AVERAGE ENERGY IN A PAIRED STATE

Kinetic energy

The first term (kinetic energy) is, as for the particle number, the sum of the contributions of the pairs of states, labeled by k (each of the two states having the same kinetic energy): 0

=

(pair k)

k

=2

k

(k)

k

2

k

=2

sin

2

(22)

k

k

.

Interaction energy

The average of the interaction potential energy is the sum of the averages of the terms on the right-hand side of (20), i.e. of terms that belong to one of the three possibilities I, II and III cited above; we study them successively. – Case I (the creation operators repopulate the states depopulated by the annihilation operators) For such terms, the occupation numbers of each individual state remain unchanged in the course of the interaction process. They are “diagonal terms” (sometimes called “mean field terms”). Two cases may arise, depending on whether the spin index is the same as, or different from ; we examine each of these possibilities in turn. (i) If = , as the interaction potential does not act on the spin, we can trace each particle using its spin direction; it is as if the particles were distinguishable. If the creation operators repopulate exactly the same individual states depopulated by the annihilation operators, the only possible interaction is schematized in Figure 2, and corresponds to a forward scattering. As the momentum transfer q is zero, the potential term includes the constant 0 , and we get the following contribution to the average energy: 0

2

ΨBCS

k

ΨBCS

k

k

k

(23)

k= k

(the condition k = k comes from the fact that the pairs are different, each pair being labeled by the value of k associated with the spin +). Two anticommutations permit bringing the last operator k right after the first one k (with two sign changes that cancel each other). If we now sum all the contributions from = + and from = , we get: 0

2

k

k

k

k

k

k

k

k

k= k

+

k

k

k

k

k

k

k

(24)

k

We can show that the two terms inside the brackets yield the same contribution by interchanging the two dummy indices k and k in the summation. We thus double the first term, and after changing the sign of k , we get: 0

k

k

k

k

k

k

k

k

=

(k) 2

0

k=k

2

k

k=k

=

sin2

0

k

sin2

k

(25)

k=k

1875

COMPLEMENT BXVII



Figure 2: Schematic plot of the interaction between particles of opposite spins, which do not belong to the same pair (forward scattering). This diagram contributes to the particles’ mean field. When the particles are distributed in a large number of individual states, the value of the summation in the above expression is barely changed if we ignore the constraint k = k . If we now use for the expression (C-19) of Chapter XVII, we can write this contribution as: 0

2

(26)

4

According to relation (21), the constant 0 is proportional to the inverse of the volume 3 . This term can be interpreted as a mean field term, where 2 particles with a spin + interact with 2 particles having a spin ; a particle with a given spin direction feels the mean field exerted by all the particles with opposite spin, whose numerical density is 2 3. (ii) if = , it is no longer possible to distinguish the particles by the direction of their spin, and the indistinguishability effects play their full role. Two cases must be distinguished for these “diagonal terms”: either k = k and k = k , which yields a direct term; or k = k and k = k , which yields an exchange term. In both cases, the individual states populated in the bra and the ket are the same, and we are dealing with “diagonal processes” that can be called “mean field terms”. For the direct term, no particle changes its momentum, which again corresponds to a “forward scattering” (left-hand side of Figure 3), and the potential term again includes the constant 0 written in (21). The average value of this direct term is: 0

2

ΨBCS

k

k

k

k

ΨBCS

(27)

k=k

Here again, since k = k (otherwise we would have the square of an annihilation operator, which is zero), two anticommutations let us bring the operator k to the second position,

1876



AVERAGE ENERGY IN A PAIRED STATE

Figure 3: Interaction between particles having the same spin; the direct term (forward scattering) is schematized on the left, and the exchange term on the right. These two diagrams add their contributions to the diagram of Figure 2 to build the particles’ mean field.

and we get: 0

2

k

k

k

k

k

k

k

=

k

2

0

k=k

2

k

k

k=k

=

sin2

0

k

sin2

k

(28)

k=k

(the two values of yield the same contribution, hence the disappearance of the factor 1 2 on the right-hand side). As for the exchange term, we have k = k and k = k (right-hand side of Figure 3); for such a momentum exchange, the transfer q is no longer zero, but equal to: q=k

(29)

k

and the potential term now includes Furthermore, when k = k : k

k

k

=

k

k

k

k

k

obtained by inserting q = k

(30)

k

k

k in (16).

Apart from this sign change, the computation is the same as for the direct term. The sum of the two direct plus the exchange contributions finally yields: [ k=k

0

k

k]

2 k

2 k

=

[

0

k

k]

sin2

k

sin2

k

(31)

k=k

In the short-range potential approximation where k k = 0 , this sum is zero: the Pauli exclusion principle prevents particles having the same spin components from interacting via a contact potential.

1877

COMPLEMENT BXVII



– Case II (particles annihilated in a pair of states and restored in another pair) Considering the nature of the creation and annihilation operators it contains, this process may be called “pair annihilation-creation”. It plays an essential role in the BCS pairing, as we shall see in Complement CXVII ; the corresponding term in the Hamiltonian is thus often called the “pairing term”. We then have, on one side k = k and = , and on the other, k = k , so that, according to its definition (17), the momentum transfer is q = k k ; the corresponding diagram is shown in Figure 4. We are going to show below that its contribution to the energy can be written as: k

k

sin

k

cos

k

sin

k

cos

k

2 (

k

k

)

(32)

k=k

This term is new, in the sense that it is not a mean field term, like the previous ones, but that its existence is due to the pairing process. We will show in Complement CXVII that its contribution to the average energy plays an essential role in the BCS theory.

Figure 4: Interaction process between two particles in the same pair, which, in their final states, end up in another pair. In terms of creation and annihilation operators, this process is a “pair annihilation-creation” (two particles of the same pair are annihilated, while two particles are created in another pair). As opposed to the terms introduced by the other interaction processes, this term’s contribution to the energy depends on the pairing; it is sometimes called the “pairing term”, and is responsible for the energy gain in the BCS theory (Complement CXVII ). 1878



AVERAGE ENERGY IN A PAIRED STATE

Demonstration: If

= +, the contribution contains a product of “anomalous” average values: 1 2

k

ΨBCS

k

k

k

k

ΨBCS

k

k=k

=

1 2

k

k

k

k

k

k

k

k

k

(33)

k

k=k

that is, using (C-42) and (C-44) of Chapter XVII: 1 2

k

k

k

k k

k

k=k

=

1 2

k

k

sin

k

cos

k

sin

cos

k

2 k

(

k

k

)

(34)

k=k

If = , it is now the kets that come into play, and we obtain another k and k product of anomalous average values for which we must use (C-43) and (C-45) of Chapter XVII (as well as the fact that the functions of k are even, as indicated in that chapter): 1 2

k

ΨBCS

k

k

k

k

ΨBCS

k

k=k

=

1 2

k

k

k

k

k

(35)

k

k=k

This expression is the same as the previous one, since it only differs by the sign of the summation dummy indices k and k (remember that q is even). We therefore remove the factor 1 2 in (34) and get (32).

– Case III (particles annihilated in a pair of states, then restored in the same pair) We then have again k = k and = , but in addition k = k (and hence necessarily k = k ), as shown in Figure 5; this is another case of forward scattering. We now check that this term can be neglected. Its contribution to the energy is: 0

2 If

ΨBCS

k

k

k

ΨBCS

k

(36)

k

= +, we get (after two operator anticommutations): 0

2 and if

k

k

k

k

k

k

=

k

=

k k

2

(37)

k k

:

0

2

0

2

k

k

k

k

k

=

0

2

2 k

(38)

k

1879



COMPLEMENT BXVII

Figure 5: Interaction process where two particles of the same pair are scattered in the forward direction.

This term is the same as the previous one, as it only differs by the sign of the summation dummy index k. Taking into account expression (C-19) of Chapter XVII for ˆ , we can write the total contribution as: 2

0

k

=

k

0

ˆ

(39)

2

This contribution is interpreted as the average attraction energy in an ensemble of ˆ 2 pairs. When the average particle number is large, we can neglect (39) compared to (26). Consequently, the pairing effects we are going to discuss cannot be simply interpreted as an attraction among an ensemble of 2 pairs.

3-b.

Total energy

Finally, adding the terms (22), (31), (26) and the double of (34), we get the average energy3 : sin2

=2 k

k

+

0

4

2

+

[

0

k

k ] sin

2

k

sin2

k

cos

k

k=k

+

k

k

sin

k

sin

k

cos

k

2 (

k

k

)

kk

(40) 3 The summations over k have no restrictions, contrary to the tensor product appearing in relation (B-8) of Chapter XVII, where the summation is limited to a half-space to avoid redundancy.

1880



AVERAGE ENERGY IN A PAIRED STATE

The first term on the right-hand side corresponds to the kinetic energy, the second to the mean field for particles of opposite spins, the third one is the analogous term for particles having the same spin (it goes to zero for a short-range potential); these three terms were already present in the Hartree-Fock theory. The fourth term, however, is new: it corresponds to the pair annihilation-creation (pairing term) whose average value is non-zero only in a paired state. It is the only one that depends on the phases k , which will prove to be essential in the BCS theory (Complement CXVII ). 4.

Spinless bosons

For bosons, we must take into account the Bose-Einstein condensation phenomenon (Complements BXV , CXV and FXV ): in the ground state, a large fraction of the particles can occupy a single quantum state, the state k = 0. This is not the case for a paired state; we must therefore choose a variational state permitting such a condensation. We assume the interactions to be repulsive, in order to avoid the instabilities occurring for a system of attractive bosons (Complement HXV , § 4-b). 4-a.

Choice of the variational state

In Complement CXV , we used the Gross-Pitaevskii approximation to treat, in the simplest way, Bose-Einstein condensation: the system of bosons is supposed to be, at a given instant, in a state that is the product of identical individual states, generally chosen as the zero momentum state, k = 0; the system state is thus written as: Φ =

0

0

(41)

( 0 is the creation operator in the individual state k = 0). However, whereas such a state is suitable for an ideal gas ground state, it can only be an approximation for a gas of interacting particles: it is an eigenvector of the kinetic energy, but not of the operator associated with the interaction energy. The interaction potential actually couples this state to all the states where two particles are transferred from the individual state k = 0 toward any two individual states of opposite momenta ~k and ~k (because of momentum conservation), such as, for example, the state: 2

Φ =

k

0

0

k

(42)

where two states of a pair are occupied. This suggests using a state Ψpaired as a variational ket4 for describing the components of the system state vector associated with all the individual states k = 0. We must also include the components corresponding to the individual state k = 0; those will be described5 by a “coherent state” (Complement GV ). 4 The

interaction potential also couples a state such as (42) to numerous states of the form 3

0 , where q can take on any value. An exact theory would require taking those “unpaired” states into account, but leads to complex calculations. This is why we limit ourselves to a variational method in the framework of an approximation where the states k = 0 are only accessible to pairs (we assume ). 0 5 This individual state must be treated separately, as applying the general formula (B-9) of Chapter XVII, used when k = 0, to obtain k=0 would involve the exponential of the square of the operator k+q

q

k

0

1881

COMPLEMENT BXVII



We therefore choose a variational state vector of the form: Φ

=

Ψpaired =

0

0

(43)

k k

(the notation refers to the name Bogolubov). In this expression, Ψpaired is the paired state for spinless particles (B-8) of Chapter XVII, a tensor product of the normalized states k defined in (B-9) and (C-13). The domain of the tensor product in (43) is half the k-space to avoid (as seen previously) a double appearance of each state k ; the origin k = 0 is excluded from . This domain could eventually have an upper bound for k. As for 0 , it is the coherent state whose expression can be found, for example, in Complement GV , whose relations (65) and (66) provide6 : 0

=

0

2

0 0

0

=0

(44)

This state depends on a complex parameter its phase 0 : 0

=

0

=

characterized by its modulus

0

0

and

(45)

0

0

It is a normalized eigenvector of the operator 0

0,

0

with the eigenvalue

0:

(46)

0

The average particle number in the state k = 0 is thus: 0

0 0

0

=

0

0

0

0

=

0

(47)

The width of the corresponding distribution is 0 (Complement GV ), hence negligible compared to 0 (supposed to be a large number). The variational variables contained in the trial ket (43) are thus the set of k and , as well as 0 and 0 . k 4-b.

Different contributions to the energy

We now compute the average energy, in the variational state Φ of the Hamiltonian operator given by (11).

written in (43),

, leading to large fluctuations of the particle number in the state k = 0 (condensed particles). This k=0 would necessarily yield large fluctuations of the total particle number, as well as of the average repulsive energy, whereas, as we saw in § 3-b- of Complement GXV , those fluctuations are not possible precisely because of this repulsion. 6 One shold be careful about the change of notation: in Chapter V and its complements, 0 denotes the ground state of the one particle harmonic oscillator, which here corresponds to the vacuum 0 = 0 . In the present complement, 0 is the ket associated with a large number of particles occupying the same individual state, as is also the case of the wave function (r) of the Gross-Pitaevskii equation (Complement CXV ); with the notation of Complement GV , this ket would rather correspond to a 0 state.

1882

• .

AVERAGE ENERGY IN A PAIRED STATE

Kinetic energy

The kinetic energy term is the sum of the contributions from the different individual states k, with no contribution from the k = 0 state (since =0 = 0). Each term of the summation contains the operator k , whose average value in the factored (over the k) state (43) is given by the average value k k k in the state k . This average value is given by relation (C-33) of Chapter XVII as sinh2 k , which yields, for the average value of the kinetic energy in the state Φ the expression: =

k

k

k

sinh2

=

k=0

(48)

k

k=0

with: =

}2 2

where .

2

(49)

is the particle mass. Interaction energy

The average value of the interaction potential energy is a sum over four indices k k k k of the potential matrix elements described in (19). As opposed to what happened for the kinetic energy, these elements have no particular reason to cancel out if one (or several) of their indices is zero. We shall therefore compute the different contributions, arranging them in decreasing order of the number of their zero indices. A noticeable simplification of the computation occurs with the choice of the trial vector, as the coherent state 0 is one of its factors. Any time one of the four indices in the potential energy term is zero, the corresponding annihilation operator may be replaced by the complex number 0 . This is because the trial ket Φ is an eigenvector of the operator 0 with eigenvalue 0 - see relation (46). In the same way, each time one of the two indices k or k is zero, the creation operators on the left of the product, and hence acting on the bra Φ , can be replaced by 0 , since the Hermitian conjugate of relation (46) is: 0

0

=

0

(50)

0

These two operators are therefore simply replaced by numbers. Let us examine in turn all the possible cases. (i) If the four indices k k k k are zero, the interaction potential contributes via the constant 0 (forward scattering term), defined in (16) as the integral of the interaction potential 2 (r); this contribution is written as: forward

= 00

0

2

0

0 0 0 0

0

=

( 0

0)

2

2

(51)

The corresponding term is represented in Figure 6. There is no contribution from terms where three (and only three) indices k k k k are zero: total momentum conservation would require the fourth index to also be zero. (ii) If among the four indices k k k k two are zero, one concerning an annihilation operator, the other a creation operator, the momentum conservation requires the 1883

COMPLEMENT BXVII



Figure 6: Diagram symbolizing the interactions between particles in the individual state k = 0, and which remain in that state after the interaction (forward scattering term that yields the internal mean field of the condensate). other two operators to be k and k , with the same index k. This can yield either a direct term, or an exchange term. - the direct terms contain the average value of either the product k 0 0 k , or of the product 0 k k 0 ; it is again a forward scattering process and the potential appears via the constant 0 . The two average values can be factored into two terms 0 0 0 0 = 2 and k k k =sinh2 k ; they are thus equal and the corresponding contribution 0 is written as: direct

= 0

0

sinh2

0

k

(52)

k=0

where the subscript symbolizes the ensemble of the “excited” states, i.e. those with momentum ~k = 0 that have a non-zero kinetic energy. Introducing the average total number of particles in these excited states: =

sinh2

=

k k k=0

k

(53)

k=0

we can write: direct

= 0

0

0

(54)

This term is simply interpreted as coming from the interaction between 0 particles in the condensed state k = 0 and particles in the other individual states. - the exchange terms contain k 0 k 0 and 0 k 0 k . We are now dealing with a momentum transfer process, and the potential now appears via the constant k obtained by inserting q = k in (16). Otherwise, the computation is the same as for the direct 1884



AVERAGE ENERGY IN A PAIRED STATE

terms: the two average values can be factored, and the corresponding contribution is written: ex

= 0

k

0

sinh2

(55)

k

k=0

The two terms (54) and (55) correspond to mean field contributions associated with the interaction between k = 0 particles and k = 0 particles, taking into account the indistinguishability of the particles that led to the exchange term. (iii) if the product of operators contains the annihilation operator in the state k = 0 twice, the momentum conservation requires the product to be of the form k k 0 0 . We are now dealing with a process where two particles in the state k = 0 are replaced by a pair (k k), which amounts to creating a pair from particles initially in the condensate, as shown on the left-hand side of Figure 7; here again, the potential appears via the 2 constant k . The two annihilation operators introduce the factor [ 0 ] = 0 2 0 and the other two operators, an “anomalous average value” in a state k , which we already computed in (C-51) of Chapter XVII. We therefore get: 0

2

k

sinh

k

cosh

k

2 (

0

k)

(56)

k=0

If the product of operators contains the creation operator in the state k = 0 twice, it must necessarily be of the form 0 0 k k , which corresponds to the annihilation of a pair whose particles are transferred to the state k = 0 (right-hand side of Figure 7). This product is the Hermitian conjugate of the previous one, and its average value is the complex conjugate of the previous result. The sum of these two terms is the contribution of the processes of creation and annihilation of pairs from the condensate: k

0

sinh

k

cosh

k

cos 2 (

0

k)

(57)

k=0

These terms come from the pairing of particles, as opposed to other terms that are related to the mean field. We shall see in Complement EXVII the essential role they play in the Bose-Einstein condensation of an ensemble of bosons. (iv) There are matrix elements of the interaction potential involving a single particle in the k = 0 individual state, and three other particles in the k = 0 states. The corresponding terms have a zero average value in the state Φ because of the structure of its component Ψpaired , where the occupation numbers of two paired states must always vary together. (v) We have yet to compute the contribution of terms where none of the wave vectors are zero. The computation is very similar to that of § 3, and we shall again distinguish three cases: – Case I Terms containing interactions where particles are created in the states from which they were destroyed: a direct term in k k k k , and an exchange term in k k k k . The computation is the same as in § 3, except for the fact that no minus sign occurs in 1885



COMPLEMENT BXVII

Figure 7: The diagram on the left represents a process where two particles, initially in the k = 0 individual state, interact and end up in states of opposite momenta. The diagram on the right represents the inverse process, where two particles of opposite momenta collide and end up in the k = 0 individual state (i.e. in the condensate). As opposed to the previous terms, corresponding to the mean field of interacting particles, the terms corresponding to this diagram are introduced by the pairing process: they play a central role in the Bogolubov theory (Complement EXVII ). the exchange term. Result (31) therefore becomes, for bosons7 : 1 2

[ k=k

0

+

k

k]

2 k

2 k

=

1 2

[

0

+

k

k]

sinh2

k

sinh2

k

k=k 0

2

(

2

) +

1 2

k

k

sinh2

k

sinh2

k

(58)

kk

The first term is the direct term that, when 1, is interpreted as the effect of the interaction mean field between the ( 1) 2 different pairs of particles (when 1). It is corrected by a second exchange term, which expresses the increased interaction between particles due to the boson bunching effect. – Case II The “pair annihilation-creation” term in k k k k , which yields here, taking

7 As opposed to the fermion case, the contribution of the terms k = k is not zero but involves the average value of the operator k k 1 in the state k , see § C-2-b of Chapter XVII. However, for a large system, the number of individual states k is very important, and this contribution is totally negligible compared to (58). This is why we neglected this term. As for fermions, the summations over k do not have any restriction (no limitation to half the reciprocal space).

1886



AVERAGE ENERGY IN A PAIRED STATE

(C-51) and (C-52) of Chapter XVII into account: 1 2

k

k

k

k

k

k

k k

k

k

k=k

=

1 2

k

k

sinh k cosh

k

sinh

k

cosh

2 (

k

k

k

)

(59)

k=k

– Case III Finally, the case where only one pair is involved leads, as in the fermion case, to 2 a term proportional to 0 , negligible compared to the term in 0 of (58) when the average particle number is large. We shall therefore neglect it. 4-c.

Total energy

Regrouping the terms in

0

of (51), (54) and of (58), we get a total mean field

term: mean field

=

0

2

(

+

0

)

2

(60)

From a physical point of view, it is natural that this term be proportional to the square of the total particle number divided by 2, that is to the number of ways particles can be associated by pairs (when 1). If we now include (55), (57) and (59), we get for the total energy: sinh2

=

k

k=0

+

k

0

sinh2

k

+

0

2

sinh

k

(

0

+

cosh

) k

2

cos 2 (

k)

0

k=0

+

1 2

k k

sinh2

k

sinh2

k

+ sinh

k

sinh

k

cosh

k

cosh

k

cos 2 (

k

k

)

k k =0

(61) The second summation on the right-hand side describes the effect of momentum transfers between k = 0 particles and the condensate, as well as the processes of annihilation and creation of pairs from the condensate (this last process depends on the relative phases 0 k and, as already mentioned, arises from the pairing of particles). The last terms on the right-hand side, included in the double summation over k and k , correspond to interactions between particles in the k = 0 states. Since the number of individual states is very large, we have ignored the constraint k = k of relation (59), which has a negligible effect; furthermore, as we noted in Chapter XVII, it is justified to replace the imaginary exponential by a cosine. We have shown, in this complement, that the paired states are a useful tool for computing the average energy of an ensemble of interacting particles. In the following complements, we shall use these results successively for fermions and bosons.

1887



FERMION PAIRING, BCS THEORY

Complement CXVII Fermion pairing, BCS theory

1

2

3

4

Optimization of the energy . . . . . . . . . . . . . . . . 1-a Function to be optimized . . . . . . . . . . . . . . . . 1-b Cancelling the total variation . . . . . . . . . . . . . . 1-c Short-range potential, study of the gap . . . . . . . . Distribution functions, correlations . . . . . . . . . . . 2-a One-particle distribution . . . . . . . . . . . . . . . . . 2-b Two-particle distribution, internal pair wave function . 2-c Properties of the pair wave function, coherence length Physical discussion . . . . . . . . . . . . . . . . . . . . . 3-a Modification of the Fermi surface and phase locking . 3-b Gain in energy . . . . . . . . . . . . . . . . . . . . . . 3-c Non-perturbative character of the BCS theory . . . . Excited states . . . . . . . . . . . . . . . . . . . . . . . . 4-a Bogolubov-Valatin transformation . . . . . . . . . . . 4-b Broken pairs and excited pairs . . . . . . . . . . . . . 4-c Stationarity of the energies . . . . . . . . . . . . . . . 4-d Excitation energies . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

1890 1891 1893 1896 1899 1899 1901 1909 1914 1914 1917 1918 1919 1919 1920 1920 1922

We present in this complement the BCS mechanism for the pairing of fermions through attractive interactions. The three letters BCS refer to the names of J. Bardeen, L.N. Cooper and J.R. Schrieffer who proposed in 1957 [9] a theory for a physical phenomenon already observed in 1911 by H. Kamerlingh Onnes in Leiden, but as yet unexplained. This latter scientist observed that, below a certain temperature, the electrical resistivity of certain metals (mercury in his case) abruptly goes to zero as a phase transition occurs toward a so-called “superconducting” state. Along with this transition, many other spectacular effects occur, such as the expulsion of magnetic fields from the material. In this complement, we shall be concerned, with the general pairing mechanism of attractive fermions in the framework of BCS theory. We shall not, however, give any detail about the theory of metals, simply accepting the existence of an attraction between the fermions, without justifying its precise origin. In metals, this effective attraction comes from a coupling between electrons and phonons, and is therefore indirect, introducing an additional complexity to the problem. Furthermore, we shall not present any calculation of electrical resistivity, and hence not show that it can go to zero. The BCS theory is a mean field theory, of the same type as the Hartree-Fock theory (Complements DXV and EXV ). In this latter theory, particles are assumed to independently propagate in the mean field created by all the others; the system is described by an -particle Fock state. Here, we shall assume that the particles form pairs, and this hypothesis will lead us to use, as a variation trial ket, the ket ΨBCS introduced in Chapter XVII; this complement is a direct application of the results of that chapter. The state 1889

COMPLEMENT CXVII



we will choose does indeed mathematically resemble a Fock state of “molecules”, each composed of two particles. It should not be concluded, however, that this approximation reduces to a theory where each molecule is considered as an identifiable object moving in the mean field created by all the others. This naive picture is correct in the limit where the molecules are very strongly bound, but we shall see that it is totally inappropriate for loosely bound pairs such as those in the BCS theory. As already underlined in the introduction to Chapter XVII, the use of paired states brings a lot of flexibility to the mean field approach, as it allows modulating the binary correlation function between particles, and then to adapt it to interactions. We start (§ 1) by minimizing the energy to determine the optimal quantum state in the family considered. In § 2, we discuss some physical properties of the optimized BCS wave function, mainly in terms of one- or two-particle correlation functions, but also in terms of what is called “non-diagonal order” (Complements AXVI and AXVII ). Finally, in § 3, we shall study in more detail the physical content of the BCS pairing mechanism allowing the optimization of the energy of a fermion system, and in particular the role of phase locking (spontaneous symmetry breaking). For the sake of simplicity, we assume throughout this complement that the temperature is zero, but the BCS method can also be extended to the study of non-zero temperatures. This will lead to the study of excited states (§ 4), as will be briefly mentioned in § 4-d). Shortly before the BCS theory was established, Cooper proposed a model including two attractive fermions. He showed that the exclusion of their wave functions from the interior of a Fermi sphere led to a bound state having certain properties similar to those described later by the complete BCS theory. This theory can be considered to be a generalization to particles of the Cooper model, highlighting the collective effects leading to the properties of the BCS ground state. The Cooper model will be studied in Complement DXVII , and its analogies with the -particle theory will be underlined. In the present complement, we present the BCS theory, starting directly from the general results of Chapitre XVII; we shall also use the average energy values calculated in § 3 of Complement BXVII . We obviously cannot give here a detailed account of superconductivity theory and its various resulting effects, which would require an entire book. Limiting a large part of the computations to zero temperature situations already implies that numerous phenomena are outside the scope of this complement. To learn more about the subject, the reader can consult reference [8]. 1.

Optimization of the energy

Relation (B-11) of Chapter XVII yields the expression of the paired state1 ΨBCS : ΨBCS = exp

k k

k

k

0 =

k

(1)

k

1 This state is a superposition of components containing different numbers of particles. As already mentioned in Chapter XVII, one could also choose a variational state where the particle number is perfectly determined ([8], § 5-4 and Appendix C of Chap 5), but that would make the calculations a bit more complex.

1890



FERMION PAIRING, BCS THEORY

The ket ΨBCS was then normalized by separately normalizing each ket became the kets k : =

k

k

+

k

k

where the two functions k

=

k

k

0

k

and

k

, which

(2) k

are related by: (3)

k

and satisfy: 2 k

2

+

k

=1

(4)

In that chapter, k was introduced as the Fourier transform of the wave function (r) of the “diatomic molecule” used to build the paired state; until now, this state was not specified. Here, we shall consider the k as variational parameters. Choosing k = 0 leads to k = 0 and k = 1: in that case, the two individual states k and k are neither occupied nor paired. They will be, however, if k is not zero. In general, the number of non-zero k is, a priori, arbitrary (finite or infinite). We can, for example, limit their number by setting a maximum value for the modulus of k, and consider this maximum value as a supplementary variational parameter defining the trial ket. k We were led, in that same Chapter XVII, to set k = cos k and k = sin k k , relations that imply that k and k have opposite phases (a situation always possible to obtain by changing the global phase of the ket k , which has no physical consequences). In the present complement, it will be more convenient to assume that the phase of k is chosen in order to make k real and positive, and we will set: k

= cos

k

k

= sin

k

2

(5)

k

Relation (C-19) of Chapter XVII yields the particle number in the state ΨBCS : 2

=2

k k

1-a.

sin2

=2

k

(6)

k

Function to be optimized

The average particle number in the state ΨBCS may be changed by varying the k dependence of the k and k : as an example, choosing k = 1 and k = 0 for any value of k, the average number will be zero; on the other hand, if k is very small and can k equals 1 for a great number of k values, the average total particle number attain arbitrarily large values. As the energy minimization operation makes sense only for a fixed value of , we shall determine that value with the Lagrangian multiplier (chemical potential; see Appendix VI, § 1-c). We will optimize the k and k choices by introducing the variations d k and d k and cancelling the variation of the average value = . The volume 3 of the physical system and its chemical potential are supposed to be fixed; we can choose one of two equivalent sets of variables to be determined, either the k and the k , or the k and the k . 1891



COMPLEMENT CXVII

Relation (40) of Complement BXVII yields then have:

, whereas

is given by (6). We

= =2

(

2

)

0

+

k

2

4

k

+

+

[

0

k

k]

2

2

k

k

k=k

k k

k

k

k

(7)

k

kk

with, according to (5): 2 k

= sin2

;

k

k

k

= sin

k

cos

k

2

(8)

k

In the above expression for , is the kinetic energy of a free particle with momentum }k: }2 2 = (9) 2 and the k are the Fourier transforms of the interaction potential 2 (r): =

k

1 3

d3

kr

2

(r)

(10)

As this potential is rotationally invariant, the function k only depends on the modulus of k, and it must be real (Appendix I, § 2-e); as the potential is attractive, we can assume all the k to be negative. We saw in Chapter XVII that the first term of (7) corresponds to the kinetic energy, the second to the mean field (diagonal term) for particles with opposite spins, the third one to the similar term for particles with identical spins (in which the direct and exchange terms cancel each other for a short-range potential). Finally, the fourth term, on the second line (which contains k k ) plays a particularly important role in what follows; it comes from the pair annihilation-creation diagram schematized in Figure 4 of Complement BXVII . It is often called the “pairing term”. We use relation (6) to write:: 2

d[

] =2

d

=4

d

2

(11)

k

k

We then get: d

=2

d

2 k

k

+

k k

k

k

d[

k k] + k

k

d[

k k]

(12)

kk

where the variable is the kinetic energy with respect to the chemical potential, and corrected by the interaction effects 2 : =

2A

+

0

2

+

(

0

k

k)

2

k

factor 2 appears in front of the summation over k as the variations of k must be added, but is included in the factor 2 in front of the summation containing

1892

(13)

k

2

and .

k

2

in (7)



FERMION PAIRING, BCS THEORY

Comment: For the applications of the BCS theory, the choice of the interaction potential to be used in the equations is not necessarily self-evident. This is especially true in the superconductivity theory of metals, where the fermions involved are electrons which, isolated, interact via a repulsive Coulomb potential. In a metal, however, the direct repulsive interaction between electrons is mostly screened and they interact indirectly via the crystalline network deformations (phonons, see Complement JV ). This phenomenon leads to a long-range attractive component in their effective interaction, and explains why pairing between electrons is possible. This effective interaction depends on the phonon characteristics, and in particular on the Debye frequency of the solid under study. This is also true in the theory of an ultra-cold diluted fermionic gas, where we do not use directly the interatomic potential in the equations. This interatomic potential contains, at short-range, a strongly repulsive part (often assimilated to a “hard core”) and, at an intermediate distance, a strongly attractive well, permitting the formation of a large number of molecular states. Now when the gas under study is very dilute, the three-body collisions leading to these molecules are very rare, meaning these molecular states play practically no role; only the long distance effects of the potential have a real importance. In other words, the essential role is played by the asymptotic properties of the stationary collision states, as described by the scattering amplitude (Chapter VIII, relation B-9) and the associated phase shifts (Chapitre VIII, § C). The potential used in the BCS computations will therefore be an “effective potential”. Furthermore, as the collisions occur at very low energy, this effective potential only depends on the phase shift 0 associated with = 0. This phase shift is generally characterized by a “scattering length” 0 defined as 0 0; the effective potential will be attractive 0 when if this scattering length is negative. As this complement deals mainly with the quantum mechanism for BCS pairing, and not with the determination of a valid potential, we shall not examine this point further and assume a pertinent choice of the interaction potential has been made. 1-b.

Cancelling the total variation

It is obvious in (7) that the first three terms on the right-hand side depend only on the moduli of the k ; only the last term (annihilation-creation) depends on the phases must be minimal if we vary the k without changing the k , i.e. k . Now the function when we vary only this last term: k k

sin

k

cos

k

sin

k

cos

k

2 (

k

k)

(14)

kk

We assumed that all the potential matrix elements were negative, whereas relation (C-5) of Chapter XVII shows that the products sin k cos k are positive. The lowest value of this sum will be obtained when all the terms in the summation over k and k have the same phase in order to add coherently. This condition is called the “phase locking condition”, and will be discussed in more detail in § 3-a- . The minimum obtained does not depend on the absolute phase of the k , but only on their relative phases. One can then simply choose all the k to be equal to zero, which means that all the k are real and positive. This is the choice we shall adopt from now on. 1893



COMPLEMENT CXVII

In relation (12), the terms in d [ k are dummy indices), and we have:

d

=2

d(

k)

2

+2

k

k k

k k]

k

and d [

d(

k

k k]

now become equal (the k and

k k)

(15)

kk

We must now vary the k . For that purpose, we introduce the quantities ∆k , having the dimension of an energy, as: ∆k =

(

k k

)

k

(16)

k

k

The ∆k are real since the k and k are real, and positive since we assumed the interaction potential matrix elements to be negative. They are called “gaps” and play an important role in the BCS theory. It will be easier to discuss this role in the case of a very short-range potential; this will be done in § 1-c. The choice of the word “gap” will also be explained later (in § 4, see in particular Figure 7). The variation of can now be written as: d

=4

k

d

2

k

k

∆k [

The variations of d k and d (for k and k real) that: 2

d

k

k

+2

k

d

k

k

k

d

2

k

k

∆k

Multiplying by 2

k k

(17)

are, however, not independent since relation (4) requires

k

by

kd k

(

∆k

k)

k

(

k)

k k,

k.

The right-hand side of (17) then becomes:

2

d

(19)

k

k

k

k

kd k]

(18)

Cancelling the variation of 2

+

=0

This means we can replace d 4

kd k

k

with respect to all the

k

leads to:

2

=0

(20)

k

we get:

= ∆k (

2 k)

(

k)

2

(21)

or else: sin 2

k

= ∆k cos 2

(22)

k

One can then compute the sine and the cosine, since: 2

2

[cos 2 k ] = 1894

[cos 2 k ] 2

[cos 2 k ] + [sin 2 k ]

2

=

( ) 2

2 2

( ) + (∆k )

(23)



FERMION PAIRING, BCS THEORY

and we obtain: cos 2

k

=

= 2

( ) + (∆k ) sin 2

k

2

∆k

= 2

( ) + (∆k )

=

∆k

(24)

2

where we have set: =

2

( ) + (∆k )

2

(25)

We finally obtain: 1 1 [1 + cos 2 k ] = 1 2 2 1 1 2 [ k ] = [1 cos 2 k ] = 1 2 2

[

2 k]

=

(26)

They are many possibilities for rendering stationary the difference of average values, = , depending on the sign chosen in each equation, and for each value of k. This multiplicity of solutions is not surprising since the stationarity is obtained not only for the ground state, but also for all the possible excited states of the physical system; those will be discussed in § 4, and we will see that the stationarity conditions (26) include the possibility of “excited pairs” (§ 4-c- ). For the moment, we shall concentrate on the search for the ground state and look for the absolute minimum of the average value (7). Let us examine which signs must be chosen in (26) to get the ground state, i.e. the lowest possible value of in expression (7). The chemical potential of an ideal fermion gas in its ground state is positive, equal to the Fermi level, and proportional to the particle density to the power 2 3 (Complement CXIV , § 1-a). In the presence of a weak attractive potential, the factor ( ) in the first term on the right-hand side of (7) is negative when the modulus of k is small, and positive when . In the first case, 2 to minimize , it is better to choose values of [ k ] as large as possible, and hence the sign in the second equation (26) since k is negative in this case. On the other hand, 2 when , it is better to choose values of [ k ] as small as possible, and it is again the sign in the second equality that must be chosen. As a result, it is the sign that must be taken in the second equality (26), and hence the + sign in the first one. As we know that k and k are positive, we finally obtain for the ground state:

k

=

1 2

1+

k

=

1 2

1

(27)

Inserting these results in (21), we verify that the stationarity relations are fulfilled, independently of the sign of . They apply as long as the self-consistent condition derived 1895

COMPLEMENT CXVII



from the ∆k definition (16) is satisfied: ∆k =

1 2

2 k k

1

=

k

1 2

∆k k k

2

(

k

(28) 2

) + (∆k )

As we now show, this condition takes on a simpler form for a very short-range interaction potential. The above computation shows that starting from any function (r), or (which amounts to the same thing) from functions k considered to be entirely free variables, the optimization procedure yields values for k and k ; this in turn fixes the optimal k and determines the function (r) for building the paired state described in (1). 1-c.

Short-range potential, study of the gap

The matrix elements of the interaction potential were defined in Complement BXVII as the Fourier transforms of the potential – see relations (16) and (18) of that complement. For a regular potential of range , the matrix element necessarily varies when changes by a quantity of the order of 1 ; in particular, 0 when 1 . However, in many physical applications of the BCS theory, the wave vectors involved remain very small compared to 1 , and a useful approximation is to ignore the variations of the . We therefore consider them all equal to the same constant : k

=

(29)

The minus sign was introduced to make a positive number for an attractive potential; this number is inversely proportional to the volume , as shown by relation (10). Definition (13) of k now takes on a simplified form: =

(30)

2

that can be inserted in the relations (27) to get the functions k and k . Relation (16) also has a simpler version; all the ∆k take on the same value ∆: ∆=

k

(31)

k

k

In this case, there exists only one value of the gap, and since the k and k are real, this value is also real. We shall see in what follows that ∆ plays a particularly important role, especially in the dispersion curve characterizing the system excitations (see for example Figure 7 which shows the existence of an energy minimum equal to ∆). All the previous formulas apply, provided we replace the ∆k by ∆. Relations (24) then become, with the sign choice leading to the ground state: cos 2

k

=

= 2

( ) + ∆2 sin 2

k

=

∆ 2

( ) + ∆2 1896

=



(32)

• Equalities (27) are unchanged. When (25) and (30) into account, the value of ∆ 2

k

k

FERMION PAIRING, BCS THEORY

the second relation (32) shows that, taking goes to zero as:

∆ 2

(33)

It will be useful in what follows to know the asymptotic behaviors of the functions k and k , whose values as a function of k are given by (5). When , relation (33) shows that k goes to 1 whereas k goes to zero as: ∆ ∆2 + 0( 2 ) 2

k

1 2

+ 0(

1 4

)

(34)

This ensures the convergence of the summations (C-19) and (C-26) of Chapter XVII giving the average values of and of its square. .

Self-consistency condition and divergences The self-consistent condition (28) now takes on the simpler form: 2

∆=

2

1

=

k

∆ 2

k

(35)

2

( ) + ∆2

that is: 1

1=

2

(36)

2

( ) + ∆2

k

which is an implicit equation expressing the gap ∆ in terms of (as this latter parameter appears in the definition of ). Choosing a large volume 3 , we can replace the discrete summation over k by an integral. We shall assume, as mentioned in § 1, that the k in the variational ket (1) are zero when the modulus of k is larger than a given cutoff value . Under these conditions, the implicit equation for the gap becomes: 3

1=

3

2 (2 )

1

d3 0

(37)

2

( ) + ∆2

where is the upper limit of the wave vectors introduced in § 1; remember that is inversely proportional to the volume, and hence the right-hand side of this equation does not depend on that volume. The integral will diverge if is infinite, since when , the function to be integrated behaves as 1 1 1 2 . The value obtained for ∆ therefore depends on the value chosen for ; this upper limit then plays an important role. .

Calculation of the gap We note =

}2 2

the equivalent, in terms of energy, of the cutoff frequency

:

2

(38) 1897

COMPLEMENT CXVII



We now choose the energy as the integration variable. We then have to consider the density of states ( ), obtained3 by taking the differential of the definition of : 3

( )=

2

4

2 }2

3 2

(39)

Relation (37) then becomes: 1=

1

( )d

2

0

(40) 2

) + ∆2

(

where, to simplify relation (30), we introduced a chemical potential mean field energy4 : =

+

relative to the

(41)

2

In relation (40), the function to be integrated over contains a fraction that is maximum for = ; this fraction takes on significant values in an energy band of width ∆ centered, in k space, on the surface of the “Fermi sphere” (see Complement CXIV ) whose radius obeys: }2 2

2

=

(42)

As for the density of states ( ), it takes on low values in the vicinity of the center of that sphere, but increasingly larger ones outside. The inside of the sphere barely contributes to the summation, the main contribution coming from the outside, in between the Fermi surface and the cutoff energy . We can then find in this region an intermediate value ( ) without changing the integral, with: 0 for the density of states that can replace ( )

( )

0

(43)

The density of states can be removed from the integral, and we get: 1=

0

2

1

d 0

(

(44) 2

) + ∆2

As is inversely proportional to the volume, whereas, according to (39), the density of states is proportional to it, this relation is independent of the volume. In the physics of superconducting metals, the attractive interaction between the electrons is mediated by the motions of the crystal’s ions, i.e. by the phonons of the network. A cutoff energy naturally appears in the matrix elements of the interaction potential, the Debye energy } of the phonons. One often uses a simple model where, 3 This

density of states is defined in Complement CXIV , and given by formula (8) of that complement. In our case, as we do not have to take into account the two spin states, the density of states is half the one computed in that complement. 4 With the sign convention we chose for in (29), this mean field energy is equal to 2 per particle.

1898



FERMION PAIRING, BCS THEORY

in (28), the potential matrix elements k k are zero as soon as the difference in energy is larger than } , supposed to be much smaller than ; otherwise, they are all equal to a constant . The same computations as those that led to (44), yield in this case the gap equation5 for levels close to the Fermi surface: +}

1

2

1

d

) + ∆2

(

}

= 2

arsinh

} ∆

(45)

where is the density of states on the Fermi surface. If, furthermore, we make the approximation 1, we get: ∆=

} sinh (1

)

2}

exp

1

(46)

This important equation is called the “BCS gap equation”. It is worth noting that this expression cannot be expanded in a power series of when the interaction goes to zero: all the derivatives of ∆ with respect to are zero for = 0. Consequently, this expression cannot be obtained in the framework of a perturbation theory in powers of the interaction (this point will be discussed in § 3-c). 2.

Distribution functions, correlations

Inserting expressions (27) in the trial ket, we obtain the optimal state vector ΨBCS that best describes the ground state. We now examine the physical implications of that optimized quantum state, concerning the properties of the one- or two-particle distribution functions. These properties will be used later on in this complement to understand the origin of the energy lowering due to condensation into pairs of particles. 2-a.

One-particle distribution

As we are going to show, the properties of the one-particle distribution are fairly close to those of an ideal gas. .

Momentum space

Once the gap ∆ is obtained, we can use relations (30) and (32) to determine the values of k for each value of k; relation (C-16) of Chapter XVII then yields the average number of particles in each pair of states. As the two states composing the pair play the same role, the average number of particles in each of the states is simply half that number, that is sin2 k . Figure 1 plots, as a function of , the variation of the distribution function k obtained, which is the momentum distribution function of a particle, once the variables k and k have been optimized. For an ideal gas, we saw in Complement BXV that it is a Fermi-Dirac distribution; at zero temperature (as is the case here, since we are studying the ground state), this distribution is a “step function” equal to 1 for and to zero for (dotted curve). In our case, the transition 5 There are two equal contributions to the integral on the right-hand side, one from the values of above , the others from those below; this is why the factor 1 2 has disappeared from the second equality.

1899

COMPLEMENT CXVII



2

Figure 1: Plots of the one-particle distribution function k = k in the BCS state, as a function of the energy . In the absence of interactions, this function is equal to 1 for , and zero for (dotted line step function). In the presence of interactions, due to the pairing of fermions the curve is rounded off over an energy domain of the order of the gap ∆ (double arrow on the horizontal axis), and the variations occur around the value (value of shifted by the mean field effect). The dashed curve plots the product . This function is largest around = , in a k k as a function of the same variable domain spreading over a few ∆.

between 0 and 1 occurs around , hence for a value of chemical potential shifted by the mean field effect as indicated by relation (41); this energy shift due to the mean field is natural. What is more striking is that the curve no longer presents a discontinuous step, but varies progressively over an energy domain whose width is of the order of the gap ∆. The interaction effect depopulates certain pairs of states in favor of other pairs having higher kinetic energies. Certain fermions are promoted from the inside of the Fermi surface towards the outside, this effect occurring over a depth of the order of ∆. In k space, the perturbation introduced by the attractions is localized in the neighborhood of that surface; the fermions situated close to the center of the Fermi sphere are not concerned, whereas those close to that surface gain an energy of the order of the gap.

.

Position space

Relations (B-22) and (B-23) of Chapter XVI yield the one-particle correlation function in position space:

1 (r

1900

;r

) = Ψ (r)Ψ (r ) = r

r

(47)



FERMION PAIRING, BCS THEORY

where is the one-particle density operator. Taking into account formula (A-14) of Chapter XVI, applied to normalized plane waves, we can write: 1 (r

;r

1

)=

(k

r

(k

r

k r)

k

3

k

kk

1

=

k r)

ΨBCS

3

ΨBCS

k

k

(48)

kk

where k is the creation operator in the individual state of momentum }k and spin , and k the annihilation operator in the state of momentum }k and spin . Now, in the state ΨBCS , the occupation number of each momentum pair are either 0 or 2, which means that the average values of the product of these operators is zero whenever each of them concerns a different pair, or if the two individual states are in the same pair, but are different from each other. We now have: 1 (r

;r

1

)=

k (r

r)

k (r

r)

3

ΨBCS

k

k

ΨBCS

k

1

=

3

ˆk

(49)

2

(50)

k

with: k

= ΨBCS

ΨBCS =

k

k

k

The function 1 is proportional to the Fourier transform of the average population k of the individual state k . As this average population is a function whose width is of the order of the Fermi wave vector , the function 1 goes to zero when r r 1 , i.e. as soon as the difference in positions is no longer microscopic: in this system, there exists no “long-range non-diagonal order” of the one-particle correlation function. For r = r , we get: 1 (r

where

;r BCS

)=

2

3

ˆ

=

BCS

2

(51)

is the numerical particle density: ˆ

BCS

=

3

(52)

The function 1 (r ; r ) has no spatial dependence; the factor 1 2 reflects the fact that the total density BCS is shared equally among the two spin states. The results are the same as for an ideal gas. 2-b.

Two-particle distribution, internal pair wave function

As opposed to the one-particle distribution, the two-particle distribution is strongly affected by the BCS mechanism, which is to be expected since it is a pairing process. 1901



COMPLEMENT CXVII

.

Momentum space, peak in the distribution

Relation (C-19) of Chapter XV yields the expression for the matrix elements of the two-particle density operator in an unspecified basis. In the momentum representation, they are written: 1:k

3; 2

:k

1:k

4

1; 2

:k

=

2

k

1

k

2

k

4

k

(53)

3

We shall mainly consider the diagonal elements, characterizing the correlations between the momenta of two particles: 1:k

;2 : k

1:k

;2 : k

=

k

k

k

(54)

k

In this expression, the creation operators repopulate precisely the same states as those that have been depopulated by the annihilation operators. For = , in order to obtain a non-zero result, we must have k = k (otherwise we get the square of a fermionic operator, which is zero). We get: 1:k

;2 : k

1:k

;2 : k

=

2

2

k

(if k = k)

k

(55)

which means there exists no correlations between the momenta. For = , when k is different from k, two different pairs are involved, and we again get a product6 : 1:k

;2 : k

1:k

;2 : k

2

=

k

2 k

(if k =

(56)

k)

On the other hand, if k = k, only one pair is concerned, destroyed and then reconstructed by the operators (as before, this is a contribution from the diagram in Figure 4 of Complement BXVII ; the computation then involves a single state k , and we get: 1:k

;2 :

1:k

k

;2 :

k

=

2

(57)

k 4

This result is not the limit of the previous one when k k, which would be k ; the 2 value we obtain is larger, since k 1. This shows that for all the values that do not correspond to a pair of opposite momenta, the density operator is simply a product, involving no correlations between the particles’ momenta. This confirms what we found in the study of the one-particle density operator: all the k states having a momentum smaller than that of the Fermi level are populated, with a rounding off of the functions due to the pairing phenomenon. On the other hand, for opposite values of both the momenta and the spin values, as is the case for a pair, we observe a discontinuity of the diagonal correlation function: it (2) 4 2 jumps from k to the larger value k . The corresponding discontinuity k can be written as: (2) k

2

=

k

=

k k

4 k 2

=

2 k

1

2 k

(58)

6 If = , to get this result we use the fact that the function (k) is even. Now if k = k, two pairs are still involved, labeled by opposite values of momentum (remember that we chose the convention where each pair is labeled by the momentum of the spin + particle); but, here again, the parity of (k) leads to (k) 4 , in agreement with (56).

1902



FERMION PAIRING, BCS THEORY

Figure 2: The left part of the figure shows the distribution function of the momenta of two particles, assuming that the two momenta }k1 et }k2 are parallel (or antiparallel); a value ∆ = 1 10 has been chosen. In the descending part of the surface, and in the two corners where k1 + k2 vanishes, one can distinguish a small crest indicating a partial Bose-Einstein condensation. In order to see this effect more clearly, we cut the surface along vertical planes whose trace is indicated by the dashed lines in the right part of the figure. This leads to the curves of Figure 3. (2)

We shall see below (§ 2-b- ) that k is none other than the square of the k component of the pair wave function. This discontinuity is significant for the values of k for which the product k k takes on its largest values; Figure 1 shows that it corresponds to a region around the Fermi surface, with an energy bandwidth of the order of the gap ∆. The momentum distribution function depends on the 6 components of the two momenta, which does not allow a simple graphic representation. To simplify, we are going to assume the two particles’ momenta }k1 and }k2 are parallel (or antiparallel), so that the distribution we wish to represent becomes a surface in three-dimensional space: we plot 1 along one axis, 2 along the second, and the probability along the third perpendicular axis. We then obtain the surface shown in the left part of Figure 2, where it has been assumed that ∆ = 1 10. To explore this surfact in more detail, we plot in Figure 3 the curves we obtain by cutting this surface by vertical planes parallel to the first bisector of the 1 and 2 axes; we assume the difference 1 2 to be constant (dashed lines in the right part of Figure 2) and use, as the variable, the sum of these two momenta. The horizontal axis in Figure 3 then represents the dimensionless variable : =

+ 2

1

2

(59)

As

varies, the corresponding point in the plane 1 , 2 moves along a straight line. For a fixed value of = 1 2 , we must set = 0 for the wave vector components to have opposite values: 1 = 2 and 2 = 2. On the left-hand side of Figure 3, the difference is chosen equal to 1 4 ; we obtain an almost square curve, rounded off by the fact that ∆ is not zero (as was the case in Figure 1), and which presents 1903

COMPLEMENT CXVII



Figure 3: Plots of the distribution function for two particles of parallel (or antiparallel) momenta k1 and k2 , as a function of the dimensionless variable = ( 1 + 2 ) 2 ; the figure was plotted with the choice ∆ = 1 10. For the curve on the left-hand side, the difference ( 1 . 2 ) is chosen equal to 1 4 The curve looks like a bell shaped function, practically constant for small values of , and decreasing for larger values following a rounded slope similar to the one in Figure 1 (and all the more steep as the ∆ value is chosen smaller). No singularity of the distribution is clearly visible (except for a minuscule peak at the origin). For the curve on the right-hand side, the difference in momenta is chosen equal to 2} ; when is close to zero, the two momenta take on values that both fall into the rounded part of the one-particle distribution. A singularity at = 0 is now clearly visible, signaling an accumulation of “molecules” in a state where their center of mass does not move. The height of the central peak corresponds to the population of the discrete level having a zero total momentum, and its width reduces to zero as it is a discrete level.

a barely visible peak. But, as we saw before, the effects of the pairing are important when 1 = , i.e. when 2 . On the right-hand side of Figure 3, the 2 momentum difference is chosen equal to 2} , so that the momenta can both fall in the rounded part of the distributions. We observe, superposed on a “pedestal”, a narrow peak indicating an additional population in the level having a zero total momentum. The value of that population is given by the height of the peak, whose width is strictly zero for discrete levels. The singularity of this momentum distribution is then clearly visible. Therefore, a singularity appears in the momentum distribution of particle pairs, whose centers of mass present a condensation in momentum space. This is, however, a partial condensation: as opposed to the boson case, the condensation peak appears on a pedestal due the presence of a majority of non-condensed pairs. Actually, the only pairs involved are those whose two components have energies falling around the Fermi level , in a bandwidth of the order of the gap ∆. Despite these restrictions, it is nevertheless true that the condensation phenomenon into attractive BCS pairs has properties related to Bose-Einstein condensation for repulsive bosons. The link between that condensation and the appearance of an order parameter for the pair field is discussed in § 2-b- of Complement AXVII . 1904

• .

FERMION PAIRING, BCS THEORY

Position space, correlations described by the pair wave functions

We did not find any effects of the interactions on 1 . But, as pointed out before, since the BCS theory relies on pairing, one expects to find more interesting properties concerning the two-particle correlation functions. They will be studied now, limiting ourselves to the “diagonal” correlation function, as defined by relation (B-33) of Chapter XVI, including the spin variables as in (B-36). This function is written as: 2

(r

;r

)=

Ψ (r)Ψ (r )Ψ (r )Ψ (r)

= 1:r (

;2 : r

1:r

;2 : r

(60)

is the two-particle density operator), or else, as before: 2

(r

;r

1

)=

[(k

k

) r+(k

k

)r]

k

6 kk k

k

k

k

(61)

k

This expression includes the average values of the product of four operators, whose computation is similar to the one explained in § 3 of Complement BXVII for the average interaction energy, except that, in our case, the spin indices are fixed rather than appearing as summation indices. Figure 4 schematizes with a diagram each term of (61): the incoming arrows represent particles that disappear (annihilation operator action), the outgoing arrows represent those that will appear (creation operator action); each value of k is associated with a position value r, via an exponential k r for the incoming kr arrows, or for the outgoing ones, as well as with a value of the spin . Parallel spins: if = , the two destruction operators necessarily concern pairs with different k. To restore the populations of these two couples of states to an even value, the only possibility is to again give each one its initial value; otherwise the result will be zero. We must have, either k = k and k = k (direct term), or k = k and k = k (exchange term). In the first case, we obtain (after two anticommutations whose sign changes cancel each other) a result7 independent of the position variables: 1 k

6

k

k

k

k

k

k

1

=

k

2 k

6

kk

2 k

(62)

kk

and in the second case (after only one anticommutation): 1

(k

k

) (r

r

)

k

6

k

k

(k

k

k

k

k

k

k

kk

1

=

) (r

r

)

2 k

6

2

(63)

k

kk

Regrouping these two contributions, and using (6), we get: 2 2

(r

;r

)=

2

3

1

[ (r

2

r )]

7 If = , we must change the sign of k and k in k and result since we can change the sign of summation variables.

(64) k

but, as before, it does not change the

1905



COMPLEMENT CXVII

Figure 4: This diagram symbolizes each term involved in the binary correlation function. The two incoming arrows on the bottom represent particles that will disappear during the interaction under the action of the two annihilation operators; the two outgoing arrows on the top represent particles that appeared during the interaction under the action of the two creation operators. All the arrows are associated with an imaginary exponential of the position, with a positive argument for the incoming arrows, and a negative one for the outgoing arrows. The indices label the spins.

with an exchange term containing the (real) function: 2

(r) =

kr

2

(65)

k

k

This result has the same form as relation (22) of Complement AXVI , taking into account the fact that the population of each spin state is half of . It shows that the correlation function for two parallel spins exhibits an “exchange hole” very similar to the one plotted in Figure 2 of that complement, but with a slightly different shape, since here 2 the functions k are no longer exactly discontinuous step functions. The width of this exchange hole is of the order of the inverse of , the Fermi wave number related to the Fermi energy by = }2 2 2 . Opposite spins: If = , it is possible for the two annihilation or creation operators to act on the same pair of states; we are then dealing with a pair annihilationcreation term (term of type II according to the classification presented in § D-1-a of Chapter XVII). Figure 5 symbolizes the three types of diagrams playing a role in the computation of the correlation function for opposite spins: I (forward scattering), II (pair-pair) and III (special cases). The computation of their sum has been performed in § D-2 of that same chapter, and led to the following result: 2 2

1906

(r

;r

)

2

3

+

pair

(r

r)

2

(66)



FERMION PAIRING, BCS THEORY

We have used the following definition of the (non-normalized) “pair wave function”8 r ): pair (r pair

(r) =

1

kr

k k

3

=

k

∆ 2 3

1

kr

(67)

2

( ) + ∆2

k

which is simply the pair wave function already introduced in relation (D-14) of Chapter XVII. We find again relations (38) and (39) of Complement AXVII , where this wave function was obtained by a different method involving the two-particle field operator. The presence of the second term on the right-hand side of (66) is thus due to the non-zero average value of the pair field, introduced in that complement (non-zero order parameter). We have just shown that, contrary to what happens in an ideal gas (Complement AXVI , § 2-b), two particles with opposite spins may be spatially correlated. This correlation is described by the modulus squared of the wave function pair (r r ), defined by its spatial Fourier transform k k . This new wave function, different from the one we used at the beginning to build the -particle trial wave function, was introduced in § D-2 of Chapter XVII, as well as in Complement AXVII , starting from the pair field operator. The spatial correlation it characterizes is purely dynamic, as it does not exist in the absence of interactions. Its physical consequences, in terms of potential energy, will be discussed in § 3-a- . Physical discussion: on the right-hand side of (66), the first term does not contain any spatial dependence; it simply corresponds to the correlation function of an ensemble of totally independent particles. The second term, on the other hand, depends on the position differences r r ; we now discuss its physical origin in terms of quantum interference. This second term comes from the contribution of the pair annihilation-creation terms for which we have, in relation (61), = and = . Let us show that “cutting them in half”, they look like interference terms. They include average values of operator products that, when = , can be written: k

k

k

= ΨBCS

k

=

k

k

k

k

Ψ (k ) Ψ (k)

ΨBCS (68)

where Ψ (k) is defined as: Ψ (k) =

k

k

ΨBCS

(69)

Relation (66) then becomes: 2 2 (r

;r

)

2

3

+

1

(k

6

k

) (r

r

) Ψ (k ) Ψ (k)

(70)

kk

8 The factor 1 3 appearing in (67) permits defining a pair wave function independent of the dimension of the physical system, in the limit of large where the sum over the k becomes an integral over d3 multiplied by ( 2 )3 . As a result, the square of that wave function is not homogeneous to the inverse of a volume, as is generally the case for a particle wave function, but to 1 6 . Actually, it should be considered as a two-particle wave function, product of a constant wave function 1 3 2 of the center of mass of the pair (assumed to have a zero momentum) and a wave function describing its relative position variable.

1907

COMPLEMENT CXVII



Figure 5: Diagrams symbolizing various contributions to the binary correlation function for opposite spins ( = ). The diagram of type I corresponds to a process where two particles with opposite spins are destroyed and then re-created in exactly the same individual states (forward scattering). The type II diagram corresponds to the case where two particles of the same pair are destroyed, and then two particles are created in the states of another pair (pair annihilation-creation process). Finally, type III diagram is a special case of the previous diagrams, and yields a negligible contribution. It is the type II diagram that introduces the spatial dependance of the correlation function.

The position dependent term in the correlation function can be interpreted as resulting from the interference between a process where two particles of the same pair (k k) are annihilated, and a second process where the two annihilated particles are from another pair (k k ); these two processes are schematized in Figure 6. According to (1) and (69), we have: Ψ (k) =

k

k

= 0;

k

=0

k

(71)

k =k

If k = k , the two states Ψ (k)

and Ψ (k )

are neither identical, nor orthogonal;

they actually have identical components on all the pairs of states different from (k k) and (k k ), but these two pairs have the same component only for states where the 4 populations are zero. We can then write: Ψ (k ) Ψ (k) =

k

k

k k

(72)

Inserting this result in (70) yields relation (66), whose spatial dependence does come from the interference between the two processes schematized in Figure 6. One could also use relation (68) of Complement AXVII to express the product Ψ (r )Ψ (r) as a function of a sum of pair annihilation operators. This is another way of understanding the role of pairs in the determination of the binary correlation function expression (66). 1908



FERMION PAIRING, BCS THEORY

Figure 6: Diagram symbolizing two pair annihilation processes from the initial state ΨBCS , leading respectively to the states Ψ (k) and Ψ (k ) . As these two states are not orthogonal, an interference effect occurs that is at the origin of the position dependent part of the binary correlation function.

2-c.

Properties of the pair wave function, coherence length

The pair wave function plays an important role in the BCS theory, and not only for the binary correlation functions, as we already mentioned. Its range determines the coherence length of the system, and its norm is also related to the number of quanta present in the field of condensed pairs (Complement AXVII ). .

Form of the pair wave function

As the functions k and k only depend on the modulus of k, we can apply the Fourier transform formulas for this case – see Appendix I, relation (52). Replacing the discrete summation by an integral, the pair wave function (67) becomes:

pair (r) =

1 2

3

d3

kr

k k

=

1 (2 )

∆ 2

1

d 0

(

sin 2

) + ∆2 (73)

Therefore,the pair wave function is real. Figure 1 gives a plot of k k as a function of the energy ; it presents a maximum in the vicinity of the surface of the Fermi sphere, with a peak whose width is of the order of ∆. More details on the role this pair wave function plays in the correlation functions are given in Complement AXVII . The Fourier transform of pair (r) is thus concentrated around values of the modulus of k of the order of the Fermi wave vector . Its spread is such that the corresponding variation of energy is of the order of the gap ∆, which leads to the condition: }2 2 2



that is



(74) 1909



COMPLEMENT CXVII

This wave function oscillates9 as a function of the position r, at a spatial frequency approximately equal to the wave vector at the Fermi surface. These oscillations are damped over a length of the order of pair defined by (the arbitrary factor 2 is introduced to match the usual definition found in the literature): pair

=

2

=

2}2 ∆

=

1 4

(75)



which is of the order of the distance between fermions, multiplied by the ratio ∆, very large compared to 1. Each fermion pair extends over a relatively large volume, leading to a strong overlap between pairs. In a superconductor, the length pair is called the “coherence length”10 ; it characterizes the capacity of the physical system to adapt to spatial constraints, and plays a role analogous to the “healing length” in systems of condensed bosons (Complement CXV , § 4-b). We have shown that the pairing significantly modifies the correlation functions for opposite spins: the particles now become correlated, whereas this was absolutely not the case for an ideal gas. It is a positive correlation, leading to a bunching tendency (the opposite of a Pauli exclusion); this explains the decrease in the interaction energy of the particles (§ 2-c- ). On the other hand, the pairing has no significant effect on the correlation function of particles having the same spin direction; it remains similar to the correlation function of an ideal gas, with an exchange hole whose width is of the order of 1 . Relation (75) indicates that the width of this exchange hole is much smaller than the distance pair over which the modifications of the correlation function for opposite spins occur. As mentioned before, the pair wave function has little in common with the initial wave function (r1 r2 ) used in Chapter XVII to build the -particle variational ket, since (3) shows that the Fourier transform of is k = k k . This is not surprising: when building a trial ket by the repetitive action of the same pair creation operator, we do not simply juxtapose those pairs. The antisymmetrization effects are dominant, and in each term of the expansion to the power of operator k k k in relation (1), the result is zero each time the same value of k is repeated (the square of a fermionic creation operator is zero; two fermions cannot occupy the same individual state). This is why the antisymmetrization effects completely remodel the pairs formed in the system of identical particles. .

Norm of the pair wave function

According to (67), the component on k of the ket function pair (r) is written as: k

pair

=

1 3 2

k k

pair

associated with the wave

(76)

9 The existence of this oscillation is confirmed by the fact that its integral over the entire space is 2 , which is indeed practically zero practically zero; this integral is proportional to 0 0 , i.e. 0 1 0 since 0 1. 10 The coherence length pair should not be confused with the (London) “penetration depth” that characterizes the magnetic field exclusion from a superconductor, and that depends on the charges of the particles.

1910

• (the functions pair

pair

k

=

and

FERMION PAIRING, BCS THEORY

are even). The square of this ket’s norm is therefore:

k

1

2

(77)

k k

3 k

Replacing the discrete summation by an integral, we get: pair

pair

=

3

1 3

d3

2

2 k k

=

∆2 4 3

( ) 0

d (

2

) + ∆2

(78)

where, in the second equality, we chose as the integral variable, introducing the density of states ( ) defined in (39). The function to be integrated converges since ( ) varies as ; this function is concentrated around = , spreading over a width ∆ . We can then, to a good approximation, replace ( ) by its value at the Fermi energy, and extend the lower bound of the integral to . As the integral of a Lorentz function is known (Appendix II, § 1-b): d 2

) + ∆2

(

=

(79)



we get: pair

pair

=



3

4

(80)

We showed in § 2-a- of Complement AXVII that the norm pair pair yields the average value of the field Φpair (R) , hence the value of the order parameter. In § 2-bof that same complement we showed that the square of the norm, equal to pair pair , yields the large distance behavior of the average value Φpair (R) Φpair (R ) ; the quantity pair pair is thus related11 to the field pair intensity (or, in other words, to the total number of quanta in that field). In addition, we just saw in the above § 2-b- that a peak in the momentum distribution signaled the presence of a Bose-Einstein condensation. Inserting (76) into (58) shows that this peak height is: (2) k

=

2 k k

=

3

k

2

(81)

pair

The total particle number associated with this peak is: (2) k k

=

3

pair

pair

=

4



(82)

Consequently, the square of the norm, pair pair , multiplied by the volume, is also the total particle number in the condensation peak found in § 2-b- , which confirms the previous interpretation. 11 The

pair field operators Φpair and Φpair do not exactly satisfy the boson commutation relations

(Complement AXVII ); the operator Φpair Φpair is thus not, stricto sensu, an operator giving the number of quanta in the pair field.

1911

COMPLEMENT CXVII

.



Link to the interaction energy

The energy term on the third line of (7) can be written, in the zero range potential approximation (29): k k

k

k

k

k

=

k

=

6

k

k

k

kk

k

k pair

(0)

2

(83)

This result yields an energy proportional to and to the probability that the two components of a pair, described by the wave function pair (r), are found at the same point; this makes sense since the pair size is very large compared to the interaction potential range. .

Non-diagonal order

When studying Bose-Einstein condensation for bosons, we showed in § 3 of Complement AXVI that the one-particle non-diagonal correlation function did not go to zero at large distance, when a significant fraction of the particles occupy the same individual state. Nevertheless, in § 2-a of the present complement, we found that this was not the case for a system of paired fermions, where the non-diagonal order goes to zero over a microscopic distance. This can be understood from a physical point of view, since, in the present case, there is no accumulation of particles in the same individual quantum state. On the other hand, we saw in § 2-b- that the center of mass of the pairs of particles is subject to a phenomenon of partial accumulation, reminiscent of a Bose-Einstein condensation. It is thus natural to examine the properties of the non-diagonal functions relative to pairs, and compute the “non-diagonal position” average value: Ψ (r)Ψ (r)Ψ (r )Ψ (r )

(84)

With two positions r on the right, and two positions r on the left, this expression is the exact transposition to two particles of the one-particle non-diagonal function: a couple of particles with opposite spins are annihilated at point r , and then recreated at a different point r. In a more general way, we are going to evaluate the 4-point average value:

Ψ (r1 )Ψ (r2 )Ψ (r3 )Ψ (r4 )

(85)

which, following the same computation steps as for the one-particle functions, can be written as:

Ψ (r1 )Ψ (r2 )Ψ (r2 )Ψ (r1 ) =

1

[(k r1 +k

6 kk k

1912

k

r2 )

(k

r2 +k

r1 )]

k

k

k

k

(86)



FERMION PAIRING, BCS THEORY

In this equality, the matrix elements have already been evaluated in § 2-b- . We are going to show that this expression can be written as: Ψ (r1 )Ψ (r2 )Ψ (r2 )Ψ (r1 ) =

1 (r1

; r1 )

1 (r2

; r2 ) +

pair (r1

r2 )

pair

(r1

r2 )

(87)

where the one-particle non-diagonal distribution 1 has been defined in relation (49) for the case = , and the pair wave function pair in relation (67). This equality is the same as relation (72) of Complement AXVII , but is now obtained by another method. Demonstration: To compute expression (86), we distinguish several cases, as already explained on several occasions: (I) Forward scattering terms; if the annihilation operators do not act on two states of the same pair ((k = k ), this term will be non-zero only if k = k and k = k , in which case it is written: 1

[ k (r 1

r1 )+k

(r2

r2 )]

6

2 k

2 k

=

1 (r1

r1

)

1 (r2

r2

)

(88)

kk

As we already mentioned, for example in Chapter XVII (§ D-1-a), the constraints on the summation indices can be ignored if the size of the system, , is macroscopic; the two summations then become independent. (II) Terms corresponding to the annihilation-creation of different pairs; if k = k and k = k but k = k (annihilation-creation of different pairs), we get the contribution: 1

[k (r1

r2 ) k

(r2

r1 )]

6

k k k

k

=

pair (r1

r2 )

pair (r2

r1 )

(89)

kk

As, in addition, the function right-hand side of (87). (III) If k = k and k = same pair), we get: 1

k (r1 +r2

6

r2

r1 )

pair (r)

is even, we indeed obtain the second term of the

k , and furthermore k = k

2

(annihilation-creation of the

(90)

k

k

This term is negligible as it is proportional to same, whereas the term (I) is proportional to

6 2

6

when all the positions are the

.

Let us now assume the positions can be grouped two by two: r1 and r2 are close to each other, as are r1 and r2 , but that the two groups’ positions are further away from each other. Under these conditions, the first term in 1 on the right-hand side of (87), which has a microscopic range in (r1 r1 ) and (r2 r2 ), becomes very small. We are then left with the product of the pair wave functions. It follows that the non-diagonal 1913

COMPLEMENT CXVII



correlation function is simply the product of pair wave functions calculated at the relative positions12 . In the particular case where r1 = r2 = r and r2 = r1 = r , we get the pair correlation function (84), which obeys: Ψ (r)Ψ (r)Ψ (r )Ψ (r )

r r

pair

(0)

2

(91)

This non-zero long distance limit signals the existence of a non-diagonal order for the twoparticle density operator. It comes, as was already the case for the pair wave function, from contribution (II), meaning from terms corresponding to the annihilation-creation of different pairs. This situation is reminiscent of what we encountered in the case of a condensed boson gas; but in the present case, the non-diagonal order concerns the pairs and not the individual particles. 3.

Physical discussion

In an ideal gas of fermions, and as we saw in Complement AXVI , there already exist strong correlations between the particles, simply due to their indistinguishability (a purely statistical effect). In the presence of attractive interactions, the BCS mechanism introduces additional correlations (dynamic correlations) that lower the total system energy. We are going to show that this decrease in energy comes from a slight imbalance between an increase in kinetic energy and a decrease of the potential energy, the latter one slightly surpassing the first one. For clarity, we shall discuss this using the short-range potential approximation (§ 1-c) where all the matrix elements of the interaction potential are replaced by a constant ( being positive); all the ∆k are then equal to the same gap ∆. 3-a.

Modification of the Fermi surface and phase locking

The energy written in (7) first includes a kinetic energy term, then a mean field term expressed in terms of the average particle number. If that average number is constant, this term is independent of the quantum state of the system, and hence not related to the BCS pairing mechanism. By contrast, the last term in (7), which is the one we optimized in the variational calculation, is far more interesting; we call it the “pairing term”, and use the words “pairing potential energy” or else “condensation energy” for its optimized value paired . As the k and the k are real, paired can be written as: 2 paired

=

k k

(92)

k

where the k and the k take the optimized values given in (27). We see that to get a large condensation energy, the sum k k k must take the largest possible value. 12 As mentioned in note 8, the center of mass variables are not included in (87) since we assumed all the pairs to be at rest. If this were not the case, the long-range factorization of the non-diagonal order would be expressed as the product of a function of the two variables (r1 r2 ) and (r1 + r2 ) 2 by the complex conjugate of that same function of the two variables (r1 r2 ) and (r1 + r2 ) 2 (in other words by the product of a function of r1 and r2 by the complex conjugate of that same function of the two variables r1 and r2 ).

1914

• .

FERMION PAIRING, BCS THEORY

Compromise between the kinetic and potential energies

In an ideal gas, the ground state is the one for which all the individual states of energy lower than the Fermi level (chemical potential = ) are occupied by one particle, and all the states above are totally empty. In the k-space, the particles each occupy a state inside one of the two Fermi spheres of radius (with = }2 2 2 ), one associated with the + spin state, and one associated with the spin state. Using the ket (1), such a state simply corresponds to the case where: k k

= 0 and = 1 and

k k

=1 =0

for for

(ideal gas)

(93)

Whatever the value of k, one or the other of the functions k and k will be zero, and so is the product k k ; the condensation energy of an interacting system remains equal to zero as long as the state of the system does not differ from the ideal gas state. A condensation energy can only be obtained via a deformation of the Fermi distribution. It is the attractive interactions that actually distort this distribution to create an overlap between regions where both functions k and k are different from zero, as can be seen in Figure 1. This allows minimizing the pairing energy (92), but involves a transfer of particles from the inside of the Fermi sphere to the outside, hence toward states of higher kinetic energy; this has a cost in terms of kinetic energy. The optimization we performed amounts to looking for the most favorable balance between the gain in potential energy and the cost in kinetic energy. The condensation energy is proportional to the square of the integral of the dashed line curve in Figure 1, which presents a maximum in the vicinity of the Fermi energy . The largest contributions come from energies close to , over a width of the order of a few ∆ - but the figure also shows that the contributions to the condensation energy spread relatively far from the Fermi surface (the curve only decreases as the inverse of the energy’s distance from its maximum). The Fermi surface, which was perfectly defined for an ideal gas, becomes blurred over a certain energy domain. The two-particle correlation function expresses in more detail this optimization of the attractive potential energy. Relation (64) shows that, for parallel spins, no significant change of the correlation function occurs, when compared to that of an ideal gas (for which an exchange hole is already present in the binary correlation function) – hence no significant change of the corresponding interaction energy. This lack of effect comes from the fact that the BCS wave function only pairs particles with opposite spins. On the other hand, when the spin directions are opposite, relation (66) shows that the probability of presence of two particles at a short distance from each other is increased; the larger the pair wave function modulus at the origin (r = r ), the higher this increase will be. It directly yields the gain in the attractive potential energy. In other words, the BCS gain in energy comes from the fact that, because of the interactions, the system changes its wave function to optimize its pairing potential energy. It develops correlations that go beyond that of an ideal gas; they are referred to as “dynamic correlations”, as opposed to the statistical correlations (coming solely from the indistinguishability of the particles, such as those studied in Complement AXVI ). This produces a deformation of the ideal gas Fermi distribution that, instead of presenting an abrupt transition between the occupied and empty states (perfectly well defined Fermi sphere), presents at its edge a more progressive transition region (blurred Fermi sphere). The system’s state vector then becomes a superposition of states where the particle 1915

COMPLEMENT CXVII



number in each pair of states (k k) fluctuates. The potential energy term that drives the BCS mechanism is the “pair annihilation-creation” term computed in § 3-a- of Complement BXVII , and that is schematized on its Figure 4. It includes a sum of terms containing non-diagonal matrix elements of the form: (k

k)

= 2;

k )

(k

=0

2

(k

k)

= 0;

(k

k )

=2

(94)

(the occupation numbers of all the other pairs remaining the same); between the ket and the bra, a pair (k k ) is replaced by another one (k k). The BCS energy gain is due to the summation over all these non-diagonal terms; they are sensitive to the coherence of the state vector between these two components (where the numbers of pairs fluctuate in a correlated way) and hence to their relative phase. .

Phase locking and cooperative effects

In the computation of the § 1-b, the minimization of the energy led us to choose the phases of all the k to be equal, and they simply disappeared from the following calculations. To discuss the physical process at work, it is useful to reintroduce them with their non-specified values before the optimization, as they appeared in (14); when all the matrix elements of the interaction potential are equal, the average value of the pairing energy is written: paired

=

sin

k

cos

k

sin

k

cos

k

2 (

k

k)

(95)

kk

We mentioned above that adding to the phase k of each k any given common phase , left all the results unchanged. The energy is invariant with respect to a symmetry of the wave function, the one that concerns the global phase of the k . It is often called the “ (1) symmetry”, referring to the (1) unitary symmetry group of rotations around a circle, isomorphic to the group of phase changes for a complex number. On the other hand, changing one by one the phases of the k , leads to an obvious reduction (in absolute value) of the pairing energy (95): in the complex plane, vectors that were perfectly aligned, now take different directions and the modulus of their sum is reduced. We saw that this term is at the origin of the gain in energy provided by the BCS mechanism; it is clearly linked to the acquisition of a common phase by all the pairs of individual states. This is an example of a phenomenon called in physics “spontaneous symmetry breaking”: whatever the phase of each k , which can take on any value, it is essential that it be the same for all, otherwise most of the gain in energy will be lost. In a similar way, in the ferromagnetic transition in a solid, the space direction along which the spins will align is not a priori fixed, but it is essential that it be the same13 for all the spins. Note finally the cooperative character of the energy gain obtained, which, mathematically, corresponds to the presence of a double summation over k and k . Starting 13 For an ensemble of spins parallel to any given direction, each spin is in a state where the relative phase between the components on + and is the same. For the BCS mechanism, it is the phase between the components where the occupation number of the couple of states k k is 0 or 2 that takes on a value independent of k. The corresponding energy lowering results from an interference effect between states where two pairs k and k have respective occupation numbers k = 2 k = 0 and k = 0 k = 2; therefore it cannot be directly expressed in terms of pair populations.

1916



FERMION PAIRING, BCS THEORY

from a perfectly phase locked situation, destroying the phase locking of a single pair leads to an energy loss proportional to the number of pairs that remained phase locked; the individual energy of a single pair is not what is at stake. On the other hand, starting from a situation where the phases of all the pairs are random, changing a single phase k barely modifies the average energy. We are in the presence of a cooperative effect: the greater the number of pairs that are already phase locked, the higher the tendency for a new pair to become phase locked; this tendency can be seen, in a way, as resulting from a mean field created in a cumulative way by all the other pairs. Here again we see the analogy with a ferromagnetic material where, the greater the number of spins already aligned, the higher the gain in energy with the alignment of a new spin. 3-b.

Gain in energy

We now compute the gain in energy resulting from the pairs’ formation. We first insert in (7) relations (27), which yield the optimal values of the k and the k , and use the definition (25) for ; we also take all the potential matrix elements to be equal to the same constant . Since we then have: 2 k

=

1 2

1

k k

=

1 2

1

=

1 2

(96)

and: 2

=

∆ 2

(97)

we get:

BCS

=

2

4

∆2 4

+ k

1 kk

1

2

( ) + ∆2

(

(98)

2

) + ∆2

The first term on the right-hand side is the mean field term (as before, we have neglected 1 compared to the total number of particles). The second one corresponds to the kinetic energy, and the third one, to the interaction between pairs: ∆2 4

kk

1

1

2

2

( ) + ∆2

(

=

) + ∆2

∆2 2

1 k

(99)

2

( ) + ∆2

where to get the second equality we have used relation (36) to eliminate the summation over k . Using again the definition (25) for , we get: = BCS

2

4

1

+ k

2

( ) + ∆2

2

( ) + ∆2

∆2 2 (100) 1917

COMPLEMENT CXVII



On the right-hand side of this expression, the first term, corresponding to the mean field, is of no particular interest. The second one accounts for the change in energy due to the dynamic correlations introduced by the interactions; it characterizes the BCS mechanism. We first check the convergence of its summation over k, for a fixed value of ∆. This is not the case for each term in the parenthesis, which tends toward a constant when , leading to a summation in 1 1 2 that diverges. We are now going to show that the divergent terms cancel each other. We can write: ∆2 2

2

( ) + ∆2

3 ∆4 + 8 ( )3

(101)

and thus: 1

∆2 2

2

( ) + ∆2

2

( ) + ∆2

3 ∆4 + 8 ( )3

(102)

We have just shown that the divergent terms in the infinite summation of k in relation (100) cancel each other between the kinetic and interaction terms; the function in the 3 summation goes to zero, for large values, as 1 ( ) 1 6 , which ensures the convergence and does not require the introduction of a cutoff frequency (apart from the one we had to introduce before to ensure a finite value for ∆). This was also the case for the total number of particles. We thus see that once we have introduced an upper boundary (cutoff) in the integral determining the gap ∆, all the other important physical quantities remain finite, without having to reimpose this cutoff frequency. The precise determination of the energy requires, in general, the computation of somewhat complex integrals. It will not be detailed here, but yields the result: = BCS

0

4

[

2

]

1 2 ∆ 2

(103)

(remember that is the density of individual states at the surface of the Fermi sphere, and is proportional to the volume = 3 ). Finally, the energy gain resulting from the BCS pairing is given by: =

1 2 ∆ 2

(104)

It can be shown that the values of that contribute most to the energy are those that are lower than or comparable to the gap ∆; the energy change linked to the pairing phenomenon is mainly located in the vicinity of the Fermi surface. This result is often interpreted by saying that, in an ensemble of ∆ pairs, each pair gains an energy of the order of ∆, which explains the ∆2 dependence of (104); while this image is simple, it has its shortcomings (see note 13). 3-c.

Non-perturbative character of the BCS theory

Generally, the most basic way to take the interactions into account is to use a first order perturbation theory (Chapter XI), where the energy correction is the average value, in the initial non-perturbed state, of the perturbation Hamiltonian. Applied here, the first order correction to the energy is obtained by inserting the values (93) into (7). 1918



FERMION PAIRING, BCS THEORY

The first term (kinetic energy) on the right-hand side of (7) is unchanged, and the third one remains zero since, according to (93), the product k k is always zero for any value of k. We are left with the second term, which produces a mean field correction. To the next perturbation order, the effect of the potential is to change the ground state by transferring pairs of particles, initially both inside the Fermi sphere, toward individual states whose momenta fall outside the sphere (all the while keeping the total momentum constant); this changes at the same time the average kinetic energy (which is increased) and the interaction potential energy. The computations become more and more complex as the perturbation order increases. And above all, it is clear that this approach to higher and higher perturbation orders cannot account for the existence of the gap obtained in (46): as the function ∆ ( ) has all its derivatives with respect to equal to zero at = 0, it can not be expanded as a series in . The BCS theory is a non-perturbative method that solves this difficulty. However, it is not an exact method since it is a variational method, but the chosen wave function is sufficiently well adapted to allow the inclusion of important physical effects, without using any perturbation theory. 4.

Excited states

Up to now, we only studied the ground state of the system of attractive fermions. As soon as the temperature is no longer zero, excited states of the system begin to be populated. In this last section, we shall give a survey of the BCS theory predictions concerning the excited states. A study of the BCS theory at non-zero temperature can be found in more specialized books. 4-a.

Bogolubov-Valatin transformation

Relations (E-3) and (E-4) of Chapter XVII define the Bogolubov-Valatin transformations of the creation and annihilation operators of spin 1 2 fermions. With the notation of the present complement where the spin directions are explicit, they become: k k

=

k

=

k

k

k k

+

k k

k

which, by conjugation, yield the definitions of the Hermitian conjugate operators k: k k

=

k

=

k

k

+

k

and

k

k

k

(105)

k

k

(106)

For each value of k, we get a general transformation of the four initial creation and annihilation operators or into four new operators and . We showed in Chapter XVII that these operators obey the usual anticommutation relations of fermions. We also saw in that chapter that the ket k defined in (2): k

=

k

+

k

k

k

0

(107) 1919



COMPLEMENT CXVII

is an eigenvector of the two operators

k

and

k

k

k

=

k k

k

k

k

=

k k

k

with a zero eigenvalue: k

+

k

0 =0 0 =0

(108)

It follows that the variational ket ΨBCS of the ground state, written in (1), is an eigenvector, for any value of k, of all the operators k and k , with a zero eigenvalue. It is therefore also an eigenket of all the operators k k and k k with a zero eigenvalue, which is a minimum eigenvalue for operators defined as positive or zero. Furthermore, we showed that the repeated action of the creation operators k and k permits obtaining other states, which are also eigenvectors of the operators k k and k k . We are going to show that these operators k k and k k can be interpreted as corresponding to the occupation numbers of the excitations present in the physical system. 4-b.

Broken pairs and excited pairs

Letting

2

=

k

k

act on (107), we get:

k

2

k

=

k

k

k

k

k

0

k

0 =

2 k

k

+

2 k

k

0 (109)

which is a ket normalized to unity, and obviously non-zero (as opposed to the one resulting from the action of k ). Similarly, if we consider the action of k , we get another non-zero ket: k

k

=

k

0

(110)

These two new normalized kets are orthogonal to the initial ket k , since they correspond to an occupation number equal to 1, whereas the occupation numbers of k are 0 and 2. In these states, a pair has been replaced by a single particle, not belonging to any pair; they are called “broken pair” states. As the squares of the operators k and k are zero, the repeated application of any of the two does not allow constructing new orthogonal states; however this can be achieved with their cross product. Letting k act on the ket (110), we get the ket k : k

=

k

k

k

=

k k

k

k

0

(111)

which is another normalized ket, and orthonormal, as can be easily checked, to k ; letting now the two operators k and k , in the inverse order, act on k , we get the same ket, k , within a change of sign. The components of the two states k and contain occupation numbers equal to 0 or 2; the ket k is called “excited pair” k state. To go from k to k , we simply switch k and k , change the sign of k , and finally take their complex conjugate (this is true for the general case, but in the BCS pairing case, as k and k take on real values, this last step becomes unnecessary). 4-c.

Stationarity of the energies

Let us now show that the energies of these new states are stationary with respect to the variational parameters. 1920

• .

FERMION PAIRING, BCS THEORY

Broken pair According to (109), the action of k

ΨBCS =

0

k

k

on the ground state ΨBCS leads to the ket:

ΨBCS

(112)

where ΨBCS is just the ket ΨBCS whose k pair component has been removed from the product: ΨBCS =

(113)

k k =k

The energy average value in the state (112) is the sum of three terms: = }2

(i) the kinetic energy

2

2

associated with the state

0

k

(ii) the energy associated with the state ΨBCS ; the computation of that energy is the same as for ΨBCS , including the pair interaction energy, except for the fact that one less pair is now involved in the calculation. This slightly modifies the value of ∆, and hence the optimal value of the parameters k ; however, since the relative variation of ∆ is inversely proportional to the number of particles, we shall ignore this slight modification. (iii) finally, the interaction energy between the particle in the individual state k 0 and the particles described by ΨBCS ; the pair structure of that state means that the only contributions are a direct term in: 1 2

0

k

+

k

k

k

k

k

k

k

=

0

k =k

= (where 1 2

k k

(114)

0

is the average particle number in the state ΨBCS ) and an exchange term in: k

k

k

k

k

k

+

k

k

k

k

=

k =k

k

k

k

(115)

k =k

Note again that, for an interaction that does not act on the spins, the exchange is only possible with particles having the same spin. With the short-range potential approximation (29): 0

=

k

=

the term (114) becomes equal to yields a constant 2.

(116) , the term (115) to

2, and their sum simply

The parameters defining the variational state (113) are the set of the k and the k for k = k (the dependence on the parameter, which characterized the broken pair, is no longer present). These parameters are the ones that make the energy stationary, since we neglected the slight variation of the gap linked to the disappearance of a pair; they play no role either in (i) or in (iii).

We have just confirmed that k ΨBCS renders the energy stationary (in the framework of the variational approximation we are using). A symmetry argument shows that this is obviously also the case for the state k ΨBCS . 1921

COMPLEMENT CXVII

.



Excited pair

In the stationary relations (26), the change of k into k simply amounts to exchanging the signs (the k and k are real); the components of the ket k are part of the stationary solutions we have discarded in writing (27). We thus confirm that the excited pair corresponds to a stationary energy; it is actually the highest possible energy for the pair of states (k k). 4-d.

Excitation energies

In the 4-dimensional state space associated with each pair of states (k k), the creation operators acting on the k permit building a new basis of 4 orthonormal states, whose average energies are stationary. They can be considered to be the approximate eigenvectors of the system Hamiltonian. We now compute the corresponding eigenvalues. In the case of the broken pair, the excited state ΨBCS does not contain the same number of particles as ΨBCS ; it does not make sense to directly compare their energies. To make a valid comparison, we must take into account the presence of a particle reservoir whose energy increases by (chemical potential) each time it absorbs a particle. In other words, we must evaluate the variations of the average value .

.

Broken pair We now show that the variation of the average value

the breaking of the pair is simply the energy

associated with

defined in (25).

To do so, we compute the variation of this average value when the state ΨBCS is replaced by expression (112). Several terms come into play: (i) The variation of the average value of the kinetic energy is the difference between the energy of a particle and that of the population k 2 of the pair (k k), with a kinetic energy of 2 ; this difference is therefore 1 2 k2 . (ii) As for the potential energy, the passage from ΨBCS to the ket ΨBCS changes the average particle number from 2 k 2 (initial population of the pair) to 0, so that the 2 variation of is = 2 k 2 . The mean field term 4 in (7) varies by 2 2, that is k . The following term in (7) is zero for a short-range potential. Finally, the breaking of a pair has an impact on the binding energy in the last term of (7); if we change the dummy summation index into , and into , the terms that will change correspond to the terms = and the terms = , which double each other. The breaking of the pair leads to an increase of energy equal to: 2

k k

k

k

= 2∆

k k

k 2

= ∆

1

(

)

2

=

∆2

(117)

where we have used (31) and (27). (iii) We saw that the unpaired particle has a potential energy 2 2. This energy must be added to the variation of the mean field term calculated above, to give a contribution equal to 1 2 k 2 2.

1922



FERMION PAIRING, BCS THEORY

We now sum all the previous contributions and add the variation of a term

1

2 =

2 k

which yields

. The total variation is then: 1

2

2

2 k

+

∆2

(118)

or else, taking (13) into account: =

1 2

=

+ ∆2

1 =

+

∆2

(119)

We find the expected result14 .

.

Excited pair

We now assume that in the product that yields ΨBCS , the ket k is replaced by the orthogonal ket k written in (111), and which describes an “excited pair”. We are going to show that the variation of the average value associated with that excitation is 2 , twice the excitation associated with the breaking of the pair. To show this, we must, here again, add several variations. The first one comes from the 2 2 kinetic energy and yields 2 , that is 2 1 2 k ) 2 – see relation (4). k k The second one is the mean field term introduced by the fact that the average value of 2 2 the total particle number varies by 2 , which leads to a potential energy k k 2 variation 1 2 k . We must also account for a variation of the pair binding energy, which comes from the sign change of the product k k , which doubles the term (117). Because of the change in the average particle number, the term in gives a 2 2 contribution of 2 , that is 2 1 2 k 2 . Finally, all the terms found k k for the broken pair are just doubled here and we indeed find 2 .

.

Spectrum of the elementary excitations

We now have the energy of the three excited states associated with each pair of states: the energy (doubly degenerate since it corresponds to both kets k k and ) and the energy 2 (non-degenerate). The value of these energies is given in k k (25). Figure 7 plots the dispersion relation of these elementary excitations (energy of these excitations as a function of their momentum). The solid line corresponds to the spectrum associated with the breaking of a pair k, k, during which a particle disappears, as in relations (109) and (110); as the spectrum associated with the excitation of a pair can be simply obtained by the multiplication by a factor 2, it is not plotted in the figure. The dashed lines plot those same energies for an ideal gas (no interaction) for which ∆ = 0. The interaction effect creates a “gap” ∆, which yields a minimum value for the excitation energy that otherwise can go to zero for an ideal gas. Upon the breaking of a pair, we saw that, on average, the system’s total population 2 changes by 1 2 k . The curve in Figure 7 has a different interpretation, depending on 14 The calculation would be the same for an ideal gas; in the particular case where ∆ = 0 and according to (25), we would get: = .

1923

COMPLEMENT CXVII



Figure 7: The solid line plots the variation as a function of

of the excitation energy

2

= ( ) + ∆2 , with = and = ~2 2 2 (for the sake of simplicity, we assumed , so as to neglect the difference between and in the expression of ). This energy presents a minimum equal to the gap ∆ when is equal the the Fermi wave vector = (wave vector for which = ). The dashed line curve is the same function, but for ∆ = 0 (zero gap), hence for an ideal gas. The BCS theory predicts that the energy associated with the breaking of a pair is , and the energy associated with the excitation of a pair is twice that amount, i.e. 2 . The minimum of the energy is therefore the minimum of energy that must be supplied to the system in its BCS ground state to produce one of the previously computed excitations. As explained in the text, the region on the left-hand side of the curve corresponds to “hole type” excitations (the excitation leads to a particle loss for the physical system) and the region on the right-hand side to “particle type” excitations (the physical system gains a particle). For clarity, the figure does not take into account the mean field effects; these would lower all the energies by the same negative quantity, and change the chemical potential into defined by (41), so that would become = . These effects slightly shift the curve plotting to the left.

which part of the curve we analyze. On the left-hand side (decreasing function), the solid 2 line curve nearly perfectly matches the dashed curve, meaning k is practically equal 2 to 1 (see Figure 1). In that case, 1 2 k 1: a particle disappears in the course of the excitation, which is said to be of the “hole type”. Its energy is the energy needed to push one particle towards the reservoir that fixes the chemical potential, diminished by its initial kinetic energy , and to which we must add the mean field correction15 ; the excitation energy is therefore , a value that corresponds to the left-hand side 2 of the curve. As for the right-hand side part (increasing function), the constant k is practically zero: the excitation adds a particle to the system, and is said to be of the “particle type”. Its energy is equal to , energy necessary to promote a particle from its energy to a state of kinetic energy (with, as before, a mean field correction 15 This

1924

correction changes the initial energy

into

2, and hence

into

.



FERMION PAIRING, BCS THEORY

that changes into ). Finally, for the central part of the curve, we have a “mixed” excitation, of both hole and particle type; it is the region of the spectrum where the solid line parts the most from the dashed line, and where the BCS mechanism, which creates the gap ∆ plays an essential role. From these four energy levels associated with each pair of states, quantum statistical mechanics allows obtaining a density operator describing the thermal equilibrium of the system at temperature , as well as all the various thermodynamic functions. The corresponding development will not be exposed in this complement. We shall simply mention that it allows extending the validity of a certain number of results obtained at = 0, by simply introducing a gap ∆( ) that depends on the temperature. At zero temperature, ∆(0) is still given by (46), but the gap decreases as increases, and goes to zero for a certain critical temperature . This cancelling of the gap corresponds to a phase transition: as a system of attractive fermions is cooled down, when it reaches a certain temperature the pair condensation phenomenon occurs, which leads to a number of physical consequences. For example, the system’s specific heat first takes on values higher than in the absence of transition, then abruptly (exponentially) goes to zero as 0. Conclusion In conclusion, the choice of a variational basis of paired states sheds new light on the behavior of an ensemble of attractive fermions. We focused on the case of weak attractions, corresponding to electrons in superconducting metals; in such a situation, (∆ ) 1 and relation (75) shows that the pair range pair is very large compared to the distance between fermions. The one-particle distribution shown in Figure 1 is then very similar to the step function obtained for an ideal gas at zero temperature, the step being nevertheless rounded off over an energy band of width equal to ∆. In other words, the BCS pairing only slightly modifies the Fermi sphere of a perfect gas. Studying the properties of the optimal state, we were able to expose a number of important phenomena: existence of spatial dynamic correlations, which explain the increase of the average attraction (negative) energy between fermions of opposite spins; phase locking accounting for the cooperative aspect of the pairing (increase of the attractive energy overcoming the increase of kinetic energy, hence leading to a decrease of the system total energy); existence of a pair wave function describing fermions of opposite spins, and reminiscent of the Cooper pairs (Complement DXVII ); appearance of a “gap” in the elementary excitation spectrum explaining the robustness of the system’s ground state. Another interesting limiting case concerns strong attractive interactions where the pair range becomes very small compared to the distance between particles. “Molecules” are then really formed with a binding energy large (in absolute value) compared to the Fermi energy . Saying that the size of the bound state is small compared to the distance between particles amounts to saying that its momentum distribution width is large compared to the Fermi wave vector . This means that due to the attractive potential, the occupation of the individual states are spread over a large number of different momenta, which dilutes the effects of Pauli exclusion principle; these effects thus become negligible whereas they are essential in the BCS case. Instead of being positive, the chemical potential is now negative, close to . Relations (24) then show that the (and hence the populations of the individual states k) always remain small, nevertheless 1925

COMPLEMENT CXVII



extending up to energies of the order of . In that special case, the pair wave function 2 = sin cos , practically coincides with pair , with its Fourier components the wave function having as Fourier components ( ) =tg 2 , and which was initially used for building the paired state. As the molecules contain two strongly bound fermions, they behave as composite bosons (Complement AXVII , § 3), which may undergo Bose-Einstein condensation. Paired states enable us to see the continuous passage from one limiting case (BCS situation with a weakly perturbed Fermi sphere) to the other (condensation of strongly bound “molecules”). A detailed discussion of this continuous passage and its physical consequences is given in §4-6 of reference [10] and in [11]. In this complement, we emphasized the physical interpretation of the results obtained in the detailed calculations we presented; this will give the reader the necessary base for studying the experimental aspects of superconductivity, which are not presented here. Among the many aspects of superconductivity that have not been studied in this complement, we can list: transport phenomena and the disappearance of the electrical resistance; behavior in the presence of a magnetic field (Meissner-Ochsenfeld effect); experimental study of the elementary excitation spectrum and gap measurements via different methods (tunnel effect, magnetic resonance); Josephson effect. The interested reader can refer, for example, to the books of M. Tinkham [12], of R.D. Parks [13], or of A.J. Leggett already quoted [8]. The book of Combescot and Shiau [14] presents a good overview of the four main theoretical methods for studying superconductivity, the BCS variational method discussed in this complement being the first of this list.

1926



COOPER PAIRS

Complement DXVII Cooper pairs

1

Cooper model

. . . . . . . . . . . . . . . . . . . . . . . . . . . 1927

2

State vector and Hamiltonian . . . . . . . . . . . . . . . . . . 1927

3

Solution of the eigenvalue equation . . . . . . . . . . . . . . . 1929

4

Calculation of the binding energy for a simple case . . . . . 1929

In this complement, we present the “Cooper model” which was a first step towards the complete BCS theory. It yields some results of that theory without having to deal with the difficulties inherent to an -body problem. With this simplified model, we study the properties of two attractive fermions whose wave function, in the momentum representation, is excluded from the Fermi sphere. The model will be presented in § 1, where we show the existence of a bound state, occurring only because of the existence of that sphere. Furthermore, we shall see that the mathematical expression for the corresponding binding energy is reminiscent of the expression for the gap value ∆ obtained in the BCS theory. 1.

Cooper model

Among a large ensemble of identical fermions, we focus our attention on two of them, supposed to attract each other, in order to study their two-body wave function and energy levels. The presence of all the other fermions is simply accounted for by a Fermi sphere that, because of the Pauli exclusion principle, requires the components of that wave function to be zero inside that sphere. Such an approach is obviously not very rigorous: isolating two fermions among a large number of other indistinguishable fermions does not make much sense. Furthermore, it is hard to imagine why two of them would interact via an attractive potential, whereas all the others determining the Fermi sphere would be without interaction. However, the mathematical form of the results obtained with this model presents interesting similarities with the variational method where all the fermions are treated equally; it is thus useful to study this model. 2.

State vector and Hamiltonian

Consider two attractive fermions in a singlet spin state =0 =

1 [1: 2

2:

1:

2: ]

=0 : (1)

The relative motion of their position variables is described by the orbital ket Ψorb , and their center of mass is described by a zero momentum ket ΦK=0 . Their state vector is: Ψ = ΦK=0

Ψorb

=0

(2) 1927

COMPLEMENT DXVII



The state Ψorb is characterized by a wave function Ψorb (r): Ψorb (r) = r Ψorb

(3)

where: r = r1

(4)

r2

is the difference between the positions of the two particles (relative position). As the singlet state is even with respect to the exchange of the two particles, their fermionic character requires the wave function Ψorb (r) to be even with respect to particle exchange, i.e. with respect to a sign change of r: Ψorb ( r) = Ψorb (r)

(5)

We assume the operator describing the attractive interaction between the two particles to be independent of the spin. As in § B-2 of Chapter VII, we separate in the the twoparticle Hamiltonian the motion of the center of mass from the relative motion, and assume the center of mass is at rest. We are then left with a Hamiltonian rel that only acts on the space of the relative motion variables, and can be written: rel

ˆ2 p

=

+

(ˆr)

(6)

where ˆr = ˆr1 ˆr2 is the operator associated with the relative position of the two particles; ˆ is the operator associated with the momentum of the relative motion, defined as a p ˆ 1 and p ˆ 2 of the two particles: function of the momenta p ˆ= p

ˆ1 p

ˆ2 p

(7)

2

As mentioned above, we assume the presence of an ensemble of non-interacting fermions, whose Fermi level is . We must then solve the eigenvalue equation: rel

Ψorb = (

+2

) Ψorb

when Ψorb does not have any component inside the Fermi sphere of radius to the Fermi level by:

=

}2 ( 2

(8) , related

2

)

(9)

In relation (8), is the eigen-energy with respect to twice the Fermi level. It is indeed natural to take 2 as an energy reference; this is the minimal energy to be given to the two fermions under study, for their wave function Ψorb to have zero components inside the Fermi sphere, in the absence of interaction. With this convention for the energy origin, simply reflects the effect of the attractive interaction. 1928

• 3.

COOPER PAIRS

Solution of the eigenvalue equation

We now expand Ψorb on the normalized eigenvectors k (plane waves) of the momentum ˆ: p Ψorb =

(10)

k

k k

Projected onto k , the eigenvalue equation (8) reads: 2

k

+

k

k

k

=(

+2

)

(11)

k

k

where we have set, as usual: }2 2 (12) 2 The absence of components of Ψorb inside the Fermi sphere leads to the relation: =

k

=0

if

(13)

k

while (11) becomes: [

+ 2(

)]

k

=

(14)

k

kk k

The matrix elements of the interaction operator kk

4.

= k

are noted

kk

: (15)

k

Calculation of the binding energy for a simple case

Let us further simplify the model and assume that the potential matrix elements are such that: kk

=

if k

kk

=0

kk

+∆

k

otherwise

(16)

where ∆ defines a wave vector domain ∆ ; these matrix elements can therefore be factored. Note that the minus sign in front of the constant was introduced to ensure that, for our present attractive potential, the constant is positive. When k +∆ , the summation on the right-hand side of (14) becomes a constant independent of k, with: = whereas, if k simply: k

=

k

=0

(17)

k k

+ 2(

+∆

+ ∆ , this summation is zero. The solution of this equation is

) otherwise

if

k

+∆ (18) 1929



COMPLEMENT DXVII

We must add the self-consistent condition we get from inserting this solution in the definition (17) of : = k

+ 2(

+∆

(19)

)

that is, changing the sign of the denominator: 1

1

= k

+∆

2(

(20)

)

This condition is also an implicit equation for obtaining the energy . Assuming the system is enclosed in a cubic box with a very large edge length , the discrete summation can be replaced by an integral, and we get: +∆

3

1

=

2

2

2

2(

(21)

)

We now choose, as the integral variable, the variable : =

(22) = }2 d

As d

, this integral now includes a density of states

3 2

( )=

d

2

2

( ):

3

=

2

2

(23)

}2

or: 3

2 2 2 }3 The implicit equation for

(24)

( )=

1



=

( 2

d 0

then becomes:

+ )

where the upper bound ∆ 2

+∆

=

} ( 2

(25) is defined by: 2

+∆ )

(26)

As we assumed ∆ , we can replace1 in (25) the density of states ( ), no longer dependent on the variable , and that we simply note

by

=

(

)

(

+ ) : (27)

We can then perform the integration, which yields: 1

= =

ln (2

2 2

ln

2∆

)

∆ 0

=

2

ln

2∆

(28)

1 Replacing by in (23), one can easily compute an order of magnitude for the density of states at the Fermi level. We find ( ) , hence a value proportional to the average particle number.

1930



COOPER PAIRS

We then have: 2

=

2∆

(29)

The solution of this equation for

is:

2

=

2∆

which, when =

2∆

(30)

2

1 (

)

exp [ 2

1, can be simplified to: ]

(31)

We obtain a negative energy (with respect to 2 ), as expected for a bound state (the wave function can be normalized). As we cannot make a series expansion of the function exp ( 1 ) in the vicinity of = 0, this energy cannot be expressed as a power series of the interaction potential , since all its derivative are zero at = 0; consequently, this energy cannot be obtained by an ordinary perturbation calculation. Note also that the energy goes to zero (through negative values) if the density of states ( ) goes to zero, i.e. if goes to zero: the existence of the bound state is therefore linked to the presence of the Fermi sphere, whose role is to introduce a non-zero density of states. If the Fermi sphere disappears, so does the bound state. We find results similar to those found in Complement CXVII using the BCS theory, and in particular to the expression (46) of that complement, which yields the gap ∆. To obtain that expression, we had to introduce an upper bound } for the variation (in absolute value) of the energies around the Fermi energy; this upper bound plays a role comparable to that played by the energy ∆ we just introduced in (25). We simply have to assume that } = ∆ for the two results to become quite similar as they only differ by a factor 2 in the exponential (the sign difference simply comes from the fact that the gap ∆ was defined as a positive quantity, whereas a binding energy is negative). The interest of the Cooper model is to clearly highlight the essential role played by the density of states (in the vicinity of the Fermi level) in the creation of the gap ∆ in the BCS theory.

1931



CONDENSED REPULSIVE BOSONS

Complement EXVII Condensed repulsive bosons

1

Variational state, energy . . . . . . . . . . . . . . . . 1-a Variational ket . . . . . . . . . . . . . . . . . . . . . 1-b Total energy . . . . . . . . . . . . . . . . . . . . . . 1-c approximation . . . . . . . . . . . . . . . 0 Optimization . . . . . . . . . . . . . . . . . . . . . . . 2-a Stationarity conditions . . . . . . . . . . . . . . . . . 2-b Solution of the equations . . . . . . . . . . . . . . . Properties of the ground state . . . . . . . . . . . . 3-a Particle number, quantum depletion . . . . . . . . . 3-b Energy . . . . . . . . . . . . . . . . . . . . . . . . . . 3-c Phase locking; comparison with the BCS mechanism 3-d Correlation functions . . . . . . . . . . . . . . . . . . Bogolubov operator method . . . . . . . . . . . . . . 4-a Variational space, restriction on the Hamiltonian . . 4-b Bogolubov Hamiltonian . . . . . . . . . . . . . . . . 4-c Constructing a basis of excited states, quasi-particles

2

3

4

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

1935 1935 1936 1936 1937 1938 1939 1940 1940 1942 1944 1946 1950 1951 1952 1954

In this complement we study the properties of an ensemble of repulsively interacting bosons1 , undergoing Bose-Einstein condensation. We know that, for an ideal gas, a system of bosons in its ground state is totally condensed: a single individual quantum state, corresponding to the lowest energy, is occupied by all the particles. In the presence of short-range interactions, and for a sufficiently diluted system, one expects its properties to remain close to that of an ideal gas, and in particular that a large fraction of the particles still occupy the same individual quantum state. We consider this to be the case for the system under study, and that the population of one individual quantum state is much larger than all the others. We shall assume that each of the states obeys the periodic boundary conditions in a box of side length (Complement CXIV ), and that the state with the large population2 is the state k = 0 (whose momentum p = }k and kinetic energy are zero). Consequently, if 0 is the average value of the population of this zero momentum level, we assume that: 0

k

(for any k = 0)

(1)

1 We will not consider the case of attractive interactions, as they lead to an unstable physical state – see § 4-b of Complement HXV . 2 This hypothesis simplifies the writing of the equations, but is not essential; in the case where it is a state of non-zero momentum k0 that is highly populated, one can go to the reference frame where this momentum is zero. This amounts to adding, in the initial frame, k0 to all the wave vectors appearing in the equations.

1933

COMPLEMENT EXVII



where k = k is the average number of particles occupying an individual state k = 0. The total particle number is: =

0

+

k

(2)

k=0

We have already used, in Complement CXV , a first approximation to study the ground state of a condensed boson system: we described the state of the -particle system as the product of identical individual state vectors. This led to the GrossPitaevskii equation. This approach implies that only one single individual state k = 0 is occupied ( 0 = and k = 0 for any k), as for an ideal gas. This obviously cannot be exact: it is clear that the interactions introduce dynamic correlations between the particles, which cannot be accounted for by a state vector that is a simple product of individual kets (hence without correlations). Actually, the effect of the interaction potential on the ground state is to transfer at least a fraction of the particles3 from the state k = 0 to the states k = 0; a model involving only one individual state is necessarily limited to the case where the potential effect is very weak, and hence 0 . In Complement EXV , we introduced another approximation, based on the HartreeFock method; it is more general than the previous one as it allows taking into account a non-zero temperature. However, it still implies that each particle moves in the mean field created by all the others, ignoring the dynamic correlations; its description of the ground state is no better than the one derived from the Gross-Pitaevskii equation. Furthermore, this latter method proved to be problematic for a boson system undergoing Bose-Einstein condensation: we noted in § 3-b- of Complement GXV that, for a system of condensed bosons, the Hartree-Fock approximation predicts, at the grand canonical equilibrium, very large fluctuations of the number of condensed particles. In the real world, these fluctuations are strongly limited by the repulsion between the particles, which clearly indicates that the predictions of the Hartree-Fock approximation concerning fluctuations are non-physical. In the present complement, we shall try to address these two problems: on one hand, we shall take into account the dynamic correlations introduced by the interactions in the physical system; on the other, we shall not let the number of condensed particles fluctuate arbitrarily. We will use a variational method, choosing a variational state that takes into account the binary correlations between the particles, but does not introduce unrealistic fluctuations of the particle number. This variational state will be built with the help of a paired state, enabling us to directly use the results of Chapter XVII. We will add an extra component, to account for the Bose-Einstein condensation in the individual k = 0 state. Obviously, this is still not an exact calculation, as it involves a variational approximation, but it allows describing a physical situation more complex than the simple Gross-Pitaevskii approximation. This approach also highlights the many analogies, but also the differences, between the pairing of condensed bosons and the pairing of fermions. In a general way, this complement illustrates how variational methods allow changing the correlations between particle pairs. When dealing with binary interactions, as in a standard Hamiltonian, these correlations determine the average value of the potential energy (Chapter XV, § C-5-b- ). The higher order correlations (ternary, etc.) may be present and play a role in the system; but they are not directly involved in the energy. 3 This phenomenon is traditionally called “quantum depletion” and will be discussed in more detail in § 3-a.

1934



CONDENSED REPULSIVE BOSONS

This is why using the paired states to optimize only the binary correlations can lead to fairly good results. We introduce in § 1 the paired variational state depending on a certain number of parameters, and compute the corresponding average energy. In § 2, we shall search for the optimal values of these parameters that minimize this energy, using an approximation where 0 so that we can neglect the interactions between the particles in the k = 0 individual states. In § 3 we study the physical properties of the state thus obtained, such as the number of particles that are not in the k = 0 state, the energy, and the correlation functions. We shall then develop, in § 4, a different point of view, the Bogolubov operator method. We shall choose a larger variational space, and use the results of § E in Chapter XVII to get the Bogolubov Hamiltonian, which can be directly diagonalized. This will confirm a certain number of previously obtained results. The reader only interested in the Bogolubov operator method can go directly to this paragraph, which is fairly self-contained. The conclusion of this complement will sum up the results obtained and the limits of this approximation method. 1.

Variational state, energy

We are now going to directly apply the results of Complement BXVII , for the choice of the variational ket as well as for the computation of its average energy. 1-a.

Variational ket

The (normalized) variational ket is of the form: Φ

=

Ψpaired =

0

0

(3)

k k

where the subscript refers to the name Bogolubov. In this expression, Ψpaired is the paired state for spinless particles written in (B-8) of Chapter XVII, which is a tensor product of the normalized states (C-13): k

=

1 cosh

exp

k

k

k

k

0

(4)

0)

(5)

with: k

= tanh

2

k

(

k

k

The domain of the tensor product in (3) is half the k-space, which prevents (as we saw in Chapter XVII, § B-2-a) the double appearance of each state k = k ; the origin k = 0 is excluded from . As for 0 , it is the coherent state already used in Complement BXVII , relation (44): 0

=

0

2

0 0

0

=0

This state depends on a complex parameter its phase 0 : 0

=

0

0

(6) 0,

characterized by its modulus

0

and (7) 1935



COMPLEMENT EXVII

It is the normalized eigenvector of the operator 0

0

=

0

with eigenvalue

0:

(8)

0

0

The average particle number in the state k = 0 is then: 0

0 0

=

0

0

0

0

=

0

(9)

0

The width of the corresponding distribution is 0 (Complement GV ), hence negligible compared to 0 (this number is supposed to be large). The variational variables contained in the trial ket (3) are thus the set of k and k , as well as 0 and 0 . 1-b.

Total energy

Expression (61) of Complement BXVII yields the total energy in the form:

sinh2

=

k

+

k=0

+

k

0

sinh2

k

0

(

2

sinh

k

0

+

cosh

) k

2

cos 2 (

k)

0

k=0

+

1 2

k k

sinh2

k

sinh2

k

+ sinh

k

sinh

k

cosh

k

cosh

k

cos 2 (

k

k

)

k k =0

(10) where the matrix elements Chapter XVII, by: k

=

1

d3

3

kr

k

of the particle interaction potential are defined, as in

2 (r)

(11)

The term on the second line of (10) corresponds to the momentum exchanges between the k = 0 particles and the k = 0 condensate, as well as the pair annihilation-creation processes originating from the condensate. The terms in the last line, with a double summation over k and k , correspond to interaction effects between k = 0 particles. 1-c.

approximation

0

As already pointed out in the introduction, for an ideal gas in its ground state, only one individual state is occupied, corresponding to the lowest energy; in that case, the average total particle number is equal to 0 , and all the populations k of the other k states are zero. We are going to assume that the system we study is a dilute gas where the interaction effects are limited so that 0 remains very large compared to the sum of all the populations k : 0

=

(12) k=0

1936



CONDENSED REPULSIVE BOSONS

This hypothesis is more constraining than the one initially proposed in (1), since now the population 0 must largely exceed the sum of all the other populations. Nevertheless it allows a simplification of the following computations while highlighting a certain number of general physical ideas. Under these conditions, the interactions between particles in the k = 0 states and particles in the k = 0 condensate are dominant compared to the interactions between particles both in the k = 0 states. The interaction term on the second line of (10), proportional to 0 , is therefore much larger than the term on the last line, which does not contain 0 . This is why we use the approximate average value: sinh2

k

0

+

k=0

+

k

0

(

2

sinh2

0

2

+

)

sinh

k

k

cosh

k

cos 2 (

k)

0

(13)

k=0

We have yet to determine the optimal values of the variables appearing in (13) by minimizing this energy average value with respect to each of them. 2.

Optimization

The variational state Φ depends on the variables 0 and 0 associated with the individual state k = 0 (condensate), as well as on the angles k and the phases k associated with all the other k = 0 states. On the other hand, is not a variational variable, but a function of the previous variables determined by relation (53) of Complement BXVII : =

k k

sinh2

=

k=0

(14)

k

k=0

As in Complement BXVII , we introduce a Lagrange multiplier (chemical potential, see Appendix VI) to fix the average total particle number; we thus impose the stationarity of the difference of two average values: =

(15)

where

is the average total particle number in the variational state: =

+

0 0

k k

=

0

+

(16)

k=0

The function to be minimized is therefore: =

0

2

(

0

+

)

2 0

+

(

) sinh2

k=0

k

+

k

0

(17)

k=0

with: k

=

k

sinh2

k

sinh

k

cosh

k

cos 2 (

0

k)

(18) 1937

COMPLEMENT EXVII

2-a.



Stationarity conditions

The function must be made stationary with respect to all the variables. We shall start with the phases, then the parameters k , and finally 0 . .

Stationarity with respect to the phases: phase locking

The phases only intervene in the k , as phase differences 0 k . Since we have a repulsive potential, we assume k is positive; furthermore, as the variable k is always positive according to its definition in Chapter XVII, the product sinh k cosh k is also always positive. Expression (18) shows that, whatever the value of k , the minimization of the function with respect to the phases 0 and k requires the cosine to be equal to 1, that is: k

=

for any k

0

(19)

Consequently, the phases used to build the paired states must all be equal to the phase defining the coherent state associated with k = 0. We call this equality the “phase locking condition”. .

Stationarity with respect to the The stationarity of 0=

0

(

0

+

)

k

with respect to each parameter + 2(

) sinh

cosh

k

k

+

k

implies that, for any k:

k

(20)

0

k

k

where the derivative of k is taken at the phase values that satisfy relation (19). Grouping on a first line the terms in sinh k cosh k (including those coming from the derivative of sinh2 k ), and on a second, those coming from the derivative of sinh k cosh k , we get: 0 =[

0

(

0

+

)+

+

2

2

cosh

0 k

k + sinh

0 k] 2

sinh

k

cosh

k

(21)

k

Relation (21) then becomes: 0=[

+

0

(

0

+

)+

0 k]

sinh2

)+

0 k

k

0 k

cosh2

k

(22)

that is: tanh2 .

k

0 k

=

+

0(

0

+

Stationarity with respect to

(23)

0

We now write the stationarity of with respect to 0 . Taking into account relations (17) and (18), as well as the phase locking condition (19), we get: 0=

0

(

0

+

)

+

k k=0

1938

sinh2

k

sinh

k

cosh

k

(24)



CONDENSED REPULSIVE BOSONS

This result shows that the chemical potential is equal to: =

0

(

0

+

)+

(25)

which is the sum of a mean field term another term : =

k

sinh2

k

sinh

k

cosh

k

0

(

0

+

) created by all the particles, and

(26)

k=0

This last term is also the sum of two terms of different signs: a positive contribution coming from momentum transfer processes, leading to an increase of the repulsion energy due to the boson bunching effect; a negative contribution due to the creation or annihilation of pairs from the condensate k = 0 (Figure 7 of Complement BXVII ), and expressing the reduction of that energy, due to the dynamic correlations induced by the interactions. Relation (23) then becomes: tanh2

k

=

0 k

+

(27)

0 k

with: = 2-b.

(28) Solution of the equations

The ground state we are looking for depends on two parameters that are externally fixed, the volume 3 of the physical system, and the chemical potential that controls the total number of particles. We must determine the variables k from (27), as well as , which is not an independent variable since 0 from (24). This last relation includes it is determined by (14). We have a set of non-linear equations whose solution is not obvious, a priori: relation (27) determines the k ( 0 ), and hence ( 0 ), as a function of 0 . But 0 itself is determined as a function of and the variables k (directly, and indirectly through ) by the stationarity condition (24). Inserting the k ( 0 ) in this relation, we get an implicit equation for 0 , reminiscent of the implicit equation for the gap ∆ in the BCS theory (Complement CVII , § 1-c- ). A first approach for solving this implicit equation is to proceed by successive iterations, as in the Hartree-Fock method. We start from an approximate, reasonable value of 0 , such as the value and are both zero. Using 0 obtained by assuming that (27), we then get a first approximation for the k and for , that can be inserted in (24) to get a new value for 0 . Iterating the process, one can expect, as for the Hartree-Fock non-linear equations, a convergence after a certain number of cycles. Another approach is to not arbitrarily fix the chemical potential, but rather deduce it from the computation. We then start from an arbitrary 0 value, yielding the values of the angles k , then the value of using (14); this fixes the total particle number = 0 + , and the relations (25)-(26) yield the chemical potential. We shall use this simpler approach in what follows. 1939



COMPLEMENT EXVII

3.

Properties of the ground state

We start by computing the total particle number. To highlight the general ideas while dealing with equations as simple as possible, we shall use a model where the potential matrix elements k are all equal to the same constant 0 , or else equal to zero: k

=

k

=0

if k

0

if k

(29)

where the “cutoff value” characterizes the potential range ( 1 ). To further simplify, we shall consider, in each calculation, the case where = 0, i.e. where is the same as . 3-a.

Particle number, quantum depletion

We use relation (14) to compute , the average number of particles in the individual states k = 0. To get sinh2 k , let us first compute cosh2 k using: cosh2

k

cosh2 2 k cosh 2 k sinh2 2

=

2

+

=

(

= k

0 k

+2

1

1 tanh2 2

k

(30)

0 k)

Since we have: 2sinh2

k

= cosh2

k

+ sinh2

1 = cosh2

k

k

1

(31)

we can write: sinh2

k

=

1 2

+ (

0 k

+2

0 k)

1

(32)

1

(33)

Inserting this relation in (14) we get: =

1 2

+ (

k=0

0 k

+2

0 k)

Let us see what becomes of this expression in the simple model where the are equal to the ( is supposed to be negligible). Replacing the summation in (33) by an integral, we get: 3

=

16

3

3

d

}2 2

}2 2

2

2

}2 2

+ 2

0 k

+2

1

(34)

0 k

Using the matrix elements of the simplified potential, relation (29), the function to be integrated only depends on the modulus of k, and goes to zero if ; this means 1940

• that, using spherical coordinates, the integral over the integral variable s: }2

s=

2

k=

CONDENSED REPULSIVE BOSONS

only goes from 0 to

. We define

(35)

k

0 0

is the “healing length” introduced4 in § 4-b of Complement CXV :

where

}2

=

2

Noting

(36)

0 0

the upper bound of s coming from the upper bound

=

of : (37)

we can write: 3

=

4

2

2

3 2 0 0 }2

2

d

0

2

2

+1

(

2

+ 2)

1

(38)

The integral in (38) is still convergent if goes to infinity since, when , one can make a limited expansion in powers of the infinitely small = 1 2 and write: 2 2

(

+1 2

+ 2)

=

1+1 1+2

2 2

=1+

1 + 2 4

(39)

The integral also converges at the origin (the function to be integrated diverges as 1 , but the differential element is 2 d , which eliminates the divergence). This integral can be readily calculated, and for an infinite value of (very short-range potential) is equal to 2 3. We then get: 3

=

3

2

3 2 0 0 }2

(40)

We find that is proportional to the volume and to the product 0 0 to the power 3 2. When the interaction potential 0 is zero, we confirm that all the particles are in the individual state k = 0. As 0 starts increasing, the “non-condensed fraction” ( 0 + ) varies, at the beginning, proportionally to the power 3 2 of 0 . We have found that the effect of the interaction potential is to transfer a certain number of particles from the individual state k = 0 towards the k = 0 states. This effect is often called “quantum depletion”. It has nothing to do with a thermal excitation effect that would bring some particles from their ground state towards excited states, as a result of the coupling with a thermal reservoir at a non-zero temperature. The calculations we are performing in this complement concern the ground state, and we assume the temperature is rigorously zero. 4 In Complement C as a function of the parameter associated XV , we defined in (61) a constant with an interaction potential in (r). Such a potential corresponds to 0 = (where is the }2 2 volume), and relation (36) is indeed equivalent to = 0.

1941



COMPLEMENT EXVII

Comment: Note however that the result (40) was established with the hypothesis therefore only valid if: }2

0

2

(

0.

1 3

0)

It is

(41)

If is the range of the potential, and its order of magnitude, relation (11) shows that 3 3 the order of magnitude of the matrix elements 0 is ; the previous condition is then written: }2

1 3

3

2

(42)

3

0

The result is thus valid if remains small compared to the kinetic energy of a particle localized within the potential range , multiplied by the ratio between the average particle distance in the state k = 0 and . This requires the potential range to be sufficiently small. 3-b.

Energy

We now compute the energy, taking successively all the different terms of (13) into account. .

Kinetic energy

The first contribution comes from the kinetic energy, which according to (32) is written: sinh2

=

k

=

k=0

1 2

+ (

k=0

0 k

+2

0 k)

1

(43)

This term reminds us of the one we encountered in the computation of in (33), but the presence of the factor in the summation changes its properties. In the simplified potential model where the k are given by (29), and if we furthermore assume that = 0, the change of variable (35) leads to: 3

=

2 }2

2

4

3 2

[

5 2 0 0] 0

According to (39), when 2

+1

(

2

4 2

+ 2)

d

1 =

2

+1

(

2

4 2

+ 2)

1

, the function to be integrated behaves as: 4

1+

1 + 2 4

1 =

1 +0 2

1 2

and tends towards a constant. Consequently its integral over d is divergent if infinite; if is large but finite, the integral value depends linearly on the choice of 1942

(44)

(45) is .

• .

CONDENSED REPULSIVE BOSONS

Interaction with the condensate 2

In relation (13), the mean field term in the energy 0 ( 0 + ) 2 is known since has been obtained previously – see relation (38). We do not need to compute specifically its contribution to the average energy. The second term in the potential energy is proportional to 0 , and corresponds to the interactions between atoms outside the condensate ( k = 0 individual states) and inside the condensate (population of the k = 0 state); this term contains the sum of the k defined in (18). In order to evaluate that term, we need to compute the product sinh k cosh k : sinh

k

cosh

k

1 sinh2 2

=

sinh2 2 k cosh2 2 k sinh2 2

1 2

tanh2 2 k 1 tanh2 2

1 2

=

=

k

k

(46) k

or else, according to (27): sinh

k

cosh

k

=

0 k

2

(

+2

(47)

0 k)

Taking into account the phase locking condition (19) for the optimal variational state, the k defined in (18) are equal to: k

=

k

0

=

k

0

=

k=0

k

sinh

cosh

k

k

= We obtain the contribution

sinh2

2

(

+2

1

0 k)

k

(

k=0

+2

1

0 k)

(49)

For the simple model (29) already used before, and if 3 0

=

4

2

2 0 0

(48)

to the energy:

0

0

2

k

3 2 0 0 }2

2

= 0, this results becomes:

2

d

2

0

1

+2

(50)

The function to be integrated behaves, at infinity, as: 2 2

+2

1

2

1

1 2

+

1 =

1+0

which means the integral is not convergent when

1

(51)

2

, but depends linearly on

. 1943



COMPLEMENT EXVII

.

Ground state total energy With the same approximation +

0

1 2

=

(

+2

= 0, the sum of (43) and (49) yields:

0 k)

(52)

0 k

k=0

or else, using again the simplified model (29) for the interaction potential and adding (44) and (50): +

0

2

4

2

2

1

=

3 2 5 2 0 0)

(

}2

3

(

2

+ 2)

4

2

(53)

0

Relations (45) and (51) show that the function to be integrated tends towards 1 2 when ; we have again a divergent integral when (or ). Consequently, its value is a linear function of the chosen cutoff frequency . However, the fact that the limit of the function to be integrated is negative indicates that, for large values of , the decrease in potential energy overcomes the increase in kinetic energy. The ground state total energy ground is the sum of all the energies we just computed, including the mean field term: ground

where

=

+

0

2 0

(

0

+

2

) +

+

0

(54)

is given by (53).

Remarque: The divergences appearing when in our calculation of the energy are not fundamental. They occur when one assumes that the matrix elements k of the potential written in (29) remain constant when , while they tend to zero with a realistic potential. It is indeed possible to perform a more careful treatment of the potential, such as that mentioned in § 4.2 of Référence [15], and to obtain a finite result. 3-c.

Phase locking; comparison with the BCS mechanism

We just saw in § 2-a- how all the phases k had to become equal to the phase associated with the state k = 0 in order to minimize the repulsive energy between any particle in the k = 0 states and any particle in the k = 0 state. Fixing the phase differences to be zero is what we called the “phase locking condition”. This situation reminds us of the “symmetry breaking” of the BCS mechanism, where a common locking of all the relative phases of all the pairs (k k) enabled the building of a gap ∆, through a collective effect. For bosons, the equivalent of the collective mean field created by the pairs is the one created by the condensate of particles in the k = 0 state. This is why it is now the relative phase of the pair states with respect to that of the condensate that plays a role; in other words, we have now an “external” instead of an “internal” mechanism. The gain in energy will be exactly the same, whatever value is chosen for the phase 0 ; only the relative phase k 0 is relevant. Accordingly, and just as for fermions, the arbitrary choice of the phase leads to a symmetry breaking phenomenon. The analogy is reinforced by the fact that, for a fixed value of , we found in § 2-b that the value of the 0

1944



CONDENSED REPULSIVE BOSONS

particle number 0 in the k = 0 state is given by an implicit equation in 0 . Similarly, in the BCS theory, it is also an implicit equation that fixes the value of the gap ∆. Relations (40) and (53) include non-integer powers of the interaction potential 0 , and are thus non-analytic functions of that potential. They cannot be obtained by a perturbation theory as a power series expansion of 0 , and this is another analogy with the results of Complement BXVII . We also saw in that complement that, in the BCS mechanism, it is the energies in the vicinity of the Fermi level that play the most important role. This is not the case for a system of repulsive bosons. For example, in (44), the function whose integration over yields the kinetic energy is: ( )=

2

+1

(

2

4 2

+ 2)

1

(55)

whereas the one yielding the interaction potential energy between particles in the k = 0 states and particles in the k = 0 state is: 2 0 ( ) =

2 2

(

2

+ 2)

1

(56)

Figure 1 plots these two functions with dashed lines, as well as their sum with a solid line. It illustrates how, in the case of repulsive bosons, the effects of minimization of the potential energy overcome those of the kinetic energy. This minimization of the repulsion necessarily comes with a modification of the particles’ position correlation function, which must decrease at short distances; this interpretation will be substantiated in § 3-dwhere we study the binary correlation functions. We also note in the figure that the accumulated gain of energy is not due to a particular energy band: all the values contribute up to the limit imposed by the upper bound of the integral. We can further refine the analysis of the energy balance by looking at the gain of energy per individual quantum state. We must then remove from the previous relations the factor 2 coming from the density of states, and hence remove a factor 2 from (55) and (56). Figure 2 plots the resulting functions, which show that as long as is small or of the order of 1, the decrease in potential energy largely overcomes the increase in kinetic energy; on the other hand, the two contributions balance each other when 1. According to (35), the condition . 1 corresponds to: .

1

that is:

.

0 0

(57)

This means that individual states of low energy provide most of the decrease of the repulsive potential energy; the corresponding energy domain is proportional both to 0 and to the interaction matrix element 0 . Relations (27) also show that it is those energies in the system ground state that the energy minimization affects the most. From the physical point of view, it is understandable that particles having a low kinetic energy compared to the interaction energy 0 0 are the most affected by the interactions, whereas those with a kinetic energy large compared to 0 0 have their correlations only slightly modified by the interaction potential. However, as we noted before, even though the individual contribution of the highest energy state to the energy reduction is reduced, their large number (corresponding to a density of states proportional to 2 ) means that their contribution to the total energy remains significant.

1945



COMPLEMENT EXVII

Figure 1: Plots as a function of of the functions whose integral over yield the kinetic energy (upper dashed curve), the potential energy for the interaction with the condensate (lower dashed curve), as well as their sum (solid line). The increase 0 in the kinetic energy is overcome by the decrease in the potential energy, which ends up lowering the total energy. 3-d.

Correlation functions

As the system is contained in a box and obeys periodic boundary conditions, we expect the properties of the one-body correlation functions to be translation invariant. This does not rule out a possible spatial dependence of the correlation functions, as far as the differences in positions are concerned. This is what we want to elucidate now. .

One particle

Expanding the field operator Ψ(r) on the annihilation operators, according to relation (A-3) of Chapter XVI, we get for the one-particle correlation function 1 1

(r r ) = Φ

Ψ (r)Ψ(r ) Φ

=

1

(k

3

r

k r)

Φ

k k

Φ

(58)

kk

where expression (3) determines Φ . Since in this state the particle number in an individual state k is always the same as that number for the individual state k, the average value of k k in Φ will be different from zero only if k = k . In the summation over k, the k = 0 contribution introduces a term in 0 ; adding to it all the other contributions, we get:

1

1

(r r ) =

3

0

+

k

k (r

r)

(59)

k=0

When r = r , the function tot

1946

=

0

+ 3

1

is simply equal to the total particle density

tot :

(60)



CONDENSED REPULSIVE BOSONS

Figure 2: Plots as a function of of the kinetic energy (upper dashed curve), the potential energy (lower dashed curve), and the total energy (solid line) per individual state. It shows that it is the lowest kinetic energy states that make the largest contribution to the lowering of the energy. When r and r are different, the function 1 is the sum of two terms: – one term corresponding to particles in the k = 0 state (condensed particles), independent of the positions; this term does not decrease at large distance, but has an infinite range. – a second term corresponding to particles in the k =0 state, which is the transform of the particle distribution k , and therefore goes to zero when r and r move away from each other (it has a microscopic range). We find again the Penrose-Onsager criterion according to which it is the condensed fraction of a boson system that leads to an infinite range of the non-diagonal one-body correlation function (in the case of paired fermions, we found in Complement CXVII , §§ 2-a- and 2-b- , that this long range does not occur for the one-body correlation function, but only for the two-body correlation function). .

Two particles The diagonal two-particle correlation function is written: 2

(r r; r r ) = Φ 1 = 6

Ψ (r)Ψ (r )Ψ(r )Ψ(r) Φ (k kk k

k) r

(k

k

)r Φ

k k

k

k

Φ

(61)

k

We get the same simplifications as in § 4-b- of Complement BXVII in the computation of the average values of products of creation and annihilation operators: operators 0 placed on the right each yield a factor 0 and operators 0 placed on the left, each a factor 0 ; the average values of the other operator products are given by the results of § C in Chapter XVII. We must distinguish between several cases, depending on the number of values, among the 4 summation indices, which are equal to k = 0; we shall proceed by decreasing values of that number. 1947



COMPLEMENT EXVII

(i) If the four operators concern the k = 0 state (case represented in Figure 6 of Complement BXVII ), we get the contribution: (

0

2

1)

0 6

0 3

(62)

which is position independent. The case where only three of the summation indices are zero is not possible, as the corresponding term would contain the average value of an operator k (or of its Hermitian conjugate) in the state Φ , which is zero. (ii) If one creation and one annihilation operator concern the individual k = 0 state, two cases may occur and yield different types of terms: – direct terms in 0 k k 0 or k 0 0 k ; both contributions are equal and their sum leads to: 2

0 6

(63)

which is also position independent. – exchange terms in 0 k 0 equal and their sum is written:

0 6

k

(r

r

k =0

=

2

)

k

k

or

k (r

+

k

r)

0 k

0;

the corresponding terms are also

k

k=0

0 6

k

cos [k (r

r)]

(64)

k=0

which now depends on the difference in the positions r and r . These terms reflect the existence of a bunching effect between bosons; relation (C-28) of Chapter XVII yields the value of k : k

= sinh2

k

(65)

(iii) If the operators corresponding to k = 0 are of the same nature (both creation or both annihilation operators), we get terms such as the ones represented on Fig 7 of Complement BXVII , corresponding to the creation or annihilation of a pair from the condensate k = 0: – for a product of the type k k 0 0 where k and k are not zero but opposite (for the same reason as explained before), we get an anomalous average value in the state k , multiplied by the average value of the product of two operators 0 . The first 2 average value is given by relation (C-51) of Chapter XVII, and the second yields ( 0 ) , 2 0 that is 0 according to (7). – for a product of the type 0 0 k k where k and k opposite, we get the complex conjugate of the previous result. 1948

are not zero but



CONDENSED REPULSIVE BOSONS

The sum of the two previous results is then written: k (r

0 6

r)

sinh

k

cosh

k

2 (

0

k)

k=0 k

+

(r

r)

sinh

cosh

k

k

2 (

0

k

)

k =0

=

2

0 6

sinh

k

cosh

k

cos [k (r

r) + 2 (

0

k )]

(66)

k=0

(iv) Finally we have terms where none of the wave vectors is zero, corresponding to cases where the particles are in k = 0 states before and after the interaction. They include a direct term: (

)2

(67)

6

which is constant, an exchange term, and finally a pair annihilation-creation term. Compared to the previous terms (which are proportional 0 ), their relative value is of the order of 0 . Taking into account the exchange term and the pair creation-annihilation term leads to simple calculations of a type already performed; however, to be consistent with (12) and the corresponding energy approximations, we shall ignore those terms. The sum of (62), (63) and (67) yield the constant 2 6 , to which we add (64) and (66) and get: 2 2

(r r; r r )

6

+

2

0 6

sinh2

k

cos [k (r

r)]

k=0

sinh

k

cosh

k

cos [k (r

r) + 2 (

0

k )]

(68)

The position dependent term on the second line shows how the relative phases introduced in Φ control the relative particle position in the physical system; it confirms that the choice k = 0 does indeed decrease the probability of finding two particles close to each other. When the phase locking condition (19) is satisfied, the position dependent contribution becomes: 2

0 6

sinh2

k

sinh k cosh

k

cos [k (r

r)]

(69)

k=0

Since sinh k cosh k , the cosine in each term has a negative coefficient; it does decrease the probability 2 (r r; r r) of finding two particles at the same point r: the dynamic correlations appearing in the system tend to “antibunch” the particles, and hence reduce their repulsive interactions. The final result is a compromise between a sinh2 k term that leads to bunching (as for a non-interacting boson gas) and a antibunching term in sinh k cosh k that is larger, and involves anomalous average values (creation or annihilation of particle pairs in the condensate). 1949

COMPLEMENT EXVII



Comments: (i) The correlation function (68) is invariant with respect to the exchange of r and r , as seen from the way the terms k and k are accounted for in the summation. Its Fourier transform only contains terms in cos [k (r r)], which can take on any value by an appropriate choice of the k and the k . As mentioned in the introduction, the variational state can lead to any correlation function; the results discussed above concern the optimal value of this correlation function. (ii) On several occasions, we assumed the chemical potential correction , defined in (26), to be zero, which enabled us to replace the by the . Let us now check that a non-zero value of this correction does not radically change the results we obtained. Using the model (29) where the non-zero matrix elements constant 0 , expression (26) is simply written: =

sinh2

0

sinh

k

k

cosh

=

k

k=0

sinh

0

k

k

are all equal to the same

k

0

where the summation is limited to vectors k having a modulus less than = with tanh2

0

=

. Setting: (71)

0

0, relation (27) that fixes the k

(70)

k=0

0 0

+

0 0

(1 + )

k

becomes: (72)

The corrective effect of and hence , is to lower the k ; this correction is however negligible if 0 0 . Consequently, the populations of the individual states k (which are equal to sinh2 k ) decrease when . 0 0 , but remain practically unchanged in the opposite case. The quantities resulting from a summation over k, such as , are then barely affected: the change in the function to be integrated only occurs for small values of , whose contributions, in any case, are weak because of the factor 2 in the integral (38). As for the energy, this is accentuated since the integral in (53) diverges if is infinite, which means it mainly depends on the large values of (if 1). Turning now to the correlation functions computed in § 3-d, they contain summations over k that lead to the same integrals; they are thus fairly insensitive to the value of . This explains why, aside from the predictions concerning the populations of small wave vectors, the approximation = 0 used in § 3 is reasonable in many cases. To push the analysis a step further we need to compute the value of the coefficient. This requires improving the precision of the calculations, and in particular taking into account the interactions of the particles in the k = 0 individual states. This is beyond the scope of this complement, and we shall simply accept that is small and note that only the small populations are changed when is not equal to zero.

4.

Bogolubov operator method

We now present a different point of view and introduce the Bogolubov method; it is based on the search for a readily diagonalizable operator form of the Hamiltonian (or of an approximate expression of this Hamiltonian). This method not only applies to the ground state, but it also enables the study of the excited states. We shall use the results of § E in Chapter XVII to introduce new operators that simplify the diagonalization of the Hamiltonian. 1950

• 4-a.

CONDENSED REPULSIVE BOSONS

Variational space, restriction on the Hamiltonian

The variational set we consider has been defined in (3); we assume: Φ

=

Ψpaired

0

(73)

where 0 is the coherent state (6), and Ψpaired , any paired state in the Fock space spanned by all the individual states others than k = 0. We call ( 0 ) the ensemble of kets expressed as (73). We now take the general Hamiltonian operator written in (8) of Complement BXVII , and consider its action restricted to such states; the corresponding matrix elements are of the type: Φ

Φ

(74)

where Φ and Φ are any two kets of ( 0 ). In the computation of this matrix element, the same simplifications as in § 1 will occur: any annihilation operator 0 on the right can be replaced by the number 0 , any creation operator 0 on the left by the complex conjugate 0 . We will further simplify the problem by assuming, as in (12), that 2 the total population of the k = 0 individual levels is much smaller than 0 = 0 , and keeping only certain terms among the Hamiltonian interaction terms. First, we study the forward scattering terms, which are the terms (k = k and k = k ) in relation (20) of Complement BXIII . Their expression is: 0

2

k

k k kk

In this equality, =

0

+

where 0 = k = 0, and =

k

=

0

2

k k k

k

kk

k k

=

kk

0

2

1

(75)

is the total number of particles operator: (76)

is the operator associated with the population of the individual state that associated with the total number of particles in the states k = 0 :

0 0

k k

(77)

k=0

As for all the other interactions terms, we shall proceed as in § 1-c and will only keep the terms that contain 0 – the others correspond to interactions between particles in the k = 0 individual states, assumed to be negligible when inequality (12) is satisfied. In all the terms we keep, there are either four or two creation or annihilation operators concerning the k = 0 state. Those containing the product 0 0 0 0 , or one of the two products k 0 0 k and 0 k k 0 , are already taken into account in the mean field term (75). We simply have to add: – the terms containing the products k 0 k 0 or 0 k 0 k , i.e. the exchange terms of § 4-b- in Complement BXVII ; they yield a contribution: 0 k k k

(78) 1951



COMPLEMENT EXVII

– the terms in k or the terms in 0 0 k their contribution is: 0 k

2

e2

0

k k

k 0 0, k

+e

corresponding to the pair creation from the condensate, corresponding to the annihilation of pairs into the condensate;

2

0

k

(79)

k

With the above conditions, we get a simplified version of the Hamiltonian becomes a reduced Hamiltonian : =

(

0

, which

1) 2

+

k k

+

0 k

k k

k=0

+

1 2 e 2

0

k k

+e

2

0

k

k

(80)

If the number of particles is fixed, the first term in the right hand side (mean field) introduces only the same displacement of all energies, without physical consequence. In the Bogolobov approximation, where the condition 0 is assumed, one often merely replaces by 0 in this term, which amounts to restricting the sum of (75) to the terms k = k = 0. Since the kets (73) are eigenvectors of 0 with eigenvalue 0, we may then replace the mean field operator (75) by the number 0 02 2. If, moreover, one assumes that 0 = 0, one obtains the simpler expression: 0

=

2 0

2

+

k k

+

0 k

k k

+

k k

k=0

+ 2

k

k

(81)

Either (80) or (81) can be used as the Hamiltonian within the Bogolubov approximation. Neither of these operators conserves the total particle number, because of its terms proportional to the product of two creation or two annihilation operators. Complement BXVII explained how such “anomalous” terms can, nevertheless, account for the interaction effects within the framework of certain approximations. We are now going to show that this expression can be put in the form of a Hamiltonian of independent particles, provided the operators undergo the transformation introduced in § E of Chapter XVII. 4-b.

Bogolubov Hamiltonian

We obtained in Chapter XVII the expression (E-29) of the Hamiltonian operator :

=

}

k k

+

k

k

(82)

k

which includes the Bogolubov operators for bosons: k k

1952

=

k k

+

=

k

k

+

k

k k k

(83)



CONDENSED REPULSIVE BOSONS

Remember that is half the momentum space, avoiding double counting of the same pairs of states in (82). Relation (E-15) of Chapter XVII expresses the k and k in terms of the two parameters k and k : k

= cosh

k

k

= sinh

k

k

(84)

k

As for the value of the parameter We then have: k k

=

2 k

k k

, it will be fixed later.

2

+

k

k

k

k

k

+

k k k

k

+

k k

(85)

k k

and: k

k

2

=

k

2

+

+

k k

k

k k

k k

+

k k k

k

(86)

The operators in these equalities can be rearranged in normal order, using the proper commutation; adding them both, we get: k k

+

k

k

= +2

2 2 k

2

+

k

+2

k k

k

+

k k k k

k

k

+2

k k k

(87)

k

that is, taking (84) into account: k k

+

k

k

= cosh2

k

+ 2sinh2 Operator

k

k k

+

+ sinh2

k

k

2

k

k

+

k

k

k

+

k

k

k k

2

k

(88)

2

k

(89)

can therefore be written:

=

cosh2

}

k

+

k k

k

k

k

+2sinh2

k

+ sinh2

2

k

k k

Now this Hamiltonian may be identified with the approximate Hamiltonian (80). To see this, we replace cosh2 k by expression (30), sinh2 k by the double of (47), and sinh2 k by (32); we are still in the simplified model where = 0, and hence the are replaced by the . Finally we choose for the value: }

(

=

+2

0 k)

and we assume that all the =

(

+

0 k)

(90) k

are zero. We then get:

k k

+

k

k

k

+

0 k

2

k k

+

k

k



fond

(91)

1953



COMPLEMENT EXVII

with, again using the value (32) for sinh2 ∆

ground

=

2

sinh2

}

k

k

=

1 2

k:

=

sinh2

}

k

k=0

(

+2

0 k)

(92)

0 k

k=0

Comparison with relations (52) and (54) shows that ∆ ground is none other than the energy ground already obtained, shifted by the mean field value: ∆

ground

=

0 ground

2

(93)

2

Finally, taking (80) into account, if the condition), we simply have: = 4-c.

+

k

are all chosen equal to zero (phase locking (94)

ground

Constructing a basis of excited states, quasi-particles

As ground is a number, it introduces a simple energy shift in the eigenvalues of compared to those of , with no effect on the eigenvectors. Now we saw in § E-3 of Chapter XVII that the eigenvalues of are known, and can be written as: =

[ ( k) + (

k )] }

(95)

k

where ( k ) and ( k ) are any positive or zero integers. As for the eigenstates associated with these energies, they can be simply obtained by the action on the ground state of the following product of creation operators: (

k)

(

k

k)

(96)

k

k

All things considered, the operator shares a lot of properties with the Hamiltonian of an ensemble of non-interacting particles. Just as the usual creation operators permit adding particles in a system of free identical particles, the k and k creation operators can be considered as adding an extra “quasi-particle” to the physical system. When acting on the ground state, the operator k yields a ket where both the energy and the momentum are well-defined: the energy is increased by the amount ~ specified in (90), the momentum is increased by }k with respect to the zero momentum of the ground state. This exact change of momentum occurs because the action of k on any ket creates two components: one component where one particle with momentum ~k is added, the other component where one particle with momentum ~k is suppressed. In both cases, the total momentum has increased5 by the same amount ~k. The operator k 5 This

tum P =

result can also be verified by calculating the commutator k

~k

k k

with

k

. One obtains P

P k = ~k k k + k k = ~k increase its eigenvalue by ~k.

1954

k

=

k

~k

k k

k

. As a consequence, the effect of k

P + k

k k

of the total momen~k

k

k

k

, or:

on any eigenstate of P is to



CONDENSED REPULSIVE BOSONS

therefore creates a quasi-particle of well-defined energy and momentum, and k annihilates it; of course, k and k have the same properties for the quasi-particle of opposite momentum. These quasi-particles do not coincide with particles of a system without interactions, as can be seen from the expression of those creation operators. They yield, however, a basis of states that permits reasoning as if there were no interactions; this provides a very powerful point of view in many fields of physics. We can assume, as in Complement DXV , that the interaction potential has a zero range: 2

(r r ) =

(r

r)

(97)

Relation (11) then becomes: k

=

1 3

d3

qr

2 (r)

=

(98)

3

and equality (90) is written as: } (k) =

(

+2

0)

=

}2 2

2

(

2

+

2 0)

(99)

with: 0

=

2 }

0

=

2

(100)

In this last equation, is the healing length defined in (36). This equation is the same as relation (34) of Complement DXV , whose Figure 1 represents the quasi-particle spectrum. When the modulus of the wave vector k is smaller than the wave vector 0 , we get a linear spectrum whose slope corresponds to the sound velocity in the boson system; for values larger than 0 , the spectrum becomes quadratic, as for free particles. Conclusion The calculations presented in this complement illustrate the analogy between the pairing phenomena for attractive fermions and for repulsive bosons. In both cases, binary position correlations are introduced by the dynamic interactions, resulting in a decrease of the interaction potential energy of the physical system; the paired states are a valuable tool for understanding this effect. In both cases, a relative phase locking phenomenon occurs, but the precise nature of that locking is, however, different. For fermions, the energy gain is due to a collective effect, involving the pair-pair interactions and the relative phase of every pair of states (k k); each contributes to the value of the gap ∆ which, in turn, has an effect on all the others – this is translated mathematically by the presence of a double summation over k and k in the energy. This is reminiscent of a ferromagnetic system, where each spin contributes to the collective exchange field that act on all its neighbors. As the interactions are supposed to be attractive, the phase locking to zero maximizes the pair-pair interactions, and hence minimizes the energy. For bosons, the major role is played by the relative phase of the pairs with respect to that of the reservoir composed of all the particles in the k = 0 state (condensate). 1955

COMPLEMENT EXVII



The physical process involved is illustrated in Figure 7 of Complement BXVII , where two particles emerge from the condensate to form a pair, or vice-versa – mathematically, the energy term contains only one summation over k. The relative phase locking it introduces will minimize the repulsion between these pairs and the condensate, and hence the total energy. Compared to the fermion case, the presence of a condensate independent of the pairs radically changes the nature of the phase locking.

1956

Chapter XVIII

Review of classical electrodynamics A

B

Classical electrodynamics . . . . . . . . . . . . . . . . . . . . . A-1 Basic equations and relations . . . . . . . . . . . . . . . . . . A-2 Description in the reciprocal space . . . . . . . . . . . . . . . A-3 Elimination of the longitudinal fields from the expression of the physical quantities . . . . . . . . . . . . . . . . . . . . . . Describing the transverse field as an ensemble of harmonic oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1 Brief review of the one-dimensional harmonic oscillator . . . B-2 Normal variables for the transverse field . . . . . . . . . . . . B-3 Discrete modes in a box . . . . . . . . . . . . . . . . . . . . . B-4 Generalization of the mode concept . . . . . . . . . . . . . .

1959 1959 1960 1965 1968 1968 1969 1974 1975

Introduction In the three previous chapters, we studied ensembles of identical particles, which allowed us to introduce the concept of quantum field operators. We now begin a new series of three chapters where this quantum field concept is applied to an important particular case: the electromagnetic field, made of identical bosons called “photons”. We start by noting that, in classical electromagnetism, the dynamics of the different field modes is exactly similar to that of a series of harmonic oscillators. Each of these modes may be quantized by the same method as that used for an elementary harmonic oscillator, for a single particle; this method has the great advantage of simplicity. It requires, however, establishing beforehand the equivalence between modes of the classical electrodynamic field and harmonic oscillators; this is the main purpose of the present chapter. For the presentation to be self-contained, we first review a certain number of properties of classical electromagnetism. One complement is also devoted to a synthetic Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XVIII

REVIEW OF CLASSICAL ELECTRODYNAMICS

presentation of the Lagrangian formalism applied to this case. The reader already familiar with those aspects of classical electrodynamics may wish to go directly to the quantum treatment presented in Chapter XIX. We start in § A with the equations of Maxwell-Lorentz describing the coupled evolution of the electric field E(r ), the magnetic field B(r ) and the coordinates and speeds of the particles acting as source for this electromagnetic field1 . We shall give the expressions for a certain number of constants of motion, such as the energy, or the linear and angular momenta of the global system “field + particles”. The vector potential A(r ) and scalar potential (r ) will also be introduced, as well as the gauge transformations that can be performed on these potentials. We shall then show that it is useful to take the spatial Fourier transforms of these fields, since in the reciprocal space, Maxwell’s equations have a simpler form. For a free electromagnetic field (in the absence of charged particles), they are no longer partial differential equations, as in ordinary space, but ordinary time-dependent differential equations. Furthermore, the concept of longitudinal or transverse field vectors has a clear geometrical significance in the reciprocal space2 . A field vector V˜ (k ) is longitudinal if V˜ (k ) is parallel to k at every point k of the reciprocal space, transverse if V˜ (k ) is perpendicular to k at every point k. We will show that two of the four Maxwell’s equations yield the value of the longitudinal electrical and magnetic fields, whereas the other two describe the evolution of the transverse fields. It will become clear that the longitudinal electric field is simply the Coulomb electrostatic field created by the charged particles. Consequently, it is not an independent field variable since it only depends on the coordinates of the particles3 . Furthermore, choosing the Coulomb gauge amounts to choosing the longitudinal potential vector equal to zero; this permits eliminating the longitudinal fields from the expressions for all the physical quantities. In § B, we establish the equivalence between the radiation field and an ensemble of one-dimensional harmonic oscillators. Maxwell’s equations for transverse fields enable introducing linear combinations of the vector potentials and transverse electric fields, whose time evolution, in the absence of particles, is of the form where = . These variables, called normal variables, thus describe the eigenmodes of the free field vibrations. The dynamics of each of these eigenmodes is similar to that of a one-dimensional harmonic oscillator. The normal mode variable is the equivalent of the linear combination of the position and velocity of the associated operator, and becomes, in the quantization process, the annihilation operator, fundamental in the quantum theory of the harmonic oscillator. Replacing the normal variables and their complex conjugates by annihilation and creation operators will yield, in Chapter XIX, the expressions for the various operators of the quantum theory.

1 We assume that the speeds of the particles are small compared to the speed of light, so as to use a non-relativistic description. 2 We shall note ˜ (k) the spatial Fourier transform of (r), the symbol “tilde” allowing a clear distinction between the functions in ordinary and reciprocal space. 3 As for the longitudinal magnetic field, it is simply zero.

1958

A. CLASSICAL ELECTRODYNAMICS

A.

Classical electrodynamics

A-1.

Basic equations and relations

A-1-a.

Maxwell’s equations

There are four Maxwell’s equations in vacuum, and in the presence of sources: ∇ E(r ) =

1

(r )

(A-1a)

0

∇ B(r ) = 0 ∇

E(r ) =



B(r ) =

(A-1b) B(r )

1

E(r ) +

2

(A-1c) 1 0

2

j(r )

(A-1d)

where is the velocity of light in vacuum and 0 the vacuum permittivity. These equations yield the divergence and the curl of the electric field E(r ) and the magnetic field B(r ). The charge density (r ) and current density j(r ) appearing in those equations can be expressed, in the non-relativistic limit, in terms of the positions r ( ) and the speeds v ( ) = dr ( ) d of the various particles of the system, each having a mass and a charge : (r ) =

[r

j(r ) =

A-1-b.

r ( )]

v ( ) [r

(A-2a)

r ( )]

(A-2b)

Lorentz Equations

Lorentz equations describe the dynamics of each particle and magnetic forces exerted by the fields E(r ) and B(r ): d2 r ()= d2

[E (r ( ) ) + v ( )

submitted to the electric

B (r ( ) )]

(A-3)

The particle and field evolutions are coupled: the particles move under the effect of the forces the fields exert on them, but they also act as sources for the evolution of those fields. A-1-c.

Constants of motion

Definitions (A-2a) of (r ) and (A-2b) of j(r ) lead to the continuity equation: (r ) + ∇ j(r ) = 0

(A-4)

which implies the time invariance of the total charge of the particle system: =

d3

(r ) =

(A-5) 1959

CHAPTER XVIII

REVIEW OF CLASSICAL ELECTRODYNAMICS

Other constants of motion exist: the total energy , the total momentum P and the total angular momentum J of the system field + particles. They are respectively given by: 1 2

= P =

v2 ( ) + v ( )+

J=

r ()

0

0

d3

2

E 2 (r ) +

d3 E(r )

v ( )+

0

2

B 2 (r )

B(r )

d3 r

[E(r )

(A-6a) (A-6b)

B(r )]

(A-6c)

Using (A-1) and (A-3), we can verify that the derivatives with respect to time of , P and J are indeed zero (for and P , see for example exercise 1 in Complement CI of [16] and its correction). A-1-d.

Scalar and vector potentials: gauge transformations

As we already saw in Complement HIII , the fields E(r ) and B(r ) can always be written in the form: E(r ) =

∇ (r )

B(r ) = ∇

A(r )

(A-7a)

A(r )

(A-7b)

where A(r ) and (r ) are the vector and scalar potentials defining a gauge. For any function (r ) of r and of , the transformation of these potentials obeying the relations:

A(r ) (r )

A (r ) = A(r ) + ∇ (r ) (r ) =

(r )

(r )

(A-8a) (A-8b)

leads to the same expression for E(r ) and B(r ); the same physical fields can therefore be represented by several different potentials A(r ) and (r ). The transformation (A8) associated with the function (r ) is called a gauge transformation. Relations (A-8) allow a flexibility on the choice of the gauge A , which allows introducing an additional condition. The Coulomb gauge, which we will use in this chapter and the following, is defined by the condition: ∇ A(r ) = 0

(A-9)

A geometrical interpretation of condition (A-9) in the reciprocal space will be given later. A-2.

Description in the reciprocal space

Using Fourier transforms, the equations of electrodynamics can be put in a form that simplifies calculations. 1960

A. CLASSICAL ELECTRODYNAMICS

A-2-a.

Spatial Fourier transforms

Let us introduce the Fourier transform of the electric field E(r ): ˜ E(k )=

1 (2 )3

2

kr

d3 E(r )

(A-10)

which enables us to write E(r ) as: E(r ) =

1 (2 )3

2

˜ d3 E(k )

kr

(A-11)

Analogous expressions can be written for all the physical quantities we just introduced: magnetic field, charge and current densities, scalar and vector potentials. It will be useful in what follows to recall the Parseval-Plancherel relation (Appendix I, § 2-c) showing the identity of the scalar products of two functions expressed in position space or in reciprocal space4 : d3

d3 ˜ (k) ˜ (k)

(r) (r) =

(A-12)

and the fact that the product of two functions in reciprocal space, is the Fourier transform of their convolution in position space: ˜ (k) ˜ (k) A-2-b.

FT

1 (2 )3

2

d3

(r ) (r

r)

(A-13)

Maxwell’s equations in reciprocal space

Maxwell’s equations take on a simpler form in the reciprocal space, clearly showing the differences between the longitudinal and transverse components of the various fields. Any vector field V˜ (k ) can be decomposed into a longitudinal field V˜ (k ), parallel at any point k to the vector k, and a transverse field V˜ (k ) perpendicular to k: V˜ (k ) = V˜ (k ) + V˜ (k )

(A-14)

with: V˜ (k ) = κ κ V˜ (k ) = k k V˜ (k ) V˜ (k ) = V˜ (k ) V˜ (k )

2

(A-15a) (A-15b)

where κ=k

(A-16)

is the unit vector along k.

4 The space of the vectors r ( ordinary space) is called “position space” whereas “reciprocal space” is the space of the wave vectors k.

1961

CHAPTER XVIII

REVIEW OF CLASSICAL ELECTRODYNAMICS

As the operator ∇ in position space corresponds to the operator k in reciprocal space, Maxwell’s equations (A-1) become in reciprocal space: 1 ˜ k E(k )= ˜(k )

(A-17a)

0

˜ k B(k )=0 k

˜ E(k )=

k

1 ˜ B(k )= 2

(A-17b) ˜ B(k )

(A-17c)

1 ˜ ˜ E(k )+ j(k ) 2

(A-17d)

0

Taking into account definitions (A-15) for the longitudinal and transverse components of a vector field, the first two equations (A-17a) and (A-17b) determine the ˜ ˜ longitudinal parts, projections of the fields E(k ) and B(k ) onto k: ˜ (k ) = E

˜(k ) 0

k

(A-18a)

2

˜ (k ) = 0 B

(A-18b)

˜ The last two equations (A-17c) and (A-17d) yield the rate of change E(k ) and ˜ ˜ ˜ B(k ) of the fields E(k ) and B(k ), and are the equations of motion of these ˜ fields. In the absence of sources (j(k ) = 0), i.e. for what we will call a “free” field, they are time-dependent differential equations, and no longer partial derivative equations as is the case in position space. A-2-c.

Longitudinal electric and magnetic fields

˜ (k ) is zero. Equation Equation (A-18b) shows the longitudinal magnetic field B ˜ (A-18a) expresses E (k ) as a product of two functions of k, ˜(k ) and k 0 2 whose Fourier transforms are written (relation (63) of Appendix I): ˜(k ) k 0

(r )

FT

2 FT

(A-19a)

(2 )3 2 r 4 0 3

(A-19b)

Using relation (A-13) then leads to: E (r ) = =

1 4

d3 0

1 4

(r

0

r r

r r r r r () r ()3 )

3

(A-20)

This means that at time , the longitudinal electric field coincides with the Coulomb field produced by the charge distribution (r ), computed as if this distribution were static and fixed at that instant .

1962

A. CLASSICAL ELECTRODYNAMICS

Comment The fact that the longitudinal electric field instantaneously follows the evolution of the charge distribution (r ) should not lead us to believe in an action at a distance propagating at an infinite speed. The contribution of the transverse field must also be taken into account, as only the total electric field E = E + E has a real physical meaning. It can be shown that the transverse electric field also has an instantaneous component, which balances exactly the longitudinal component so that the total field is always retarded (to ), as the electromagnetic interactions propagate at the speed of light (see exercise 3 and its correction in Complement CI of reference [16]).

The previous results show that the longitudinal fields are not independent quantities: they are either zero (in the case of the longitudinal magnetic field), or simply related to the particle coordinates r ( ) (in the case of the longitudinal electric field, whose expression is given by (A-20)). A-2-d.

Time evolution of the transverse fields

Now that we showed that the first two Maxwell’s equations determine the longitudinal part of the fields, let us consider the last two equations (A-17c) and (A-17d) and focus on their transverse components. Since k E = k E , they can be rewritten as: ˜ B(k )= ˜ (k ) = E

˜ (k ) E

k 2

˜ B(k )

k

(A-21a) 1˜ j (k )

(A-21b)

0

˜ (k ) and B(k ˜ which yield the time evolution of the transverse fields E ). Comment One can also study the longitudinal projections of the two Maxwell’s equations (A-17c) and (A-17d). The result is trivial for the first one: as both sides of the equation are transverse, their longitudinal projections are zero. As for the second equation, (A-17d), it leads to: ˜ (k ) + 1 j˜ (k ) = 0 E

(A-22)

0

Taking the scalar product of k with each side of this equation, using (A-18a) and the fact that k j˜ = k j˜ , we find: ˜ ˜(k ) + k j(k )=0

(A-23)

which is simply the continuity equation (A-4) in the reciprocal space, and does not provide any new information. A-2-e.

Potentials

In the reciprocal space, relations (A-7a) and (A-7b) between fields and potentials become: ˜ E(k )= ˜ B(k )= k

k ˜ (k ) ˜ A(k )

˜ A(k )

(A-24a) (A-24b) 1963

CHAPTER XVIII

REVIEW OF CLASSICAL ELECTRODYNAMICS

and the gauge transformations relations (A-8a) and (A-8b) are written: ˜ A(k )

˜ (k ) = A(k ˜ A ) + k ˜(k )

(A-25a)

˜ (k ) = ˜ (k )

(A-25b)

˜ (k )

˜(k )

where ˜(k ) is the Fourier transform of (r ). Since the last term in (A-25a) is a longitudinal vector, it is clear that a gauge ˜ (k ), which thus defines a gauge transformation does not change the transverse part A invariant physical field: ˜ (k ) = A ˜ (k ) A

(A-26)

˜ = 0, the transverse projections of relations (A-24a) and (A-24b) yield the Since k A equations: ˜ (k ) = E ˜ B(k )= k

˜ (k ) A

(A-27a)

˜ (k ) A

(A-27b)

˜ (k ) as a function of B(k ˜ Note that equation (A-27b) allows expressing A ), as we now show. Taking the vector product of k with each side of this equation, and using the identity: a

c) = (a c)b

(b

(a b)c

(A-28)

˜ (k ) = 0, we get: and the fact that k A ˜ (k ) = A

2

k

˜ B(k )

(A-29)

This equation, together with equation (A-27a), allow rewriting the two time evolution equations (A-21a) and (A-21b) for the transverse fields in a form only involving ˜ (k ) and A ˜ (k ): E ˜ (k ) = A ˜ (k ) = E

˜ (k ) E 2 2

˜ (k ) A

(A-30a) 1˜ j (k )

(A-30b)

0

In the absence of sources (j˜ (k ) = 0), we get two coupled time evolution equations for ˜ (k ) and A ˜ (k ). They will be useful later on for introducing the transverse fields E the field normal variables, and for the demonstration of the equivalence between the transverse field and an ensemble of harmonic oscillators. Time evolution equation for the transverse potential vector ˜ can be obtained by replacing E ˜ The time evolution equation for A ˜ A . We obtain: 2 2

+

2 2

˜ (k ) = 1 j˜ (k ) A

in (A-30b) by

(A-31)

0

which is written, in the position space:

1964

1

2

2

2

∆ A (r ) =

1 0

2

j (r )

(A-32)

A. CLASSICAL ELECTRODYNAMICS

A-2-f.

Coulomb gauge

Condition ∇ A(r ) = 0, which defined in (A-9) the Coulomb gauge, becomes in the reciprocal space: ˜ k A(k )=0

˜ (k ) = 0 A

(A-33)

In the Coulomb gauge, the longitudinal vector potential is therefore equal to zero; there only remains the transverse vector potential, which, as mentioned above, is a physical field. What can be said about the scalar potential in the Coulomb gauge? Let us consider the longitudinal part of each side of equation (A-24a). As the last term on the ˜ (k ) = k ˜ (k ), which right-hand side is transverse in the Coulomb gauge, we get E reads, in position space, E (r ) = ∇ (r ). The scalar potential is the potential whose gradient yields the longitudinal electric field. Equation (A-20) then shows that, to within a constant, (r ) is equal to: (r ) =

1 4

r

0

1 r ()

(A-34)

which is the Coulomb potential created by the charge distribution. Lorenz gauge In the present chapter and the next one, we shall mainly use the Coulomb gauge. Another gauge often used, in particular in the clearly covariant formulations of electrodynamics, is the Lorentz gauge 5 defined by the condition: ∇ A(r ) +

1 2

(r ) = 0

(A-35)

which can be written, using covariant notation: =0

(A-36)

The condition defining the Lorenz gauge thus keeps the same form in every Lorentz reference frame, which is not the case for the Coulomb gauge (since in relativity, a transverse field of zero divergence in one reference frame is no longer necessarily transverse in another frame). Nevertheless, an advantage of the Coulomb gauge is that it allows the immediate identification, in a given reference frame, of the field variables that are really independent.

A-3.

Elimination of the longitudinal fields from the expression of the physical quantities

It will be useful for the following discussion to eliminate the longitudinal fields from the expressions of the total energy and the total momentum given by equations (A-6a) and (A-6b). We shall express these physical quantities only in terms of the truly independent variables, such as particle coordinates and speeds, and transverse fields.

5 The

danish physicist Ludwig Lorenz is often confused with the dutch physicist Hendrik Lorentz.

1965

CHAPTER XVIII

A-3-a.

REVIEW OF CLASSICAL ELECTRODYNAMICS

Total energy

We start by eliminating the longitudinal electric field from the last term in expres˜ (k ) sion (A-6a). Using the Parseval-Plancherel equality (A-12) and the fact that E ˜ (k ) = 0, we can rewrite this term as: E 0

d3

2

E 2 (r ) +

2

B 2 (r ) =

long

+

trans

(A-37)

where: long

=

trans

=

0

2 0

2

˜ (k ) E ˜ (k ) d3 E d3

˜ (k ) E ˜ (k ) + E

(A-38a) 2

˜ (k ) B(k ˜ B )

(A-38b)

˜ (k ) by expression (A-18a). We get, taking (A-12) and (A-13) In (A-38a), we replace E into account: 1 ˜ (k )˜(k ) d3 2 2 0 1 ˜(r )˜(r ) = d3 d3 8 0 r r 1 = = Coul + 8 0 r r

=

long

Coul

(A-39)

=

The longitudinal field energy is thus equal to the Coulomb electrostatic energy Coul of the charge distribution (r ). In addition to the Coulomb interaction energy between different particles and , Coul also contains the energy Coul of the Coulomb field of each particle , which diverges for point particles. Expression (A-38b) for trans can be rewritten as a function of the variables ˜ ˜˙ (k ) and A ˜ (k ) introduced above for the transverse field: E (k ) = A

trans

=

0

2

d3

˜˙ (k ) A ˜˙ (k ) + A

2

˜ (k ) A ˜ (k ) A

(A-40)

Finally, the energy of the global system field + particles can be expressed in the form: =

1 2

r˙ 2 ( ) +

Coul

+

trans

(A-41)

where we used the simplified notation r˙ ( ) = dr ( ) d = v ( ). It is the sum of the kinetic energy of the particles, of their Coulomb energy, and of the energy of the transverse field. 1966

A. CLASSICAL ELECTRODYNAMICS

A-3-b.

Total momentum

Similar computations can be carried out for the total momentum P . The field contribution contained in the last term of (A-6b) can be written as: ˜ (k ) d3 E

0

˜ B(k )=

˜ (k ) d3 E

0

˜ B(k )

Plong

+

0

˜ (k ) d3 E

˜ B(k )

(A-42)

Ptrans

where we have separated the contributions to P coming from the longitudinal and transverse components of the electric field6 . Using (A-18a) and (A-27b), taking into account ˜ (k ) = 0, we get: identity (A-28) and the fact that k A Plong =

˜ (k ) k

d3

0

0

k

2

˜ (k ) A

˜ (k ) d3 ˜ (k )A

=

(A-43)

We then have: d3

Plong =

(r )A (r ) A (r

=

)

(A-44)

As we did above for (A-40), we can rewrite the expression of Ptrans as a function ˜ (k ) = A ˜˙ (k ) and A ˜ (k ) of the transverse field: of the variables E Ptrans =

0

=

0

˜˙ (k ) d3 A d3

k

˜ (k ) A

˜˙ (k ) A ˜ (k ) k A

(A-45)

The momentum of the global system field + particles can be written in the form: P =

[

r˙ ( ) +

A (r

)] + Ptrans

(A-46)

Let us finally introduce the quantity: p ()=

r˙ ( ) +

A (r

)

(A-47)

We shall see later that, in the Coulomb gauge electrodynamics, p ( ) is the conjugate momentum of r ( ), hence different from the mechanical momentum r˙ ( ). Expressed 6 The notation P long should not lead us to believe that Plong is a longitudinal field vector itself: it is actually the vector yielding the longitudinal electric field contribution to the momentum vector; the same comment applies to Ptrans .

1967

CHAPTER XVIII

REVIEW OF CLASSICAL ELECTRODYNAMICS

as a function of p ( ), the total energy (A-41) and total momentum (A-46) are written as: =

1

[p

2

P =

A (r

2

)] +

Coul

+

trans

p ( ) + Ptrans

(A-48) (A-49)

where trans and Ptrans were introduced in equations (A-38b) and (A-42). We shall see that actually coincides with the Hamiltonian in the Coulomb gauge of the global system field + particles. A-3-c.

Total angular momentum

Calculations similar to ones just presented, but that will not be detailed here7 , show that the contribution of the longitudinal electric field to the total angular momentum is equal to: Jlong =

0

d3 r

(E

B) =

r

A (r )

(A-50)

Adding Jlong to the particles’ angular momenta, we get, taking (A-47) into account: r˙ + Jlong =

r

r

p

(A-51)

so that we can finally write: J=

r

p + Jtrans

(A-52)

where: Jtrans =

B.

0

d3 [r

(E

B)]

(A-53)

Describing the transverse field as an ensemble of harmonic oscillators

B-1.

Brief review of the one-dimensional harmonic oscillator

The energy of a harmonic oscillator is given by: =

1 2

where d d

1 2

2 2

(B-1)

2 is the oscillation frequency, and ˙ the oscillator velocity: = ˙

7 These

1968

˙2 +

calculations can be found in § 1 of Complement BI in [16].

(B-2)

B. DESCRIBING THE TRANSVERSE FIELD AS AN ENSEMBLE OF HARMONIC OSCILLATORS

This velocity obeys: d ˙= d

2

(B-3)

so that the equation of motion of 2

¨+

is:

=0

(B-4)

Consequently, the time evolution of ( ) is given by a (real) linear combination of cos( ) and sin( ). The dynamic state of the classical harmonic oscillator is defined at each instant by two real variables ( ) and ˙ ( ). It is often useful to combine them into a single complex variable ( ) by setting: ()=

( )+

˙( )

(B-5)

where is an arbitrary (time-independent) constant. Relations (B-2) and (B-3) show that ( ) obeys the first order differential equation: ˙ =



)=

+

˙

=

(B-6)

The time dependence of the new variable ( ) is therefore simply . One can invert the system formed by equation (B-5) and its complex conjugate yielding , and compute and ˙ as a function of and . Inserting the expressions thus obtained in equation (B-1) for the energy , we obtain by a simple calculation8 : 2

=

4

2

(

The constant 2

4

2

=

+

)

(B-7)

can be chosen so that:

~ 2

(B-8)

This leads, after quantization, to the Hamiltonian operator: ˆ = ~ (ˆ ˆ + ˆˆ ) 2

(B-9)

which is the Hamiltonian of a harmonic oscillator9 . B-2. B-2-a.

Normal variables for the transverse field Vibration eigenmodes of the free transverse field

In the reciprocal space, expression (A-40) for the free transverse field energy trans ˜˙ (k ) and A ˜ (k ). For each value of k, we get is a sum of quadratic functions of A a harmonic oscillator Hamiltonian. The evolution introduces no coupling between the various spatial Fourier components of the transverse field. We see the advantage of 1969

CHAPTER XVIII

REVIEW OF CLASSICAL ELECTRODYNAMICS

Figure 1: For each vector k, the transverse fields can have two polarizations characterized by unit vectors ε1 (k) and ε2 (k) perpendicular both to each other and to k. working in the reciprocal space: it enables us to identify the eigenmodes of the field vibrations, in the absence of sources. Actually, for each k, the transverse field can have two different polarizations 10 characterized by unit vectors ε1 (k) and ε2 (k), both perpendicular to k and to each ˜ (k ), as an example: other, so that we can write for A ˜ (k ) = ˜ A

ε1 (k) (k

) ε1 (k) + ˜

ε2 (k) (k

˜

) ε2 (k) =

ε (k) (k

) ε (k) (B-10)

ε (k)

with: ˜

˜ (k ) ) = ε (k) A

ε (k) (k

(B-11)

The set k ε (k) defines what we shall call in this chapter a free field mode; they are the eigenmodes of the free field vibration, with a frequency: =

(B-12)

To simplify the notation, we shall write the last summation in (B-10) in a more compact form: ˜

ε (k) (k

˜

) ε (k)

ε (k



(B-13)

ε

ε (k)

Let us rewrite expression (A-40) for ˜ trans expliciting the components of the fields A (k ) and A˙ (k ) on the polarization vectors. We get: trans

=

0

2

˜˙

d3

ε (k

) ˜˙

ε (k

)+

2

˜

ε (k



ε (k

)

(B-14)

ε

8 In view of the quantization where and will be replaced by non-commuting operators ˆ and ˆ , we keep the sequence of and as they appear in the computations. 9 If ˆ and ˆ obey the canonical commutation relation [ˆ ˆ] = ~, relation (B-8) for the choice of also leads to the commutation relation [ˆ ˆ ] = 1. 10 We choose real vectors ε (k) and ε (k) corresponding to linear polarizations, but the choice of 1 2 these two polarizations is arbitrary, since they can always be rotated by any angle around k. It is also possible to perform a more general change of basis with complex vectors defining elliptical polarizations, for instance the right and left circular polarizations ε = (ε1 ε2 ) 2. Circular polarizations are useful when discussing electromagnetic spin — see § 3 of Complement BXIX . If complex (orthonormal) polarizations are used, ε (k) should be replaced by ε (k) in the right side hand of relation (B-11).

1970

B. DESCRIBING THE TRANSVERSE FIELD AS AN ENSEMBLE OF HARMONIC OSCILLATORS

Note that the components on the two polarizations ε are truly independent dynamic variables (generalized coordinates and velocities). This is not the case for the Cartesian components ˜ (k ) and ˜˙ (k ) (with = ), because of the transversality ˜ = 0. condition. For example, the components ˜ (k ) must obey Constraints on the dynamic variables in the reciprocal space ˜ (k ) = A ˜ ( k ). Since the fields are real in real space, we have the condition A In half the reciprocal space, the variables ˜ ε (k ) and ˜ ε (k ) can be considered as independent .

B-2-b.

Definition of the normal variables, free field case

Let us first assume that we are in the free field case (j˜ = 0), and we can replace ˜ (k ) by A ˜˙ (k ) in equations (A-30a) and (A-30b). As = , we get two the field E equations exactly similar to those of a harmonic oscillator (B-2) and (B-3), with A (k ) instead of ( ). This analogy suggests introducing, as in (B-5), a new transverse variable: α(k ) = =

˜ (k ) + ( ) A ( )

k 2

˜˙ (k ) A

˜ B(k )

1 ˜ E (k )

(B-15)

where ( ) is a real constant, not yet specified, which can depend on (its value will be chosen at the beginning of the next chapter). This definition, together with (A-30b), yields the equation of motion for α(k ): ˙ α(k )+

α(k ) = 0

(B-16)

As opposed to A (k ) that, according to (A-31), obeys a second order equation, this new variable α(k ) obeys a first order equation. It is a complex variable whose time evolution is proportional to , and not, as is the case for the variable A (k ), to a linear superposition of and + . It will be useful in what follows to consider the complex conjugate of equation (B-15):

α (k ) = =

˜ (k ) ( ) A ˜ ( k ) ( ) A

˜˙ (k ) A ˜˙ ( k ) A

(B-17)

To go from the first to the second line of (B-17), we used the fact that A is real in the real space, which leads to: ˜ (k ) = A ˜ ( k ) A

(B-18)

˜˙ . The transverse variables α(k ) and α (k ) are called A similar relation exists for A the transverse field normal variables. We will see in the next chapter that the quantization process will transform these variables into annihilation and creation operators of photons. 1971

CHAPTER XVIII

B-2-c.

REVIEW OF CLASSICAL ELECTRODYNAMICS

Equation of motion for the normal variables in the presence of sources

In the presence of sources, j˜ is no longer zero. We can still define the normal variables α(k ) by relations (B-15), but we must now keep the term in j˜ (k ) on the right-hand side of equation (A-30b). The same transformation that led us from equations (A-30a) and (A-30b) to (B-16) now yields a new equation of motion in the presence of sources: ˙ α(k )+

( )˜ j (k )

α(k ) =

(B-19)

0

This equation is strictly equivalent to Maxwell’s equations for the transverse fields. One can see this by taking the time derivative of equations (B-22a) and (B-22b) given below, and using (B-19) to get the time-dependent evolution equations (A-30a) and (A-30b) of these fields. Independence of the normal variables Another interest of the normal variables is that they are independent: there is no re˜ (k ) and lation between α(k ) and α ( k ) such as the one that exists between A ˜ ( k ). This is because the real and imaginary parts of α(k ) depend on two inA ˜ (k ) and its time derivative. It is easy to check, by dependent degrees of freedom, A changing the sign of k in (B-15) and by using (B-18) that:

α( k ) =

˜ (k ) + ( ) A

˜˙ (k ) = α (k ) A

(B-20)

The knowledge of the α(k ) in the entire reciprocal space does not entail the knowledge of the α (k ). Consequently, the integrals over k of the normal variables must be taken over the entire space, and not be limited to half the reciprocal space. B-2-d.

Expression of the physical quantities in terms of the normal variables

We are going to show that all the physical quantities can be expressed in terms of the normal variables. .

Transverse fields in the reciprocal space Replacing k by α ( k )=

k, we can rewrite equation (B-17) as:

˜ (k ) ( ) A

˜˙ (k ) A

(B-21)

˜ (k ) and A ˜˙ (k ) as a function of Using (B-15) and (B-21), we can now express A α(k ) and α ( k ). We get: ˜ (k ) = A ˜˙ (k ) = A

1972

1 [α(k ) + α ( k )] 2 ( ) 2 ( )

[α(k )

α ( k )]

(B-22a) (B-22b)

B. DESCRIBING THE TRANSVERSE FIELD AS AN ENSEMBLE OF HARMONIC OSCILLATORS

.

Energy and momentum of the transverse field

˜ (k ) and ˜˙ (k ) in the expression We insert relations (B-22a) and (B-22b) for A (A-40) for trans , using the more compact notation: α = α(k )

;

α = α( k )

(B-23)

We get:

trans

= =

0

2 0

2

2

d3

4

2(

α ) (α



)

α ) + (α + α ) (α + α )

2

d3

4

2(

α + 2α



)

α

(B-24)

(in these equations, we keep the ordered sequence of α and α as they appear in the computations, even though α and α are commuting numbers; the reason is that similar computations can be carried out in the quantum theory where α and α will be replaced by non-commuting operators). A change of variable k k in the integral of the terms in α α yields an integral of α α . We then get: 2 trans

=

0

d3

2(

4

)

α+α α ]



(B-25)

Expliciting the components of α and α on the two polarization vectors ε perpendicular to k, and using the simplified notation (B-13), we finally get: 2 trans

=

0

d3 ε

4

2(

)

[

ε (k

)

ε (k

)+

ε (k

)

ε (k

)]

(B-26)

This expression looks a lot like a sum of harmonic oscillator Hamiltonians; a suitable choice for the constant will be made in the next chapter. Similar calculations can be carried out for the transverse field momentum Ptrans 11 . Using equations (A-45), (B-22a) and (B-22b), we get: Ptrans =

0

d3 ε

.

4

2(

)

k [

ε (k

)

ε (k

)+

ε (k

)

ε (k

)]

(B-27)

Transverse fields in real space

˜ (k ), whose expression in Let us consider first the transverse potential vector A terms of the normal variables is given by (B-22a). To get its expression in real space, one must, taking (A-11) into account, multiply (B-22a) by (2 ) 3 2 k r and integrate over k. Making the change of variable k k in the integral containing α ( k ), we finally get: A (r ) =

1 (2 )3

2

d3 ε

1 2 ( )

ε (k



kr

+

ε (k



kr

(B-28)

11 The expression of the angular momentum J trans of the transverse field, in terms of the normal variables, will be computed in Complement BXIX .

1973

CHAPTER XVIII

REVIEW OF CLASSICAL ELECTRODYNAMICS

This relation (as well as the next two equations) is written in the general case where the polarizations may be complex, elliptical or circular (cf. note 10). This is why the term in ε (k ) contains a complex conjugate polarization ε . Similar calculations can be carried out for the transverse electric field as well as for the magnetic field. They yield: E (r ) =

B(r ) =

1 (2 )3

1 (2 )3

2

d3 ε

2

d3 ε

2 ( )

2 ( )

ε (k

ε (k





kr

ε

ε (k

kr

kr



ε (k



ε

(B-29)

kr

(B-30) where κ has been defined in (A-16) as the unit vector parallel to k. B-3.

Discrete modes in a box

So far, we have considered radiation propagating in an infinite space and used continuous Fourier transforms; in relation (A-11), the electric field is expanded on a continuous basis of normalized plane waves k r (2 )3 2 . It is often useful, however, to use a discrete basis, assuming the radiation to be contained in a box of finite volume, generally defined as a cube of edge length ; this will frequently occur in the next two chapters when dealing with the quantized radiation. The components of each wave vector must obey the boundary conditions in the box12 , and hence take on discrete values: =2

(B-31)

At the end of the computation, nothing prevents us from choosing a very large value of in order to check that the final result does not depend on . Instead of continuous spatial Fourier transforms, one must now introduce discrete Fourier series where each physical quantity is expanded in terms of normalized plane waves k r 3 2 . The expansion (A-11) of the electric field then becomes: E(r ) =

1

˜k ( ) E

3 2

kr

(B-32)

k

with13 : ˜k ( ) = E

1 3 2

d3 E(r )

kr

(B-33)

The summation in (B-32) is discrete, and the integral in (B-33) is now limited to the volume of the box. 12 One

can choose to impose the field being zero on the walls, but it is generally easier to enforce periodic boundary conditions (B-31), which lead to the same density of states. 13 In Appendix I, we used a slightly different definition for the Fourier series, with which the factor 1 3 2 would be missing from (B-32), but where (B-33) would contain a factor 1 3 . The definition we use here is chosen to directly yield an expansion of E(r ) on plane waves normalized in the cube.

1974

B. DESCRIBING THE TRANSVERSE FIELD AS AN ENSEMBLE OF HARMONIC OSCILLATORS

Note that if the field is zero outside the box, it is obviously possible to use the con˜ tinuous Fourier transform (A-10) to get the field component E(k ); however, this latter component is different from the discrete component E˜k ( ), because of the coefficients introduced in the definitions. The two components are related by: E˜k ( ) =

3 2

2

˜ E(k )

(B-34)

The same changes can be made on the Fourier transforms of all the other physical quantities such as the magnetic field, the vector potential, as well as the charge and current densities. The equations in the reciprocal space such as equations (A-17), (A-21), (A-25) and the following, remain valid if we replace the continuous variables k by discrete vari3 2 3 2 ables, since each side of those equations are multiplied by the same factor (2 ) . In the case of a zero field outside the box, the ε (k ) are also replaced by : k ε( ) =

3 2

2

ε (k

)

(B-35)

Coming back to ordinary space (r ) via the inverse Fourier transform, we must use relations of the type (B-32) instead of (A-11). Consequently, once we replace in ˜ ( k ) by the E ˜ k ( ), we must also introduce a multiplicative the integral over d3 the E 14 factor : d3

2

=

3 2

(B-36) k

B-4.

Generalization of the mode concept

In the absence of sources, the solution of the equation of motion (B-16) for the normal variable ε (k ) is very simple, since it is an exponential with an angular frequency = : ε (k

)=

ε (k

0)

(B-37)

Inserting (B-37) in the expressions we just obtained for the transverse fields and the other physical quantities, we see that the fields are linear superpositions of progressive plane waves, propagating independently of each other. The free field energy and momentum are the sum of the squared moduli of the various normal variables, each being timeindependent and proportional to ε (k 0) 2 . The modes k ε introduced in this chapter permit expanding the free transverse fields on progressive plane waves. Nevertheless, other expansions on monochromatic waves that are not necessarily plane waves are also possible; they involve other families of modes, as we are now show, coming back to equation (A-31). In the absence of sources, (+) any monochromatic solution of this equation, of the form A (r) , necessarily obeys equation: (∆ +

2

)A

(+)

(r) = 0

(B-38)

14 The product of the multiplicative factor of (B-34) and that of (B-35) yields the usual factor (2 obtained directly from (B-31).

)3 ,

1975

CHAPTER XVIII

REVIEW OF CLASSICAL ELECTRODYNAMICS

kr (which is simply the Helmholtz equation) with = . The plane waves are a possible basis of eigenfunctions for this eigenvalue equation, but not the only one. There exists other bases, such as the basis of stationary waves cos k r and sin k r, the basis of multipolar waves (radiation modes with a specific angular momentum, whereas plane waves have a specific linear momentum), or the basis corresponding to Gaussian modes. More generally, any linear combination of plane waves with the same modulus k can become a mode. Whatever basis is chosen, the transverse field energy will be a sum of the squared moduli of normal variables introduced in the expansions of the transverse fields on the eigenfunctions of that basis. The expression of the other physical quantities, however, will only have a simple form in a particular basis. As an example, the momentum of the transverse field is a sum of squared moduli only in the basis of progressive plane waves, whereas the field angular momentum has a simple form only in the basis of multipolar waves. Note finally that the field can be contained in a cavity with well defined boundary conditions. Finding the eigenfunctions of equation (B-38) obeying these boundary conditions is a way to determine the eigenmodes of this cavity.

To conclude this chapter, we can say that the free radiation field is equivalent to an ensemble of one-dimensional harmonic oscillators associated with the modes k ε labeled by their wave vector and their transverse polarization. Each mode is associated with a field normal variable, similar to the classical variable of the corresponding classical oscillator, and which will become, in the quantization process, the oscillator annihilation operator. The results established in this chapter will be the simple starting point for the radiation quantization explained in the next chapter.

1976

COMPLEMENT OF CHAPTER XVIII, READER’S GUIDE AXVIII : LAGRANGIAN ELECTRODYNAMICS

FORMULATION

OF

The dynamic equations for the electrodynamic field (Maxwell’s equations) can be obtained from the Lagrangian formalism based on a principle of least action. This enables introducing expressions for the conjugate momenta of the various field variables, as well as for the field Hamiltonian when coupled to charged particles. The results of this complement are not indispensable for reading the other chapters and complements. They offer, however, an overview of a more general approach to quantum electrodynamics, which is essential for a relativistic treatment of these problems and for the use of path integrals (Appendix IV).

1977



LAGRANGIAN FORMULATION OF ELECTRODYNAMICS

Complement AXVIII Lagrangian formulation of electrodynamics

1

2

3

Lagrangian with several types of variables . . . . . . . . . . 1980 1-a Lagrangian formalism with discrete and real variables . . . . 1980 1-b Extension to complex variables . . . . . . . . . . . . . . . . . 1982 1-c Lagrangian with continuous variables . . . . . . . . . . . . . 1984 Application to the free radiation field . . . . . . . . . . . . . 1986 2-a Lagrangian densities in real and reciprocal spaces . . . . . . . 1986 2-b Lagrange’s equations . . . . . . . . . . . . . . . . . . . . . . . 1987 2-c Conjugate momentum of the transverse potential vector . . . 1987 2-d Hamiltonian; Hamilton-Jacobi equations . . . . . . . . . . . . 1988 2-e Field commutation relations . . . . . . . . . . . . . . . . . . . 1989 2-f Creation and annihilation operators . . . . . . . . . . . . . . 1990 2-g Discrete momentum variables . . . . . . . . . . . . . . . . . . 1991 Lagrangian of the global system field + interacting particles1992 3-a Choice for the Lagrangian . . . . . . . . . . . . . . . . . . . . 1992 3-b Lagrange’s equations . . . . . . . . . . . . . . . . . . . . . . . 1993 3-c Conjugate momenta . . . . . . . . . . . . . . . . . . . . . . . 1995 3-d Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1995 3-e Commutation relations . . . . . . . . . . . . . . . . . . . . . . 1996

Introduction As shown in Appendix III, the dynamics of a system of point particles in an external potential can be described either by Newton’s equations, or by a Lagrangian with the principle of least action leading to Lagrange’s equations, equivalent to Newton’s equations. An advantage of the Lagrangian formalism is that it facilitates the quantization of the theory: it directly leads to the definition of the conjugate momenta of the particles’ coordinates, and of the system’s Hamiltonian, which is a function of the coordinates and the conjugate momenta. It then naturally introduces the canonical commutation relations, fundamental for the quantum description of the system. This complement will show, in a succinct way, how the Maxwell-Lorentz equations, studied in this chapter and the next, can be deduced from a Lagrangian and a principle of least action. This will give a more general justification for the expression of the Hamiltonian of the system “field + particles” postulated in Chapter XIX and for the commutation relations also postulated in that chapter1 . Another advantage of the Lagrangian formalism, that we shall not exploit here, is that it is well suited to a relativistic description of the system “field + particles” which is why it is used in the quantum theory of relativistic fields. 1 The relations postulated in Chapter XIX are justified a posteriori by the fact that they lead to the correct Heisenberg equations for the quantum operators associated with the particles and the fields.

1979



COMPLEMENT AXVIII

We start in § 1 by extending the computations of Appendix III to the case where the system coordinates are complex and not real, even though the Lagrangian remains a real quantity. We will also show that the principle of least action and Lagrange’s equations can be generalized to the field case, that is to a case where the system coordinates no longer depend on a discrete but on a continuous index, such as the point r in real space. The discussion will be illustrated in § 2, which studies the Lagrangian of the free radiation field in the absence of sources. The field will be described by its components in reciprocal space, which are complex quantities. The Lagrangian then depends only on the field components and their time derivatives, hence making the computations easier than if the fields were described by the real components in real space (this is because the Lagrangian in real space depends not only on the field components and their time derivatives, but also on their spatial derivatives). In this study, we shall establish the expression for the field Hamiltonian, and the canonical commutation relations of the components of that field. Finally, we give in § 3 the expression for the electrodynamic Lagrangian in the Coulomb gauge in the presence of sources; we show how Lagrange’s equations deduced from this Lagrangian coincide with the Maxwell-Lorentz equations studied in Chapter XVIII. Several important relations for the quantization of the theory will be established: expression for the conjugate momenta of the particles and fields; expression for the Hamiltonian of the global system field + particles; canonical commutation relations. The results obtained in this complement give a base for the quantization process more general than the simplified approach of Chapitre XIX. The interested reader can find a more detailed description of the electrodynamic Lagrangian and Hamiltonian formalism in Chapter II of reference [16] and its complements. 1.

Lagrangian with several types of variables

1-a.

Lagrangian formalism with discrete and real variables

The Lagrangian is a real function of dynamical variables composed of “generalized coordinates” ( ) labeled by a discrete index and of the corresponding “generalized velocities” ˙ ( ) = d ( ) d . is written: [

1(

)

2(

)

( ); ˙ 1 ( ) ˙ 2 ( )

˙ ( )]

(1)

Consider a possible motion of the system where the coordinates ( ) follow a certain “path” Γ between an initial time in and a final time 2 . The integral of along the path Γ is, by definition, the action Γ associated with this path: 2

Γ

=

d

[

1(

)

2(

)

( ); ˙ 1 ( ) ˙ 2 ( )

˙ ( )]

(2)

1

The principle of least action postulates that, among all the possible paths starting from the same initial conditions described by ( in ) and arriving at the same final conditions described by ( 2 ), the system will follow the one for which Γ presents an extremum (if the path varies, Γ is stationary). Consider an infinitesimal variation () and ˙ ( ) of the dynamical variables around this path of extremum action, which does not change the initial and final values of the coordinates, i.e. such that: ( 1980

in )

=

( 2) = 0

(3)



LAGRANGIAN FORMULATION OF ELECTRODYNAMICS

The corresponding variation of the action 2

=

()

+ ˙ ()

1

must be zero to first order in by: d d

˙ ()=

d

˙

(4)

( ) and ˙ ( ). We replace, in the last term of (4), ˙ ( )

()

(5)

and integrate by parts the corresponding term. The integrated part is zero because of (3). We then get: d d

2

=

() 1

˙

d

As must be zero for any variation must obey the equations: d d

˙

(6) ( ), the path actually followed by the system

=0

(7)

These relations are called Lagrange’s equations; they can be shown to be equivalent to Newton’s equations (Appendix III). The next step of the Lagrangian formalism is to introduce the conjugate momenta of the coordinates , defined by the equations: =

(8)

˙

as well as the Hamiltonian =

equal to:

˙

(9)

Let us take the differential of d

= =

d ˙1 + ˙ d [˙ d

˙d ]

d

˙1

d˙ (10)

To go from the first to the second line of (10), we used (7) and (8) to replace by ˙ and ˙ by . We assume the ˙ can be expressed as a function of the and the . The Hamiltonian is then a function of the coordinates and the conjugate momenta , whose evolution obeys, taking (10) into account, the 2 equations: d d

=

d = d

(11) 1981



COMPLEMENT AXVIII

called the Hamilton-Jacobi equations. Let us finally recall the canonical quantization process. One associates with the coordinates and the conjugate momenta the operators ˆ and ˆ obeying the commutation relations: [ˆ ˆ ] = ~

(12)

all the other commutators being equal to zero. These results are valid only if the coordinates are Cartesian components (see comment in § B-5 of Chapter III). 1-b.

Extension to complex variables

Let us assume, to keep things simple, that the index takes only = 2 values. With the two real coordinates 1 ( ) and 2 ( ) we build the complex variables: ()=

1 [ 2

1(

)+

2(

)]

()=

1 [ 2

1(

)

2(

)]

(13)

whose real and imaginary parts are, within a factor 1 2, equal to 1 ( ) and ( ), 1 ( ) and ( ). Equations (13) can be inverted and yield: 2 ( ) for 1(

)=

1 [ ( )+ 2

( )]

2(

)=

[ ()

2

( )]

2(

) for

(14)

Analogous equations can be written, relating ˙ ( ) and ˙ ( ) to ˙ 1 ( ) and ˙ 2 ( ) and vice versa: ˙ ( ) = 1 [ ˙ 1 ( ) + ˙ 2 ( )] 2

˙ 1( ) =

1 2

˙ ( ) = 1 [ ˙ 1( ) 2

˙( )+ ˙ ( )

˙ 2( ) =

˙( )

2

˙ 2 ( )]

(15)

˙ ()

(16)

Inserting in the Lagrangian (1) expressions (14) and (16) for the variables, we get a Lagrangian of the form () ( ) ˙ ( ) ˙ ( ) that depends on complex variables. Note however that just as in (1), this Lagrangian, even though it depends on complex variables, is still a real quantity since its time integral along a path Γ is an action (which is a real quantity). We now study what becomes of all the results established earlier ˙ and ˙ . using (1) when they are expressed as a function of .

Lagrange’s equations It is important, for what follows, to relate and ˙ 1 and ˙ 2 . Using (14) and (16), we can write: 1

=

˙ 1982

=

˙1

2

1 2

1

2

˙2

˙2 1 = ˙ 2

˙1

˙2

2

+

1

˙1 + ˙

=

˙ to

1,

2,

(17)

(18)



LAGRANGIAN FORMULATION OF ELECTRODYNAMICS

Subtracting from equation (17) the time derivative of equation (18), we get: d d

=

˙

1 2

1

d d

˙1

2

2

d d

(19)

˙2

The two parentheses on the right-hand side of this equation are zero since 1 and 2 obey Lagrange’s equation (7). It then follows that the left-hand side is also zero, as is its complex conjugate2 : d d

which proves that .

d d

=0

˙

˙

=0

(20)

also obey Lagrange’s equations3 .

and

Conjugate momenta In (18) we replace 1 ( 2

=

˙

˙ 1 and

˙ 2 by

1

and

2

(see Eq. (8)). We get:

2)

1

(21)

The complex conjugate of equation (21) is written: =

˙

1 ( 2

1

+

2)

(22)

To choose the definition of the conjugate momentum of the complex variable , it is useful to compare the way the conjugate momentum and the velocity ˙ are transformed upon the change of dynamical variables 1 2 . We compare the first equation (15) and the two equations (21) and (22). The velocity ˙ becomes ˙ 1 + ˙ 2 ; a wise choice would be to define the associate momentum in such a way that its transformation yields 1 + 2 . Equations (21) and (22) then clearly show that ˙ , but rather as ˙ ; the complex conjugate must not be defined as is ˙ then equal to : =

=

˙

(23)

˙

This is the definition we shall use in the rest of this complement. .

Hamiltonian

The quantity ˙ 1 1 + ˙ 2 2 appears in the definition (9) of the Hamiltonian. This quantity can be rewritten by replacing ˙ 1 and ˙ 2 by their expressions (16) as a function of ˙ and ˙ , as well as 1 and 2 by analogous expressions as a function of and : ˙1

1

+ ˙2

2 Since 3 These

sidering

2

1 ˙ 1 ( + ˙ ) ( 2 2 = ˙ + ˙ =

+

)+

2



˙ )

2

(

) (24)

˙ . is real, we have ( ) = and an analogous equation for results could have been obtained directly by the variational calculation leading to (6), conand to be independent variables.

1983

COMPLEMENT AXVIII



We can then write: = ˙1

1

+ ˙2

2

= ˙

We now take the differential of = ˙d

d

+

+ ˙

(25)

:

d˙+ ˙ d + d˙

d

d

˙



˙



(26)

Using (20) and (23), we get: = ˙ d

d

˙ d

+ ˙d

˙d

(27)

If ˙ and ˙ can be expressed in terms of the variables , , , , the Hamiltonian only depends on those variables and we deduce from (27) the Hamilton-Jacobi equations: d d

d = d

=

(28)

and the complex conjugate equations for and . Note that it is the partial derivative with respect to (and not with respect to ) that is equal to the total derivative of . .

Canonical commutation relations

Upon quantization, the different variables become operators ˆ1 , ˆ2 , ˆ1 , ˆ2 , ˆ , ˆ , ˆ , ˆ . The commutation relations between the operators ˆ , ˆ , ˆ , ˆ are obtained by expressing those operators in terms of ˆ1 , ˆ2 , ˆ1 , ˆ2 and using the commutation relations (12). We can easily check that the only non-zero commutators are [ ˆ ˆ ] and [ ˆ ˆ ] = [ ˆ ˆ ]. Using (21), (22) and (23), we obtain: 1 [ ˆ ˆ ] = [ˆ1 + ˆ2 ˆ1 ˆ2 ] 2 1 = ([ˆ1 ˆ1 ] + [ˆ2 ˆ2 ]) = ~ 2 1-c.

(29)

Lagrangian with continuous variables

We now assume that the dynamical variables of the system depend on a continuous index, such as the point r in real space (or the point k in the reciprocal space when the real space is infinite). In other words, they constitute a field (r), where the discrete index labels the component of the field if we are dealing with a vector field; in the reciprocal space, this field becomes ˜ (k). We shall only establish here Lagrange’s equations for a real field. In the upcoming § 2, we shall study the radiation field described by its complex components in the reciprocal space. We will then generalize to a complex field all the results established earlier for discrete and complex variables. This will yield the expression for the free field Hamiltonian and the commutation relations for the free field, which are essential for the quantum description of the field. The Lagrangian of a real field is now the integral in real space of a Lagrangian density L : = 1984

d3 L (r )

(30)



LAGRANGIAN FORMULATION OF ELECTRODYNAMICS

The Lagrangian density is a function of the field respect to and to the components of r: L (r ) = L (r ) ˙ (r ) (r ) with the notation ( = ˙ (r ) = (r ) (r ) =

(r) and of its partial derivatives with (31)

): (32)

(r )

(33)

Consider a possible path Γ for the field, going from the value (r in ) at an initial time in to the final value (r 2 ) at a final time 2 . The action Γ associated with this path is, by definition: 2

Γ

=

(r ) ˙ (r )

d3 L

d

(r )

(34)

1

The principle of least action postulates that among all the possible paths starting from the same initial state and ending at the same final state4 , the path(s) actually followed by the system is the one (or are those) for which Γ presents an extremum. Let us compute the variation of the action for an infinitesimal variation of the path, characterized by the infinitesimal variations (r ), ( (r ) ) and ( (r ) ). 2

=

d3

d

(r )

L

1

L + ˙ (r ) + ˙

(

(r ))

L (

) (35)

Using: (

˙ (r ) =

(

(r ))

(r )) =

(

(r ))

(36)

and performing an integration by parts of the terms proportional to ( (r )) and ( (r )), we find that the integrated terms are zero because of the boundary conditions for (r ) at the initial and final times, and for r . The remaining terms are therefore all proportional to (r ). Grouping them all, we find5 : 2

=

d3

d

(r )

L

1

As

d d

L ˙

must be zero for any time or spatial variations of L

d d

L ˙

L (

)

=0

L (

)

(37)

(r ), we can deduce that: (38)

which are the Lagrange equations for the field. 4 We also assume that the Lagrangian density is zero or tends to zero fast enough when r goes to infinity. L 5 The function does not directly depend on time . It nevertheless depends indirectly on if ˙

we replace, as in (31), the fields and their derivatives by their values for a given history of the field. By convention, we then denote by dd L ˙ the time derivative of this function at each point of space. It contains the sum of the contributions of the partial derivatives of the function with respect to all the initial variables (the fields and their derivatives).

1985

COMPLEMENT AXVIII

2.



Application to the free radiation field

We now study the free radiation field (in the absence of sources) starting from its Lagrangian density in reciprocal space. We choose the Coulomb gauge so that the longitudinal vector potential A is zero and A is reduced to A ; furthermore, the scalar potential is also zero since, in the Coulomb gauge, it would be the potential corresponding to the Coulomb field created by the charges (see relation (A-34) of Chapter XVIII), and we are assuming that there are no charges. The only fields we have to consider are thus the transverse electric field and the magnetic field related to the transverse potential vector by the following equations in real space: A˙ (r )

E (r ) =

B(r ) = ∇

A (r )

(39)

˜ B(k )= k

˜ (k ) A

(40)

and in the reciprocal space: ˜˙ (k ) A

˜ (k ) = E 2-a.

Lagrangian densities in real and reciprocal spaces

The Lagrangian density most commonly used in real space is6 : L (r ) =

0

2

E 2 (r )

2

B 2 (r )

(41)

where is the speed of light. Using (39), we see that this Lagrangian density depends both on A (r ) and A˙ (r ), as well as on the spatial derivatives of A (r ). We now go to the reciprocal space. The Lagrangian is then written as: d3 L˜(k )

=

(42)

where the Lagrangian density L˜(k ) in the reciprocal space is obtained from (41), rewriting the fields in the reciprocal space7 . Let us evaluate the contribution to the Lagrangian of the two terms in the bracket of (41). Using (40) and the Parseval-Plancherel equality – see relation (A-12) of Chapter XVIII – we can write: d3 E (r ) E (r ) = d3 B(r ) B(r ) = As the two vectors k we have: d3

(

k

˜˙ (k ) A ˜˙ (k ) d3 A d3

(

˜ (k ) and k A

˜ (k ) ( k A

k

˜ (k )) ( k A

˜ (k )) A

(43)

˜ (k ) are in the plane perpendicular to k, A

˜ (k ) = A

d3

2

˜ (k ) A ˜ (k ) A

(44)

Finally, the Lagrangian density in the reciprocal space is written as: L˜(k ) = 6 This

0

2

˜˙ (k ) A ˜˙ (k ) A

2 2

˜ (k ) A ˜ (k ) A

(45)

density has the advantage of being a relativistic invariant (Lorentz scalar). use the notation L˜(k ) for this Lagrangian density even though this function is not the Fourier transform of L (r ). 7 We

1986



LAGRANGIAN FORMULATION OF ELECTRODYNAMICS

The generalized coordinates of the field can be seen as components of the transverse potential vector, and the generalized velocities as the time derivatives of these coordinates. ˜ (k ) and A ˜˙ (k ), and not on the As opposed to L (r ), L˜(k ) depends only on A ˜ partial derivatives of A (k ) with respect to the k components. The computations will thus be simpler in the reciprocal space. ˜ (k ) It will be useful for what follows to introduce the Cartesian components of A in the reference frame formed by κ = k and the two polarization unit vectors ε1 (k) ˜ (k ) is transverse, it does not have and ε2 (k) in the plane perpendicular to k. Since A any component along κ, and we can write: L˜(k ) =

˜˙

0

2

ε (k

) ˜˙

ε (k

)

2 2

˜

ε (k



ε (k

)

(46)

ε

where ε is a simplified notation representing the summation over the two transverse polarizations ε1 (k) and ε2 (k) – see relation (B-13) of Chapter XVIII. 2-b.

Lagrange’s equations

Equation (38) becomes here: L˜(k ) ˜ (k ) ε

L˜(k ) =0 ˜˙ (k ) ε

d d

(47)

We then get, taking (46) into account: ¨ ˜

ε (k

)+

2 2

˜

ε (k

)=0

(48)

This equation coincides with equations (A-30a) and (A-30b) of Chapter XVIII, which give the time evolution of the transverse vector potential of a free field in the absence of sources (we set j˜ = 0). We recover, as expected, the predictions of Maxwell’s equations in the usual formulation of classical electrodynamics. 2-c.

Conjugate momentum of the transverse potential vector

˜ ε (k ) of the complex variable ˜ ε (k ), To define the conjugate momentum Π we use expression (23). Note however that the velocity ˜˙ ε (k ) appears several times in the integral over k of L˜(k ). Consequently, we must add all the corresponding contributions of the partial derivatives of L˜(k ) with respect to these various velocities ˜ ε (k ). This situation results from the in the definition of the conjugate momentum Π fact that the fields are real in real space. The Fourier transform properties then lead to (cf. relation (B-18) of Chapter XVIII): A (r ) = A (r )

˜ (k ) = A ˜ ( k ) A

(49)

and to an equivalent relation for the time derivatives of the components of the transverse potential vector. In the integral over k of ˜(k ) we get, in addition to the term ˜˙ (k ) ˜˙ ε (k ), the term ˜˙ ( k ) ˜˙ ε ( k ) which, according to (49), is equal to ε ε 1987



COMPLEMENT AXVIII

˜˙ ε (k ) ˜˙ ε (k ) and therefore doubles the first term. If we ignore the terms in we must then double the contribution of the terms in k, which yields: ˜ Π

ε (k

)=2 =

0

L˜(k ) ˜˙ (k ) ε ˜˙ (k ) = ε

0

˜

ε (k

)

k,

(50)

The conjugate momentum of the transverse potential is seen to be equal, within a factor 0 , to the transverse electric field. Another equivalent way to obtain (50) is to use, in the Lagrangian expression, only independent variables. The reality condition (49) ensures that, if one knows the variables in half the reciprocal space, one knows them in the entire space. One can define as the integral over only half the reciprocal space (where all the variables are independent) of a Lagrangian density, noted L¯(k ), equal to twice the initial density. Writing “ ” the integral over half a space (the bar indicating that the k space is divided into two parts), we get: d3 L¯(k )

=

(51)

with: L¯(k ) =

˜˙

0

ε (k

) ˜˙

ε (k

2

)

˜

ε (k



ε (k

) = 2 L˜(k )

(52)

ε

so that one can also define the conjugate momentum of the transverse potential vector as: ˜ Π

ε (k

2-d.

L¯(k ) = ˙˜ (k ) ε

)=

0

˜˙

ε (k

)

(53)

Hamiltonian; Hamilton-Jacobi equations

The Hamiltonian of the free radiation field is obtained by generalizing, to continuous variables, expression (25) established for discrete variables. To only include independent variables in the integral over k of the Hamiltonian density ¯ (k ), this integral is taken over only half the reciprocal space: d3

=

¯ (k )

(54)

where the Hamiltonian density ¯ (k ) is equal to: ˜˙

L¯(k ) +

¯ (k ) =

ε (k

˜ )Π

ε (k

) + ˜˙

ε (k

˜ )Π

ε (k

)

(55)

ε

which yields for =

0

, taking (53) and (52) into account: ˜˙

d3 ε

1988

ε (k

) ˜˙

ε (k

)+

2 2

˜

ε (k



ε (k

)

(56)



LAGRANGIAN FORMULATION OF ELECTRODYNAMICS

If the integral over the half-space in (56) is extended to the entire space, one must replace 0 by 0 2; we then get the same expression as (B-14) of Chapter XVIII for the energy of the free transverse field. The Hamiltonian obtained with the Lagrangian formalism coincides with the field energy. We can write expression (55) for ¯ (k ) as a function of only the variables ˜ ε (k ) ˜ ε (k ). Using (53), we get: and Π 1 ˜ Π

¯ (k ) =

0

ε

ε (k

˜ )Π

ε (k

)+

0

2 2

˜

ε (k



ε (k

)

(57)

Equations (28) can then be generalized and yield the Hamilton-Jacobi equations for ˜ ε (k ): ˜ ε (k ) and Π ¯ (k ) 1 ˜ = Π ε (k ) ˜ (k ) Π 0 ε ¯ (k ) 2 2 ˜ ˜˙ ε (k ) = Π = 0 ε (k ) ˜ (k ) ˜˙

ε (k

)=

(58a) (58b)

ε

It is easy to check that the two equations (58a) and (58b) are the same as Maxwell’s equations (A-30a) and (A-30b) of Chapter XVIII, which describes the evolution of the transverse fields in the absence of sources. Equation (58a) of the present complement and equations (A-30a) of Chapter XVIII are identical, and define the transverse electric field as a function of the time derivative of the transverse potential vector. As for equation (58b) of this complement and equation (A-30b) of Chapter XVIII, they are the same when j˜ = 0. They describe the evolution of the transverse electric field. 2-e.

Field commutation relations

Generalizing the canonical commutation relation (29) to the case of continuous variables, we get: ˆ ˜

ε (k)

ˆ ˜ Π

ε

(k ) = ~

εε

(k

k)

(59)

all the other commutators being equal to zero. The Kronecker delta εε of the vectors ε and ε is equal to 1 if both these vectors are the same, and to 0 if they are different. Comment The canonical commutation relations only apply to independent conjugate variables, which is the case for the components of the various fields along the transverse polarization directions. Now the field components on an arbitrary fixed reference frame e , e , e , ˜ (k) with = , are not independent because of the transversality condition ˜ (k) = 0. Therefore: ˆ ˆ ˜ ˜ (k) Π

(k ) = ~

(k

k)

(60)

ˆ ˜ (k ), we must express ˜ (k) and Π To get the correct commutation relation between ˆ both quantities as functions of their components along the two polarization vectors ε and

1989

COMPLEMENT AXVIII



ε0 , which are perpendicular both to each other and to k (here we choose a basis of linear, real, polarizations), and then use (59). As an example: ˆ ˜ (k) =

ˆ ˜

ε (k)

ˆ ˜

+

ε

(k)

(61)

where: =e

ε

=e

ε

(62)

We thus get: ˆ ˆ ˜ ˜ (k) Π

(k ) = ~ (

+

k)

) (k

(63)

This equation can be further transformed by noting that ε, ε0 and k mal basis, so that: +

2

+

=

(64)

ˆ ˆ (k) and Π ˜ Finally, the correct commutation relation between ˜ ˆ ˆ ˜ ˜ (k) Π

form an orthonor-

(k ) = ~

2

(k

k)

(k ) is written: (65)

We multiply both sides of (65) by k r k r (2 )3 and integrate over k and k . We then ˆ (r )], get on the left-hand side the commutator of the fields in real space8 , [ ˆ (r) Π 2 and on the right-hand side the Fourier transform of the function ( ) which is the transverse delta function 9 (r r ): ˆ ˆ (r) Π

2-f.

(r )

=

~ (2 )3

d3

d3

=

~ (2 )3

d3

k (r

k r

kr

(k

2 r ) 2

~

(r

r)

k) (66)

Creation and annihilation operators

The normal variable α(k ) introduced in equation (B-15) of Chapter XVIII has a time evolution , where = for the free field case. According to § B-1 of Chapter XVIII, this variable becomes, upon quantization, the annihilation operator ˆε (k ) of the harmonic oscillator associated with the mode k ε . As for the complex conjugate of this normal variable, it becomes the creation operator ˆε (k ). These two operators are therefore written as: ˆε (k ) =

( ) ˆ

ε (k

)+

ˆ Π

ε (k

)

ˆ Π

ε (k

)

0

ˆε (k ) =

( ) ˆ

ε (k

) 0

(67)

ˆ ˆ ˆ 8 For the term in Π ˜ (k ), we use the reality condition Π ˜ (k ) = Π ˜ ( k ) and change k into k in the integral over k . 9 The interested reader can find details on the properties of that function in Complement A of I reference [16].

1990



LAGRANGIAN FORMULATION OF ELECTRODYNAMICS

˙ ˆ ε (k). The quantity ( ) is a where we have used (53) to write ˆ ε (k) = (1 0 )Π normalization constant, arbitrary for now. It can, however, be determined by imposing for the commutator of the two operators (67) a generalization of the well-known relation [ˆ ˆ ] = 1 for the harmonic oscillator. We thus use the two equations (67) to compute the commutator ˆε (k) ˆ (k ) as a function of the commutators of the fields ˆ and ε

ˆ and of their adjoints. Since the only non-zero commutators are between ˆ and Π ˆ Π ˆ as well as between ˆ and Π , we get: ˆε (k) ˆε (k ) =

2

=

2

ˆ

( )

ε (k)

ˆ Π

0

( )

2~

εε

(k

ε

(k )

ˆ Π

ε (k)

ˆ

ε

(k )

k)

0

=

2

( )

2~ εε

(k

k)

(68)

0

To go from the first to the second line of (68), we used (59) and its complex conjugate. The constant ( ) is finally determined by imposing the commutators between the annihilation and creation operators to be equal to εε (k k ), which yields: 0

( )=

2~

0

=

(69)

2~

Inserting this relation in equality (B-25) of Chapter XVIII, we find that the contribution to the classical energy of the mode k ε is: } [ 2

ε (k) ε (k)

+

ε (k) ε (k)]

(70)

We shall see in Chapter XIX that it is indeed the equivalent of the expression for the quantized radiation Hamiltonian. 2-g.

Discrete momentum variables

We examined, in § B-3 of Chapter XVIII, the case where the radiation is contained in a box of finite volume 3 , which leads to a discrete summation over the momenta. Relation (59) then become: ˆ ˜

k

ˆ ˜ Π

= ~

k

εε

(71)

kk

Applying the substitution (B-34) or (B-35) of that chapter to both sides of (67), the two 3 coefficients (2 ) cancel each other, and these relations remain unchanged (aside from the fact that k is now a discrete index rather than a continuous variable). As for the relations (68), they become: ˆk

ˆk

=

2

( )

With the choice (69) for and ε = ε .

2} εε

kk

(72)

0

( ), we check that the commutator is equal to unity if k = k

1991



COMPLEMENT AXVIII

3.

Lagrangian of the global system field + interacting particles

We now study the Lagrangian of the total system, including the interactions between the particles and the electromagnetic field. 3-a.

Choice for the Lagrangian

We choose a Lagrangian expressed as: =

+

+

(73)

where depends only on the radiation variables, only on the particle variables, and on both types of variables as it describes the interactions between particles and radiation. For , we shall take the Lagrangian introduced above for the free field – see relations (51) and (52): =

d3 L¯ (k ) =

d3

0

˜˙

ε (k

) ˜˙

ε (k

2

)

˜

ε (k



ε (k

)

(74)

For the Lagrangian of the particles, labeled by the index , we shall use the usual Lagrangian for a system of particles, i.e. the difference between their kinetic energy and their potential interaction energy which comes from the Coulomb forces they exert on each others: 1 2

=

2

r˙ ( )

(75)

Coul

Finally, the interaction Lagrangian will be chosen as: =

d3 j(r ) A (r )

(76)

where j(r ) is the particle current density given by the expression: j(r ) =

r˙ ( ) (r

r ( ))

(77)

and is the charge of particle . This expression does not contain terms including A , the charge density (r ) or the scalar potential (r ). This is because of our choice of the Coulomb gauge in which A is zero, so that the energy of the longitudinal electric field only depends on the particle variables; the same is true for the scalar potential, which is at the origin of the term Coul included in the particle Lagrangian (75). For the following computations, it will be useful to give other equivalent expressions for ; depending on the problem we focus on, we shall use the most suitable expression. Inserting (77) into (76), we get: =

d3 j(r ) A (r ) =

r˙ ( ) A (r

)

(78)

Now, using the Parseval-Plancherel identity, we get: = 1992

d3 j(r ) A (r ) =

˜ (k ) d3 j˜ (k ) A

(79)

• In the integral over

LAGRANGIAN FORMULATION OF ELECTRODYNAMICS

in (79) we get the sum:

˜ ˜ (k ) + j˜ ( k ) A ˜ ( k ) = j˜ (k ) A ˜ (k ) + j(k ˜ (k ) (80) j˜ (k ) A ) A Limiting the integral to half the reciprocal space, we can write: =

˜ (k ) d3 j˜ (k ) A

=

d3

3-b.

˜ ˜ (k ) + j(k ˜ (k ) j˜ (k ) A ) A

(81)

Lagrange’s equations

We now show that the Lagrange’s equations associated with (73) coincide with the Maxwell-Lorentz equations, which will be a justification for the choice of . .

Field Lagrange’s equations

The time derivative of the transverse potential vector ˜˙ ε only appears in the Lagrangian density L¯ (k ) written in (74). Writing L¯ the Lagrangian density of the total system, we can write: L¯ ˜˙

ε (k

= )

L¯ ˜˙

ε (k

= )

0

˜˙

ε (k

)

(82)

Now ˜ ε (k ) appears both in L¯ and in L¯ which is the function to be integrated in the last integral of (81). We get:

˜

L¯ = ε (k ))

˜

L¯ + ε (k ))

˜

L¯ = ε (k )

0

2 2

˜

ε (k

)+˜

ε (k

)

(83)

The field Lagrange’s equation is obtained by setting equal the time derivative of (82) and relation (83), which yields: ¨ ˜

ε (k

)+

2 2

˜

ε (k

)=



ε (k

)

(84)

0

We thus get the time evolution equation of the transverse potential vector in the presence of sources, which we already obtained in (A-31) of Chapter XVIII. .

Lagrange’s equations for the particles

The velocity r˙ of particle of r˙ on the axis, we get: ˙

=

˙

+

˙

=

˙ +

appears in both

(r ( ) )

and

. Noting ˙ the component

(85)

We now compute the time derivative of this expression. The time dependence of the second term of (85) is explicit via the time dependence of the vector potential, implicit 1993



COMPLEMENT AXVIII

via the time dependence of the point r ( ) where this potential is evaluated. We therefore have: d d

˙

(r

¨ +

=

)





+

(r

)

The partial derivative of the transverse vector potential with respect to transverse electric field: (r

)

=

(r

)

(86) leads to the

(87)

As for the last term of (86), it can be written: ∇



(r

)=

˙

+ ˙



(88)

We now compute the partial derivative of with respect to the component of r , which appears in the term Coul of as well as in (via the position dependence of A ). We obtain: =

+

Coul

=

A



+

(89)

The term is the Coulomb force exerted on the particle , i.e. the force due Coul to the electrostatic field created by the charge distribution and acting on the charge at point r where the particle is located. It can also be written as: Coul

=

(r

)

(90)

where E is the longitudinal electric field, since, as we saw in Chapter XVIII, the longitudinal electric field is equal to the electrostatic field created by the charge distribution. Finally, let us explicitly write the last term of (89): A



=

˙

+ ˙



Lagrange’s equation for particle d d

˙

(91)

:

=

(92)

can thus be written, using the previous results: ¨ =

(r

)





+



A

(93)

where the total electric field at point r is : E(r ) = E (r ) + E (r ) 1994

(94)



LAGRANGIAN FORMULATION OF ELECTRODYNAMICS

We can write explicitly the last term of (93) by regrouping equations (88) and (91). We then get: ∇



=

˙

=

[r˙

A



+

˙ A )] =

(∇

(r˙

B)

(95)

This yields the Lorentz magnetic force exerted by the magnetic field on the particle with velocity r˙ . To sum up, Lagrange’s equation for particle is written: r¨ =



E(r ) +

B(r )

(96)

and coincides with the Lorentz equation (A-3) of ChapterXVIII. The Lagrangian we chose above therefore leads to the right equations for the field and the particles. 3-c.

Conjugate momenta

The equation (82) established above can be used to compute the conjugate mo˜ ε (k ) of the field variables ˜ ε (k ): menta Π ˜ Π

ε (k



)=

˜˙

=

ε (k )

L¯ ˜˙

=

0

ε (k )

˜˙

ε (k

)) =

0

˜

ε (k

)

(97)

In a similar way, the computation of the conjugate momentum of the coordinate r of particle is the same as the one leading to equation (85): p = 3-d.



=



+



r˙ +

=

A (r )

(98)

Hamiltonian

˜ = 0 ˜˙ Since ˜˙ ε and Π ε appear only in the radiation Lagrangian L , which only depends on the radiation variables, the Hamiltonian of the global system must contain a term identical to the expression (56) found for the free field. To obtain the other terms coming from the particle conjugate momenta and from the subtraction of and , we first compute r˙ p . Equation (98) yields: r˙

p =

r˙ 2 +



A (r )

(99)

We must now subtract from (99) the values of and given by equations (75) and (78). The term coming from cancels the last term of the right-hand side of (99) and we are left with: Coul

+

1 2

r˙ 2 =

Coul

+

2

A (r )]

[p 2

(100) 1995

COMPLEMENT AXVIII



Finally, the global system Hamiltonian is given by: =

+

Coul

+

2

A (r )]

[p 2

(101)

where has the same form as (56) for the free radiation. This result is a justification for the expression of given in equation (A-41) of Chapter XVIII. 3-e.

Commutation relations

Since the radiation variables and their conjugate momenta are the same as for the free field, the commutation relations (59) established for the free field remain valid: ˆ ˜

ε (k)

ˆ ˜ Π

ε

(k ) = ~

εε

(k

k)

(102)

all the other commutators being equal to zero. As for the commutation relations for the positions and conjugate momenta of the particles, they are the usual relations: [ˆ

ˆ ]= ~

(103)

where the indices label the particles and the indices = the Cartesian components of r and p. As in § 2-g, one can extend these commutation relations to the case where the momenta are discrete.

1996

Chapter XIX

Quantization of electromagnetic radiation A

B

C

Quantization of the radiation in the Coulomb gauge . . . . 1999 A-1 Quantization rules . . . . . . . . . . . . . . . . . . . . . . . . 1999 A-2 Radiation contained in a box . . . . . . . . . . . . . . . . . . 2001 A-3 Heisenberg equations . . . . . . . . . . . . . . . . . . . . . . . 2002 Photons, elementary excitations of the free quantum field . 2004 B-1 Fock space of the free quantum field . . . . . . . . . . . . . . 2004 B-2 Corpuscular interpretation of states with fixed total energy and momentum . . . . . . . . . . . . . . . . . . . . . . . . . . 2005 B-3 Several examples of quantum radiation states . . . . . . . . . 2006 Description of the interactions . . . . . . . . . . . . . . . . . 2009 C-1 Interaction Hamiltonian . . . . . . . . . . . . . . . . . . . . . 2009 C-2 Interaction with an atom. External and internal variables . . 2010 C-3 Long wavelength approximation . . . . . . . . . . . . . . . . 2010 C-4 Electric dipole Hamiltonian . . . . . . . . . . . . . . . . . . . 2011 C-5 Matrix elements of the interaction Hamiltonian; selection rules 2014

Introduction This chapter presents a quantum description of the electromagnetic field and its interactions with an ensemble of charged particles. Such a description is necessary for interpreting certain physical phenomena such as the spontaneous emission of a photon by an excited atom, which cannot be carried out with the semiclassical treatments we have used previously1 (classical description for the field, and quantum description for the 1 See for example in Complement electromagnetic wave.

AXIII the study of the interaction between an atom and an

Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XIX

QUANTIZATION OF ELECTROMAGNETIC RADIATION

particles). Imagine, for example, that a monochromatic field with angular frequency is described by a classical field E0 cos ; its interaction with an atom is then described by the Hamiltonian = D E0 cos , where D is an operator (the electric dipole moment) whereas E0 remains a classical quantity2 . Such a treatment is adequate for understanding how the field can excite the atom from its ground state with energy towards an excited state of energy ; the processus is resonant if is close to the atomic Bohr frequency 0 = ( ) }. Imagine now that the atom is initially in the excited state , in the absence of any incident radiation. The classical field E0 is then identically zero and, consequently, so is the interaction Hamiltonian . The Hamiltonian of the total system is then reduced to the atomic Hamiltonian . Since this operator is time-independent, its eigenstates are stationary, including, in particular, the excited state . The semiclassical theory predicts that an atom, initially excited in a state in the absence of incident radiation, will remain indefinitely in that state. But this is not what is experimentally observed: after a certain time, the atom spontaneously falls into a lower level , emitting a photon whose frequency is close to 0 = ( ) }. This process is called spontaneous emission and happens after an average time called the radiative lifetime of the excited state . This is a first example of a situation where a radiation quantum treatment is indispensable. It is far from being the only example: numerous experiments, more and more elaborate, have created situations where the quantum description of the electromagnetic field is necessary. This chapter presents the base of this quantum description, while following an approach that is as simple as possible – a more general presentation of the quantization of the electromagnetic field is possible with the Lagrangian formulation of electrodynamics (Complement AXVIII . In the previous chapter, we underlined the analogy between the eigenmodes of the radiation field vibrations and an ensemble of harmonic oscillators. We shall use this analogy in § A of this chapter, and proceed to a simple quantization of this ensemble of oscillators. With each eigenmode of the classical field, described by normal variables and , we shall associate annihilation and creation operators, obeying the well-known commutation relations = 1. We shall also propose a plausible form for the quantum Hamiltonian of the system “field + particles”, starting from the classical energy of that system established in the previous chapter. We will see that the equations of evolution3 for these various quantities in the Heisenberg picture (Complement GIII ) are the transposition of the Maxwell-Lorentz equations to operators describing fields and particles, properly symmetrized. This will yield an a posteriori justification for the simple quantization procedure we used. Several important properties of the free field (in the absence of sources) are described in § B. The state space of this field has the structure of a tensor product of Fock spaces, analogous to those studied in Chapter XV; the elementary excitations of the field are called photons. A few important states of the field will be described: the photon vacuum, where no photons are present (but where there exists, nonetheless, a fluctuating field throughout the entire space, with a zero average value), the one-photon states, and the quasi-classical states, which reproduce the properties of a given classical field. Finally, § C studies the interaction Hamiltonian between an electromagnetic field and particles, in particular when those are neutral atoms (such as the Hydrogen atom 2 For the sake of clarity, we use in the entire chapter and its complements the symbol “hat” to distinguish an operator from its corresponding classical quantity . 3 More concisely, we shall call them Heinsenberg equations.

1998

A. QUANTIZATION OF THE RADIATION IN THE COULOMB GAUGE

where the positive and negative charges of the atom’s constituents balance each other). It is then possible to distinguish between two types of atomic variables: the center of mass variables (external variables) and the “relative motion” variables in the center of mass frame (internal variables). We shall also study the electric dipole approximation, valid when the radiation wavelength is large compared to the atomic sizes, as well as the selection rules associated with the interaction Hamiltonian. A.

Quantization of the radiation in the Coulomb gauge

A-1.

Quantization rules

In the previous chapter, we established in relation (B-26) the following expression for the energy of the classical transverse field: 2 trans

=

d3

0

ε

2(

4

)

[

ε (k

)

ε (k

)+

ε (k

)

ε (k

)]

(A-1)

where ε (k ) and ε (k ) are the normal variables describing the transverse field, = , and ( ) a real normalization constant that appeared in the equations defining the normal variables in terms of the transverse potential vector and its time derivative: α(k ) = α (k ) =

˜ (k ) + ( ) A

˜˙ (k ) A

˜ (k ) ( ) A

˜˙ (k ) A

(A-2)

The analogy between the free transverse field and an ensemble of classical harmonic oscillators of frequency associated with the modes k ε is clearly seen in expression (A-1). To quantize the field, this analogy suggests replacing the normal variables ε (k ) and ε (k ) by annihilation and creations operators. We shall use in this § A the Schrödinger picture where these operators are time-independent and where the time dependence only appears in the evolution of the state vector. The quantization procedure will consist in replacing the ε (k = 0) by time-independent annihilation operators ˆε (k), and of course the ε (k = 0) by the adjoint creation operators ˆε (k). Once this operation is performed on (A-1), we obtain a quantum Hamiltonian identical to a sum of standard harmonic oscillator Hamiltonians, provided the factor 2 4 2 ( ) multiplying the bracket on the right-hand side of (A-1) is equal to ~ 2 0 . We therefore choose for ( ) the value: ( )=

0

2~

0

=

(A-3)

2~

This relation is the same as relation (69) of Complement AXVIII , obtained from the commutation relations. We now replace in (A-1) the classical normal variables ε (k ) and ε (k ) by the operators ˆε (k) and ˆε (k) obeying the commutation relations: ˆε (k) ˆε (k ) =

εε

(k

k)

[ˆε (k) ˆε (k )] = ˆε (k) ˆε (k ) = 0

(A-4a) (A-4b) 1999

CHAPTER XIX

QUANTIZATION OF ELECTROMAGNETIC RADIATION

This yields the Hamiltonian operator (as this operator will be frequently used, we simplify the notation and replace ˆ trans by ˆ ): ˆ

ˆ trans =

d3 ε

~ 2

ˆε (k)ˆε (k) + ˆε (k)ˆε (k)

(A-5)

which has the expected form for the quantum Hamiltonian of the transverse field. Extending this procedure, we now replace the classical normal variables by annihilation and creation operators in all the classical expressions established in the previous chapter for the various physical quantities. The transverse momentum – see equation (B-27) of Chapter XVIII – becomes: Pˆtrans =

~k ˆε (k)ˆε (k) + ˆε (k)ˆε (k) 2

d3 ε

(A-6)

As for the transverse fields, written in (B-29), (B-30) and (B-28) of Chapter XVIII, they become linear combinations of creation and annihilation operators: ˆ (r) = E ˆ B(r) = ˆ (r) = A

d3 (2 )3 d3 (2 )3 d3 (2 )3

2 ε

2 ε

2 ε

1 2

~ 2ε0

ˆε (k) ε

~ 2ε0

kr

ˆε (k) ε

kr

(A-7)

1 2

ˆε (k) κ

kr

ε

ˆε (k) κ

ε

kr

(A-8)

1 2

~ 2ε0

ˆε (k) ε

kr

+ ˆε (k) ε

kr

(A-9)

Comment: As in Chapter XVIII, these relations are written in the general case where the polarizations may be complex (elliptical or circular). Complex conjugate ε of the polarization vectors are therefore associated with the creation operators. It is of course necessary to check that the quantification procedure is independent of the arbitrary choice of the polarization basis. If a quantization is performed with a given basis of polarizations, by substitution one can calculate the operators multiplying ε and ε in the new basis, and check that the commutation relations of these operators are indeed those of standard creation and annihilation operators. This ensures the polarization basis independence.

Finally, relation (A-48) of Chapter XVIII for the total energy of the system “particles + fields” becomes: ˆ =

1 2



ˆ (ˆ A r )

2

+ ˆCoul +

d3 ε

~ 2

ˆε (k)ˆε (k) + ˆε (k)ˆε (k) (A-10)

which is a plausible form for the quantum Hamiltonian of the system “particles + fields”. The position rˆ and momentum pˆ operators defined using equation (A-47) of Chapter XVIII obey the usual commutation relations: [(ˆ r ) (pˆ ) ] = ~ [(ˆ r ) (ˆ r ) ] = [(pˆ ) (pˆ ) ] = 0 2000

(A-11a) (A-11b)

A. QUANTIZATION OF THE RADIATION IN THE COULOMB GAUGE

The quantization rules we just heuristically introduced have the advantage of simplicity. We are going to show in addition that the Heisenberg equations for the various operators describing the particles and the fields, deduced from the Hamiltonian (A-10) as well as from the commutation relations (A-4), (A-11a) and (A-11b), are indeed the Maxwell-Lorentz equations for operators. This result justifies a posteriori the quantization procedure exposed in this chapter. A-2.

Radiation contained in a box

If the real space is infinite, k is a continuous variable, and there exists a continuous infinity of modes. However, as we mentioned in § B-3 of Chapter XVIII, it is often more convenient to consider the field to be contained in a cube of edge length with periodic boundary conditions; the variable k is now discrete: =2

(A-12)

where are positive, negative or zero integers. All the physical predictions must be independent of when it is large enough. In such an approach, we replace the Fourier integrals by Fourier series and the integrals over k by discrete summations. For a classical field, the continuous variables ε (k ) then become discrete variables k ε ( ). If the field is zero outside the box, relation (B-35) of chapter XVIII indicates the multiplicative factor that must be used to go from one type of variable to the other. The system is then quantized as we just explained. In the Schrödinger picture, each classical coefficient k ε ( = 0) in a Fourier series becomes an annihilation operator ˆk ε ; each coefficient k ε ( = 0) becomes a creation operator ˆk ε . This latter operator creates a quantum in a field mode confined inside the box (instead of spreading over the entire space). The commutation relations (A-4) are then written: ˆk ε ˆk ε [ˆk ε ˆk

ε

=

εε

(A-13a)

kk

] = ˆk ε ˆk

ε

=0

(A-13b)

Relation (B-36) of Chapter XVIII indicates that once the discrete variables have been inserted in the expressions for the fields, the following rule must be applied to go from a continuous to a discrete summation: d3

2

=

3 2

(A-14) k

Expressions (A-7) to (A-9) must be modified. As an example, relation (A-7) becomes: ˆ (r) = E kε

~ 2ε0

1 2 3

ˆk ε ε

kr

ˆk ε ε

kr

(A-15)

This means that in addition to replacing the integral by a discrete summation, and multiplying by a factor (2 )3 2 , one must divide the field expansion by the square root of the volume 3 . Both relations (A-8) and (A-9) undergo the same changes. 2001

CHAPTER XIX

A-3.

QUANTIZATION OF ELECTROMAGNETIC RADIATION

Heisenberg equations

A-3-a.

Heisenberg equations for massive particles

We start with the equation for the evolution of rˆ ( ): 1 rˆ˙ ( ) = rˆ ( ) ˆ ~

(A-16)

The only term in Hamiltonian (A-10) that does not commute with rˆ is the first one. Using the commutation relation deduced from (A-11a) and (A-11b) : [(ˆ r )

((pˆ ) )] =

}

(A-17)

(pˆ )

we get: 1 1 ˆ (ˆ rˆ˙ ( ) = rˆ ( ) pˆ ( ) A r ~ 2 1 ˆ (ˆ = pˆ ( ) A r )

2

) (A-18)

This equality is simply the operator form: rˆ˙ ( ) +

pˆ ( ) =

ˆ (ˆ A r

)

(A-19)

of the classical equation relating the generalized (or canonical) momentum p and the mechanical momentum r˙ . We then define the velocity operator vˆ of particle by: vˆ ( ) =

1

ˆ (ˆ A r

pˆ ( )

)

(A-20)

Consider now the Heisenberg equation for the evolution of this operator. It yields the equation of motion of that particle: ˆ˙ ( ) = v

ˆ ()= r¨

vˆ ( ) ˆ

(A-21)

~ We shall compute below the commutator vˆ ( ) ˆ ; it leads to the quantum equation of motion for particle : ˆ = r¨

ˆ r )+ E(ˆ

2



ˆ r ) B(ˆ

ˆ r ) B(ˆ



(A-22)

which is simply the quantum Lorentz equation describing the motion of particles inˆ and the total electric field E ˆ = E ˆ +E ˆ . The teracting with the magnetic field B ˆ r ) B(ˆ ˆ r ) vˆ special form of the magnetic force vˆ B(ˆ 2 comes, as shown in the computation below, from using the Heisenberg equations, and from the fact that the ˆ r ) is not Hermitian. To make that operator Hermitian, we must add operator vˆ B(ˆ ˆ r )) , which is simply B(ˆ ˆ r ) vˆ , and divide the result by 2. its adjoint vˆ B(ˆ

2002

A. QUANTIZATION OF THE RADIATION IN THE COULOMB GAUGE

Demonstration of equation (A-22) To compute the commutator of vˆ calculate the following commutators:

2

~ with the first term of ˆ , it is useful to first

ˆ (ˆ (pˆ ) (A r ))

[(ˆ v ) (ˆ v )]=

ˆ (ˆ (A r ))

= ~

ˆ (ˆ (A r )) (pˆ ) ˆ (ˆ (A r ))

ˆ r )) (B(ˆ

= ~

(A-23)

where is the completely antisymmetric tensor that allows writing the cross product components of two vectors a and b in the form (a b) = . We then get: 2

(ˆ v )2 2 =

(ˆ v ) ~

=

(ˆ v ) [(ˆ v ) (ˆ v ) ] + (ˆ v ) (ˆ v )

2~

(ˆ v )

ˆ r )) + (B(ˆ ˆ r )) (ˆ (ˆ v ) (B(ˆ v )

2

(A-24) The last line in (A-24) can be rewritten in the form: ˆ r ) B(ˆ



2

ˆ r ) B(ˆ



(A-25)

and is thus the component along the axis of the symmmetrized magnetic force. The commutator of vˆ ~ with the second term of ˆ is written: [(ˆ v ) ~

Coul ]

=

1 [(pˆ ) ~

Coul ]

=

(ˆ r )

Coul

=

ˆ (ˆ (E r ))

(A-26)

It describes the interaction between particle and the longitudinal electric field. We finally have to compute the commutator of vˆ ~ with the last term of ˆ . Using ˆ and E ˆ , we get: the commutation relations (A-4) and expressions (A-9) and (A-7) for A (ˆ v )

d3

~

~

ˆε (k )ˆε (k ) + 1 2

ε

=

ˆ (ˆ (A r )) ˆε (k )ˆε (k )

d3 ε

=

ˆ (ˆ (E r ))

(A-27)

This term describes the interaction of particle with the transverse electric field. Finally grouping (A-25), (A-26) and (A-27) leads to (A-22). A-3-b.

Heisenberg equations for fields

As all the fields are linear combinations of the operators ˆε (k ) and ˆε (k ), we simply have to consider the Heisenberg equation for ˆε (k ): 1 ˆ˙ ε (k ) = ˆε (k ) ˆ ~

(A-28) 2003

CHAPTER XIX

QUANTIZATION OF ELECTROMAGNETIC RADIATION

We assume that the polarizations ε are real (linear polarizations). The commutator with the first term of ˆ yields, with the use of(A-4a) and (A-20): vˆ2 2

1 ˆε (k ) ~

=

2~

=

2~

A



2

ε (k)

0

A

+

} ε (2 )3

ε (k)



k rˆ



+

k rˆ



(A-29)

ˆ where A ε (k ) denotes the coefficient of ˆε (k) in the integral (A-9) of A (r), which is nothing but the coefficient of ε (k = 0) in the classical expression of A (r) . We introduce the current operator (symmetrized to make it Hermitian): 1 ˆ j(r) = 2

rˆ ) + (r

[ˆ v (r

rˆ )ˆ v ]

(A-30)

The right hand side term of equation (A-29) can then be rewritten in the form: 2 2 0 ~ (2

)3

k rˆ



ε

+

k rˆ



= =

d3

2 0 ~ (2 )3 2 0~

ˆ˜ ε j(k)

kr

ˆ ε j(r) (A-31)

The commutator with the second term of ˆ is zero, whereas the commutator with the third term yields, using (A-4): 1 ˆε (k ) ~

d3

~

ˆε (k

)ˆε (k

)+1 2

=

ˆε (k )

(A-32)

ε

Finally, regrouping (A-31) and (A-32) yields: ˆ˙ ε (k ) +

ˆε (k ) =

2 0~

ˆ ˜ ε j(k )

(A-33)

This equation is, for the operator ˆ (k ), an equation of motion of the same form as the equation of motion of the classical normal variables α(k ), which is given by equation (B-19) of Chapter XVIII. As this latter equation is equivalent to Maxwell’s equations for the transverse fields, we may conclude that the Heisenberg equations for the quantum transverse fields are simply the usual Maxwell’s equations applied to the field operators. B.

Photons, elementary excitations of the free quantum field

We now study a certain number of properties of the electromagnetic field we just quantized, starting with the simplest case: the field in the absence of charged particles. B-1.

Fock space of the free quantum field

The state space of the total system “field + particles” is the tensor product of the particle state space and the radiation field state space . This latter space is itself 2004

B. PHOTONS, ELEMENTARY EXCITATIONS OF THE FREE QUANTUM FIELD

the tensor product of the state spaces of the harmonic oscillators associated with the different modes k ε : =

k1

1

k2

(B-1)

k

2

where k is the state space of the harmonic oscillator associated with the mode k ε , with frequency . As in § A-2, we assume the radiation to be contained in a box of edge length . The operators ˆ (k) depending on the variables k are then transformed into operators ˆk ε depending only on discrete variables. We can even use a more compact notation ˆ , where the index labels4 the whole set of indices k ε ; the operators ˆε (k) are now simply written ˆ . In this section, it is convenient to use the Heisenberg picture; the time dependence of the ˆ and ˆ is then particularly simple, since we have: ˆ ( ) = exp( ˆ

ˆ

~) ˆ exp(

~) = ˆ

(B-2)

as well as the Hermitian conjugate relation. Once the discrete variables have been inserted in the continuous expressions of the fields, we must use rule (A-14) to transform the continuous integrals into discrete summations. The expansions of these fields in term of normal variables are then: ˆ (r ) = E

~ 2 0

ˆ B(r )=

~

ˆ (r ) = A

2

0

1 2

ˆε

3

0

)

ˆκ

ε

ˆε

(k r

(k r

)

)

(B-3)

ˆ κ

(k r

ε

3

)

+ˆ ε

~ 2

ˆ ˆ +ˆ ˆ

=

~

Pˆtrans =

~k 2

ˆ ˆ +ˆ ˆ

=

~k ˆ ˆ

ˆ ˆ +

(k r

)

(B-4)

(B-5)

1 2

Note that the last term in (B-7) does not contain the factor 1 2, since B-2.

)

1 2

=

ˆ

(k r

ˆ ε

1 2 3

~ 2

(k r

(B-6) (B-7) k = 0.

Corpuscular interpretation of states with fixed total energy and momentum

Consider first the mode . The eigenvalues of the operator ˆ ˆ appearing in expressions (B-6) and (B-7) for ˆ and Pˆtrans are all the positive or zero integers : ˆ ˆ

=

=0 1 2

(B-8)

4 For each k , there exists two polarization vectors ε 1 and ε 2 perpendicular to k and perpendicular to each other. The compact notation must be interpreted as a summation over k and, for each value of k , as a sum over ε 1 and ε 2 .

2005

CHAPTER XIX

QUANTIZATION OF ELECTROMAGNETIC RADIATION

Remember the well-known actions of operators ˆ and ˆ on the states ˆ

=

ˆ

=

:

+1 1

ˆ 0 =0

(B-9)

As ˆ ˆ commutes with ˆ ˆ , the eigenstates of ˆ and Pˆtrans are the tensor products of the eigenstates 1 = 1 of the creation and annihilation operators ˆ1 ˆ1 ,....ˆ ˆ ...: ˆ Pˆtrans

1

=

1

=

+ ~k

1 2

~

(B-10a)

1

(B-10b)

1

The field’s ground state corresponds to all the

equal to zero, and will be noted 0 :

0 = 01 0

(B-11)

the states 1 being obtained by the action of a certain number of creation operators on this 0 state:

1

=

(ˆ1 )

1

1!

(ˆ ) !

0

(B-12)

With respect to the field ground state, the state 1 has an energy ~ and a momentum ~k . It can be interpreted as describing an ensemble of 1 particles of energy ~ 1 and momentum ~k1 ,....., particles of energy ~ and momentum ~k . These particles characterize the elementary excitations of the quantum field and are called photons. The quantum number is therefore the number of photons occupying the mode , so that the ground state 0 , corresponding to all the equal to zero, can be called the photon vacuum. Whereas there exists for photons eigenstates of momentum and energy, there are no quantum states of the electromagnetic field where the position can be perfectly known; no position operator is associated with this field. This is a different situation from what we encounter with massive particles, which have both a position and a momentum operator; the wave functions in the two representations are related by a simple Fourier transform. This non-existence of a position operator is linked to the impossibility of building, by a linear superposition of transverse electromagnetic waves, a vector wave perfectly localized at a point in space. The relativistic and transverse character of the electromagnetic field yields commutation relations between its components that involve the transverse delta function (Complement AXVIII , § 2-e) instead of the usual delta function. B-3.

Several examples of quantum radiation states

We now study several examples of states of quantum radiation. 2006

B. PHOTONS, ELEMENTARY EXCITATIONS OF THE FREE QUANTUM FIELD

B-3-a.

Photon vacuum

The presence of the 1 2 term in the parenthesis on the right-hand side of equation (B-10a) shows that the vacuum state energy is not zero, but equal to ~ 2; this sum is an infinite quantity. We encounter here a first example of the difficulties linked to the divergences appearing in quantum electrodynamics. They can be resolved by renormalization techniques, whose presentation is outside the scope of this book. We shall avoid this difficulty by only considering energy differences with respect to the vacuum. If we consider a single mode of the field, the energy ~ 2 of the vacuum state for this mode is finite, and reminiscent of the zero-point energy of a harmonic oscillator of frequency . As you may recall, this zero-point energy is due to the impossibility of having simultaneously zero values for the position and momentum of that oscillator, because of the Heisenberg relations. The lowest energy state of the oscillator results from a compromise between the kinetic energy, proportional to 2 , and the potential energy, proportional to 2 (this problem is discussed in § D-2 of Chapter V). The same arguments can be presented for the contribution, at a given point r, of mode to the ˆ (r ) and magnetic B(r ˆ electric E ) fields; according to (B-3) and (B-4), those fields are represented by two different linear superpositions of operators ˆ and ˆ , which thus do not commute. Consequently, one cannot have simultaneously a zero value for the ˆ 2 , and for the magnetic energy proportional to B ˆ 2. electric energy proportional to E One can further calculate the average value and variance of the contribution of ˆ (r ) at point r. Since ˆ and ˆ change by 1, a simple mode to the electric field E calculation yields: ˆ (r ) 0 0E ˆ 2 (r ) 0 0E

mode i mode i

=0 =

(B-13a) ~

2

0

(B-13b)

3

Similar calculations can be done for the magnetic field. They show that in the photon vacuum state, the average value of both the electric and magnetic fields is zero, but not their variance. Since result (B-13b) is proportional to ~, the non-zero variance of the fields in the vacuum is a quantum effect. Comments (i) The summation over all the modes of expressions (B-13) yields, once we have replaced the discrete sum by an integral: ˆ (r ) 0 = 0 0E ˆ 2 (r ) 0 = 0E

(B-14a) ~ 2

0

3

=

~ 2

0

3 2

d

(B-14b)

0

This means that the variance of the electric field diverges as the fourth power of the upper boundary of the integral over appearing in the summation of the modes of frequency = . This divergence is the same as that mentioned above. (ii) To characterize the dynamics of these field fluctuations, it is possible to compute the field correlation functions in vacuum5 . This calculation shows that the electric and magnetic fields fluctuate very rapidly around their zero average value. These fluctuations are called the vacuum fluctuations. Certain radiative corrections, such as the “Lamb shift” 5 See

for example § III-C-3-c and Complement CIII of reference [16].

2007

CHAPTER XIX

QUANTIZATION OF ELECTROMAGNETIC RADIATION

in atoms, can be interpreted from a physical point of view, as resulting from the vibration of the atom’s electron caused by its interaction with this fluctuating electric field. This vibration leads the electron to explore the nucleus Coulomb potential over the range of its vibrational motion. The corresponding correction to its binding energy depends on the energy level it occupies; this explains why the degeneracy between the 2 1 2 and 2 1 2 states of the hydrogen atom, predicted by the Schrödinger and Dirac equations, can be lifted by the interaction with the vacuum fluctuations 6 . B-3-b.

Field quasi-classical states

The state and observables of a classical field are characterized by the normal variables introduced in § B-2-b of Chapter XVIII. The coherent states of a onedimensional harmonic oscillator studied in Complement GV , can be used to build the field quantum states whose properties are closest to those of the classical field . The coherent state, supposed to be normalized, of a one-dimensional harmonic oscillator is the eigenstate of the annihilation operator ˆ, with eigenvalue : ˆ

=

(B-15)

The eigenvalue may be a complex number since operator ˆ is not Hermitian. Equation (B-15) leads to: ˆ

=

ˆ

=

(B-16)

More generally, the average value of any function of ˆ and ˆ , once put in the normal order, i.e. where all the annihilation operators are positioned to the right of the creation operators (Complement BXVI , § 1-a- ), is equal to the expression obtained by replacing operator ˆ by and operator ˆ by . As an example: ˆ ˆ

=

(B-17)

Consider then the field quantum state: 1

2

=

1

2

(B-18)

where each mode is in the coherent state corresponding to the classical normal variable . Using equations (B-16) and (B-17), we can obtain the average values of the various field operators (B-3), (B-4) and (B-5) in the state (B-18); they coincide with the values of these various physical quantities for a classical field described by the normal variables . The same is true for the observables (B-6) and (B-7) corresponding to the energy and momentum of the transverse field. This is why the quantum state (B-18), which yields average values identical to all the properties of a classical field, is called a quasi-classical state 7 . We shall see later that the correlation functions of the quantum and classical fields involved in various photodetection signals also coincide when the field state is a quasi-classical state.

6 See 7 For

[16].

2008

for example [17]. more details on the properties of the radiation quasi-classical states, see § III-C-4 of reference

C. DESCRIPTION OF THE INTERACTIONS

B-3-c.

Single photon state

Consider the state vector: Ψ =

1

0

(B-19)

=

which is a linear superposition of kets where a mode contains one photon, whereas all the other modes = are empty. Such a ket is an eigenket of the operator total number of photons ˆ = ˆ ˆ with an eigenvalue equal to 1. It is therefore a single photon state. However, except in special cases, it is not a stationary state since it is not an eigenstate of the field energy ˆ . It describes a single photon propagating in space with velocity . We shall see later (Complement DXX ) that, when the field is in the state (B-19), a photodetector placed in a small region of space yields a signal corresponding to the passage, in that region, of a wave packet. C.

Description of the interactions

C-1.

Interaction Hamiltonian

The Hamiltonian ˆ of the system “particles + field” has been given above. In its expression (A-10), we now separate the terms that depend only on the particle variables or only on the field variables, and those that depend on both. We can then write ˆ = ˆ + ˆ + ˆ , where the particle Hamiltonian is: ˆ

pˆ2 + ˆCoul 2

=

(C-1)

whereas the radiation one is: ˆ

=

ˆ ˆ +

~

1 2

(C-2)

Finally, the interaction Hamiltonian is the sum: ˆ = ˆ

1

+ ˆ

(C-3)

2

with: ˆ

ˆ

1

=

2

=

2

2

2



ˆ (ˆ ˆ (ˆ A r )+A r ) pˆ

ˆ (ˆ A r )

(C-4)

2

(C-5)

(we have separated the linear and quadratic terms with respect to the fields). To that interaction Hamiltonian, we must further add the term: ˆ

1

=

ˆ M

ˆ r ) B(ˆ

(C-6) 2009

CHAPTER XIX

QUANTIZATION OF ELECTROMAGNETIC RADIATION

describing the interaction of the spin magnetic moments of the various particles with the magnetic field of the radiation (Complement AXIII , § 1-d): ˆ = M where



2

(C-7)

is the “Landé g-factor” of particle

whose spin is noted Sˆ .

Comment Even with this additional term, all the possible interactions are not contained in that Hamiltonian: missing for example are the electron spin-orbit coupling, the hyperfine interaction between the electron and the nucleus, etc. – see comment (iii) of § C-5. The Hamiltonian we wrote is however sufficient in a great number of cases. C-2.

Interaction with an atom. External and internal variables

Consider the case where the particle system is a single atom, assumed to be neutral, formed by an electron and a nucleus which have opposite charges ( = = ) and whose masses are noted and . This is the case for example of the hydrogen atom. ˆ It is standard practice (see for example § B of Chapter VII) to separate the variables R and Pˆ of the system’s center of mass and the variables rˆ and pˆ of the relative motion. These two types of variables commute with each other and are given by equations: r +

R= r=r

r

r

where we have noted =

P =p +p p p p = the total mass of the system, and

+

;

=

(C-8) its reduced mass: (C-9)

Expressed as a function of these new variables, the particle Hamiltonian is written: ˆ

=

Pˆ 2 pˆ2 + + ˆCoul (ˆ) 2 2

(C-10)

The center of mass variables, also called external variables, describe the global motion of the atom, whereas the variables of the relative motion, also called internal variables, describe the motion in the center of mass reference frame. C-3.

Long wavelength approximation

The interaction Hamiltonians (C-4), (C-5) and (C-6) contain fields evaluated at the electron r and nucleus r positions. These positions can be described with respect to the position of the center of mass and we can write for example: ˆ (ˆ ˆ (R ˆ + rˆ A r )=A

ˆ R)

(C-11)

In an atom, the distance between the position of the electron or the nucleus and the atom’s center of mass is of the order of the atom’s size, i.e. just a fraction of a nanometer. Now 2010

C. DESCRIPTION OF THE INTERACTIONS

the radiation wavelengths that can have a resonant interaction with the atom are of the order of a fraction of a micron, much larger than the atomic dimension. One can thus neglect the variation of the fields over distances of the order of r R (or r R) and write: ˆ (ˆ A r ) ˆ (ˆ A r )

ˆ (R) ˆ A ˆ (R) ˆ A

(C-12)

Such an approximation is called the long wavelength approximation (or dipole approximation). Using this approximation in the interaction Hamiltonian ˆ 1 , yields: ˆ

1



=

ˆ (ˆ A r )





ˆ (ˆ A r )

pˆ ˆ (R) ˆ A

ˆ (R) ˆ pˆ A

=

(C-13)

We used the relation = = as well as definition (C-8) for the relative momentum. As for Hamiltonian ˆ 2 , it becomes with this approximation: ˆ

2 2

=

2 2

2

ˆ2 (ˆ A r )+

2

2

ˆ2 (ˆ A r )

ˆ2 (R) ˆ A

(C-14)

Comment When we include the Hamiltonian describing the spin magnetic coupling ˆ 1 written in ˆ This is however insufficient: we must add other (C-6), we also replace all the rˆ by R. ˆ in ˆ 1 and terms of the same order, obtained by including first order terms in k (ˆ r R) ˆ 2 , and representing corrections to the long wavelength approximation. This is because a computation analogous to the one in § 1-d of Complement AXIII shows that these corrections yield new interaction terms of the same order as ˆ 1 : interaction between the atomic orbital momentum L and the radiation magnetic field; electric quadrupole interaction. C-4.

Electric dipole Hamiltonian

Using the long wavelength approximation, the global Hamiltonian for the system atom + field is written: ˆ2 ˆ = P + 1 2 2



ˆ (R) ˆ A

2

+ ˆCoul +

~ 2

ˆ ˆ +ˆ ˆ

(C-15)

We are going to perform a unitary transformation on this Hamiltonian, leading to a new ˆ E ˆ (R), ˆ where D ˆ interaction Hamiltonian, composed of a single term of the form D is the electric dipole moment of the atom: ˆ = D



(C-16) 2011

CHAPTER XIX

QUANTIZATION OF ELECTROMAGNETIC RADIATION

ˆ (R), ˆ the quantum field given by expression (B-3). This new interaction Hamiland E tonian is called the electric dipole Hamiltonian. To find this unitary transformation, it is useful to start with the simpler case where the radiation field is treated classically. C-4-a.

Electric dipole Hamiltonian for a classical field

When the radiation field is treated classically, as an external field whose dynamic is externally imposed and hence has a fixed time dependence, the last term of relation ˆ (R), ˆ which appears in the second term, must be (C-15) does not exist; operator A ˆ ). The system Hamiltonian is then written: replaced by the external field A (R ˆ2 ˆ = P + 1 2 2



A

ˆ ) (R

2

+ ˆCoul

(C-17)

We are looking for a unitary transformation that performs a translation of pˆ by ˆ ), so that the second term in (C-17) is reduced to pˆ2 2 . Such a a quantity A (R transformation reads: ˆ( ) = exp

rˆ A

ˆ ) (R

(C-18)

~ We can check this since, using [pˆ (ˆ r )] = ~ rˆ and the fact that the internal ˆ we have: variable rˆ commutes with the external variable R, ˆ( ) pˆ ˆ ( ) = pˆ + A

ˆ ) (R

(C-19)

ˆ the other terms of (C-17) are unchanged by the transforAs they do not depend on p, mation. On the other hand, since this transformation has an explicit time dependence ˆ ), the new Hamiltonian via the term A (R that governs the evolution of the new state vector: Ψ ( ) = ˆ( ) Ψ( )

(C-20)

is given by: ˆ ˆ ( ) = ˆ( ) ˆ ( ) ˆ ( ) + ~ d ( ) d

ˆ ()

(C-21)

As we have in addition: ~

d ˆ( ) d

ˆ ( ) = rˆ

A

ˆ ) (R

=

ˆ E D

ˆ ) (R

(C-22)

ˆ = rˆ is the electric dipole moment of the atom, we finally obtain: where D 2 ˆ2 ˆ ( ) = P + pˆ + 2 2

Coul

ˆ E D

ˆ ) (R

where the last term has the expected form for an electric dipole Hamiltonian. 2012

(C-23)

C. DESCRIPTION OF THE INTERACTIONS

C-4-b.

Electric dipole Hamiltonian for a quantum field

The results we just obtained suggest using the unitary transformation: ˆ (R) ˆ rˆ A

ˆ = exp

(C-24)

~ ˆ (R) ˆ that appears in the exponential. One can check where it is now the operator A ˆ so that the second term in (C-15) that this operator is still a translation operator for p, is now simply of the form pˆ2 2 . As the transformation (C-24) no longer has an explicit time dependence, the term analogous to (C-22) does not exist anymore. On the other hand, we must study the transformation of the last term of (C-15), which represents the energy ˆ of the transverse quantum field. We therefore rewrite expression (C-24) using the expansion (B-5) ˆ (R) ˆ as a function of ˆ and ˆ : of A ˆ = exp

ˆ

ˆ

(C-25)

with: =

3

2 0~

ε

ˆ D

ˆ k R

(C-26)

In this form, operator ˆ does appear as a translation operator (Complement GV , § 2-d); it obeys the equations: ˆˆ ˆ = ˆ +

ˆˆ ˆ = ˆ +

(C-27)

To prove relations (C-27), one can use (Complement BII BII , § 5-d) the identity: ( + )

[

=

] 2

(C-28)

valid if and commute with their commutator [ ], as well as the commutation relation ˆ (ˆ ) = ˆ . The transformation of the last term in (C-15) then yields: ˆˆ ˆ =

~ 2

(ˆ +

)(ˆ +

) + (ˆ +

)(ˆ +

)

(C-29)

The terms on the right-hand side of (C-29) that are independent of ˆ . The terms linear in and yield: ~

ˆ +

ˆ

=

~ 2 0

=

ˆ (R) ˆ D ˆ E

3

ˆε

ˆ k R

ˆ ε

ˆ k R

and

yield again

ˆ D (C-30)

where we have used (B-3). We thus get the expected electric dipole form for the interaction Hamiltonian: ˆ =

ˆ (R) ˆ D ˆ E

(C-31) 2013

CHAPTER XIX

QUANTIZATION OF ELECTROMAGNETIC RADIATION

Finally, the terms quadratic in ˆ dip =

1

=

~

and

2

0

3

introduce a term we shall note ˆ dip : ˆ D)(ε



ˆ D)

(C-32)

It represents a dipolar energy intrinsic to the atom. To sum up, regrouping all the previous terms, we get for the transformed Hamiltonian: 2 ˆ2 ˆ = P + pˆ + 2 2

Coul

ˆ E ˆ (R) ˆ + ˆ dip D

+ ˆ

(C-33)

This is a form similar to (C-23), with an additional term ˆ dip . Comments (i) The same mathematical operator does not describe the same physical quantity in two different representations, deduced from one another by a unitary transformation. As an ˆ (R) ˆ appearing in (C-31), does not represent the transverse example, the operator E ˆ (R) ˆ transformed by ˆ, written electric field in the new point of view, which should be E ˆ (R) ˆ ˆ , and hence different from E ˆ (R). ˆ Actually, one can show that the operator as ˆE ˆ R) ˆ R) ˆ (R) ˆ represents in the new point of view the physical quantity D( ˆ 0 where D( ˆ E is called the electric displacement field (see Complement AIV of [16]). (ii) The intrinsic dipolar energy ˆ dip is given by an integral over , which diverges at infinity. This integral must however be limited to values of for which the long wavelength approximation is still valid. C-5.

Matrix elements of the interaction Hamiltonian; selection rules

int Consider an initial state where the atom is described by in for its internal state, for its external state, and where the radiation is in the state in . The interaction Hamiltonian (C-31) couples this initial state to a final state where the atomic internal and int external variables, as well as the radiation variables are respectively in the states fin , ext ˆ ˆ , and . As the operator E ( R) appearing in (C-31) is a linear superposition of fin fin annihilation ˆ and creation ˆ operators, the matrix element of ˆ describes two types of processes: the absorption processes associated with operator ˆ where one photon disappears, and the emission processes associated with operator ˆ where a new photon appears. This matrix element can be factored into a product of three matrix elements concerning the three types of variables; they are written, for the absorption processes: ext in

~ 2

int fin

3

0

ˆ D

ε

int in

ext fin

exp( k

ˆ) R

ext in

fin

ˆ

(C-34)

in

and for the emission processes: ~ 2

0

3

int fin

ε

ˆ D

int in

ext fin

exp(

k

ˆ) R

ext in

fin

ˆ

in

(C-35)

The central term in these expressions is a matrix element concerning the external atomic variables; it expresses the conservation of the global momentum as we now 2014

C. DESCRIPTION OF THE INTERACTIONS

ˆ translates the momentum by a quantity ~k . If the show. Operator exp( k R) atom’s center of mass has an initial momentum ~Kin , once it absorbs a photon, its final momentum will be ~Kfin = ~Kin + ~k ; the momentum ~k of the absorbed photon is therefore transferred to the atom during the absorption process. In a similar way, one can show that the atom’s momentum decreases by the quantity ~k when a photon is emitted. In the first matrix element of (C-34), which concerns the internal atomic variables, ˆ is an odd operator. The matrix element will be different from zero only if operator D the initial and final internal atomic states have opposite parity, as for instance the 1 and 2 states of the hydrogen atom. We rediscover here a second conservation law, the ˆ is a vector operator, it leads to conservation of parity. In addition, as the operator D selection rules on the internal angular momentum which will be studied in Complement CXIX . Comments (i) The conservation of the total momentum comes from the central matrix elements in expressions (C-34) and (C-35). One may wonder whether this result is only valid for the approximate form (C-31) of the interaction Hamiltonian used to establish those equations. Actually it can be shown, using the commutation relations [p (r )] = ~ r and [ˆ ˆ ˆ ] = ˆ , that the interaction Hamiltonian ˆ 1 written is (C-4) (without the long wavelength approximation) commutes with the system total momentum pˆ + ~k ˆ ˆ . The same result is true for all the terms of the interaction Hamiltonian. Consequently, the exact (without approximation) interaction Hamiltonian has non-zero matrix elements only between states having the same total momentum. The fact that the total momentum commutes with all the terms in the Hamiltonian is related to the system invariance with respect to spatial translation. The properties of the system are unchanged upon the translation by the same quantity of the particles and the fields. Similar considerations apply to the rotational invariance and cause the interaction Hamiltonian to only connect states with the same total angular momentum. These results are important for understanding in a simple fashion the exchanges of linear and angular momenta between atoms and photons, which will be discussed in Complements AXIX and CXIX . (ii) Conservation of total momentum during the absorption process, combined with total energy conservation, shows that the energy of the absorbed photon is different from the energy separating the two internal levels involved in the transition. Two effects account for this difference: the Doppler effect, and the recoil effect (Complement AXIX ); they play an important role in laser cooling methods. (iii) If we continue the calculations beyond the long wavelength approximation, we find additional terms for the interaction Hamiltonian, describing the interaction between the radiation magnetic field and the atomic orbital or spin magnetic moments (Complement AXIII , § 1). Some of these terms have already been written in (C-6). Transitions, called magnetic dipole transitions, may occur between levels having the same parity, as opposed to the electric dipole transitions studied above. Other types of transitions may also be observed at higher orders, such as the quadrupole transitions.

Note finally that, if the initial radiation state already contains photons, the last two matrix elements of (C-34) and (C-35) are equal to 1ˆ = and +1ˆ = + 1. In the presence of incident photons, the probability of the absorption process is thus proportional to , whereas the emission probability is 2015

CHAPTER XIX

QUANTIZATION OF ELECTROMAGNETIC RADIATION

proportional to + 1. We shall see in Chapter XX that this difference is linked to the existence of two types of emission, the stimulated emission and the spontaneous emission. With the knowledge of the various Hamiltonians ˆ , ˆ and ˆ , as well as their matrix elements, we can now solve Schrödinger’s equation to compute the transition amplitude between an initial state and a final state of the system “atom + field”. This will be done in the next chapter, where we study various processes, such as the absorption or emission of photons for an incident radiation either monochromatic or having a large spectral band, the photoionization phenomenon, multiphoton processes and photon scattering.

2016

COMPLEMENTS OF CHAPTER XIX, READER’S GUIDE The processes of photon absorption and emission by atoms must obey conservation laws for the total linear or angular momentum. This has implications that are highlighted in the three complements of this chapter. When an atom absorbs (or emits) a photon, it gains (or loses) an energy, a momentum and an angular momentum equal to the energy, momentum and angular momentum of the photon. This allows “manipulating” several properties of the atoms. It is, for example, at the base of optical pumping and laser cooling methods.

AXIX : MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Momentum exchange between atoms and photons plays an important role in determining, for example, the Doppler effect and the shape of the spectral lines emitted or absorbed by gases. As an atom continually absorbs and re-emits photons, its momentum can be greatly affected. The atom can be decelerated, which allows slowing down and even bringing to rest an atomic beam over short distances. Other uses of the Doppler effect include the introduction of a friction force that slows down atoms to form ultra-cold gases. When the atoms are confined in a trap, the nature of the Doppler shift may be greatly changed and even completely disappear (Mössbauer effect). Mulitiphoton processes, in which the total momentum of the absorbed photons is zero, are also discussed.

BXIX : ANGULAR MOMENTUM OF RADIATION

This complement is more technical than the previous one. It shows how the photon can be seen as a spin 1 particle; this particle also has an orbital angular momentum. The photon thus possesses two types of angular momentum. These concepts are useful for reading the next complement.

CXIX : ANGULAR MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

This complement studies the exchanges between the photon spin angular momentum (related to the light beam polarization) and the internal degrees of freedom of the atoms. These exchanges obey selection rules for the atomic transitions. Such rules are essential for many experimental methods in atomic physics, such as “optical pumping” where a polarized light beam allows accumulating the atoms of a gas in specific Zeeman sublevels. A number of applications of these methods are briefly reviewed in this complement.

2017



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Complement AXIX Momentum exchange between atoms and photons

1

2

3

4

Recoil of a free atom absorbing or emitting a photon . . . . 2020 1-a Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . 2020 1-b Doppler effect, Doppler width . . . . . . . . . . . . . . . . . . 2022 1-c Recoil energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 2023 1-d Radiation pressure force in a plane wave . . . . . . . . . . . . 2024 Applications of the radiation pressure force: slowing and cooling atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2025 2-a Deceleration of an atomic beam . . . . . . . . . . . . . . . . . 2025 2-b Doppler laser cooling of free atoms . . . . . . . . . . . . . . . 2026 2-c Magneto-optical trap . . . . . . . . . . . . . . . . . . . . . . . 2034 Blocking recoil through spatial confinement . . . . . . . . . 2036 3-a Assumptions concerning the external trapping potential . . . 2036 3-b Intensities of the vibrational lines . . . . . . . . . . . . . . . . 2037 3-c Effect of the confinement on the absorption and emission spectra2038 3-d Case of a one-dimensional harmonic potential . . . . . . . . . 2039 3-e Mössbauer effect . . . . . . . . . . . . . . . . . . . . . . . . . 2040 Recoil suppression in certain multi-photon processes . . . . 2040

We studied, in § C of Chapter XIX, the matrix elements of the interaction Hamiltonian between an atom and a field. This led us to establish selection rules based on the conservation of the total momentum of the system “atom + field” during the absorption or emission of photons by the atom. The study in Chapter XX of the absorption and emission amplitudes will show that the global energy of the system is also conserved during these processes. The goal of this complement is to show how these conservation laws1 provide an interesting view on many aspects of momentum exchange between atoms and photons. We start in § 1 with the case of a free atom, whose center of mass is not submitted to any external potential. We shall establish the expressions for the Doppler shift and the recoil energy that appear in the equation yielding the frequency of the absorbed or emitted photons. In a gas containing a large number of atoms with different velocities, the dispersion of these velocities yields a Doppler broadening of the emission and absorption lines. This broadening, as well as the shift related to the recoil energy, introduce perturbations in the lines observed using high resolution spectroscopy, hence limiting its precision. In addition, when the atom constantly absorbs and re-emits lots of photons, its momentum change per unit time can become very large. This results in a force generated 1 The

first study along this line concerned the Compton effect (1922), where the scattering of a photon by an electron is considered as a collision between two particles. Assigning the photon an energy and a momentum }k, with k = 2 , one writes the conservation equations for the total momentum and energy during the collision. This yields the change of frequency of the photon as it is scattered in a given direction, in complete agreement with experimental observations.

2019

COMPLEMENT AXIX



by the radiation pressure. We shall calculate the order of magnitude of that force and show that it can produce an acceleration or deceleration of the atom a hundred thousand times larger than the one due to gravity. In § 2, we will show that this force is able to slow down and immobilize a beam of atoms propagating in the direction opposed to that of the light beam. The velocity dependence of the radiation pressure force, due to the Doppler effect, is also very interesting. It allows, using two light beams in opposite directions, but with the same intensity and frequency, to generate a friction force on the atom, provided the light frequency is lower than the atomic frequency: the radiation pressure force is zero when the atomic velocity is zero, but opposite to when it is different from zero, therefore producing a damping of that velocity. This is the principle of one of the first laser cooling mechanisms observed experimentally. The very low temperatures obtained, millions of times lower than room temperature, explain the increasing number of application of the ultra-cold atoms thus obtained. We also explain in § 2-c the principle of the magneto-optical trap, which involves a position dependence of the radiation pressure force. We describe in §§ 3 and 4 of this complement a number of methods developed to suppress or circumvent the effect related to the recoil. Confining the atom in a trap, as studied in § 3, may prevent the atom’s recoil if the trapping is strong enough. If the transition is multi-photonic, for example if two photons having the same energy but opposite momenta are absorbed in the transition, no recoil is experienced by the atom, and there is no Doppler shift. An important example of this method is the Doppler-free two-photon spectroscopy (§ 4). 1.

Recoil of a free atom absorbing or emitting a photon

Consider first an atom that is not subjected to any external potential (free atom). The Hamiltonian ˆ ext for the external variables is reduced to the kinetic energy term Pˆ 2 2 , where Pˆ is the momentum of the center of mass and the total mass of the atom. The eigenstates of ˆ ext can be chosen as states having well defined momentum P and energy 2 2 . 1-a.

Conservation laws

Let us express the radiation field as a function of plane waves of wave vector k and frequency = . The eigenstates of the radiation Hamiltonian ˆ can be described in terms of photons having an energy ~ and a momentum ~k. The interaction Hamiltonian ˆ studied in § C of Chapter XIX is invariant2 under spatial translation of the total system “atom + field”. Consequently, it commutes with the system’s total momentum, and can induce transitions only between states having the same total momentum. Furthermore, the transition amplitude associated with an interaction lasting a time can only connect states of the total system having the same total energy, within ~ (this point will be further discussed in the next chapter). These two conservation laws can be used to study the influence of the motion of the center of mass on the frequencies of the photons it can absorb or emit.

2 The interaction Hamiltonian involves field values at points where particles are located. It is therefore invariant when both fields and particles are shifted (by the same quantity).

2020



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Consider first an absorption process where the atom goes from an internal state to another internal state by absorbing a photon of energy ~ and momentum ~k. We shall note: ~

=

0

(1)

the energy difference between the two internal states. The initial (final) momentum of the center of mass before (after) the photon absorption is noted Pin (Pfin ). The conservation of the total momentum leads to: Pfin = Pin + ~k

(2)

The total energy in the initial state is the sum of the photon energy, the internal energy 2 of the atom and its translation energy in 2 : in

=~ +

+

2 in

(3)

2

The total energy in the final state reduces to the atom’s energy since the photon has been absorbed: fin

=

+

2 fin

(4)

2

The conservation of the total energy means that relation reads: ~ =~

0

+

~k Pin

+

~2 2

in

=

fin .

Using (1) and (2), this last

2

(5)

The last two terms in (5) represent the variation of the external energy of the atom during the transition – i.e. the variation of the atom’s center of mass kinetic energy between the final state where it is equal to (Pin + ~k)2 2 , and the initial state, where 2 it is equal to Pin 2 . This equation can be rewritten as: =

0

+ k vin +

where vin = Pin rec

=

~2 2

rec

(6)

~

is the initial velocity of the center of mass, and where:

2

=~

(7)

R

is the recoil energy; it is the energy an atom, initially at rest, will acquire upon the absorption of a photon having a momentum ~ . The same type of calculation holds for the emission process during which an atom, whose center of mass has an initial momentum Pin , goes from the internal state to the internal state by emitting a photon of energy ~ and momentum ~k. Equation (6) must now be replaced by: =

0

+ k vin

rec

(8)

~

where only one sign is changed with respect to (6). 2021

COMPLEMENT AXIX

1-b.



Doppler effect, Doppler width

The term k vin in equations (6) and (8) is simply the Doppler shift, due to the motion of the atom, of the frequencies it absorbs or emits. Setting: 0

=2 (

0)

=2 ∆

and

=2

(9)

we get for the frequency variation ∆ due to the Doppler effect: ∆

=

κ vin

(10)

where κ = k is the unit vector defining the propagation direction of the photon. Note that a radiation quantum theory is not needed to account for this frequency shift, which can be predicted by a classical theory. This was to be expected since, among the last two terms of (6) and (8), k vin is the only one that does not go to zero when ~ tends toward zero, as opposed to rec ~ = ~ 2 2 (which is proportional to ~). For an ensemble of atoms in a dilute gas at thermal equilibrium at temperature , the velocities are distributed according to the Maxwell-Boltzmann law, and the velocity dispersion ∆ is of the order of , where is the Boltzmann constant. The shifts ∆ of the frequencies emitted or absorbed by the atoms are distributed following a Gaussian curve3 whose width ∆ (standard deviation of the frequency distribution, equal to the square root of the variance), called the Doppler width, is given by: ∆

=

(11)

2

0

In general, in the optical domain and for temperatures around 300 K, the Doppler width ∆ is of the order of 1GHz=109 Hz, much smaller than the frequency 0 (of the 15 order of 10 Hz), but much larger than the natural width Γ, of the order of 107 Hz. In this domain, the resolution of spectroscopic measurements of line frequencies emitted by a dilute gas is generally limited by the Doppler broadening of the lines. Relativistic Doppler effect The previous calculations are only valid in the non-relativistic limit ( ). The Doppler shift expression can be generalized to any value of by noting that the four quantities are the four components of a four-vector. Let us assume the atom is at rest in a reference frame and emits a photon of frequency along the axis (we ignore here the recoil energy). An observer, in a reference frame moving with velocity along the axis, sees the atom moving away at velocity and measures a frequency for the photon emitted by the atom. According to the relativistic expressions for the transformations of four-vector components, we have: = 1

2

2

(12)

To first order in , replacing k vin by , we again find the Doppler shift of equation (8) – the relativistic correction being (in relative value) of the order of 2 2 for . 3 Actually, this distribution is the convolution of a Gaussian and a curve of width Γ, where Γ is the natural width (due to the spontaneous emission) of the line emitted or absorbed by the atoms. For a gas at room temperature, Γ is much smaller than the Doppler width.

2022

• 1-c.

MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Recoil energy

Imagine the atom is initially at rest, so that the terms in k vin in (6) and (8) are zero. When the atom absorbs a photon, its momentum increases by a quantity ~k equal to the momentum of the absorbed photon. Consequently, the atom recoils with a velocity rec = ~ in the direction of the incident photon, and its kinetic energy 2 becomes fin = 2 = rec . The energy ~ of the incident photon is used both to rec increase the internal energy of the atom by ~ 0 (since the atom goes from to ), and to increase its kinetic energy from 0 to rec . We then have ~ = ~ 0 + rec , which is a particular case of (6) for a zero initial velocity. However, for the emission process where the atom goes from to by emitting a photon, this relation is modified. As the photon leaves the atom with a momentum ~k, the atom recoils with the same momentum but in the opposite direction, and acquires a kinetic energy rec . The loss of internal energy of the atom, equal to ~ 0 , must now be used both to increase the radiation energy by ~ (the energy of the emitted photon), and to increase the kinetic energy of the atom from zero to rec . We then have ~ 0 = ~ + rec , that is ~ = ~ 0 rec , which coincides with (8) with the minus sign on the right-hand side.

Absorption

Emission

∆ωD

ω

ω0 ω0 − ωR

ω0 + ωR

Figure 1: Because of the recoil effect of the atom, the absorption and emission lines do not coincide, but form a doublet called the recoil doublet; they are centered at + for the absorption line and for the emission line. In a gas, their width is the Doppler width ∆ . The recoil of the atom with a kinetic energy rec causes the centers of the absorption and emission lines to be different (Figure 1); their position is 0 + for the absorption line and 0 for the emission line, with = rec ~. For a gas containing a large number of moving atoms, the velocity dispersion around a zero average gives each of these two lines a Doppler width ∆ . In the optical domain, the recoil frequency is of the order of a few kHz, much smaller than the Doppler width at room temperature, and also smaller than the natural width. Consequently the recoil doublet shown in Figure 1 is not resolved: the distance between the centers of the two lines is less than their width. However, when one studies the lines emitted by nuclei in the X-ray or -ray domains, the recoil frequency (which increases as 2 ) becomes comparable to or 2023

COMPLEMENT AXIX



even larger than the Doppler width (which only increases as ). The two lines in Figure 1 then only overlap far out on their wings. In this case, the photon emitted by a nucleus in an excited state has very little chance of being absorbed by another identical nucleus in the lower state . We shall see later how the recoil of a nucleus can be blocked when the atom having that nucleus is sufficiently strongly bound to other atoms in a crystal (Mössbauer effect). 1-d.

Radiation pressure force in a plane wave

Each time an atom absorbs a photon, it gains a momentum ~k. If ˙ abs is the number of absorptions per unit time, the atom gains a momentum ˙ abs ~k, per unit time. In a steady state, the number ˙ abs of absorptions per unit time is equal to the number of emissions per unit time ˙ em . This latter number is equal to Γ , where Γ is the natural width of the atom’s excited state, and where the diagonal element of the atom’s density operator in is the occupation probability of that state. This gain in momentum per unit time of the atom can be considered as coming from the action of a force, associated with the radiation pressure exerted by the light beam on the atom. One often calls this force the “radiation pressure force”. According to what we just saw, it is equal to4 : F = ˙ abs ~k = Γ

~k

(13)

To get an idea of the order of magnitude of this force, let us assume the light intensity is very high; the atomic transition is then saturated, meaning the occupation probabilities and of the higher level and the lower level are both equal to 1 2. We then have: F = ~k

Γ 2

Such a force can communicate an acceleration A equal to F Taking (14) into account, this acceleration is equal to: A=

~k Γ vrec = 2 2

(14) to the atom of mass

.

(15)

where vrec = ~k is the recoil velocity of an atom absorbing or emitting a photon, and = 1 Γ is the radiative lifetime of the excited state . Let us calculate the value of this acceleration for a sodium atom. The recoil velocity is of the order of 3 10 2 m s and the radiative lifetime of the order of 16 2 10 9 s, meaning the acceleration is of the order of 106 m s2 , which is 105 times larger (!) than the acceleration due to gravity (of the order of 10m s2 ). This high value for the acceleration arises because the velocity change of the atom rec for each absorptionemission cycle, though very small by itself, accumulates during the very large number of cycles, 1 2 , occurring in one second. 4 To compute the momentum change, we only considered here the photon absorption processes. The spontaneous emission processes also change the atom’s momentum, as the atom recoils when it emits a photon. However, the spontaneously emitted photons can have any direction in space, and the momentum change of the atom has a zero average. This phenomenon, however, gives rise to a diffusion of momentum, hence increasing the velocity dispersion of the atoms. We shall see in § 2-b- that this momentum diffusion must be taken into account when evaluating the limits of laser cooling.

2024

• 2.

MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Applications of the radiation pressure force: slowing and cooling atoms

We now study three applications of the radiation pressure force using either one laser beam, or two laser beams with the same intensity and same frequency, propagating in opposite directions along the axis. We shall see in § 2-a how, with just a single laser beam, the radiation pressure force exerted by the beam on the atoms in an atomic beam propagating in the opposite direction can be used to slow down, and even bring to rest, the atoms of the beam. With two laser beams propagating in opposite directions, interesting effects occur if one can introduce an imbalance between the two radiation pressure forces, depending either on the atom’s velocity along the axis, or on its position . We study in § 2-b how to get this velocity dependence, and how the sum of the two forces exerted by the two waves, zero for = 0, becomes, for = 0 but sufficiently small, a linear function of ; it can then be expressed as . For a proper detuning of the lasers’ frequency, the coefficient can be negative, so that the resulting force acts as a friction force that damps the velocities of all the atoms in the beam, and hence cools them down. This is the principle of laser cooling. In § 2-c, we shall see how a position dependence can be obtained, and how the resulting force, zero for = 0, becomes different from zero if = 0 and equal to if is sufficiently small. If is negative, this force becomes a restoring force that can trap the atoms around = 0. This is the principle of the magneto-optical trap. 2-a.

Deceleration of an atomic beam

Imagine that an atomic beam is irradiated by a resonant laser beam propagating in the direction opposite to that of the beam. Due to the radiation pressure force, the atoms of the beam will slow down. Is it possible to bring them to rest? It is important to notice that even if the laser beam is initially resonant, it will not be when the atoms’ velocities change, since the Doppler effect takes the atoms out of resonance; this will significantly lower the radiation pressure force, and hence the slowing down effect. For the sake of simplicity, we shall first ignore the Doppler effect following the change in the atoms’ velocities; we shall see later how it can actually be circumvented. We first assume the radiation pressure force does not change in the course of the deceleration process, and that the laser intensity is high enough for the atomic transition to be saturated; we can then use the orders of magnitude calculated in the previous paragraph. We found that for sodium atoms, the deceleration is of the order of 106 m s2 . If the atoms of the beam have an initial velocity of the order of 103 m s, their velocity will be zero after a time of the order of 10 3 s, after they traveled a distance 2 2 of the order of 0 5m, which shows that the size of such an experiment is not, a priori, excessive. To solve the problem of the atoms going out of resonance because of the Doppler effect which changes in the course of the slowing down, an ingenious method was proposed and demonstrated [18]. It is based on the propagation of the atoms in a spatially inhomogeneous magnetic field. More precisely, the atomic beam travels along the axis of a spatially varying solenoid coil (Figure 2). The magnetic field produced by the solenoid is parallel to the beam direction, and its intensity varies along the beam axis. As an atom propagates along this field, it undergoes a variable Zeeman shift of its resonance frequency. One can then adjust the profile of the field so that as the atom is slowed down, the Zeeman shift of the atomic frequency balances the Doppler shift of the laser 2025

COMPLEMENT AXIX



Laser

Atomic beam

Solenoid with varying diameter

Figure 2: Schematic diagram of a Zeeman slower. The atomic beam is cooled by a laser beam propagating in the opposite direction. It travels along the axis of a solenoid composed of a set of magnetic coils with decreasing diameters, whose cross sections are shown in the figure (the current in the coils flows perpendicularly to the figure). While they propagate, the atoms are submitted to a larger and larger magnetic field. The Zeeman shift of their resonance frequency can thus follow the Doppler shift of the apparent laser frequency in their own reference frame. Consequently, instead of going off resonance, they can be slowed down during their entire propagation through the solenoid, and even come to rest. frequency: at each point , the field is calculated so that both Doppler and Zeeman shifts exactly balance each other. This type of apparatus is called a “Zeeman slower”. 2-b.

.

Doppler laser cooling of free atoms

Doppler laser cooling principle

The “slower” described above concerns the mean velocity of atoms, which can be brought down to zero. However, the root mean square of the atomic velocities remains non-zero, as does the temperature which is characterized not by the mean velocity but by its root mean square. We now describe a method using the velocity dependence of the radiation pressure force, due to the Doppler effect, and which permits reducing the dispersion of the atomic velocities around their mean value, and hence really cooling down the atoms. As this method uses the Doppler effect, it is called “Doppler laser cooling”. It was proposed in 1975 for free atoms [19] and for trapped ions [20]. Here, we shall only study the case of free atoms. The idea is to use two laser waves 1 and 2, having the same angular frequency and the same intensity, counterpropagating along the axis, wave 1 toward negative , and wave 2 toward positive (Figure 3). Imagine an atom also propagating along the axis with, for example, a positive velocity . We call 0 the angular frequency of the atomic transition excited by the laser. We assume the lasers are “red-detuned”, meaning that: 0

(16)

In the reference frame where the atom is at rest, the apparent frequency of wave (with = 1 2) is Doppler shifted, and equal to ki v. As is positive, the atom and wave 1 propagate in opposite directions, so that k1 v is negative. The apparent frequency 2026



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Figure 3: Principle of Doppler laser cooling. An atom moving with velocity along the axis interacts with two laser waves 1 and 2 having the same angular frequency ; this frequency is “red-detuned”, meaning 0 . The two waves have the same intensity but propagate in opposite directions along the axis. The direction of the velocity is opposite to that of wave 1. In the atom’s frame of reference and because of the Doppler effect, the frequency of wave 1 is shifted closer to resonance, whereas the frequency of wave 2 is shifted away from it. Consequently, the modulus of force 1 exerted by wave 1 increases, whereas that of the force 2 exerted by wave 2 decreases. The resulting force, zero for = 0, is opposed to when = 0, and proportional to for small enough values of . It is therefore equivalent to a friction force.

k v of wave 1 is therefore increased. The Doppler effect brings the apparent frequency of wave 1 and the atomic resonance frequency closer together; consequently, the modulus of the radiation pressure force 1 exerted by wave 1 on the atom and directed, like wave 1, towards the negative , increases with respect to its value for = 0. The conclusions are just the opposite for wave 2, whose frequency is shifted away from resonance by the Doppler effect; the force 2 it creates along the positive is weaker than its value for = 0. The sum of the two forces5 is zero for = 0 (both forces have the same modulus and opposite directions), but no longer zero when = 0. For positive values of , it has the same direction as 1 since the modulus of 1 is larger than that of 2 (Figure 3); for negative values of , it is just the opposite since the two waves have switched their roles. The sum = 1 + 2 of the two radiation pressure forces exerted by the two waves is thus in the direction opposite to that of the velocity . For small values of , it is proportional to and can be written: =

(17)

where

is a positive friction coefficient. Under the effect of this friction force, the atomic velocities are constantly reduced. Their dispersion, however, does not tend towards zero for long interaction times, because of the unavoidable fluctuations in the emission and absorption processes. There is actually a competition between the friction effect we just described, which tends to cool down the atoms, and the momentum diffusion that tends to heat them up. We evaluate in the next paragraphs the effect of these two processes to estimate the order of magnitude of the temperatures that can be obtained by Doppler laser cooling. 5 We

shall see below that the effects of interference between the two waves can be neglected.

2027

COMPLEMENT AXIX

.



Estimation of the friction coefficient

We will now seek an estimation of the friction coefficient, in order to calculate the evolution of the momentum of the atom, as well as of its energy. In a first step and for the sake of simplicity, we will limit ourselves to a calculation of the average effect of the photon absorption and emission cycles by the atom. This will allow us to determine the evolution of its average momentum . Nevertheless, the absorption and emission processes of photons by an atom are actually fluctuating processes, as we will discuss below. Ssuch a calculation is theferore not sufficient to obtain the evolution of the average 2 of the square of the momentum, that is of the average kinetic energy, since the average value of a square differs in general from the square of the average value. In a second step, we will use a calculation that takes the fluctuations of the momentum transfers betwee photons and atoms into account. We set: =

0

+

(18)

where ~ is the recoil energy defined in (7); is the detuning between the laser frequency and the atomic frequency 0 . We assume from now on that the intensity of the two lasers is weak; the atomic transition is not saturated and consequently the population of the excited level remains low. Its variation as a function of the detuning follows a Lorentzian curve with a total width at half maximum equal to Γ: ( )=

(0)

(Γ 2)2 [(Γ 2)2 +

2]

(19)

We shall not need the expression for (0), since it cancels out in the expressions for the friction and diffusion coefficients we shall obtain; it can however be found in Chapter V of reference [21] (optical Bloch equations). In a perturbative approach to the problem, two types of terms must be considered: the “square” terms, coming from the interaction between the dipole induced by wave with that same wave ; the “cross” terms coming from the interaction of the dipole induced by wave with the other wave = . The cross terms correspond to interference between the effects of the two waves. However, as these two waves do not have the same spatial dependence (they propagate in opposite directions), these interference effects vary rapidly in space as exp( 2 ). They consequently vanish when the forces are averaged over distances of the order of the laser wavelengths, as we shall assume. It is then possible to consider that the radiation pressure force acting on the atom is simply the sum of the radiation pressure forces exerted by each wave, in the absence of the other. When the atom is at rest, the two waves have the same frequency in the atom’s reference frame, and hence the same detuning ; remember that is supposed to be negative in a laser cooling experiment – see (16). If the atom moves with a velocity 0, we just saw that the apparent frequency of wave 1 is increased by a quantity (where and are positive), so that the detuning for the interaction of the atom with wave 1 is equal to: 1

= +

(20)

while it is 2 = for wave 2. For this atom, the population of the excited state is the sum of two contributions: that of wave 1, obtained by replacing with + in 2028



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Figure 4: Populations bb ( 1 ) and bb ( 2 ) excited in the upper state by wave 1, with detuning 1 = + , and wave 2, with detuning 2 = . As the detuning for = 0 is assumed to be negative, the ordinate bb ( 1 ) of point is higher than the ordinate . bb ( 2 ) of point expression (19), and which is proportional to the ordinate of point whose abscissa is in Figure 4; that of wave 2, which requires to replace with in (19), 1 = + and which is therefore proportional to the the ordinate of point having the abscissa in that same figure. Computations similar to those of § 1-d allows computing the total force acting on the atom, as the sum of the two forces exerted by each wave, added independently of each other: =

Γ~

( +

) + Γ~

(

)

(21)

where

( ) is the function defined on the right-hand side of (19). When is small compared to the width Γ of the curve in Figure 4, one can expand ( ) to first order in and obtain: =

2

Γ~

d d

( )

(22)

The last factor on the right-hand side of (22), which is the slope at point (with abscissa ) of the curve representing ( ), can be computed from (19). For the point = Γ 2 where the slope is maximum, we find: d d

( )=

2

( ) Γ

(23)

Inserting (23) into (22), we get: =

(24)

where the friction coefficient = 4~

2

( )

is given by: (25) 2029

COMPLEMENT AXIX



Using equation (24), we can also compute the damping of the momentum its square. As = d d and = , we have: d = d

and of

(26)

We can also write: d d

2

=2

2

d = d

2

(27)

Remember, however, that the average value of the kinetic energy is proportional to 2 , the average value of the square of the momentum, and not to the square of its average value 2 . We should therefore not conclude from relation (27) that the ultimate temperature that can be reached by Doppler laser cooling when is zero. Equation (27) was obtained by considering only the average effect on the atom’s velocity of the light beams and of the successive spontaneous emission processes, which introduces a continuous average evolution. But, in reality, the absorption and emission processes fluctuate and yield photons emitted in random directions. Even though their total momentum has a zero average (meaning is not affected), these fluctuations will change 2 . This effect can be considered as a source of noise (also called momentum diffusion) that increases 2 , and acts in the direction opposite to the friction. It is therefore the competition between these two opposed mechanisms that leads to an equilibrium state, whose energy 2 2 determines the ultimate temperature that can be obtained. .

Momentum diffusion

We now study the diffusion of the momentum of one atom, due to the spontaneously emitted photons; the ensemble of particles discussed above then reduces to one single atom ( = ). Let us consider a time interval d whose value will be specified later. We call d 1 and d 2 the photon numbers from waves 1 and 2 that are absorbed during that time interval. As we assume the friction has acted long enough to cancel the average velocity, the detuning has become the same for the two beams. We then have, on the average: d

1

=d

2

=d

(28)

Each absorbed photon then yields a spontaneously emitted photon. We use here a simple one-dimension model: each photon is emitted spontaneously in a random way, either in the positive direction (the atom’s recoil is then negative), or in the negative direction (the recoil is then positive). The variation of the atom’s momentum is then } , with = 1 in the first case and = +1 in the second. There are no correlations in the directions of two consecutive photons. The total momentum d gained by the atom during the absorption and re-emission of d 1 + d 2 = 2 d photons is equal to the sum of the momentum gained by the absorption of photons from beam 1, of the momentum gained by the absorption from beam 2, and finally of the momentum coming from the spontaneous emissions: 2 d

d =~

d

2

d

1

+

(29) =1

2030



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

(i) To begin with, we neglect the fluctuations of the number of photons absorbed from each wave (we shall discuss this point later). The numbers d 1 and d 2 are then equal to their average values, and we have: 2 d

d =~

(30) =1

The summation over in (29) yields a zero contribution to d , since the sum of the is zero on the average, but this is not the case for the contribution to d 2 that we now compute. Taking the square of the summation over in (30), the cross terms with = are zero on the average, because there are no correlations between the signs of the and the . We are left with the square terms 2 which are all equal to 1. We obtain: d

2

~2

=2d

2

(31)

which is obviously non-zero. Let us specify the time interval d , which should be long enough for the number dN of absorptions and spontaneous emissions to be large during this time, but sufficiently small for the variations of to stay negligible. We can then write: = ˙d =Γ

d

( )d

(32)

where we used in the last equality the fact that the average number of photons absorbed per unit time in each wave is equal to Γ ( ), as we have seen above. Inserting this 2 result in (31), we get the increase of during the time interval d : d

2

= 2~2

2

= 2~2

d

2

Γ

( )d

(33)

We finally get: d 2 d

= 2~2

2

Γ

( )=

(34)

sp

sp

The subscript “sp” of the parenthesis reminds us that this increase of 2 per unit time is due to spontaneous emission processes. This expression is often called the “diffusion coefficient” sp . (ii) We now study the effect of fluctuations on the numbers d 1 and d 2 of absorbed photons in each wave during the time interval d . If these fluctuations are no longer neglected, we must write: d

=d

+

=d

+

(35)

with = 1 2. In this equation, is the fluctuation of the number of absorbed photons in wave (by definition, the average value of is zero). The total momentum d the atom receives during the absorption of these photons is equal to: d = ~ (d

2

d

1)

=~ (

2

1)

(36)

Since the average values of 1 and 2 are zero, the fluctuations in the absorption process do not change the average value of , but this is not true for the average value 2031



COMPLEMENT AXIX

2 . Taking the square of (36) and using the fact that there are no correlations between the fluctuations of 1 and 2 (and hence the average value of their product is zero), we get: 2

d

2 1

=

2,

To compute 2

d

2 2

+

2

=d

~2

2

(37)

we take the square of equation (35). This leads to:

+2 d

+

2

(38)

We now take the average value of each side of this equation. Using the fact that the average value of is zero, we get: 2

=d

2

2

d

(39)

The quantity d 2 d 2 is simply the variance of the number of photons absorbed from the wave. In general, for Poisson statistics, we have6 : 2

d

2

d

=d

We deduce that 2

d

~2

= 2d

2

=d

(40)

is simply equal to d , so that equation (37) is simply written:

2

(41)

We therefore obtain for the increase of 2 due to fluctuations in the absorption processes a result identical to (31). The computations leading from (31) to (34) can be repeated and we get the following result for the increase, per unit time, of 2 due to the absorption processes: d 2 d

= 2~2

2

Γ

( )=

abs

(42)

abs

where the diffusion coefficient abs due to the absorption processes has the same value as sp for the spontaneous emission processes: abs

=

sp

= 2~2

2

Γ

( )

(43)

To evaluate the global rate of variation of 2 , we must add to the rates of variation (34) and (42) the one due to the cooling process (d 2 d )cool . This variation is simply the variation of 2 in the absence of fluctuations during the momentum exchanges, so that it can be obtained by assuming that 2 and 2 are simply equal. Using (27), we can write this variation as: d 2 d

2

=

2

(44)

cool

6 One could evaluate the effects of the deviations from Poisson statistics, but it will not be done here to keep things simple; this is legitimate for the low laser intensities (unsaturated transition) we assumed here.

2032



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Adding (34), (42) and (44), we finally obtain a total rate of change equal to: d 2 d

2

=

2

+

sp

+

abs

(45)

tot

This rate of change goes to zero when : 2

=(

+

sp

abs )

2

(46)

Dividing both sides of this equation by 2 and using expressions (25) and (43) for the friction and diffusion coefficients, we finally get7 the expression for the average kinetic energy in the stationary regime: 2

=

2

~Γ 4

(47)

This result indicates that the ultimate temperature is directly related to the natural width of the excited level. .

Doppler temperature

In a Doppler laser cooling experiment, and for a one-dimensional treatment of the problem, the average kinetic energy of the velocity fluctuations around its average value is related to the Doppler temperature : 2

2

=

2

(48)

where is the Boltzmann constant. Using (47) and (48), we find that the equilibrium temperature that can be reached by Doppler laser cooling is given by: =

~Γ 2

(49)

For sodium atoms, this temperature is equal to 235 10 6 K, that is 106 times lower than room temperature (of the order of 300 K)! Our treatment of Doppler laser cooling is based on several approximations, as for example the one-dimensional treatment and the simplified description of spontaneous emission occurring only in two opposite directions. Nevertheless, more precise calculations lead to the conclusion that the average kinetic energy reached in a stationary regime is indeed of the order of ~Γ, within numerical coefficients of a few units. Finally, equation (47) yields an estimation of the velocity ¯ of the cold atoms thus obtained: ¯



(50)

This means that the Doppler shifts ¯ of the apparent frequencies of waves 1 and 2 are such that the separations of the dotted lines in Figure 4 is of the order of ~Γ . 7 As the diffusion and friction coefficients are all proportional to (46) and is no longer present in (47).

( ), this factor disappears from

2033

COMPLEMENT AXIX



We can compare these Doppler shifts with the width Γ of the curve plotted in Figure 4 and obtain: ~Γ Γ

~

2

Γ

rec

(51)



where rec is the recoil energy given in (7). The ratio rec ~Γ is in general small for the atomic resonance lines used in laser cooling; it is of the order of 1 100 for sodium, which means that ¯ remains small compared to Γ and shows the validity of the limited series expansion used in equation (22). Other laser cooling methods Until now we have described laser cooling methods using only the Doppler effect. Other methods have been proposed and demonstrated, such as “Sisyphus cooling” (§ 4 of Complement DXX and [22]), the “subrecoil cooling” and “evaporative cooling” [23]. The reader interested in the two latter methods may read for instance § 13.3 of [24]. 2-c.

Magneto-optical trap

We wish to introduce an imbalance, depending on the atom’s position , between two laser beams 1 and 2, with the same frequency but propagating in opposite directions along the axis. This requires achieving detunings (between the lasers’ frequency and the atomic frequency) that depend on the position of the atom along the axis; these detunings must be the same for both lasers when = 0, and different when = 0. We must then necessarily use an atom with several Zeeman sublevels and different polarizations for the two beams 1 and 2. The principle of the method, suggested for the first time by Jean Dalibard in 1986, is schematized in Figure 5. We assume the atomic transition is between a ground state with a zero angular momentum ( = 0) and an excited state with an angular momentum equal to 1 ( = 1). The solid lines in Figure 5 plot the energy of the three Zeeman sublevels +1 , 0 and 1 of the excited state, and of the sublevel 0 of ground state, in an inhomegeneous magnetic field applied along the axis. This field is zero at = 0 and varies linearly with around = 0; it can be created, for example, by two circular coils having the same axis , placed symmetrically with respect to = 0, and carrying currents of opposite directions. The quantization axis is chosen along and allows defining the magnetic quantum numbers and of the sublevels in the excited and ground states. The two laser waves 1 and 2 propagate in opposite directions and have opposite circular polarizations with respect to the quantization axis 8 . Wave 1, with polarization + , excites the transition 0 +1 whereas wave 2, with polarization , excites the transition 0 1 . The energy ~ 0 of the zero-field atomic transition is equal to the difference between the energies of states ~ 0 between the 0 and 0 (solid horizontal lines in the figure). The detuning ~ = ~ energy ~ of the laser photons and that of the zero-field atomic transition is shown as the difference in height between the dashed and solid horizontal lines in the figure. We 8 One generally defines the right-hand and left-hand circular polarization with respect to the propagation direction of the photon. In that case, the polarization of the photons of wave 1 and wave 2 in Figure 5 would be + (in both cases the electric field of the wave turns around the direction of propagation following the right-hand rule). When studying the selection rules of the various transitions resulting from the conservation of spin angular momentum (see Complements BXIX and CXIX ), it is best to define both the quantum numbers and the polarizations of all the beams with respect to the same axis, chosen here to be the quantization axis .

2034



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

assume here that the detuning is negative (the laser’s frequency is shifted towards the red of the atomic transition 0 ). At point = 0, the energies of states +1 and 1 are equal, as are the detunings of waves 1 and 2 which excite the transitions 0 +1 and 0 1 . The radiation pressure forces exerted by the two waves are equal in intensity and opposite in direction, so that the resulting force is zero. This balance does not hold as soon as we move away from = 0. For example, at = + , wave 1 which excites transition 0 +1 , is at resonance with this transition and the force it exerts is at its maximum. On the other hand, at that same point, wave 2, which excites transition 0 1 , is way off-resonance and hence exerts a much weaker force. Consequently, the balance between the two waves is broken in favor of wave 1 and the resulting force exerted by both waves is non-zero and directed towards the right. The conclusions are inverted at point = , where the total force is non-zero and directed towards the left. We obtain a restoring force, proportional to in the vicinity of = 0, which traps the atoms around = 0. Such a trap is called a “magneto-optical trap” or “MOT”. For the sake of simplicity, we only considered a one-dimensional model, but the extension to three dimensions is possible. Note in particular that the field created by

Figure 5: Principle of the magneto-optical trap. The transition used is a =0 =1 transition, excited by two laser waves 1 and 2 propagating in opposite directions along the axis, with polarizations ( + ) and ( ) with respect to the quantization axis . A magnetic field gradient is applied along the axis, the magnetic field being equal to zero at = 0. Wave 1, which excites the transition 0 +1 , is resonant for this transition at point = + and the radiation pressure force it exerts on the atoms is then maximal. At that point, wave 2, which excites the transition 0 , is off-resonance for the corresponding atomic frequency, and, consequently, exerts a much weaker force. The two forces are unbalanced, in favor of the wave 1 force. The resulting force is non-zero, directed towards 0. The conclusions are reversed at point = , where the resulting force is non-zero, directed towards 0. Finally at = 0 both waves are off-resonance by the same amount, and the resulting force is zero. We obtain a restoring force that traps the atoms around = 0. 2035

COMPLEMENT AXIX



two circular coils centered around the axis, placed on each side of point = 0 and carrying currents in opposite directions, is zero at = 0 and exhibits non-zero gradients along the and axis. The detuning towards the red of the laser frequency with respect to the atomic frequency also has the advantage of providing a Doppler laser cooling effect. The magneto-optical trap is nowadays a basic tool of cold atom physics9 . Other trapping methods Other methods for trapping atoms with light beams exist, for instance laser dipole trapping methods (Complement DXX , §§ 1, 2 and 3).

3.

Blocking recoil through spatial confinement

We now assume the atom or the ion under study is subjected to an external potential that traps it in a region of space. The energy spectrum of the external variables is no longer a continuous spectrum (as would be the case for a free atom), but a spectrum including a discrete part corresponding to the atomic bound states. Furthermore, because of this external potential, the atomic Hamiltonian is no longer translation invariant, and hence the total momentum is no longer a good quantum number. In this section we study how the absorption and emission spectra of the atom are modified by its confinement within the potential, and how the recoil of the atom can be blocked in certain cases. 3-a.

Assumptions concerning the external trapping potential

We will assume the external potential acts only on the external variables, and not on the internal variables. This is the case for example for an atomic ion trapped by electric and magnetic fields, which only act on the center of mass via the global ionic charge, but does not act on the internal variables 10 . Figure 6 plots the trapping potentials for an atom or an ion in the internal states or . The two potential curves are identical, deduced from one another by a vertical translation of amplitude ~ 0 where 0 is the frequency of the internal atomic transition . The spectrum of the vibrational levels, of energies , , , ... is the same for the two potentials. The atomic states are labeled by two quantum numbers: a quantum number or for the internal variables; a vibrational quantum number , , , ... for the external variables. The atomic transitions between the two internal states and now present a structure due to the vibrational motion of the center of mass. The frequency of the transition is given by: ~

=~

0

+

(52)

The quantity gives the variation of the external energy of the atom during the transition.

9 See § 14-7 of [24] for a description of the first experimental realizations of such a trap and for a more quantitative study of its performances. 10 The ion is generally confined in the center of the trap, a region where the electric and magnetic fields are very weak. It is then legitimate to neglect the Stark or Zeeman shifts of the internal states.

2036



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Figure 6: The external potential trapping the ion is the same when the ion is in either of its internal states or , separated by an energy difference ~ 0 . Consequently, the spectrum of the vibrational levels of the center of mass in the external potential is the same for both internal states.

3-b.

Intensities of the vibrational lines

We showed in § C-5 of Chapter XIX that the matrix elements of the interaction Hamiltonian could be factored into three terms pertaining to the three types of variables, the internal and external atomic variables, and the radiation variables – see relations (C34) and (C-35) of that chapter. The part relative to the external variables is equal to ext ext ext ext ˆ for an absorption process, where fin and in are the external fin exp( k R ) in final and initial states of the transition, equal here to and . This leads to an intensity of the vibrational line proportional to: ˆ exp( k R)

= The

2

(53)

obey the sum rule (obtained by the closure relation on the states =

exp(

ˆ k R)

ˆ exp( k R)

It follows that the relative weight of the transition transitions starting from is precisely equal to . Another sum rule The relative weights =

(

=1

): (54)

compared to all the

obey another sum rule: )=

~2 2

2

=

rec

(55)

2037



COMPLEMENT AXIX

which indicates that the average energy gained by an atom going from a given level to another level is equal to the recoil energy, whatever the value of . To prove relation (55), we rewrite the sum over in (55) in the form: ˆ ˆ ext k R)

exp(

ˆ exp( k R)

(56)

ˆ is the external variable Hamiltonian. The only term where ˆ ext = Pˆ 2 2 + (R) ˆ is the kinetic energy term. We can in ext that does not commute with exp( k R) ˆ Pˆ 2 2 replace the commutator appearing in (56) by exp( k R) . We now develop this commutator and use the closure relation on the states, as well as relation: exp(

ˆ2 ˆ = 1 (Pˆ + ~k)2 ˆ ( P ) exp(+ k R) k R) 2 2

(57)

This yields: exp(

ˆ2 ˆ P exp(+ k R) ˆ k R) 2 = rec +

Pˆ 2 2 ~k Pˆ

=

rec

(58)

since, taking Ehrenfest’s theorem into account (Chapter III, § D-1-d- ), the average value of operator Pˆ (equal to d R d ) in the stationary state is zero. 3-c.

Effect of the confinement on the absorption and emission spectra

The absorption and emission spectra of an atom are significantly modified by the confinement of its center of mass. As we saw in § 1, when a free atom has a well defined initial momentum Pin , the absorption of a photon with momentum ~k places it in a well defined momentum state Pfin = Pin + ~k. The conservation of the global momentum means that there is only one final state, Pfin , corresponding to an initial state Pin , and hence a single absorption line . On the other hand, when the global momentum is no longer conserved because of the external potential’s trapping of the atom, we get several lines going from an initial given state to several possible final states , whose frequencies are given by (52). One can then ask which of these lines is the strongest. To answer that question, we go back to expression (53) for the relative weight of ˆ represents a translation the transition . In this equation, operator exp( k R) operator, in momentum space, of the quantity ~k. The quantity is thus proportional to the squared modulus of the scalar product of the vibrational wave function (r) and the wave function (r) translated in momentum space by the quantity ~k. Let us assume the atom is trapped in a region of spatial extension ∆ , very small compared to the wavelength = 2 of the incident photon. The momentum spread ∆ of the wave function is then larger than ~ since: ∆

(59)

leads to: ∆ 2038

~ ∆

~

=

~ 2

(60)



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

This means that, as long as the excited levels’ energy is not too large, the translation in momentum space of the wave function (r) by a quantity ~k much smaller than its width leaves that wave function practically unchanged, and it consequently remains orthogonal to (r) for = . The strongest line in the absorption spectrum is therefore the line with no change in the vibrational quantum number (a line sometimes called the zerophonon line), whose frequency remains unchanged, equal to 0 . A strong confinement suppresses the Doppler shift and the atom’s recoil. Comment on momentum conservation One may wonder what happened to the momentum of the absorbed photon in the zerophonon transition. Remember that we have treated the trapping potential of the atom as an external potential, which breaks the translation invariance of the problem under study: the Hamiltonian no longer commutes with the total momentum, which is no longer conserved. It is thus not surprising that we cannot follow what becomes of the photon momentum. We can, however, describe the trapping potential, not as an externally given potential, but rather coming from the interaction of the atom with another physical object whose dynamics must be taken into account. A quantum treatment of that device and its interaction with the atom permits introducing for the global system, “atom + trapping device”, a Hamiltonian that commutes with the total momentum; it is the global system that absorbs the photon momentum. As this momentum is microscopic, whereas the mass of the device is macroscopic, the recoil velocity is so weak that the corresponding frequency shifts are totally undetectable. 3-d.

Case of a one-dimensional harmonic potential

We now assume the external trapping potential is harmonic, and we call osc the oscillation frequency of the atoms in this potential. The energies of the vibrational levels are equal to ( + 1 2)~ osc , where is an integer, positive or zero. The spatial extension ∆ 0 of the ground state wave function = 0 is equal to ~ 2 osc . To characterize the confinement, we introduce the dimensionless parameter11 : = ∆

0

=2



0

(61)

If 1, the atoms are confined in a region small compared to the radiation wavelength. The square of the parameter has a simple physical significance since: 2

=

~ 2

2

rec

= osc

~

(62)

osc

is the ratio between the recoil energy rec and ~ osc , which is the energy difference between vibrational levels in the potential well. It is instructive to compute, as a function of , the intensities 0 0 and 0 1 of the vibrational lines 0 0 and 0 1. Assume the photon wave vector k is parallel ˆ appearing in equation (53) can be replaced by to the axis. Exponential exp( k R) 11 This parameter is often called the Lamb Dicke parameter, after the names of the physicists who first introduced the idea of recoil-free absorption in a trapped system. To get a historical overview of the various studies on the suppression of recoil due to confinement, the interested reader may consult § 6-4-4 of [24] as well as the references cited in that §.

2039

COMPLEMENT AXIX



exp( ˆ ). We now use the expression of operator ˆ in terms of the annihilation and creation operators of the harmonic oscillator associated with the external potential: ˆ=

~ 2

osc

(ˆ + ˆ ) = ∆

We then get, in the limit ˆ ) = exp

exp(

0

(ˆ + ˆ )

(63)

1, using (61):

(ˆ + ˆ ) 2

=1+

(ˆ + ˆ )

2

(ˆ + ˆ )2 +

(64)

The series expansion (64) used in (53) yields, to order 2 in : 00 10

2

=1 =

2

(65a) (65b)

All the other transitions 0 with > 2 have relative intensities of a higher order, in 2 . The transition with no change in the vibrational state, and hence with no recoil, is predominant for a strong confinement. The transition 0 1 has a much lower probability, by a factor rec ~ osc ; when it occurs, it increases the atomic energy by a quantity ~ osc much larger than rec . The sum rule (55) shows that, on the average, the energy gained by the atom remains equal to rec . 3-e.

Mössbauer effect

In 1958, Rudolf Mössbauer observed very narrow lines in the resonant absorption spectrum of rays by the atomic nuclei in a crystal. Building on the previous work of Lamb [25] on the suppression of the recoil in the resonant absorption of slow neutrons (and not of photons) by the atomic nuclei in a crystalline network, he attributed the narrow spectral structures he observed to a suppression of the recoil. This suppression can occur if, in the crystal phonon spectrum, there are frequencies larger than the recoil frequency = rec ~. The interest of the Mössbauer effect comes from the high value of the frequency of the internal transition, which can reach 1018 Hz, or even much higher frequencies. If the Doppler width and the recoil shift are suppressed by the confinement, and if the natural width remains of the order of 106 to 107 Hz (as for an optical transition), the quality factor of the transition (ratio between frequency and the spectral width of the resonance) can reach values of the order of 1012 . Such a resolution allowed measuring, already in 1960 [26], and for the first time in a laboratory, the gravitational shift predicted by general relativity between the frequencies of an emitter and a receptor, both located in the earth’s gravitational field but separated by an altitude of roughly twenty meters. 4.

Recoil suppression in certain multi-photon processes

Until now, we only considered one-photon processes. We shall see in Chapter XX that there are multi-photon processes in which the atom goes from an internal state to another state by absorbing or emitting several photons. During such processes, the total energy and momentum must of course be conserved12 . 12 We

2040

now consider again free atoms, without an external potential.



MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Figure 7: Saturated absorption spectroscopy. This figure plots the absorption profile of the probe beam when scanning the frequency of the two laser beams. A narrow hole, of width Γ, appears in the middle of a much larger Doppler profile, of width ∆ . Imagine a two-photon process, where the two photons have the same frequency and opposite wave vectors +k and k. The total radiation momentum is zero in this case. If the atom absorbs those two photons, its momentum does not change. Its external energy is not modified, meaning there is neither a Doppler effect, nor a recoil energy. This possibility can be extended to -photon processes as long as the sum of the wave vectors of the photons is zero: k1 + k2

+k =0

(66)

This idea was proposed independently by two groups, in Russia [27] and in France [28]. It led to significant experimental advances in high resolution spectroscopy where the line width is no longer limited by the Doppler width, but rather by the often much smaller natural width. A particularly interesting example is the study of the transition 1 2 of the hydrogen atom by Doppler-free two-photon spectroscopy [29]. The upper state 2 of this transition is metastable, with a long lifetime (around 120 ms). Consequently, its natural width is very small and the two-photon line very narrow, which allows extremely precise measurements of fundamental constants such as the Rydberg constant. Note must be taken however that the interaction with the laser radiation inducing the two-photon transition leads to shifts of the energy levels13 , proportional to the light intensity; these must be taken precisely into account to determine the non-perturbed frequency of the two-photon transition. Doppler-free saturated absorption spectroscopy Nonlinear effects also appear in experimental set-ups where the atom interacts with two counter-propagating light beams, with the same frequency , one having a high intensity (pump beam) and the other a weaker one (probe beam). Contrary to the two-photon transitions considered in the previous paragraph, we assume here that the transitions induced by each beam are one-photon transitions between the two atomic internal states and ; the laser frequency is therefore close to the atomic transition 0 (and not close to 0 2). We neglect here the recoil energy, in general very small in the optical domain compared with the natural width Γ of the upper state . However, the Doppler shifts of the absorp13 See

Complement BXX , §2-b

2041

COMPLEMENT AXIX



tion lines of the various atoms play an important role, as they are different for the pump beam and the probe beam which propagate in opposite directions. The pump beam interacts with an atom of velocity pump if its apparent frequency pump for that atom coincides with the atomic frequency 0 (within Γ), i.e. if pump = 0 within Γ. In the same way, the probe beam interacts with atoms of velocity probe if + probe = 0 within Γ, i.e. if probe = ( is different from 0 , we have 0 ) within Γ. When pump = probe : the two beams do not interact with the same atoms in the velocity distribution, so that the absorption of the probe beam is not perturbed by the presence of the pump beam. However, this perturbation becomes important when = 0 (within Γ), since the two beams interact with the same sub-set of atoms (those belonging to the same “velocity group” along the beams’ axis). The high intensity pump beam lowers the difference in populations between the and levels of the atomic transition, and tends to equalize these populations. The absorption of the probe beam is thus diminished when the two beams interact with the same velocity group, i.e in the vicinity of = 0 . When scanning the frequency of the two laser beams, the absorption of the probe beam varies according to a Doppler profile centered around 0 , with width ∆ , in the middle of which (Figure 7) appears a hole with a much smaller width Γ. This method, called saturated absorption, allows the determination of the atomic frequency 0 with a much better resolution than when using a single beam.

Conclusion In this complement, we showed how the analysis of the momentum exchanges between atoms and photons allows introducing and interpreting several important physical phenomena. These phenomena include Doppler width, recoil energy, radiation pressure forces, Doppler laser slowing down and cooling of atoms, suppression of the Doppler effect due to confinement or in two-photon transitions, and the Mössbauer effect. Thanks to these various methods, spectacular improvements in the resolution of spectroscopic measurements have been obtained. This led to high precision measurements and improvements of atomic clocks, which now have a relative stability of the order of 10 16 . Placing such a clock in the international spatial station and comparing its frequency with that of a similar clock on the Earth, one hopes to be able to test the value of the gravitational shift predicted by general relativity with a precision better, by a factor close to 100, than all the other existing tests. Another conclusion we can draw is that atom-photon interactions are useful tools for controlling and manipulating atoms. We shall see in Complement CXIX how the exchanges of angular momentum between atoms and photons allows controlling the angular momentum of the atoms, polarizing them via optical pumping. Such achievements have opened new fields of research, such as atomic interferometry and the study of degenerate quantum gases.

2042



ANGULAR MOMENTUM OF RADIATION

Complement BXIX Angular momentum of radiation

1

2

3

Quantum average value of angular momentum for a spin 1 particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2044 1-a Wave function, spin operator . . . . . . . . . . . . . . . . . . 2044 1-b Average value of the spin angular momentum . . . . . . . . . 2045 1-c Average value of the orbital angular momentum . . . . . . . . 2046 Angular momentum of free classical radiation as a function of normal variables . . . . . . . . . . . . . . . . . . . . . . . . 2047 2-a Calculation in position space . . . . . . . . . . . . . . . . . . 2047 2-b Reciprocal space . . . . . . . . . . . . . . . . . . . . . . . . . 2048 2-c Difference between the angular momenta of massive particles and of radiation . . . . . . . . . . . . . . . . . . . . . . . . . 2050 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2050 3-a Spin angular momentum of radiation . . . . . . . . . . . . . . 2050 3-b Experimental evidence of the radiation spin angular momentum2051 3-c Orbital angular momentum of radiation . . . . . . . . . . . . 2052

Introduction Radiation angular momentum plays an important role in many situations, in particular in atomic physics experiments. As will be explained in Complement CXIX , the exchange of angular momentum between atoms and photons is the base of many experimental methods, such as optical pumping, which illustrated for the first time the manipulation of atoms by light. In Chapter XVIII, a spatial Fourier transform of the classical fields led to the introduction of the field normal variables ε (k) and ε (k), which are the field components in a basis of transverse plane waves. Upon quantization, these normal variables became the annihilation ˆε (k) and creation ˆε (k) operators of a photon in a mode k ε. Such a plane wave basis is particularly useful for studying the radiation energy and momentum, since the photons of mode k ε have a well defined energy ~ = ~ and momentum ~k. On the other hand, the expansion of the field angular momentum in terms of the normal variables ε (k) and ε (k) is not as simple, since the photons of mode k ε do not have a well determined angular momentum. The aim of this complement is to find another expansion better adapted to the study of the radiation angular momentum, and establish a number of useful results. In the classical description of radiation, the normal variable α(k) = ε ε (k)ε is a vector function of k presenting a certain analogy with a wave function in the reciprocal space, and which could be seen as the wave function of the radiation (in momentum space). A physical quantity, such as the radiation total energy or the total momentum, does appear as the average value in that wave function of a one-particle operator 2043



COMPLEMENT BXIX

representing the energy or the momentum of a photon. We will see other examples of this analogy in this complement. As it is a vector function, this wave function can be regarded as the wave function of a particle of spin 1 whose total angular momentum J would be the sum of the orbital angular momentum L and the spin angular momentum S. We will present, in § 1, a quantum mechanical calculation of the average values, in the state of a spin 1 particle described by the vector wave function Ψ(k), of the orbital and spin angular momentum of that particle. Returning to classical physics, in § 2 we will establish the expression for the total angular momentum J of free radiation; we shall first write it as a function of the fields in reciprocal space, then as a function of the field normal variables α(k). The expression thus obtained has the same form as that obtained in § 1, provided we replace the Ψ(k) by the α(k). This will lead us to the expansion in terms of normal variables of not only the field total angular momentum, but also of its orbital and spin angular momenta. The physical interpretation of these results is discussed in § 3, which highlights, in particular, some important characteristics of these two types of angular momenta. 1.

Quantum average value of angular momentum for a spin 1 particle

We first study a spin 1 particle that has a mass, and is therefore not a photon. Our results will be useful as a point of comparison for the next paragraph’s computations, where we return to the electromagnetic field. 1-a.

Wave function, spin operator

The conclusions of this § 1 will be compared with those of the following § 2 concerning the radiation angular momentum expressed in terms of normal variables. As normal variables characterize the field in reciprocal space, it is important here to describe the state of the spin 1 particle in that same space. The particle state vector can be expanded on a basis k , where k represents the wave vector and the spin state: d3

Ψ =

(k) k

(1)

with: (k) = k

Ψ

(2)

In general, we choose for the states the eigenstates + 1 component. Here, we shall choose another basis: = (1

2)

1

=(

2)

1 + +1

1 of the

spin

+1

= 0 The action of the S components We must use =( ++ ) 2 and 2044

0

(3) on these basis vectors can easily be computed. = ( + ) 2, as well as the action of

• +

ANGULAR MOMENTUM OF RADIATION

on the states + 1

0

1 (Chapter VI, relations (C-50)). As an example,

1 ( 2

1

+1

we obtain: = =

( 2 1 = ( 2

+

+

+

+

+

+

1 2 1 ) 2 )

~ 2 2 ~ = 2 2 =

1 + +1

)0 =

~ 2

2 +1 +

2

1

These equations, and those similar for the action of compact way as:

20

20

2 0 +

2 0 = ~

= and

=0

(4)

~

, can be written in a more

= ~

(5)

where are the indices and is the three-dimensional completely antisymmetric tensor1 . Equation (5) also leads to: = ~ 1-b.

(6)

Average value of the spin angular momentum

Taking (1) into account, the average value of Ψ

d3

Ψ =

d3

(k ) k

k

is written: (k)

(7)

As does not act on the orbital degrees of freedom, described by the variable k, we get, taking (6) into account : k

k

= ~

(k

k)

(8)

Inserting this result in (7) then yields: Ψ

d3

Ψ = ~

(k)

(k) =

~

(using the fact that is antisymmetric). As the two vectors V and W is written: (V

W) =

d3

(k)

(k)

(9)

component of the cross product of (10)

we get: Ψ 1 By

Ψ =

~

d3 (Ψ

definition = +1 if = 1 if is deduced from (or all three) are equal.

Ψ)

(11)

are or can be deduced from by an even permutation; by an odd permutation; finally = 0 if two of the three indices

2045

COMPLEMENT BXIX

1-c.



Average value of the orbital angular momentum

The orbital angular momentum is written: L=R Its

P

(12)

component acts in position space as: P) =

= (R

with

~

=

(13)

Going to the reciprocal space amounts to performing a spatial Fourier transform. The operators which in r space correspond to a multiplication by and a derivation with respect to , respectively become, in k space, the operators derivation with respect to and multiplication by (multiplied by appropriate factors): = ˜

FT

=

(14)

FT

with the notation: ˜

(15)

In reciprocal space, the action of the therefore: ( ˜ )(

~ =

~(k

component of the orbital angular momentum is ˜

)= ~

∇k )

(16)

In the last equality of the first line, we have moved ˜ to the right of , which is allowed since the presence of means that all the terms with equal indices and must be zero. To obtain the equality in the second line of (16), we use the anstisymetry of under the exchange of and , and relation (10) of the vector product; ∇k is the gradient with respect of k. We finally compute the average value of in the state Ψ. Taking (1) into account, we get: Ψ

Ψ =

d3

d3

(k ) k

k

(k)

(17)

As does not act on the spin degrees of freedom, we must have = in the matrix element on the right-hand side, which yields, using the differential form (16) for : Ψ

2046

Ψ =

d3

d3

(k ) k

=

~

d3

(k)

=

~

d3

(k) (k

k ˜

(k)

∇k )

(k)

(k)

(18)



ANGULAR MOMENTUM OF RADIATION

Finally, adding (11) and (18), we obtain the following expression for the average value of the particle total angular momentum in the state Ψ: ΨJ Ψ =

~

d3

Ψ (k)

Ψ(k) +

(k) (k

spin

2.

∇k )

(k)

(19)

orbital

Angular momentum of free classical radiation as a function of normal variables

We now show that the classical calculation of the radiation angular momentum presents a certain analogy with the results of § 1. 2-a.

Calculation in position space

Relation (A-53) of Chapter XVIII yields for the total angular momentum of free radiation (in the absence of particles): J=

0

d3

r

B(r)]

[E (r)

(20)

Let us replace B by ∇ A and use the triple product expansion a (a b) c. We obtain, keeping the right order between ∇ and A: E

[∇



A] =

0

d3

(r

(21)

∇)

r

,

or

axis, labeled by the index .

(E ∇)A

Consider first the contribution J (1) of the second term in (22). Its is written: (1)

=

0

d3

=

0

d3

We now move

r

(1)

component

(23)

to the right of

, using: (24)

The contribution of the term d3

(22)

(E ∇)A

=

0

c) = (a c) b

(E ∇)A

where is the component of E (r) on the Inserting (21) in (20) leads to: J=

(b

=

to (23) yields: 0

d3 (E

A)

(25)

2047

COMPLEMENT BXIX



As for the contribution of the term to (23), we perform an integration by parts. The contribution of the integrated term yields a zero surface integral if the fields decrease fast enough at infinity. We obtain for the contribution of the term to (23) : d3

0

(

) =0+

d3

0

(26)

In the last term of (26), we note the quantity =∇ E

(27)

which is equal to zero since the electric field, in the absence of sources, is purely transverse (and hence of zero divergence). This term therefore disappears. Finally, the average value of J is the sum of (25) and of the first term of (22): J=

d3

0

E(r)

A(r) +

spin

2-b.

∇)

(r)(r

(r)

(28)

orbital

Reciprocal space

Expression (28) for J can be rewritten as a function of the Fourier transforms of the fields E(r) and A(r). Using the Parseval-Plancherel equality (Appendix I, § 2-c) and relations (14), we get: J=

˜ (k) E

d3

0

˜ A(k) +

˜ (k)(k

spin

∇k ) ˜ (k)

(29)

orbital

We now use expressions (B-22a) and (B-22b) of Chapter XVIII to obtain the fields ˜ ˜ E(k) and A(k) as a function of the normal variables: ˜ E(k) =

[α(k) α ( k)] 2 ( ) 1 ˜ A(k) = [α(k) + α ( k)] 2 ( ) where

(30a) (30b)

( ) was determined by relation (A-3) of Chapter XIX: 0

( )=

(31)

2~

Inserting (30) into (29), we get: ~ 2

= +

2048

d3 [

(k)

[

(k)

( k)]

( k)] ˜ [

[ (k) +

(k) + ( k)]

( k)] (32)



ANGULAR MOMENTUM OF RADIATION

Each line in (32) contains four terms: two of these terms include either α twice for the first one, or α twice for the second – both will be shown below to be equal to zero; the other two terms contain either α once or α once – we show below that they are equal. We finally obtain: J=

d3

~

α (k)

α(k) +

(k) (k

spin

∇k )

(k)

(33)

orbital

This expression has the same form as (19): the angular momentum is the sum of a spin term and an orbital term involving spatial derivatives. This result confirms that the normal variable α(k) can be regarded as the wave function in reciprocal space of the photon field, and that the photon is indeed a spin 1 particle. This result also gives the explicit expressions of the radiation spin angular momentum (first term in the bracket of (33)) and orbital angular momentum (second term). For a massive particle, we know (Chapitre VI, § D-1-a) that in spherical (or cylindrical) coordinates, the action of the angular momentum component corresponds to a derivation with respect to the azimuthal angle : }(

˜ )= }

˜

(34)

This result simply comes from a calculation of partial derivatives; it is thus also valid for a field. Computation of the various terms appearing in equation(32) Consider, in the first line of (32), the terms involving a product of two α or two α, for example: (~ 2)

d3

[

(k)

( k)]

(35)

Changing k into k, inverting the indices and , and using = , we can show that (35) is equal to its opposite, and hence must be zero. The same approach, followed for the term: + (~ 2)

d3

[

( k)

( k)]

(36)

shows that this term is equal to: (~ 2)

d3

[

(k)

(k)]

(37)

which is identical to the other term on the first line of (32) containing one α and one α, provided we change the relative order in which the term (k) and (k) appear. In classical theory, these quantities are numbers, and hence commute: their order does not matter. It is however useful to keep track of that order in order to obtain an expression still valid when, upon quantization, the α and α will be replaced by the non-commuting creation and annihilation operators. This computation can be extended to the terms on the second line of (32). In addition to changing k into k, we must also perform an integration by parts. The integrated term,

2049

COMPLEMENT BXIX



which yields a surface integral, is zero if the fields tend to zero fast enough at infinity. Added to it is a contribution that shows that the terms containing two α or two α are equal to their opposite, and hence equal to zero. On the other hand, the integration by parts shows that the two terms containing one α and one α are equal if the order of the α and α can be switched. In the case where the order of the α and α is not taken into account, we obtain expression (33). 2-c.

Difference between the angular momenta of massive particles and of radiation

In spite of the strong analogy between equations (19) and (33), we should not forget an important difference between the two angular momenta, arising from the fact that the normal variables α(k) are transverse. Maxwell’s equation ∇ E = 0 for the free field does require α(k) to always be perpendicular to the wave vector k: k α(k) = 0

k

(38)

while the wave function Ψ(k) of a massive particle in the reciprocal space is not necessarily perpendicular to k. Another difference, of course, is that the norm of this wave function does not have any particular physical meaning (it can arbitrarily be put equal to unity), while changing the norm of the normal variables of the field changes its amplitude. 3.

Discussion

3-a.

Spin angular momentum of radiation

The spin angular momentum, first term on the right-hand side of (33), can be written as: (

) =

~

d3

(k)

(k) =

~

d3

[α (k)

α(k)]

(39)

Instead of using the components α (k) and α(k) on a basis of three vectors e , e , e independent of the wave vector k direction, we can choose a basis of three vectors ε1 (k), ε2 (k), ε3 (k) = κ = k , including the unit vector κ along k and two other vectors ε1 (k) and ε2 (k), orthogonal to each other and to κ, and forming a right-handed reference frame. As the normal variables α (k) and α(k) are transverse, their components on κ are zero. In addition, we introduce the two complex linear combinations of e1 (k) and e2 (k): ε+ (k) =

[ε1 (k) + ε2 (k)]

ε (k) = + [ε1 (k)

ε2 (k)]

2 2

(40)

corresponding to right and left circular polarizations with respect to the k direction. The transverse normal variables α (k) and α(k) can be expanded on these two vectors: α(k) =

+ (k)e+ (k)

+

(k)e (k)

α (k) =

+ (k)e+ (k)

+

(k)e (k)

2050

(41)



ANGULAR MOMENTUM OF RADIATION

Using these two expansions, we compute the cross product α (k)

α(k). Since:

1 (ε1 ε2 ) (ε1 + ε2 ) = (ε1 ε2 ε2 ε1 ) = κ 2 2 1 ε = (ε1 + ε2 ) (ε1 ε2 ) = (ε1 ε2 ε2 ε1 ) = κ 2 2 1 ε = (ε1 ε2 ) (ε1 ε2 ) = 0 = ε ε+ 2

ε+

ε+ =

ε ε+

(42)

we get: S =

d3

+ (k) + (k)



(k)

(k) ~κ

(43)

The form of this expression, “diagonal with respect to the spin variables”, has a clear physical significance: to each plane wave with wave vector k and a right (left) polarization with respect to k, correspond photons of momentum ~k and spin angular momentum +~ ( ~) along the direction κ of k. Upon quantization, when the normal variables α (k) and α(k) are replaced by creation and annihilation operators, expression (43) becomes: Sˆ =

d3

ˆ+ (k)ˆ+ (k) ~κ

ˆ (k)ˆ (k) ~κ

(44)

Operator ˆ+ (k) creates a photon with momentum ~k and spin angular momentum +~ along the direction κ of k; operator ˆ+ (k) annihilates that photon, and operator ˆ+ (k)ˆ+ (k) corresponds to the number of photons in that mode. An analogous definition applies to the second term of (44) with a change of sign for the angular momentum. Helicity These results, which arise from the transverse character of free radiation, lead us to introduce the so-called “helicity”. It is the projection of the photon spin angular momentum onto the direction κ of the wave vector k, equal to +1 for photons with a right circular polarization with respect to κ, and 1 for photons with a left circular polarization. The transverse character of the free radiation field forbids the photons to have zero helicity. Note also that helicity is a pseudoscalar: upon reflection in space, the polar vector κ changes sign whereas the spin vector S, an axial vector, remains unchanged. Consequently, the scalar product of κ and S changes sign (as opposed to a scalar). 3-b.

Experimental evidence of the radiation spin angular momentum

Consider a plane wave of wave vector k and polarization ε. Using the expressions for the electric field E(r) and the vector potential A(r) given in Chapter XVIII, one can easily compute the two terms of equation (28) yielding, in position space, the radiation spin angular momentum and orbital angular momentum2 . The result is that the radiation orbital angular momentum – second term of (28) – is always zero, whatever the 2 These are the two terms that, transformed into the reciprocal space, and after the introduction of the normal variables, yield the two terms on the right-hand side of equation (33); by comparison with the expression of the angular momentum of a spin 1 particle, these terms have been interpreted as the two components of the radiation angular momentum.

2051

COMPLEMENT BXIX



polarization ε is. As for the spin angular momentum – first term of (28) – it is zero for a linear polarization, but different from zero for a circular polarization, with opposite signs for the right and left circular polarizations. This validates, in the simple plane wave case, the general conclusions of the previous paragraph. Such a result suggests sending a linearly polarized light beam through a quaterwave plate. Assuming the plate transforms the incident linear polarization into a right (left) circular polarization, the incident photons have a zero spin angular momentum before they go through the plate, and equal to +~ ( ~) as they come out of the plate. The radiation spin angular momentum thus changes as beam goes through the plate, and this must be accompanied by a change, in the opposite direction, of the plate’s angular momentum. Suspending the plate by a thin torsion fiber, one should observe a rotation of the plate induced by the incident radiation, in a direction opposite to that of the circular polarization of the outgoing beam. This experiment, suggested by A. Kastler [30] was performed by R. Beth [31], confirming the existence of angular momentum transfer. Comment A paradox arises when computing, again for a plane wave, the angular momentum of the radiation, given by equation (20). In a plane wave, the Poynting vector Π(r) = E(r) B(r) is always parallel to the wave vector k at any point r, and for any polarization. The integral over the entire space of r Π(r) must then be zero. As the orbital angular momentum is also zero, it seems that the spin angular momentum should also be zero, whatever the polarization is. This paradox arises because infinite plane waves do not exist in the physical world: any real light beam has a finite spatial extension. The authors of [32] (see also [33]) show that the circular polarization at the center of the beam changes when the field amplitude changes around the edge of the beam. Taking this effect into account quantitatively confirms the result obtained above, namely that the beam spin angular momentum is equal to the sum of the angular momenta ~ of that beam’s photons.

3-c.

Orbital angular momentum of radiation

An important difference between the radiation orbital and spin angular momenta is clearly seen in expression (28): the definition of orbital angular momentum involves a reference point O, since the vector r, defined with respect to that point, explicitly appears in the expression for the orbital angular momentum. This is not the case for the spin angular momentum which, for this reason, is sometimes called “intrinsic” angular momentum. There are actually at least two cases where the choice of the point O is obvious, cases that we now analyze. .

Multipolar waves

When studying the radiation emitted or absorbed by an atom or a nucleus between two discrete states, a natural choice for analyzing the exchanges of angular momentum between the system internal variables and the photons is the center of mass of that atom or that nucleus. In the next complement CXIX , we study for example the exchange of angular momentum between photons and the internal variables of an atom in a particular case: an electric dipole transition, in the long wavelength approximation. Consequently, the expressions describing the photon absorption only involve the radiation polarization variables, and hence only the photon spin angular momentum; the radiation orbital 2052



ANGULAR MOMENTUM OF RADIATION

angular momentum does not actually play any role3 . There are other transitions, especially for atomic nuclei, where the variation of the internal angular momentum between the two transition states is larger than or equal to 2; the photon spin angular momentum, equal to 1, can then no longer ensure the conservation of the angular momentum. Radiation states having a total angular momentum larger than 1 must come into play, which implies a contribution from the radiation orbital angular momentum. Waves corresponding to such states are called “multipolar waves”. The simplest way to build multipolar waves having a total angular momentum characterized by the quantum numbers and is to associate a spherical harmonic (κ) with a spin = 1; the spherical harmonic is an eigenfunction of L2 and with eigenvalues ( + 1)~2 and ~; the spin = 1 has three eigenstates , with = +1 0 1, isomorphic to the three polarization states e+ = (e + e ) 2, e , e = (e e ) 2. We therefore obtain a vector spherical harmonic: Y

1 (κ)

=

1

(κ)e

(45)

which is an eigenfunction of J 2 , L2 , , with eigenvalues ( + 1)~2 , ( + 1)~2 , ~. In this equation, the first term on the right-hand side is a Clebsch-Gordan coefficient (Chapter X, § C-4-c), can take one of the three values = 1 = = + 1 and = + . A difficulty is that the vector spherical harmonics are not all transverse functions and hence cannot be used as a basis of normal transverse functions for expanding the radiation field. For a given value of , one can nevertheless build linear superpositions of vector spherical harmonics Y 1 (κ) with = = 1 that are transverse and have, in addition, a well defined parity = 1. Each vector spherical harmonic, which only depends on the direction κ of k, can also be multiplied by ( 0 ), hence yielding a function that is also an eigenfunction of the energy, with eigenvalue ~ 0 . These functions form a possible basis of normal transverse variables for expanding the field; they are characterized by four quantum numbers: the energy ~ 0 , the total angular momentum , and the parity . They are called electric (for = 1) or magnetic (for = +1) multipolar waves. As this book is limited to the study of electric or magnetic dipole transitions, we do not give here the general expressions for multipolar waves. More details can be found in complement BI of [16] and in [34]. .

Beams with cylindrical symmetry around one axis

It often happens that the beams under study have a cylindrical symmetry. This is the case, for example, for Gaussian beams propagating along a axis, and whose transverse sections are circular. If the reference point O is taken on the axis, the beam symmetry causes the orbital angular momentum to be necessarily along the axis and to have the same value regardless of the position of O along this axis. If the reference point O is taken outside this axis, the orbital angular momentum will change, but not its component , which exhibits an intrinsic character. 3 This orbital angular momentum may, however, come into play during the angular momentum exchanges with the atom’s external variables for certain types of light beams, the Laguerre-Gaussian light beams (§ 3-b of complement CXIX ).

2053

COMPLEMENT BXIX



A particularly interesting case concerns the Laguerre-Gaussian beam (LG) whose field has an exp( ) dependence with respect to the azimuth angle that defines the direction of a point in the plane perpendicular to the beam axis. The cylindrical symmetry is preserved since a rotation of the beam of an angle 0 around the axis yields the same field to within a global phase factor exp( 0 ). Relation (34) then shows that the component of the orbital angular momentum of each photon of the LG beam is }. Consider an LG beam propagating along the axis, with a wave vector along that axis. The phase at a point with cylindrical coordinates is that of exp ( + ). For an ordinary Gaussian beam (for which = 0) the surfaces of constant phase are, in the vicinity of the focal point, planes perpendicular to the axis. When = 0, and must both vary for the phase to remain constant, following the relation d + d = 0. The surfaces of constant phase are therefore helicoidal surfaces spanned by a half-line perpendicular to the axis, starting from this axis, and which rotates of an angle 2 when increases by . It is not surprising that under such conditions the field has a non-zero orbital angular momentum. Note also that the field must be zero on the axis (otherwise its phase would vary discontinuously upon crossing that axis). In conclusion, we showed in this complement that there are two types of radiation angular momenta, the spin angular momentum and the orbital angular momentum, and we studied their properties. The photon can be viewed as a spin 1 particle, except for the fact that it only has two (instead of three) internal states, with respective heliticity +1 and 1. We shall see in the next complement how the interactions between radiation and atoms permit transferring angular momentum from the first to the others.

2054



ANGULAR MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Complement CXIX Angular momentum exchange between atoms and photons

1

2

3

Transferring spin angular momentum to internal atomic variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2056 1-a

Electric dipole transitions . . . . . . . . . . . . . . . . . . . . 2056

1-b

Polarization selection rules . . . . . . . . . . . . . . . . . . . 2056

1-c

Conservation of total angular momentum . . . . . . . . . . . 2058

Optical methods . . . . . . . . . . . . . . . . . . . . . . . . . . 2058 2-a

Double resonance method . . . . . . . . . . . . . . . . . . . . 2059

2-b

Optical pumping . . . . . . . . . . . . . . . . . . . . . . . . . 2062

2-c

Original features of these methods . . . . . . . . . . . . . . . 2064

Transferring orbital angular momentum to external atomic variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2065 3-a

Laguerre-Gaussian beams . . . . . . . . . . . . . . . . . . . . 2065

3-b

Field expansion on Laguerre-Gaussian modes . . . . . . . . . 2065

Introduction Angular momentum exchanges between atoms and photons are at the base of several experimental methods that played an important role in atomic physics and laser spectroscopy. The aim of this complement is to analyze the selection rules appearing in the photon absorption and emission processes by an atom; they express the conservation of the total angular momentum of the system atom + radiation during these processes. We shall mainly focus on the transfer of angular momentum to the atomic internal (electronic) variables (§ 1). The polarization of the field, characterizing the spin angular momentum of that field (Complement BXIX ), then plays an essential role. For the absorption process, we will establish the selection rules relating the field polarization to the variation of the “magnetic quantum number” characterizing the projection of the total internal angular momentum of the atom on a given axis. Two important applications of these selection rules will be described in § 2: the double resonance method, and optical pumping. We shall see that the proper choice of the polarization of the exciting light beam, and of both the direction and polarization of the detected light emitted by the excited atoms, allows controlling the atomic Zeeman sublevels that can be populated and detected by light. We shall emphasize in § 2 the importance of this selectivity. The transfer of the radiation orbital angular momentum to the atomic external variables will be briefly addressed in § 3. 2055

COMPLEMENT CXIX

1.



Transferring spin angular momentum to internal atomic variables

1-a.

Electric dipole transitions

We shall limit ourselves to the case where the transitions between the atomic internal states are electric dipole transitions. As seen in Chapter XIX (§ C-4), the interaction Hamiltonian between atom and radiation can then be written in the form: ˆ E ˆ (R) ˆ D

ˆ =

(1)

ˆ is the operator associated with the atomic electric dipole moment and E ˆ (R) ˆ where D ˆ is the radiation transverse electric field operator at point R, the position of the atom’s center of mass. Note that all the results established in this complement are still valid ˆ by the atomic magnetic for magnetic dipole transitions; one must simply replace D ˆ ˆ ˆ operator. For the dipole moment operator M and E by the radiation magnetic field B transitions where one photon is absorbed, the expressions (B-3) and (B-4) of Chapter XIX ˆ and B ˆ can be replaced by the part containing only destruction operators, of the fields E ˆ (+) and called “positive frequency component” (cf. § A-3 of Chapiter XX) and denoted E ˆ (+) . For the transitions where one photon is emitted, these fields can be replaced B ˆ ( ) and B ˆ ( ) containing only creation operators, called “negative their components E frequency components”. ˆ (+) and B ˆ (+) by their plane wave expansions, the internal variables Replacing E ˆ or M ˆ , with the polarization vector ε of the plane only appear in the scalar products of D, wave k. We shall assume in § 1 and § 2 that all the incident radiation states are plane waves, or linear superpositions of plane waves with wave vectors having directions very close to an optical axis (paraxial approximation). These waves are supposed to have the same polarization ε, which means that the beam angular aperture must be sufficiently small. The transitions between the internal atomic states1 and we shall consider are ˆ . thus entirely characterized by the matrix elements ε D 1-b.

Polarization selection rules

We start with the simple case of an atom with a single electron, and a transition between a ground state with an orbital angular momentum = 0 and an excited state with an angular momentum = 1. We assume the radiation has a right circular polarization, noted + , with respect to an axis noted : this means that the radiation electric field rotates around that axis following the right-hand rule, at the angular frequency . We have: 1 (e + e ) 2

(2)

ˆ = rˆ, where Since D nucleus, we have:

is the electron charge and rˆ its position with respect to the

ε=

ˆ = ε D

2

( +

)=

2

sin exp( )

(3)

1 In this complement and as is usually done in the literature, we shall note the ground state and the excited state (instead of using our previous notation of and for the two atomic levels).

2056



ANGULAR MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

where are the spherical coordinates of the electron with respect to the nucleus. Starting from a ground state with a magnetic quantum number = 0 (defined with respect to the axis) and with no dependence on , the excitation with + polarized light creates an excited state wave function that now varies with as exp( ). This wave function is an eigenfunction of the component ˆ = (~ ) of the orbital angular momentum, with eigenvalue +1 (it corresponds to the state = +1). In a similar way, an excitation with a polarization, for which ε = 12 (e e ), transfers the atom into the state = 1. Consider finally an excitation with a polarized light, for which ε = e : the electric field has a linear polarization parallel2 to the axis. We then have ˆ = ε D = cos , with no dependence; the excitation brings the atom to the state = 0. To sum up, the polarization selection rules for a transition =0 =1 are given by: = +1

+

=0

=

1

(4)

The previous results can easily be generalized to any transition going from a ground state with angular momentum to an excited state with angular momentum , for an atom with any number of electrons. One simply has to use the Wigner-Eckart ˆ theorem (Complement DX and exercise 8 in Complement GX ). The dipole operator D (sum of the dipole operators for each individual electron) is a vector operator whose three ˆ with = +1 0 1 are equal to : spherical components D 1 ˆ ( + ˆ ) 2

ˆ +1 =

ˆ0 = ˆ ˆ 1 = 1 (ˆ 2

(5a) (5b)

ˆ )

(5c)

The Wigner-Eckart theorem (Complement DX ) states that ˆ D

=

ˆ

1;

(6)

where the last term on the right-hand side is a Clebsch-Gordan coefficient (Chapter X, § C-4-c) and the first term is a “reduced matrix element” independent of , and . The Clebsch-Gordan coefficient is different from zero only if: a triangle can be formed with , 1 and , or to 1, the transition =0 =

, which means that is equal either to = 0 being forbidden;

+ .

As the three values of ( = +1 0 1) correspond to the three polarizations + respectively, the selection rules (4) for any given transition can be generalized to (see Figure 1): +

=

+1

=

=

1

(7)

2 Because of the transversality of the field, the light beam must then propagate in a direction perpendicular to the axis.

2057

COMPLEMENT CXIX

• me = mg − 1

me = mg

σ−

π

me = mg + 1

σ+

mg Figure 1: Selection rules for an electric or magnetic dipole transition. The magnetic quantum number increases by one unit for an excitation with a + polarization, remains unchanged for a polarization and decreases by one unit for a polarization.

1-c.

Conservation of total angular momentum

We saw that, when an atom absorbs a photon having a polarization + with respect to a axis, the component of the atom’s angular momentum along that axis increases by one (in ~ units). The conservation of the total angular momentum means that the absorbed + photon must have an angular momentum +~ along the axis. This result can also be obtained from the study of the radiation angular momentum presented in Complement BXIX , as we now show. Although the interaction Hamiltonian pertaining to the atomic internal variables does not depend on the photon wave vector3 k, one can always choose a k wave vector parallel to the atomic quantization axis (since these two directions must be perpendicular to the polarization vector of the circular wave). Now we saw in Complement BXIX that in a plane wave (or in a beam of plane waves with a small angular aperture) a photon with polarization + with respect to the wave vector has a total angular momentum (actually reduced to its spin angular momentum) equal to +~ and parallel to its wave vector. The total angular momentum is indeed conserved.

2.

Optical methods

The polarization selection rules show that it is possible to selectively excite a Zeeman sublevel of an atomic excited state. In a similar way, we shall see below that the observation of the emitted light in a given direction with a specific polarization allows determining from which excited sublevel the light was emitted. Such a selectivity in excitation and detection is the base of the optical methods for Hertzian spectroscopy, as will be illustrated below with two examples. 2058



ANGULAR MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

Figure 2: Energy diagram and Zeeman structure of the 61 S0 63 P1 transition of the even isotopes of mercury at 253.7 nm (Fig.a). A static magnetic field, applied along the axis, lifts the degeneracy of the excited state, which is split into three equidistant Zeeman sublevels: = 1 0 +1. A resonant light beam, propagating along the axis, with a linear polarization parallel to the axis (Fig.b), selectively excites the atoms into the = 0 sublevel (upward arrow on Fig.a). When they are in the excited state, the atoms are transferred from the = 0 state to both states = 1 by a resonant radiofrequency field (oblique small double arrows on Fig.a). A detector D placed along the axis, just behind an analyser that only transmits light with + polarization along the axis, detects the light emitted by the atoms (Fig.b). Consequently, the detector is only sensitive to the light emitted from the = +1 sublevel (wiggly arrow on Fig.a); it yields a signal proportional to the population of that sublevel.

2-a.

Double resonance method

We now explain the principle of the method taking as an example the even isotopes of mercury, for which the first theoretical predictions were made by Brossel and Kastler [35]; the first experimental evidences were obtained by Brossel and Bitter [36]. Since for these isotopes the nuclear spin is zero, and as in a mercury atom all the electron shells are filled in the ground state, the energy diagram is particularly simple. The ground state has a zero angular momentum ( = 0), and the first accessible excited states, an angular momentum = 1. In the presence of a static magnetic field applied along the axis, the three Zeeman sublevels = 1 0 +1 of the excited state undergo Zeeman shifts proportional to and to the applied field, so that the energy of the state is written: =

0

+

(8)

3 In the long wavelength approximation, the radiation wave vector k no longer appears in the interaction Hamiltonian for the atomic internal variables. This wave vector only appears in the part of the ˆ pertaining to the external variables. interaction Hamiltonian, exp( k R),

2059

COMPLEMENT CXIX



where 0 is the excited state energy in the absence of the field, is the Landé g-factor of that state, and the Bohr magneton. The ground state = 0 = 0, whose energy is chosen to be zero, is not affected by the field (Figure 2a). In the double resonance method, the atoms are selectively excited by a resonant light beam with polarization into the sublevel = 0. For example, the exciting beam propagates along the axis, with a polarization ε perpendicular to the axis, and parallel to the axis, hence with a polarization (Figure 2b). If the atoms were left alone, with no perturbation while in the excited state during its radiative lifetime = 1 Γ (of the order of 1.5 10 7 sec), they would remain in that sublevel for the entire time they stay in the excited state. On the other hand, if they are subjected to a resonant radiofrequency field4 that is strong enough to make them go from = 0 to = 1 during the excited state lifetime , the two states = 1 will be equally populated. Is it possible, observing the light emitted by the atoms as they go back to the ground state by spontaneous emission of a photon, to determine the sublevel from which the light was emitted, and hence obtain a signal proportional to this sublevel population? This problem must be analyzed with more care than the absorption process for the following reason. Once the atom has reached the excited sublevel , it can emit in any direction with all possible polarizations, which are not necessarily the three basic polarizations + , or . We are going to show that one can place the detector in a specific direction, far from the atom, and put in front of it a polarization analyzer suitably chosen so as to be able to determine from which sublevel the detected photon was emitted. To demonstrate this result, it is important to first study the oscillations of the atomic dipole associated with a transition . Let us assume the detector is placed on the axis where the atom is located. (i) For the transition =0 = 0, as the dipole oscillates along the axis, it does not emit along that axis. The detector does not receive any fraction of the light emitted by an atom in the = 0 state. (ii) On the other hand, the dipole associated with the transition =0 = +1, which rotates around the axis following the right-hand rule, in a plane perpendicular to that axis, emits along that axis light with a + polarization; this light yields a signal on the detector equipped with an analyzer selecting right circular polarization. This analyzer will, however, block the light emitted by an atom in the = 1 sublevel, which has a left circular polarization. To sum up, placing a detector in a well chosen direction, preceded by an analyzer selecting a suitable polarization, one can block out all light except the one coming from a specific unique sublevel, and get a signal proportional to that sublevel population. The principle of this double resonance experiment is to selectively excite the atoms in the = 0 sublevel, and to detect, by observing the + light emitted along the axis, the variations of the number of atoms transferred into the = +1 sublevel by a radiofrequency field with angular frequency close to the Zeeman frequency = ~. Observing the variation of the emitted light as one scans for a fixed value of the magnetic field, or as one scans the magnetic field for a fixed , one 4 The atoms are thus submitted to two resonant excitations: an optical resonant excitation that brings them from = 0 to = 0; a radiofrequency resonant excitation that brings them from = 0 to = 1. This is why this method is called “double resonance method”.

2060



ANGULAR MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

can optically detect the magnetic resonance in the excited state. Comment What happens if the light emitted by the atom is not observed along the axis, with a right circular analyzer, but with a detector placed along the axis, behind an analyzer selecting a linear polarization parallel to the axis, and hence perpendicular to the axis, a polarization called “sigma”5 ? The light emitted along the axis by the atom in the = 0 state has a linear polarization parallel to the oscillating dipole, and hence parallel to the axis. It is blocked by the analyzer that only lets through the orthogonal polarization. On the other hand, whether the atom is in the = +1 or = 1 sublevel, the rotating dipole in the plane (following the right-hand or left-hand rule) emits in that plane light with a linear polarization perpendicular to the axis; that light can be detected by the sigma polarization detector and it yields a signal proportional to the sum of the populations of both sublevels = +1 and = 1. Actually, the resonant radiofrequency field excites a linear superposition of these two Zeeman sublevels. It was later discovered that the waves emitted from these two = +1 and = 1 states gave rise to interference, hence modulating at the frequency 2 the detected sigma light intensity6 . The detector signal therefore contains a component modulated at the frequency 2 , on top of a continuous component (in steady state), proportional to the sum of the two populations of the Zeeman sublevels. This continuous component was the signal used in the first double resonance experiment.

The shape of the magnetic resonance line can be exactly computed; it leads to analytical expressions in excellent agreement with experimental observations. The center of the resonance line yields the Landé g-factor of the excited state, i.e. the magnetic moment of that state. The resonance width, extrapolated to zero radiofrequency intensity to eliminate radiative broadening, yields the natural Γ width of the excited state. Calculation of the line shape Broadband excitation with a polarization prepares, in a quasi-instantaneous way, the atom in the = 0 excited state. In the steady state, 0 atoms per unit time are excited into that state. Each atom then evolves, because of its interaction with the radiofrequency field B1 , and its state becomes a linear superposition of the three sublevels = 1 0 +1. Let us assume the radiofrequency field is a rotating field, which allows introducing the associated rotating reference frame (Complement FIV ) where the atom’s evolution actually becomes a simple rotation around B1 . Using the rotation matrix for a spin 1, one can find the expression for ( =0 = +1 ), the probability for the atom, initially in the = 0 state, to be found after a time t in the = +1 state [38]. Because of the radiative lifetime = 1 Γ of the excited sublevels, this probability is reduced by a factor Γ . Consequently, in steady state, the number of atoms transferred to the = +1 state is equal to7 : +1

=

(

0

=0

= +1 )

Γ

d

(9)

0

Using the expression for +1

=

0Ω

2

2Γ (

2

+

Γ2

(

=0

= +1 ), one finally obtains:

4 2 + Γ2 + Ω2 + Ω2 )(4 2 + Γ2 + 4Ω2 )

(10)

5 This

set-up was the one used in the first double resonance experiment [36]. modulations, called “light beats”, were observed in 1959 [37]. 7 A similar computation can be carried out for 1. 6 Such

2061

COMPLEMENT CXIX



Figure 3: Principle of optical pumping for a =1 2 = 1 2 transition. The resonant absorption of a photon with + polarization selectively excites the = 1 2 = 1 2 transition. Once it has reached its excited state, the atom falls back, through spontaneous emission, into the = 1 2 states. If it falls in the = +1 2 state, it can no longer absorb an incident photon, and remains in that state (since there is no transition + originating from the = +1 2 state where the atoms accumulate). In addition, any change in the population difference between the = 1 2 sublevels can be detected by a change in the incident beam absorption, since this absorption is only possible starting from the = 1 2 sublevel.

In this expression, Ω is the Rabi frequency associated with the radiofrequency field B1 , proportional to that field’s amplitude; = is the difference between the frequency of the RF field and the Zeeman frequency associated with the gap between the Zeeman sublevels, which is proportional to the static field 0 . The resonance is plotted by scanning either the frequency of the RF field, or the static field, which amounts to scanning . 2-b.

Optical pumping

Optical pumping, proposed by A. Kastler [39], extends to atomic ground states the essential ingredients of the double resonance method. It also opens the possibility of achieving, in a steady state, large population differences between Zeeman sublevels in the ground state. We shall explain this method’s principle in the simple case where the ground state (as well as the excited state ) has only two Zeeman sublevels = 1 2 (or = 1 2 for the excited state). The ideas introduced for that example remain valid for more complex transitions where the and states have a higher degeneracy. The principle of optical pumping is illustrated in Figure 3. Atoms are excited by a resonant beam with + polarization, propagating along the axis (left-hand side of the figure). A static magnetic field is also applied along that same axis. The absorption of a resonant photon with + polarization is selective, meaning it can only excite the = 1 2 = 1 2 transition, during which the magnetic quantum number increases by one unit. Once in the excited state, and after an average time = 1 Γ, the atom falls back, through spontaneous emission, to the = 1 2 states. The probabilities of the various possible transitions are proportional to the square of the Clebsch-Gordan coefficients. If the atom falls into the = 1 2 state, it can reabsorb a + photon. After a certain number of cycles, it will eventually fall in the = +1 2 state, where it can no longer absorb an incident photon. It will remain in that state (since no + transition originates from the = +1 2 state) and the atoms will accumulate in 2062



ANGULAR MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

that sublevel. The cycles of absorption of a + photon starting from the = 1 2 state followed by a spontaneous emission a photon transferring the atom in the = +1 2 level can be considered as an “optical pump” that empties the = 1 2 sublevel to fill the = +1 2 sublevel, hence the name “optical pumping”. Balance of angular momentum exchanges During an optical pumping cycle, the atom gains an angular momentum +~ since it goes from the state = 1 2 to the state = +1 2 . The radiation looses one unit ~ of angular momentum as a + photon is absorbed. Since the total angular momentum of the system atom + radiation is conserved, the angular momentum of the radiation emitted by the atom going by spontaneous emission from the state = +1 2 to the state = +1 2 must be zero. In order to directly demonstrate that the radiation does not lose, on average, any angular momentum during this transition, one must take into account that, as it goes from the = +1 2 state to the = +1 2 state, the atom actually emits a spherical wave in all directions, with all possible transverse polarizations; it is therefore not correct to say that the atom emits a single photon with a polarization, which would imply the atom only emits photons with a wave vector perpendicular to the axis. The correct way is to calculate the total (orbital and spin) angular momentum of the spherical wave emitted by the atom going from = +1 2 to = +1 2. We give a brief outline of the computation8 as this involves expanding the field no longer onto plane waves but onto multipolar waves (Complement BXIX , § 3-c- ). Those waves are field modes characterized by four quantum numbers: wave number (or energy), parity equal to +1 or 1, total angular momentum (which must be an integer), and finally the component of the angular momentum on the quantization axis (which varies by unit steps between and + ). For an electric dipole transition such as the one studied here, the parity equals 1 and the total angular momentum equals 1. It can be shown that is equal to 0 when the atom goes from to = , and is equal to 1 when the atom goes from to = 1. The correct language to describe the spontaneous emission into the ground state is as follows: for a =1 2 = 1 2 transition, an atom in the = +1 2 state can fall back either into the state = 1 2 by spontaneous emission of an electric dipole photon = +1, or into the state = +1 2 by spontaneous emission of an electric dipole photon = +0. This correlation between and is due to conservation of total angular momentum, arising from the rotational invariance of the interaction Hamiltonian. The final state of the system after the spontaneous emission is thus an entangled state, a superposition of the = 1 2 = +1 state and the = +1 2 =0 state, with coefficients weighted by the Clebsch-Gordan coefficients of the two atomic transitions, coming from the dipole matrix elements. This computation also yields the speed with which the two ground state sublevels repopulate, the branching ratio being given by the square of the Clebsch-Gordan coefficients.

Light plays a double role in these experiments. As we just saw, it polarizes the atoms by accumulating them in a ground state sublevel; it also permits the optical detection of the atoms’ polarization. As the atoms can only absorb the incident + light if they are in the = 1 2 sublevel, the absorption of that light yields a signal proportional to the population of that sublevel. Any change in the population differences 8 The interested reader will find more details on multipolar waves properties in Complement B of I [16] and in [34].

2063

COMPLEMENT CXIX



between the = 1 2 and = +1 2 sublevels, induced by a resonant radiofrequency field or by collisions, can therefore be detected by a change in the absorbed light. 2-c.

Original features of these methods

We now review some of the original features of these optical methods, to understand their prominent role in the development of atomic physics. At the time it was suggested, the double resonance was among the first methods to extend the magnetic resonance techniques to atomic excited states. These techniques had been developed for ground states or metastables states with very long lifetimes, using essentially atomic or molecular beams: Stern-Gerlach type experimental set-ups were used to select atoms in given internal states; the flipping of the spins by a RF field would change the trajectories of the atoms or molecules in a detectable way, hence allowing, in most cases, the monitoring of the magnetic resonance (see for example reference [40]). These techniques could not be extended to the excited states because of their very short lifetimes. An interesting feature of these optical methods is their selectivity, both for the excitation and the detection. This selectivity comes from the light polarization, and not from the light frequency. The width of the spectral sources used at the time, and the Doppler width of the spectral lines of the atoms contained in a glass cell were considerably larger than the frequency differences between optical lines going from the ground state Zeeman sublevels to the excited states Zeeman sublevels; it was thus out of the question to try to excite or detect a single Zeeman component of the optical line. The measurements of the Zeeman or hyperfine structures in the excited states by the double resonance method are high resolution measurements. The structures under study are not determined by the measurement of the difference between two optical line frequencies, but by a direct measurement of the structure. In the radiofrequency or microwave domain, the Doppler width is negligible and the measurement resolution is only limited by the natural width. The optical methods are highly sensitive methods. A radiofrequency transition between two sublevels of the excited or ground state is not detected by the loss or gain in energy of the radiofrequency field, but via an absorbed or reemitted optical photon, which has a much higher energy than a RF photon, and whose polarization depends on which sublevel the atom is in. It is therefore possible to detect magnetic resonances in a very dilute medium, such as a vapor. Very high polarization ratios in the ground state, up to 90%, can be achieved by optical pumping. Such ratios are considerably larger than those expected at thermodynamic equilibrium: because of the very weak Zeeman shifts between the ground state sublevels, and the high temperature at which the experiments are conducted, the Boltzmann factors exp( ~ ) are all very close to 1. Note in addition that optical pumping may easily result, by a suitable choice of the polarization, in a larger population for the ground state Zeeman sublevel having the higher energy. This is one of the first examples of a method for achieving a population inversion, an essential condition for obtaining a maser or a laser effect. 2064



ANGULAR MOMENTUM EXCHANGE BETWEEN ATOMS AND PHOTONS

More details on the optical methods and their applications can be found in the well documented work [24] and the references suggested therein. 3.

Transferring orbital angular momentum to external atomic variables

3-a.

Laguerre-Gaussian beams

In optics or atomic physics experiments, one often uses Gaussian beams, linear superpositions of plane waves with wave vectors nearly parallel to an optical axis (paraxial approximation). If all the plane waves forming the Gaussian beam have the same polarization ε, the field phase in planes perpendicular to the beam axis does not depend on the azimuth angle that determines the direction around the beam axis. New types of Gaussian beams have recently been realized9 , called Laguerre-Gaussian (LG) beams, for which the field has an azimuthal dependence in exp , where = 1 2 , in planes perpendicular to the beam axis. We already mentioned the existence of such beams in § 3-c- of Complement BXIX . We now show how the absorption of photons from such beams can transfer to the atomic center of mass a non-zero orbital angular momentum with respect to the beam axis. 3-b.

Field expansion on Laguerre-Gaussian modes

The LG modes form a possible basis for expanding any field. They are characterized by three quantum numbers: the wave number , the number of nodes in the radial direction, and the integer number characterizing the phase dependence on the azimuth angle . We assume the polarization ε to be uniform in the beam. We now place an atom in that beam, and use the LG modes basis ε (r). Instead of being ext ext ˆ written fin exp( k R) , the matrix element pertaining to the external variables in of the interaction Hamiltonian in this basis is now written : ext fin

ˆ (R)

ext in

(11)

ext This relation shows that the initial wave function in (R) of the atomic center of mass is now multiplied by the function (R) characterizing the mode. The phase factor exp( ) is, in a manner of speaking, “imprinted” on the initial wave function. Equation (11) means that the transition amplitude, concerning the external variables and induced ext by the interaction Hamiltonian, is equal to the scalar product of (R) in (R) and the ext final center of mass wave function fin (R). Imagine that the initial external state of the atom has a zero angular momentum ext with respect to the axis, i.e. that in (R) does not depend on the angle . The absorption of a photon from such an LG beam, with quantum numbers , gives ext to the product (R) in (R) a dependence given by exp( ). This means that in its final state, the center of mass must have an orbital angular momentum ~ with respect to the axis, since = (~ ) . The LG beam has transferred to the atom’s center of mass an orbital angular momentum ~. It is important to note that the transfer’s efficiency, described by the matrix element (11), will only be significant if the 9 The main method used to achieve such beams is to numerically design and fabricate holograms, then use them to diffract a Gaussian beam. For a review of the properties and applications of these new types of beams, see reference [41].

2065

COMPLEMENT CXIX



spatial extent of the initial and final wave functions are well adapted to the geometrical characteristics of the LG beam. In the vicinity of the focal point, the width of the beam is of the order of its “waist” 0 , i.e. of the order of a few microns, much larger than the atomic wave packets, of the order of nanometers at normal temperatures ( 300 ). This explains why the orbital angular momentum transfer to the atoms’ centers of mass became operational only when atoms could be cooled down to very low temperatures, in the microkelvin, or even nanokelvin range. The matter waves thus obtained, in BoseEinstein condensates for example, can have spatial extensions of the order of a few microns. The transfer of orbital angular momentum by “phase imprint” can then be used to generate quantum vortices (see Complement DXV , § 3-b- ) in matter waves, where the atoms rotate in phase around an axis. This method was actually used to create such vortices in a Bose-Einstein condensate of trapped atoms [42].

2066

Chapter XX

Absorption, emission and scattering of photons by atoms A

B

C

D

E

A basic tool: the evolution operator . . . . . . . . . . . A-1 General properties . . . . . . . . . . . . . . . . . . . . . A-2 Interaction picture . . . . . . . . . . . . . . . . . . . . . A-3 Positive and negative frequency components of the field Photon absorption between two discrete atomic levels B-1 Monochromatic radiation . . . . . . . . . . . . . . . . . B-2 Non-monochromatic radiation . . . . . . . . . . . . . . . Stimulated and spontaneous emissions . . . . . . . . . . C-1 Emission rate . . . . . . . . . . . . . . . . . . . . . . . . C-2 Stimulated emission . . . . . . . . . . . . . . . . . . . . C-3 Spontaneous emission . . . . . . . . . . . . . . . . . . . C-4 Einstein coefficients and Planck’s law . . . . . . . . . . Role of correlation functions in one-photon processes D-1 Absorption process . . . . . . . . . . . . . . . . . . . . . D-2 Emission process . . . . . . . . . . . . . . . . . . . . . . Photon scattering by an atom . . . . . . . . . . . . . . . E-1 Elastic scattering . . . . . . . . . . . . . . . . . . . . . . E-2 Resonant scattering . . . . . . . . . . . . . . . . . . . . E-3 Inelastic scattering - Raman scattering . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

2068 2069 2070 2072 2073 2073 2075 2080 2080 2081 2081 2083 2084 2084 2085 2085 2086 2089 2091

Introduction In this chapter, we will use the results established in the previous chapter to study some elementary processes concerning the absorption or emission of photons by atoms. Knowing the Hamiltonians describing the atomic energy levels and the radiation, as well Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

as their interactions, we can now focus on solving Schrödinger’s equation that governs the evolution of the system atom plus field. Our objective is to compute the probability amplitude for that system to go from a given initial state at time to a certain final state at a later time . In quantum mechanics, the evolution of the system’s state vector between the instants and is controlled by the evolution operator ( ), which is the basic tool for computing the amplitudes of the various processes studied in this chapter. This is why we start in § A by reviewing a number of equations satisfied by ( ), which will be useful for the forthcoming computations. The first processes we shall study in § B concern photon absorption or emission by an atom undergoing a transition between two discrete states. We shall first consider monochromatic incident radiation, and then a broadband excitation. We will then show in § C that two types of emission can occur during these interaction processes: stimulated emission, which is also predicted in a semiclassical treatment, and spontaneous emission, which requires a quantum treatment of the radiation. We will make a connection with the method used by Einstein to reestablish Planck’s law (giving the spectral distribution of the black body radiation), and deduce the absorption and emission coefficients. The role of correlation functions (pertaining to the atomic dipole and to the incident field) in the computation of transition probabilities is discussed in § D. An important example that involves not one, but two photons, is the scattering of a photon by an atom: during that process, an incident photon is absorbed and a new one is created either by spontaneous or induced emission. This process is studied in § E. When the frequency of the incident photon is close to the atomic transition frequency, the scattering is said to be “quasi-resonant”. Its description requires a non-perturbative treatment that will be developed, based on the results of §§ 4 and 5 of Complement DXIII . In this entire chapter, we shall only consider cases where the atomic levels are discrete; a case where those levels include a continuum will be treated in Complement BXX . Notation: In Chapter XVIII, it was important to distinguish between the classical

quantities and the corresponding quantum operators, so that the latter were denoted with “hats”. In the present chapter, this distinction becomes less important, and we will come back to a more standard and simpler notation, without the hats; for instance, the annihilation and creation operators will be denoted and , instead of ˆ and ˆ . A.

A basic tool: the evolution operator

The unitary evolution operator ( 0 ) has been defined in Complement FIII ; it yields the state of a quantum system at instant knowing the state of that system at a previous time 0 : () =

(

0)

( 0)

(A-1)

It is a unitary operator: ( 2068

0)

(

0)

=

(A-2)

A. A BASIC TOOL: THE EVOLUTION OPERATOR

If

( ) is the system Hamiltonian, }

d ( dt

0)

=

() (

0)

(

=

obeys the differential equation: (A-3)

0

0)

= . The integral equation:

( ) ( }

0)

0)

with the initial condition (

(

0)

(A-4)

0

is equivalent to the differential equation together with its initial condition. If the Hamiltonian is time-independent, the evolution operator is simply: (

0)

A-1.

(

=

0)

}

(A-5)

General properties

In this entire chapter, we use the evolution operator to express the probability amplitude ( ) for the system, starting from the initial state at instant , to be found in the state at time : (

)=

(

)

(A-6)

Consider the total Hamiltonian =

+

+

=

0

:

+

(A-7)

where 0 = + is the non-perturbed Hamiltonian (sum of the isolated atom Hamiltonian and the free radiation Hamiltonian ) and is the interaction Hamiltonian (between the atom and the field). The evolution operators 0 and associated respectively with 0 and read: 0(

(

0) 0)

=

0(

=

(

0) 0)

~

(A-8a)

~

(A-8b)

These operators are related through the integral relation: (

0)

=

0(

0)

+

1 ~

d

0(

)

(

0)

(A-9)

0

To demonstrate this relation, we take the derivative of each side (taking into account the derivative with respect to the integral’s upper bound, which appears in addition to that of the function to be integrated): ~

(

0)

=

0

0(

0)

+

0(

)

(

0)

+

1 ~

d

0

0(

)

(

0)

0

(A-10) that is, taking into account the relation ~

(

0)

= =[

( +

0) 0]

+ (

0 0)

0(

=

0( 0)

(

+

)= 1 ~ 0)

and relation (A-9): d

0(

)

(

0)

0

(A-11) 2069

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

The operator defined in (A-9) therefore obeys relation (A-3) with the total Hamiltonian ; since, in addition, ( 0 0 ) = , this operator is an evolution operator. Inserting expression (A-9) for ( 0 ) into the integral on the right-hand side of that same expression (A-9), we obtain an expression of that operator containing a double integral in time. Reiterating this process several times, we get the series expansion of the evolution operator in powers of : (

0)

=

0(

1 ~

+

0)

+

1 ~

d

0(

)

0(

0 )+

0

2

d 0

d

0(

)

0(

)

0(

0)

+

(A-12)

0

The -order term of this series is a succession of + 1 non-perturbed evolutions, each described by 0 , separated by interactions . The evolution operator also obeys another integral equation, symmetric to (A-9), which will be useful for what follows: (

(A-13)

Its demonstration is similar to that of (A-9). Finally, we can also insert expression (A-13) for by and by . We then obtain:

in the integral of (A-9), replacing

0)

=

=

0(

0(

1 ~

+

0)

0)

+

1 ~

0)

(

0)

+

1 ~

d

(

)

0(

0

d

0(

)

0(

0 )+

0

2

d 0

d

0(

)

(

)

0(

0)

(A-14)

0

Contrary to (A-12), the right-hand side of (A-14) only has three terms, and not an infinity. It is, however, the perturbed evolution operator , and not 0 , that appears in the last term on the right-hand side of (A-14), in between the two interaction Hamiltonians . This form of the evolution operator will be used in § E-2. A-2.

Interaction picture

For the following computations, it will often be useful to write Schrödinger’s equation in the interaction picture. Let ( ) be the Schrödinger state vector. Setting: ¯( ) =

0(

0)

( ) = exp [ (

0)

0

~]

()

(A-15)

the new state vector ¯( ) obeys the time evolution equation: ~

d ¯ ( ) = ¯ ( ) ¯( ) d

(A-16)

where ¯ ( ) is defined as: ¯ ()= 2070

0(

0)

0(

0)

(A-17)

A. A BASIC TOOL: THE EVOLUTION OPERATOR

The evolution operator ¯ ( tion of ¯( ) : ¯( ) = ¯ (

0)

in the interaction representation yields the evolu-

¯( 0 )

0)

(A-18)

From equations (A-1) and (A-15), we can deduce the relation between evolution operators in the two points of view: ¯(

0)

=

0(

0)

(

0)

= exp [ (

0)

0

~] (

0)

(A-19)

In addition, insertion of (A-18) into (A-16) shows that: ~

¯(

0)

= ¯ ( ) ¯(

0)

(A-20)

which leads to the following series expansion for ¯ ( ¯(

0)

=

+

1 ~

d 0

(perturbation expansion):

2

1 ~

¯ ( )+

0)

d 0

d

¯ ( )¯ ( )+

(A-21)

0

The great advantage of the interaction picture is that the state vector only evolves under the effect of the interaction – since ¯ ( ) is the only operator appearing on the right-hand side of (A-16). We shall see that this point of view also allows expressing the transition probabilities in terms of time correlation functions of dipole and field operators, i.e. as average values of products of physical quantities taken at two different instants, and evolving freely (under the effect of only 0 ). Finally, when trying to calculate the transition probability between two eigenstates and of 0 , with respective energies and , it is often convenient to use the interaction picture since, according to (A-19), the transition amplitude is of the form: (

0)

= =

exp [ (

(

0)

¯(

0) }

0

~] ¯ ( 0)

0)

(A-22) (A-23)

( 0) } As the phase factor disappears from the probability (modulus squared of the amplitude), it can be ignored; this allows simply replacing the evolution operator by ¯ and directly using the more compact expansion (A-21). In this entire chapter and its complements, we describe the interaction between atom and field by the electric dipole Hamiltonian = D E (R) introduced in § C-4 of Chapter XIX. For the sake of simplicity, we assume that 0 = 0. In the interaction picture1 , this operator becomes:

¯ ()=

¯ ) E ¯ (R ) D(

(A-24)

where: ¯ ) = exp ( D( ¯ ( ) = exp ( E

0

~) D exp (

0

~) E exp (

~)

0 0

~)

(A-25a) (A-25b)

1 Here, the atom’s external degrees of freedom are treated classically. The atom is supposed to be at rest at point R, meaning R is not modified when going to the interaction picture.

2071

CHAPTER XX

A-3.

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

Positive and negative frequency components of the field

In the present chapter, we assume the system is contained in a cubic box of volume . The transverse electric field is given by relation (B-3) of Chapter XIX. It is a linear combination of annihilation and creation operators, and can be expressed as the sum of two terms: 3

E (R) = E

(+)

(R) + E

( )

(R)

(A-26)

where: E

(+)

E

( )

~ 2 0

(R) =

1 2

~ 2 0

(R) =

k R

ε

3 1 2

k R

ε

3

= E

(+)

(R)

(A-27)

(+)

Operator E (R), obtained by keeping only the annihilation operators in the expan¯ (R), is called 2 the electric field “positive frequency component”. As for the sion of E ( ) operator E (R), it is the “negative frequency component”. These two operators are not Hermitian, and do not commute. In a product of field operators, the order is said to be normal if the creation operators are to the left of the annihilation operators, as in ( ) (+) E E ; the order is said to be antinormal for a product in the inverse order, as in (+) ( ) E E . In the interaction picture, the annihilation and creation operators become: ¯ ( ) = exp (

0

~)

exp (

0

~) =

¯ ( ) = exp (

0

~)

exp (

0

~) =

+

(A-28)

(these equalities can be verified by computing the matrix elements in the Fock state basis, eigenvectors of 0 , and using the fact that the only action of operator is to annihilate a photon in the mode ). The positive and negative frequency components of the field are thus:

¯ (+) (R ) = E ¯ ( ) (R ) = E

~ 2 0

1 2

~ 2 0

(k R

ε

3

)

1 2 3

ε

(k R

)

¯ (+) (R ) = E

(A-29)

Suppose now that we wish to study the lowest order process of photon absorption by atoms. To compute the action of ¯ ( ) on the system’s initial state, we can keep 2 The positive frequency component annihilates a photon, the negative frequency component creates one. Furthermore, in the Heisenberg picture, we shall see that the free evolution of the positive component goes as , and that of the negative one, as + . This labeling as “positive frequency” may seem somewhat counter intuitive, but is widely accepted.

2072

B. PHOTON ABSORPTION BETWEEN TWO DISCRETE ATOMIC LEVELS

only the terms in ¯ ( ) that annihilate one photon, i.e. the Hamiltonian terms containing ¯ ( ). This amounts to keeping only the positive frequency component of the field operator, and using the simplified interaction Hamiltonian: ¯ ) E (+) (R ) D(

¯ ()

(absorption)

(A-30)

In a similar way, to study the lowest order photon emission process, we can use the expression: ¯ ) E ( ) (R ) D(

¯ () B.

(emission)

(A-31)

Photon absorption between two discrete atomic levels

We start with monochromatic radiation (§ B-1), and will study later the broadband radiation case (§ B-2). The base of the computation is the study of the transition rate between stationary states of the non-perturbed Hamiltonian 0 ; Complement DXX will present a more detailed study in terms of wave packets propagating in free space, built from coherent superpositions of stationary states. B-1.

Monochromatic radiation

B-1-a.

Probability amplitude (absorption)

We call =~

and

two discrete levels with respective energies

and

, and set: (B-1)

0

where 0 2 is the atomic eigenfrequency, assumed to be positive (the level has an energy higher than the level). For the sake of simplicity, we shall ignore the external variables, which amounts to considering the atom as infinitely heavy and at rest3 . We assume the radiation is at the initial time = 0 in a state in = containing photons with wave vector k , polarization ε and frequency 2 ; it is a monochromatic radiation. The initial state of the system atom + radiation is written: in

=

;

with energy

in

=

+

(B-2)

~

We are trying to compute the probability amplitude for the atom to absorb a photon and be in the excited state at instant = ∆ . The final state of the system must then be: fin

= ;

1

with energy

fin

=

+(

1)~

(B-3)

As mentioned above, when in and fin are eigenstates of 0 , it is easier to carry out the calculation in the interaction representation. In the expansion (A-21) of ¯ , the lowest order term that can link those two states is the first order term in . Calling 3 Complement A XIX shows how taking into account the external variables and momentum conservation allows introducing the Doppler effect and the recoil effect in the absorption and emission of a photon. It also shows how the confinement of the atom in a region of space by a trapping potential allows controlling those effects.

2073

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

∆ the duration of the interaction, the probability amplitude for the system, initially in the state in at time 0 = 0, to be at time = ∆ in the final state fin is: fin

¯(

)

in



1 ~

=

d

¯ (∆ 0) =

in

=

in

∆ fin

(

d

in

in )

fin

~

(B-4)

0

or else, taking into account the relation fin

~

0

0

1 ~

=

~

0

fin

fin

in

= ~(

):

0



1 ~

fin in

~(

)

0

(B-5)

0

1

fin

(

d

in

(

)∆

0

1

)

0

which leads to: fin

¯ (∆ 0)

in

=

fin

~

(

in

0

)∆

2

2 sin ( (

)∆ 2 )

0 0

(B-6)

The absorption probability is the squared modulus of that expression: fin

¯ (∆ 0)

2 in

=

1 }2

2 fin

4 sin2 (

in

(

)∆ 2

0

(B-7)

2

)

0

Taking relations (A-24) and (A-27) into account, the matrix element of in the amplitude (B-5), can be written: fin

in

~ 2 0

=

ε

3

k R

D

, appearing

(B-8)

so that the probability becomes: fin

¯ (∆ 0)

2 in

=

2~

0

3

ε

D

2

4 sin2 ( (

0

)∆ 2 2

0

)

(B-9)

It is proportional to the number of incident photons , i.e. to the incident intensity in the state in , as well as to the squared modulus of the atomic dipole matrix element between the states and (since D is an odd operator, the absorption of the photon can only occur between two states of different parity). This probability is an oscillating function of the time ∆ . B-1-b.

Energy conservation 2

The presence in the denominator of (B-9) of the factor ( 0 ) means that, the closer gets to 0 , the larger the absorption probability will be. A photon absorption is said to be resonant when the absorbed photon energy is exactly equal to the energy the atom must gain to go from to (energy conservation). The width of the resonance described by (B-9) is of the order of ∆ = 1 ∆ or, in energy, of the order of ∆ = ~ ∆ . This is consistent with the time-energy uncertainty principle: in a process extending over a length time ∆ , the energy is only conserved to within ~ ∆ . 2074

B. PHOTON ABSORPTION BETWEEN TWO DISCRETE ATOMIC LEVELS

Actually, when the variable is the angular frequency , one may consider the function within brackets in (B-6) to be an approximate delta function (∆ ) ( 0 ), having a non-zero width of the order of 1 ∆ . As relation (10) of Appendix II in Volume II shows that the integral over of sin[( 0 )∆ 2] ( 0 ) is equal to , and since ( 0 ) is proportional to the difference in total energy between the final and initial states, equation (B-6) can be written (ignoring the phase factor): fin

¯ (∆ 0)

B-1-c.

fin

2

fin

in

(∆ )

(

fin

in )

(B-10)

Limits of the perturbative treatment

The lowest order perturbative treatment in that we have used cannot remain valid for arbitrarily long times. To understand why, imagine for example that the resonance condition = 0 is satisfied. Since sin( ∆ ) tends toward ∆ when goes to 0, the absorption probability predicted by (B-9) becomes proportional to ∆ 2 , which gets very large as the time interval ∆ increases. Now a probability can never be larger than one; it is therefore obvious that expression (B-9) is no longer valid for long times. The same is true if gets close to 0 without being strictly equal to it: it is then quite possible for the oscillation amplitude of the right-hand side of (B-9) to be larger than one. The previous results can thus only be used for short enough times, ensuring the validity of the perturbative treatment. We shall use in Complement CXX the “dressed atom method” to get a more precise treatment of the coupling effects between a two-level atom and a single mode of the field. The coupling intensity will be characterized by a constant Ω1 called the “Rabi frequency”. At resonance, one shows that the probability amplitude presents a “Rabi oscillation”4 in sin Ω1 ∆ . The quadratic behavior in ∆ 2 found above for the absorption probability at resonance is simply the first term of the series expansion of sin2 (Ω1 ∆ ) in powers of Ω1 ∆ . We shall also discuss the extent to which relation (B-9) can be used far from resonance. B-2.

Non-monochromatic radiation

We now study the absorption and emission processes when the radiation is no longer monochromatic. The transitions are still between two discrete atomic levels and ; the case of a continuum of atomic levels is discussed in Complement BXX . B-2-a.

Absorption of a broadband radiation

We now assume the initial state of the system atom + radiation to be of the form:

in

=

;

in

(B-11)

The atom is still in the internal state of lower energy , but the radiation is now in a state in where photons occupy several modes with different frequencies. The radiation frequency distribution is characterized by a certain spectral band ∆ . We shall see below (very last part of § B-2-a- ) the condition ∆ must satisfy for the results obtained in 4 We shall also show how this Rabi oscillation is modified when taking into account the width of the excited level due to spontaneous emission.

2075

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

this section to be valid. The calculations will be carried out in the same way as in § B-1, while focusing more on the time correlation properties of the incident field. We introduce the notation: D

=

with:

=

(B-12)

e

where e is the unit vector5 (a priori complex) parallel to , and the (real) modulus of that vector. In the interaction picture (with respect to the free atom Hamiltonian), this equality becomes: ¯ ) D( .

=

0

D

=

0

(B-13)

0

Transition probability

Let us insert expression (A-24) for ¯ ( ) into (A-21). The first order term in ¯ ( ) yields the probability amplitude for the system, starting at = 0 from the state ; in , to be found at time = ∆ in the state ; fin . Taking (B-13) into account, we obtain: fin

¯(

)



1 ~

=

in

¯ ) D(

d

¯ (R E

fin

)

in

0 ∆

=

d ~

0

fin

0

(+) d (R

)

in

(B-14)

where we have included only the positive frequency component of the field, which is the only one involved since fin contains less photons than in (we are dealing with an atomic absorption process); in this equality and the following, we shall use the convenient notation: (+) d (R

¯ (+) (R ) ; E

)=e

(+)

( ) d (R

)=e

¯ ( ) (R ) E

(B-15)

( )

¯ (R ) and E ¯ (R ) are the field operators in the interaction picture defined where E in (A-29); the atom is supposed to be fixed at point R. The corresponding transition probability abs (∆ ) is obtained by squaring the modulus of amplitude (B-14), and then summing over all possible radiation final states. Replacing in (B-14) the integral variable by , we obtain: ∆

2 abs

where

(∆ ) = (

(

~2



d

0(

d

0

)

(

)

(B-16)

0

) is defined as: )=

in

( ) d (R

)

fin

fin

(+) d (R

)

in

fin

= 5 The

set = we have

2076

( ) d (R

in

matrix elements 2

=e

2

+ =

+ e

)

(+) d (R

)

in

of the three components of 2

and introduce the vector e = D .

(B-17) are three a priori complex numbers. We . It is a unit vector since e e = 1;

B. PHOTON ABSORPTION BETWEEN TWO DISCRETE ATOMIC LEVELS

This function characterizes the role of the incident beam in the absorption process under study. It is the average value in the initial state of the product of two field operators, arranged in normal order (§ A-3), and taken at two different times. Changing the integral variable for = (the jacobian of this change of variables is equal to unity), equation (B-16) becomes: ∆

2 abs

(∆ ) =



d

~2

d

(

0

+ )

(B-18)

0

The transition probability is therefore proportional to the time integral of the Fourier transform6 with respect to of the function ( + ), limited to the time interval [ ∆ ]. .

Excitation spectrum If the radiation initial state in is an eigenstate of the free radiation, the function ( ) depends only on the difference . For example, if in is a Fock state7 : in

=

(B-19)

1

where the populated modes have the polarization ε , inserting expansions (A-29) for the field operators leads to: (

~

)=

2

0

in

3

e

in

ε

2

(

)

=

=

d

(

( )

)

(B-20)

where: ( )=

~ 2

0

3

e

ε

2

(

)

(B-21)

If the radiation is initially in the Fock state (B-19), the value of is simply , the number of photons in that state. On the other hand, if the radiation is in a statistical mixture of such states (see note 7), represents the corresponding statistical average. Expression (B-20) thus appears as the value at of the Fourier transform of a function ( ) of , which depends on the initial photon populations . As ~ is the average energy of mode with frequency 2 , the function ( ) actually gives the variation of the energy density of the radiation as a function of frequency; this function is also referred to as the spectral distribution of the incident radiation (excitation spectrum). This distribution can have, a priori, any shape, but the energy density often presents a single peak of width ∆ , with no other particular structure. Its Fourier transform ( ) is then a function of width 1 ∆ ; as a result, when 6 The

(+)

components of the field d vary in e . This is why the Fourier component of at 0 appears in (B-18). 7 Instead of a Fock state as the initial state, one could choose a statistical mixture of such states with arbitrary weights; this would not significantly change the following calculations and conclusions.

2077

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

Figure 1: The function to be integrated in expression (B-16) yielding the absorption probability is significant only in a band along the first bisector, of width 1 ∆ , where ∆ is the width of the incident radiation spectrum. If ∆ 1 ∆ , the area of the domain actually contributing to that integral increases linearly with the diagonal of the square, and hence with ∆ (and not with its square). becomes large compared to 1 ∆ , the correlation function ( ) goes to zero and can be neglected. We shall see that in such a case the probability becomes proportional to ∆ , which naturally leads to introducing a probability per unit time. Our reasoning will be similar to that of Complement EXIII for a classical perturbation, but here the radiation is treated quantum mechanically. .

Probability per unit time

The integration domain of the double integral in (B-16) is plotted in Figure 1. As ( ) goes to zero as soon as is large compared to 1 ∆ , the portion of that domain where the function to be integrated is not negligible is a band of width 1 ∆ , along the first bisector; the width of this band is very small compared to the domain extension if ∆ 1 ∆ . To make use of that property, we again replace the integral variable by = (the associated Jacobian for this change of variables in the double integral is equal to 1): ∆



d 0



d 0

=



d

d

(B-22)

0

In the second integral, the values of that actually yield a non negligible contribution are of the order of the correlation time 1 ∆ of ( ). If we assume ∆ 1 ∆ , we can replace the limits of that second integral by and + . Inserting then (B-20) into (B-18) yields the integral: ∆

+

d 0

2078

+

d

d

(

0

)

( )=2 ∆

(

0)

(B-23)

B. PHOTON ABSORPTION BETWEEN TWO DISCRETE ATOMIC LEVELS

since the summation over d yields the function 2 ( 0 ), which allows integration over d ; we get a function independent of whose integration is proportional to ∆ . We finally obtain: abs

(∆ ) =

2 ∆ ~2

2

(

0)

=

2 ∆ ~2

~

2

2

0

3

e

ε

2

(

0)

(B-24)

This means that abs (∆ ) increases linearly with ∆ , which leads us to define an absorption probability per unit time (absorption rate): abs abs

2 (∆ ) =2 2 ∆ ~

=

(

0)

(B-25)

(remember that, for monochromatic radiation, we found an absorption probability increasing not linearly but as the square of ∆ ). This absorption rate is proportional to the radiation energy density at the atomic transition frequency 0 . Formulas (B-21) and (B-25) give the dependence of the absorption rate on the various parameters of the incident radiation (populations of the modes, polarization ε ). Our calculation is perturbative since it was carried out to the lowest order in . It is therefore only valid for times ∆ such that abs (∆ ) = abs ∆ 1, i.e. such that abs ∆ 1 . In addition, we saw above that the linear variation of abs (∆ ) with ∆ abs is obtained only if ∆ 1 ∆ . These two inequalities are compatible if ∆ . The approximation to lowest order is thus valid only if the broad band radiation has a spectral width large compared to the absorption rate. B-2-b.

A specific case: isotropic radiation

The previous calculations can be pushed a step further when the radiation is isotropic, meaning when , the average number of photons in mode , depends only on , and neither on the direction of the wave vector k nor on the polarization ε . The results we shall obtain for this specific case will be useful for later computations (§ C4) on the spontaneous emission rate, as well as for the isotropic radiation at thermal equilibrium. Consider the limit of (B-25) when the volume 3 containing the system goes to infinity. The summation over the index can be replaced by an integral: 3 2

(2 )3

d dΩ

(B-26) ε k

where, for isotropic radiation, a sum is taken over two linear polarizations perpendicular to à k. We assume that the vector e is real as well, and choose an axis that is parallel to it8 . We first calculate, for a given direction of k, the sum of the quantities 2 for the two polarizations. We take two polarization vectors, ε1 and ε2 , both perpendicular to k, and perpendicular to each other. Imagine the first one is in the plane containing k and the axis, so that 1 = sin , where is the angle between k and the axis; the

8 If e is complex, it is easy to see that the contributions of its real and imaginary part simply add in (B-24). The sum of these two contributions then gives (B-27) again.

2079

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

second is necessarily perpendicular to that plane, which means 2 = 0. We then have 2 = sin2 . The integral over all the directions of k is then trivial: ε k dΩ sin2 = 2

d sin3 = 0

8 3

(B-27) of k. Using (B-25), (B-26), (B-27) and , we finally obtain:

We are left with the integral of the modulus changing the variable into the variable = 2 abs

C.

=

3 0~

3 0

(

3

0)

(B-28)

Stimulated and spontaneous emissions

C-1.

Emission rate

We now assume that the atom is initially in the upper state , and that the radiation is in the initial Fock state (B-19); we study the emission processes where the atom falls back into the state while emitting a photon. We still assume that the spectrum of the incident radiation is broad, so that its correlation function tends to zero in a time that is much shorter than ∆ . The computations are then similar to the one we just did for the absorption process, but with a certain number of changes. First of all, we must now use expression (A-31) for the interaction Hamiltonian, the one that contains the negative frequency component of the field operator (with solely creation operators). Secondly, concerning the electric dipole operator, we must only keep the term connecting to , which amounts to replacing (B-13) by the complex conjugate expression: ˜ ) D(

=

=

0

e

(C-1)

0

The correlation function (B-17) of the field in normal order is now replaced by the correlation function in antinormal order ( ): (

)=

(+) d (R

in

)

( ) d (R

)

(C-2)

in

For the radiation state (B-19), this function is given by: (

)= =

~ 2 (

+ 1)~ 2

ε

e

3

0

0

3

e

ε

2

2

(

(

)

)

(C-3)

In the present case, it is now that is present in (C-3), and not as in (B-20). Since and do not commute, this leads to an important difference compared with expression (B-20) for ( ): the number of photons initially populating the mode is replaced by + 1. It is the quantum character of the field (non-commuting operators) that is responsible for these essential differences between the absorption and emission processes. 2080

C. STIMULATED AND SPONTANEOUS EMISSIONS

The emission rate is obtained by computations similar to those that led to the absorption rate (B-25) (with a change of sign for 0 and for the ). We obtain: em em

2 (∆ ) =2 2 ∆ ~

=

[

+ 1] ~ 2

0

3

e

ε

2

(

0)

(C-4)

This result differs from the absorption rate only by the replacement of by + 1. This formula gives the general expression of the emission rate as a function of the population of the modes and of their polarization ε . We shall now discuss its two components, one proportional to , and one that does not depend on . C-2.

Stimulated emission

We first consider the terms in (C-4) that contain , i.e. the contribution of the modes containing initially at least one photon. These terms correspond to an emission induced by the incident radiation; their rate is proportional to the incident light intensity. For this reason it is called stimulated emission. Its rate is the same as the absorption rate, since the terms depending on are identical in (B-24) and (C-4): stim em

=

abs

(C-5)

In particular, for isotropic radiation, we get a result identical to (B-28): 2 stim em

=

3 0~

3 0 3

(

0)

(C-6)

A photon resulting from stimulated emission is created in the same mode as the photons inducing that emission: the number of photons in that mode goes from to + 1. The added photon has the same energy, same direction and same polarization as the initial photons. If the incident radiation is coherent, one can show that radiation emitted by stimulated emission has the same phase as the incident one. This results in a constructive interference effect (in the direction of the incident radiation) between the radiation emitted by the induced dipole and the incident radiation, hence leading to an amplification effect. For this phenomenon to occur, the atomic populations must be inverted, meaning that the probability of occupation of the upper level must be larger than that of the lower level . However, if this is not the case, the interference is destructive, which explains the attenuation of an incident beam by the absorption process. The coherent amplification by stimulated emission of an incident beam propagating through atoms with an inverted population plays an essential role in laser systems. The word laser is an acronym of “Light Amplification by Stimulated Emission of Radiation”. C-3.

Spontaneous emission

If all the are equal to zero, the radiation is initially in the vacuum state; the absorption rate (B-25) is then equal to zero. On the other hand, the emission rate (C-4) is not, because of the term 1 in = + 1. It follows that an atom, initially in the upper state and placed in a vacuum of photons, has a non-zero probability per unit time of emitting a photon and falling back into the lower state . This is called the spontaneous emission process. 2081

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

The term 1 appearing in (C-4) is the same for all the modes, as opposed to the term that only exists for the modes already containing photons. As this term 1 does not favor any particular direction or polarization, the situation is similar to that where the radiation is isotropic; the calculations leading to (B-28) remain valid provided we replace ( 0 ) by 1. This leads to the rate of spontaneous emission: spont em

2

=

3 0

(C-7)

3

3 0~

This result could also be obtained by Fermi’s golden rule (see Chapter XIII, § C-3b) in the following way. The system atom + radiation starts from the initial state ; 0 (atom in state , photon vacuum). This discrete state is coupled to an infinity of final states ; 1 , atom in state with one photon in mode . The golden rule enables one to compute the probability per unit time of the transition from the discrete initial state towards the continuum of final states, which is simply the spontaneous emission rate: em spont

=

2 ~

;1

;0

2

(~

~

0)

(C-8)

Using ; 0 ; 1 = (e ε )(~ 2 0 3 )1 2 , for a real polarization, as well as equations (B-26) and (B-27), we get the same result as (C-7). This equation shows that the spontaneous emission rate increases as the cube of the atomic transition frequency; this explains why spontaneous emission is negligible in the radiofrequency domain, becomes important in the optical domain, and even more so in the ultraviolet or X ray domain. This 03 factor has two origins: on one hand, the square of the 1 2 factor appearing in the electric field expression, on the other, the 2 factor present in the density of final states. The spontaneous emission rate spont em is also called the “natural width” of the excited state , and noted Γ. The inverse of Γ is the “radiative lifetime” of the excited state, which is the average time necessary for the atom to undergo radiative decay: Γ=

spont em

=

1

It is instructive to compare the rate Γ 0

2

=

3 0~

2 0 3

(C-9) spont em

to

0.

It follows from (C-7) that: (C-10)

The dipole is of the order of 0 where is the electron charge and 0 the Bohr radius. This yields the quantity 2 (3 0 ~ ) which, within a factor 4 3, is the fine-structure constant 1 137 multiplied by 20 02 2 . Since 0 0 is of the order of the electron velocity in the first Bohr orbit, which is times smaller than the speed of light , 20 02 2 is of the order of 2 , so we finally have: Γ

3

(C-11)

0

The natural width of the excited level is therefore much smaller than the atomic transition frequency: the atomic dipole may oscillate a great number of times before these oscillations are damped. Typically, in the optical domain, Γ is of the order of 107 to 109 s 1 whereas 0 2 is much larger, of the order of 1014 to 1015 s 1 . 2082

C. STIMULATED AND SPONTANEOUS EMISSIONS

C-4.

Einstein coefficients and Planck’s law

Let us now assume the radiation is at thermal equilibrium at temperature (black body radiation). In this case, it is current practice to use another notation for the various absorption and spontaneous or stimulated emission rates: the Einstein and coefficients from his 1917 article. In that article, he introduces for the first time the concept of stimulated emission: abs

and

stim em

=

spont em

=

=

One can then write the change per unit time of the populations levels due to the various absorption and emission processes:

˙

=

˙

=

+

+

(C-12) and

of the

(C-13)

As an example, on the right-hand side of the first line of (C-13), the first term describes how level fills up when absorbs a photon, the second term how it is emptied by stimulated emission towards , the third one how it is emptied by spontaneous emission. Similar explanations can be given for the second line. In a steady state, there is a balance between the various processes, and we have: ˙

= ˙

=0

(C-14)

We then get from the first equation: =

(C-15)

+

Now, according to the Boltzmann distribution law, must be equal to (~ 0 Relations (B-28), (C-6) and (C-7) then show that, at equilibrium, the populations and obey the relation: =

(~

)

0

=

(

=

+

(

0)

0)

+1

)

.

(C-16)

which means that: (~

(

0)

=

)

0

(~

1

)

0

=

1 (~

0

)

1

(C-17)

Multiplying the average energy per mode ~ 0 ( 0 ) by the mode density 8 02 3 in the vicinity of 0 , yields Planck’s law for the energy density per unit volume of the black body radiation, as a function of the frequency 0 = 0 2 : ( 0) =

3 0

8 3

1 (

0

)

1

(C-18)

In other words, when an ensemble of two-level atoms reaches Maxwell-Boltzmann equilibrium through absorption and spontaneous or stimulated emission of radiation, that radiation must necessarily obey Planck’s law. This is the essence of the argument used by Einstein to establish this law9 . 9 Einstein could not reason in 1917 in terms of the quantum theory of radiation, which was not available. The heuristic introduction of the and coefficients illustrates his remarkable intuition.

2083

CHAPTER XX

D.

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

Role of correlation functions in one-photon processes

The probabilities associated with one-photon processes can be expressed in terms of atom and field correlation functions. D-1.

Absorption process

Let us write again the probability amplitude for the system, starting at = 0 from the state ; in , to end up at time = ∆ in the state ; fin after undergoing a one-photon transition – see (B-14): fin

¯(

)



1 ~

=

in

¯ ) D(

d

fin

¯ (R E

)

(D-1)

in

0

When dealing with an absorption of radiation by the atom, the number of photons in the final state fin is lower than in the initial state in ; only the positive frequency ¯ (+) , which can destroy photons, yields a contribution to the component of the field E ¯ (R ) matrix element fin E in . Let us start with a radiation initial state containing photons with a single given polarization εin ; only modes with this particular polarization are involved in the matrix element. If the polarization is not linear (circular for instance), εin is complex, and we must replace it by its complex conjugate εin is all negative frequency components of the field. Moreover, since εin εin = 1, the εin polarization components are obtained by a scalar product of εin by the field. We then have: fin

¯(

)

The probability abs

(∆ ) =



1 ~

=

in

abs

¯ ) εin D(

d

¯ (+) (R εin E

)

in

(D-2)

0

(∆ ) of the absorption process is then:



1 ~2

fin

¯ ) εin D(

d

in

¯ ( ) (R εin E

)

fin

0



¯ ) εin D(

d

fin

¯ (+) (R εin E

)

in

(D-3)

0

To obtain the probability of finding the atom in any final state other than , whatever final state fin the radiation is in, we must sum this result over all possible and fin states. This yields two closure relations, one in the atom state space10 , the other in the radiation state space. This leads to the following result: abs

(∆ ) =

1 ~2





d 0

d

a(

)

(

)

(D-4)

0

with the definitions: a(

)=

¯ ) εin D( ¯ ) εin D(

(D-5)

10 In the summation over , the initial atomic state can be included since the matrix element of the atomic dipole in that state is zero (because of a parity argument).

2084

E. PHOTON SCATTERING BY AN ATOM

and: (

)=

¯ ( ) (R εin E

in

)

¯ (+) (R εin E

)

(D-6)

in

The two functions we just defined correspond, respectively, to the correlation function of the atomic dipole and to that of the electric field expressed in normal order. In the more general case where the field initial state includes several polarizations, (D-1) must now include the matrix elements: ¯ ( )

fin

¯ (R

)

(D-7)

in

=

with: ¯ ( )=e

¯ ) and D(

¯ (R ) = e

¯ (R ) E

where e is the unit vector of each of the three becomes: abs

(∆ ) = =

1 ~2



(D-8)

=

axes. Probability (D-4) then



d 0

d

a

(

)

(

)

(D-9)

0

with the following definitions of the 9 dipole correlation functions, and the other 9 field correlation functions (the vectors e are real) : a

(

)=

(

)=

¯ ) e D(

e in

e

¯ ( ) (R E

¯ ) D( )

e

¯ (+) (R E

)

in

(D-10)

We thus get a correlation tensor, which slightly complicates the equations, but does not change the essence of the results. D-2.

Emission process

For the photon emission processes, spontaneous or stimulated, we can make sim¯ (+) and E ¯ ( ) operators must be ilar computations. The main difference is that the E interchanged, which yields antinormal instead of normal field correlation functions; furthermore, we must use the more general formula (D-9) instead of (D-4) since an emission process does not favor any specific polarization. E.

Photon scattering by an atom

We now consider a photon scattering process where the initial state includes an atom in state and an incident photon, and where in the final state the atom is still in state , but the incident photon has been replaced by another one. This is a two-photon process, since one photon disappears, and is replaced by another one. 2085

CHAPTER XX

E-1.

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

Elastic scattering

As in § B, we shall treat classically the center of mass of the atom, supposed to be fixed at the origin of the reference frame (see however the comment on page 2094). The initial state in of the system atom + radiation at time corresponds to an atom in state in the presence of a photon in mode , with wave vector k , polarization ε and frequency (we assume the radiation to be monochromatic): in

=

;k ε

with energy

in

=

+~

(E-1)

At the final time , the scattering process has replaced the initial photon k ε by a new photon k ε with wave vector k and polarization ε . The final state of the system is: fin

=

;k ε

with energy

fin

=

+~

(E-2)

Conservation of total energy requires = . This scattering thus occurs without frequency change and is called for that reason “elastic scattering”. We shall study in § E-3 the case where the final atomic state is different from the initial atomic state. This different process is called “Raman scattering”. As the electric dipole interaction Hamiltonian (A-24) can only change the photon number by one unit, the system must go through an intermediate state often called a “relay state” where the atom is in state and the radiation in a state different from its initial state. The lowest order term of the series expansion (A-12) that contributes to the scattering amplitude is of order two. E-1-a.

Two possible types of relay state

There are two possible types of relay states: those corresponding to processes we shall label ( ), where the photon k ε is absorbed before the photon k ε is emitted; and those corresponding to processes labeled ( ), where the photon k ε is emitted before the photon k ε is absorbed. In the first case, the relay state is the state rel = ; 0 , where is an atomic relay state and 0 is the radiation vacuum, since the photon k ε present in the initial state has been absorbed; the energy of this relay state is . In the second case, the relay state is rel = ; k ε ; k ε , since the rel = photon k ε has been emitted before the photon k ε was absorbed: the energy of this relay state is rel = + ~ + ~ . Figures 2 and 3 show two different diagrams representing these same two processes. In Figure 2, the horizontal lines represent the atomic levels; an upwards arrow represents an absorption, whereas a downwards arrow represents an emission. The advantage of this representation is to directly show the energy difference between the initial state and the relay state, equal to in +~ for the ( ) processes, and rel = to in ~ for the ( ) processes: this difference is simply the disrel = tance between the height of the dashed line and the height of the line representing the atomic relay state . In particular, these two lines coincide for the ( ) processes when +~ = , i.e. when the absorption of the incident photon is resonant for the transition (resonant scattering, which will be studied later on). In Figure 3, an incoming arrow represents an absorption, whereas an outgoing one represents an emission. Reading the diagram from bottom to top, one clearly sees which atomic state and photons are present in the initial state, the relay state and the final 2086

E. PHOTON SCATTERING BY AN ATOM

Figure 2: First diagram representation of the scattering processes labeled ( ) and ( ) in the text; these processes have a different chronological order for the absorption of the incident photon and the emission of the scattered photon. The full horizontal lines represent the atomic levels, and the upwards arrows represent absorption processes and downwards arrows, emission processes; the horizontal dashed lines clearly show the energy differences that will appear in the denominator of the transition amplitude expression.

state. For the ( ) processes, no photons are present in the relay state, whereas both incident and scattered photons are present in that state in the ( ) processes. E-1-b.

Computation of the scattering amplitude

The computation of the scattering amplitude is of the same type as the calculations already presented above. In addition, it is almost identical to the computation of the two-photon absorption amplitude explained in detail in § 1 of Complement AXX . Consequently, it will not be explicitly carried out here, but we shall merely highlight the differences with the computation of that complement. The reader interested in more details may want to read that complement before continuing with this paragraph. Relations (13) and (14) of that complement are written here: fin

¯(

)

in

=

2

(

in )

+

(

in )

(∆ )

(

=

2

(

in )

+

(

in )

(∆ )

(~

fin

in )

~ )

(E-3)

where we have introduced the probability amplitudes: (

in )

fin

=

D E(

) rel

rel

in

rel

D +~

ε

D E (+)

in

rel

=

~ 2

ε 0

3

D

(E-4)

2087

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

Figure 3: Another possible diagram representation of the scattering processes ( ) and ( ). The state of the atom is shown on the vertical full line. As for the photons, the incoming arrows (wiggly arrows pointing towards the vertical full line) represent absorption, whereas outgoing ones (wiggly arrows leaving the vertical full line) represent emissions. The bottom to top reading of each diagram shows the chronological succession of the states of the system atom + photons.

(

in )

fin

=

D E (+)

rel in

rel

D E(

)

in

rel

rel

=

~ 2

ε 0

3

D

ε

D

(E-5)

~

The two amplitudes ( in ) and ( in ) correspond to the two types of relay states considered above. In (E-3), the delta function expressing the energy conservation is proportional to (∆ ) ( ) since fin ). To write (E-4) and (Ein = ~( 5), we have replaced in relation (13) of Complement AXX the interaction Hamiltonian by D E (+) or D E ( ) , depending on whether it pertains to an absorption or an emission process. In the numerator of the fractions on the right-hand side of (E-4), operator E (+) (which absorbs the incident photon) acts before operator E ( ) (which creates the scattered photon); this is to be expected for an ( ) type process. The order of the two operators E (+) and E ( ) is reversed in (E-5), as expected for a ( ) type process. The coefficients on the second lines of these equalities come from the plane wave expansion (A-27) of the electric field; is the edge of the cubic cavity used to quantize the field. We could have added a factor , where and are the photon numbers in the initial and final states; for the sake of simplicity, we have assumed that = = 1. 2088

E. PHOTON SCATTERING BY AN ATOM

E-1-c.

Semiclassical interpretation

Elastic scattering can also be explained by a semiclassical treatment, where the quantum treatment only applies to the atom; the incident wave is described as a classical field of frequency . This wave induces, in the atom, an oscillating dipole at the same frequency. This dipole radiates into the entire space a field oscillating at that frequency. The semiclassical approach also enables a simple interpretation of the absorption of the incident beam as resulting from a destructive interference, in the direction of the incident field, between that field and the field scattered by the induced dipole. One can also use such a description to account for the amplification of an incident beam by an ensemble of atoms whose population has been “inverted”, meaning atoms for which the population of an excited level is larger than the population of a lower energy level. The scattered field then has the opposite phase of that it would have without population inversion, so that the interference becomes constructive. E-1-d.

Rayleigh scattering

Assume that the frequency of the radiation is much smaller than all the atomic frequencies ~. One can then ignore and in the denominators on the second lines of (E-4) and (E-5). The only dependence of the scattering amplitudes comes from the prefactor , equal to since = . The scattering cross section involves the product of the squared modulus of that amplitude, proportional to 2 , by the density of the radiation’s final states at frequency = , also proportional to 2 . 4 The scattering cross section therefore varies as , much higher for blue light than for red light. One usually calls “Rayleigh scattering” the elastic scattering when ~. It explains the scattering of the visible solar light by the atmospheric oxygen and nitrogen molecules, which have much higher resonant frequencies, in the ultraviolet domain. This rapid variation with frequency of the Rayleigh scattering cross section is a reason for the sky being blue. E-2.

Resonant scattering

Assume now that the frequency frequency: =(

) ~

of the incident photon is very close to the

(E-6)

of a transition between a state and a state having a higher energy. The absorption of the incident photon is then resonant for the transition and the amplitude (E-4) becomes very large when the state becomes the atomic relay state – it even diverges if the resonant condition is exactly satisfied. In that case, one can neglect all the ( ) type processes; in addition, even if there are other possible atomic relay states , ,.., one can keep only the term involving . To avoid the difficulties related to the divergence of (E-4) when = , it is convenient to use the exact expression (A-14), which only involves three terms, instead of an infinity as in expansion (A-12) for the evolution operator. Only the last of those three terms plays a role, since it can destroy a photon and create another one. A computation similar to that leading to relation (6) of Complement AXX , but using (A-14) instead of 2089

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

(A-12), then yields: (

fin

=

)

in 2

1 ~

d

d

fin

0(

)

(

)

0(

)

in

(E-7)

Note that it is now ( ) that appears in the middle of the matrix element in (E-7), and not 0 ( ). Starting from in = ; k ε , the first interaction Hamiltonian of (E-7) brings the system to the state ; 0 . In a similar way, if the system starts from ; 0 . Expression fin the second interaction Hamiltonian of (E-7) brings it to the state (E-7) can then be written: fin

(

)

in

1 ~

=

2

d ;0

(

d ) ;0

;k ε ;0

0( 0(

)

;0

) ;k ε

(E-8)

Had we used (A-12) instead of (A-14), the central matrix element of relation (E8) would be ; 0 0 ( ) ; 0 = exp( ( ). In our case, using (A-14) leads to ; 0 ( ) ; 0 which is the probability amplitude for the system, starting from the state ; 0 at time , to still be in that same state at time . The calculation of that amplitude appears in the study of the radiative decay of the excited state through spontaneous emission of a photon, hence the decay of a discrete state ; 0 coupled to a continuum of final states ; k ε . Those states represent the atom in state in the presence of a photon with any wave vector k and polarization ε. Now we showed in Complement DXIII (§ 4) that it was possible to obtain a solution of Schrödinger’s equation yielding the amplitude ; 0 ( ) ; 0 at long times (and not only at short times, as for the perturbative solution). This solution is written: ;0

(

) ; 0 = exp [

(

+

)(

) ~] exp [ Γ (

) 2]

(E-9)

where is the energy shift of the state ; 0 due to its coupling with the continuum of final states11 , and Γ the natural width of the excited state (the inverse of the radiative lifetime of that state). We shall assume from now on that the shift is included in the definition of the energy of the state . Starting from the more precise expression (A-14) instead of (A-12) thus leads to a very simple result: we just have to replace, in all the computations of the scattering amplitude, by ~Γ 2. ~Γ 2

(E-10)

Once this replacement has been made, we get, keeping only the amplitude (E-4) and a single relay state , the following scattering amplitude: ¯ (

in )

=

} 2

0

3

ε ~(

D

ε D + Γ 2)

This resonant scattering amplitude no longer diverges when = ; as around , it exhibits a resonant behavior over a range equal to Γ . 11 This

2090

shift is related to the “Lamb shift” of the excited atomic states.

(E-11) is scanned

E. PHOTON SCATTERING BY AN ATOM

E-3.

Inelastic scattering - Raman scattering

We now consider a scattering process where, as before, an incident photon is absorbed and another emitted, but the final atomic state is now supposed to be different from the initial atomic state . E-3-a.

Differences with elastic scattering

Figure 4 shows an example of such a process, called “Raman scattering”, where the energy of the scattered photon is different from that of the incident photon12 . The initial state of the scattering process is, as before, the state in = ; k ε , with energy + ~ ; the final state, however, is now fin = ; k ε where = , with in = energy fin = +~ .

Figure 4: Raman scattering: an atom in state absorbs an incident photon, with energy ~ ; a photon ~ is then spontaneously emitted by the atom, which ends up into a final state different from the initial state . Conservation of total energy requires: +~

=

+~

(E-12)

If , the Raman scattering is called “Raman Stokes scattering”; the energy of the scattered photon is lower than that of the incident photon. If , the Raman scattering is called “Raman anti-Stokes scattering”; the energy of the scattered photon is higher than that of the incident photon. As we assumed here that the mode (k ε ) was initially empty, the scattered photon is emitted spontaneously. The process is then called “spontaneous Raman scattering”. We shall study later the case where (k ε ) photons are initially present, a situation resulting in “stimulated Raman scattering”. Equation (E-12) shows that the angular frequency of the scattered light is different from that of the incident light by a quantity =( ) ~, equal to the frequency of the atomic transition . This means that Raman light spectrum provides information about the eigenfrequencies of the scattering system; this is the base of Raman 12 This

figure only shows a type ( ) process, where the incident photon is absorbed in the first place.

2091

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

spectroscopy. The systems under study are often molecules and the states , are vibrational or rotational sublevels of the ground state, which means that the frequencies belong to the microwave or infrared domain. Instead of measuring directly these frequencies using spectroscopic techniques in a frequency domain where detection might be a problem, it is sometimes more convenient to illuminate the medium with an optical or ultraviolet frequency beam, and to measure the frequency via the frequency shifts of the Raman scattered light. Raman spectroscopy developed considerably once laser sources became available, yielding much higher signal intensities. The detection conditions have also been greatly improved by focusing the incident laser light on very small volumes. The analysis of the scattered light spectrum is now a very powerful tool for analyzing the chemical composition of any scattering media, since each molecule can be identified by its specific vibration-rotation eigenfrequencies. To keep things simple, we have limited our discussion to Raman scattering by atoms or molecules in a dilute medium, where each scattering entity acts individually. Raman scattering is also used in condensed media such as liquids, crystals, surfaces, etc. and provides valuable information on the dynamics of these structures. E-3-b.

Scattering amplitude

The computation of the Raman scattering amplitude is very similar to the one leading to (E-3), (E-4) and (E-5), and yields: fin

¯(

)

in

=

2

(

in )

+

(

in )

(∆ )

(

=

2

(

in )

+

(

in )

(∆ )

(

in )

fin

+~

~ ) (E-13)

where: (

in )

fin

=

D E(

) rel in

rel

D E (+)

in

rel

rel

=

(

in )

~ 2

ε 0

3

fin

=

D E (+)

D +~

rel in

ε

rel

D

D E(

(E-14)

)

in

rel

rel

=

~ 2

ε 0

3

D

ε

D

(E-15)

~

When the photon frequency is close to the transition frequency, Raman scattering becomes resonant and amplitude (E-14) can become very large. As we did for the resonant elastic scattering, to avoid the divergence of (E-14), we just have to replace by Γ 2, where Γ is the natural width of the state. E-3-c.

Semiclassical interpretation

As in § E-1-c, let us consider the dipole induced by the incident field on the scattering object. When that object is a molecule which vibrates and rotates, its polarizability 2092

E. PHOTON SCATTERING BY AN ATOM

changes with time, and is modulated by its rotation and vibration frequencies. The dipole’s oscillations induced by the incident field have an amplitude modulated at the rotation and vibration frequencies of the molecule. The Fourier spectrum of the dipole’s motion contains lateral bands at frequencies shifted from the incident field frequency; these frequency shifts are equal to the molecule’s rotation and vibration frequencies. This semiclassical interpretation accounts for the essential properties of the Raman scattering spectrum. E-3-d.

Stimulated Raman scattering. Raman laser

We now assume the Raman photon appears in a mode that is not initially empty, as = 0 photons already occupy the mode (k ε ). Similarly, we assume several photons ( for example), initially occupy the mode (k ε ). To compute the Raman scattering amplitude ; ; 1 + 1 , we must include the factor ( + 1) in expressions (E-14) and (E-15). Stimulated emission now comes into play: the factor + 1 appearing in the probability expresses the fact that the initial presence of photons in mode (k ε ) stimulates the emission probability of a Raman photon in that mode. Consider now (right-hand side of Fig.5) the inverse scattering process symbolized by ; 1 +1 ; . The corresponding scattering amplitude is simply the complex conjugate of the previous one, meaning that the probability of these two processes are equal. If we start with the same number of atoms in state and state , the number of photons created in one of the processes is equal to the number of photons that

Figure 5: The left-hand side of the figure represents a stimulated Raman process where an atom in level absorbs a photon , and ends up in state after the stimulated emission of an photon. The right-hand side of the figure shows the inverse process, where an atom starting from state absorbs an photon, and falls back in state after the stimulated emission of an photon. These two processes have, a priori, the same probability. However, if the population of state (shown as a large dot on the left-hand side of the figure) is higher than that of the state (shown as a small dot on the righthand side of the figure), the number of processes resulting in the stimulated emission of an photon is bigger than the number of processes where that photon is absorbed. The radiation at frequency is thus amplified by stimulated emission, which allows creating a “Raman laser” at this frequency. 2093

CHAPTER XX

ABSORPTION, EMISSION AND SCATTERING OF PHOTONS BY ATOMS

disappear in the inverse process. What will happen now if we start with an ensemble of atoms where the populations of states and are not equal? If, for example, level has a lower energy than level , the relaxation mechanisms leading to thermal equilibrium tend to create a larger population in than in . The number of scattering processes ; ; 1 + 1 is then larger than the number of the inverse processes ; 1 +1 ; , leading to an amplification of the number of photons. This amplification mechanism is the basis of Raman laser operation. This type of laser is different in two major ways from lasers involving a transition between an upper level, populated by a pumping process, and a lower level (with no relay level). First of all, they do not require a population inversion; the atomic media can be at thermal equilibrium, since the stimulated Raman scattering at the origin of the amplification starts from the atomic state with the largest population. They do require, however, a high intensity radiation at frequency , furnished by another laser called the “pump laser”. Secondly, the frequency of the Raman laser oscillation can be scanned by changing the “pump frequency” , whereas lasers using a two-level system necessarily oscillate at a frequency very close to the atomic transition, and have thus a very small tuning range. Conservation of total momentum If the position R of the atomic center of mass is no longer treated classically, and placed at the origin as we have done until now, we must keep the exponential functions exp( k R) and exp( k R) in the interaction Hamiltonian describing the absorption of an photon and the emission of an photon. The matrix element of the product of those two operators must be taken between an initial state of the center of mass, with momentum ~Kin and a final state with momentum ~Kfin . This yields a (Kfin Kin k +k ) function expressing the global momentum conservation in a Raman process: the momentum of the atom increases by the quantity ~(k k ) during that process. It often happens that the two atomic states and are two sublevels of the same electronic ground state, so that the frequency = ( ) ~ falls in the microwave domain; it is then much smaller than the frequencies and , which are optical frequencies. The energy conservation equation (E-12) then shows that the moduli of the two wave vectors k and k are practically the same. If the two wave vectors k and k have opposite directions, the momentum gained by the atom during a Raman process is equal to ~(k k ) 2~k . The interest of such a Raman process is to couple two states and , energetically very close to each other, by transferring to the atom in one of the two states a very large momentum 2~k , equal to twice that of an optical photon. On the other hand, if the two states and were to be coupled directly by absorption of a single photon in the microwave domain, the momentum transfer would be much smaller. This possibility of coupling two sublevels of the ground state (hence having long lifetimes) by transferring a large momentum to the atom in one of these two states, has interesting applications, in particular in atomic interferometry.

To remain concise, in this chapter we have not treated a certain number of interesting related problems. Among them are multophotonic processes, photoionization, the dressed atom method that facilites the study of light shifts, or the use of photon wave packets. All these subjects are treated in the complements of this chapter.

2094

COMPLEMENTS OF CHAPTER XX, READER’S GUIDE AXX : A MULTIPHOTON PROCESS: TWOPHOTON ABSORPTION

In a multiphoton absorption process, an atom simultaneously absorbs several photons. This complement focuses on the simplest case where two photons are absorbed, while presenting general ideas that also apply to processes involving a larger number of photons. Monochromatic and broad band excitations are successively considered. The very short time the system spends in the relay state violating energy conservation is proportional to the inverse of that energy mismatch.

BXX : PHOTOIONIZATION

In a photoionization process, a photon can remove an electron from an atom, which then becomes an ion (photoelectric effect). This complement studies this process by using a quantum theory of radiation that no longer couples two discrete atomic states but rather a discrete (ground) state to a continuum of (excited) states. Two important cases are considered: a quasi-monochromatic incident radiation, and a broad band excitation. In that second case, this study provides a justification for Einstein’s equation of the photoelectric effect. Lastly, we consider the case where the radiation field is so very intense that the atomic ionization no longer occurs through the absorption of one or several photons, but rather by the tunnel effect.

CXX : TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED ATOM APPROACH

The dressed atom approach is a powerful tool for describing and interpreting higher order effects that appear as a two-level atom interacts with a quasi-resonant radiation. It is valid both in the weak coupling domain (low field intensity) and the strong coupling domain (very high field intensity). An essential parameter is the ratio of the Rabi oscillation (characterizing the coupling with the field) and the natural width of the atomic levels. This general approach allows, in particular, a full understanding of the various properties of light shifts.

DXX : LIGHT SHIFTS: A TOOL FOR MANIPULATING ATOMS AND FIELDS

Using light shifts has become a basic tool in atomic physics, as it allows manipulating atoms and photons. A number of applications of such methods are briefly described in this complement: laser trapping of atoms by dipole forces, mirrors for atoms, optical lattices, “Sisyphus” cooling, and one by one detection of photons in a cavity.

2095

EXX : DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

2096

Just as for a massive particle, on can build wave packets for a photon by the coherent superposition of states, each having a different momentum; this leads to a description of radiation propagation in free space, and we have the possibility to model the arrival of a photon on an atom. We obtain a description of the photon absorption or scattering processes that is more realistic than that given in Chapter XX, where the incident radiation is described by a Fock state having a well defined number of photons (and hence without any spatial propagation). We introduce for the photon a function that is not its wave function, but rather yields the probability amplitude that it might be detected at a given point. Absorption and scattering of wave packets are studied, as well as the one- or two-photon detection signals; the case of two entangled photons (parametric down-conversion) is treated at the end of the complement.



A MULTIPHOTON PROCESS: TWO-PHOTON ABSORPTION

Complement AXX A multiphoton process: two-photon absorption

1 2

Monochromatic radiation . . . . . . . . . . . . . . . . . . . . . 2097 Non-monochromatic radiation . . . . . . . . . . . . . . . . . . 2101 2-a Probability amplitude, probability . . . . . . . . . . . . . . . 2101 2-b Probability per unit time when the radiation is in a Fock state 2103 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2105 3-a Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . 2105 3-b Case where the relay state becomes resonant for one-photon absorption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2106

3

In the process studied in this complement the atom absorbs, not one, but two photons of energy ~ , to go from a discrete level to another discrete higher energy level 1 ; this process is schematized in Figure 1. For the moment we shall ignore the external degrees of freedom and suppose the atom to be infinitely heavy2 . Conservation of total energy then requires: =~

0

= 2~

(1)

The calculation of the transition amplitude will explicit the role played by this conservation law. 1.

Monochromatic radiation

Studying this process involves the computation of the transition amplitude: fin

(

)

(2)

in

The initial state at time in

=

;

of the system atom + radiation is:

with energy

in

=

+

(3)

~

It describes an atom in state in the presence of photons in mode , with wave vector k , polarization ε and frequency (we assume the radiation is monochromatic). After the absorption process, at time , the photon number is lowered by two units, going from to 2. The final state of the system is: fin

= ;

2

with energy

fin

=

+(

2)~

(4)

The atom-radiation interaction is described, as before, by the electric dipole Hamiltonian (A-30) of Chapter XX, which lowers the photon number by a single unit. This 1 In Complement B XX , we shall see how multiphoton processes can also make an atom go from a discrete state to a final state belonging to a continuum of states (photoionization). 2 When the atom’s mass is finite, there are interesting physical effects arising from the conservation of total momentum; these will be studied later on (cf. § 3; see also § 2 of Complement AXIX ).

2097

COMPLEMENT AXX



Figure 1: During a two-photon transition, the atom goes from state to state by absorbing two photons of energy ~ . The horizontal dashed line represents the energy half-way between the and levels. A third level, the relay level , is also involved in the transition; its energy is not necessarily between those of levels and and it is not shown in the figure. However, we assume that the energy of that relay atomic level is so far from the dashed line that no one-photon resonant transition can occur between and . means that in the expansion (A-21) of Chapter XX for ¯ ( ), the lowest term that gives a non-zero contribution to the transition amplitude (2) is of order two, hence containing two operators . The first brings the system atom + radiation from the initial state in = to an intermediate “relay state” rel

=

1

with energy

rel

=

+(

1)~

(5)

where is any atomic state; the second operator brings the system from this relay state rel to the final state fin = 2 . One must of course sum over all accessible intermediate states . Nevertheless, to keep the computation simple, we shall only take into account a single intermediate state (the summation of the probability amplitudes over several such states does not pose a serious problem). If we insert relation (A-21) of Chapter XX between the bra fin and the ket in , the first two terms on the right-hand side yield zero, and the third one becomes3 : fin

¯ (∆ 0)

in

1 ~

=

2 fin

rel

rel

in

rel



d 0

d

fin

~

rel (

) ~

in

~

(6)

0

Let us write explicitly the argument of the exponents appearing in the integral on 3 We

assumed that the two absorbed photons were identical. If the two absorbed photons 1 and 2 were different, either by their energies, their wave vector directions, or their polarizations (albeit satisfying the conservation of energy ~ 1 + ~ 2 = = ~ 0 ), this would lead to a situation similar to that of § E-1-a of Chapter XX. Two types of processes should then be considered, those where photon 1 is first absorbed, photon 2 next, and those where the photons are absorbed in the inverse order (cf. Figure 2).

2098



A MULTIPHOTON PROCESS: TWO-PHOTON ABSORPTION

the second line: [

+(

2)~ ]

The terms in [

[

+(

and

)

[

+

~ ]

(7)

cancel out, leaving the expression:

2~ ]

Choosing and double integral:

1)~ ] (

[

+ ~ ](

=

)

(8)

as the integral variables, the second line of (6) becomes the



0 [

d

2~

]

~

[

d

~

]

~

(9)

0

or else: ∆ [

d

2~

]

~

~

[

0

+~

]

[

1

+~

]

~

(10)

We are dealing with situations where the frequency 2 of the photons is close to the two-photon resonance expressed by the conservation of energy (see relation (1)). In addition, we assume that the process is a real direct two-photon transition, and not successive one photon absorptions. In other words, we assume level cannot absorb a photon in a resonant way and get to the intermediate level ; this means that the energy of that intermediate level must be very different from the half-way energy ( + ) 2 shown as the horizontal dashed line in Figure 1. It is easy to see that the first term 1 in the bracket of (10) does correspond to a two-photon resonant absorption, since its probability amplitude is written: ∆

~ [

+~

]

[

d

2~

2 ~2 +~

[ (the sign (∆ )

]

~

0 (∆ )

]

(

2~ )

(11)

simply means we have ignored an irrelevant phase factor), with:

( )=

1 sin(

∆ 2~)

(12)

The function (∆ ) tends towards a delta function when ∆ , as shown by relation (10) in Appendix II; it expresses an energy conservation satisfying condition (1), within ~ ∆ . On the other hand, the second term in the bracket of (10) introduces, in the sum ~ ] ~ over , an exponential [ that oscillates rapidly as a function of when condition (1) is (exactly or approximately) satisfied4 . This term yields a non-resonant contribution, hence negligible. Its physical significance (sudden branching of the coupling between atom and field) will be discussed in comment (ii) below. We shall ignore it for the moment because of its non-resonant character. This leads us to: fin

¯ (∆ 0)

in

=

2

fin

rel

rel

in

+~

4 In that case, [ ~ ] ~ [ + 2 ] assumed that is very different from the half-sum of

2~ ,

(∆ )

(

2~ )

(13)

whose exponent is necessarily large since we and .

2099

COMPLEMENT AXX



Comparison with relation (B-10) of Chapter XX shows a great similarity between the probability amplitude of a one-photon absorption process and that of a two-photon transition. We go from the first to the second by substituting the variable in the function (∆ ) by the one relevant for the two-photon energy conservation written in (1), and by 5 replacing the matrix element fin in by : fin

rel

rel

in

rel

in

=

~ 2

0

3

(

1)

ε

D +~

ε

D

(14)

This means that we just have to replace, in relation (B-6) of Chapter XX for the transition amplitude, the matrix element by a product of matrix elements divided by an energy difference. Comments (i): Characteristic time of the intermediate transition The transit of the physical system through the intermediate relay state occurs without energy conservation, since it involves a difference = +} with the initial energy. Mathematically, this results in the presence, in the second time integral of (9), of an oscillating term; the larger the energy difference, the more rapid the oscillation. Once that integral is performed, we obtain the bracket appearing in (10), multiplied by a prefactor. This bracket starts from zero at time = 0, then oscillates as a function of the intermediate time . After a time = , which corresponds to one oscillation period, its average value over time equals one, precisely the value we have used for the computation of the probability amplitude. The transit through the relay state brings in a characteristic time = , after which the modulus of the integral over time no longer increases. The larger the departure from energy conservation, the shorter that time is (this short transit through such a relay state is sometimes referred to as a “virtual transition”). The integral over time thus behaves completely differently from the integral over , which, at resonance, increases linearly with time as shown from (11) and (12); this latter integral over may accumulate contributions over much larger times. The limitation of the times that actually contribute to the probability amplitude has a natural interpretation in the context of the Heisenberg time-energy uncertainly relation }. (ii): Physical meaning of the term left out of the transition amplitude We have left out the second term in the bracket of relation (10). Its origin is nevertheless interesting, as it arises from the sudden branching of the coupling between atom and field at time = 0, as assumed in the computation. To see this, we can use a model where the interaction Hamiltonian is replaced by an operator ( ) ; the time dependence of the function ( ) allows introducing an adiabatic turning on of the coupling. It can be shown that the term we had ignored does disappear when turning on the interaction very slowly. A more rigorous description can be obtained by describing the field as a wave packet propagating in space (Complement EXX ), and overlapping the atom only during a limited time. In that case, the interaction Hamiltonian only acts during that overlap time, even 5 We

2100

use for

expression (A-24) of Chapter XX, as well as expression (A-27) for the electric field.



A MULTIPHOTON PROCESS: TWO-PHOTON ABSORPTION

though the operator itself is time independent. For a wave packet with a progressive wave front, this approach shows that the term we have ignored does not even come into play.

As for the transition probability, the computation is the same as for a one-photon (2) transition. We get for the transition probability ( ): (2)

()= 2

~ 2~ 0

3

[ (

ε

1)]

D +~

ε

2

D

4 sin2 ( (

0

2 )∆ 2

(15)

2

0

2 )

At short times, it is proportional to the square of (∆ )2 . Finally, if several relay states are involved in the two-photon transition, one must sum expressions (13) and (14) over all the possible intermediate states rel (taking into account the fact they each have different energies rel ); on the right-hand side of (14), this amounts to summing over all accessible intermediate atomic states with energy . Interference effects between amplitudes associated with different intermediate states may then appear in the probability (15). 2.

Non-monochromatic radiation

Consider now what happens when the initial state of the system in contains photons of different frequencies. We are going to show that, just as for one-photon transitions, the two-photon transitions can lead to a transition probability per unit time; however, this probability involves a higher order correlation function (order 4 instead of 2). 2-a.

Probability amplitude, probability

The computation of the probability amplitude is similar to that discussed in § 1; it is based, as before, on the expression of ¯ to second order in . We will carry out the calculation so as to highlight the properties of the time correlation functions of the incident electric field. The radiation initial state is in , its final state fin , and rel is its intermediate state when the atom is in the relay state . The two-photon transition is described by the sequence of the following states for the system atom + photon: ;

in

;

;

rel

(16)

fin

corresponding to the transition amplitude to the lowest order: ;

fin

¯ (∆ 0) ;

in

1 ~

=

2



d

¯ ) D(

fin

¯ (+) (R E

)

rel

0

d

¯ ) D(

rel

¯ (+) (R E

)

(17)

in

0

(R is the atom’s position). By analogy with relation (B-13) of Chapter XX, we set: ¯ ) D( ¯ ) D(

=

(

)

}

with:

=

=

(

)

}

with:

=

e e

(18) 2101



COMPLEMENT AXX

We then get: ; fin ¯ (∆ 0) ;

in

=

∆ (

d

~2

)

}

fin

¯ (+) (R E

e

)

rel

0 (

d

)

}

rel

¯ (+) (R E

e

)

(19)

in

0 (+) Expression (A-29) of Chapter XX for the electric field operator ¯ (R ) is a sum of modes’ contributions, each including the exponential associated with its ¯ (+) (R ) and mode in eigenfrequency. Let us focus on the contribution of mode in E ¯ (+) (R ). It involves the time integrals: E ∆ (

d

)

}

(

d

0

)

}

(20)

0

with an exponent containing: [

} ]

+[

]

}

=[

}

]

}

+[

}

](

)

(21)

reminiscent of result (8) obtained for monochromatic excitation. The computation is then very similar to that of § 1, assuming that the relay state is not half-way between levels and , and that the frequency distribution of the incident photons does not include any of the resonance frequencies for the one-photon transitions and . We make the usual change of variable = , and, in the integral over , we only keep the upper boundary contribution, as we did going from (10) to (13): 0 [

d

}

]

}

}

[

+~

(22)

]

(we discussed in comment (ii) at the end of § 1 the significance of the lower boundary contribution, and why it is justified to ignore it). In addition, we assume the width of the frequency spectrum of the incident photons to be small compared to the one-photon resonance detuning +~ ; consequently, the denominator of (22) does not vary significantly in relative value, and can be replaced by the constant value + ~ ex , where ex 2 is the central excitation frequency. We note ∆ the distance from the resonant absorption of a photon in the intermediate state: ∆

+~

ex

(23) } The replacement of the integral over by ∆ yields an exponential depending on the variable , with argument [ } } ] }. Each summation over the modes and with the exponential factors reconstructs the electric field (+) ¯ E (R ), which leads to: ;

=

fin

¯ (∆ 0) ;

in

=



d 0

2102

0

}

fin

e

~2 ∆ ¯ (+) (R E

)

rel

rel

e

¯ (+) (R E

)

in

(24)



A MULTIPHOTON PROCESS: TWO-PHOTON ABSORPTION

where 0 was defined as 0 = ( ) }. Finally, the summation over all the radiation intermediate states rel yields a closure relation and we obtain: ;

fin

¯ (∆ 0) ;

=

in

~2 ∆

rel



d

}

0

e

fin

¯ (+) (R E

)

e

¯ (+) (R E

)

in

(25)

0

This result is similar to the probability amplitude for a one-photon transition, written on the second line of (B-14) in Chapter XX, provided we make the substitution: fin

e

¯ (+) (R E e

fin

~∆

)

in

¯ (+)

E

(R

)

e

¯ (+) (R E

)

in

(26)

We then follow the same line of reasoning as for a one-photon transition. In equality (B-17) of Chapter XX, we must now substitute: (

)

in

e

e

¯ ( ) (R E ¯ (+) (R E

)

)

e e

¯ ( ) (R E ¯ (+) (R E

) )

in

(27)

which is then inserted in (B-18) to yield the transition probability. This probability is given by the Fourier transform, at the angular frequency 0 of the atomic transition, of a correlation function of the field in the initial state, and which involves four field operators (4-point correlation function). This function is in general different from a product of correlation functions involving two field operators (those determining the absorption probability of a single photon). This means that measurements of two-photon transition probabilities yield access to characteristics of the quantum field that are different from those measured in single photon transitions. 2-b.

Probability per unit time when the radiation is in a Fock state

Let us assume now that the radiation is initially in a Fock state such as that described by (B-19) in Chapter XX. In (25), we replace the positive frequency components of the electric fields by their expressions (A-27) of Chapter XX. Only the occupied modes in state in now come into play, since each annihilation operator yields a factor equal to the square root of the mode’s initial population; the other modes give a zero result. We then consider two modes, 1 and 2 , initially occupied in the incident radiation. They yield two contributions (Figure 2) to relation (25): in one of them (term = 1 and = 2 ), the photon 1 is absorbed first and brings the atom from the ground state to the relay state, then photon 2 completes the two-photon transition and brings the atom to level ; in the other (term = 2 and = 1 ), the order of the two absorptions is inverted. These two contributions interfere in the probability: once the amplitude modulus is squared, four terms arise from the cross contributions of the two modes (to which we must add two non-crossed contributions = = 1 2 where only one mode is involved). 2103

COMPLEMENT AXX



Figure 2: Two diagrams schematizing a two-photon transition with a multimode source where two modes 1 and 2 are initially occupied. In the left-hand side diagram, photon 1 is absorbed first, bringing (in a non-resonant fashion) the atom from the initial sate to the relay state ; photon 2 then completes the (resonant) two-photon transition. In the right-hand side diagram, the order of absorption of photons 1 and 2 is inverted. These two diagrams describe probability amplitudes that interfere when computing the two-photon transition probability. The same line of reasoning as in § B-2-a of Chapter XX, and summarized in Figure 1 of that chapter, can be followed here. We assume that the 4-point correlation of the field goes rapidly to zero when is larger that a value 1 ∆ that is small compared to ∆ . One can then show that the transition probability becomes proportional to ∆ , and that the two-photon transition probability per unit time can be written: (2) (2)

(∆ ) ∆

= =

2 ~2

2

~∆

1 4

2 0

2

6

(e

( ε ) (e

ε )

2

(

+ 1)

(~ ) (~

+

0)

) (28)

The delta function at then end of this expression obviously expresses total energy conservation: for the atomic transition to occur, the sum of the energies of the absorbed photons must be equal to the energy of the transition. As expected, the probability includes the photon populations that satisfy this condition. A general property is that, for = (photons absorbed from two different modes), it is the average value of the product of the mode populations that appears in the two-photon transition probability, and not the product of the average values which different in general (they are nevertheless equal in the special case of a Fock state of the radiation). For = (two photons absorbed from the same mode, as in § 1), it is the average value ( 1) that comes into play; this value equals zero if only one photon is present in the mode, as obviously one single photon cannot induce a two-photon transition. 2104



A MULTIPHOTON PROCESS: TWO-PHOTON ABSORPTION

Comment: There also exists 3- ,..., -photon transitions, corresponding to an energy conservation relation = ~ . The corresponding transition rates are proportional to field correlation functions of order 6, .., 2 .

3.

Discussion

Even though the transition amplitudes for one- and two-photon absorption processes are similar, the second type of processes has a number of specific features we now discuss. 3-a.

.

Conservation laws

Total energy conservation

As we just saw, the function (∆ ) ( 2~ ) appearing in (13) expresses the conservation of total energy. When the atom absorbs two photons to go from state to state , its gain in energy equals the sum 2~ of the energies of the two absorbed photons.

.

Total momentum conservation

In the computation leading to the two-photon transition amplitude, the external variables have been ignored. To take them into account, we must assume the atomic center of mass is at point R and keep the exponentials exp( k R) appearing in the operators E (+) (R) in the two interaction Hamiltonians ; as these two exponentials are multiplied by each other, they yield the operator exp(2 k R). We must also include in the initial and final states the quantum numbers Kin and Kfin characterizing the initial ~Kin and final ~Kfin momenta of the center of mass. We then get in the transition amplitude an additional term: Kfin exp(2 k

R) Kin = (Kfin

Kin

2k )

(29)

which shows that the atom’s momentum increases by 2~k when it absorbs two photons. Comment: Imagine the atom is excited by two light beams 1 and 2, having the same frequency 2 but propagating in opposite directions. The previous computations must then be generalized to the case where the two photons absorbed by the atom belong, one to beam 1, and the other to beam 2. The momenta +~k and ~k of these two photons are then opposite and the total momentum gained by the atom during the transition is zero. As the Doppler effect and the recoil effect are linked to the variation of the atomic momentum during the transition (see Complement AXIX ), it follows that the two-photon absorption line does not present any Doppler broadening nor any recoil shift. Such a situation presents many advantages for high resolution spectroscopy, and is used for example in the study of the two-photon transition between the 1 and 2 states of the hydrogen atom.

2105



COMPLEMENT AXX

.

Conservation of total angular momentum and parity

Expression (14) appearing in the two-photon absorption amplitude is a product of two matrix elements of a component of the atomic electric dipole – which is a vector operator – and an energy denominator. In a rotation of the atom, this expression will be transformed as the product of two vectors, since the energy denominator is rotation invariant. Vectors are irreducible tensor operators (Complement GX , exercise 8) of order = 1. Consequently, expression (14) may be expanded6 as a sum of components with total angular momentum = 0 1 2. Using the Wigner-Eckart theorem (Complement EX ), we can show that the two-photon absorption amplitude between two levels with quantum numbers and (where and are the components of F on the axis) is different from zero only if: =

2

1 0

=

2

1 0

(30)

In addition, the electric dipole operator appearing in (14) is an odd operator, as it is proportional to the electron position operator. Consequently, the initial and final states of the two-photon transition must have the same parity, and a parity inverse to that of the relay state. These selection rules can be applied to the 1 2 transition of the hydrogen atom, that occurs between two states having the same parity and a total angular momentum difference equal to 1 at most (the 1 2 spins of the electron and the proton are taken into account). For electric dipole transitions, a two-photon transition 1 2 is allowed, whereas it is forbidden for a one-photon transition. 3-b.

Case where the relay state becomes resonant for one-photon absorption

In the denominator of expression (14), we have the quantity: ~∆

=

+~

(31)

which is the difference between the energy of the atomic state , increased by ~ , and the energy of the relay sate . If ∆ goes to zero, we get a divergence and the expressions we have obtained become meaningless. In the computation, we did explicitly assume that the intermediate level was not resonant for a one-photon absorption, so that this divergence should not occur. Let us examine, however, what would be involved if ∆ were to go to zero. As the resonance condition for the two-photon transition is written = + 2~ , the condition ∆ = 0 means that the atomic relay level7 is exactly 6 The product of two irreducible tensor operators and can be decomposed as the product of two kets with angular momentum and , hence involving Clebsch-Gordan coefficients. This means that, according to the general results of Chapter X on the addition of angular momenta, a product of irreducible tensor operators can be decomposed as the sum of other irreducible tensor operators of order , where varies between and + . In the particular case where = = 1, we get three possible values = 0 1 2. 7 We assume here that the relay state is discrete. If it belongs to a continuum, the sum over this relay state in (14) becomes an integral over . An adiabatic branching calculation then introduces a fraction 1 ( +~ + ) with 0, which can then be expressed in terms of ( +~ ) and (1 ( +~ )), where is the Cauchy principal value. This calculation yields, after integration over , functions of in = + 2~ which have no reason to diverge in the vicinity of in fin .

2106



A MULTIPHOTON PROCESS: TWO-PHOTON ABSORPTION

half-way in energy between and , the initial and final atomic levels. This means that, starting from level , a resonance would occur for both a two-photon and a one-photon process. Is it possible to study the case where ∆ is zero or very small, while avoiding the divergence of the two-photon absorption? A method to overcome this difficulty is to note that state has a finite lifetime , because of spontaneous emission of photons from that state. As we did before in § E-2 of Chapter XX, when studying the resonant scattering of a photon by an atom and where similar divergences appeared, we can show that it is legitimate to replace the energy of state by ~(Γ 2), where Γ = 1 is the natural width of level . The denominator of the transition amplitude no longer goes to zero and the divergence disappears. The transition amplitude still varies significantly over an interval of width Γ when varies around +~ . Replacing by (~Γ 2) leads to valid results only if the matrix elements ; 1 ; and ; 2 ; 1 , characterizing the coupling of the field with the atom for the and transitions, are small compared to ~Γ . If this is not true, we cannot limit the computation to the lowest field order. We must then diagonalize the Hamiltonian of the global system atom + field within the subspace of the states which, in the absence of coupling, are very close to each other8 . When no relay state is resonant, the subspace is two-dimensional; it is spanned by the two states ; and ; 2 . When one relay state becomes resonant, we must include the state ; 1 in the subspace – which then becomes three-dimensional. To study the dynamics of the system, we must diagonalize the matrix: ;

+ ~ 1 ; 0

; ;

; +( 2

1 1)~ ;

0 ; 1

1 +(

; 2)~

2

(32)

This general treatment allows taking into account simultaneously the one- and twophoton transitions. Concluding this complement, let us emphasize that the two-photon transitions involve a physical process different from the mere succession of two one-photon absorptions. We stressed in the discussion of § 1, and in particular in its two comments, the difference between populating the final state, which is cumulative in time and conserves the energy, and a transit through an intermediate relay state, which can only last a very short time ∆ , limited by the non-conservation of energy. It is also noteworthy that the two-photon transition amplitude can take a form very similar to that of a one-photon transition; the only major change is the replacement of the matrix element to first order in the interaction, by a second order matrix element, divided by an energy defect factor in the relay state. These concepts can be generalized to higher order processes: similar techniques can be used to evaluate three-, four-, etc.. photon transition amplitudes.

8 Such a description of the atom + radiation interactions is called the “dressed atom method” (see for example Chapter VI of reference [21]). In Complement CXX , this method is applied to the problem of a two-level atom interacting with a strong field. The eigenstates of the total Hamiltonian restricted to the subspace are called the “dressed states”.

2107



PHOTOIONIZATION

Complement BXX Photoionization

1

2

3

4

5

Brief review of the photoelectric effect . . . . . . . . . . . . 1-a Interpretation in terms of photons . . . . . . . . . . . . . . . 1-b Photoionization of an atom . . . . . . . . . . . . . . . . . . . Computation of photoionization rates . . . . . . . . . . . . . 2-a A single atom in monochromatic radiation . . . . . . . . . . . 2-b Stationary non-monochromatic radiation . . . . . . . . . . . . 2-c Non-stationary and non-monochromatic radiation . . . . . . 2-d Correlations between photoionization rates of two detector atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Is a quantum treatment of radiation necessary to describe photoionization? . . . . . . . . . . . . . . . . . . . . . . . . . . 3-a Experiments with a single photodetector atom . . . . . . . . 3-b Experiments with two photodetector atoms . . . . . . . . . . Two-photon photoionization . . . . . . . . . . . . . . . . . . . 4-a Differences with the one-photon photoionization . . . . . . . 4-b Photoionization rate . . . . . . . . . . . . . . . . . . . . . . . 4-c Importance of fluctuations in the radiation intensity . . . . . Tunnel ionization by intense laser fields . . . . . . . . . . . .

2110 2110 2111 2112 2112 2113 2116 2117 2118 2118 2119 2123 2123 2124 2125 2126

All the atomic processes of absorption, emission or scattering of photons studied in Chapter XX involved transitions between two discrete states of the atom. In addition to discrete levels, atoms also have continuums of energy levels. The most directly accessible one is the simple ionization continuum, which corresponds to the loss of a single electron by the atom (ionization). This continuum starts at an energy threshold above the ground state energy, and extends over all energies larger than this threshold. This energy is called the “ionization energy”. The aim of this complement is to study the “photoionization” process where incident radiation takes the atom from its ground state to a state belonging to the ionization continuum. Once the atom’s electron has reached the ionization continuum, it can travel an arbitrary distance from the remaining ion; it has been ejected from the atom by the incident radiation. Such a process is reminiscent of the photoelectric effect where radiation ejects an electron from a metal. This is why we shall review in § 1 a few properties of the photoelectric effect to underline its analogies with photoionization. We shall then use quantum theory to compute, in § 2, the probability per unit time for the incident radiation to photoionize an atom. We shall assume that the incident radiation spectrum is entirely above the ionization threshold, so that no resonant absorption can bring the atom to a discrete excited state. Since the emitted electron can be amplified in a photomultiplier, the atom can play the role of a photodetector. In the case where only one photodetector D is used (§ 2-a), the computations are very similar to those exposed in Chapter XX; the only differences arise from the continuous character 2109

COMPLEMENT BXX



of the final atomic state. Another interesting situation occurs when two detectors D1 and D2 are placed in the radiation field at points R1 and R2 (§ 2-d) and when we focus on the correlations between their signals. For example, we shall compute the probability per unit time to observe a photoionization at R1 at time 1 and another one at R2 at time 2 . One may wonder if a quantized radiation theory is needed to quantitatively account for the photoionization processes. Could a semiclassical theory suffice to describe the ionization of one or several quantized atoms by a classical field? In other words, can the photoelectric effect be explained “without photons” [45]? This question will be discussed in § 3. The atom can also be photoionized by the absorption of a number of photons , larger than one. These processes are called “multiphoton ionization” and play an important role in experiments using high intensity laser sources. In § 4, we shall give an idea of how to compute the rates of those processes for = 2. We shall then briefly mention in § 5 another mechanism for atoms’ ionization, based on a tunnel effect, and occurring when the incident radiation electric field becomes of the order of the Coulomb field between an atom’s electron and the nucleus. 1.

Brief review of the photoelectric effect

In 1905, Albert Einstein [43] introduced for the first time in physics the concept of “light quanta”, which we now call photons. Considering the great analogy between certain statistical properties of black body radiation and those of an ideal gas of particles, Einstein proposed the idea that radiation was in fact composed of discrete quanta, each having an energy . In view of the successes of the wave theory of light, this return to a particle description seemed totally unrealistic for most physicists at the time. Energy quantization had indeed been introduced a few years earlier by Max Planck to account for the spectral distribution of black body radiation, but it was the exchanges of energy between matter and radiation that were quantized, not the radiation itself. 1-a.

Interpretation in terms of photons

In that same 1905 article, Einstein used the concept of light quanta to give a new description of the photoelectric effect. In this process, an electron is ejected from a metal irradiated by light. Einstein postulated that the energy of a light quantum from the incident beam was absorbed by an electron in the metal, hence allowing it to escape from the metal. This escape requires an energy at least equal to that of the binding energy of the electron in the metal. The frequency of the light beam must therefore be larger than a threshold value given by = . If , no electron can be ejected. If , the energy surplus provides the electron with a kinetic energy 2 2. This interpretation leads to Einstein’s equation: kin = 1 2

2

=

if

(1)

giving the kinetic energy of the ejected electron as a function of . It means, in particular, that the kinetic energy of each electron depends only on the frequency of the light beam, 2110



PHOTOIONIZATION

Figure 1: Photoionization of an atom. State is the ground state, state one of the discrete states. The continuum of states belonging to the continuous part of the spectrum (ionization continuum) starts at an energy above the ground state. Energy is called the “ionization energy”. As the atom in state absorbs a photon with frequency such that , it goes into a state , which is part of the ionization continuum. The electron is ejected, and its kinetic energy kin (when it is far enough from the ion formed as it left the atom) is equal to the difference between and . and not on its intensity1 (which, on the other hand, determines the number of ejected electrons per unit time). Equation (1) also tells us that if is varied and one plots 2 the variation of 2 as a function of , one should get a straight half-line with slope , starting from the abscissa axis at point . All these predictions generated a certain skepticism and it was not until several years later (1913) that an experimental confirmation of the predictions of equation (1) was obtained by the work of R. Millikan and H. Fletcher on the photoelectric effect [44]. 1-b.

Photoionization of an atom

Figure 1 represents the ground state of an atom, one of the states from the discrete part of the spectrum, and the continuum part of the spectrum which starts at a distance above the ground state. The origin of the energies is often chosen at the beginning of the continuum, and hence the discrete states have a negative energy. When it is in the positive energy states, belonging to the continuum and called “scattering states”, the electron is no longer bound to the nucleus, although it is still attracted to it. Consider an atom in the ground state . A photon with energy has different ways to bring energy to the atomic electron. If , this photon can be absorbed only if coincides with the frequency =( ) of a transition between state 1 A classical theory would tend to predict that the higher the light intensity, the more energy could be furnished to the electron, thereby increasing its acceleration.

2111



COMPLEMENT BXX

and state belonging to the discrete spectrum. This is the absorption process between two discrete states already studied in Chapter XX. On the other hand, if , the atom can always absorb the photon and end up in a state within the continuum. The electron is no longer bound and can move away from the remaining ion formed once the electron has left. When the electron is far enough from the ion for their Coulomb interaction energy to be neglected, its kinetic energy is given by: 1 2

2

=

if

(2)

This is the photoionization process of an atom. Equation (2) is the generalization for an atom of equation (1) introduced by Einstein for the photoelectric effect in a metal. 2.

Computation of photoionization rates

Let us now see how to adapt the computations of Chapter XX to the calculation of a photoionization rate. 2-a.

A single atom in monochromatic radiation

We start from expression (B-7) of Chapter XX for the probability that the system, leaving at = 0 the initial state in = ; (atom in state in the presence of photons ), ends up at time ∆ in the final state fin = ; 1 (atom in state with an energy ~ 0 above , and one photon less in mode ) : fin

¯ (∆ 0)

in

2

=

1 ~2

fin

in

2

4 sin2 [( [(

0

)∆ 2]

(3)

2

0

) 2]

In Chapter XX, we used this expression for studying the case where states and are both discrete states. It is nevertheless still valid when belongs to a continuum; its interpretation, however, is different. Whereas the probability of finding the atom in a discrete final state makes sense, from a physical point of viewn, when dealing with a continuum we must compute the probability of finding the atom within a non-zero energy interval. We must then sum probability (3) over states . As varies, 0 = ( ) ~ varies, and so does the matrix element of . However, for large enough ∆ , the variation of the matrix element is much slower than that of the ratio in the right-hand side of (3). This ratio is the square of a diffraction function, whose maximum equals ∆ 2 for 0 = and whose width is of the order of 1 ∆ . The area under this function is thus of the order of ∆ 2 (1 ∆ ) = ∆ . Compared to functions of 0 with slow variations over an interval of the order of 1 ∆ , this function behaves, within a proportionality factor, as the product ∆ ( 0 ). It follows that the sum over of (3) is proportional to ∆ , meaning we can define a probability per unit time for the atom to reach the continuum, i.e. a photoionization rate. The proportionality factor between a diffraction function and a delta function is given by relation (11) in Appendix II, which is written: lim



2112

sin2 [( [(

0

)∆ 2] 2

0

) 2]

=



(

0

2

)=2 ~∆

(

~ )

(4)



PHOTOIONIZATION

Inserting (4) into (3), summing over , and dividing by ∆ , finally yields the photoionization rate: 1 ∆

fin

¯ (∆ 0)

in

2

=

2 ~

=

2 ~

fin

fin =

in

in

2

2

(

(

~ ) +~ )

(5)

+~

where ( + ~ ) is the density of states in the continuum around the energy +~ . This expression is just a consequence of the Fermi golden rule (Chapter XIII, § C-3-b) applied to the coupling between the discrete level ; and the continuum ; 1. It is also reminiscent of expression (C-37) of Chapter XIII which yields the transition probability per unit time between a discrete atomic state and a continuum, with an excitation induced by a classical wave described by a time-dependent sinusoidal function. This point will be discussed further in § 3. 2-b.

Stationary non-monochromatic radiation

We now consider a single atom interacting with non-monochromatic radiation, described by a spectral distribution ( ). We first assume that the field statistical properties are time-invariant. We shall consider the case of non-stationary radiation in § 2-c. .

Field and atomic dipole correlation functions

abs In § D-1 of Chapter XX, we obtained the expression for the probability (∆ ) for the atom to go, through the absorption of a photon, from state to any state different from , after a time ∆ . This probability is given by relation (D-4) of that chapter as a double integral of a sum of products of field and atomic dipole correlation functions. We first examine the correlation functions of the atomic dipole. As in relation (B-13) of Chapter XX, we write the matrix elements of this dipole, in the Heisenberg picture (with respect to the Hamiltonian of the free atom), as:

¯ ) D(

=

=

e

D

=

(6)

with: (7a)

where e is the unit vector parallel to the vector =e

=

e

D

D

. Since e

e = 1, we have: (7b)

For the sake of simplicity, we shall assume the unit vector e to have the same direction for all the states . Comment: This vector e is, a priori, not the same for all the states related to by matrix elements of D. The direction of that vector actually depends on the rotation symmetry properties

2113

COMPLEMENT BXX



of the two states2 . One can sort the different states by categories having the same symmetry (hence the same direction for e ) and for which the computations presented hereafter are valid. One must then add all the ionization probabilities calculated for each category.

Since the field always appears in a scalar product with D, assuming that e has the (+) ( ) same direction for all states implies that only the scalar products (and ) of the fields with e (and with e ) appear in the correlation functions of the field. The calculation is then very similar to that of § D-1 of Chapter XX, and we obtain:

abs

(∆ ) =

1 ~2





d 0

d

(

)

(

)

(8)

0

where: 2

(

)=

(

)=

in

=

in

e

(

)

¯ ( ) (R E

¯ ( ) (R



(+)

(9)

) (R

¯ (+) (R E

e )

in

)

in

(10)

In this last equality, in is the radiation initial state. As this state is stationary, its properties are invariant under time translation; consequently, the correlation function depends only on the difference . The atomic correlation function (9) can also be rewritten as: (

)=

d

˜ ( )

(

)

(11)

where: ˜ ( )=

2

(

)

(12)

The quantity ˜ ( ) represents the spectral sensitivity of the “photodetector atom”, that is the variation with of the transition intensities from the ground state to a level in the ionization continuum, at an energy ~ above . We shall assume here that the width ∆ of the function ˜ ( ) is much larger than the bandwidth ∆ of the incident radiation. ∆ 2 The



(13)

ground state is an eigenvector of the angular momentum component along the quantization axis , with eigenvalue }; for the state , the eigenvalue is }. The to transition thus corresponds to a variation ∆ = . If ∆ = 0, symmetry arguments show that e is parallel to ; if ∆ = 1, e is in the plane perpendicular to : e = (e e ) 2; we note that the complex conjugate of this vector appears in the matrix element (7b).

2114



PHOTOIONIZATION

Such a condition defines what we shall call a “broadband photodetector”. The field correlation function (9) has already been calculated in § B-2-a- of Chapter XX assuming the radiation initial state is a Fock state 1 , or a statistical mixture of such states with weights ( 1 ) – see equations (B-20) and (B-21) of Chapter XX. For the problem we are studying now, i.e. stationary non-monochromatic radiation, we can use the same assumption for the radiation initial state. .

Photoionization rate

To transform equation (8), it is useful to study in more detail the dependence of ( ). We assume that the spectral distribution is centered around a non-zero value ex , and that this distribution is entirely above the ionization threshold. Since the (+) ( ) field ¯ varies in e , and the field ¯ in e , we can then write: (

ex (

)=

)

(

)

(14)

where ( ) is an “envelope” function whose Fourier transform is a function centered at = 0 and of width ∆ . This envelope function varies very slowly over time intervals short3 compared to 1 ∆ . For = , i.e. for = 0, equation (14) leads to: (0) = =

(0) in

¯ ( ) (R



(+)

(R

)

in

=

(15)

where is the radiation intensity (which is time-independent since the radiation state is supposed to be stationary). We shall see in the next section how to generalize our results to non-stationary radiation. Let us go back to the double integral of (8) and assume that the integration interval ∆ satisfies the condition ∆ where =1 ∆ is the detector correlation time. In the plane , the function to be integrated in (8) is different from zero only in a band along the first bisector (Figure 1 of Chapter XX), of width very narrow compared to ∆ . If we change the integral variables and to the variables and = , we can neglect the variation of ( ) since, according to (13), 1 ∆ , and use (15) to rewrite (14) as: (

)=

ex (

)

(

ex (

)

(0) =

) ex (

)

(16)

The double integral of (8) is easily performed with the new variables and . Using expression (3) for ( ), the integral over of a function that no longer depends on this variable introduces a simple factor ∆ . We are then left with the integral over which leads to: abs

(∆ ) =

1 ~2

+



d =2 ¯ (

3 This is not the case for ( time intervals of the order of 1 ∆

ex

( )

(17)

ex )

), because of the exponential , since ex ∆ .

ex (

)

that varies a lot over

2115



COMPLEMENT BXX

As this probability is proportional to ∆ , we can define a photoionization rate: phot

=

1 ∆

abs

(∆ ) =

2 ~2

¯ (

ex )

(18)

This rate is proportional to the incident intensity and to the spectral sensitivity ˜ ( of the photodetector, evaluated at the radiation central frequency ex . 2-c.

ex )

Non-stationary and non-monochromatic radiation

For non-stationary radiation, the initial radiation state is no longer a Fock state or a statistical mixture of Fock states; it is rather a linear superposition of such states, creating wave packets such as those described in Complement DXX . The radiation correlation function ( ) is no longer a function of the single variable , but depends on both and . One can still assume that the frequencies appearing in ( ) are centered around ex in an interval of width ∆ , which permits generalizing expressions (14) and (15) to: (

)=

(

)=

ex (

)

(

(R

)

)

(19)

4

and : ( ) in

(+)

(R

)

=

in

(

) = (R

)

(20)

where (R ) is the intensity at point R and at time . Using the expansions in and for the field operators appearing in (10), we get an expression for ( ) that generalizes equation (B-20) of Chapter XX to nonstationary fields: (

)= ~ 2 0

3

(e

)(e

)

in

in

(k

R

) (k

R

)

(21)

For fixed values of k and , the summation over of (21) represents a wave packet of central frequency ex , and whose envelope passes by a point R in a time interval of the order of 1 ∆ . If varies over a time interval ∆ 1 ∆ , the variation of the envelope can be neglected. Similar conclusions are valid for a summation over of relation (21), with fixed values for k and . Let us now go back to the double integral in (8). As the phenomenon now depends on , the integration interval will be taken between and + ∆ (instead of between 0 and ∆ ). We assume that ∆ satisfies the condition : 1 ∆

1 ∆



(22)

which is possible when (13) is taken into account. Since ∆ 1 ∆ , we can neglect in (19) the variation of ( ) when and vary in the integration domain; we therefore replace ( ) by: (

)=

( ) in

(R )

(+)

(R )

in

= (R )

(23)

4 From now on, we simplify the notation by omitting the bar over operators in the Heisenberg picture, since the explicit time dependence is sufficient to indicate this point of viewn.

2116

• Using this equality in (19) we get, in the integration interval between (

ex (

)

)

(R )

PHOTOIONIZATION

and + ∆ : (24)

The computations are then quite similar to those carried out above for a stationary field: the integral over leads to a ∆ term; the integral over = is equal to: +

d

( )=2 ˜ (

ex

ex )

(25)

We call (R )∆ the probability that a photodetector atom, placed at R, will undergo a photoionization between times and + ∆ . We get the result: ( )

(R ) =

in

(R )

(+)

(R )

(26)

in

where = 2 ¯ ( ex ) ~2 is a factor characterizing the photodetector sensitivity at the radiation central frequency ex . The atomic photoionization rate is thus a signal that constantly follows the time variations of the incident radiation intensity, written in (20). 2-d.

Correlations between photoionization rates of two detector atoms

The previous computations can be generalized to analyze other experiments where two detector atoms are placed at R1 and R2 , and where we study correlations between the photoionizations observed on those two atoms at times 1 and 2 . More precisely, let us call (R2 2 ; R1 1 )∆ 1 ∆ 2 the probability to detect a photoionization at R1 between 1 and 1 + ∆ 1 and another one at R2 between 2 and 2 + ∆ 2 . Computations very similar to those performed above, and which will not be explicited here (for more details, see Complement AII of [21]), lead to: (R2

2 ; R1

2 in

1)

=

( ) d (R1

1)

( ) d (R2

2)

(+) d (R2

2)

(+) d (R1 (+)

1)

in

(27) ( )

It is easy to understand why two operators E preceded by two operators E appear in (27). The double photoionization rate is computed from a probability that is the modulus squared of a probability amplitude for a photon to be absorbed at R1 1 (+) and another one at R2 2 . This amplitude must contain a product of two operators E . ( ) Its conjugate must contain two operators E arranged in inverse order. We therefore ( ) (+) should find in (27) two operators E followed by two operators E with different orders of R1 1 and R2 2 . There is a great analogy between the simple and double photoionization rates and given by equations (25) and (27) and the correlation functions 1 and 2 studied in Chapter XVI. The functions 1 and 2 give the probability densities of finding a particle at r1 1 for 1 , or a particle at r1 1 and another one at r2 2 for 2 . Note, however, that 1 and 2 give the probability of finding one or two particles at specific points, whereas we are now dealing with the probability of photoionization of atoms placed at specific points. The field operators appearing in (26) and (27) are the positive or negative frequency components of the electric field, since these are the operators describing the emission or absorption of photons. 2117

COMPLEMENT BXX

3.



Is a quantum treatment of radiation necessary to describe photoionization?

In a semiclassical treatment of photoionization, the radiation field is described as a classical field, while the atom follows a quantum treatment. The atom-field coupling is then a time-dependent perturbation that can induce transitions between a discrete atomic state, such as the ground sate , and a state , part of the ionization continuum. Does such a treatment yield the same results as those obtained in a quantum treatment? We are going to show that, while this is often the case, this is not always true. 3-a.

Experiments with a single photodetector atom

In the simple case of an oscillating monochromatic classical field, with frequency , the transition probability per unit time takes a form that is reminiscent of the Fermi golden rule, valid for a constant perturbation coupling a discrete state to a continuum5 . In the more general case where the classical field is non-monochromatic but stationary, one can follow the same line of computations that led to equation (8). This shows that the transition probability from a discrete state to any state of the continuum can still be expressed as the integral of the product of two correlation functions: one, ( ), for the atomic dipole, another, ( ), for the radiation. In both cases, the field appears in the transition probability only via a correlation function. For the classical case, the quantum average value (10) must be replaced by the product of the negative and positive frequency components of the classical field: (

)=

( ) d (R

)

(+) d (R

)

(28)

Note that in this relation, in order to distinguish the quantum fields from the classical fields, these latter fields are written with curly letters; the subscript means that the field has been projected onto the polarization unit vector defined in (7a). Expression (28) has been obtained for a perfectly known classical field. Another possibility is that the classical field is only known in a probabilistic sense, as is the case with a classical statistical mixture of fields, with given probabilities. The transition probability (8), where the correlation function ¯ ( ) is replaced by ( ), must then be averaged over all the states of the statistical mixture, which amounts to replacing (28) by: (

)=

( ) d (R

)

(+) d (R

)

(29)

where the bar above the product of the two fields symbolizes the statistical average. It seems that for all signals involving one single photodetector atom, the quantum predictions are identical to those of a semiclassical theory, using a classical field with the same correlation function as the quantum field. In particular, for a stationary field, the Fourier transform of the field correlation function is simply the spectral distribution ( ) of that field. For a quantum field, this property was established in Chapter XX – see relation (B-20). For a classical field, this property is a consequence of the Wiener-Khintchine theorem. It follows that the photoionization probability of an atom is the same, whether it is computed with a stationary quantum or classical field, as long as they both have the same spectral distribution. 5 See

2118

for example § C-3 of Chapter XIII, and relation (C-37) in particular.



PHOTOIONIZATION

Comment The equivalence of the predictions of the two theories is also valid for a non-stationary field. The semiclassical theory predicts that the photoionization rate at time is proportional to ( )

(+)

the classical field intensity d (R ) d (R ), which now depends on since the field is no longer stationary. We shall see in Complement DXX that a similar result is obtained in quantum theory: the photoionization probability at time of an atom receiving a one-photon wave packet is again given by the modulus squared of a function, which can be considered to be the photon wave function, evaluated at = . 3-b.

Experiments with two photodetector atoms

The same line of computations that led to equation (27) can be followed for a classical field. It leads to an expression similar to (27), where the field quantum correlation function is replaced by the statistical average of the product of two negative frequency components and two positive frequency components of the classical field: ( ) d (R1

( ) 1 ) d (R2

2)

(+)

(R2

2)

(+)

(R1

1)

(30)

As classical fields commute with each other, expression (30) can be rewritten as: (R1

1)

(R2

2)

(31)

where: (R1

1)

=

( ) d (R1

1)

(+)

(R1

1)

(32)

is the classical field intensity at point R1 at time 1 and with a similar equation for (R2 2 ). Correlations between the photoionization rates of the two photodetector atoms thus involve, in the semiclassical theory, the product of the intensities arriving on both photodetectors. The correlation function of the field amplitude is replaced by the correlation function of the intensity. .

Situations where a semiclassical treatment is adequate

A first situation where quantum and semiclassical predictions agree is the case where the field state is a coherent state described by the set of classical normal variables (Chapitre XIX, § B-3-b). Each mode is in a coherent state , meaning the state is an eigenket of the operator E (+) (R ) with eigenvalue (+) ( R ) equal to the classical field corresponding to the set of classical normal variables . In a similar way, the bra is an eigenbra of E ( ) (R ) with eigenvalue ( ) ( R ). For a coherent state of the field, the rate written in (27) therefore becomes: égal à: (R2 =

2

=

2 ( )

2 ; R1

1)

( ) d

(

(R1

R1

1)

1)

( ) d ( )

(

(R2

2)

R2

(+) d (R2 2 ) (+) ( 2)

(+) d (R1

R2

2)

1) (+)

(

R1

1)

(33)

The quantum result for does coincide with the semiclassical prediction. The same conclusion holds when the state of the quantum field is a statistical mixture of coherent states with statistical weights ( ). 2119

COMPLEMENT BXX



Another situation where the quantum and semiclassical predictions agree is the case of a thermal field. In the quantum description, Wick’s theorem (Complement CXVI ) allows expressing the four-point correlation function appearing in as the sum of products of two-point correlation functions. In a similar way, in the semiclassical theory, the thermal field is a Gaussian random field, and here again, the classical four-point correlation function is the sum of products of two-point correlation functions. Provided we use the same two-point correlation functions in both theories, their predictions agree. Comment The interferometric analysis of the electric field of the light emitted by the stars (to measure their angular diameter) is confronted with the problem of atmospheric fluctuations, which introduce a random phase shift between the two arms of the interferometer. The analysis of intensity correlations is much less sensitive to these fluctuations. As different parts of the stars emit incoherent waves, the total field received from the star is Gaussian, and the result we just established shows that analyzing intensity correlations allows obtaining two-point correlation functions and hence the same information as the one contained in a field correlation measurement. The validity of such a method, based on intensity correlations, was experimentally demonstrated in 1956 by Robert Hanbury Brown and Richard Twiss [46].

.

Situations requiring a quantum radiation treatment

An example of a situation where a quantum treatment of the radiation becomes essential is shown in Figure (2). A one-photon wave packet is emitted by atom ; this wave packet is described by (cf. Complement DXX ). It then goes through a beamsplitter LS that divides it into two wave packets: a transmitted wave packet and a reflected wave packet , which then arrive on two detectors 1 and 2 .

Figure 2: Atom A emits a photon described by a wave packet . This wave packet goes through a beamsplitter that divides it into a transmitted wave packet and a reflected wave packet , which then arrive on two detectors 1 and 2 . The quantum and semiclassical predictions concerning the correlations between the signals detected on 1 and 2 are significantly different (see text). 2120



PHOTOIONIZATION

In a quantum description of the radiation, the radiation state after crossing the beamsplitter is still a one-photon state, described by the linear superposition of the two one-photon wave packets and , that is: =

+

(34)

In the expression (27) for the rate giving the probability of a photoionization at time 1 of detector 1 at R1 , and a photoionization at time 2 of detector 2 at R2 , appears the squared norm of the ket: (+) d (R2

2)

(+) d (R1

1)

(35) (+)

where is given by (34). The first operator d (R1 1 ), which destroys a photon, yields the vacuum 0 when it acts on state that contains only one photon. The (+) second destruction operator d (R2 2 ) will then yield 0 when acting on the vacuum. The two detectors 1 and 2 cannot both undergo a photoionization. This result was, a priori, obvious: a single photon cannot produce two photoionizations. If, on the other hand, the radiation emitted by the atom is described classically, the two wave packets and are classical wave packets which can ionize the detectors and they encounter. 1 2 Single photon sources are not easy to fabricate. An experiment close to the situation in Figure 2 is described in reference [47]. Instead of the atom A in Figure 2, it uses as a light source atoms emitting pairs of photons in a radiative cascade: the atom emits a photon of frequency going from a state to a state , then a photon going from state to a state . If we call the radiative lifetime of state , photon is emitted after photon in a time window having a width of the order of . Imagine we add to the experimental set-up of Figure 2 a third detector (not shown in the figure) that detects the photon and can trigger a departure time: after each detection of a photon, detectors 1 and 2 are activated, but for a short time interval of the order of . The probability of detecting a single photon during that time window is much higher than in a time window of the same length, but not triggered by the detection of a photon. This trigger method provides an equivalent of a single photon source, and was used to observe that a single photon could not simultaneously excite both detectors 1 and 2 . .

Resonance fluorescence of a single atom. Photon antibunching

Another experiment clearly shows the need for a quantum description of radiation: the study of the second order correlation function of the fluorescent light emitted by a single atom or ion, and excited by a resonant laser beam. Imagine the emitting object A in Figure 2 is a single trapped ion6 . Submitted to the resonant laser excitation, the ion emits a series of photons which enter the set-up of Figure 2. The distances between the beamsplitter and the detectors 1 and 2 are equal, so that the two wave packets associated with each photon arrive at the same time on 1 and 2 . 6 Ion trapping is now a well mastered technique. The results presented in Figure 3 have been obtained on a single 24 Mg+ ion [48]. The first experimental evidence for photon antibunching in the fluorescent light from a single atom were obtained on a sodium atomic beam at very low intensity, with an observation volume small enough for the probability of its containing more than one atom to be negligible [49].

2121

COMPLEMENT BXX



For a continuous laser excitation with a constant intensity, the statistical properties of the fluorescent light are invariant under time translation; consequently, the quantum correlation function (R2 2 ; R1 1 ) characterizing the photoionizations detected on (2) and depends only on = 2 ( ). Figure 3 plots the 1 2 1 . It shall be noted (2) variations of ( ) as a function of , for increasing values (from bottom to top) of the laser intensity. This figure shows that (2) ( ) is zero for = 0; in other words, the detected photons are “antibunched” in time (one cannot detect simultaneously a photon at 1 and another one at 2 ). The quantum interpretation of that result is as follows. Each photon emitted by the ion is detected either by 1 , or by 2 . Right after the emission of a photon, the ion is “projected” into the ground state of the transition excited by the laser. This means it cannot immediately emit another photon as it must first be re-excited by the laser, and that takes a certain time. This is why (2) ( ) is zero for = 0. Actually, after it emits a photon, the atom starting from will oscillate between the state and the excited state at the Rabi frequency characterizing the atom-laser coupling, and proportional to the laser field amplitude. The Rabi oscillations explain the oscillations of (2) ( ) that appear in Figure 3 at frequencies higher and higher as the laser intensity increases.

Figure 3: Intensity correlations in the resonance fluorescence of a single ion excited by a laser. The figure shows the time correlations (2) ( ) between the signals from the two detectors 1 and 2 as a function of the delay between two detections. The three curves correspond to increasing intensities (from bottom to top) of the laser beam exciting the resonant fluorescence of the trapped ion. It shows that (2) ( ) is zero for = 0 and, for small positive values of , it increases with (figure adapted from [48]). Let us examine now the predictions of a theory that classically treats the field emitted by the ion. We established above that the correlations between the photoionization rates of the two detectors are described by the correlation function ( ) ( + ) of 2122



PHOTOIONIZATION

the classical intensity ( ). For a stationary field, this classical correlation function depends only on : () ( + )=

(2) cl (

)

(2) cl

(36)

In addition, writing that ( ( )

( + ))2 > 0, we get:

( ( ))2 + ( ( + ))2 > 2 ( ) ( + ) that is, taking into account the field stationarity and relation (36): (2) cl (

= 0) >

(2) cl (

)

(37) (2)

The semiclassical theory therefore predicts that cl ( ) should not be an increasing function of in the vicinity of = 0. This is contradicted by the experimental results shown in Figure 3, and hence proves that the fluorescent light emitted by a single ion excited by a resonant laser beam cannot be described as a classical field. The radiation quantum theory is thus essential to account for all photoionization experimental results. This remains true even though the simple photoelectric effect observed on a single photodetector can be described by a semiclassical theory (without photons). 4.

Two-photon photoionization

4-a.

Differences with the one-photon photoionization

We now consider a two-photon absorption process similar to those studied in Complement AXX , but where the final state of the two-photon absorption process is now part of the atomic ionization continuum. This continuum starts at an energy (ionization energy) above the energy of the ground state (Fig. 4). This process is called two-photon photoionization. The photoionization process transforms the atom into an ion and an electron, which moves away. When the distance between the electron and the ion is large enough, their Coulomb interaction energy becomes negligible and the electron energy is just its kinetic energy. Total energy conservation tells us that this kinetic energy is equal to: kin

= 2~

(38)

If we plot the variations of kin as a function of , we get a straight half-line with slope 2~, which starts from the abscissa axis at point 2~. This result is a generalization of the photoelectric law established in 1905 by Einstein. The previous result clearly shows that it is not necessary for the incident photon energy ~ to be larger than the ionization energy for the atom to undergo photoionization. Figure 4 shows that ~ is lower than whereas 2~ is larger than . This result can be generalized: if ~ , with = 1 2 1, but if ~ , we have a -photon photoionization. The kinetic energy of the photoelectron, once it is far enough from the ion, is equal to kin = ~ . 2123

COMPLEMENT BXX

4-b.



Photoionization rate

We first assume the radiation is monochromatic, and use expressions (13) and (14) of Complement AXX for the two-photon absorption probability amplitude, whose modulus squared yields the probability. As the final state now belongs to a continuum, we must sum this probability over and use Fermi’s golden rule to compute the photoionization probability per unit time. Since the modulus squared of equation (14) is proportional to ( 1), where is the number of incident photons, the photoionization rate increases as the square of the incident radiation intensity (for 1). In a similar way, it can be shown that a -photon ionization increases as the th power of the incident radiation (for 1). Consider now the case of non-monochromatic stationary radiation. As in section § 2-b, we assume the radiation spectral density is centered around a frequency ex with a width ∆ much smaller than the spectral bandwidth ∆ of the detector. The field correlation function ( ) that appears in the two-photon absorption probabil¯ (+) (R ) ity is still given by relation (27) of Complement AXX . The two operators E ex appearing in this equation have a predominantly exponential time dependence. ( ) ¯ Similarly, the two operators E (R ) have a predominantly exponential + ex time dependence. With the same reasoning as in § 2-b- , which led to (14), we set: (

)=

2

ex (

)

(

)

(39)

where ( ) is an “envelope” function with a much slower dependence in , and on a time scale of the order of 1 ∆ . In the double integral of (8), the correlation

Figure 4: Two-photon photoionization. The atom goes from state to state , which is part of the ionization continuum, through the absorption of two photons, with energy ~ . The unbound electron produced at the end of that process leaves the atom with a kinetic energy equal, when the electron is far enough from the atom, to kin = 2~ . 2124



PHOTOIONIZATION

function ( ) is different from zero only for where is the correlation time of the atomic dipole, much shorter than 1 ∆ . We can therefore, as in § 2-b- , take = in the envelope function ( ) defined in relation (39) which yields . We obtain for the field correlation function appearing in the two-photon ionization rate: (

)=

2

ex (

in

e

)

E

( )

e

(R

) E

(+)

e

E

(R

)

( )

e

(R

) E

(+)

(R

)

in

Note that the average value appearing in this equation is independent of radiation is supposed to be stationary. 4-c.

(40) since the

Importance of fluctuations in the radiation intensity

Even when the radiation is monochromatic, i.e. when only a single mode is populated, its intensity can take different values, spread out around an average value; the only case where the radiation intensity is well defined is when the radiation is in a Fock state . If we only consider stationary monochromatic radiation, the most general state is a statistical mixture of Fock states with weight ( ). As an example, if the mode is in thermal equilibrium at temperature , the probability of its containing photons is: ( )= where

1

(

+1 2)~

(41)

is the Boltzmann constant and (

the partition function given by:

+1 2)~

=

(42)

From these two equations one can easily compute the average value of the number of photons in this mode, as well as the average value 2 of 2 , for radiation at thermal equilibrium (see demonstration below). In particular, we can show that: 2

=

+2

2

(43)

According to the previous results, the two-photon photoionization rate is proportional to ( 1) = 2 . If the radiation has a well-defined intensity, i.e. if it is in a 2 2 Fock state , we have 2 = and the photoionization rate is proportional to if 1. On the other hand, if the radiation is in thermal equilibrium with the same average value , the photoionization rate is, according to (43), proportional to 2 2 , i.e. twice as large as for a state with no dispersion in intensity but same average value . An intensity fluctuation, keeping the average value constant, considerably increases the photoionization rate. This result is to be expected for a nonlinear phenomenon: the values of above the average value contribute much more than the values below. 2125



COMPLEMENT BXX

Demonstration of relation (43) This calculation has already been explained in §§ 2-b and 3-b of Complement BXV . We 2 briefly recall its principle. We note = ~ and eliminate the irrelevant factor from the partition function by setting: 2

( )= The function

( )

(44)

( ) is easily computed, since:

( )=1+

2

+

+

+

+

=

1

(45)

1

We also know that: = =

( )=

1 ( )

1 d ( ) ( ) d

(46)

This leads to: =

1

(47)

1

A similar calculation permits computing 2 using the second derivative d2 ( ) d which leads to relation (43), the equivalent of relation (42b) of Complement BXV .

2

,

If the radiation is no longer monochromatic, but still stationary and at thermal equilibrium, we can use the field correlation function (40) and Wick’s theorem (Complement CXVI ) to rewrite this function as the sum of products of second order correlation functions: (

)=

2

ex (

in

)

e

E in

+

in

e in

( )

e E e

(R E

( )

( )

(R

E

e

)

( )

(R ) (R

E ) e )

(+)

e E e

(R E

(+)

(+)

) (R

)

)

in

(R

)

(R

E

in

(+)

in

in

(48)

We thus get a sum of two terms, each proportional to the square of an average intensity. 5.

Tunnel ionization by intense laser fields

As high power lasers became available, studies of multi-photon ionization processes led to the discovery of many new physical phenomena. In particular, when the instantaneous laser field becomes of the order of the Coulomb field binding the electron to the nucleus, ionization no longer results from a multi-photon ionization process, but from a tunnel effect. The laser field, yielding a potential that varies linearly as a function of the electronnucleus distance, lowers the Coulomb potential sufficiently to allow the electron to escape 2126



PHOTOIONIZATION

Figure 5: Effective potential seen by an electron undergoing tunnel ionization. The electron leaves the ion by tunneling through the potential barrier, sum of the ionic Coulomb potential and the linear potential associated with the laser electric field, assumed to be linearly polarized along the axis.

via a tunnel effect (see Fig.5). Once the electron has left the ion, it is accelerated by the laser field. As the oscillating laser field changes sign, the acceleration produced by the laser field is inverted, the electron comes back toward the ion and emits, as it passes close to the ion, a “bremsstrahlung” radiation (braking radiation). It can be shown that the frequency of this radiation is an odd harmonic of the laser frequency. The order of this harmonic is very high and can reach several hundred. As the fraction of the period of the laser field where the electron can escape by the tunnel effect is very small, the electron wave packet that leaves the ion has a very short time extension. The bremsstrahlung radiation it emits when it comes back to the ion also extends over a very short time, expressed in tens of attoseconds (one attosecond is equal to 10 18 sec). The interested reader can find an up to date review of these developments in Chapters 10 and 27 of reference [24].

2127



TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

Complement CXX Two-level atom in a monochromatic field. Dressed-atom method

1

2

3

4

Brief description of the dressed-atom method . . . . . . . . 1-a State energies of the atom + photon system in the absence of coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-b Coupling matrix elements . . . . . . . . . . . . . . . . . . . . 1-c Outline of the dressed-atom method . . . . . . . . . . . . . . 1-d Physical meaning of photon number . . . . . . . . . . . . . . 1-e Effects of spontaneous emission . . . . . . . . . . . . . . . . . Weak coupling domain . . . . . . . . . . . . . . . . . . . . . . 2-a Eigenvalues and eigenvectors of the effective Hamiltonian . . 2-b Light shifts and radiative broadening . . . . . . . . . . . . . . 2-c Dependence on incident intensity and detuning . . . . . . . . 2-d Semiclassical interpretation in the weak coupling domain . . 2-e Some extensions . . . . . . . . . . . . . . . . . . . . . . . . . Strong coupling domain . . . . . . . . . . . . . . . . . . . . . 3-a Eigenvalues and eigenvectors of the effective Hamiltonian . . 3-b Variation of dressed state energies with detuning . . . . . . . 3-c Fluorescence triplet . . . . . . . . . . . . . . . . . . . . . . . 3-d Temporal correlations between fluorescent photons . . . . . . Modifications of the field. Dispersion and absorption . . . . 4-a Atom in a cavity . . . . . . . . . . . . . . . . . . . . . . . . . 4-b Frequency shift of the field in the presence of the atom . . . . 4-c Field absorption . . . . . . . . . . . . . . . . . . . . . . . . .

2130 2131 2131 2133 2135 2135 2137 2137 2138 2139 2139 2140 2141 2141 2142 2144 2145 2147 2147 2148 2149

Introduction. The probability amplitude for an atom, subject to monochromatic radiation, to absorb a photon and go from a discrete state to another discrete state was calculated in § B of Chapter XX. We used, however, a perturbative treatment limited to lowest order with respect of the interaction Hamiltonian. The predictions of such an approximate calculation are, a priori, only valid for times that are sufficiently short for the higher order corrections to remain negligible. This complement presents another approach to atom-photon interactions, called the “dressed-atom approach”, which does not have those limitations. It considers the atom and the mode of the quantum field it interacts with as a single quantum system. As this unified system is described by a time-independent Hamiltonian1 , one can study its energy diagram to obtain very useful information. 1 This would obviously not be possible in a classical description of the radiation (even with a quantum treatment of the atom), since the field varies sinusoidally with time, and its coupling Hamiltonian with the atom is time-dependent.

2129

COMPLEMENT CXX



This will allow us to improve the results of Chapter XX. The dressed-atom method yields a non-perturbative description of the physical processes under study, and hence remains valid even for intense fields. It provides new insights into several important physical phenomena: atoms’ behavior in an intense electromagnetic field, spectral distribution of the light spontaneously emitted by an atom in such an intense field, time correlations between the emitted photons, origin of the forces exerted on an atom by radiation with a space-varying intensity. As is usual in the literature, and as we already did in Complement CXIX , we shall note the atomic ground state and the atomic excited state (instead of using our previous notation of and for the two atomic levels). For the same reason, rather than keeping the notation of Chapter XX for the angular frequency of the exciting beam, we shall use , that refers more explicitly to a laser beam, which can have a very high intensity. We start in § 1 with a brief description of the dressed-atom method2 . We assume the frequency to be close to resonance with the atomic frequency, noted 0 = ( ) ~, but far enough from all the other atomic transition frequencies. The radiationatom interaction will be characterized by a frequency called the “Rabi frequency”. It is the equivalent, in this radiation quantum treatment, of the precession frequency of a spin turning around a classical radiofrequency field in a magnetic resonance experiment (Complement FIV ). Establishing the energy diagram of the unified atom-photon system will enable us to study both the weak coupling regime (Rabi frequency small compared to the natural width Γ of state or to the detuning 0 between the field and atomic frequencies) and the strong coupling regime (Rabi frequency large compared to the natural width and to the detuning). The weak coupling regime is studied in § 2. We show that the ground state’s energy undergoes a “light shift”, by an amount that is proportional to the field intensity, and whose value as a function of the detuning follows a Lorentzian dispersion curve. The ground state also undergoes a radiative broadening, proportional to the field intensity, and which can be interpreted as a probability per unit time of leaving the ground state through the absorption of a photon. We focus in § 3 on the strong coupling domain, where the Rabi oscillation between sates and appears, although damped because of the radiative instability of . The energy diagram of the dressed-atom allows interpreting phenomena that are specific to the strong coupling regime, such as the fluorescence triplet and the temporal correlations between the photons emitted in the lateral components of that triplet. The atom-field coupling perturbs, not only the atom, but also the field. We show in § 4 that the real and imaginary parts of the refractive index, associated with the atom’s presence, are actually perturbations of the field, just as the light shifts and radiative broadening are perturbations of the atom. 1.

Brief description of the dressed-atom method

Let us call “laser mode ” the quantum field mode that is populated by photons with frequency . In the absence of coupling between the atom and this laser mode, the Hamiltonian of the total system + is equal to + , where is the atomic Hamiltonian, and the Hamiltonian of the radiation in laser mode . The energy levels 2A

2130

more detailed description can be found in Chapter VI of [21].



TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

+ are labeled by two quantum numbers: or for the atomic internal state3 , for the photon number in the only radiation mode that is not empty and contains photons of frequency . of

1-a.

State energies of the atom + photon system in the absence of coupling

Consider the two states photon vacuum state are: =

+

and

~

1

=

1 , whose energies with respect to the +(

1)~

(1)

Their energy difference is: 1

=~ = ~(

(

) 0) = ~

(2)

where: =

(3)

0

is the frequency detuning between the field frequency and the atomic transition frequency 0 = ( ) ~. For a resonant field, i.e. when = 0 , the two states are degenerate. Here we consider very small detunings (in absolute value) compared to 0 : (4)

0

Consequently, even if the field is not exactly resonant, the two states 1 can be grouped in a two-dimensional multiplicity ( ): ( )=

1

and (5)

which is far from all the other states of the system atom + field. There is an infinity of other multiplicities, for values of going from 1 to infinity. As an example, Figure 1 shows the three multiplicities ( 1), ( ) and ( + 1); there are 1 others with lower energies, and an infinity with higher energies. Each multiplicity is separated from the next one by the distance ~ , and the spitting between the two levels inside one multiplicity equals ~ . The multiplicity (1) corresponding to = 1, includes the two states 1 and 0 ; on the other hand, the state 0 is isolated. 1-b.

Coupling matrix elements

The coupling between the atom and the mode is proportional to the product of the atomic dipole moment D and the mode component of the radiation electric field E. We can choose the origin of the coordinates so as to be able to write k R = where R is the position of the atomic center of mass. We then get for :

=

~ 2 0

3

D (ε



)

(6)

3 The atomic external degrees of freedom are treated classically by assuming that the atom is fixed at point R.

2131

COMPLEMENT CXX



Figure 1: Energy levels of the system atom + photon in the absence of coupling. Only three adjacent multiplicities ( 1), ( ) and ( + 1) are shown in the figure; many more exist below or above, corresponding respectively to smaller or larger values of . Each vertical arrow links a pair of states having an energy difference indicated next to it. The only non-zero matrix elements of the odd operator D are those between and . The annihilation and creation operators and change by 1. It follows that is coupled to 1 and + 1 , whereas 1 is coupled to and 2 . The two states and 1 of the multiplicity ( ) are thus coupled to each other by the matrix element: 1 where Ω Ω

=

~Ω 2

(7)

is the “Rabi frequency” defined as:

=

Ω1

(8)

with: Ω1 =

2 ~

~ 2 0

3

ε

D

(9)

We assume Ω1 is real and positive. If this is not the case, it suffices to change the relative phase4 of the kets and , which modifies the phase of the matrix element D ; a suitable choice of that phase will make Ω1 real and positive. As operator changes both the photon number by one unit and the atomic internal state, it does not have matrix elements between the kets of multiplicity ( ) and those of multiplicities ( 1). On the other hand, it can couple these kets with those of multiplicities ( 2) and, to higher order, to those of multiplicities even 4 Such a phase change will affect the non-diagonal elements of the density matrix, leaving the physical predictions unchanged.

2132



TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

further away. However, the distances (in energy) between these multiplicities and ( ) are of the order of 2} (or of a multiple of that energy), whereas we have assumed that the interaction matrix elements are very small compared to } . The multiplicities other than ( ) have therefore energies too different from those of ( ) to play a significant role. We shall ignore their non-resonant coupling, which has a negligible effect for a quasi-resonant excitation. 1-c.

Outline of the dressed-atom method

At the beginning of section § 1, we have described the quantum states of the system + (atom + laser mode) in the absence of coupling; we showed they can be grouped into multiplicities ( ), with = 0, 1, 2, ... well separated from each other when condition (4) is satisfied. As an example, Figure 1 shows that the multiplicity ( ) for an atom with two levels and includes the states and 1 separated by an energy } . Relation (7) tells us that these two states are coupled by an operator describing the interaction between and , and that the corresponding matrix elements equal }Ω 2. We also discussed why, for a quasi-resonant excitation, the couplings between different multiplicities were negligible, so that one can separately study each multiplicity ( ). .

Dressed states and energies

The first step in the dressed-atom approach is to study the energy levels of the system + inside a multiplicity ( ), taking into account the coupling restricted to ( ). We must diagonalize the Hamiltonian + + inside this sub-space. For ( ) a two-level atom, the restriction of that Hamiltonian to ( ), noted , is represented in the base of the kets written in (5) by a Hermitian 2 2 matrix equal to: (

)

=

+ ~Ω

=[

+(

~ 2

2 1)~

~Ω +( 1)~

] +~

Ω Ω

2

2 0

(10)

where

is the identity operator in ( ). Because of the coupling created by the non-diagonal elements Ω 2, the states and 1 whose non-perturbed energies are separated by } (left-hand side of Figure 2) are transformed into two states + ( ) and ( ) with energies } (right-hand side of the figure): }

=[

+(

1)~

]+

~ 2

~ 2

Ω2 +

2

(11)

The new states + ( ) and ( ) are linear superpositions of the initial states; they are called “dressed states”, and their respective energies } the “dressed energies”. This complement will show that a great number of interesting physical phenomena occurring when an atom is coupled to a laser mode can be interpreted in terms of these dressed states and their energies. 2133



COMPLEMENT CXX

Figure 2: Energies of the states of the system + within ( ), in the absence of coupling (left-hand side of the figure), and in the presence of a coupling of intensity }Ω 2 between the two initial states (right-hand side of the figure). .

Rabi oscillation

Let us consider first a particularly simple application of the dressed-atom method. Imagine, for example, that the system is, at time = 0, in the state : ( = 0) =

in

=

(12)

and let us try to find the probability that it will be found at a later time fin

=

in the state:

1

(13)

We are dealing with the evolution of a system with two levels coupled by a static perturbation ~Ω 2. This problem was studied in detail in § C of Chapter IV. We must first expand the initial state on the states ( ) , and multiply each of them by an exponential whose argument is proportional to its energy } : () =

+(

+

)

+(

) +

( )

The probability amplitude of finding the system in state 1

2

()

+(

2

+

)

1

( )

1

+(

( )

(14)

1 is then: )

( )

(15)

where we introduced the Bohr frequency: =

Ω2 +

=

+

2

(16)

( ++ ) 2 In (15), the sign simply means that we omitted a global phase factor , with no physical significance. The probability of finding the system at time in the final state (13) is therefore:

()= = 2134

1 2 +

() 2

+

+

2 2

2

+

+

+

+ c.c.

(17)

• where the =

TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

are the scalar products: ( )

and

=

( )

1

(18)

and c.c. means the complex conjugate. We see that the probability ( ) is an oscillating function of time, with a frequency that is the only Bohr frequency Ω2 + 2 of the system within the multiplicity . This frequency can obviously be expanded to all orders of the perturbation Ω2 , but the result we obtained is not perturbative. The oscillation we found concerns the total system formed by a two-level atom placed in a monochromatic radiation field that can be intense and resonant: starting from state , the atom absorbs a photon and goes to state 1 ; it then comes back to state by stimulated emission of a photon, and so on. 1-d.

Physical meaning of photon number

The situation is different depending on whether the atom is placed in a real cavity or in free space. If the atom is placed in a real cavity, as in some experiments, the field modes are the cavity eigenmodes. Such a situation will be discussed in § 4; the photon number then has a perfectly clear physical meaning. The volume 3 , appearing in the modal expansion of the fields, and which is found in expressions (6) and (9) above, is simply the volume of the cavity containing the photons. If the atom is in free space, the volume 3 , introduced to obtain discrete modes, is simply used in the computation, without any precise physical meaning. On the other 3 hand, the energy density in the vicinity of the atom, proportional to ~ , does 3 have a physical meaning. Provided we keep constant, we can change and 3 arbitrarily without changing the coupling between the atom and the field; this is because 3 . In the coupling is characterized by the Rabi frequency Ω , which depends on that case, the photon number does not have an intrinsic physical meaning. Imagine, for example, that the field is in a coherent state (Complement GV ). The values of are then distributed around an average value in an interval of width ∆ = , very small in relative value compared to , but very large in 3 absolute value. If both and 3 go to infinity, keeping the ratio constant, the Rabi frequency Ω will barely change in relative value even when varies over a large interval around . The frequency Ω can thus be replaced in (10) by a constant Ω (which does not depend on ): Ω



(19)

This will be done in what follows and in § 3. 1-e.

Effects of spontaneous emission

We have ignored, until now, all the field modes others than the laser mode. However, when the atom is in the excited state , it can spontaneously emit a photon in another mode. This means that, in addition to the atom and the laser mode , we must take into account the system including all the modes that, initially, did not contain any photons. As is a very large system (sometimes called “reservoir” for that 2135



COMPLEMENT CXX

reason), the coupling effects between + and must be described by a so-called “master equation”; this equation describes the evolution of the density operator + of + under the effect of the coupling with (see part D of Chapter VI in [21]). Though we shall not introduce this master equation here, we shall merely discuss the physical interpretation of the results it leads to. As the frequency spectrum of the reservoir has a width ∆ of the order of the optical frequency, its associated correlation time 1 ∆ is much shorter than all the other characteristic times of the problem. It is, in particular, much shorter than the radiative lifetime : =

1 Γ

(20)

where Γ is the natural width of the excited state ; it is also shorter than the inverse of the Rabi frequency, which yields the characteristic time of the coupling to the laser mode. This means that, when the system + is in the state 1 and a spontaneous emission occurs, it lasts for a time interval too short for the atom to have sufficient time to couple with . The system then goes quasi-instantaneously from state 1 , which belongs to ( ), to state 1 in the lower multiplicity ( 1) – see Figure 1. As a consequence (see § C-3 of Chapter III in [21], as well as Complement DXIII ), the evolution within ( ) can still be described by the same equations as above, provided we simply add an imaginary term to the energy of the excited state: =

}

Γ 2

(21)

This means that, to describe the evolution of the system + within5 ( ) while taking into account spontaneous emission processes, we must replace the Hamiltonian ( ) written in (10) by the effective non-Hermitian Hamiltonian: ( ) eff

=[

+(

1)~

] +~

Ω Ω

2

2 Γ 2

(22)

Because of the imaginary term Γ 2 appearing in the matrix, the two eigenvalues of ( ) eff also have an imaginary part: the two dressed states are now unstable as a result of spontaneous photon emission, which can occur in any of these states. Within a constant factor, given by the term proportional to on the right-hand side of (22), the eigenvalues } of eff are obtained by diagonalizing the 2-dimensional matrix that follows on the right-hand side of (22). The exact solution is written6 : =

Γ + 4 2

1 2

Ω2 +

+

Γ 2

2

(23)

We now discuss the physical meaning of these results in two limiting cases. 5 The coupling with the reservoir induces other important effects leading to transitions between different multiplicities. This is what happens, for example, in the fluorescence phenomenon studied in § 3-c. 6 For brevity, we use a slighlty incorrect mathematical notation, since the square root sign must in principle be applied on a real and positive number. What we mean with the square root sign written on the right-hand side of (23), is either one of the two complex numbers whose square is equal to the complex number under the root sign.

2136

• 2.

TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

Weak coupling domain

We start with the weak coupling domain, which is more directly related with the results of Chapter XX. 2-a.

Eigenvalues and eigenvectors of the effective Hamiltonian

Consider first the case where the non-diagonal coupling ~Ω 2 between the two non-perturbed states of ( ) is small compared to the differences between the energies of these two states (including the imaginary term associated with the natural width of ). As this difference is complex, we must take its modulus: Ω

+

Γ 2

(24)

This inequality is satisfied if: Ω

Γ

or

ΩR

(25)

The weak coupling domain is thus obtained for low light intensities, or large frequency detunings. For weak coupling, we can apply perturbation theory to obtain the energy corrections for the states and 1 , to order 2 in Ω . Starting from (22), we obtain in this way the correction to the energy of state : =~

(Ω 2)2 =~ + Γ 2

~

(26)

2

where: =

4

2

+ Γ2

Ω2

and

=

4

2

Γ Ω2 + Γ2

A similar calculation yields for the correction to the energy of state

1:

(28) 2 We can write the approximate eigenvalues of the effective Hamiltonian (22) in the form: 1

+

=

(27)

+

~

+ ~

2

+

Γ + + (29) 2 2 which coincides with an expansion in powers of Ω Γ of the exact result (23). Perturbation theory also allows computing the eigenstates of eff to first order in Ω . The state , which tends towards when Ω goes to zero, is written: =

+

(Ω 2) + Γ 2

1

(30)

This means that state is “contaminated” by state 1 . A similar computation for state 1 , which tends towards 1 when Ω goes to zero, yields: 1 =

1

(Ω 2) + Γ 2

(31) 2137

COMPLEMENT CXX



Figure 3: Non-perturbed states (left-hand side of the picture) and perturbed states (righthand side) in the ( ) multiplicity. The coupling, characterized by the Rabi frequency Ω , shifts state by a quantity ~ (representing the light shift of the ground sate ); its wave function is “contaminated” by the unstable wave function of state 1, meaning that the ground state also becomes unstable as shown by its radiative broadening ~ . State 1 is shifted in the opposite way, compared to ; its width is reduced from ~Γ to ~(Γ ).

2-b.

Light shifts and radiative broadening

The real parts of and 1 represent shifts in the energy levels induced by the coupling with the light and called for that reason “light shifts”. The imaginary part of represents a radiative broadening of state , which becomes unstable under the coupling effect. The imaginary part of 1 describes a reduction of the radiative broadening ~Γ of state 1. Figure 3 shows, in its left-hand side, the non-perturbed states and 1 in the ( ) multiplicity. They are separated by the gap ~ ; if is positive, state is above state 1 ; conversely, if is negative, state 1 is now above state . The thickness of the line representing state 1 symbolizes its natural width ~Γ. The right-hand side of the figure represents the states perturbed by the interaction with light. The two states and 1 repel each other, meaning that they undergo light shifts of opposite signs. The ~ shift of is positive if is above , i.e. if is positive; it is negative if is negative. The stable state is also “contaminated” by the unstable state 1 , which makes it unstable as shown by its radiative broadening ~ . An atom in state cannot stay there indefinitely: it will leave that state with a probability per unit time equal to , which can be interpreted as the photon absorption rate of an atom in state . Conversely, due to the “contamination” of the unstable state 1 by the stable state , state 1 becomes less unstable and its width is reduced from ~Γ to ~(Γ ). 2138



TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

γg δg

δ

Figure 4: Plots of the light shift ~ (dashed line curve) and of the radiative broadening ~ (solid line curve) as a function of the detuning between the laser frequency and the atomic frequency.

2-c.

Dependence on incident intensity and detuning

The shifts ~ and radiative broadenings ~ given by equation (27) are all proportional to Ω2 , hence to the number of incident photons, meaning to the light intensity. Their variations with the detuning follow respectively Lorentzian dispersion and absorption curves (Fig. 4). When the detuning is very large (in absolute value) compared to the natural width of ( Γ), we can neglect Γ2 compared to 4 2 in the denominators of expressions (27), which yields: =

Ω2 4

=

Ω2 Γ 4 2

(32)

This leads to: =

Γ

(33)

For large detunings, the light shifts are thus much larger than the radiative broadenings. 2-d.

Semiclassical interpretation in the weak coupling domain

In this weak coupling domain, the atom responds linearly to the incident field; the results we just discussed can be interpreted semiclassically, in terms of a dipole induced by the incident field (see for example [50]). This dipole has a component in phase with the field and a quadrature component, related to the field by a dynamic polarizability ( ). The quadrature component absorbs energy from the field. It varies with the detuning as an absorption curve; it is responsible for the absorption rate associated with the radiative broadening . The in-phase component of the dipole yields a polarization energy. Its variation with the detuning follows a dispersion curve. It is responsible for the light shift, just as the Stark shift results from the interaction of a static electric field with the static dipole it induces. This is why this light shift is often called a “dynamic Stark effect”. 2139

COMPLEMENT CXX

2-e.



Some extensions

We now discuss some direct, important extensions of the previous study. .

Non-monochromatic incident radiation

Imagine the radiation state is now a Fock state 1 2 , or a statistical mixture of such states; the radiation spectral distribution is then described by the function ( ). To second order perturbation theory (weak coupling domain), the processes that come into play in the light shifts and radiative broadenings are stimulated absorption and re-emission of photons. When several modes contain photons and the radiation state is a Fock state, the photon must be re-emitted by stimulated emission in the same mode it was absorbed from (otherwise the matrix element describing the second order coupling would be zero). This means that the effects of the different field modes can be added independently; we then get for and :

.

d

2

d

2

( (

)

+ Γ2 Γ ) 2 4 + Γ2 4

2

(34)

Degenerate ground state

Assume the ground state has a non-zero angular momentum and therefore contains several Zeeman sublevels ; one can then show [51] that the sublevels of having a well-defined light shift and radiative broadening are obtained by diagonalizing the Hermitian matrix whose elements are: ε

D

ε

D

(35)

where the states are the sublevels of . The eigenstates of this matrix, with eigenvalues , undergo light shifts proportional to and radiative broadenings proportional to (where and are the shifts and broadenings for a two-level atom). Reference [52] studies the symmetry properties of matrix (35), and discusses the equivalence between the light shifts and the effect of fictitious magnetic and electric fields acting on the ground multiplicity of the atom. We shall simply focus here on the simple case of a =1 2 = 1 2 transition such as, for example, the hyperfine component =1 2 = 1 2 of the 61 0 63 1 transition of the 199-isotope of mercury ( = 253.7 nm). It is on such a transition that light shifts were observed for the first time [53]. The left-hand side of Figure 5 shows the components + and of this transition, that link respectively = 1 2 to = +1 2 and = +1 2 to = 1 2. If the beam polarization is + , level = 1 2 has a non-zero and well defined light shift, since the absorption and re-emission of a + photon can link sublevel = 1 2 only to itself. On the other hand, level = +1 2 is not shifted because there is no = +1 2. We get opposite conclusions for a + optical transition starting from polarization of the light beam: the light shift of sublevel = +1 2 is well-defined and sublevel = 1 2 is not shifted. Now, by symmetry, the Clebsch-Gordan coefficients (Complement BX ) for the + and transitions are equal; the light shifts have the same 2140



TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

me = −1/2

me = +1/2

σ+ σ−

σ+

mg = −1/2

σ− mg = −1/2

mg = +1/2

mg = +1/2

Figure 5: The left-hand side of the figure represents the =1 2 = 1 2 transition, and the light beam polarizations that can induce transitions between its Zeeman sublevels. The diagram on the right-hand side plots in its center the ground state energy levels in the absence of any light beam (Zeeman levels in a static magnetic field); the two lateral extensions depict their light shifts by a non-resonant light beam with polarization + (on the right), or (on the left). In the first case, the light selectively shifts the sublevel = 1 2, in the second case, it shifts the sublevel = +1 2. This is why, depending on whether the beam polarization is + or , the variation in the gap between the two Zeeman sublevels changes sign. value for a + excitation of sublevel = 1 2 and for a excitation with same intensity of sublevel = +1 2. In the presence of a static magnetic field, there is an energy gap between the two atomic sublevels = 1 2 (Zeeman effect). The right-hand side of Figure 5 shows that a non-resonant light excitation changes this gap by the same amount, but in the opposite directions7 depending on whether it has a + or polarization. The ground state magnetic resonance line, detected by optical methods using a resonant beam (Complement CXIX , § 2-b), is thus shifted when a second non-resonant beam is applied; this shift has opposite directions, depending on whether that beam has a + or polarization. As relaxation times can be very long in the ground state, its magnetic resonance line is very narrow, which allows detecting very small light shifts, of the order of a few Hz. This is how the existence of light shifts were demonstrated in 1961, when laser sources were not yet available in laboratories [53]. With laser sources, one routinely observes shifts of the order of 106 Hz, and even more. 3.

Strong coupling domain

We now examine how the previous results are modified in the strong coupling regime. 3-a.

Eigenvalues and eigenvectors of the effective Hamiltonian

A strong coupling regime means that the non-diagonal element ~Ω 2 of the effective Hamiltonian written in (22) is large compared to the difference between two diagonal elements: Ω 7 We

and



assume the detuning

Γ

(36)

is large compared to the Zeeman splitting.

2141



COMPLEMENT CXX

For the sake of simplicity, we shall only consider the resonant case ( = 0). Equation (23), which yields the eigenvalues of the 2 2 matrix of (22) for any value of Ω , then becomes: Γ 4

=

1 2

Ω2

Γ2 4

(37)

where, as we did above, we use the concise notation for the square root of a number that is not always positive (see note 6). As long as Ω Γ 2, the last term on the right-hand side of (37) is purely imaginary. The same is true for the two eigenvalues , which are equal to: Γ 4

=

1 2

Γ2 4

Ω2

(38)

If, in addition, Ω Γ, a limited expansion of (38) in powers of Ω Γ yields + = 2 and = (Γ ) 2; as expected, we confirm the results of the previous § for the weak coupling regime. As Ω increases, while remaining lower than Γ 2, the eigenvalue + increases whereas decreases, but their sum ( + + ) remains constant and equal to Γ 2. When Ω reaches the value Γ 2, both eigenvalues are equal to Γ 4. As soon as Ω goes beyond Γ 2, the last term in (37) becomes real. The two eigenvalues have opposite real parts and the same imaginary part, equal to Γ 4; the two dressed levels now always have the same width Γ 4. As the coupling becomes strong (Ω Γ ), the energies are equal to: Γ 4

Ω 2

(39)

and the eigenvectors tend toward symmetric and antisymmetric linear combinations of and 1: ( )

1 [ 2

1]

(40)

Such states can no longer be considered to be a result of a light mutual contamination of the non-perturbed states of multiplicity ( ). They are actually entangled, and hence impossible to consider as products of an atomic state and a field state. These states of the global atom + field system are commonly called atom-field dressed states. 3-b.

Variation of dressed state energies with detuning

The solid lines in Figure 6 show the energies of the dressed states ( ) as a function of } . The energies are defined with respect to the energy of the non-perturbed state 1 , chosen to be equal to ~ 0 , and represented by a horizontal line with ordinate ~ 0 . Compared to this energy, state has an energy equal to ~ which varies, as a function of ~ , as a straight line of slope unity. This line intersects the horizontal line representing the energy of 1 at a point with abscissa ~ 0 . At resonance ( = 0 ), the two dressed states ( ) are separated by an energy ~Ω , since we assume we are in the strong coupling domain where Ω Γ. Let 2142



TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

Energy

Figure 6: Energies of the dressed states ( ) (solid lines) and of the non-perturbed states and 1 (dashed lines) as a function of ~ . The energies are defined with respect to the energy of the non-perturbed state 1 , chosen to be equal to ~ 0 . As ~ varies, the energies of the dressed states follow a hyperbola whose asymptotes are the straight lines representing the energies of the non-perturbed states (anticrossing).

us first completely neglect Γ. Leaving resonance, and as the detuning becomes larger and larger (in absolute value), one finally reaches regions where is larger than Ω , which corresponds to a weak coupling regime. Varying the detuning, one then continuously goes from a strong coupling to a weak coupling region. The energies of the dressed levels ( ) follow a hyperbola whose asymptotes are the energies of the non-perturbed states and 1 (Figure 6). As they come close to their asymptotes, the dressed states become very close to the non-perturbed corresponding states, and the distance between the hyperbola and its asymptote is simply the light shift defined in (27). To take into account the natural width Γ of the excited level , one should add a width to the dressed levels shown in Figure 6. Far away from the anticrossing center, close to the asymptotes, the width would be Γ for the dressed states that are close to the horizontal asymptote, and for the dressed states that are close to the asymptote with slope one. Following one hyperbola branch continuously, the width will progressively change from one of these values to the other, and take the value Γ 4 at the center of the anticrossing. Another interesting phenomenon occurs when the system continuously follows one of the hyperbola branches. Imagine it follows the lower branch, from left to right, for instance because the excitation frequency is slowly varied. If the transit is slow enough to neglect any non-adiabatic transition to the other dressed state, i.e. to the other hyperbola branch, one continuously goes from state to state 1 . This is 2143

COMPLEMENT CXX



another convenient way to go from to : instead of applying a resonant field during the time necessary for the Rabi oscillation to bring the system from to ( pulse), one slowly scans the field frequency through resonance, from a lower to a higher value. Note however that this scanning cannot be too slow, since it must occur on a time scale that is too rapid for the dissipative processes to be able to change the atomic internal state. Such a transit is often referred to as an “adiabatic fast passage”, as it must be slow enough to remain adiabatic and fast enough to avoid any dissipation during the transit time. The dressed-atom approach allows clearly specifying the conditions for transferring the atom from one level to another. 3-c.

Fluorescence triplet

With the dressed-atom approach, we can also simply explain the spectrum of the lines spontaneously emitted by an atom subjected to intense radiation. When studying elastic scattering in § E-1 of Chapter XX, we showed that, when the exciting radiation had an intensity low enough to allow a perturbation treatment, the radiation emitted spontaneously by the atom had the same frequency as the exciting radiation. We now show that the situation is different in the case of an intense excitation radiation: new frequencies appear in the light emitted by the atom8 . We assume the exciting radiation to be resonant and intense, so that the two dressed states ( ) of multiplicity ( ) are separated by an energy interval ~Ω (Figure 7). These two states are linear superpositions of the states and 1; consequently, they both have a non-zero projection onto 1 . Similarly, the two states ( 1) of multiplicity ( 1) are linear superpositions of the states 1 and 2 ; they both have a non-zero projection onto 1 . The lines emitted spontaneously by the atom are those that link two energy levels between which the atomic dipole operator D has a non-zero matrix element. Since D does not change the photon number and can link to , the matrix element 1D 1 is non-zero; each of the two states ( ) can be linked via D to each of the two states ( 1) . The four radiative transitions represented by the curly arrows in Figure 7 are thus possible and correspond to three distinct frequencies: frequency + Ω for the + ( ) ( 1) transition; frequency for both the + ( ) 1) and the +( ( ) ( 1) transitions; frequency Ω for the ( ) 1) +( transition. We get a frequency triplet for the spontaneously emitted light, which was first predicted by Mollow [54] by using a semiclassical treatment. Autler-Townes doublet Imagine that one of the two atomic states we considered until now, for example , is connected via an allowed transition to a third state , meaning that D is nonzero. Let us also assume that the radiation frequency that is resonant for the transition, is completely off-resonance for the transition; consequently, it does not perturb state , so that the sates can be considered as eigenstates of the total Hamiltonian, even if they are slightly shifted. The two states ( ) , which both have a non-zero projection onto , can therefore be connected via D to . This means that, because of the presence of an intense radiation exciting the transition, the transition is split into two lines separated by ~Ω , called the “Autler-Townes doublet” [55]. 8 This

2144

is not Raman scattering as we assume there are no other atomic states except

and .



TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

Figure 7: Radiative transitions between one of the two states ( ) of multiplicity ( ) to one of the two states ( 1) of multiplicity ( 1), whose energy is lower than that of ( ) by the quantity ~ . We assume the exciting radiation to be exactly at resonance, so that the energy interval between the two states of each multiplicity is equal to ~Ω .

3-d.

Temporal correlations between fluorescent photons

We now study the characteristics of the radiation spontaneously emitted by an atom that interacts continuously with the electromagnetic field of a laser.

.

Radiative cascade of the dressed atom

We saw that a dressed atom, spontaneously emitting a photon, goes from the multiplicity ( ) to the one just below, ( 1), located at an energy distance } . We shall not study here the precise evolution of the physical system as it leaves multiplicity ( ), which requires the master equation, already mentioned in § 1-e. Our discussion will remain qualitative, but the interested reader will find a more detailed approach in § D of Chapter VI in [21]. Once it reaches ( 1), the atom can spontaneously emit a new photon, which brings the dressed atom to ( 2), and so on. The series of photons spontaneously emitted by the atom in continuous interaction with the laser radiation can be viewed as a “radiative cascade” of the dressed atom descending its energy diagram. This image of a radiative cascade permits studying the time correlations between the photons emitted by the atom. As we shall see, the observed correlations depend on the spectral resolution of the photodetectors used. 2145

COMPLEMENT CXX

.



High spectral resolution photodetection

With a high enough spectral resolution, one can observe the time-correlations between photons emitted in the two side-bands of the triplet. Suppose we place filters in front of the photodetectors, so that each can receive only one of the components of the fluorescent triplet, centered at frequencies , + Ω and Ω . This means that the spectral resolution of the apparatus is better than the splitting frequency Ω of this triplet, but it does not imply that it is lower than the natural width Γ of each components. If we call this spectral resolution, we then have: Γ



(41)

Imagine that at a given time, a detector registers a photon emitted for example in the lateral band centered at +Ω , as the system undergoes the transition Ψ+ ( ) Ψ ( 1) (curly arrow on the left-hand side of the Figure 7). The next photon is emitted as the system, starting from Ψ ( 1) , undergoes either the Ψ ( 1) Ψ ( 2) transition, emitting a photon of frequency , or the Ψ ( 1) Ψ+ ( 2) transition, emitting a photon of frequency Ω . This means that a second photon with the same frequency + Ω as the first one, cannot be emitted right after the first one. If the frequency of that second photon is , the system ends up in state Ψ ( 2) ; from that state, it can emit either a third photon with frequency , or a third photon with a lower frequency Ω . If, on the other hand, the frequency of that second photon is Ω , the system ends up in state Ψ+ ( 2) ; from that state, it can emit either a photon with a higher frequency + Ω , or a photon of frequency . As opposed to the second photon, the third photon may thus have the same frequency + Ω as the first one. Following the same line of reasoning one can argue that, if the first photon has a frequency Ω , this cannot be the case for the second photon; one must wait until the third photon to eventually obtain the same frequency Ω . This means that, if photons with only the two extreme frequencies Ω are selectively observed, the detected emission processes will necessarily alternate in time (but these events may be separated by any number of photon emissions at the central frequency ). Comment: Taking a Fourier transform to return to the time domain, relation (41) imposes a limit to the temporal resolution of the detection system: 1 Ω . This means that it is not possible to measure the exact time at which a photon is emitted with a precision of the order of the Rabi precession period.

.

Photodetection with high temporal resolution

We now study the opposite case where the detectors have a temporal resolution better than the Rabi precession period. This allows a precise determination of the time at which the photon is emitted, but one can no longer distinguish the frequencies of the three triplet components. We have seen above (§ 1-e) that the elementary spontaneous emission processes have a correlation time that is very short compared to all the other characteristic times 2146



TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

of the problem (because of the large spectral width ∆ of the empty modes’ reservoir). A spontaneous emission from ( ) thus corresponds to a very short “quantum jump” taking the system from state 1 in ( ) into state 1 in ( 1). Once the atom has reached this second state, it cannot emit a second photon right away, since no spontaneous emission can occur from a ground state . A certain time must elapse for the atom-laser interaction to bring the system from state 1 to the state 2 it is coupled with, and from which another photon can be spontaneously emitted. The system then falls back to state 2 , and the previous process repeats itself (with an value lowered by one unit). It therefore becomes clear why one observes a “temporal antibunching” of the photons emitted by a single atom, as they must be separated by a time interval at least of the order of 1 Ω ; this antibunching was already referred to in § 3-b- of Complement BXX . 4.

Modifications of the field. Dispersion and absorption

We now study how the field is modified by its interaction with the atom. 4-a.

Atom in a cavity

The atom-field interaction does not solely perturb the atom; it also changes the field. In order to study this effect, it is convenient to imagine the atom being placed in a real cavity assumed to be perfect, meaning that its losses can be ignored (they occur on a time scale much longer than all the other relevant times of the experiment). As opposed to what we did before, we shall keep the dependence of the Rabi frequencies Ω given by equations (8) and (9), since in a cavity the photon number has a physical meaning (§ 1-d). Figure 8 shows the first multiplicities ( ) of the system atom + field for low and increasing values of the photon number , starting at = 0. Multiplicity (0)

Figure 8: Energy levels of the system atom + field for low values of the photon number (in angular frequency units, meaning the energies are divided by ~). States and 1 of ( ) undergo opposite shifts, proportional to . State 0 is not shifted. 2147

COMPLEMENT CXX



contains a single state 0 . Multiplicity (1) contains the two states 1 and 0. Multiplicity (2) contains the two states 2 and 1 , and so on. We shall assume we are in the weak coupling regime, so that we can use the perturbative results of § 2 for the light shifts of the different energy levels. State 0 is not shifted as it is not coupled to any other state9 . States 1 and 0 of (1) undergo opposite light shifts, respectively +~ and ~ , where is given by equation (27), where we have replaced Ω by Ω1 , the Rabi frequency for = 1 (see (8)). Setting:

0

= Ω21

4

2

+ Γ2

(42)

the light shifts of states 1 and 0 are, respectively, +~ 0 and ~ 0 . According to (8), the squares of the Rabi frequencies Ω characterizing the atom-field coupling in the multiplicities ( ), are proportional to ; this means that states 2 and 1 of (2) undergo light shifts respectively equal to +2~ 0 and 2~ 0 . More generally, states and 1 of ( ) undergo shifts respectively equal to + ~ 0 and ~ 0. A similar reasoning can be applied to the radiative broadening. It shows that the radiative broadenings of states and 1 of ( ) are respectively equal10 to + 0 and Γ 0 , where 0 is given, according to (27), by: 0

4-b.

= Ω21

4

2

Γ + Γ2

(43)

Frequency shift of the field in the presence of the atom

Consider the left column in Figure 8. The gap between the perturbed energies of states 1 and 0 is equal to ~( + 0 ); the gap between the perturbed energies of states 2 and 1 is equal to ~( + 2 0 + 0 ), and so on. As the 0 ) = ~( light shifts of the states are proportional to , increasing linearly with , the 0 perturbed levels in the left column of Figure 8 have a constant gap between them, even in the presence of the coupling; the distance between consecutive levels simply goes from ~ to ~( + 0 ). In other words, the presence in the cavity of an atom in its ground state changes the field frequency from to + 0 . As the light shifts of the levels in the right column of Figure 8 have an opposite sign, a similar argument shows that the presence in the cavity of an atom in the excited state changes the field frequency from to 0. The atom-field interaction thus shifts the field frequency inside the cavity by a quantity that changes sign, depending on whether the atom is in the internal state or . Let us assume this interaction lasts a time , as will be the case if an atom is introduced into the cavity and takes that time to traverse it. Compared to the free oscillation in the absence of the atom, the field oscillation will be out of phase by an amount ; this phase shift is equal to = + 0 if the atom is in state , and to = if the atom 0 is in state . This change in the field frequency is a phenomenon similar to that described by the real part of the refractive index: a light beam going through an atomic media changes 9 There is actually a coupling between state 0 and state 1 , but is highly non-resonant; we shall ignore it since, as stipulated above, our computation is to zeroth order in Ω . 10 State 0 does not undergo any radiative broadening.

2148



TWO-LEVEL ATOM IN A MONOCHROMATIC FIELD. DRESSED-ATOM METHOD

its propagation velocity without any changes in its frequency. In a cavity, the field wavelength cannot change as it is fixed by the boundary conditions on the cavity walls, and hence by the cavity size. The phase shift compared to the free evolution cannot accumulate in space but must accumulate in time (resulting in a frequency change of the field). Note that if one varies the frequency of the field around the atomic frequency , the sign change of the light shift 0 0 is reminiscent of the sign change of the real part of the refractive index in the vicinity of an atomic resonance11 . 4-c.

Field absorption

Consider an atom in its ground state in the presence, at time = 0, of an mode of the field in a quasi-classical coherent state (Complement GV ). The state of the total system reads: (0) =

2

=

2

(44)

!

=0

The time evolution in the presence of coupling changes the energies of the states to: ˜

=

~(

+

0

0

2)

(45)

and the state of the system at time

becomes: 2

() = =0

! exp[

(

2

exp[ +

( 0

+ 0

0

2) ]

0

2) ] (46)

The atom is still in the presence of a quasi-classical coherent state. However, compared to the free field evolution of that state in the absence of coupling, the atom-field interaction has introduced a phase shift 0 (as already discussed above) as well as a decrease in 2 0 amplitude , resulting in an attenuation of the amplitude of the field. This is reminiscent of the radiation absorption described by the imaginary part of the refractive index. To sum up, we showed that the atom-field coupling produces light shifts and radiative broadening of the atomic levels, corresponding to the well-know field dispersion and absorption phenomena in optics. Conclusion. In conclusion, we showed for many various situations that the dressed-atom approach brings strong clarifications while keeping the calculations simple. Considering the atom and the field mode with which it interacts as a quantum system described by a timeindependent Hamiltonian allows introducing true energy levels for the global system; this leads to a new, broad overview of the stimulated absorption and emission of photons. As an example, this approach makes it very clear how the atom-photon coupling changes the energy diagram of the dressed atom at high field intensity; this leads to a 11 This

effect is sometimes referred to as “anomalous dispersion.”

2149

COMPLEMENT CXX



very simple interpretation of the new frequencies appearing in the atomic fluorescence spectrum in the strong coupling domain. As the energy diagram of the dressed atom is a succession of multiplicities separated by an energy equal to a photon energy, the spontaneous emission of a photon is viewed in this approach as a quantum jump of the dressed atom from one multiplicity to the one just below (radiative cascade). This approach allows a simple calculation of the delay function yielding the distribution of the time intervals between two successive quantum jumps; this permits studying the time correlations between fluorescent photons. Let us also mention that this delay function allows simulating the temporal evolution of an atom, hence obtaining individual quantum trajectories, which can be used to get an averaged atomic evolution. Several experimental applications of the dressed-atom method are presented in the next complement.

2150



LIGHT SHIFTS: A TOOL FOR MANIPULATING ATOMS AND FIELDS

Complement DXX Light shifts: a tool for manipulating atoms and fields

1 2 3 4

5

Dipole forces and laser trapping . . . . . . . . . . . . . Mirrors for atoms . . . . . . . . . . . . . . . . . . . . . . Optical lattices . . . . . . . . . . . . . . . . . . . . . . . . Sub-Doppler cooling. Sisyphus effect . . . . . . . . . . . 4-a Laser configurations with space-dependent polarization 4-b Atomic transition . . . . . . . . . . . . . . . . . . . . . . 4-c Light shifts . . . . . . . . . . . . . . . . . . . . . . . . . 4-d Optical pumping . . . . . . . . . . . . . . . . . . . . . . 4-e Sisyphus effect . . . . . . . . . . . . . . . . . . . . . . . Non-destructive detection of a photon . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

2151 2153 2153 2155 2156 2156 2156 2157 2158 2159

Light shifts studied in § 2-b of Complement CXX exhibit a number of important properties leading to numerous applications; these will be briefly discussed in this complement. As these light shifts are proportional to the laser intensity, their magnitude can be space-dependent if the laser intensity is not homogeneous in space. These shifts can be used to create either potential wells (§ 1) to trap atoms once they are cold enough (laser trapping), or potential barriers (§ 2) reflecting atoms (mirrors for atoms). A particularly interesting example involves periodic optical potential wells created at the nodes and antinodes of a laser standing wave in an off-resonant condition (§ 3). This situation is reminiscent of that encountered by electrons trapped in the periodic potential of a crystal lattice. Neutral atoms trapped in optical lattices can thus serve as models for condensed matter problems. For low enough values of the detuning between the laser frequency and the atomic frequency, and if the ground state has several Zeeman sublevels, non-dissipative effects, such as light shifts, can coexist with dissipative effects, such as optical pumping between Zeeman sublevels. We explain in § 4 how correlations between these two types of effects can lead to new cooling mechanisms, such as Sisyphus cooling, allowing the atoms to reach temperatures much lower than with Doppler cooling. Finally, we show in § 5 how the light shifts undergone by an atom crossing a highly detuned cavity allows determining the number of photons present in the cavity, by performing measurements on the atoms at the cavity exit, without absorbing any of the cavity photons. 1.

Dipole forces and laser trapping

When the light intensity varies in space, as with a focalized laser beam or a standing wave, the light shifts also become space-dependent. If the detuning between the laser frequency and the atomic frequency is large compared to the natural width Γ of the excited level, it is then justified to ignore the dissipation due to spontaneous emission, on 2151



COMPLEMENT DXX

the characteristic time scales of the experiment. The light shift ~ (R) of ground state depends, as does the light intensity, on the position R of the atomic center of mass; it can therefore be considered as a potential energy (R) = ~ (R) that affects the atomic motion. This potential has the same sign as the light shift, and hence depends on the sign of the frequency detuning . The potential (r) gives rise to a force: Fdip (R) =

∇R

(R)

(1)

called the “dipole force”, or sometimes the “reactive force” (§ 11-4 in [24]). It is different from the radiation pressure forces studied in § 1-d of Complement AXIX , which come from momentum exchanges as the atom absorbs photons that are spontaneously reemitted. The dipole forces introduced here arise from the spatial variations of the light shifts undergone by the dressed-atom levels. One could say they are caused by the redistribution of photons between the different plane waves composing the laser beam1 : the atom absorbs a photon from one plane wave and re-emits it, by stimulated emission, in another plane wave; this process changes the atom’s momentum, and hence giving rise to a force. Comment: As is the case for light shifts, the intensity of the dipole forces, as a function of the frequency detuning between the laser frequency and the atomic frequency, follows a dispersion curve. In addition, the light shifts of the two dressed levels in multiplicity ( ) have an opposite sign for a given detuning ; the dipole force thus changes sign from one dressed state Ψ+ ( ) to its associated state Ψ ( ) . When the detuning is not too large, and if spontaneous emission processes can occur, the dressed-atom radiative cascade can lead to sign changes of the dipole force, as the atom goes from states Ψ ( ) to Ψ ( 1) ; this is the origin of the fluctuations of the dipole forces.

An important application of dipole forces is the implementation of laser traps. Consider first a laser beam detuned toward the red ( 0 ) and focalized at point O. The light shift, zero outside the laser beam, is negative inside the laser beam; it increases in absolute value as one gets closer to the focal point, where it reaches its maximum value. This creates a potential well that could trap a neutral atom; this will indeed happen if the atom’s kinetic energy, of the order of , is lower than the depth 0 of the potential well. This is why these laser traps have been built only since the 1980’s, once atomic cooling techniques (Complement AXIX , § 2) allowed slowing down atoms to temperatures of the order of a microkelvin [56].

Comment: The trapping forces involved in laser traps are of the order of an atomic dipole multiplied by the gradient of a laser field. They are much weaker than the forces exerted by a static electric field on a charged particle. This explains why laser traps for neutral atoms are much shallower than ion or electron traps. There exist, however, other types of traps for 1 A single plane wave does not have an intensity gradient, and cannot exert a dipole force. These forces, due to intensity gradients, require the presence of several plane waves with different wave vectors.

2152



LIGHT SHIFTS: A TOOL FOR MANIPULATING ATOMS AND FIELDS

neutral atoms, using different physical mechanisms (for a short review, see for example § 2-c of Complement AXIX , and Chapter 14 of reference [24] ).

2.

Mirrors for atoms

A laser detuned toward the blue ( 0 ) gives rise to repulsive potentials. Imagine for example that the laser wave propagates within a bloc of glass, and undergoes total internal reflection at the boundary between the glass and the vacuum (Figure 1-a). An evanescent wave appears outside the glass, with an amplitude decaying exponentially in a direction perpendicular to the boundary, becoming negligible over a distance of the order of the laser wavelength. This evanescent wave creates a potential barrier of height 2 0 , which reflects atoms arriving with an energy 0 (Figure 1-b). This set-up can be used as a mirror for neutral atoms [57].

Figure 1: (a) A laser beam traveling within a block of glass (shaded in grey in the figure) undergoes total internal reflection at the boundary. Outside the glass, an evanescent wave appears. (b) If the laser is detuned toward the blue ( 0 ), this evanescent wave creates a potential barrier of height 0 . Atoms falling on this barrier with an energy lower than 0 are reflected by the barrier and turn around.

3.

Optical lattices

When laser beams form a standing wave, the light intensity is modulated in space, with a periodicity 2: the intensity is zero at the nodes, and maximal at the antinodes. This creates periodic potential wells, located at the antinodes of the wave for a negative detuning ( 0 ), and at the nodes for a positive detuning ( 0 ). Figure 2 shows a two-dimensional optical lattice created by two standing waves, along two orthogonal axes3 . The study of optical lattices is interesting for several reasons, in particular because the motion of a neutral atom in an optical lattice is reminiscent of that of an electron in 2 Atoms

falling on a solid surface would stick to it, rather than being reflected. frequencies 1 and 2 of the two standing waves are in general sufficiently far apart for the interference terms between the two waves to have a negligible effect on the atom’s motion; the potentials created by the two waves can then be independently added. This requires 1 2 to be large compared to all the characteristic frequencies of the atom’s motion, such as it’s vibrational motion inside a well; in that case, the interference terms oscillate too fast to have a significant effect. 3 The

2153

COMPLEMENT DXX



Figure 2: Schematic representation of a two-dimensional optical lattice: placed in a superposition of two standing laser waves along two orthogonal axes, the atom is subjected to a potential periodic in space, represented by the undulating surface in the figure. These periodic potential wells, located at the antinodes of the standing waves for 0 , and at the nodes for 0 , form an optical lattice. The spheres above the surface indicate the positions where the atoms can be trapped.

a crystal lattice. Granted the order of magnitudes involved are quite different, since the spatial period of an optical lattice is of the order of a micron, whereas the period of a crystal lattice is of the order of a fraction of a nanometer. Nevertheless, optical lattices offer a large number of possibilities not available to crystalline lattices: One can easily change the intensity of the laser waves forming the standing waves, hence modifying the depth of the potential wells; this allows controlling the tunnel effect between adjacent wells. This method was used to explore the transition between a deep well regime where the atoms are localized at the bottom of the wells, and a shallow well regime where the atoms’ wave functions are delocalized over the entire lattice [58]. One can abruptly switch off the trapping laser beams (which obviously cannot be done for a crystal lattice) and study the resulting behavior of the liberated atoms. Studying the expansion velocity of the clouds of atoms yields information on their velocity distribution, and hence on their temperature (time-of-flight method). Studying their spatial distribution and the possible appearance of a diffraction pattern allows determining whether the matter waves trapped in distinct potential wells of the optical lattice were coherent or not. One can use two different frequencies 1 and 2 for the two laser waves counterpropagating to form the one-dimensional standing laser wave. This leads to a “standing” laser wave, moving with constant velocity if 1 2 is fixed, or with an acceleration if 1 2 increases linearly with time. In this latter case, the atom experiences a constant inertial force in the rest frame of the standing wave; its motion is then similar to that of an electron in a crystal lattice periodic potential, subjected in addition to a static electric field. The motion of such a particle, in a periodic lattice and subjected to a constant force, is predicted to be oscillatory, 2154



LIGHT SHIFTS: A TOOL FOR MANIPULATING ATOMS AND FIELDS

following the so-called Bloch oscillations; the experimental observation of such oscillations is facilitated in an optical lattice, as the atom’s relaxation time can be much longer than the oscillations’ period [59]. Cold atoms trapped in an optical lattice are a model system for “simulating” a number of situations encountered in solid state physics. Cold atom studies involve interactions between atoms much weaker than the Coulomb interactions between electrons. Furthermore, they can be controlled thanks to resonance effects occurring as atoms collide with each other. Note finally that optical lattices are a good example highlighting the importance of light shifts. One may wonder if it might not be simpler to shift atomic levels by Zeeman or Stark effects in static magnetic or electric fields, rather than using an off-resonance light beam to produce a light shift. The advantage of the light shifts is that they can be used to form potentials varying over very short distances, of the order of an optical wavelength, which is much more difficult to attain with static fields. 4.

Sub-Doppler cooling. Sisyphus effect

We described, in § 2-b of Complement AXIX , a cooling mechanism for the atoms, based on the Doppler effect, and called for that reason “Doppler cooling”. We computed the friction and diffusion coefficients associated with that mechanism and showed that the lowest temperature that could be reached by Doppler cooling was of the order of ~Γ (where Γ is the natural width of the atoms’ excited states, and the Boltzmann constant). Actually, the first measurements of the temperatures reached by laser cooling, and based on the time-of-flight method [60], showed that temperatures much lower than could be obtained; furthermore, their dependence on the detuning between the laser beams’ frequency and the atomic frequency did not follow the prediction of the Doppler cooling theory. This implied the existence of other cooling mechanisms for the atoms, leading to temperatures lower than the Doppler limit ; as expected, these mechanisms were called “sub-Doppler” mechanisms. One of them, called the “Sisyphus effect”, will be described in this section. The theory of Doppler laser cooling, exposed in § 2-b of Complement AXIX , does not take into account several important characteristics of laser cooling experiments. In most experiments performed in three-dimensional space, the polarization of the laser field cannot be uniform. The spatial variations of this polarization should not be ignored. The atoms under study have several sublevels, in the lower state and in the excited state. The two-level atom approximation of § 2-b in Complement AXIX is therefore not sufficient. As there are several sublevels in the lower state , one should include the effects of the optical pumping between these sublevels, effects whose characteristic time constants (pumping times) are longer than the lifetime 1 Γ of the excited state. As the detuning between the laser beams’ frequency and the atomic frequency is different from zero, one must take into account the light shifts of the lower level , which can take on different values for the different sublevels. 2155

COMPLEMENT DXX



Before describing the Sisyphus effect, we first show on a simple example how these different effects come into play. 4-a.

Laser configurations with space-dependent polarization

Laser configurations with a space-dependent polarization do not necessarily involve three pairs of laser beams counterpropagating along the , and axes. They can be achieved in one-dimension, and are easier to study, as long as the two counterpropagating laser waves have different polarizations. As an example, Figure 3 represents two laser waves propagating in opposite directions along the axis and having linear orthogonal polarizations e and e . The polarization of the total field changes from right-hand circularly polarized ( + with respect to the quantization axis ) to left-hand circularly polarized ( ) in planes separated by a distance 4, and is linear at 45 of the and axes, half-way between these planes.

Figure 3: Laser configuration with a space-dependent polarization: two laser waves propagate in opposite directions along the axis, having linear orthogonal polarizations e and e .

4-b.

Atomic transition

Many of the laser cooling experiments use transitions between a lower state with angular momentum and an excited state with an angular momentum equal to = + 1. Here, we shall consider the simplest possible case = 1 2, where the lower state contains only 2 sublevels 1 2 . We then have = 3 2 and the excited state has 4 sublevels 1 2 and 3 2 . 4-c.

Light shifts

Consider first a point in space where the laser field polarization is + (with respect to the quantization axis ). We saw in § 1-b of Complement CXIX that photons with a + ( ) polarization have a spin angular momentum +~ ( ~) along the axis. Conservation of total angular momentum in the photon absorption process leads to the selection rule = +1( 1) for the absorption of a + ( ) photon, where and are the magnetic quantum number of the states involved in the transition. Figure 4 represents the 2 transitions 1 2 = +1) that can +1 2 and +1 2 +3 2 ( be excited by the laser field. The numbers 1 3 and 1 shown next to these 2 transitions are the squares of the Clebsch-Gordan coefficients of these transitions (Complement BX ); they indicate that the +1 2 +3 2 transition is 3 times more intense than the 1 2 2156



LIGHT SHIFTS: A TOOL FOR MANIPULATING ATOMS AND FIELDS

Figure 4: Transition 1 2 3 2. The oblique upwards arrows show the transitions excited at a point where the laser field polarization is + , the vertical downward arrow indicates the spontaneous emission transition from sublevel +1 2 toward sublevel +1 2 . The light shifts of states 1 2 and +1 2 are noted } 3 and } . At a point where the laser polarization becomes instead of + , the shifts of the two sublevels are interchanged, by symmetry.

between the laser beams’ frequency and the atomic +1 2 transition. As the detuning frequency is negative in a laser cooling experiment, both states 1 2 have a negative light shift, but with a modulus 3 times larger for state +1 2 than for state 1 2 . These light shifts are written in the figure as ~ and ~ 3, where is positive. At a point in space where the polarization is , the previous results are interchanged. It is now the 1 2 3 2 transition that is 3 times more intense than the transition, yielding light shifts equal to ~ and ~ 3 for the states +1 2 1 2 and , respectively. 1 2 +1 2 Finally, at a point where the polarization is linear, the two light shifts are identical for symmetry reasons, and proportional to the square of the Clebsch-Gordan coefficient (equal to 2 3), indicated in the figure for the +1 2 +1 2 transition. Consequently, they are both equal to 2~ 3. This means that as one moves along axis, the positions of the 2 Zeeman sublevels ~ and ~ 3 1 2 and +1 2 oscillate, with opposing phases, between the values (taking the energy of the unperturbed ground state equal to zero). 4-d.

Optical pumping

Let us focus on a point where the laser field polarization is + and there is an atom in state +1 2 . The atom can absorb a + photon and end up in state +3 2 . From this state, it can only fall, by spontaneous emission, back to its initial state +1 2 ; optical pumping (§ 1-b of Complement CXIX ) does not lead, in this case, to any population change. On the other hand, if the atom is initially in state 1 2 and absorbs a photon + that brings it to state +1 2 , it can then fall back, by spontaneous emission, into state +1 2 ; optical pumping takes place from the least shifted sublevel 1 2 towards the most shifted sublevel +1 2 . A comparable situation is found at a point where the laser field polarization is . Optical pumping can only occur from the least shifted sublevel +1 2 toward the most shifted sublevel 1 2 . As for a point where the laser field polarization 2157

COMPLEMENT DXX



is linear, since the Clebsch-Gordan coefficients of the 1 2 1 2 and +1 2 +1 2 transitions are equal, as are those of the 1 2 and +1 2 +1 2 1 2 transitions, optical pumping cannot favor one of the the populations of the 2 sublevels 1 2 and . To sum up, optical pumping can only transfer population from the least shifted +1 2 sublevel to the most shifted sublevel, with a maximum efficiency at points where the laser field polarization is circular. 4-e.

Sisyphus effect

We now show how the correlations between the light shifts and the optical pumping effects studied in the last two sections can reduce the atom’s kinetic energy, and hence cool it down. Figure 5 shows, for an atom moving along the z axis, the energies of its 2 sublevels 1 2 and +1 2 , shifted by the light. Let us assume the atom starts from the bottom of a potential valley, at a point where the laser field polarization is + , and is initially in its most shifted state +1 2 . As it moves toward the right, it climbs a potential well, and looses some kinetic energy. If the optical pumping time is long enough, it will have time to reach the top of the hill, where the laser field polarization is ; it then has a high probability to undergo an optical pumping cycle and be transferred to the most shifted sublevel, which is now sublevel 1 2 . The whole cycle we just described can repeat itself, and each time the kinetic energy of the atom is lowered by a quantity of the order of the maximum energy difference between the two sublevels in Figure 5, equal to (2 3)~ . The atom is facing a situation similar to that of the hero of Greek mythology, Sisyphus: it must endlessly climb a potential hill since it is sent back to the bottom as soon as it reaches the top, hence the name Sisyphus effect given to this mechanism. The temperature reached by such a mechanism can be estimated by a simple

Figure 5: Principle of Sisyphus cooling: an atom in state +1 2 moving from a point where the laser field polarization is + must climb a potential hill of height 2~ 3, which decreases its kinetic energy. When it reaches the top of the hill, where the laser field polarization is , it has a strong probability to fall back, by spontaneous emission, to the state 1 2 . As the cycle repeats itself, the atom is for ever climbing potential hills, like the hero Sisyphus in Greek mythology. Its kinetic energy diminishes constantly. 2158



LIGHT SHIFTS: A TOOL FOR MANIPULATING ATOMS AND FIELDS

argument. The atom’s kinetic energy decreases during each Sisyphus cycle, until it is low enough for the atom to be trapped at the bottom of a potential well. At that point its kinetic energy must be of the order of ~ . The ultimate temperature that can be reached by Sisyphus cooling is expected to be ~ . In laser cooling experiments, the laser intensities are generally low and the atomic transitions are not saturated; consequently, ~ ~Γ and hence . This explains why the measured temperatures can be two orders of magnitude lower than the Doppler temperature, and reach values of the order of 10 6 , opening the way to numerous applications. All these qualitative predictions have been confirmed by more quantitative models, see ([61]) and ([62]). Experiments have confirmed the theoretical predictions, in particular those concerning the dependence of on the various experimental parameters, such as laser intensity and detuning [63]. 5.

Non-destructive detection of a photon

Consider now an experiment where the atoms of a beam cross, one after the other, a cavity containing radiation whose quantum state is described by a Fock state; the number of photons in the cavity mode is fixed, equal for example to 0 (radiation vacuum) or 1 (single photon). The atoms are prepared in a coherent superposition of the two states and : in

=

1 ( 2

+

)

(2)

While each atom interacts with the photons, its levels undergo light shifts resulting in different phases for the two atomic states as the atom crosses the cavity; note, however, that if the detuning between the laser frequency and the atomic frequency is sufficiently large, no photon will be absorbed or emitted. As the atom exits the cavity, the radiation state is the same initial Fock state, whereas the atomic state is modified by this phase factor. The final atomic state can be written (within a global phase factor of no physical significance): fin

=

1 ( 2

+

)

(3)

The phase is simply the integral over time of the energy difference between the dressedatom levels that come into play as the atom crosses the cavity. It is given by the energy diagram of the dressed-atom. Figure 8 of Complement CXX shows that the gap between states 0 and 0 is reduced from ~ 0 to ~( 0 ) by the light shifts. For a cavity with no photons 0 ( = 0), when the atom exits the cavity, the coherence between its states and has been dephased by: 0

=

0(

)d

(4)

where 0 ( ) is obtained by replacing in (42) of Complement CXX the Rabi frequency Ω1 by a function of time that accounts for the motion of the atom in the cavity mode, where it is subjected to a time-dependent light intensity; remember that we assumed the 2159

COMPLEMENT DXX



detuning between the atomic frequency and the laser frequency to be large enough for no real photon absorption by the atom to occur4 . If now the cavity contains one photon ( = 1), Figure 8 indicates that the gap between states 1 and 1 is reduced by the light shifts to ~( 2 0 ), i.e. to ~( 3 ). When the atom exits the cavity, the 0 0 coherence between its states and is now shifted by three times the amount obtained in (4). This means that an atom, traversing the cavity in a superposition of states and , keeps in the phase of that coherent superposition a trace of the number of photons present in the cavity; this occurs without any photon absorption (since the detuning is too large). To sum up, if = 0, the state of the atom at the cavity exit is: fin (

= 0) =

whereas if fin (

1 ( 2

+

0

)

(5)

1

)

(6)

= 1, this state is: = 1) =

1 ( 2

+

with

1 = 3 0. How can we make use of this trace left on the atom by the possible presence of a photon in the cavity, and determine if this cavity contains zero or one photon? The time taken by the atom to cross the cavity can be adjusted by changing the atom’s speed. Imagine that this time is tuned so that 0 1 = ; this means that the two states (5) and (6) are now orthogonal. As the atom exits the cavity, we can apply to it a 2 laser pulse adjusted to transform ( = 0) into . That same pulse will transform ( = 1) into the state orthogonal to , that is to . This means that measuring the atomic state after this 2 laser pulse allows concluding that = 0 if the atom is found in state , and that = 1 if the atom is found in state . The measurement can be repeated several times by sending a stream of atoms, one after the other, and applying to each of them the same procedure; one can measure several times in a row the same value , which proves the number of photons in the cavity did not change during the measurements. As opposed to photoionization where a photon is absorbed giving rise to a photoelectron (Complement AXX ), this method is non-destructive: the presence of the photon is detected without it being absorbed. This experiment, generalized to the case where the photon number is larger than one, is described in more detail in reference [64].

Conclusion. For a long time, light shifts have been considered as an interesting physical phenomenon without specific applications, and even as an undesired perturbation for high resolution spectroscopy, since they modify the atomic transition frequencies one is trying to measure with the highest possible precision. These shifts must be taken into account to extract from the measurements the non-perturbed frequencies of atomic transitions; most of the 4 We

also assume that the field variation encountered by the atom as it crosses the cavity is slow enough for non-adiabatic transitions from to , or from to , to be highly improbable. We also suppose that the natural width Γ of the excited state , and the time the atom takes to cross the cavity, are small enough for Γ 1, meaning spontaneous emission from state does not have time to occur while the atom crosses the cavity.

2160



LIGHT SHIFTS: A TOOL FOR MANIPULATING ATOMS AND FIELDS

time, several measurements at different light intensities must be performed to extrapolate the results to zero light intensity. This complement clearly shows how much the situation has changed, by presenting the large variety of experimental methods using light shifts of atomic energy levels, and their great number of applications. These methods were implemented more than 20 years after these shifts were theoretically predicted and experimentally demonstrated; this illustrates the long term practical impact of fundamental research. These methods allow acting both on the internal and external atomic variables; they also permit using atoms as a very sensitive non-destructive probe for the properties of a field composed of only a few photons. These methods made it possible to trap atoms in a standing laser wave, or to obtain periodic lattices of neutral atoms trapped in such a wave. It also led to laser cooling methods that allowed reaching temperatures previously totally inaccessible for atomic gases, millions of times lower than the lowest temperatures found in the interstellar or intergalactic space of the Universe.

2161



DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

Complement EXX Detection of one- or two-photon wave packets, interference

1

2

3

4

5

One-photon wave packet, photodetection probability . . . . 1-a Photoionization of a broadband detector . . . . . . . . . . . 1-b Detection probability amplitude . . . . . . . . . . . . . . . . 1-c Temporal variation of the signal . . . . . . . . . . . . . . . . One- or two-photon interference signals . . . . . . . . . . . . 2-a How should one compute photon interference? . . . . . . . . 2-b Interference signal for a one-photon wave packet in two modes 2-c Interference signals for a product of two one-photon wave packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Absorption amplitude of a photon by an atom . . . . . . . . 3-a Computation of the amplitude . . . . . . . . . . . . . . . . . 3-b Properties of that amplitude . . . . . . . . . . . . . . . . . . Scattering of a wave packet . . . . . . . . . . . . . . . . . . . 4-a Absorption amplitude by atom B of the photon scattered by atom A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-b Wave packet scattered by atom A . . . . . . . . . . . . . . . Example of wave packets with two entangled photons . . . 5-a Parametric down-conversion . . . . . . . . . . . . . . . . . . . 5-b Temporal correlations between the two photons generated in parametric down-conversion . . . . . . . . . . . . . . . . . . .

2165 2165 2166 2167 2167 2168 2168 2170 2174 2174 2175 2176 2176 2177 2181 2181 2183

Introduction In Chapter XX the initial and final states of the atom + photon(s) system were chosen as states that, in the absence of interaction, had a well defined energy; before the interaction, such states do not evolve in time, as if the photon were not propagating in space. As an example, in the scattering process of a photon by an atom, the chosen initial radiation state is a photon with momentum ~k and energy ~ = ~ , which spreads over the entire space; similarly, the final state is also a photon with momentum ~k and energy ~ = ~ . The interaction was “turned on” at time , which allowed computing the probability amplitude for the atom + photon(s) system to go from one state to the other between and . This is clearly a phenomenological approach: what actually happens is that the interaction operator remains constant but only comes into play to change the state vector when the atom is in the presence of radiation. A more realistic description of the process should involve the propagation of wave packets, with the incident radiation being described by a wave packet initially very far away from the atom, but going towards it. Their interaction then gives rise to a scattered wave packet moving away towards infinity, while the incident wave packet, modified by the interaction, also continues on its way. 2163

COMPLEMENT EXX



Note however that introducing a wave packet for a photon cannot be done by the standard method used for a massive particle. As already pointed out at the end of § B-2 in Chapter XIX, a photon does not have a position operator. One cannot obtain its spatial wave function by projecting its state vector onto the eigenvectors of that operator, and then squaring this wave function’s modulus to get the probability of finding the photon in any given region of space. One could then imagine using the spatial variations of the electric and magnetic fields to infer the photon localization. But for radiation states with exactly one, two, etc. photons, the average value of theses fields at any point in space is zero (it is the sum of zero average value creation and annihilation operators in each mode). Consequently, for a single photon, this average value cannot be directly used for building a wave packet localized in space. This is why we shall use another approach: we shall assume the photon interacts with detectors, well localized in space, and compute the probability of its detection by these apparatus. This will lead us to introduce an amplitude for the photon detection (by photoionization) at a given point, which presents close analogies with the spatial wave function of a massive particle in non-relativistic quantum mechanics. We start in § 1 by exposing the general idea of this approach; we introduce a function (r ) that allows localizing a single photon in space through its probability of being absorbed by a broadband detector. This leads to the concept of wave packets, even though the average value of the electric field remains zero throughout the entire space. In the perturbative computations, one can also introduce initial and final radiation states that are wave packets, described by linear superpositions of photon states with different momenta and energies. In § 2, we show how the detection amplitude (r ) allows studying light interference phenomena involving one or two photons. These phenomena are interpreted in terms of interference between the transition amplitudes associated with different paths leading the quantum field from a given initial state to a given final state. Starting first (§ 2-a) with a general discussion of the interference signals, we then focus in § 2-b on the interference involving one photon being simultaneously in two modes of the field. Finally, we examine interference involving two photons in the simple case where the system is described by the product of two one-photon wave packets (§ 2-c). In § 3, we replace the broadband detector by an atom with discrete energy levels. Without having to assume that the coupling between the atom and the field is turned on abruptly (which is hard to justify from a physical point of view), a number of results of Chapter XX are confirmed with, in addition, the possibility of studying the temporal aspect of the absorption phenomenon. In § 4, we extend this method to study the scattering of photons by an atom. Here again, we will confirm the results of Chapter XX, while enriching our understanding of the temporal aspects of the physical process. Finally, in § 5, we consider “real” two-photon wave packets that are entangled wave packets. Parametric down-conversion is an example of a situation leading to strong temporal correlations between two entangled photons. Such correlations are impossible to understand in terms of a classical treatment of the radiation. In this entire complement, we have limited our studies to one- or two-photon wave packets, but the computations can be extended to wave packets containing a larger number of photons1 . 1 Or

2164

even an undetermined number, as in a coherent state.

• 1.

DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

One-photon wave packet, photodetection probability

A one-photon wave packet is described by a state , where the photon number is precisely equal to one (eigenstate of the photon number operator, with eigenvalue equal to 1). We build this wave packet2 as a linear superposition of states with different momenta ~k: =

d3

(k)

k

d3

0 =

(k) k

(1)

State is not stationary (it is not an energy eigenstate). It is of the type considered in § B-3-c of Chapter XIX, an eigenket of operator ˆ (total number of photons) with an eigenvalue equal to one. It is assumed to be normalized: d3

(k) 2 = 1

(2)

which allows interpreting be equal to ~k. 1-a.

(k) 2 as the probability density for the photon momentum to

Photoionization of a broadband detector

Imagine we place an atom playing the role of a photodetector at point r in the radiation field described by state (1). According to relation (26) of Complement BXX , the probability (r )d for observing a photoelectron emission between times and + d is given (to the interaction’s lowest order) by: (r ) =

( )

(r )

(+)

(r )

(3)

where is a constant depending on the photodetector sensitivity. ( ) (r ) and (+) (r ) are the negative and positive frequency components of the electric field operator appearing in its plane wave expansion as given by (A-7) in Chapter XIX: (+)

(r ) =

( )

(r )

=

d3 (2 )3

2

~ 2

(k)

(k r

)

(4)

0

with: =

k =

(5)

where is the speed of light. Remember that expression (3) was established in the interaction representation, where the state vector evolves only under the effect of the atom-radiation interaction; the operators evolve freely only under the effect of the atomic or radiation Hamiltonians (i.e. without mutual interaction). However, as we are performing a computation to lowest order, we can consider in (3) that is actually

2 For the sake of simplicity, we ignore in this complement the degrees of freedom of the radiation polarization, which do not play a significant role in the effects under study. This amounts to assuming that all the vectors k appearing in (1) have almost the same directions and that they are all associated with the same polarization ε.

2165

COMPLEMENT EXX



constant. The annihilation operator (+) (r ) in (3) acting on the one-photon state yields the vacuum, and we can rewrite (3) as: ( )

(r ) =

(r ) 0 0

(+)

(r )

=

0

(+)

(r )

2

(6)

that is: (r ) =

d3 (2 )3

2

~ 2

2

(k)

(k r

)

(7)

0

For a massive particle of mass , the probability to find it at point r and at time 2 is given by the squared modulus Ψ(r ) of its wave function Ψ(r ). This wave function is the Fourier transform of the probability amplitude (k) that a measurement of the particle’s momentum gives the value ~k. For a free particle, this probability amplitude (k) has a time variation in . Equality (7) is thus reminiscent of this probability for a massive free particle; however, in view of the factor (proportional to ) in front of (k) in the integral of (7), (r ) is not proportional (at a given instant ) to the modulus squared of the spatial Fourier transform of (k ) = (k) . This means that, limiting ourselves to one-photon states, we can indeed consider the function (k) appearing in (1) as a wave function in momentum space, since (k) 2 is a probability density for the photon momentum. However, the probability to detect a photon at point r and at time with a photodetector is not simply the modulus squared of the Fourier transform of that “wave function in momentum space” (k) . This confirms that it is not possible, for a photon, to introduce a spatial wave function that is exactly equivalent to that of a massive particle. 1-b.

Detection probability amplitude

The right-hand side of (6) is the squared modulus of the function: (r ) = 0

(+)

(r )

(8)

which plays an important role in all the computations to follow; it has the dimensions of an electric field. For the wave packet written in (1), its expression is: (r ) =

d3 (2 )3

2

~ 2

(k)

(k r

)

(9)

0

It should not be confused with the average value in state of the operator (+) (r ) written in (4), since that average value is zero, as we mentioned above. Nor is it, as already pointed out, a wave function for the photon in position space; it is a probability amplitude for the detection (and not the presence) of the photon at point r and time . When we mention, in this complement, the space time wave packet associated with the photon in state (1), we will always be referring to the amplitude (8). Note, however, that in the particular case where the function (k) in momentum space is well centered around a value with a dispersion ∆ very small compared to , one can neglect in (7) the variation of and replace by ; the integral in (7) will therefore involve the spatial Fourier transform of (k) . This approximation will often be used in what follows. 2166



DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

Comments: (i) In this section, we have ignored the radiation polarization, assuming that all the plane waves in relation (1) are associated with the same polarization vector ε. When this is not the case, the detection amplitude becomes a three-component vector function. Its component along any given axis yields the detection probability amplitude by a photodetector preceded by a polarization analyser letting through only the light polarized linearly along that axis. This vector detection amplitude is similar to the wave function of a spin 1 particle, which also has three components. (ii) In this complement, we shall study only wave packets containing a well defined photon number, 1 or 2, which allows directly generalizing the computations of Chapter XX. Wave packets can, however, be built many different ways, without exactly fixing the photon number. It often happens, for example, that one wishes to reproduce a classical field for which each field mode k has a given amplitude (k); it is then natural to use a state where each quantum mode is in a coherent state with eigenvalue (k). In that case, only the average photon number is fixed, not its exact value. 1-c.

Temporal variation of the signal

When the photodetector is placed at r = 0, it delivers a signal that is given, according to (7), by: 2

(r = 0 )

d3

(k)

(10)

This signal is proportional to the squared modulus of the Fourier transform of (k). 2 Let us assume that ( ) is a real positive function of , and that ( ) is a function of centered at = , with a width ∆ . If this width ∆ of the wave packet is very small compared with the average wave number , one can replace by , and the signal becomes proportional to the modulus squared of the Fourier transform of (k). At time = 0, and since we assumed all the ( ) to be positive, all the waves forming the wave packets are in phase and the signal observed on the photodetector is maximum; it is zero for = , and takes on significant values only during a time interval ∆ 1 ∆ around = 0. This signal describes, in a way, the passage of the wave packet at the detector’s position. To study the detection probability at a point r = 0, we just have to replace ) by (k r in the integral of relation (10). As an example, imagine we have a one-dimensional wave packet, all the wave vectors k being parallel to the axis ( ) ( = = 0); the exponential reduces to . The phenomena observed at a point in space of coordinate = 0 are thus deduced from those observed at = 0 by a simple time shift equal to : the wave packet moves along the direction with velocity and without any deformation. 2.

One- or two-photon interference signals

We now discuss in terms of wave packets what happens in light interference experiments involving one or two photons. 2167

COMPLEMENT EXX

2-a.



How should one compute photon interference?

In non-relativistic quantum mechanics, a particle with a non-zero mass is described by a wave function (r ) whose squared modulus (r ) 2 yields the probability density of finding the particle at point r and time . In a Young’s type interference experiment, the wave function, after going through the two slits pierced into a screen, is a linear superposition of two wave functions 1 (r ) and 2 (r ) originating from the two slits. These two waves overlap in a region of space where the probability density of finding the particle at point r and time , which is equal to 1 (r ) + 2 (r ) 2 , contains a term 2Re 1 (r ) 2 (r ) oscillating in space and time; this results in interference fringes. However, we recalled in § 1 why we cannot, in general, introduce a spatial wave function for a photon that would be strictly analogous to (r ), and whose squared modulus would yield the probability density for the presence of the photon at a given point. This led us to define an amplitude (r ) in (8), whose squared modulus yields the probability density for photodetecting the photon at point r and time . We are going to show in § 2-b that such amplitudes can actually be used to interpret interference fringes; as an example, we shall study the fringes appearing in the single photodetection signal (r ) observed on a one-photon wave packet after it goes through a screen pierced with two slits. As already underlined, it is important not to confuse (r ) with the average value of the electric field in the quantum state under study – which in any case is zero in a one-photon state. In classical electromagnetism, the electric (or magnetic) fields directly interfere; in quantum electromagnetism, one must reason in terms of probability amplitudes. For field states containing at least two photons, the double photodetection signal (r r ) is different from zero. To interpret it in the simplest possible case, we assume (in § 2-c) that the radiation is described by a tensor product of two onephoton wave packets3 . We will show that interference fringes observable on can also be interpreted in terms of products of detection probability amplitudes; these fringes result from interference between transition amplitudes associated with two different paths leading the field from its initial state (where it contains two photons) to the vacuum. Here again, one should reason in terms of interference not directly between average values of electric or magnetic fields, but between paths. 2-b.

Interference signal for a one-photon wave packet in two modes

We start with the simplest photon interference experiment, the well-know Young’s double slit experiment, but in a case when only one photon at a time passes through the screen pierced with the two slits. The state vector of this single photon is then the sum of two components associated with the passage through one or the other of the two slits. When the photon reaches the interference region, these two components are associated with two different radiation modes.

3 A simple example of a two-photon entangled state, which is not a product of two one-photon states, will be studied in § 5.

2168

• .

DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

One-photon wave packets, after passage through a two-slit screen

We now focus on the radiation state after it passed the two-slit screen. This state is described by a one-photon wave packet, which, in the interaction representation, is of the form: =

1

+

1

2

with:

2

1

2

+

2

2

=1

(11)

In this expression, the ket 1 describes the wave packet emerging from the first slit; as in (1), it can therefore be written with a function 1 (k) that is peaked around the value k1 . The ket 2 describes the wave packet emerging from the second slit, and its function 2 (k) is peaked around the value k2 . Since before going through the slits the two wave packets came from the same source, they must be centered around a common frequency = 1 = 2 ; consequently k1 and k2 have the same modulus, but their directions can be different. We shall finally assume that the wave packets emerging from the two slits arrive at the same time in the interference region (meaning the optical paths along the two trajectories are equal) and that each wave packet is sufficiently long for the frequency to be well defined. As in (8), for each wave packet we introduce a detection amplitude (r ): 0

(+)

(r )

=

(r )

where:

=1 2

(12)

In the interference region, we assume the two modes4 to be close to plane waves with wave vectors k1 and k2 . We then set: (r ) =

(k r

(r )

where the function ) nential (k r . .

)

(13)

(r ) has a much slower space and time variation than the expo-

Calculation of the single photodetection signal

We assume that the field is contained in a box of volume 3 ; we use a complete orthonormal set of field modes, with wave vectors k , which includes both k1 and k2 . Relation (B-3) of Chapter XIX indicates that the positive frequency component of the electric field can be written 5 as: (+)

}

(r ) =

2

0

(k r

)

(14)

3

with = . When this operator acts on the ket (11), all the terms lead to a zero result, except for the = 1 and = 2 terms. For these two terms, we have 12 = 0 , so that (11) and (12) lead to: (+)

(r )

=

1

1 (r

)+

2

2 (r

) 0

(15)

4 Another possibility would be to use Gaussian wave packets with the same focal point, having in the vicinity of that point plane wave structures with wave vectors k1 and k2 , and lateral extensions very large compared to the wavelengths 2 1 and 2 2. 5 For the sake of simplicity, we ignore polarization variables of the field.

2169



COMPLEMENT EXX

The probability for detecting the photon at point r and time square of norm of this ket, written as: ( )

(+)

(r )

is proportional to the

2

(r )

=

1 1 (r

)+

2 2 (r

)

(16)

The equality includes square and cross terms. The square terms can be written, taking (13) into account: (r ) 2 =

(r ) 2

(17)

and they vary slowly as a function of r and . The crossed terms are expressed as: 1 2

1 (r

)

2 (r

) + c.c. =

1 (r

1 2

)

2 (r

) exp [(k1

k2 ) r)] + c.c.

(18)

and exhibit spatial modulations characteristic of interference phenomena (c.c. stands for complex conjugate). .

Discussion

Relation (16) shows that the photodetection signal is the squared modulus of the sum of two amplitudes, 1 1 (r ) and 2 2 (r ), which interfere. Amplitude 1 1 (r ) is the amplitude for detecting at point r and time the photon in mode 1 ; it is equal to the amplitude 1 of finding the field in state 1 , multiplied by the amplitude 1 (r ) for detecting the photon at point r and time when the field is in state 1 . The amplitude 2 2 (r ) is interpreted in a similar way. During the detection process, the field goes from state written in (11) to the vacuum state 0 following two possible paths: the photon is absorbed either while in mode 1 , or while in mode 2 . As nothing allows deciding which path the system followed, the two corresponding amplitudes interfere. This confirms what we stated above: in the quantum theory of radiation, the interference fringes observed on a photodetector signal are associated with the interference, not between two classical electromagnetic waves, but rather between two transition amplitudes corresponding to different paths (leading the system from the same initial state to the same final state). 2-c.

Interference signals for a product of two one-photon wave packets

Let us generalize this type of interpretation, in terms of transition amplitudes, to interference experiments involving two photons and where one measures correlations between signals coming from two photodetectors. .

State vector for the two photons

We now assume the field contains two photons, and can be described as the product of two wave packets such as the one written in (1): 12

=

d3

1 (k)

d3

2 (k

)

k k

0

(19)

Is it possible to observe spatial and temporal modulations on the signals and coming from one or two detectors placed in that field? We are going to show that the 2170



DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

answer to that question is “no” if one considers only single photodetections, but “yes” if one takes into account the correlations between double photon detections. We assume the two wave packets in (19) to be well separated: there exists no overlap between the domain 1 where the function 1 (k) is different from zero and the domain 2 where the function 2 (k) is non-zero. Let us introduce the matrix element that generalizes relation (8) to the two-photon case: 0

(+)

(+)

(r )

(r

)

(20)

12

We now insert in that expression the plane mode expansion (14) for both electric field operators. To yield a non-zero result, the annihilation operators appearing in these fields must act on a mode that, according to (19), contains one photon. This means that either the mode selected in (+) (r ) belongs to the 1 domain and the one selected in (+) (r ) to the 2 domain, or the inverse. In the first case, the scalar product of the vacuum bra and the modes that came into play yields the detection amplitude 2 (r ) associated with the second wave packet, multiplied by the detection amplitude 1 (r ) associated with the first one. In the second case, the wave packets are inverted. The final result is: 0

(+)

(+)

(r )

(r

)

12

=

1 (r

)

2 (r

)+

2 (r

)

1 (r

)

(21)

where the functions 1 and 2 are the detection amplitudes associated, as defined in (8), with the two individual wave packets included in 12 . .

Single photodetection signal

(r )

To get the single photodetection signal, we first compute the result of the action on state (19) of the field positive frequency component (+) (r ). As we argued above, to yield a non-zero result, this operator must destroy a photon, either in a mode for which 1 (k) is non-zero, or in a mode for which 2 (k) is non-zero. In the first case, the summation over all the modes involved reconstructs the function 1 (r ) multiplied by the vacuum ket associated with these modes; the modes of the other wave packet remain unchanged. In the second case, the two wave packets exchange roles and it is now the function 2 (r ) that is reconstructed. This leads to: (+)

(r

) 1 (r

12

)

= d3

2 (k

)

k

0 +

2 (r

)

d3

1 (k) k

0

(22)

The probability per unit time to detect a photon at point r and time is the square of this ket’s norm. The terms in 1 (r ) 2 and 2 (r ) 2 contain the square of the two wave packets’ norms, each equal to one; these terms do not oscillate, neither in space, nor in time. The cross terms are the only ones that could yield spatial and temporal modulations; they contain, however, the scalar product of the two wave packets, which is zero since we assumed the wave packets were orthogonal (there is no overlap between the two 1 and 2 domains). This means that, when the field is described by state (19), no interference fringes are observable in the signal of a single photodetector. The interpretation of this result is similar to the one we gave before. The system can follow two paths: either an absorption of a photon from the first wave packet, or an 2171

COMPLEMENT EXX



absorption of a photon from the second. However, as opposed to what happens when the system started from the initial state (11), the final state of the field is not the same for these two paths: if a photon from the first wave packet has been absorbed, the final state includes a photon from the second wave packet. Consequently, the two final states associated with the two paths are orthogonal, and observing the field’s final state one could (in principle) determine which path the system has followed; this is why the two amplitudes cannot interfere. Comment: One could consider other states for the two modes, each containing several photons, as for example states 1 2 where each mode is in a coherent state, characterized by a classical normal variable, 1 for mode k1 , 2 for mode k2 . We then know that state (+) (+) (r ) with an eigenvalue value cl ( 1 r ) equal 1 is an eigenstate of operator to the positive frequency component of the classical field in mode k1 , corresponding to the classical normal variable 1 (Chapter XVIII, § B-2). A similar result is valid for state 2 : (+)

(r )

=

(+) cl (

r )

=1 2

(23)

This leads to: (+)

(r )

1

2

=

(+) cl ( 1

r )+

(+) cl ( 2

r )

1

The probability of detecting a photon at point r and time norm of ket (24). It is proportional to: (+) cl ( 1

r )+

(+) cl ( 2

(24)

2

is equal to the square of the

2

r )

As this is the squared modulus of the sum of two classical fields, it is the usual interference signal of classical fields. As opposed to what we found before for the radiation state (19), when the two modes are in coherent states, the one-photon detection signal exhibit interference. This is an illustration of the quasi-classical character of coherent states.

.

Double photodetection signal

r

(r

)

Assuming, as above, the field initial state is given by (19), we now focus on the probability (r r ) (per double unit time) that a detector, placed at r , detects a photon at time and that another detector, placed at r , detects a photon at time . This probability is proportional to the correlation function (Complement BXX , § 2-d): 12

( )

(r

( )

)

(r

)

(+)

(r

)

(+)

(r

)

(25)

12

Since 12 contains only two photons, we can insert in the middle of this expression the projector onto the vacuum state, which leads to the squared modulus of expression (21). We obtain: (r 2172

;r

)

2 (r

) 1 (r

)+

1 (r

) 2 (r

)

2

(26)



DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

In addition to the square terms 2 (r have a slow variation with r , r , , 1 (r

)

1 (r

) 2 (r

)

2 (r

2

) 1 (r ) and 1 (r , we get cross terms:

) 2 (r

2

) , which

) + c.c.

which do have spatial and temporal modulations. If, for example, the first wave packet is centered around the values k1 and 1 , and the second one around the values k2 and 2 , these modulations are of the form: exp

k2 ) (r

(k1

r )

(

1

2 )(

)

+ c.c.

(27)

This result is not in contradiction with the fact that the probability of detecting a photon at r , , or at r , varies slowly with these variables: once a first photon has been detected at r , the probability to detect another one at r varies sinusoidally with r r and . (i) Discussion The amplitude whose squared modulus appears on the right-hand side of (26) is the sum of two amplitudes associated with two possible paths leading the system from the initial state 12 (containing two photons) to the same final state 0 (where all the modes are empty). Along the first path, with amplitude 2 (r ) 1 (r ), the k1 mode photon is absorbed at r and the k2 mode photon is absorbed at r . Along the second path, with amplitude 1 (r ) 2 (r ), the opposite happens: the k2 mode photon is absorbed at r and the k1 mode photon is absorbed at r . As explained before (§ 2-b), interference occurs between the different transition amplitudes associated with two possible paths leading the system from the same initial state to the same final state, as long as there is no way one can determine which path is actually followed. (ii) Another interpretation The photodetection signal (25) can also be written in the form: (+)

(r

)

(+)

(r

)

12

)

(28)

with: =

(+)

=

1 (r

(r )

d3

2 (k

)

k

0 +

2 (r

)

d3

1 (k) k

0

(29)

where, in the second line, we used relation (22). Signal (28) can be interpreted as the probability of detecting a photon when the field is that state where the photon has a probability amplitude 1 (r ) to be in the wave packet with amplitude 2 (k ), and a probability amplitude 2 (r ) to be in the other wave packet with amplitude 1 (k). This situation is quite similar to that encountered in § 2-b- , where we showed that the photodetection probability of a photon in state (11) exhibits modulations. In other words, we started from a state 12 with no coherence. It is the detection of a first photon that introduces the state (29) where a second photon is now in a coherent superposition, the coherence arising from the fact that the detected photon can come either from the first wave packet, or from the second. The coefficients of the 2173



COMPLEMENT EXX

superposition (29) depend on the point r and the instant where the detection of the first photon occurred. In this description of the phenomena, it is the first detection that introduces quantum correlations between the two modes, and the dependence of these correlations on r and explains why the probability of the second detection oscillates as a function of r r and . 3.

Absorption amplitude of a photon by an atom

We now replace the broadband photodetector by an atom with two discrete levels, a ground level and an excited level . This atom is placed at r = 0, and interacts with the same wave packet as that written in (1). We propose to compute the probability amplitude for the atom, initially in state , to absorb the incident photon and be found in state at time . 3-a.

Computation of the amplitude

The initial and final states of the process under study are: in

=

;

fin

= ;0

(30)

since the absorption of the photon transfers the radiation from state to the vacuum 0 . According to relation (B-4) in Chapter XX, the amplitude we are looking for is, to first order in : fin

¯(

)

in

=

1 ~

d

¯ ( )

fin

(31)

in

where the bar above the operators indicates they are expressed in the interaction picture, with respect to the non-perturbed Hamiltonian. The interaction Hamiltonian is given by6 : (+)

=

(r = 0)

(32)

The matrix element of ¯ ( ) appearing in (31) equals: fin

¯ ( )

in

=

0

0

(+)

(r = 0

)

(33)

In this equality, = , 0=( ) ~ is the frequency of the atomic transition, and (+) (r = 0 ) the electric field positive frequency component in the interaction representation. Using notation (8) for the matrix element on the right-hand side of (33), we get: fin

¯ ( )

in

=

0

(r = 0

)

(34)

which allows rewriting the absorption amplitude (31) as: fin

¯(

)

in

=

d

0

(r = 0

)

(35)

~

6 As we have ignored the radiation polarization degrees of freedom, we also ignore here the vector character of the atomic dipole D. Operator appearing in (32) is actually the projection of D onto the radiation polarization vector.

2174



DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

The quantity (r = 0 ) that appears in this expression can be considered to be the interaction Hamiltonian between the atomic dipole and a classical field (r = 0 ). The field (r = 0 ) thus appears as the classical field that would yield the same transition amplitude between the two atomic states and as a quantum field, when the radiation is in a one-photon wave packet described by (k). 3-b.

Properties of that amplitude

Let us return to the wave packet in infinite space (1) and use relation (8) to get the probability amplitude for the detection of the photon at point r = 0. Describing the field operator (4) with an integral over d3 instead of d3 , and using it in (1), the commutation of operators (k ) and (k) leads to a (k k ) function. This leads to: (+)

0

(r = 0

)

= (r = 0

)=

(2 )3

d3

2

} 2

( )

(36)

0

where = . As ( ) is centered around an average wave vector , assumed very large compared to the width ∆ of ( ), amplitude (36) can be written as: (r = 0

)=

( )

(37)

which is the product of a carrier wave of frequency = by an envelope ( ). This latter function has a variation that is slower than the preceding exponential; it can, for example, follow a bell-shaped curve, centered at = 0 and with width ∆ 1 ∆ . Inserting (37) into (35), we get: fin

¯(

)

in

=

d

(

)

0

Ω1 ( )

(38)

where Ω1 ( ) is the instantaneous Rabi frequency defined as: ~Ω1 ( ) =

( )

(39)

Equation (38) allows understanding the behavior of the absorption amplitude of the photon when increases from to + . As long as ∆ , both functions ( ) and Ω1 ( ) are zero; the incident wave packet has not yet reached the atom’s vicinity and no photon absorption can occur. As increases from ∆ 2 to +∆ 2, the wave packet crosses the atom, and the integral in (38) becomes larger. When +∆ , the wave packet has left the atom; the absorption amplitude remains constant and equal to: + fin

¯ (+

)

in

=

d

(

0

)

Ω1 ( )

(40)

This expression yields the probability amplitude for a photon to have been absorbed once the wave packet has crossed the atom. This confirms the results of Chapter XX, but in the present approach we did not have to artificially introduce any initial or final times for the process. Let us evaluate an order of magnitude for the amplitude (40). Assume first that = 0 (resonant wave packet). The integral in (40) is then of the order of Ωmax ∆, 1 2175

COMPLEMENT EXX



where Ωmax is the maximum value reached by the Rabi frequency when the atom is at 1 the center of the wave packet, and the envelope ( ) takes on its largest value. When = 0 (off-resonance wave packet), the absorption amplitude is weaker. According to (40), this amplitude is actually the Fourier transform of Ω1 ( ) at frequency 0 . This result simply expresses energy conservation: for the incident photon to be absorbed, its frequency must be equal to the atomic transition frequency. However, as the field envelope varies over time intervals of the order of ∆ , the photon average frequency does not have to be strictly equal to the atomic frequency; the two frequencies must be equal to within ∆ 1 ∆. 4.

Scattering of a wave packet

We now study a process involving two atoms: a wave packet impinges on an atom placed at r on the axis; after interacting with it, the wave packet is scattered in all directions, and then interacts with a second atom placed at r . The incident wave packet, propagating along the direction, is described by the function (k) . As before, we have two main goals. The first one is, while assimilating atom with a device for measuring the photon scattered by , to confirm the interpretation of (r ) as a detection amplitude of a photon at point r . The second goal is to study the time dependence of the scattering process itself. We shall first study the spatial and temporal dependence of the scattered wave packet, in particular when the central frequency of the incident wave packet is close to the resonant frequency 0 of the scattering atom. We shall then compute the probability amplitude for the scattered wave packet to have excited at time the atom from its ground state to its excited state . As in § 1, we will associate with this amplitude a spatial wave packet describing the passage of the scattered wave packet by point r . 4-a.

Absorption amplitude by atom B of the photon scattered by atom A

We first consider a photon with a wave vector k parallel to the axis, and a frequency = . We are looking for the probability amplitude k k for this photon to be scattered by atom A located at r from state k to state k . This amplitude is given by relation (E-3) of Chapter XX, where we only take into account the resonant processes (we assume the incident photon frequency to be fairly close to the atomic resonant frequency): k

k = =

fin

¯ (∆ 0)

2

(

in )

in (∆ )

(

fin

(∆ )

in )

(

)

(41)

( in ) is obtained from relation (E-4) in Chapter XX (here again we In this relation, assume that the radiation is contained in a box of volume 3 ): (

in )

=

} 2

ε 0

3

D +~

ε

D

(k

k )r

(42)

where we have assumed that only one level contributes, which explains why the sum over has been suppressed; this is correct if the frequency of the radiation is close to the resonance frequency of one transition, but far from all the other resonances. Note 2176



DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

that we have added to the right-hand side an exponential factor that comes from the spatial dependence of the electric field: in this complement, as in Chapter XX, we treat the atom’s position r classically, but we no longer assume the atom to be placed at the coordinate origin. Expression (42) is a product of two matrix elements of the interaction Hamiltonian, one for the absorption of the k photon, the other for the emission of the k photon, divided by a common energy denominator. The function (∆ ) ( ) simply expresses energy conservation, within } ∆ , for the elastic scattering process, as was the case, for example, in §§ B-1-b and E-1-b of Chapter XX. We assume that the interaction time ∆ is sufficiently long for this function to be assimilated to a real delta function ( ). The coefficient introduced in the second equality (41) is proportional to expression (42); it contains the product of two matrix elements, which depends on the polar angles and of the vector k with respect to the direction of k . We characterize this dependence by a function ( ), with: = k = k

(43)

As we assumed the frequency of the incident photon to be close to resonance, we can use the results of § C-2 in Chapter XX concerning resonant scattering. As in relation (E-11) of that chapter, we write the energy denominator in the form 0 + Γ 2, where Γ is the natural width of the excited state of the scattering atom . Amplitude (41) then becomes: k

(

k =

0

) + Γ 2

[(k

k )r ]

(

)

(44)

where

is a coefficient proportional to . We now move to the next stage, the interaction of atom B with the k photon. As in (33), it is described by a matrix element (here again, we must add an exponential factor to account for the fact that atom B is not at the coordinate origin, but at point r ): 0

0

(+)

(r

) ;k

k r

(

0

)

(45)

We are now looking for the amplitude at time of the complete process, scattering by A of the k photon, with an amplitude given by (44), and absorption by B, with an amplitude given by (45). Consequently, we multiply these two amplitudes and sum the product over all the possible k vectors for the scattered photon, and over the linear combination of states k forming the incident wave packet. 4-b.

Wave packet scattered by atom A

To study the properties of the wave packet scattered by atom A, we successively carry out the two summations. .

Summation over all possible directions of the scattered photon Let us start with the summation over k . Taking into account the function ( ) appearing in (44), the summation over the modulus of k leads to: =

=

(46) 2177

COMPLEMENT EXX



Regrouping the k dependent terms in (44) and (45), we find that the summation over the directions of k introduces the angular integral: dΩ (

)

k (r

r )

(47)

The summation over the polar angles of the exponentials describing the phase shift between r and r of all the plane waves k yields a spherical wave centered at r : dΩ (

)

k (r

r )

(

)

with

= r

r

(48)

where and are the polar angles of vector r r with respect to the direction k of the incident photon. The right-hand side of (48) is reminiscent of a classical result in collision theory – see for example relation (B-12) of Chapter VIII. This means that the sum of all the plane waves scattered by atom A located at r has the structure of an outgoing spherical wave with the same wave number as the k waves it is composed of. The amplitude of this spherical wave varies as 1 , which ensures that the outgoing energy across a sphere of radius and surface 4 2 does not depend on . The fact that the polar angles and appearing on the right-hand side of (48) are those of vector r r can be understood by stationary phase arguments. The phase cos factor k (r r ) associated with the scattered wave k is equal to , where is the angle between k and r r . Since 1, this phase factor has a very rapid variation with , except in the vicinity of points where cos is stationary with respect to , i.e. when = 0 for the outgoing wave. The angular integral (47) therefore gets most of its contribution from values of the angles that are close to the polar angles and of vector r r . Taking (48) into account, we deduce from (44) and(45): ;0 ¯ ( ) ;k

k

k

(

1 0+ Γ 2

)

k

k r

(

0

)

(49) where we have replaced .

and

by .

Summation over the energies

The initial state of the scattering process is a superposition of states k multiplied by (k ). We consider a one-dimensional wave packet, propagating along the axis. With a proper choice of the coordinate origin, we can assume atom A is located at r = 0, which amounts to replacing r by 0. As before, we assume ( ) is real, so that at = 0, the wave packet is centered at the position r = 0 of the scattering atom A. We now multiply (49) by ( ), integrate over , and over the time from to . The amplitude of the absorption by atom B, at time , of the photon scattered by atom A is therefore proportional to: d

0

d

( )

(

)

1 + Γ 2

0

2178

(50)



DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

where we have replaced the integral variable by = . Let us compare (50) and (35). The integral over appearing in the integral over in (50) can be interpreted as the classical field scattered at time and at a distance from point O along an axis with polar angles : diff (

)

d

( )

(

1 0+ Γ 2

)

(51)

It will be useful for what follows to regroup the two exponentials of (50) and to set: ˜=

(52)

We then get: diff (

)=

d

( )

(

)

1

1 + Γ 2

˜

(53)

0

which means that the wave packet scattered along the direction , and that its amplitude decreases as 1 . .

moves at velocity

Spatial and temporal dependence of the scattered wave packet

We now assume that the frequency width ∆ of the incident wave packet is much smaller than its average frequency : ∆

(54)

but we do not make any hypothesis concerning the relative values of ∆ and Γ. The factor appearing in the previous relations can now be replaced by , and comes out of the integral. We can also neglect the variations with of ( ) over the 1 Γ interval where the function ( + Γ 2) varies significantly. The scattered 0 field diff ( ˜) can then been seen as the temporal Fourier transform of the product of 1 two functions ( ) and ( . This field is the convolution of the Fourier 0 + Γ 2) transforms of these two functions of . Taking into account (36) and (37), the Fourier transform of the first function is: (˜) =

( )

˜

(˜)

(55)

For the second function, we get: 1 0+ Γ 2



(˜)

Γ˜ 2

(56)

where (˜) is the Heaviside function, equal to 1 for ˜ to: diff (

˜)

1

+

d

( )

0 (˜

)



0, and to 0 for ˜

)

Γ(˜

) 2

0. This leads

(57) 2179

COMPLEMENT EXX

.



Study of two limiting cases

Two interesting cases occur when the width ∆ of the incident wave packet is either very large or very small compared to the natural width Γ of the excited state of atom A. Γ limit The incident wave packet passes through a given point in a time 1 ∆ that is very short compared to the radiative lifetime 1 Γ of the excited state. The envelope ( ) of the incident wave packet is different from zero only during a time interval much shorter 1 than the characteristic times of the Fourier transform of ( . We can thus 0 + Γ 2) set = 0 in the last two terms of (57), which yields: ∆

diff (

˜)

+

1

(

d

)

0



( )

(˜)

Γ˜ 2)

(58)

The term in the first bracket is proportional to the excitation amplitude of the scattering atom by the incident wave packet. The second bracket describes a free oscillation at the atomic frequency 0 , starting at time ˜ = 0 and damped over a time 2 Γ. The physical meaning of this result is as follows. The incident wave packet spends a very short time near atom A, and hence excites it in a percussive manner before moving away with velocity . Once the incident wave packet is gone, the atomic dipole thus excited oscillates freely at frequency 0 , until it is damped by spontaneous emission. This situation is the analog of the percussive excitation of an oscillator in classical mechanics. Γ limit We can now replace in (57) the function ( ) by (˜) as ˜ cannot be larger, in modulus, than 1 Γ. This is because of the presence of the last exponential term in (57) and the fact that, when ∆ Γ, the envelope of ˜(˜) varies very slowly over that time interval. One can then rewrite (57) in the form: ∆

diff (

˜)

(˜)

+

˜



d

)

0 (˜

)



)

Γ(˜

) 2)

(59)

Let us make the change of variable =˜ in the integral over . Taking (56) into 0 account, we see that this integral is actually the Fourier transform of ( ) Γ 2 1 calculated at , which is ( . This leads to: 0 + Γ 2) diff (

˜)

(˜)

˜

1 0

+ Γ 2

(60)

The physical meaning of this result is as follows. When ∆ Γ, the wave packet takes a long time passing atom A, whose dipole undergoes forced oscillation at the frequency . It thus emits radiation at the same frequency, with an amplitude that follows adiabatically the slow variation of the envelope (˜) of the incident wave packet; this explains the first term in (60). The second term describes the linear response of the dipole with eigenfrequency 0 and damping time 2 Γ to an excitation of frequency . In this case, the oscillator’s amplitude follows adiabatically the excitation’s amplitude. 2180

• 5.

DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

Example of wave packets with two entangled photons

In § 2-c, we considered two-photon states that were tensor products of two one-photon wave packets; the two photons described by these states were not entangled. There obviously exist a number of two-photon states that cannot be described in the form of a product of two one-photon states, and which thus describe entangled photons. In this last section, we shall focus on such an example where the entangled photons appear in an optical nonlinear process, called parametric down-conversion. This process has the advantage of producing pairs of photons bunched in time. Detecting one photon of the pair at a given time allows predicting the second photon will be detected to within a very short time interval. 5-a.

Parametric down-conversion

Computations involved in parametric down-conversion are similar to computations we already discussed. We shall simply outline the general ideas allowing a physical understanding of the process, without going into details that would unnecessarily lengthen the present complement. .

Description of the process

In § E-1 of Chapter XX, we studied the elastic scattering of a photon by an atom. Figures 2 and 3 of Chapter XX show two possible diagrams representing such a process: an incident photon with angular frequency is absorbed and an photon emitted while the atom goes back to its initial level. Energy conservation of the total system atom + photon requires that = . In the present complement, we study a nonlinear scattering process during which, as before, an incident photon of angular frequency 0 is absorbed by an atom in state , but where there are now two photons with angular frequencies 1 and 2 that are emitted. At the end of the scattering process the atom returns to state ; energy conservation now requires that 0 = 1 + 2 . Figure 1 gives two possible representations of such a process, analogous to those of Figure 2. and 3. in Chapter XX. Several temporal orders are possible for the absorption and emission processes. For example, Figures 3. and 3. of Chapter XX do not have the same temporal order for the absorption and emission occurring in the scattering of one photon. For the three-photon process considered here, including one absorption and two emissions, 3! = 6 possible temporal orders should, a priori, be considered; Figure 1 represents only one of these six possible orders. .

Scattering amplitude

The principle for calculating the scattering amplitude of parametric down-conversion is similar to the one that led us to formulas (E-3) to (E-5) of Chapter XX, but involves now three, instead of two, interactions with the field; two relay states (instead of one) come into play. As an example, for the process represented in Figure 1, we must consider the following states: initial state:

;

first relay state:

0

, with energy ; 0 , with energy

in

= rel 1

+~

0

= 2181

COMPLEMENT EXX



Figure 1: An incident photon, with angular frequency 0 is scattered by an atomic system in the initial state . At the end of the scattering process, the atomic system has returned to state , while two new photons have appeared with angular frequencies 1 and 2 . Energy conservation requires that 0 = 1 + 2 . In the left hand side of the figure, the absorption (emission) processes are represented with upwards (downwards) arrows; in the right-hand side, these processes are shown with incoming (outgoing) wiggly arrows that also symbolize the photon propagation.

second relay state: final state:

;

1

; 2

1

, with energy

, with energy

fin

rel 2

=

= +~

+~ 1

+~

1 2

The probability amplitude associated with this process is obtained by generalizing relation (E-4) of Chapter XX. Within a constant and non-significant factor, it is the product 7 of a function ( 1 + 2 0 ), imposing energy conservation , by the following expression: 3 2

} 2

0

3

0 1 2

ε2 D ( +~

2

ε1 D ε0 D )( + ~ 0 )

(61)

where ε0 , ε1 and ε2 are, respectively, the polarizations of the photons of frequencies 0 , 1 and 2 . Compared to relation (E-4) of Chapter XX, expression (61) now contains three (instead of two) matrix elements in the numerator, and two (instead of one) energy denominators containing the differences in energy between the initial state and either the relay 1 or the relay 2 state. Six similar amplitudes can be written, generalizing equations (E-3) to (E-5) of Chapter XX and corresponding to different temporal orders for the absorption and emission processes. Once they have been added, one must also sum these amplitudes over all 7 As in § 4-a, we assume the total interaction time ∆ to be sufficiently long for the function to be assimilated to a real delta function.

2182

(∆ )



DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

the atomic relay states and . All the contributions to the total amplitude contain the same function ( 1 + 2 0 ). The final state of the system “atom + radiation” at the end of the scattering process is the sum over 1 and 2 of the components thus obtained, with the condition 1 + 2 = 0 . It can be written: Ψ =

(62)

where: =

(

1

+

2

0)

(k1 k2 ) k1 k2

(63)

k1 k2

with 1 2 = k1 2 . This state cannot be written as the product of two field states; it is therefore entangled (see Chapter XXI). The function (k1 k2 ) characterizing the field state is the result of the 1 and 2 dependence of the scattering amplitudes, as well as of the density of final states appearing in the summations over the continuums8 1 and 2 (summation over the moduli of the two vectors k1 and k2 ). We assume here that the energies of all the relay states are far away from any resonance, so that the (k1 k2 ) dependence on k1 and k2 does not present any kind of narrow structure. In other words, all the energy differences ∆ = in rel in the denominators of the scattering amplitudes are of the order of a fraction of an optical frequency. We mentioned in comment (i) at the end of § 1 in Complement AXX that the time spent in a relay state during the scattering process is, according to the time-energy uncertainty relation, of the order of ~ ∆ . This means that the times separating the emission of the two photons 1 and 2 cannot differ by more than a few optical periods, i.e. a few tens of femtoseconds. This qualitative argument shows that the two photons 1 and 2 are emitted quasi-simultaneously. Comment: If the interaction Hamiltonian appearing in the three matrix elements of the scattering amplitude is the electric dipole Hamiltonian, and if the atomic states have a well defined parity, this atomic parity changes with each interaction. After the three interactions, the parity is therefore changed, which forbids the final atomic state to be the same as the initial state. Consequently, the parametric down-conversion process we just studied can occur only when the atomic states do not have a well defined parity. Such a situation is encountered when the atomic Hamiltonian is not invariant upon reflection. This happens for example when the atom is inserted in a crystal where the local crystalline field, which has the symmetry of an external electric field, is not invariant upon spatial reflection. 5-b.

Temporal correlations between the two photons generated in parametric down-conversion

We now compute the double detection signal from the two photons generated in parametric down-conversion. The experiment we analyze is schematized in Figure 2. An incident pump beam, with frequency 0 , propagates along a direction with unit vector 8 Two continuums of final states come into play in this problem, but the condition reduces it to one.

1

+

2

=

0

2183

COMPLEMENT EXX



Figure 2: A pump beam with angular frequency 0 , propagating along a direction with unit vector u0 , impinges on a nonlinear crystal placed in O. The parametric down-conversion process generates two beams of frequencies 1 and 2 , with 1 + 2 = 0 . Diaphragms allow fixing the directions u1 and u2 of these two beams. The two detectors 1 and 2 register the arrivals of the photons and permit studying their temporal correlations.

u0 , and impinges onto a nonlinear crystal O containing atoms performing the conversion. Two diaphragms placed in front of the two detectors 1 and 2 allow selecting two directions, with unit vectors u1 and u2 , for the two beams generated by parametric down-conversion. We focus on the temporal, rather than spatial, aspect of the phenomenon. For the sake of simplicity, we assume that the three field states appearing in Figure 2 are plane waves, infinite in the two transverse directions. The only variable characterizing the modes involved is therefore the longitudinal component of the vector k or, equivalently, the frequency . The incident photon is described by a wave packet, characterized in 0 frequency space by a real function ( 0 ), centered at 0 and of width ∆ 0 . The center of the incident wave packet arrives at crystal O at time = 0, and passes through it in a time of the order of: ∆

1 ∆ =1 ∆

(64)

0

The two-photon wave packet generated by parametric down-conversion is described by an expression similar to (63), in which we now use the variables 1 2 = k1 2 ; this wave packet depends, to a certain extent, on 1 and 2 via the function ( 1 2 ). .

(r1 ; r2

Double photodetection signal

+ )

We first compute the absorption amplitude of the two photons9 , one at time detector 1 located at r1 , the other at time + by detector 2 located at r2 : 0

(+)

(r2

+ )

(+)

(r1 )

[k2 r2

9 The

2184

signal

= 2(

+ )]

} 2

0

(

3

[k1 r1

0 1

1

2

]

(

is the squared modulus of this amplitude.

1

0)

+

1 2

2

0)

(

1

by

2)

(65)



DETECTION OF ONE- OR TWO-PHOTON WAVE PACKETS, INTERFERENCE

In this equation, k1 and k2 are the wave vectors of the two photons propagating freely along u1 and u2 . We note 1 and 2 the distances between O and 1 , O and 2 , and define 1 = 1 and 2 = 2 as the times taken by the photons to travel these two distances. We have: k1 r1 =

1

k2 r2 =

2

1

=

1 1

2

=

2 2

(66)

so that (65) can be rewritten in the form: 0

(+)

(r2

+ )

(+)

(r1 )

=

} 2

(

3

0

0

1

2[ 2

We now replace the two variables 1

=

2

=

0

2

1

]

and

0)

1[ 1

2

1 2

(

1

2)

2

]

(

1

+

2

by a single variable

0)

(67)

by setting:

+

0

(68)

2

Condition 1 + 2 = 0 is then automatically satisfied, so that the delta function appearing on the second line of (67) is no longer necessary. The summation over 1 and , and the function ( 1 2 ) is replaced by a function 2 becomes a summation over ( ). If we assume, for the sake of simplicity, that 1 = 2 = , we finally obtain: 0

(+)

(r2

+ )

(+)

(r1 )

} 2

0

0[

0)

] 2

0[

] 2

0

(

.

(

3

0

2

+

)(

0

2

) (

)

(69)

Discussion

The dependence of the double photodetection signal is given by the summation over on the second line of (69). Going to the continuous limit, it is therefore the Fourier transform of ( 20 + )( 20 ) ( ). Nevertheless, the product 0 0 ( 2 + )( 2 ) varies very slowly with and can be taken as constant, as can be the state densities introduced when replacing the discrete summation by an integral. We also saw above (§ 5-a- ) that the variation of as a function of 1 and 2 , hence the variation of ( ), is very slow as long as no resonant (or quasi-resonant) relay states are involved in the scattering process. We must thus take the Fourier transform of a function of that has a very large width, of the order of a fraction of the optical frequency. This means that the double photodetection signal is different from zero only if the two photodetections are separated by a time interval of the order of a few optical periods. In other words, the two detections are always quasi-simultaneous. Consider now the summation over 0 in the first line of (69). We are going to see that the dependence of the signal involves time scales much longer than those 2185

COMPLEMENT EXX



characterizing the variation with of the signal . To show this, we replace the right-hand side; after going to the continuous limit, we get: d

0

(

0)

0(

)

by 0 on

(70)

which is the Fourier transform of the incident wave packet. This packet arrives at point O at = 0, and for the entire packet to pass that point, it takes a certain time ∆ ; this time interval ∆ is much longer, in general, than the time characterizing the dependence of signal . Relation (70) thus indicates that both detectors yield (almost simultaneously) a signal at any time within a time interval ∆ centered around = ; this time corresponds to the arrivals at 1 and 2 of the photons generated at O by the incident wave packet. But once a photon is detected by one of the detectors, the other photon is detected practically at the same instant by the other detector. Such a temporal correlation could not be predicted by a semiclassical treatment. These results remain valid when the parametric down-conversion process is produced, not by a single incident photon described by a wave packet, but rather by a continuous laser excitation. The two beams generated by the parametric down-conversion process then contain a series of pairs of photons, that are detected at the same instant; they are referred to as twin photons. Such twin beams can excite two-photon transitions in a much more efficient way than ordinary beams. This is because, in the absence of resonant relay states in the twophoton absorption process, an argument similar to that presented above shows that the two absorptions must be separated by a very short time interval (the two photons must interact quasi-simultaneously with the absorbing atom). The two incident photons must impinge on the atom at exactly the same time, which can be the case for twin beams (with ordinary beams, one can only observe two-photon absorptions due to accidental coincidences) In practice, radiation parametric down-conversion is often performed, not on an isolated atom, but rather on atoms or molecules in a solid. It is then imperative to take into account the interference between beams generated in different parts of the solid, and identify the conditions for getting a constructive interference. The refractive optical index of the medium in which the beams propagate then plays an important role, which leads to the so called phase matching condition. This discussion, outside the scope of the present complement, is treated in detail in quantum optics books [65] [66].

2186

Chapter XXI

Quantum entanglement, measurements, Bell’s inequalities A B

C

D

E

F

Introducing entanglement, goals of this chapter . . . . . . . 2188 Entangled states of two spin-1 2 systems . . . . . . . . . . . 2190 B-1 Singlet state, reduced density matrices . . . . . . . . . . . . . 2191 B-2 Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2191 Entanglement between more general systems . . . . . . . . 2193 C-1 Pure entangled states, notation . . . . . . . . . . . . . . . . . 2193 C-2 Presence (or absence) of entanglement: Schmidt decomposition2193 C-3 Characterization of entanglement: Schmidt rank . . . . . . . 2196 Ideal measurement and entangled states . . . . . . . . . . . 2196 D-1 Ideal measurement scheme (von Neumann) . . . . . . . . . . 2196 D-2 Coupling with the environment, decoherence; “pointer states” 2199 D-3 Uniqueness of the measurement result . . . . . . . . . . . . . 2201 “Which path” experiment: can one determine the path followed by the photon in Young’s double slit experiment? 2202 E-1 Entanglement between the photon states and the plate states 2203 E-2 Prediction of measurements performed on the photon . . . . 2204 Entanglement, non-locality, Bell’s theorem . . . . . . . . . . 2204 F-1 The EPR argument . . . . . . . . . . . . . . . . . . . . . . . 2205 F-2 Bohr’s reply, non-separability . . . . . . . . . . . . . . . . . . 2207 F-3 Bell’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . 2208

Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

We discuss in this last chapter an essential concept of quantum mechanics, entanglement. This will highlight a number of aspects of quantum mechanics that have no equivalent in classical physics. A.

Introducing entanglement, goals of this chapter

Consider two physical systems and , each having a state space and ; they can be grouped into a total system + , whose state space is the tensor product . If we assume system is described by a normalized quantum state belonging to , system by a normalized quantum state belonging to its space state , the ket Φ describing the total system is the tensor product: Φ =

(A-1)

In this case, each of the three physical systems , , and + is described by a state vector, which is the most precise possible description in quantum mechanics. The situation is different when the state vector of the global system is no longer a simple product. Let us note , , .., orthonormalized quantum states belonging to the state space of the first system, and , ,..., orthonormalized quantum states belonging to the state space of the second. We can then build products different from (A-1) for the state of the global system, for example: Φ =

(A-2)

Now, we can also, in view of the superposition principle, form any linear combination Ψ of Φ and Φ , which will no longer be a simple product: Ψ =

+

In this relation, the complex coefficients obey the normalization condition: 2

2

+

=1

(A-3) and

can take on any value, as long as they (A-4)

We shall assume, however, that neither of these coefficients is zero so that (A-3) is not reduced to a simple product: =0

(A-5)

A state such as (A-3), which contains a coherent superposition of two (or more) components, each component being a product, is called an “entangled state”. The general property associated with these states is called “quantum entanglement”. It expresses the fact that the quantum state of each subsystem is, in a way, conditioned by the state of the other. In Complement EIII , we introduced the concept of a density operator, which provides a more general description of a physical system than a state vector. The density operator of the total physical system + , whose state vector is known, is simply the projector onto Ψ : +

2188

= Ψ Ψ

(A-6)

A. INTRODUCING ENTANGLEMENT, GOALS OF THIS CHAPTER

whose trace is equal to one: Tr

= Ψ Ψ =1

+

(A-7)

When a physical system can be described by a state vector, it is said to be in a “pure state”. Its density operator obeys the relation: [

2

+

] =

(A-8)

+

and thus: Tr [

+

]

2

=1

(A-9)

Under such conditions, we can choose to describe the total physical system either by its state vector Ψ , or by the density operator + . We are going to show that this is no longer the case for the two subsystems and , for which only the density operator can be used. Imagine, for example, that we are only interested in measurements performed on subsystem . We saw in § 5-b of Complement EIII that, when the total system is entangled as in (A-3), there generally does not exist any state vector belonging to that allows computing the probabilities of measurements performed solely on . Instead, one must necessarily use a density operator obtained by taking a partial trace (taken over the state space of the non-observed system – we recall in § B-1 below how to compute the matrix elements of a partial trace): = Tr

(A-10)

+

Like operator (A-6), this operator is Hermitian, non-negative, and its trace is equal to one; it is however not the projector onto a single state vector. When the system is described by the entangled state (A-3), this density operator is given by: 2

=

+

2

(A-11)

which is actually the sum of two projectors. Consequently, subsystem is in state 2 2 with a probability , and in state with a probability : as opposed to the state of + , the quantum state of is not known with certainty, but only with a certain probability. The density operator can be called a “statistical mixture”, underlying the fact that the results of measurements performed on are predicted by computing averages on the (non-observed) properties of . We then get the inequality: Tr [

]

2

1

(A-12)

where the equality occurs only if one of the two coefficients or is zero; the equivalent of relation (A-9) is, in general, not satisfied by . Inequality (A-12) expresses the fact that, as the quantum state of is known only in a statistical way, the quantum description of is less precise than the description of the total system + . This discussion can be easily generalized to the case where Ψ is the superposition of, not only two components as in (A-3), but of three or more. We find ourselves in a situation that might look rather surprising, as it does not have any equivalent in classical physics. We do know that a perfect classical description 2189

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

of the total system + automatically implies that each of its two subsystems is also perfectly described. This is because a complete description of the state of the total system is simply the collection of all the complete descriptions of its subsystems. As an example, a perfect classical description of the solar system is simply the knowledge of all the positions and velocities of the planets, satellites, and all their constituent particles. In quantum mechanics, things are drastically different: the most precise description of the total system by a state vector (pure state) does not imply, in general, that its subsystems can be described with the same precision. This difference radically changes the usual relation between the parts and the whole of a physical system. Schrödinger, who first introduced in 1935 the words “quantum entanglement” commented on this new concept [67]: “As far as I am concerned, I would not call this property one but rather the characteristic trait of quantum mechanics, the one that enforces its entire departure from classical lines of thought. By the interaction, the two representatives [the quantum states] have become entangled... Another way of expressing the peculiar situation is: the best possible knowledge of a whole does not necessarily include the best possible knowledge of all its parts, even though they may be entirely separate and therefore virtually capable of being ‘best possibly known’ i.e., of possessing, each of them, a representative (state vector) of its own.” In a general way, when the state vector of a global system is not a simple product, and quantum entanglement occurs, the quantum predictions for observations on part of the system can become rather unexpected. This chapter will discuss a number of special physical effects related to quantum entanglement. As a general introduction, § B studies the simple case of two spin-1/2 systems entangled in a singlet state. This example is generalized, in § C, to any physical system, and we present some of the properties of entangled quantum states. The relations between entanglement and quantum measurements is discussed in § D, using in particular the ideal measurement scheme proposed by von Neumann. In § E, we describe an experiment where one tries to observe interference fringes of a particle going through a two-slit plate while determining at the same time which slit the particle went through; if this were possible, one would face a contradiction. However, a partial trace operation on an entangled state of the particle and the plate allows proving the coherence of the quantum formalism and illustrates an aspect of complementarity. Finally, § F discusses the relations between entanglement and quantum non-locality, in the framework of the general Einstein, Podolsky and Rosen argument, and of Bell’s theorem. B.

Entangled states of two spin-1 2 systems

We first discuss a very simple case, which will prove useful for the rest of the chapter: each of the two systems and is a spin 1 2; each state space is then spanned by the two eigenstates of the spin component on the axis. We assume these two states are entangled in a singlet state, such as the one written in relation (B-22) of Chapter X: 1 2 1 = 2

Ψ =

2190

:+ +

: +

:

:+ (B-1)

B. ENTANGLED STATES OF TWO SPIN-1 2 SYSTEMS

(on the second line, we have simplified the notation, assuming that the first index in the ket refers to spin , and the second to spin ). B-1.

Singlet state, reduced density matrices

In the basis of the 4 kets + + , + matrix representing the density operator

(

1 )= 2

+

0 0 0 0

0 1 1 0

0 1 1 0

, +

+, is written:

taken in that order, the

0 0 0 0

(B-2)

2

It is easy to check, performing a matrix product, that ( + ) = ( + ), hence that relation (A-9) is verified: this means the total system is in a pure state. As indicated in § 5-b of Complement EIII , the matrix representing the density operator is obtained by taking a partial trace, i.e. by adding the matrix elements of ( + ) that are diagonal with respect to the quantum numbers of the second spin (this amounts to summing over the states of the non-observed spin): =

+

(B-3)

This leads to: (

)=

1 2

10 01

(B-4)

We then get: (

2

) =

1 4

and thus Tr (

1 0 0 1 2

)

(B-5) = 1 2; this means that spin

is not in a pure state. By symmetry,

the same result would obviously be found for ( ). Note that after taking the partial trace, all the non-diagonal elements (coherences) of (B-2) have completely disappeared. When they are considered as an isolated system, each of the two spins is in a “completely depolarized” state, and measuring its spin component (or any of its spin component, for that matter) will yield the results + or with the same probability 1 2. At the level of each individual spin, the minus sign that characterizes the entanglement of the state vector (B-1) becomes irrelevant; on the other hand, we are going to show that this entanglement yields very strong correlations between the results of measurements pertaining to both spins. B-2.

Correlations

Imagine now that we perform simultaneously measurements on both spins, the first one along a direction in the plane , making an angle with the axis, the second along a direction in that same plane, making an angle with . The results 2191

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

we are about to obtain will be important for the discussion of Bell’s theorem in § F-3-a. Relations (A-22) of Chapter IV (taking the angle equal to zero and = ) yield the expressions for the eigenvectors of the measurements in the state spaces and : +

= cos =

2

sin

+ + sin

2

+ + cos

2

(B-6)

2

In the space of the two-spin states, the ket corresponding to a double + result of the measurement is written: +

+

= cos

2

+ sin

2

cos cos

2 2

+ + + cos + + sin

sin

2 2

sin

2

+ (B-7)

2

This means that, when the system is in the singlet state (B-1), the probability amplitude for obtaining this double result is: +

1 cos sin 2 2 2 1 = sin 2 2

+

Ψ =

sin

2

cos

2 (B-8)

The probability of the result (+ +) when measuring the components of both spins along the and directions is therefore: ++

(

)=

1 sin2 2

(B-9)

2

One can redo the same calculations for the three other possible pairs of results, (+ ), ( +) and ( ). This does not present any difficulty but it is easier to note that changing into + exchanges the two eigenkets of (B-6), and hence the results + and for the first spin ; the same operation can be done with the second spin . We then make these changes in (B-9) and get the probabilities for the 4 possible results in the form: ++

(

)=

+

(

)=

( +

(

1 sin2 2 1 ) = cos2 2 )=

2 2

(B-10)

When both spins are measured, strong correlations between the results appear1 . These correlations are the direct consequence of the entanglement present in the singlet state vector (B-1). 1 The

probabilities cannot, in general, be factored. As an example, relation (B-10) shows that is different from + . This means that the ratio of the probabilities of obtaining for the first spin the result + or the result depends on the state of the second spin, clearly showing correlations. ++

2192

+

C. ENTANGLEMENT BETWEEN MORE GENERAL SYSTEMS

C.

Entanglement between more general systems

The concept of entanglement is obviously not limited to the singlet state of two spin-1 2 particles. We now study how to characterize the presence of entanglement when the total system is in a pure state. C-1.

Pure entangled states, notation

We consider two quantum systems and belonging, respectively, to state spaces (with dimension ) and (with dimension ). The normalized state vector Ψ describing the total system + belongs to the tensor product space , with dimension . Some of the states Ψ can be written as a simple product: Ψ =

(C-1)

where and are any normalized kets of and , respectively. In such a case, the two physical subsystems and are not entangled; each of them, as well as the total state, can be described by a state vector (pure state). On the other hand, the majority of the states Ψ cannot be factored this way, and must necessarily be written as a sum of products (the singlet state studied above is such an example); the two subsystems and are then entangled. It is not always obvious to guess from the expression of any given state vector Ψ if it can actually be written as a simple tensor product. This ket has, in general, components, and is expressed as: Ψ =

(C-2) =1 =1

where the and

as well as the are orthonormalized kets. Now if we expand the kets , appearing in the tensor product (C-1), onto the kets and as

=

and =1

=

(C-3) =1

we obtain for a ket Ψ of the type (C-1): Ψ =

(C-4) =1 =1

It is not obvious at all, just from the knowledge of the coefficients of Ψ , to know if they can be factored into an expression of this type, leading to a product as in (C-1). We present in the next section a systematic method for asserting if this factorization is possible and actually performing it. C-2.

Presence (or absence) of entanglement: Schmidt decomposition

It can be shown (see the demonstration below) that any pure state Ψ describing the ensemble of the two physical systems and can be written in the form: Ψ =

(C-5) 2193

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

where the are a set of orthonormal vectors in the state space of the first system, and the another set of orthonormal vectors in the second state space. This expression is the Schmidt decomposition of a pure state, also called the “biorthonormal decomposition”. Whether the state of the total system Ψ is entangled or not, it always has a corresponding density operator, written in (A-6). By taking partial traces, each of the subsystems can be described by the density operators: = Tr

;

+

= Tr

(C-6)

+

Performing these partial traces using (C-5), we get two symmetric expressions: =

(C-7)

=

(C-8)

and:

This means that, when the total system is in a pure state, the two partial density operators always have the same eigenvalues2 . In the particular case where they are all zero except one, each of the two subsystems is in a pure state and state Ψ can be factored: no entanglement exists in the total system. Most of the time, however, several eigenval2 ues are non-zero, in which case ( ) is obviously not equal to , and the same is true for ; entanglement is then present in the pure state Ψ . Demonstration of relation (C-5): As the two operators and are Hermitian, non-negative and have a trace equal to unity, their corresponding matrices can be diagonalized to yield real eigenvalues included between 0 and 1. We call the normalized eigenvectors of (the index takes on different values, where is the dimension of subsystem ) and the corresponding eigenvalues, all positive or zero (but not necessarily different. Similarly, the eigenvectors of are noted (where takes on different values, being the dimension of the second subsystem ), and the corresponding eigenvalues. The two partial density operators can then be expanded as: =

and =1

with 0

=

(C-9) =1

,

1.

State Ψ can then be expanded on the basis of the tensor products : that we shall simply note assuming the first ket represents a state of second a state of ; we call the components of Ψ in this basis and get: Ψ =

: , and the

(C-10) =1

=1

2 Note that this is not necessarily the case if the total system is described by a statistical mixture rather than a pure state. As an example, we can assume that equals a tensor product , + where and can be chosen arbitrarily, and hence have different eigenvalues.

2194

C. ENTANGLEMENT BETWEEN MORE GENERAL SYSTEMS

We now introduce the ket normalized), as:

, belonging to the state space

(this ket is not necessarily

=

(C-11) =1

Expansion (C-10) for Ψ now simply becomes:

Ψ =

(C-12) =1

We also know that the matrix elements of the partial trace =

are:

Ψ Ψ

(C-13)

Now expression (C-12) for Ψ leads to: Ψ Ψ =

(C-14)

Inserting this result into (C-13), we are only left with terms for which which yields: =

=

= and

= ,

(C-15)

This means that: =

=

(C-16)

Now in the basis we have used, we know that is diagonal and given by expression (C-9); the comparison with (C-16) shows that we must necessarily have: =

(C-17)

For non-zero eigenvalues , this relation shows that one can define a set of orthonormal vectors belonging to the state space of system as: =

1

For all the values of the index relation shows that the kets

(C-18) associated with eigenvalues are zero.

equal to zero, that same

Replacing in (C-12) the by , we complete the demonstration of equality (C-5), and of relations (C-7) and (C-8) which follow directly.

2195

CHAPTER XXI

C-3.

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

Characterization of entanglement: Schmidt rank

The number of non-zero eigenvalues, i.e. the number of non-zero terms in (C-5), is called the “Schmidt rank” of Ψ and noted . When = 1, the state of the total system is not entangled, the two subsystems each being in a pure state. When = 2, we are in the case studied in § B where the two subsystems are entangled, when = 3 we get a more complex entanglement, etc. This gives us a criterion for deciding whether a general ket (C-2) is entangled or not. We just have to compute a partial density operator for one of the subsystems and the number of its non-zero eigenvalues (for the sake of simplicity, we shall obviously choose the subsystem whose state space’s dimension is the smallest). If that number equals 1, the eigenvector associated with that non-zero eigenvalue becomes one of the factors of the decomposition, which makes it easy to find the other. If that number is greater than 1, however, the decomposition into a single tensor product is no longer possible. In a way, entanglement is symmetrically shared between and . For example, it is not possible for one of the two subsystems to be in a pure state and the other in a statistical mixture. The rank must be lower than the smallest of the dimensions and of the state spaces of and ; to allow a high rank entanglement, the two subsystems must thus have state spaces with high enough dimensions. Comment: When all the eigenvalues of (and of ) are distinct, the Schmidt decomposition is unique. This is because decompositions (C-9) and (C-8) of on the projectors onto its eigenvectors must then necessarily coincide; the set of eigenvectors is identical to that of the . In this case, the eigenvectors of the partial density operators directly yield the unique Schmidt decomposition. On the other hand, when certain eigenvalues are degenerate, this is no longer true. As an example, we saw that for the singlet state (B1), the two partial density matrices have two eigenvalues, both equal to 1 2; this singlet state, decomposed in (B-1) into products of eigenvectors of the spin components, can equally well be decomposed into products of eigenvectors of the spin components on any spatial direction. There are an infinity of possible Schmidt decompositions for that state.

D.

Ideal measurement and entangled states

Entanglement also plays an essential role in any quantum measurement process, as it generally appears while the measured system and the measuring apparatus interact. Furthermore, we shall see that it even propagates further and brings the environment of the measuring apparatus into play. D-1.

Ideal measurement scheme (von Neumann)

Von Neumann’s quantum measurement scheme proposes a general framework that allows characterizing the quantum measurement process in terms of entanglement appearing (or disappearing) in the state vector describing the total system + . The two systems and are initially described by a factored state Ψ0 ; however, as they 2196

D. IDEAL MEASUREMENT AND ENTANGLED STATES

interact during a certain time, they reach an entangled state Ψ . After the measurement, we assume they no longer interact, imagining for example they have moved far away from each other. In the state space of , with dimension , the physical quantity measured on is described by an operator whose normalized eigenvectors are the kets with eigenvalues (that we shall assume non-degenerate, to simplify the notation): = Initially, state

0

(D-1) 0

of

is any linear combination of the

=

: (D-2)

=1

with complex coefficients , having only the constraint that the sum of their squared moduli be equal to 1 (normalization condition). As for the measuring apparatus , we assume it is, initially, always in the same normalized quantum state Φ0 . The initial state of the total system is then: Ψ0 = D-1-a.

0

Φ0

(D-3)

Basic process

We start with the particular case where the measurement result is certain and where the system is initially in one of the eigenstates associated with the measurement: . In that case: 0 = Ψ0 =

Φ0

(D-4)

Once the measurement is done, stays in the same state , but the measuring apparatus reaches a state Φ different from Φ0 and which depends on : this is a necessary condition for the result to be experimentally accessible. This is because the position of the “pointer” used for the reading of the result (a needle in a macroscopic apparatus, the recording of the result in a memory, etc.) must necessarily depend on to allow for the acquisition of the data. It is also natural to assume that the different states Φ are orthogonal to each other, since the pointer necessarily involves a large number of atoms whose different states will allow a macroscopic observer to read the result. The measurement process for the total system can be summed up, in this simple case, as follows: Ψ0 =

Φ0

=

Ψ =

Φ

(D-5)

where Φ is a normalized state of . At this stage, no correlation or entanglement has appeared between the measuring apparatus and the measured system. This is what happens in the simple case where the measurement result is certain. In the general case, the initial state of system is a superposition (D-2) of eigenstates . In this case, state (D-4) must be replaced by the linear combination, with the same coefficients: Ψ0 =

Φ0

(D-6) 2197

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

As Schrödinger’s equation is linear, we get: Ψ0 =

Ψ =

Φ

(D-7)

which now becomes a state in which the measuring apparatus is entangled with the measured system . The states of and are therefore strongly correlated: when the “pointer’s position” is associated with one of the state vectors Φ , the state of the system must be described by the ket associated with a definite eigenvalue of . After the measurement, one can no longer attribute a state vector (pure state) to the system , which can only be described by a partial density operator. As the states Φ are normalized and orthogonal to each other, this density operator is given by: = Tr

Ψ

Ψ

2

=

(D-8)

This relation was to be expected: it simply states that the system has a probability 2 of being in the state associated with the measurement result , which is in agreement with the Born rule for the usual probabilities. This useful formula sums up in a simple way a number of characteristics of the quantum measurement postulate. Note, however, that at this stage, all the possible results are still present in the partial trace, as they are considered possible, even after the measurement. Nothing at this point tells us that only one result is actually measured when the experiment is performed, nor that the squares of the coefficients can be interpreted as classical probabilities associated with mutually exclusive observations. The evolution predicted by Schrödinger’s equation cannot, by itself alone, explain the uniqueness of the results observed at the macroscopic level. This is why von Neumann introduced the postulate of the state vector’s reduction (also called the “wave packet reduction” or “wave packet collapse”, cf. Chapter III, § B3-c); more detail on this point will be given in § D-3. D-1-b.

Dynamics of the entanglement process

A simple interaction Hamiltonian between systems and can explain the appearance of entanglement between these systems, and lead to relations (D-5) or (D-7). As an example, imagine this interaction Hamiltonian can be written as: =

(D-9)

where is the operator (acting only on ) already introduced above, an operator acting only on , and a coupling constant. We shall also assume that, in the state space of , there exists a Hermitian conjugate operator of the operator : [

]= }

(D-10)

This commutation relation means that generates the translation operators with re∆ } spect to . In other words, the action of on any eigenvector of : =

(D-11)

leads to a translation by ∆ of the eigenvalue ∆

2198

}

=

+∆

: (D-12)

D. IDEAL MEASUREMENT AND ENTANGLED STATES

where ∆ is any real number – see relation (13) of Complement EII . We assume that Φ0 (state of the measuring apparatus before the measurement) is a normalized eigenstate of with eigenvalue 0 , and we ignore3 any other source for the evolution of the total system other than the interaction between and . The evolution operator between the time = 0 before the measurement, and the time = when the interaction is over, is: }

( 0) =

(D-13)

Its action on the ket (D-4) yields: ( 0)

Φ0 (

0)

=

Φ0 (

0

+

)

(D-14)

where the variables in parentheses in the states of the measuring apparatus4 refer to the eigenvalues of . This means that the states Φ introduced in (D-5) are the kets: Φ

= Φ(

0

+

)

(D-15)

These relations show that, as far as the measuring apparatus is concerned, the eigenvalue of has been shifted by a quantity that depends on the eigenvalue of . The observable therefore plays the role of a “pointer’s position” in the measuring apparatus (measuring needle), which yields the measurement result once the two systems have interacted. As for the observable , it pertains to the system being measured by the pointer’s position. If now the initial state is in a coherent superposition as in (D-6), the state after the interaction (D-7) is written: Ψ =

Φ(

0

+

)

(D-16)

which is a biorthonormal decomposition such as the one obtained in § C-2. If, initially, the system is not in an eigenstate of , the interaction with the measuring apparatus changes its state into a statistical mixture (D-8). On the other hand, if the system is initially in an eigenstate of , it will stay in the same eigenstate after the measurement: the measurement process does not change its state. The measurement is then said to be a “quantum non-demolition” measurement, or QND measurement. D-2.

Coupling with the environment, decoherence; “pointer states”

We now examine under which conditions the interaction and entanglement process we have considered constitutes a good measurement. A first obvious condition to be satisfied is for the states Φ of the measuring apparatus to store the information about 3 To avoid this hypothesis, the computation could be performed within the interaction point of view (exercise 15 in Complement LIII ) with respect to free evolutions of both and ; this would somewhat complicate the results. However, as we focus here on the dynamics induced by their mutual interaction, we shall keep the computations simple and assume that these free evolutions have a negligible effect during the duration of the interaction. 4 Needless to say, a measuring apparatus is macroscopic and has many other degrees of freedom apart from the pointer’s position. For the sake of simplicity, these other degrees of freedom have not been introduced in the notation.

2199

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

the measurement result in a robust way, and prevent it from being destroyed as continues to evolve on its own. This condition will be fulfilled if is a constant of the motion of or, in other words, if commutes with the Hamiltonian of . In addition, the measuring apparatus cannot remain isolated from its environment, even at a microscopic level. In view of its function, the apparatus must be able to interact and be correlated with a measurement recording device, and even with the observer collecting the measurement result; it is, by definition, an “open” apparatus, which can interact with the outside world. In any case, for this apparatus to be completely isolated, would require that none of its atoms, electrons, etc., be in interaction and correlated with any of the environment particles, which is obviously impossible to achieve with a macroscopic apparatus. This means that, as far as the coupling between and the environment is concerned, an entanglement phenomenon occurs, reminiscent of the one discussed for and . We must determine which basis, in the state space of , will lead for the entangled state + to a biorthonormal decomposition similar to (D-16). The computation of the partial density operator of is similar to the computation that yielded (D-8) for the entanglement between and : it is in the basis of this biorthonormal decomposition that the partial density operator of (which plays the role of ) remains diagonal; with another basis, the density matrix will have in general non-diagonal elements. As, furthermore, the entanglement continues to propagate further and further into the environment, it is necessary that the relevant basis of remains constant in time. It is thus important to find this privileged basis. Depending on the circumstances, the coupling between a measuring apparatus and its environment can take on various forms, in general complex due to the large number of degrees of freedom involved; several time constants come into play. Different models have been proposed to account for this coupling and the dynamics it produces. Without going into any details we shall make a few general remarks. The measurement process involves a whole chain of amplification between and the macroscopic pointer, which can be composed of mesoscopic or macroscopic objects sensitive to the environment. Entanglement propagates along that chain via local interactions: the interaction potentials are diagonal in the position representation, and have a microscopic range. Consequently, they cannot couple quantum states corresponding to macroscopically different positions of the objects concerned; the branches of the state vector corresponding to different spatial positions propagate independently. This means that the coupling with the environment tends to favor the basis of states where the positions of the different elements of the measuring apparatus, including in particular its pointer, occupy well defined positions in space. The corresponding preferred basis in the state space of the measuring apparatus , in which its density matrix remains diagonal over time, is called the basis of the “pointer states”. In this basis, and only in this one, defined by pointer localization criteria, the entanglement with is prone to destroy the coherences (non-diagonal elements of the density matrix), without changing the diagonal elements (meaning the positions of the pointer’s particles). To sum up, several conditions are necessary for a device to be considered as an acceptable measuring apparatus for a physical quantity of . In the first place, the coupling between and must be capable of transferring the right information from one to the other. The transferred information must then be conserved over time, while continues its own evolution, and is coupled with the environment . Obviously, these are 2200

D. IDEAL MEASUREMENT AND ENTANGLED STATES

necessary conditions. In practice, an effective measuring apparatus must be conceived taking into account many other imperatives, such as high sensitivity, or strong protection against unavoidable external perturbation. D-3.

Uniqueness of the measurement result

As mentioned above, nothing in the dynamics associated with Schrödinger’s equation can explain the uniqueness of the results observed at the macroscopic level. This is not surprising as (D-8) is a direct consequence of Schrödinger’s equation, which is incapable of stopping on its own the endless propagation of the “von Neumann chain”, as we shall now discuss. D-3-a.

The infinite von Neumann chain

Let us go back to the ideal measurement scheme of § D-1. After the measurement, the state of + is the entangled state (D-7), a superposition of components associated with all the possible measurement results. One may wonder if, using a second measuring apparatus 2 to observe , one might be able to resolve this superposition and obtain a unique result. In fact, the same entanglement process that occurred between and will occur again, leading to a final state: Ψ

=

Φ

Ξ

(D-17)

where the kets Ξ represent the states of the second measuring apparatus 2 , orthogonal to each other for different values of . Adding a third measuring apparatus 3 will, obviously, only continue further the entanglement’s progression, each additional apparatus playing the role of an environment for the previous one. This chain of measuring apparatus may continue all the way to infinity without permitting at any stage the resolution of the superposition, and the demonstration of the uniqueness of the measurement result. This is called the von Neumann chain (and the logical problem it poses is called the “von Neumann’s infinite regress”). The well-known “Schrödinger’s cat paradox” involves a similar situation. The system is supposed to be a radioactive nucleus in a superposition of two states, 1 where the nucleus is still in the excited state, and 2 where it has disintegrated, emitting a particle. The kets Φ , Ξ , etc. represent the states of the measuring apparatus that can detect this particle, and then trigger a mechanical system killing the cat in the case of positive detection. The last of these kets characterizes the cat, which can therefore be in state 1 where it is still alive, or in state 2 where it is dead. Schrödinger points out the absurdity of a physical description involving a cat that can be at the same time both in an alive and a dead state. As we just discussed, the uniqueness of the measurement results cannot be proven with Schrödinger’s equation; this equation merely predicts that the pointer of a measuring apparatus, and any other macroscopic object, can become superpositions of states located at points very far away in space. Because of the linearity of Schrödinger’s equation, nothing prevents the different components of the state vector from propagating further and further away, without this infinite chain of entanglements ever reducing into a single one of its components. It is precisely to solve this problem that von Neumann introduced a specific postulate: the postulate of the reduction of the state vector (Chapter III (§ B3-c) which “forces” the uniqueness of the measurement result. 2201

CHAPTER XXI

D-3-b.

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

Postulate of reduction of the state vector

The postulate of reduction (or collapse) of the state vector is also called the “projection postulate”, or the “postulate of collapse of the wave packet”. As we saw in Chapter III (§ B-3-c, this postulate states that, once the measurement has been performed, one must suppress the summations appearing in (D-7), (D-8), and (D-17): one only keeps, among all the terms, the component = corresponding to the measurement result actually observed. After the measurement, the state vector becomes again a simple product from which entanglement has disappeared; is once again in a pure state. This means that the entanglement, initially created by the measurement, disappears once the result has been recorded. This postulate, as efficient as it may be, is somewhat difficult to interpret. Using this postulate amounts to considering that the state vector can evolve under the influence of two different processes: a “normal” continuous evolution, obeying Schrödinger’s differential equation, and a sudden discontinuous evolution upon measurement, governed by the von Neumann reduction postulate. Obviously, this duality immediately introduces the question of the limit between these two evolutions: from which time on, exactly, should we consider that the measurement has been performed? In other words, how far does the coherent superposition (D-17) propagate? Which physical processes constitute a measurement, as opposed to those leading to a continuous Schrödinger evolution? These difficult questions were the motivation for introducing other interpretations of quantum mechanics. As an example, there are “non-standard” interpretations where Schrödinger’s equation is modified by the adjunction of a small stochastic term. This term is chosen so as to be totally negligible at the microscopic level, while coming into play at a certain macroscopic level; its role is to suppress all the macroscopically different components of the state vector, except one. Both Schrödinger and von Neumann dynamics are then unified into a single equation for the evolution of the state vector. Many other interpretations have been proposed: additional variables, modal interpretation, Everett, all suggesting different solutions for the problem. The interested reader may consult reference [68]. E.

“Which path” experiment: can one determine the path followed by the photon in Young’s double slit experiment?

Let us now return to a question already discussed in Complement DI . In Young’s double slit experiment where the photon may follow two different paths to reach the detection screen, is it possible to observe interference fringes between these paths and simultaneously obtain information as to which path the photon followed? Figure 1 of Complement DI , reproduced here in the above Figure 1, shows an interference experimental set-up using a plate pierced with two slits 1 and 2 ; this plate is mobile in a direction perpendicular to the incident photon. As it receives momentum transfers ∆ 1 and ∆ 2 that will be different, depending on whether the photon goes through 1 or 2 , one could naively imagine observing interference while knowing through which slit the particle went through. However, using the momentum-position uncertainty relations applied to this mobile plate, we showed that the interference fringes were blurred as soon as the momentum transfers ∆ 1 and ∆ 2 were sufficiently different to provide this information. The reason is that if we want to be able to distinguish these two momentum transfers, the momentum uncertainty of the mobile plate must be less than the modulus of ∆ 1 ∆ 2 . A simple calculation then shows that when this condition is met, the uncertainty in the 2202

E. “WHICH PATH” EXPERIMENT: CAN ONE DETERMINE THE PATH FOLLOWED BY THE PHOTON IN YOUNG’S DOUBLE SLIT EXPERIMENT?

Figure 1: Young’s double slit experiment using a plate , mobile along the axis, and pierced with two slits 1 and 2 . A photon, emitted by a source assumed to be far away at infinity, reaches the detection screen at point . The component of the momentum transferred by the photon to the plate depends on whether it goes through slit 1 or slit 2.

position of the plate must necessarily be larger than the fringe spacing, which blurs out the fringes. It is impossible to know which of the slits the photon went through without destroying at the same time the interference pattern. We shall take the analysis a step further and consider the entanglement between the plate and the paths followed by the photon. This should allow us to envisage intermediary situations where partial information on the particle’s path can be obtained. E-1.

Entanglement between the photon states and the plate states

Consider the path 1 followed by the photon if it goes through 1 and arrives at on the detection screen (Figure 1). We call 1 the photon state when it follows that path and transfers a momentum ∆ 1 to the mobile plate as it goes through 1 . In that case, after the photon’s transit, the state of the plate is: 1

= exp( ∆

1

~)

(E-1)

0

where 0 is the initial state of the plate and exp( ∆ 1 ) the momentum space translation operator, by a quantity ∆ 1 . The state of the global system photon + plate, along the path 1 , is therefore 1 1 . A similar reasoning would yield the result along the path . As Schrödinger’s equation is linear, the state of the 2 2 2 global system after the photon has crossed the plate is: Ψ =

1

1

+

2

2

(E-2) 2203

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

which clearly shows the entanglement between the photon5 and the plate. E-2.

Prediction of measurements performed on the photon

Measurements performed only on the photon after its crossing the plate can be predicted from the reduced density operator , which is the partial trace over the plate variables of the global system density operator Ψ Ψ . The matrix elements of are obtained via the standard calculation of a partial trace (Complement EIII , § 5-b) and lead to the operator expression: =

1

1

+

2

2

+

1

2

2

1

+

2

1

1

2

(E-3)

(in this equation we did not write the factors of 1 1 and 2 2 , which are the scalar products 1 1 and 2 2 , both equal to 1 if the state 0 is normalized). The interference between the two paths is described by the terms in 1 2 and 2 1 , which are multiplied by the scalar products 2 1 and 1 2 . Two extreme cases then appear. If the two states 1 and 2 are very close to each other, the two scalar products are practically equal to 1 and the interference terms in (E-3) are barely modified by the presence of the factors 2 1 and 1 2 : the interference is then quite visible on the detection screen. In that case, however, the states 1 and 2 are too close to give any information as to whether the photon went through 1 or 2 . In the other extreme case where 1 and 2 are very different from each other, their scalar product is practically zero: the interference terms disappear from (E-3), but one can, in principle, determine which of the two slits the photon went through. The present calculation allows studying intermediate situations where the scalar products 2 1 and 1 2 take on values included between 0 and 1. They describe how the contrast of the fringes diminishes when 2 1 and 1 2 continuously decrease from 1 to 0. Actually, these scalar products can easily be computed from (E-1) and the equivalent relation for 2 . This leads to: 2

1

=

0

exp [ (∆

1

∆ 2) ]

0

(E-4)

Using the expression of ∆ 1 ∆ 2 (noted 1 2 in Complement DI ) and equations (6) and (7) of that complement, we can show that 2 1 and 1 2 are equal to the overlap integrals between the plate initial wave function, and that same wave function translated in momentum space by the amount where is the fringe spacing. F.

Entanglement, non-locality, Bell’s theorem

We now present two important theorems, the EPR (for Einstein, Podolsky and Rosen) theorem, and Bell’s theorem, which are related, the second actually being a logical continuation of the first. The EPR theorem was presented in an article published in 1935 by these three authors [69], and is one of the episodes of the famous discussion between Einstein and Bohr concerning the foundations of quantum mechanics (in particular during the Solvay conferences). Einstein’s position was that the entire physical world had 5 All the conclusions of this section remain valid for Young’s interference type experiments performed with a massive particle instead of a photon.

2204

F. ENTANGLEMENT, NON-LOCALITY, BELL’S THEOREM

to be expressed in the general framework of relativity, where the concept of space-time events is fundamental. Bohr had a different point of view, and considered that quantum theory demanded abandoning a description of microscopic events in space-time terms, while of course conforming to the actual predictions of relativity. F-1.

The EPR argument

The EPR theorem can be stated as follows: “If all the predictions of quantum mechanics are correct (even for systems made of several remote particles) and if physical reality can be described in a local (or separable) way, then quantum mechanics is necessarily incomplete: some ‘elements of reality’ exist in Nature that are ignored by this theory ”. To demonstrate their theorem, Einstein, Podolsky and Rosen imagined an experiment where two physical systems, originating for example from a common source S and described by an entangled quantum state, are then measured in remote regions of space. Historically, EPR developed their argument for correlated particles whose position and momentum are measured. It is however simpler to present an equivalent version of the argument concerning spins and discrete results, a version initially proposed by Bohm (and often called for that reason EPRB). F-1-a.

Exposing the argument

Imagine that two spin 1/2 particles are emitted by a source S in a singlet state (B-1), which is an entangled state where the spins are strongly correlated. The particles then move towards two remote regions of space, without their spins interacting with the outside world; the initial spin entanglement remains unchanged. In these remote regions of space, the particle spin components are measured along a direction defined by angle for the region on the left, and by angle for the region on the right (Fig. 2). One often calls Alice and Bob the two observers who perform the measurements in the two different laboratories, which can be very far away from each other. Alice chooses the direction freely, which defines her “measurement type”. With a spin 1 2, she can only obtain two results, that we will note +1 or 1, whatever measurement type was chosen. In a similar way, Bob chooses the direction arbitrarily and obtains one of the two results +1 or 1. In the thought EPRB experiment, one assumes for simplicity that the two spins, once they have been emitted by the source, will only interact with the measuring apparatus (without having any free evolution, as was the case above). Standard quantum mechanics then predicts (§ B) that the distances and instants at which the measurements are performed do not play any role in the probability of obtaining the different possible double results. To keep things simple, let us assume Alice and Bob limit their choices to a finite number of directions and for their respective measurements. It may then happen, by chance, that their chosen directions are parallel. Now if the angles and are chosen to be equal (parallel measurement directions), relations (B-10) indicate that the results will be always opposite for the two measurements: each time Alice observes 1, Bob observes the opposite value 1. This remains valid even if the measurements are performed at points greatly separated in space, whatever the choice = , and even if the two observers operate in totally independent ways, in their own regions of space; for example, they could make their choice at the last moment, even after the emission and 2205

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

Figure 2: In an EPRB experiment, a source S emits pairs of particles in a singlet spin state (entangled quantum state). These particles then propagate along the axis, towards two remote regions of space and , where Stern-Gerlach apparatus are used by observers Alice and Bob to measure the components of their spin along directions perpendicular to . For the first particle, the measurement direction is defined by angle , and for the second, by angle . Each measurement yields a result +1 or 1, and one looks for correlations between these results when the experiment is repeated a great number of times.

propagation of the pair of particles. Let us assume Alice performs her measurement along a direction before Bob starts doing his own measurement. When Alice finishes her measurement, it becomes certain that, should Bob decide to chose a direction parallel to , he will observe the opposite result; the result is certain in that particular case. Such a certainty can only come from the fact that the particle measured by Bob possesses a physical property that determines this certain result; this property (called “element of reality” by EPR) will influence the way that particle interacts with the measuring apparatus in and will determine the result. On the other hand, the particle propagating towards Bob cannot be influenced by events occurring in Alice’s laboratory. This means that this physical property we are discussing existed before the measurement performed by Alice. The reasoning is obviously symmetrical and establishes that, before any measurement, the particles already possessed physical properties that determined the outcome of the future measurements. As the direction chosen by Alice was random, it means that the particles must possess enough properties to determine the results for any analysis directions chosen by the observers. Now quantum mechanics does not predict the existence of such properties, as it only gives a description of the particles via a singlet state vector, which always predicts a totally random result for the first measurement. Furthermore, there exists no quantum state for which all the spin components on arbitrary directions can be simultaneously determined (the corresponding operators do not commute with each other). This means that quantum mechanics accounts only partially for the physical properties of the system; it is therefore incomplete. F-1-b.

Assumptions and conclusions

Let us discuss in more detail the logical structure of the EPR argument. (i) It starts by assuming that the predictions of quantum mechanics for the proba2206

F. ENTANGLEMENT, NON-LOCALITY, BELL’S THEOREM

bilities of measurement results are correct. The argument thus assumes that the perfect correlations predicted by this theory are always observed, whatever the distance between the two measuring apparatus. (ii) Another essential ingredient of the EPR argument is the concept of “elements of reality” defined with the following criterion [69]: “if, without in any way disturbing a system, we can predict with certainty the value of a physical quantity, then there exists an element of physical reality corresponding to this physical quantity”. In other words, a certainty cannot be built on nothing: an experimental result known beforehand can only be the consequence of a preexisting physical quantity. (iii) And last, but not least, the EPR argument brings in the notion of space-time and locality: the elements of reality they discuss are attached to regions of space where the experiments take place, and they cannot suddenly vary (and certainly not appear) under the influence of events occurring in a very distant region of space. Einstein wrote in 1948 [70]: “Physical objects are thought of as arranged in a space-time continuum. An essential aspect of this arrangement of things in physics is that they lay claim, at a certain time, to an existence independent of one another, provided these objects are situated in different parts of space”. To sum up, one can say that the basic conviction of EPR is that regions of space contain their own elements of reality (attributing distinct elements of reality to separated regions of space is sometimes called “separability”), and that their time evolution is local – one often refers to “local realism” in the literature to qualify the ensemble of the EPR hypotheses. Basing their argument on these hypotheses, EPR show that, for any chosen values of and , the measurement results are functions: (i) of the individual properties of the spins the particles carry with them (the EPR elements of reality); (ii) and of the orientations , of the Stern and Gerlach analyzers (which is obvious). It follows that the results are given by well defined functions of these variables, meaning that no non-deterministic process occurs: a particle with spin brings along all the necessary information to yield the result of a future measurement, whatever the choice of the orientation (for the first particle) or (for the second). This implies that all the components of each spin must have simultaneously well determined values. F-2.

Bohr’s reply, non-separability

Bohr rapidly replied [71] to the EPR article presenting their argument. In Bohr’s view, the only physical system to be considered is the entire experimental set-up, including the measured quantum system and all the measuring apparatus, which are treated classically. It is thus meaningless to try and select among this ensemble subsystems having individual physical properties. The physical system Bohr considers is a whole that one should not attempt to separate into parts. This is often called the “non-separability” rule. In other words, Bohr considers that spatial separation does not lead to separability. It is not the EPR reasoning that Bohr criticizes, but he considers that their starting assumptions are not relevant in the framework of quantum physics. From Bohr’s point of view, the EPR criterion for elements of reality “contains an essential ambiguity when applied to quantum phenomena”. Along the same line, more than ten years later (in 1948), Bohr made his point of view explicit [72]: “Recapitulating, the impossibility of subdividing the individual quantum effects and of separating a behavior of the objects 2207

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

from their interactions with the measuring instrument serving to define the conditions under which the phenomena appear implies an ambiguity in assigning conventional attributes to atomic objects, which calls for a reconsideration of our attitude towards the problem of physical explanation ”. It is thus the very need for a physical explanation involving such a subdivision that is questioned by Bohr. Bohr refutes Einstein’s basic idea, namely that one can attribute distinct physical properties to two objects located in very remote space-time regions. He believes that “quantum non-separability” applies, even in such a situation. It is understandable that Einstein was unwilling to abandon concepts that are the pillars of special and general relativity (gravitation). F-3.

Bell’s inequality

In 1964, more than thirty years after the publishing of the EPR argument, an article by Bell shed an entirely new light on the question [73]. This article, in a way, took up the EPR argument from the point at which its authors had left it. Taking at their face value the existence of the EPR elements of reality, and using the same local realism considerations, Bell showed that there is actually no way to complete quantum mechanics without changing its predictions, at least in some cases. This means that one must either accept that certain predictions of quantum mechanics are sometimes incorrect, or abandon certain EPR hypotheses, however natural they may seem. F-3-a.

Bell’s theorem

Following Bell’s idea, let us assume that represents the “elements of reality” associated with the spins; is, actually, just a concise notation that could represent a multiple component vector, so that the number of elements of reality contained in is totally arbitrary. One can even include in components that play no particular role in the problem; the only important hypothesis is that must contain enough information to yield the results of all the possible spin measurements. For each pair of spins emitted in the course of the experiment, is fixed. Another commonly used notation for the two measurement results is and , not to be confused with the small letters and used for the parameters of the two measuring apparatus. As expected, and are functions not only of , but also of the measurement parameters and . However, locality requires that has no influence on result (since the distance between the two measurements’ locations is arbitrarily large); conversely, has no influence on result . We shall note ( ) and ( ) the corresponding functions, which can take on two values, +1 or 1. Figure 3 schematizes the experiment we are discussing. To establish Bell’s theorem, it is sufficient to take into account only two directions for each individual measurement; we shall then use the simpler notation: (

)

(

)

(F-1)

(

)

(

)

(F-2)

For each emitted pair of particles, as values, which can only be 1.

is fixed, the four above numbers have well defined

and:

2208

F. ENTANGLEMENT, NON-LOCALITY, BELL’S THEOREM

Figure 3: Source S emits particles toward two measuring apparatus located far away from each other, each being set up with its own measurement parameter, respectively and ; each apparatus yields a 1 result. The oval under the source symbolizes a fluctuating random process, which controls the particles’ emission process, and hence their properties. Correlations between the measured results are observed; they are due to the common random properties the particles have acquired upon their emission by the fluctuating process. Consider then the sum of products: ( )=

+

+

(F-3)

that can also be written as: ( )=

(

)+

(

+

)

(F-4)

If 2

= , the above expression reduces to 2 = 2; if = , it reduces to = 2. In both cases, we see that = 2. If we now take the average value of ( ) over a large number of emitted pairs (average over ), we get: =

+

+

where denotes the average value over of the product similar notation has been used for the 3 other terms. As each 2, we necessarily have: 2

+2

(F-5) ( ) ( ), and a ( ) value can only be (F-6)

This result is the so called BCHSH (Bell, Clauser, Horne, Shimony et Holt) form of Bell’s theorem. This inequality must be satisfied by all sorts of measurements yielding random results, whatever mechanism creates the correlations, as long as the locality condition is obeyed: does not depend on the measurement parameter , and does not depend on . This means that any theory that fits in the framework of “local realism” must lead to predictions satisfying relation (F-6). Realism is necessary since we used in the demonstration the concept of EPR elements of reality to introduce the functions and 2209

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

; locality is also essential as it forbids to depend on and, conversely, to depend on . The simplicity of this demonstration is such that the inequality may be expected to remain valid in many situations. This actually happens any time the observed correlations can be explained by fluctuations having a past common cause; they are then referred to as “classical correlations”. In such cases, each time the experiment is performed, this common cause is fixed, and the 4 numbers , , and take on well-defined values (even though they are, a priori, unknown) all equal to 1; the number is therefore also well defined, equal to 2 or +2. Whatever the values found in a random series of measurements, it is mathematically impossible for the sum of these values to be greater than 2 or smaller than 2 . Consequently, the average value obtained by dividing this sum by necessarily obeys (F-6): the mere existence of these 4 numbers is sufficient to obtain the inequality. F-3-b.

Contradictions

It would seem natural for any reasonable physical theory to automatically lead to predictions satisfying this inequality. Now, surprisingly enough, this is not the case for quantum mechanics and, furthermore, this contradiction has been experimentally confirmed. .

Contradictions with quantum mechanics predictions

Relations (B-10) allow computing the average value of the product of the 1 results obtained in the measurements of the two spins along directions making an angle with each other. This average value is given by (we write ˆ and ˆ to emphasize that these letters now denote operators, not numbers): ˆ( ) ˆ ( ) =

++

+

+

+

=

cos

(F-7)

This expression is the quantum equivalent of the average value over the variable of the product of results ( ) ( ) in a theory with local realism. To get the quantum equivalent of the combination of the four products of results as they appear in (F-3), we must compute the same combination of average values of these products of results, which yields: =

ˆ( ) ˆ ( )

=

cos

+ cos

ˆ( ) ˆ ( ) + cos

ˆ( ) ˆ ( ) + cos

ˆ( ) ˆ ( ) (F-8)

Imagine now that the four directions are in the same plane, and that the vectors, arranged in the order a, b, a and b , all make a 45 angle with the preceding vector (Fig. 4); all the cosines are then equal to 1 2, except for cos that is equal to 1 2. We then get = 2 2; exchanging the directions of b and b , we get = 2 2. In both cases, BCHSH inequality (F-6) is violated by a factor 2, i.e. by more than 40 %. In spite of the seemingly simple cosine variation of expression (F-7), we just showed that no theory with local reality is able to account for it, as this would violate inequality (F-6). This means that the EPR-Bell argument leads to an important quantitative contradiction with quantum mechanics, proving it to be a theory that does not comply with local realism in the sense of EPR. 2210

F. ENTANGLEMENT, NON-LOCALITY, BELL’S THEOREM

Figure 4: Position of the four vecteurs , , and leading to a maximum violation of BCHSH inequalities for two spin-1 2 particles in a singlet state. These vectors define the spin components to be measured, along or for the spin on the left, and along or for the spin on the right. This means that the entire experiment needs four different set-ups. The only pair of vectors leading to a negative correlation between the two measurement results is ( , ), as the angle between the corresponding directions is larger than 90 .

How is it possible to encounter such a contradiction and how come such an apparently flawless argument does not apply to quantum mechanics? Several answers can be given: (i) Had Bohr been aware of Bell’s theorem, he would very likely have rejected the existence of the 4 preexisting numbers , , , . If these numbers do not exist, the argument of § F-3-a is no longer possible and the BCHSH inequality disappears. Bohr would have considered Bell’s theorem as mathematically correct in probability theory, but totally irrelevant in quantum mechanics, as being improper for the quantum description of the experiment under study. Even if he had accepted to reason about these numbers, as unknown variables to be determined later as is often the case in algebra, would the inequality have survived? The answer is no, still reasoning with Bohr’s logic. As already mentioned in § F-2, Bohr’s point of view is that the entire experiment must be considered as a whole. One cannot distinguish two separate measurements that would be performed, each on one of the particles: the only true measurement process concerns the ensemble of both particles together. A fundamentally indeterminist and delocalized process occurs in the whole region of space containing the entire experiment. The functions and then both depend on both the measurement parameters, and must be written as ( ) and ( ); this immediately forces an abandon of locality. Instead of the 2 numbers and , we now have 4 numbers, = ( ), = ( ), as well as = ( ) and = ( ); the same is true for and , which must be replaced by 4 numbers. We now must deal with a total of 8 numbers instead of 4. The demonstration of the BCHSH inequality is then no longer possible and the contradiction 2211

CHAPTER XXI

QUANTUM ENTANGLEMENT, MEASUREMENTS, BELL’S INEQUALITIES

disappears. (ii) One may prefer a more local point of view for the measurement process, which allows keeping the concept of a measurement on a single particle. To avoid the contradiction with quantum mechanical predictions, one must then consider that it is meaningless to attribute four well defined values , , , to each pair. Since only a maximum of two of them can be measured in a given experiment, we should not be able to talk about these four numbers or argue about them even as unknown quantities. A well know phrase of Péres [75] very clearly sums up this point of view: “unperformed experiments have no results”. Wheeler [76] expresses the same idea as he writes: “no phenomenon is a real phenomenon until it is an observed phenomenon.”. .

Contradictions with experimental results

The question was: does Bell’s theorem allows pointing out very particular situations where quantum mechanics is no longer valid? Or, on the contrary, are the predictions of quantum mechanics always valid, which immediately entails that certain hypotheses leading to the inequalities must be abandoned? A great number of experiments have been performed from 1972 on; they all confirmed the predictions of quantum mechanics, measuring, sometimes with great precision [77], the violation of Bell’s inequalities. After a moment of doubt, it now seems well established that quantum mechanics yields perfectly correct predictions, even in situations where it implies a violation of Bell’s inequalities. However plausible they might look, one must abandon at least one of the hypotheses that led to these inequalities. Conclusion The concept of quantum entanglement is quite essential; it leads to situations where certain types of correlations, totally impossible in classical physics, can be produced and observed. These situations can occur even when the observations are performed in regions of space arbitrarily remote from one another. A fundamental idea of quantum mechanics, without any classical counterpart, is that the most precise description of a whole does not necessarily entail an equivalently precise description of its parts. This means that there exists no theory both local and realistic, for describing a system containing two remote and entangled particles (it would contradict quantum mechanics). Entanglement also plays an essential role in the measurement processes, and comes into play at different levels: entanglement between the measured system and the measuring apparatus , between and the environment , between two environments and , and so forth. We also discussed how entanglement determines the contrast of the fringes observed in an interference experiment where a particle has to cross a plate pierced with two holes. In addition to these important aspects, entanglement also plays an essential role in quantum computing: one seeks to take advantage of the parallel evolution of the various entangled branches of the state vector to perform computations. This domain of research has undergone intense development in recent years, but is too extensive to be treated in the present volume. The reader may want to consult specialized books on the subject, as for example that of D.Mermin [78]. Entanglement also plays a central role in quantum cryptography, whose aim is to fabricate devices for secure quantum key distribution that 2212

F. ENTANGLEMENT, NON-LOCALITY, BELL’S THEOREM

cannot be spied on, as any eavesdropping is detectable; a review on this subject can be found in the article by N.Gisin, G.Ribordy, W.Tittel and H.Zbinden [79]. There still remains the fact that, in the presence of entanglement, and in particular during a measurement process, the standard interpretation of quantum mechanics may present some difficulties. Schrödinger’s evolution equation does not predict the uniqueness of the measurement result observed in the macroscopic world. To obtain this uniqueness in the framework of the theory, one can introduce an ad hoc postulate, such as the von Neumann postulate of reduction of the state vector. It then raises the question of where to set the border: when exactly should one stop using the continuous evolution of Schrödinger’s equation and impose the reduction of the state vector? How can one reconcile the intrinsic irreversibility of this ad hoc postulate with the reversibility of Schrödinger’s equation? Another open question concerns the status of the state vector. We have used it throughout this book as a mathematical tool, good for computing probabilities, but what does it really represent? Does it directly describe physical reality? Or does it simply give information about physical reality? A number of quantum mechanics interpretations have been proposed (see reference [68]) that discuss this fundamental difficulty.

2213

COMPLEMENTS OF CHAPTER XXI, READER’S GUIDE

AXXI : DENSITY OPERATOR AND CORRELATIONS; SEPARABILITY

AXXI : This complement introduces von Neumann statistical entropy associated with a density operator, discussing its properties and establishing some important inequalities it must satisfy. Also discussed are the differences between classical and quantum correlations (arising from quantum entanglement effects). The concept of “quantum non-separability” is introduced.

BXXI : GHZ STATES; ENTANGLEMENT SWAPPING

GHZ states provide an example of conflict between quantum mechanics and the usual concept of local realism. The contradiction is even stronger than for Bell’s inequalities, as it is expressed as an opposition in signs. Entanglement swapping allows entangling particles without them ever having to interact with each other.

CXXI : MEASUREMENT INDUCED RELATIVE PHASE BETWEEN TWO CONDENSATES

When two Bose-Einstein condensates overlap, their relative phase is a priori totally undetermined. However, such a phase may appear, induced by a measurement process sensitive to that phase. As measurements proceed, this relative phase will progressively acquire a more precise value.

DXXI : EMERGENCE OF A RELATIVE PHASE WITH SPIN CONDENSATES; MACROSCOPIC NONLOCALITY AND THE EPR ARGUMENT

This complement is an extension of the previous one, studying the case where the two condensates are formed of particles with spins. The same phenomenon occurs: the emergence of a relative phase, but in a context where the EPR argument is harder to refute because of the macroscopic character of the measured quantities. Furthermore, situations may arise where Bell’s inequalities are violated, which proves that the measurement induced phase between the two condensates is of a non-classical nature.

2215



DENSITY OPERATOR AND CORRELATIONS; SEPARABILITY

Complement AXXI Density operator and correlations; separability

1

2

3

Von Neumann statistical entropy . . . . . . . . . . . . . . 1-a General definition . . . . . . . . . . . . . . . . . . . . . . 1-b Physical system composed of two subsystems . . . . . . . Differences between classical and quantum correlations 2-a Two levels of correlations . . . . . . . . . . . . . . . . . . 2-b Quantum monogamy . . . . . . . . . . . . . . . . . . . . . Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-a Separable density operator . . . . . . . . . . . . . . . . . 3-b Two spins in a singlet state . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

2217 2217 2219 2221 2221 2221 2223 2223 2225

In Chapter XXI, we mainly considered global systems + described by a state vector (pure state). This complement will examine what happens when these global systems are described by a density operator (statistical mixture); we shall study the correlations quantum mechanics predicts in that case between the two subsystems and . We start by introducing, in § 1, the concept of statistical entropy, which yields a useful measure of their degree of correlation. We then analyze, in § 2, the differences between classical correlations (introduced at the probability level) and the quantum correlations (which can arise from the coherent superposition of state vectors). Finally, in § 3, we will come back to the important concept of separability already introduced in § F-2 of Chapter XXI. 1.

Von Neumann statistical entropy

The statistical entropy introduced by von Neumann permits, in a straightforward way, to distinguish between a pure state and a statistical mixture; in the latter case, it also yields a measurement of the statistical character of the information known about the physical system. It is also a useful tool for studying in a quantitative way the amount of correlation between two physical systems. 1-a.

General definition

With any density operator , we associate a statistical entropy =

Tr

ln

where is the Boltzmann constant. As ized. Noting its eigenvalues, we get: =

ln

by the relation: (1)

is Hermitian, this operator can be diagonal-

(2) 2217



COMPLEMENT AXXI

Since all the

are included between 0 and 1, we necessarily have:

0

(3)

where the equality occurs only if has one eigenvalue equal to 1, all the others being equal to zero. The entropy associated with is therefore equal to zero only if this operator is a projector, and hence corresponds to a pure state. On the other hand, whenever describes a statistical mixture, is different from zero. It takes on its maximum value when the density operator has equal populations in all the system’s accessible states, i.e. if it is proportional to the identity operator in the state space. To prove this, let us vary each by an amount d , and impose a zero variation for the sum over of (2), while maintaining constant the sum of all using a Lagrangian multiplier . We then get: d

d

=

[1 + Log

+ ]d

=0

(4)

For this expression to be zero for any d means that all the ln , and hence all the themselves, must be equal. One can associate a concept of information, or rather a lack of information, with the entropy . When the physical system is in a pure state, that state provides the maximum information on the system, compatible with quantum mechanics. In this situation, there is no lack of information and = 0. On the other hand, when the system is spread over several pure states with comparable probabilities, a large value of means that a lot of information about the system is lacking.

Comment: The statistical entropy characterizes the populations of the density matrix (§ 4-c of Complement EIII ), but not the corresponding eigenvectors. Moreover, the same density operator can in general be obtained from several different statistical mixtures of pure states (cf. comment at the end of §4-a of Complement EIII ); the value of the entropy does not distinguish between these different mixtures.

A statistical mixture of several density operators can only increase the entropy of the system. Imagine, for example, that the density operator is actually the combination of several density operators with probabilities (all positive, and whose sum over is equal to 1), written as: =

(5) n

Noting =

the entropies associated with Tr

:

ln

(6)

we can write1 : (7)

1 This

2218

properties is often called “entropy concavity”.



DENSITY OPERATOR AND CORRELATIONS; SEPARABILITY

Demonstration: In § 1-b of Complement GXV , we showed that, when and are two density operators with traces equal to 1, one always has the following inequality: Tr

ln

Tr

ln

(8)

(the equality occurring only if = Tr

ln

=

= ). We can then write: Tr

ln

Tr

ln

=

1

(9)

which establishes relation (7).

1-b.

Physical system composed of two subsystems

We now compare the entropies and associated, respectively, with + , the total density operator + and with the partial density operators and . We are going to show that + = when the systems and are not entangled, + and that + + otherwise. .

Pure state

Imagine first that the total system is in a pure entangled state. We have seen that, in that case, the two subsystems and are not described by pure but by statistical mixtures of states, so that: = =

Tr Tr

As the entropy +

ln ln +

0 0

(10)

associated with a pure state is zero, it follows that: (11)

+

(the equality corresponds to the special case where the pure state Ψ is a product, without entanglement, and where the Schmidt rank is equal to 1; see Chapter refch21, § C-3). We can also use the Schmidt decomposition for Ψ , which yields relations (C-7) and (C-8) of Chapter XXI, to get: =

ln

=

(12)

Both entropies of the two subsystems are thus always equal whenever the total system is in a pure state. 2219

COMPLEMENT AXXI

.



Statistical mixture

When the total system is described by a density operator not necessarily + corresponding to a pure state, its entropy + may not be equal to zero. We are going to show, however, that this entropy + always remains lower or equal to the sum of the entropies of each subsystem, meaning that relation (11) remains valid in this more general case; this property is referred to as the “entropy subadditivity”. The equality in (11) is obtained solely in the case where + is a product: +

=

(13)

which corresponds to the case of two subsystems, separately described by statistical mixtures, while remaining uncorrelated. The difference + + yields an estimate of the loss of precision between the quantum description of the total system, and the separate quantum descriptions of the two subsystems. Demonstration: According to inequality (8), we can write: Tr

ln

Tr

ln (

)

(14)

We note the eigenvectors of with eigenvalues , and the eigenvectors of with eigenvalues . Let us now compute the trace of the right-hand side of (14) in the basis of the tensor products of the eigenvectors of the two operators, with respective eigenvalues and ; we get: Tr

Log (

)

=

Log (

=

Log (

)

)

=

Log ( ) +

Let us now choose

=

+

Log ( )

. The first term on the right-hand side can be written as:

ln ( ) =

+

(15)

ln ( )

=

ln

= Tr

ln

(16)

The second term on the right-hand side of (15) yields a similar expression, where replaces . Finally, inequality (14) can be written as: Tr

+

ln

+

Tr

ln

+ Tr

ln

(17)

and leads to (11). The equality occurs if and only if (14) becomes an equality, i.e. if is equal to the product (13). +

2220

• 2.

DENSITY OPERATOR AND CORRELATIONS; SEPARABILITY

Differences between classical and quantum correlations

Quantum mechanics offers more possibilities than classical physics for describing correlations between physical systems. We now briefly discuss such examples. 2-a.

Two levels of correlations

The concept of correlation is not, intrinsically, a quantum notion, and it is well known in classical physics. It is then based on probabilistic calculations, which results in the linear weighting of a certain number of possibilities. In this classical context, one introduces a distribution yielding the probability for the first system to occupy a certain state, and the second system, another state; the two systems are correlated when this distribution is not a simple product. If, on the other hand, the distribution turns out to be a product, the two systems are not correlated; a measurement on one of the systems does not change the information about the other. In particular, this is what happens if the states of the two systems, and consequently that of the total system, are perfectly well defined, a case where the notion of correlation between the two systems becomes irrelevant. This means that the notion of correlation between two classical systems is closely linked to an imperfect definition of the state of the total system. In quantum mechanics, things are totally different. To begin with, even if a physical system is perfectly well defined by a state vector, many of its physical properties are not so precisely defined: during several realizations of the experiment, their measurement can provide fluctuating results. These results can nevertheless be correlated: as an example, we saw in § B of Chapter XXI that the components of each of the two spin-1 2 particles are completely indeterminate but perfectly correlated. Such correlations appear directly at the level of the state vector itself, which can be written as a linear superposition of states where the spins have various orientations. The correlations are therefore related to the the quantum mechanical superposition principle; this is totally different from the combinations of probabilities, which are quadratic functions of that state vector. Letting correlations appear directly at the probability amplitude level, one has access to a level that is, in a way, “a step ahead” of the linear weighting of classical probabilities, and maintains the possibility of quantum interference effects. Note, however, that the existence of this level of combinations does not exclude classical probabilities from coming into play. One can also assume, in quantum mechanics, that the state of the total system is only known in a probabilistic way, so that the two probability levels may coexist. To sum up, it is clear that the concept of quantum correlations covers many more possibilities than correlations in classical physics2 . 2-b.

Quantum monogamy

Another purely quantum property is that, if a physical system is strongly entangled with a physical system , it cannot be strongly entangled with another system . Such a property does not have any equivalent in classical physics, where, obviously, nothing prevents correlating a third system with two others and , all the while keeping

2 We shall introduce, in § 3, a criterion (negativity of the coefficients of the total density operator expansion into a sum of products) for confirming the quantum nature of the correlations between two subsystems.

2221

COMPLEMENT AXXI



their initial correlation. This quantum property is often referred to as “entanglement monogamy”. Let us assume, for example, that two spins are in a state of the same type as the singlet state (B-1) of Chapter XXI: Ψ

1 2

=

: +;

:

+

:

;

:+

(18)

(the singlet state is obtained for = ). How can we add an additional spin without destroying the correlation between the first two? One could imagine the three spin state to be written as: Ψ

= Ψ

:

1 2

=

: +;

:

;

:

+

:

;

: +;

:

(19)

where is any normalized state for the third spin. This ket obviously conserves the same entanglement between spins and as in state (18), but the third spin is then totally uncorrelated with the first two. Another possibility is to choose as a state vector: Ψ

1 2

=

: +;

:

;

:

1

+

:

The density operator describing spins trace (Complement EIII , § 5-b): = Tr

Ψ

;

: +;

and

:

(20)

2

is obtained by taking the partial

Ψ

(21)

Computing the matrix elements of this partial trace shows that: =

1 2

1

: +;

1

+

2

+

:

: +; :

2 1

+

; :

2 2

:

:+ ;

:

:+

: +;

1

;

:+

: +;

:

:

: ;

:+

(22)

One can then distinguish several cases: – If 1 = 2 , we find again (19), and the third spin is not entangled with the first two. The density operator is then written: =

: +;

+

:

: ;

: +; :+

:

: +;

+ :

:

;

+

:+

:

: +;

:

;

:+ :

;

:+

(23)

which is simply the projector onto state (18); it conserves all the entanglement of spins and . – The opposite case is when 1 and 2 are orthogonal, so that Ψ becomes a so called GHZ state (Greenberger, Horne and Zeilinger; cf. Complement BXXI : Ψ 2222

=

1 2

: +;

:

;

:+ +

:

;

: +;

:

(24)



DENSITY OPERATOR AND CORRELATIONS; SEPARABILITY

where, when one goes from the first component to the second, the three spins switch from one state to the orthogonal one. The second line in (22) then cancels out and the partial trace becomes: =

1 2

: +;

:

: +;

:

+

:

;

:+

:

;

:+

(25)

which is a statistical mixture of two possibilities, with probabilities 1 2: the two spins are either in the state : +; : , or in the state : ; : + . The quantum coherences between these two states (terms dependent on the phase ) have totally disappeared. The correlation between the two spins and is then of a classical nature3 , and no entanglement comes into play. – In the intermediate situation where 1 and 2 are neither parallel nor orthogonal, we see from (22) that a certain coherence remains (non-diagonal elements). The more parallel 1 and 2 are, the more the partial density operator resembles that of the two initial spins which remain entangled, whereas the third one becomes less and less entangled with the first two; conversely, the more orthogonal they are, the more the initial spins lose their correlation, which becomes entirely transmitted to the three spin level. This is actually a general property: when two physical systems are maximally entangled, a principle of mutual exclusion makes it impossible to entangle them with a third system. Mathematically, this property is expressed by the Coffman-Kundu-Wooters inequality [80]. 3.

Separability

From Bohr’s point of view (§ F-2 of Chapter XXI), one must give up the notion of separability. Even when two physical subsystems are well separated in space, it does not infer that they have, each of them separately, their own physical properties (as EPR assumed); only the total system, including the measuring apparatus, can have such properties. On the other hand, we saw in § 2 that in quantum mechanics, there are two ways for introducing correlations between two systems: a classical way (by assuming they have given probabilities to be found in such or such individual correlated states), and a quantum way (by assuming entanglement directly at the level of a common state vector). We also know that, even though there are situations where quantum mechanics predicts violations of Bell’ inequalities and hence of local realism, there are many others where its predictions obey these inequalities; a violation is, in a way, the signature of an ultraquantum situation. It is thus interesting to look for a criterion allowing a distinction between these two types of correlations. 3-a.

Separable density operator

Consider a total system described by a density operator and composed of + two subsystems and . Let us assume that can be expanded as a series of + density operators and pertaining to each of the two subsystems, with real and 3 The density operator is separable in a sense that will be defined in § 3, and therefore cannot lead to violations of Bell’s inequalities.

2223

COMPLEMENT AXXI

positive +



coefficients whose sum equals to one, and can be assimilated to probabilities: =

(26)

with: 0

1

and

=1

(27)

Intuitively, one can guess that the correlations contained in must then be + of a classical nature. The total system is, with a probability , described by a density operator that is a product, without correlations, of density operators each describing one of the subsystems. The correlations between these subsystems are therefore introduced in a classical way, even if nothing prevents each subsystem from exhibiting strongly quantum individual properties. Any density operator that can be decomposed as in (26) with positive coefficients is, by definition, said to be “separable” [81, 82]. On the other hand, if any decomposition of + such as (26) necessarily includes coefficients that are not real and positive, the density operator + contains quantum entanglement and is said to be non-separable. When the total system + is separable, correlation measurements between physical properties of the two subsystems and can never lead to violations of Bell’s inequalities. These violations are thus a sure sign of the non-separability of the density operator. Demonstration To show this, let us assume we perform two simultaneous measurements on the systems and , the first one depending on the measurement parameter , and the second, on the measurement parameter . We note ( ) the projector acting in the state space of and corresponding to the measurement result (this projector is the sum of the projectors onto the eigenvectors associated with that measurement). In a similar way, we note ( ) the projector in the state space of corresponding to the measurement result . When the total system is described by the density operator (26), the joint probability of obtaining both results and is: (

)=

(

)

(

)

(28)

with: (

) = Tr

(

)

(

) = Tr

(

)

(29)

As all the numbers appearing on the right-hand side of (28) are positive, this equality has a natural interpretation in classical physics, which is the framework of our present argument. We are dealing with two levels of probabilities. At one level, the total system is prepared, with probability , in a state where the two subsystems are uncorrelated. At a second level, for each value of , the individual states of the subsystems are only known in a statistical way via the probability ( ) of a result , and the probability ( ) of a result .

2224



DENSITY OPERATOR AND CORRELATIONS; SEPARABILITY

We now show that if a relation of the type (28) is verified for all the measurement parameters and , with any positive probabilities ( ) and ( ), and with positive values for all the , Bell’s inequalities are always satisfied. Let’s assume both results and can take on the values 1. To start with, we assume that the physical properties of each pair of systems and depend on a classical random variable ; this variable takes on a series of values , corresponding to each term in the summation over appearing in (28), each with the probability . This means that this summation over can be interpreted as an average value over the random variable . We then assume that the properties of the classical system depend on a different random variable , which determines the result of the measurement performed on . As an example, one can imagine that is regularly distributed on the segment [0 1]; result is a function of , and takes the value +1 on a fraction of the segment of length ( = +1), and the value 1 on the rest of the segment. This function models the probability written on the first line of (29); the measurement result is thus a function of , of the measurement parameter , and of (which replaces ). Finally, we introduce the random variable , which determines the result of the measurement performed on , with a distribution modeling in a similar way the probability written on the second line of (29), for any value of and any choice of the measurement parameter . If we now regroup the three variables , and , as being the three components of a single variable , we reproduce the exact same hypotheses stated at the beginning of § F-3-a in Chapter XXI: the measurement results are functions, the first one of and and the other one of and . The same reasoning then leads to Bell’s inequalities. Note that, at no point in this classical reasoning, did we have to consider the ensemble + as a whole; it was thus to be expected that Bell’s inequality would be established in this case. 3-b.

Two spins in a singlet state

Let’s go back to the example of two spin-1 2 particles in a singlet state: Ψ =

1 2

+

+

(30)

In the basis of the 4 kets + + , + , +, representing the density operator + is written:

(

+

1 )= Ψ Ψ = 2

This matrix density ( as, for example: 1= +

+

+

taken in that order, the matrix

0 0 0 0 0 1 10 0 1 1 0 0 0 0 0 ) has non-diagonal elements between states +

+

(31)

and

+, (32)

To obtain such a non-diagonal term by a sum of products such as (26), will require: 1=

+

+

(33) 2225

COMPLEMENT AXXI



This demands introducing at least one term that contains partial density operators and , both having non-diagonal elements. Now each of these two operators is a positive-definite operator. This means, for for example, that it must have populations (diagonal matrix elements) + + and in the two individual spin states, and the same is true for . The corresponding term will necessarily introduce in ( + ) populations in the 4 states + + , + , +, ; it will then be impossible to cancel those populations by adding other products of density operators (whose populations are positive) with positive coefficients. Consequently, this density operator ( + ) is non-separable, and this is why it can lead to violations of Bell’s inequalities.

2226



GHZ STATES, ENTANGLEMENT SWAPPING

Complement BXXI GHZ states, entanglement swapping

1

2

Sign contradiction in a GHZ state . . . . . . 1-a Quantum calculation . . . . . . . . . . . . . . 1-b Reasoning in the local realism framework . . 1-c Discussion; contextuality . . . . . . . . . . . Entanglement swapping . . . . . . . . . . . . . 2-a General scheme . . . . . . . . . . . . . . . . . 2-b Discussion . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

2227 2227 2230 2231 2232 2232 2235

Greenberger, Horne and Zeilinger (GHZ) showed in 1989 [83, 84] that violations of local realism even more spectacular than violations of Bell’s inequalities could be observed on systems containing more than two correlated particles. These violations involve a contradiction in sign (and hence a violation of 100 %) for perfect correlations between measurement results, as opposed to inequalities violated by 40% for imperfect correlations. Observation of these violations requires creating an initial entanglement between three particles or more, as will be discussed in § 1. Another example, involving the entanglement of more than two particles, is the “entanglement swapping” method, explained in § 2. This method highlights a surprising property of entanglement: the possibility of entangling together two quantum systems, without them ever having to interact with each other. 1.

Sign contradiction in a GHZ state

We consider a system composed of three spin-1 2 particles, as it is the simplest case for explaining how the GHZ contradiction can appear. 1-a.

Quantum calculation

The three-spin system is described by the normalized quantum state: Ψ =

1 2

+ + + +

(1)

In this equality, the states symbolize the eigenstates of the spin components along the axis in an reference frame; to simplify the notation of the three particle ket, the spins are not numbered: the first sign corresponds to the state of the first spin, the second to that of the second, and similarly for the third spin. The number stands for either +1, or 1. We now look for the quantum probabilities of measurement results of the components of each of the spins 1 2 3 of the three particles along two possible directions: either along the direction, or along the perpendicular direction (Fig. 1). We start with the measurement of the product 1 2 3 . As we now show, Ψ is an eigenvector of this operator product, with eigenvalue , which means that the 2227

COMPLEMENT BXXI



measurement result is certain. The action of the first operator1 is written: 1 [ + (3) + (3)] Ψ 2 1 = [ + (3) + (3) + + + ] 2 2 1 = [ + + + + ] 2

Ψ =

3

(2)

The second operator then yields: 2

3

1 [ + (2) (2)] 3 Ψ 2 1 = [ + + + ] 2

Ψ =

(3)

Finally, the product of the three operators yields: 1

2

3

1 [ + (1) (1)] 2 1 = [ + + + + 2 = Ψ

Ψ =

2

1

2

3

=

Ψ ] (4)

The probability of observing the result (

3

is:

)=1

(5)

whereas the probability ( 1 2 3 = + ) of observing the other result is zero. As the three spins play the same role, it is clear that Ψ is also an eigenvector of the two operator products 1 2 3 and 1 2 3 , with eigenvalues . The corresponding probabilities are therefore: ( (

1

2

3

1

2

3

= =

) =1 ) =1

(6)

It is thus certain that the three products take on the value . We now consider the measurement result of the product of the three spin components along the same axis. We use again (2), but (3) is now replaced by: 2

3

1 [ + (2) + (2)] 3 Ψ 2 1 = [ + + + + ] 2

Ψ =

(7)

and (4) by: 1

2

1 The

3

1 [ + (1) + (1)] 2 1 = [ + + + + 2

Ψ =

2

3

Ψ ]

Pauli matrices are defined in Complement AIV ; the operators =2 .

2228

(8) =

obey the relation



GHZ STATES, ENTANGLEMENT SWAPPING

Figure 1: Schematic set-up for observing the GHZ contradictions. Three spins, initially in state Ψ written in (1), are measured in three different regions of space. In each of these regions a measuring apparatus is placed, with a setting enabling the local observer to choose between two possible spin component measurements, either along , or along . Whatever choices the three observers make, the results given by the three apparatus are = 1, = 1 and = 1.

This shows that Ψ is also an eigenvector of the operator product time with the eigenvalue + . It follows that: (

1

2

=

3

1

2

3

, but this

+ )=1

(9)

One can conclude, with certainty, that the measurement result of this product will be equal to + . Quantum mechanical measurement of a product of commuting operators: Three operators such as 1 , 2 and 3 , acting on different spins, commute with each other; they form a CSCO (Complete Set of Commuting Observables) in the state space of the three spins. One can thus build a basis of eigenvectors 1 2 3 common to the three operators, labeled by the eigenvalue 1 = 1 of 1 , the eigenvalue 2 = 1 of 2 and the eigenvalue 3 = 1 of 3 . Any vector Ψ can be decomposed onto this basis as: Ψ =

( 1

2

1

2

3)

1

2

3

(10)

3

The action of the operator product 1 2 3 on any ket is therefore to simply multiply each of its component ( 1 2 3 ) by the product 1 2 3 . Now the vector Ψ written in (1) is an eigenvector of that operator product, with the eigenvalue . The uniqueness of decomposition (10) then means that the only non-zero ( 1 2 3 ) coefficients are those

2229



COMPLEMENT BXXI

for which: 1

2

3

=

(11)

Suppose we measure, in a first experiment, the component 1 of the first spin. The result 1 = 1 is random. After the measurement, the projection postulate leads to a state that depends on this result, obtained by keeping in (1) only half of the components – those that correspond to the observed 1 value. The components of the projected state vector still obey relation (11), where 1 is now fixed. Similarly, if we continue the experiment and measure 2 for the second spin, the result 2 = 1 is also random, but the components of the new projected state vector still obey that same relation. As now 1 and 2 are both known, the same is true for 3 , whose value is determined by the first two measurements. To sum up, the results observed for each spin component measurement fluctuate from one experiment to another, but these fluctuations are correlated and the product of the three results remain constant. One can obviously do the same analysis for the other sets of operators considered above, 1 , 2 and 3 for example. 1-b.

Reasoning in the local realism framework

Let us leave, for a moment, standard quantum mechanics and examine what a local realistic theory (in the EPR sense of these words) would predict in such a situation. As we are in a particularly simple case where the initial quantum state is an eigenvector of all the observables coming into play (all the results are certain), one could expect nothing particular to happen. On the contrary, we now show that a complete contradiction appears between local realism and the predictions of quantum mechanics. The local realism argument we present is a direct generalization of that used to obtain Bell’s inequalities in § F-3-a of Chapter XXI. We first notice that the perfect correlations imply that the measurement result of a spin component along (or ) of any particle can be deduced from the results of measurements performed on other particles, at arbitrarily large distances. The EPR argument then requires the existence of elements of reality corresponding to these two component directions, that we shall note = 1 for the first spin, = 1 for the second, and finally = 1 for the third. According to the EPR argument, for each experiment (i.e. for each emission of a group of three particles), these six numbers have well determined values, even though they are a priori unknown. These numbers are simply the results that shall be obtained, should measurements be performed later on. As an example, a measurement on the first spin will necessarily yield if the chosen analysis direction is along , or if it is along , independently of the type of measurements performed on the other two spins. To have an agreement with the three equalities (5) and (6) imposes that: = = =

(12)

Now, in the logic of local realism, the same values of , and can also be used for an experiment where the three spin components are measured along the same direction: the result should simply be the product . As the squares of the numbers 2 , 2230



GHZ STATES, ENTANGLEMENT SWAPPING

etc., are always equal to +1, we can obtain that product by multiplying the lines of (12), which yields: =

(13)

That is where the contradiction shows up: equality (9) predicts that the measurement of 1 2 3 must always yield the result + , which has the opposite sign! There could not be a greater contradiction between local realism and quantum mechanical predictions. 1-c.

Discussion; contextuality

Compared with the violations of Bell’s inequality, the GHZ contradiction seems far more spectacular, since a 100 % contradiction is obtained with 100 % certainty. From an experimental point of view, however, the necessity to bring into play three remote and entangled particles is a complex challenge. To easily identify the three spins (deciding which measurement pertains to spin noted , to spin noted , and to spin noted ), and to be sure the three measurements are performed far from each other, let us assume the spins each occupy a different region of space. When the spatial variables are taken into account, the ket (1) can be rewritten more explicitly in the form: Ψ =

1 1: 2

2:

3:

1 : +; 2 : +; 3 : + +

1:

;2 :

;3 :

(14)

where are three orbital states whose wave functions do not overlap. They can for example be entirely localized in separate boxes where the measurements are performed. One then assumes that none of the particles will be left unmeasured and that each of them is separately observed. The experimental procedure is to first choose, for each box, a component or , then perform the three corresponding measurements in each box, obtain the three results , and , and finally compute their product. A first necessary verification is to perform a large number of experiments and measure successively the three products , and , to be sure that the perfect correlations predicted by quantum mechanics are indeed observed (it is an essential step for the EPR argument, which infers from it the existence of 6 separate elements of reality). One then measures the product and, if quantum mechanics is right, one will observe a sign opposite to the EPR prediction. This means that the value obtained in a measurement of 1 (for example) depends on the or components measured on the other spins; this remains true even if the corresponding operators commute with 1 . This leads us to the general concept of “quantum contextuality”: in an experiment where several commuting observables are measured, one must take into account, according to Bohr’s prescription, the ensemble of the experimental set-up (the whole context of the system to be measured); it would not be correct to reason as if these measurements were independent processes. Experimental tests of GHZ equalities have been performed [85, 86]. These experiments require three particles to be placed in the quantum state (14), which is not an easy task. Nevertheless, using elaborate quantum optics techniques, the correctness of quantum mechanical predictions has been verified in such a case, with experiments involving 3 or 4 entangled photons, as well as with NMR (Nuclear Magnetic Resonance) techniques. 2231



COMPLEMENT BXXI

2.

Entanglement swapping

We now describe the “entanglement swapping” method, which enables entangling particles coming from independent sources (i.e. having no common past) through a quantum measurement process. 2-a.

General scheme

Consider two sources 12 and 34 each creating a pair of entangled photons (Fig. 2). The first one creates a photon with momentum k1 and another one with momentum k2 , whose polarizations are entangled in states (horizontal polarization, in the plane of the figure) and (vertical polarization, perpendicular to the plane of the figure). In a similar way, the second source creates a photon with momentum k3 and another one with momentum k4 , whose polarizations are entangled in the same way. The initial state describing the two pairs is the tensor product of two states, each describing two particles: Ψ =

1 [ k1 2

; k2

+ k1

; k2

]

[ k3

; k4

+ k3

; k4

]

(15)

While the two photons emitted by a given source are strongly entangled, no entanglement exists between the two pairs of photons, emitted by each of the two sources. It is useful to introduce the four different states pertaining to the wave vectors k , k : Φ

( )

Θ

( )

1 [k 2 1 = [k 2 =

;k

+

k

;k

]

;k

+

k

;k

]

(16)

with, here again, = 1. These states (often called “Bell states” in the literature, hence the superscript ) form an orthonormal basis of the state space associated with particles and . One can show that (the computation is straightforward but a bit tedious and will not be detailed here): Φ1 4

(+1)

Φ2 3

(+1)

Φ1 4

( 1)

Φ2 3

Θ2 3

(+1)

Θ1 4

( 1)

Θ2 3

( 1)

=

; ; ;

=

; ;

+

;

;

;

(17)

+

;

; ;

(18)

and that: Θ1 4

(+1)

( 1)

;

(to simplify the notation, it is implicitly assumed, on the right-hand side of both equations, that the order of the particle’s momenta is always k1 , k2 , k3 and k4 ). We can then write state (15) in the form: Ψ =

1 2

Φ1 4

+ Θ1 4

(+1)

(+1)

Φ2 3 Θ2 3

(+1)

(+1)

Φ1 4 Θ1 4

( 1)

( 1)

Φ2 3 Θ2 3

( 1)

( 1)

+ (19)

Figure 2 schematizes the experiment to be performed. After they are emitted, the particles with momenta k2 and k3 undergo a measurement in which they interfere. This is achieved by sending these two particles to a beam splitter BS, followed by two 2232



GHZ STATES, ENTANGLEMENT SWAPPING

BS 2

1

4

3

Figure 2: Schematic diagram of the “entanglement swapping” method. Two sources S12 and S34 each emit a pair of entangled particles, with wave vectors k1 and k2 for the first one, k3 and k4 for the second. These sources are independent. A beam splitter BS is inserted in the path of particles k2 and k3 ; it is followed by two detectors D and D that measure the particle number in each of the exit channels and . This measurement has the effect of projecting the state vector, hence bringing the two particles k1 and k4 into a totally entangled state, even though these particles have never interacted. detectors D and D measuring which exit channel were followed by the particles. If the two particles exit through two different channels, the corresponding eigenvector for this measurement result is the state Θ2 3 ( 1) ; this is because, as we show below, the three other states Θ2 3 (+1) , Φ2 3 (+1) and Φ2 3 ( 1) correspond to situations where the two particles exit through the same channel. The measurement thus projects state (19) onto the last of its four components. The net result is that if the two particles with momenta k2 and k3 exit through different channels (which happens one out of four times), the two particles with momenta k1 and k4 reach the state Θ1 4 ( 1) . This means that the two non-observed particles reach a totally entangled state though they can be arbitrarily far from each other. It is worth noting that the initial entanglement concerns the two particles k1 and k2 , and, separately, the two particles k3 and k4 . Performing a suitable measurement on a particle of each pair, one projects the two remaining particles into a strongly entangled state, even though they never interacted at any stage of the process. Demonstration: Let us show that Θ2 3 ( 1) is an initial state of two interfering particles that will lead to their exiting through different channels. We introduce for that purpose the two creation operators k2 and k2 in the individual state with wave vector k2 and polarization or , as well as the two operators k3 and k3 in the individual state with wave vector k3 and polarization or . The state Θ2 3 ( 1) can be written: Θ2 3

( 1)

=

1 2

k2

k3

k2

k3

0

(20)

As the particles go through the beam splitter, their polarizations are not modified, but

2233



COMPLEMENT BXXI

their wave vectors are. In terms of creation operators, this leads to the unitary transformations: 1 2 1 2

k2

k3

+

k2

k3

+

k2

(21)

k3

where the factors come from the phase change in a light beam as it undergoes internal reflection. Similar equalities are obtained for the polarization, so that: k2

k3

1 2

k2

+

k2

k3 k3

+

k2

k3

+

k2

k3

+

k2

(22)

k3

As creation operators in different modes commute with each other, this operator is equal to: k2

k3

k2

so that state Θ2 3 Θ2 3

( 1)

1 2

( 1)

(23)

k3

is transformed, after the beam splitter, into:

k2

k3

k2

0

k3

(24)

This shows that if the state before crossing the beam splitter is Θ2 3 photons are still in two different exit channels after the crossing. If now the state before crossing the beam splitter is Θ2 3

k2

k3

+

k2

1 2

k3

+ =

+

k2

+

k2

k2

k3

k3

, the two

, we must replace (22) by:

+

k2 k3

+

k2

(+1)

( 1)

k3

+

k2

k3

(25)

k3

which means that the two photons always exit the beam splitter through the same channel. In the same way, for the state Φ2 3 ( 1) , we get the operator:

k2

k3

k2

k3

1 2

k2 k2

=

2 k2

+

+ +

k3 k3

+

k2

2 k3

+

k2

k3 k3

2 k2

2 k3

(26)

It shows again that for each term the photons exit through the same channel. The state Θ2 3 ( 1) is therefore the only one that will lead to the photons exiting through different channels.

2234

• 2-b.

GHZ STATES, ENTANGLEMENT SWAPPING

Discussion

In classical physics, it is also possible to obtain correlations between two objects initially totally independent, by sorting objects with which each of them is correlated. To underline the fundamental difference with entanglement swapping, we now discuss such a classical experiment. Imagine that two independent sources emit pairs of correlated objects, numbered 1 and 2 for the first source, 3 and 4 for the second, as in Figure 2. Each time the experiment is performed, each source emits two classical objects sharing a common property (such as, for example, the same color, or opposite angular momenta, etc.). The two sources are nevertheless totally uncorrelated (the objects emitted by two different sources present no correlations between their colors, their angular momenta, etc.). If, however, one selects particular experiments where particles 2 and 3 present a certain correlation (for example identical colors, or else parallel or antiparallel angular momenta, etc.), it is clear that the particles 1 and 4 will also be correlated, even if they never interacted in the past and if they are very far apart. It is a mere consequence of the selection performed in a classical probability distribution, and could be called “classical correlation swapping”. Note, however, that this selection remains purely classical; no entanglement can be produced by this method. Should a Bell experiment be performed on the objects 1 and 4, the correlations obtained will necessarily obey Bell’s inequalities since we are in a classical physics context. The entanglement swapping method, however, allows creating by selection a true entanglement leading to strong violations of Bell’s inequalities. This method is a way of producing stronger correlations than classical correlation swapping, and has been demonstrated in several experiments [87, 88]. Conclusion The two examples we discussed illustrate the variety of situations where quantum entanglement produces significant physical effects, even when the entangled quantum systems are arbitrarily far from each other. In each situation, it is essential to follow the basic rules of quantum mechanics, and perform the computations with a global state vector, including all the physical systems under study. Any attempt to perform separate computations in different regions of space, and then add correlations using classical probability calculations, will necessarily lead to predictions ignoring numerous non-local quantum effects, in contradiction with experimental results.

2235



MEASUREMENT INDUCED RELATIVE PHASE BETWEEN TWO CONDENSATES

Complement CXXI Measurement induced relative phase between two condensates

1

Probabilities of single, double, etc. position measurements 1-a Single measurement (one particle) . . . . . . . . . . . . . . . 1-b Double measurement (two particles) . . . . . . . . . . . . . . 1-c Generalization: measurement of any number of positions . . . Measurement induced enhancement of entanglement . . . . 2-a Measuring the single density ( 1 ) . . . . . . . . . . . . . . . 2-b Entanglement between the two modes after the first detection 2-c Measuring the double density ( 2 1 ) . . . . . . . . . . . . 2-d Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Detection of a large number of particles . . . . . . . . . . 3-a Probability of a multiple detection sequence . . . . . . . . . . 3-b Discussion; emergence of a relative phase . . . . . . . . . . .

2

3

2239 2239 2240 2241 2242 2243 2243 2244 2244 2245 2245 2247

Introduction Referring to Bose-Einstein condensation (Complement CXV , a system of identical bosons, all occupying the same individual state is called a “condensate”. It is described by a Fock state such as the one given by (A-17) of Chapter XV, where all the occupation numbers are zero, except for , whose value can be very large: :

1

=

0

!

(1)

The operator is the creation operator of a particle in the individual state , and 0 is the vacuum state (for which all the occupation numbers are zero). In a similar way, a “double condensate” is described by a Fock state where 1 particles are in the individual state and 2 particles in the individual state ; its normalized state is written as: Φ0 =

:

1;

:

2

1 1!

=

2

1

0

2!

(2)

We shall focus on the case where the individual states defined but opposite wave vectors : Φ0 = +

:

1;

:

2

=

1 1!

1

2!

+

and

are states with well

2

0

(3)

In such a state, while the occupation numbers are perfectly well defined, the relative phase between the two condensates is completely undetermined; we will confirm this 2237

COMPLEMENT CXXI



Figure 1: The left-hand side of the figure represents two groups of particles prepared independently. The first one is composed of a large number of particles, 1 , all in the same individual state with momentum + along the axis, and propagating towards the right; the second group includes 2 particles, in the other individual state with opposite momentum, propagating towards the left. Each of these groups of independent particles is in a “condensate”. The right-hand side of the figure shows that, after a certain time, the two condensates overlap in space; this allows measuring the positions of the particles in the overlap region. For clarity, the computations are limited to one dimension, taking into account only the coordinate. The first position measurement is totally random, but as measurements continue, there appear a periodic bunching of the observed positions, progressively forming a sharper fringe pattern. These fringes result from the emergence of a relative phase between the two condensates, which can only be a consequence of the position measurements, as it was totally absent at the beginning of the experiment. If the whole process is repeated from the beginning, fringes appear with a position generally different from the first experiment: the phase appearing in each new experiment is totally independent of the one observed in previous experiments.

later (§ 2-a) by showing that measuring the position of a single particle with such a state does not lead to any observable interference fringes. Now, recent experiments [89] have shown that when the positions of many particles are measured, interference fringes can indeed be observed in the region where the two condensates overlap (Figure 1). This remains true even if the condensates have been created in a totally independent way. This fringe pattern corresponds to a well defined value of the relative phase of the two condensates; one may then wonder about the origin of this observed phase. The object of this complement is to study the mechanism responsible for the emergence of this relative phase. We will show that it results from the successive detections of particles, which progressively modifies the initial state: as more and more particles are detected, it produces a progressively increasing entanglement between the two condensates, defining their relative phase in a more and more precise way. 2238



MEASUREMENT INDUCED RELATIVE PHASE BETWEEN TWO CONDENSATES

During the course of one experiment, the position of the fringes is determined. However, should one repeat the experiment, preparing the condensates in exactly the same way, a new relative phase will progressively appear during the successive particle detections; its value is, in general, completely different from the one previously obtained. This means that if one averages observations over a large number of independent successive experiments, the fringes will be blurred and eventually completely disappear. The emergence of the phase is clearly observable only in the course of one specific experiment. We first compute, in § 1, the probability of measurements concerning the positions of one, two, and more particles; we will show that these probabilities are proportional to spatio-temporal correlation functions of field operators of the various particles. Starting, in § 2, from two condensates in an initial state described by a simple juxtaposition of two Fock states, we will see how the successive particles’ position measurements create an increasing entanglement between the two condensates. A more general study of the system’s evolution is presented in § 3, showing, in particular, how this growing entanglement leads to a better and better definition of the relative phase between the two condensates. The computations presented in this complement are limited to the case where the number of measured positions remains small compared to the total number of particles in the condensates. Complement DXXI , will go a step further and relax this hypothesis. 1.

Probabilities of single, double, etc. position measurements

As we start the successive measurements of the particles’ positions, we begin by computing the probability of finding a first particle in an interval1 of infinitesimal width ∆ around position = 1 , then a second particle in the interval of width ∆ around position = 2 , etc. The computations we present here are valid for any state Φ0 of the identical particle system. They are, actually, the equivalent of those encountered in the general study of correlation functions in § B-3 of Chapter XVI; nevertheless, we will go through them again in the specific context of the present complement. The results will be applied, in § 2, to the particular case where Φ0 is a double Fock state. 1-a.

1

Single measurement (one particle)

=

With a measurement of the position yielding a result included in the interval ∆ ∆ 1 1 + 2 , we can associate the Hermitian operator: 2 ∆ 1+ 2



(

1)

=

d Ψ ( )Ψ( ) 1

(4)

∆ 2

where Ψ( ) is the field operator destroying a particle at point , and Ψ ( ) its Hermitian conjugate, creating a particle. The average value of ∆ ( 1 ) yields the average particle number in the interval 1 . In what follows, we shall, most of the time, assume that ∆ is small enough compared to the other dimensions of the problem to justify the approximation: ∆

(

1)



Ψ (

1) Ψ ( 1)

(5)

1 To keep the notation simple, we consider a one-dimensional problem and note , the 1 , 2 ,.., particles’ positions. Generalizing to three dimensions only requires replacing all the by the vectors r .

2239



COMPLEMENT CXXI

Operator ∆ ( 1 ) is a symmetric one-particle operator of the type described in relation (B-1) in Chapter XV. It can also be written as: ∆ 1+ 2

∆(

1) =

d

:

:

(6)

∆ 2

1

=1

which is the sum over all the = 1, 2, .., particles of the projectors into the interval of the positions of each of them. As all these projectors commute with each other, 1 and since they each have eigenvalues 1 and 0, the eigenvalues of ∆ ( 1 ) are equal to 0, 1, 2, .. . Now, if ∆ is small enough, there can be no more than one particle in the interval 1 ; this means that the only accessible eigenvalues are 0 and 1, so that ∆ ( 1 ) becomes the projector associated with the measurement of a particle’s presence in the interval 1 : [



2 1 )]

(

=



(

1)

if ∆

0

(7)

Suppose now the system is in state Φ0 . The probability in the infinitesimal interval 1 of length ∆ is: 1

(

1)

= Φ0



(

1)

= ∆ Φ0 Ψ (

1

of finding a particle

Φ0 1) Ψ ( 1)

Φ0

(8)

Right after the detection of this first particle, the system is now, according to relation (E-39) of Chapter III (postulate of wave packet reduction), in the normalized state: 1

Φ0 =

1

1-b.

(

1)



(

1)

Φ0

(9)

Double measurement (two particles)

Let us now focus on the probability ( 2 1 ) of detecting a first particle in 1 2 an interval of width ∆ around point 1 , then a second one in an interval ∆ around point 2 ; we assume the system does not have time to evolve in between the two measurements. We start by computing the conditional probability2 ( 2 1 ) of detecting a 1 2 ∆ particle in the interval 2 ∆ noted 2 , knowing that a particle has been 2+ 2 2 ∆ detected in the interval 1 ∆ 1 + 2 already noted 1 . This probability equals: 2 1

2

(

2

1)

= Φ0 =

1 1(



(

1)

2)

Φ0

Φ0 ∆

(

1)



(

2)



(

1)

Φ0

(10)

where, in the second line, we have used (9); the projector ∆ ( 2 ) may be obtained by replacing 1 by 2 in expression (4). We assume that the two detection intervals do 2 Note the different notation used for the conditional probability 1 between the variables, and the simple probability (a priori probability) two results. These two probabilities are related by expression (14).

2240

(

2

1

2

2

1 ),

(

2

with a fraction bar 1 ) of obtaining the



MEASUREMENT INDUCED RELATIVE PHASE BETWEEN TWO CONDENSATES

not overlap in space, so that all the operators appearing in ∆ ( appearing in ∆ ( 2 ). We then have, taking (7) into account: ∆

(

1)



(

2)



(

1)

=[ =

∆ ∆

2 1 )]

(

(

1)



If ∆ is small enough, we can replace leads to: ∆

(

1)



(

2)

= ∆2 Ψ ( 2

=∆ Ψ (



(

(

2)

(



1) Ψ ( 1) Ψ 1) Ψ

(

commute with those

2)

(11)

1)

(

1)

and



(

2)

by their expressions (5); this

2) Ψ ( 2)

2) Ψ ( 2) Ψ ( 1)

(12)

where, in the second line, we again used the fact that field operators defined in nonoverlapping regions of space commute with each other. Inserting this result in (10), we get: 1

2

(

2

1)

∆2 Φ0 Ψ ( 1 ( 1)

=

1) Ψ

(

2) Ψ ( 2) Ψ ( 1)

Φ0

(13)

Now, the probability of detecting a particle at point 1 , then a particle at point , is the product of the probability 2 1 and the 1 ( 1 ) of detecting a particle at point conditional probability ( ) of detecting a particle at point knowing that a 2 1 2 1 2 particle has been detected at point 1 : 1

2

(

2

1)

=

1

(

1)

1

2

(

2

1)

(14)

Taking (13) into account, this leads to: 1

2

(

2

1)

= ∆2 Φ0 Ψ (

1) Ψ

(

2) Ψ ( 2) Ψ ( 1)

Φ0

= Φ21 Φ21

(15)

where Φ21 is the non-normalized state: Φ21 = ∆

Ψ(

2) Ψ ( 1)

Φ0

(16)

The probability we are looking for is simply the squared norm of the ket obtained by destroying in the initial state a particle at point 1 , and a second one at point 2 , multiplied by the width ∆ of the infinitesimal measurement interval. 1-c.

Generalization: measurement of any number of positions

The previous computations deal with simple and double density measurements; we now generalize them to measurements of higher order densities. From now on and to simplify the notation, we shall omit the subscript in the probabilities . To compute the probability associated with a triple measurement, we start from the expression of the state vector right after the detection of the second particle at 2 . Taking (10) into account, and similarly as for (9), this normalized state is written: Φ0 =

1 Φ0



(

2)

Φ0



(

2)

Φ0 =

1 (

2

1)



(

2)

Φ0

(17) 2241



COMPLEMENT CXXI

or else, if we insert (9) and use (14): 1

Φ0 =

(

1)

2

(

1

=

(

(



1)

2



1)

2)

(

2)



(



1)

(

1)

Φ0

Φ0

(18)

The probability of the third measurement at surements gave results at 1 and 2 , is thus: (

3

1)

2

= Φ0



(

3)

1

=

(

2

1)

3,

knowing that the first two mea-

Φ0

Φ0



(

1)



(

2)



(

3)



(

2)



(

1)

Φ0

(19)

As before, we consider that the position measurement zones do not overlap, so that all the projection operators commute with each other: (

3

=

1)

2

∆3 (

1)

2

1

=

(

2

1)

Φ0 Ψ (

Φ0



1) Ψ

(

(

3)

2) Ψ



(

(

2)



(

1)

Φ0

3) Ψ ( 3) Ψ ( 2) Ψ ( 1)

Φ0

(20)

In the second line, we assumed ∆ was small enough to use the approximate relation (5). As the law of conditional probabilities indicates that the probability of the three measurements at 1 , 2 and 3 is given by: (

3

2

1)

=

(

1)

2

(

3

2

1)

(21)

we simply get: (

3

2

1)

= ∆3 Φ0 Ψ (

1) Ψ

(

2) Ψ

(

3) Ψ ( 3) Ψ ( 2) Ψ ( 1)

Φ0

(22)

which is a direct generalization of (15). The same line of reasoning allows showing that the probability associated with the measurement of positions is proportional to the average value in the system’s state of a product of 2 field operators Ψ and Ψ arranged in normal order, and evaluated at 1 , 2 ,... . The probabilities are therefore equal to the spatio-temporal correlation functions of the field operators arranged in normal order (and multiplied by ∆ ). 2.

Measurement induced enhancement of entanglement

We have reasoned until now in a general way, without specifying the initial state Φ0 of the system under study. We now assume we are dealing with a double condensate, as in (3), and for simplicity we shall take 1 = 2 = (actually the computation that follows only requires the hypothesis 1 particles occupying the 2 ). We thus have individual state with a well-defined momentum } , and an equal number of particles occupying the state with opposite momentum: Φ0 = + 2242

:

;

:

=

1 !

+

0

(23)



MEASUREMENT INDUCED RELATIVE PHASE BETWEEN TWO CONDENSATES

We propose studying the interference signals that may occur in the single and double counting rates measured on such a state. We shall need the probabilities calculated above as well as expressions (A-3) and (A-6) of Chapter XVI for the field operators: 1

Ψ( ) =

+

1 2

1

Ψ ( )=

+

+

1 2

+

(24)

where (or ) and (or ) are the annihilation and creation operators of a particle in mode (or ), and where is the edge of the box used to normalize the plane waves. The dots on the right-hand side of these formulas stand for the other terms present in the field operator expansions of these operators. Because of the choice of the initial state (23), these additional terms do not play any role in the following calculations, as will be shown below. 2-a.

Measuring the single density

(

1)

Relation (8) now becomes: (

1)

=∆ + :

;

:

Ψ (

1 )Ψ( 1 )

+ :

;

:

(25)

Using expressions (24) for the field operators, and the fact that the cross terms and have a zero average value in the double Fock state (23), we get: (

1)

=



+ :

;

:

+

+ :

;

:

=

2 ∆

(26)

This means that there is no interference in the single density measurement signal. This was to be expected since the initial double Fock state includes no phase that could help determine the eventual position of such fringes. 2-b.

Entanglement between the two modes after the first detection

Relation (9) yields the ket Φ0 , right after the first measurement. It can be written as:

=

∆ 1+ 2

1

Φ0 =

(

d Ψ ( ) Ψ ( ) Φ0

1)

1

∆ 2

∆ 1+ 2

1 (

1)

d 1

+

+

+

+

Φ0

(27)

∆ 2

Taking (23) into account, and for an infinitesimal ∆, we get: Φ0

2

Φ0 +

( +

2

+ 1) 1

:

2

1;

:

1

:

+ 1; +1

: +

1 (28) 2243

COMPLEMENT CXXI



where + stands for the components of Φ0 where a particle occupies an individual state other than and ; these components do not play any role in what follows. Relation (28) shows that the entanglement of state Φ0 has increased as a result of the detection of the first particle. This state now contains a linear superposition of the initial state and two additional states of the global system, 1:+ ; : and :+ ; 1: ; the coefficients of this superposition, and in particular their relative phase, depend on the point 1 where the first particle has been detected. 2-c.

Measuring the double density

(

1)

2

We now compute the probability ( 2 1 ) associated with a double density measurement. Relations (15) and (16) show that the probability is the squared norm of the ket: Φ21 = ∆

Ψ(

2) Ψ ( 1)

:+ ;

:

(29)

Inserting in this equality the first relation (24), the terms symbolized by the dots disappear (as they involve annihilation operators yielding zero when acting on the + and states, the only initially populated states). We obtain: Φ21 =



+

2

2

1

+

:+ ;

1

:

(30)

or else: Φ21 =



(

(

+

1)

2

+

1)

(

1+ 2)

+

(

1+ 2)

(

1

2)

2:+ ;

:

:+ ;

2:

1:+ ;

1:

(31)

The squared norm of this state vector yields the probability: (

2

1)

=

∆2 2

2

(

1) + 4

2

cos2 [ (

2

1 )]

(32)

The presence of the cosine of ( 2 1 ) reveals the existence of a spatial dependence, contrary to what happened for ( 1 ): once a first particle is detected at 1 , the most probable positions 2 for the second detection are those for which ( 2 1 ) is a multiple of . In other words, fringes appear in the double density measurement. 2-d.

Discussion

One may wonder which objects interfere in the double counting signal. They are not waves but transition amplitudes associated with two different paths leading the system from the initial state (23) to the same final state 1 + ; 1 , where each of the two modes has lost one particle. In the above computation, the first path 2 1 corresponds to the term , where one particle with momentum + disappears as it is detected at 1 and the particle disappears as it is detected at 2 1 where it is now the 2 ; the second path corresponds to the term 2244



MEASUREMENT INDUCED RELATIVE PHASE BETWEEN TWO CONDENSATES

particle with momentum that disappears as it is detected at 1 and the + particle that disappears as it is detected at 2 . The double counting signal observed on a double condensate is very similar to the double photodetection signal obtained, in § 2-c- of Complement EXX , in the study of a product of two one-photon wave packets. In both cases, the signal spatial dependence comes from a quantum interference between the amplitudes of two different paths between the same initial and final states. The difference between the paths comes from a different “switching” between one of the two components of the initial state and one of the two components of the final state. 3.

Detection of a large number

of particles

We now extend the previous reasoning to the case where any number of particles’ position measurements are performed; we shall limit ourselves to the case where remains much smaller than the total particle number of each condensate. 3-a.

Probability of a multiple detection sequence

Generalizing relation (22) allows writing the probability ( tecting a particle at 1 , a particle at 2 , .. a particle at , in the form: (

2

1)

=∆

:+ ; Ψ(

:

Ψ (

) Ψ(

1 )Ψ

2 )Ψ( 1 )

(

2)

Ψ (

:+ ;

2

1)

for de-

)

:

(33)

As before, we use relations (24) to replace the field operators and their adjoints by linear combinations of annihilation operators , and creation operators , . We then get: (

2

1)

=



:+ ; (

: +

( .

(

1

+

)( 1

+

1

)

+ 1

)

) :+ ;

:

(34)

Simplifying hypothesis

When several annihilation operators act successively on the right on the initial ket, each of them introduces a varying factor ; this factor depends on the number of particles already annihilated by the other operators. In the same way, when the creation operators act on the left on the initial bra, they also introduce varying factors. To keep things simple, we shall ignore these variations, assuming that the total detection number is always very small compared to the total particle number in each individual state:

(35) One can then replace all the factors

by

: (36) 2245



COMPLEMENT CXXI

Apart from multiplication by a fixed factor , the only effect of each operator is to vary by one unit the occupation number; this result does not depend on the previous actions undergone by the state vector (all the operators commute, once the above approximation has been made). One can then freely move the annihilation and creation operators in the product of operators appearing in (34). Regrouping all the operators associated with the same values of , we get the operator: ( )=(

+

=

+

)(

+ 2

+

) 2

+

(37)

and expression (34) becomes: (

1)

2

=



:+ ;

:

( )

:+ ;

:

(38)

=1

We are then left with the computation of the average value in the initial state of a product of operators ( ). Expanding each of them according to the second line of (37), we get the sum of 4 products, most of them having, nevertheless, a zero average value in the double Fock state Φ0 . This is because the only products having a non-zero average value are those for which the repeated effect of the annihilation operators is exactly balanced by the effect of an equal number of operators (the particle number in the individual state is then also constant, since the total number of particles must be conserved). Consider then one of those non-zero products. Still using approximation (36), the contribution of each operator ( ) will be one of the three factors ( ), with = 0 1 and: 0(

)=2

1(

)=

2

(39)

The contribution of 0 leaves the particle numbers unchanged, the contribution of +1 replaces a + particle by a particle, and finally that of 1 performs the opposite substitution. Relation (38) then becomes: (

2

1)

=



( ) =1

=0

with

=0

(40)

1

where, when we expand the product in the right hand side of the equation, we retain only the terms for which the sum of all vanishes: =

=0

(41)

=1

This constraint simply expresses the conservation of the particle number in each individual state. 2246

• .

MEASUREMENT INDUCED RELATIVE PHASE BETWEEN TWO CONDENSATES

Simple expression for the multiple detection probability

delta

An easy way to impose constraint (41) on all the 0 , and write:

(

1)

2

=



0

d 2

=

2

is now free of constraint. We can then use relation: (43)

0

which amounts to multiplying each term d . We then get: (

1)

2

=

2



d 2

0

Each summation over 0(

)+ =

(42)

=1

where the summation over the 2

( )

0 1

is to introduce a Kronecker

+(

( ) in (40) by exp(

( )

) and summing over

(44)

=1

then yields the quantity:

)

+

( )

[2 + exp(2

+

) + exp( 2

)]

(45)

which can be written as: 2

[1 + cos(2

+ )] = 4

cos2

+

(46)

2

Finally, we obtain the following simple analytical expression for the multiple detection rate: (

1) =

2

2

4∆ 0

3-b.

d 2

cos2 =1

+

(47)

2

Discussion; emergence of a relative phase

Relation (47) allows understanding how the successive measurements enable the progressive emergence of a relative phase. .

Detecting the first particles

Let us start with the very first detection at probability of such an event: 2

(

1) 0

d cos2 ( 2

1

+ ) 2

=

1.

Equation (47) yields the

(48)

The term in cosine appearing in the integral yields the fringes that one expects from the interference between two waves, with wave vectors + and along the axis, and a 2247



COMPLEMENT CXXI

phase shift . However, the summation over indicates that the interference pattern must be averaged over all the possible values of , uniformly distributed between 0 and 2 : this means that the fringes are completely blurred out. The double detection rate at 1 and 2 is obtained by keeping the terms = 1 and = 2 in (47). It is equal to: 2

(

2

1) 0

d cos2 ( 2

2

+ ) cos2 ( 2

1

+ ) 2

(49)

As the first detection has already occurred, 1 is fixed in this equation, and the product of the two cosines yields the probability of finding the second particle at 2 . But the integral over d , which yields the 2 dependence of the probability, is no longer over a phase uniformly distributed between 0 and 2 , because of the presence of the cos2 ( 1 + 2) associated with the first detection; the blurring of the fringes is not as radical as before. For this second detection, the function cos2 ( 1 + 2) actually plays the role of an 1 dependent phase distribution; the two detections are no longer independent. This confirms the qualitative discussion of § 2. This mechanism can be generalized to higher order measurements. As an example, the triple detection rate at 1 , 2 and 3 is equal to: 2

(

3

2

1) 0

d 2

cos2 (

3

+ ) cos2 ( 2

2

+ ) cos2 ( 2

1

+ ) 2

(50)

Once the first two detections have been made at 1 and 2 , the relative phase distribution that comes into play for the third detection is the product of two cosine functions cos2 ( 2 + 2) cos2 ( 1 + 2) – and no longer a single one as was the case before. As the product of two cosine functions yields a sharper curve than a single cosine function, the relative phase is better defined for the third detection than for the second. The process continues in the same way with the following detections, and the phase is more and more precisely defined. This means that it is the first detections that determine the positions of the fringes appearing in the following detections, each of them contributing to a more and more precise definition of the relative phase distribution. This argument is of course only valid for a given experiment. If one performs a new experiment with the same experimental conditions, the first detections will not, in general, happen at the same places as in the first experiment. Consequently, after a large number of detections, a fringe pattern will appear, shifted with respect to the pattern observed in the first experiment. Finally, if one adds up the positions measured in a large number of successive experiments, the fringes average to practically zero, and one gets a quasi-uniform position distribution. .

Emergence of a well-defined relative phase after a large number of detections

After a large number of detections, , the relative phase distribution for the ( + 1) detection is given by the product of a large number of cosine functions, yielding a very narrow phase distribution, centered at a value . One can then replace in (47) all the [1 + cos(2 + )] by [1 + cos(2 + )], so that the probability becomes a product: the detections are now independent, the interference pattern becomes stable with a sharper and sharper contrast. These predictions have been confirmed by numerical simulations based on equation (47). 2248



MEASUREMENT INDUCED RELATIVE PHASE BETWEEN TWO CONDENSATES

Narrowing of the relative phase distribution The narrowing of the relative phase distribution can be explained by an analytical calculation. Let us assume that after detections this distribution can be approximated by a Gaussian curve centered at , and with a width :

After the ( +1 (

)2

(

( )

+ 1)

(51)

detection, the new distribution will be given by: )2

(

)

2

2

[1 + cos(2

+1

+ )]

(52)

As the function 1 + cos(2 +1 + ) is much broader than +1 ( ), it can be expanded in powers of in the vicinity of = where the distribution ( ) takes on significant values:

1+ cos(2

+1

(

+ ) = 1 + cos(2

) sin(2

One can also expand )2

(

2

+1

+

+1

+

)

1 ( 2

)

)2 cos(2

( ) in the vicinity of =

)

(53)

(54)

2

1 + cos(2

+1

+

)

1 cos(2 2

+1

+

)+

)2

(

)

:

We then multiply (53) and (54) and obtain an expansion for +1 (

+

)2

(

=1

+1

( 1 2

+1 (

):

) sin(2

+1

+

)

1 + cos(2

+1

+

)

(55)

We note that + 1) detection. We +1 ( ) depends on the position +1 of the ( can obtain an average value for +1 ( ) by weighting +1 ( ) by the probability [1 + cos(2 )] for the ( + 1) detection to occur at = +1 + +1 , and integrate +1 over a spatial period of the interference pattern: 2 +1 (

)=

2

d

+1

[1 + cos(2

+1

+

)]

+1 (

)

(56)

0

Since:

cos(2

+1

+

) = sin(2

+1

+

)

= cos(2

+1

+

) sin(2

+1

+

)=0

(57)

and: cos2 (2

+1

+

)=

1 2

(58)

we finally obtain: +1 (

)

3 1 2

(

)2

1 2

+

1 6

3 2

(

)2 )

2 +1

(59)

2249

COMPLEMENT CXXI



where: 1 2

=

1 2

+

+1

1 6

(60)

Equation (60) shows that , meaning that the distribution curve becomes +1 narrower after each detection. One can easily iterate equation (60) to obtain: 1 2 +

=

1 2

+

(61)

6

where is a positive integer. This shows that if phase distribution decreases as 1 .

1

2

, the width of the relative

A similar computation can be made to study the position of +1 ( )’s maximum when increases. One finds that the center of the relative phase distribution is shifted by a quantity proportional to 1 .

Finally, it is interesting to note the link between the uncertainty on the relative phase (which decreases as the detection number increases) and the uncertainty on the difference + between the numbers of particles in the condensates (which, on the contrary, increases). At the beginning of the experiment (before the first detection), we have + = = . After the first detection, we saw in § 2 that the state of the system contains a linear superposition of states + = 1 and = 1: the difference + between + and is no longer fixed and equal to zero, but can take on several values 0, 2. After the second detection, the state of the system contains a superposition of states having always the same value of + + , but values of + that can be equal to 0, 2, .. and so on. After detections, the values of spread out between 2 and +2 . This result is an illustration of the fact + that the relative phase and the difference between the particle numbers between the two condensates are conjugate quantities. Conclusion This complement illustrates how successive measurements on a system having components on two individual states, each with particles, can build up (from zero) a relative phase between these components; for this to happen, the measurements must depend on the relative phase between those two components. Mathematically, relation (47) shows that the results obtained for an ensemble of position measurements (with ) are exactly the same as if an initial well defined phase had existed from the beginning of the experiment, even though it was totally unknown (and could have taken on any value uniformly distributed between 0 and 2 ). The measurements did indeed introduce entanglement and its associated relative phase, but the quantum predictions are equivalent to those obtained by assuming that the measurements only reveal a preexisting phase (as in quantum theories with so-called “additional variables”). The process we have discussed is, however, of an essentially different nature: it is indeed each individual measurement that contributes to a better and better definition of the relative phase for the measurements to come; these will occur at points whose probability distributions depend on the results of all the previously performed measurements. We shall see in Complement DXXI that if, instead of measuring a fraction of 2250



MEASUREMENT INDUCED RELATIVE PHASE BETWEEN TWO CONDENSATES

all the particles, they each undergo a measurement, the phase properties can no longer be understood as that of a classical preexisting (but unknown) phase; these properties clearly become quantum, as shown by the possibility of violations of Bell’s inequalities.

2251

• EMERGENCE OF A RELATIVE PHASE WITH SPIN CONDENSATES

Complement DXXI Emergence of a relative phase with spin condensates; macroscopic non-locality and the EPR argument

1

2

3

Two condensates with spins . . . . . . . . . . . . . . . . . . . 2254 1-a

Spin 1 2: a brief review . . . . . . . . . . . . . . . . . . . . . 2254

1-b

Projectors associated with the measurements . . . . . . . . . 2255

Probabilities of the different measurement results . . . . . . 2255 2-a

A first expression for the probability . . . . . . . . . . . . . . 2256

2-b

Introduction of the phase and the quantum angle . . . . . . . 2258

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2259 3-a

Measurement number

2

. . . . . . . . . . . . . . . . . 2259

3-b

Macroscopic EPR argument . . . . . . . . . . . . . . . . . . . 2261

3-c

Measuring all the spins, violation of Bell’s inequalities . . . . 2263

This complement continues the discussion of Complement CXXI on the measurement induced emergence of a relative phase between condensates, but in a more general case. We established in CXXI that as more and more measurement results are obtained, their number still remaining smaller than the total particle number, the relative phase of the two condensates becomes better defined. It soon reaches a classical regime where it is (almost) perfectly determined. This necessarily comes with large fluctuations of the numbers of particles occupying the two individual states (or more precisely of their difference), as required by the uncertainty relation between phase and occupation numbers. In the present complement, a first important difference is that we no longer assume that the number of measurements remains small compared to the total particle number. This will enable us to follow the evolution of the phase properties during the whole series of measurements, including the last moments when the number of particles remaining to be measured is just a few units. For these few remaining particles, the fluctuations in the difference of the occupation numbers is necessarily limited to a few units, meaning that the phase can no longer be precisely determined. The phase then comes back to a quantum regime, where one can no longer interpret the measurement results in a classical context (preexisting but totally unknown phase). Another difference with Complement CXXI is that we now assume the two condensates correspond to different individual spin states. Instead of position measurements yielding continuous results, we can now perform measurements on the spin directions, which yield discrete results. This will make it easier to discuss the quantum effects, which can lead to violations of Bell’s inequalities (Chapter XXI, § F-3). Another advantage of dealing with spins is that we can go back to the EPR argument (Chapter XXI, § F-1) in a case where the elements of reality, introduced by EPR, are macroscopic and have, in addition, a simple physical interpretation (spin angular momentum). 2253

COMPLEMENT DXXI

1.



Two condensates with spins

We now assume that the two individual states populated in the condensates are the two states corresponding to two different internal states noted , but to the same orbital state : =

(1)

If ( + ) and ( ) are the creation operators associated with these states, the state Φ0 of the system formed by the juxtaposition of the two condensates can be written as: Φ0 =

1 !

(

+)

(

)

0

(2)

which replaces relation (23) of Complement CXXI ; the total particle number is 2 . By commodity, we will often call “spin states” the two states , and reason as if they were indeed the two accessible states of a spin-1 2 particle. This is just a manner of speaking: according to the spin-statistic theorem (Chapter XIV, § C-1), bosons cannot be half-integer spin particles. The system we consider is actually an ensemble of bosons that have access to only two internal states; these can be, for example, the two =0 and = 1 states of a spin equal to 1, or not necessarily spin states. 1-a.

Spin 1 2: a brief review

For the reasoning that follows, it may be useful to recall a few relations (Chapter IV, § A-2) concerning a spin 1 2 (with no orbital variables). As pointed out above, we are dealing with a fictitious spin, whose operators act on any two internal states, noted by pure commodity. Operator , associated with the first Pauli matrix (Complement AIV ), is the difference between the projector onto the state + and the projector onto the state : = + +

(3)

whereas operators and + and + as: =

+

+

=

+

+

are expressed as a function of the two non-diagonal operators + +

(4)

As for the fictitious spin component along a direction in the angle with the axis, the corresponding operator is written: = cos

+ sin

Its eigenvalues are =+1

= 1

2254

1 2 1 = 2 =

=

=

+

+

+

plane, making an (5)

1, and its eigenvectors can be expressed as: 2

+ + 2

+ +

2

2

(6)

• EMERGENCE OF A RELATIVE PHASE WITH SPIN CONDENSATES as can be easily checked by applying operator (5) to these relations. The projector onto the ket with eigenvalue can thus be written: 1 [1 + 2 1 = 1+ 2

( )=

1-b.

] +

+

+

(7)

Projectors associated with the measurements

For an ensemble of identical bosons having orbital variables, we note Ψ (r) the two field operators associated with the two internal states . The operator associated with the total particle density at point r is the sum of the local densities, Ψ+ (r)Ψ+ (r) corresponding to the + spin state, and Ψ (r)Ψ (r) corresponding to the state: (r) = Ψ+ (r)Ψ+ (r) + Ψ (r)Ψ (r)

(8)

As for the operator associated with the spin component along the axis, relation (3) indicates that it is the difference: (r) = Ψ+ (r)Ψ+ (r)

Ψ (r)Ψ (r)

quantization

(9)

According to (5), the operator associated with a measurement performed along a direction of the plane making an angle with the axis is written as: (r) =

Ψ+ (r)Ψ (r) +

Ψ (r)Ψ+ (r)

(10)

The measurements we are interested in pertain both to the position of the particles and their spin: for each measurement, the position is measured in an infinitesimal volume ∆ centered at point r and, when measuring the direction of the spin along the direction, we obtain the result = 1. By analogy with (7), the projector associated with such a measurement can be written: (r

)=

1 2 ∆ 2

d3

(r ) +

(r )



(r ) +

(r )

(11)

where (r) and (r) are given by (8) and (10). Operator (r ) projects both the orbital variables onto this small domain and the spins onto the eigenstate of the component along the axis, with eigenvalue = 1. 2.

Probabilities of the different measurement results

Consider now a system of 2 bosons in the state (2). Spin measurements are performed in a series of regions of space, which cover the whole extension of the orbital wave function (r) without overlapping. The measurements are supposed to be ideal, so that every particle is detected. The regions are supposed to be sufficiently small to obtain a negligible probability of double detection in any of them. Those where a particle is actually detected are centered at r (with = 1, 2, .., 2 ), as illustrated in Figure 1. 2255

COMPLEMENT DXXI



Figure 1: Two condensates, each having particles, the first one with + spins, and the other with spins, share the same orbital wave function (r), represented by the oval in the figure. Measurements of the transverse direction of the spins are performed in 2 non overlapping regions of space, centered at points r (with = 1, 2, .., 2 ). In each region, the measurement is performed along a transverse direction (perpendicular to the quantization axis), defined by an angle , which may depend on ; the corresponding result is = 1.

In each of these regions, one performs a measurement of the spin component along a transverse direction (perpendicular to the quantization axis), defined by the angle , and the measurement result is = 1. We now calculate the probability of getting a series of results = 1 in those 2 regions. 2-a.

A first expression for the probability

The associated projectors (r ) all commute with each other, since they contain field operators (and their adjoints) at different points in space (we assume that all measurements are done simultanously, or separated by a very short time). The probability 2 of a result is therefore the average value in the state Φ0 of the product of projectors: 2 2

(

) = Φ0

(r

) Φ0

=1

=

∆ 2

2

2

Φ0

Ψ+ (r )Ψ+ (r ) + Ψ (r )Ψ (r ) +

Ψ+ (r )Ψ (r )

=1

+

Ψ (r )Ψ+ (r )

Φ0

(12)

where symbolizes the ensemble of the variables ( 1 1 ). As the operators commute, we can also move all the field operators Ψ (r ) towards the right, and their adjoints Ψ (r ) towards the left. We now introduce a basis (r) for the wave 2256

• EMERGENCE OF A RELATIVE PHASE WITH SPIN CONDENSATES functions, its first element being the wave function of the populated states (1). This basis allows expanding the field operators according to relation (A-14) of Chapter XVI, and we can write: Ψ (r ) =

(r )

(13)

where is the annihilation operator of a particle in the individual state . Now the only term that actually plays a role in this expansion is the = 1 term: all the other = 1 terms yield zero when acting on the state Φ0 , which only contains particles in the orbital state 1 (r) = (r). One can simply replace Ψ (r ) by (r ) . The same is true for the adjoint of the field operators which, acting on the bra placed on their left, can only destroy particles in a state previously populated; consequently, we can also replace Ψ (r ) by (r ) . Once these replacements have been performed, we get an expression that can be written, in a symbolic way, as: 2

∆ 2

(

)= 2

2

Φ0 :

(r )

2 +

+

+

=1

+

+

+

+

: Φ0

(14)

The two dots surrounding the product over in this symbolic writing express the following convention, which originates from the rearrangement of the operators Ψ (r ) and Ψ (r ) mentioned above: in each of the 42 terms of the product of sums, all the annihilation operators are regrouped towards the right, and all the creation operators towards the left. To obtain the probability we are looking for, we have to compute the average values in the state Φ0 of 42 products, in normal order (Complement BXVI , § 1-a- ), of creation and annihilation operators in the two states =1 . The situation now becomes very similar to that leading to relation (37) in Complement CXXI . The computation that follows is indeed similar, except for the fact that we no longer use the approximation (35) of that complement (number of measurements small compared to the particle number): we now assume that all the particles are measured. Actually, most of the terms of the product over appearing in (14) have a zero average value in the state Φ0 . The only relevant terms are those that contain exactly annihilation operators other annihilation operators , in which case + and their action yields the vacuum, hence a normalized ket; if this is not the case, the result is zero. For a similar reason, they must also contain exactly creation operators + and other , otherwise the result is zero. All these non-zero terms have the same average values, since the product of operators in the normal order introduces each time 2

2

the same factor ! ! , that is ( !) ; we also get the product of 2 , which can take one of these 4 values:

+1 +1

=

1

1

=1

coefficients

(15)

and: +1

1

=

1 +1

=

(16) 2257



COMPLEMENT DXXI

Now +1 +1 corresponds to a term associated with a particle destruction in +, followed by a particle creation in that same state. In the same way, corresponds 1 1 to an annihilation-creation in the individual state . Finally, +1 1 corresponds to an annihilation in state followed by a creation in state + , and the opposite for . All the non-zero terms therefore correspond to products of 2 numbers 1 +1 such that the sum of the and the sum of the are both zero; this condition automatically ensures the conservation of the particle number in each individual state. The final result is: (

2

2

2

∆ 2

)=

2

( !)

(r )

2

with

=1

=

=0

(17)

= 1

where, in the right hand side, we retain only the terms satisfying the double condition: =

=0

=

=0

2-b.

(18)

Introduction of the phase and the quantum angle

Because of the summation constraints, expression (17) is not easy to handle. This is why we introduce two delta functions 0 and 0 , which obey:

0

=

0

=

+

d 2

+

d 2

(19)

This amounts to multiplying in (17) each by ( + ) and integrating over the two angles and . This enables us to write the probability in the form: 2

(

)= ∆ 2

2 2

+

( !)

d 2

+

d 2

2

(r )

(

2

+

)

(20)

=1

where the summations over and are now independent, thanks to relations (19) that automatically ensure the constraints are obeyed. For each value of , each sum contributes the factor: +1 +1

(

+

)+

1

1

(

+

)+

= 2 cos ( + ) + 2 2258

+1

cos (

1

(

)+ )

1 +1

(

)

(21)

• EMERGENCE OF A RELATIVE PHASE WITH SPIN CONDENSATES We finally make the change of variables1 : Λ= + =

(22)

This leads to the simpler expression: 2

(

)= 2

(∆)

2

+

dΛ 2

( !)

+

d 2

2 2

(r ) [cos Λ +

cos (

)]

(23)

=1

The following discussion is entirely based on this result. For reasons that will be explained below, is called the phase, whereas Λ is called the “quantum angle”. Comment: It will be useful in what follows to note that the right-hand side of the above equality stays the same if we make the change: +

dΛ 2

+

2

2 2

dΛ 2

(24)

To show this, we can decompose the integral over dΛ in a sum = 1 + 2 , where 1 is the integral between 2 and + 2, and 2 , the integral between 2 and 3 2 (as the period of the function to be integrated is 2 , any integration domain covering the entire circle is equivalent). Now 2 is just equal to 1 . This is because the function to be integrated is multiplied by ( 1)2 = 1 when one changes Λ to Λ = Λ as well as to = . Consequently, changing the integration variables Λ, to Λ , allows giving to 2 the same integration domain as 1 .

3.

Discussion

Let us examine first the case where the number of measurements is negligible compared to the particle number in each condensate; this will enable us to compare the results obtained with those of Complement CXXI . 3-a.

Measurement number

2

We first recall a general property of quantum mechanics concerning compatible observables (Chapter III, § C-6-a). When several operators , , , etc. commute with each other, one can build a basis with their common eigenvectors. The scalar product of these eigenvectors and the system state vector yields the probability amplitude for finding the system in each of these eigenvectors. If the eigenvalues are non-degenerate, the squared modulus of this amplitude yields the probability of finding the corresponding eigenvalues upon a series of simultaneous measurements associated with all the operators 1 The Jacobian of this change of variables is equal to 2, and this factor should be introduced in the denominator. Nevertheless, since the integrated function is periodic, this factor can be taken into account by reducing the integration domain of the two variables and Λ to the interval , + , which reduces the area of the integration domain by a factor 2.

2259

COMPLEMENT DXXI



, , , etc. If they are degenerate, we just have to sum the probabilities over all the orthogonal eigenkets. This is the rule we have followed until now in this complement. Imagine now that we ignore the measurement results associated with one or several operators of the series, for example and ; the probabilities of the measurement results we still consider relevant are then simply the sum over the possible results associated with the ignored measurements (sum of the probabilities of exclusive events). One can also imagine another situation where the quantities associated with and are actually never measured. The possible series of eigenvalues of the measured operators are then less numerous than in the previous case (since a smaller number of measurements is performed), which increases the degree of degeneracy of these eigenvalues. As for the eigenvalues of and , even though they do not correspond to actual measurements, they can still be used as indices to distinguish between the different orthogonal eigenvectors associated with measurement results of operators , , etc. Consequently we still have to sum the probabilities on these different eigenvalues, just as we did in the case where these measurements were ignored. This means that quantum mechanics yields the same probabilities whether we assume that the measurement results of and are ignored or have never been measured. Let us now compute the probability of obtaining results 1 when performing measurements on the spins. As we already know the probability (23) corresponding to the case = 2 , we can consider that all the 2 measurements have been performed, but that we ignore the results of 2 among them. As we just discussed, this amounts to summing in (23) the probabilities of the two possible results for each of these 2 ignored measurements, i.e. the probabilities associated with two opposite values of the . It follows that in the product over in (23), cos ( ) will disappear from all the 2 terms, leaving only cos Λ. We get for the following expression (omitting from now on the numerical factors, which are not relevant for our discussion): (

) +

2 2

+

dΛ 2 [cos Λ] 2

d 2

2

(r ) [cos Λ +

cos (

)]

(25)

=1

In this expression, the notation ( ) now stands for pairs of variables ( ), instead of 2 as before. The integral over dΛ contains the function cos Λ to the power 2 ; if 2 , this power is very high, and the function becomes a very narrow peak centered at Λ = 0. This allows us to write: +

(

)

d 2

2

(r ) [1 +

cos (

)]

(26)

=1

This result is very similar to the one obtained in relation (47) of Complement CXXI , namely a product of two positive individual probabilities2 : (

)=

1 2

2

(r ) [1 +

cos (

)]

2 Thanks to the factor 1 2, the sum of the two probabilities ( of presence of a particle in the detection volume.

2260

(27) =

1) is normalized to the probability

• EMERGENCE OF A RELATIVE PHASE WITH SPIN CONDENSATES This product is then averaged over the angle , which can take on any value between and + . As we show below, probability (27) is actually the probability of finding the result when measuring the component of a single spin along an axis with direction , assuming that spin was initially polarized along a direction defined by the angle . Demonstration: We call the spin quantization axis, and consider a spin polarized in the (transverse) direction in the plane , making an angle with the axis. Relation (6) indicates that its state is then: =

1 2

2

+ +

+

2

(28)

When measuring the spin component along an axis defined by the angle associated with the = +1 measurement result is: =

1 2

2

+ +

+

2

, the state

(29)

Consequently, the probability of that result is: =+1

=

2

1 ( ) 2 ( + 4 1 = [1 + cos( )] 2

=

) 2 2

= cos2

2

(30) (31)

As for the probability of the = 1 result, it is simply the complementary probability, obtained by changing the sign in front of cos( ). In both cases, we get the probability given by (27).

This means that can be interpreted as the relative phase between the two condensates. In this case, the predictions of quantum mechanics are identical to the predictions of a theory where the phase would be considered as a classical quantity, perfectly determined but as yet unknown at the beginning of the experiment. From such a point of view, this phase would be revealed more and more precisely by the successive measurements, instead of being created as assumed in the standard quantum mechanics interpretation. Therein lies a link to the heart of the EPR argument. 3-b.

Macroscopic EPR argument

The EPR argument was presented in § F-1 of Chapter XXI. It is based on the double hypothesis of reality and locality, as well as on the assumption that all quantum mechanical predictions are correct. The conclusion of the argument is that quantum mechanics is necessarily incomplete; to render it complete, “elements of reality” must be added to it. In an EPRB experiment, involving two spins in a singlet state, these elements of reality can be spin directions, well defined even before any measurements has been performed. Such an addition necessarily falls outside the framework of standard quantum mechanics. Bohr was opposed to it; he argued that the concept of elements of reality proposed by EPR could not be relevant for microscopic systems, since it was meaningless to try and dissociate them, conceptually, from their experimental surrounding. As we 2261

COMPLEMENT DXXI



discussed in Chapter XXI, this position is logically sound; it allows invalidating the conclusions of the EPR argument. However, we are going to show that the double condensates offer another context for applying the EPR reasoning, particularly interesting as it involves macroscopic quantities (as well as the conservation of angular momentum). These physical quantities can, in principle, be on our scale, thereby making it more difficult to deny them an independent physical reality. We consider a physical system in a quantum state similar to (2), where the two internal states of the particles are eigenstates of the spin components along the quantization axis, for example the = 0 and = 1 states of a spin = 1. For the clarity of the discussion, we will assume that the orbital wave functions of each condensate are distinct but overlap in two regions of space, as schematized in Figure 2. In each of these two regions (which may be separated by an arbitrarily large distance), two observers, Alice and Bob, perform measurements of the spin components along transverse directions3 , measurements for Alice, for Bob. For each measurement performed, each of the observers chooses an arbitrary direction defined by an angle ; Alice’s choices are completely independent of those of Bob, and vice versa. A first series of measurements (1 ) is performed by Alice in a first region of space; right after that, Bob performs another series ( +1 + ) in his own laboratory, located very far away. Now we saw that, as soon as Alice has measured the spins of a few particles, the relative phase of the two condensates in the entire space is fixed with a fairly good precision (the larger the number of measurements, the better the precision). These measurements also fix the transverse direction of the spins. Alice cannot, however, decide what this direction will be, as it is fixed in a totally random way in the measurement process. Standard quantum mechanics then predicts that when Bob will perform his own measurements, it is practically certain (within a negligible error) that he will find the same relative phase. As he can perform a large number of measurements, he can find out, practically instantaneously, the spin direction created and observed by Alice. The EPR argument underlines that, as no interaction had time to propagate from Alice to Bob, it is not possible for this transverse orientation to have been created by Alice’s measurements: it necessarily existed prior any measurements. What is new in our case, compared to the two-spin case, is that Bob’s observations may concern an arbitrarily large number of spins; his experiment then amounts to measuring the angular momentum direction of a macroscopic spin system, which has an arbitrarily large angular momentum. As we are now dealing with macroscopic quantities, one can no longer argue, as Bohr did, that the microscopic world is accessible neither to human experimentation nor to human language description. In our present case, it seems more artificial to refuse, as suggested by Bohr, to consider separately the physical properties of systems located in distinct regions of space. The EPR argument becomes harder to refute. Reference [90] contains a discussion of this unexpected situation in terms of conservation of angular momentum.

3 The longitudinal direction is the direction of the spin polarization in the initial state (2), the transverse directions are all the perpendicular directions.

2262

• EMERGENCE OF A RELATIVE PHASE WITH SPIN CONDENSATES

Figure 2: Scheme of an experiment on a double spin condensate, one in a + internal state, the other in a state. The two condensates have distinct orbital wave functions, overlapping in two regions of space where two observers, Alice and Bob, perform measurements. These two regions can be separated by an arbitrarily large distance. Alice and Bob measure the spins one by one, choosing each time a transverse component (perpendicular to the quantization axis) defined by an angle . Whereas, initially, the two condensates have no relative phase, the first measurements performed by Alice create one. The paradox is that this phase propagates instantaneously to the remote region where Bob performs his measurements. This is reminiscent of the EPR argument, but in a more striking case where the EPR “elements of reality” can be macroscopic.

3-c.

Measuring all the spins, violation of Bell’s inequalities

Imagine now that we measure all the spins; the selective effect around the Λ = 0 value does not occur any longer. The interpretation in terms of probabilities of individual events is no longer possible: the factor [cos Λ + cos ( )] in (23) can sometimes take on negative values, which rules out its possibility to be considered as a standard probability. Actually, it is not unusual in quantum mechanics that purely quantum effects arise through “negative probabilities ”, as in this case. The angle Λ is called the “quantum angle ”, which underlines that its role is to introduce such quantum effects, such as non-locality effects and violations of the Bell’s inequalities. To prove that such violations occur for any value of requires using relation (23), which involves many parameters (all the measuring angles, which are arbitrary); it is easier to perform a numerical calculation as explained in the second reference of [90]. Our objective here is to simply show that the phase does not always behave in a classical way. This is why, without presenting the numerical calculation, we shall study the behavior of expression (23) for the simple case of two measurements on two spins ( = 2 and = 1). This will enable us to show that this expression does predict the existence of violations of the inequalities, for certain cases (more general cases are treated in the above reference). Clearly, in the case of two spins we could have carried out this computation in a simpler and more direct way. Using definition (A-7) of Chapter XV for the Fock states, the state (2) can be 2263



COMPLEMENT DXXI

written in the form: Ψ =(

+)

(

) 0 =

1 [1: 2

1 [ 1 : +; 2 : 2

= 1 : ;2 :

+; 2 :

+ 1:

+ 1:

;2 :

+]

;2 : + ]

(32)

This leads to an entangled spin state, very similar to the one considered in § B of Chapter XXI. The only difference is the + sign in the present spin state, instead of the in the singlet state considered in that chapter, but this difference is of no great consequence (we shall come back to this point more precisely – see note 4). Such a state can be expected to lead to significant quantum effects, as for example to situations violating Bell’s inequalities. We can also go back to the general relation (23) to show that it indeed leads to violations of Bell’s inequalities in this simple case. For = 1, this relation becomes: 2

(

1

1; 2 +

2)

dΛ + d [cos Λ + 1 cos ( 2 2 1 = [1 + 1 2 cos 1 cos 2 + 2

1 )] [cos Λ

1 2

sin

1

+

sin

2

cos (

2 )]

2]

(33)

where we have used the fact that the average value on the circle of cosine squared or sine squared is equal to 1 2, whereas the average value of the product of cosine and sine is zero. We obtain: 2

(

1

1; 2

2)

1 [1 + 2

1 2

cos (

2 )]

1

Normalizing to unity the sum of the 4 probabilities obtained for we finally get: 1 cos2 2 1 1) = 2 ( 1 +1) = sin2 2

2

(+1 +1) =

2

(+1

2

( 1

1) =

1

(34) 1

=

1 and

2

=

1,

2

2 1

2

2

(35)

These relations are very similar to equalities (B-10) of Chapter XXI, if we make the change4 : 1

+ (36)

2

The angles 1 and 2 now play the role of the analyzer orientation angles in Figure 2 of that chapter. Now we know (Chapter XXI, § F-3) that these equalities lead to significant violations of Bell’s inequalities (by a factor 2), hence to marked non-local quantum effects; such effects should therefore be expected in our present case. 4 This

2264

change results from the + sign in the spin state (32), instead of the

sign in the singlet state.

• EMERGENCE OF A RELATIVE PHASE WITH SPIN CONDENSATES In the general case where can take on any value, measuring the totality of the spins may lead to strong violations of Bell’s inequalities, provided the measurement angles are judiciously chosen (for large , the angular domain leading to such violations decreases as the inverse of the square root of that number [90]). Note however that these violations will disappear as soon as certain spins are no longer measured (or their corresponding results no longer taken into account, which amounts to the same thing). Conclusion This complement is an illustration of the limits of the phenomenon studied in the previous complement: the process of successive measurements builds up a phase that has all the properties of a classical phase, but only up to a certain point. If all the particles are measured, and for certain particular choices of the measuring angles, in an ideal experiment the phase should exhibit some distinctly quantum properties. Furthermore, one could have expected that the extreme quantum properties, as for example their non-local aspects discussed in §§ F-1 and F-3 of Chapter XXI, would only concern systems with a small particle number, or in singlet spin states (these having a special status among all the states accessible to a physical system). The present complement shows that this is not at all the case: in principle, the same properties should exist for systems composed of a very large number of particles, in a fairly simple quantum state (a double condensate).

2265

FEYNMAN PATH INTEGRAL

Appendix IV Feynman path integral

1

2

3

4

Quantum propagator of a particle . . . . . . . . . . . . . . 1-a Expressing the propagator as a sum of products . . . . . . 1-b Calculation of the matrix elements . . . . . . . . . . . . . . Interpretation in terms of classical histories . . . . . . . . 2-a Expressing the propagator as a function of classical actions 2-b Generalization: several particles interacting via a potential Discussion; a new quantization rule . . . . . . . . . . . . . 3-a Analogy with classical optics . . . . . . . . . . . . . . . . . 3-b A new quantization rule . . . . . . . . . . . . . . . . . . . . Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-a One single operator . . . . . . . . . . . . . . . . . . . . . . 4-b Several operators . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

2267 2268 2268 2272 2272 2274 2274 2274 2275 2276 2276 2279

In Chapter III, we introduced the postulates of quantum mechanics using the Hamiltonian approach, with quantization rules applied to conjugate Hamiltonian variables. It is however possible to introduce quantum mechanics and its quantization rules in an entirely different way, starting from a classical Lagrangian, and using Feynman path integrals. This approach provides an interesting insight to the relationship between classical and quantum physics, reminiscent of the connections between geometric and wave optics. Furthermore, this approach is preferable in a certain number of cases, in particular for situations where we know the classical Lagrangian but not the conjugate variables necessary to define a Hamiltonian1 . This appendix is an elementary introduction to Feynman path integrals, and some of their properties, without too much concern for mathematical rigor. We first study in § 1 the quantum propagator of a particle, and then show in § 2 how to express it as the sum of contributions coming from different classical histories (possible evolutions) of the physical system. Once these results have been established, we discuss in § 3 how to take the inverse point of view, and start from these classical histories to derive the usual form of quantum mechanics, its quantization rules, its propagators and, in § 4, its operators. For the sake of simplicity, we shall only consider an ensemble of particles interacting via a position dependent potential; for the study of more general cases (vector potential, commutative or non-commutative gauge invariance), the reader may consult references [91], [92], or [93]. 1.

Quantum propagator of a particle

Consider a spinless particle. The propagator (r

;r ) = r

(

) r

(r

; r ) of this particle is defined as: (1)

1 This happens, for example, if the Lagrangian does not depend on the time derivative of a coordinate , in which case one cannot define the conjugate momentum .

2267

Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

APPENDIX IV

where the kets r are the eigenkets of the position operator, and ( ) the evolution operator between the initial time and the final time (Complement FIII ). This propagator gives the probability amplitude for the particle, starting at time from a state localized at point r, to be found at a later time at point r . 1-a.

Expressing the propagator as a sum of products

We can split the time interval [

] into

equal smaller segments by setting:

=

(2)

We therefore introduce which are equal to: = +

1 intermediate times

=1 2

)=

(

2,

..,

, ..,

1

(3)

1)

(

1)

=

0

(

and

1)

2

(

=

1

. We can then express

0)

(4)

Between each evolution operators, we introduce a closure relation in the (

d3

)=

1

(

d3

2

1)

r

(see Figure 1),

1

It will also be useful for what follows to set ( ) as a product of terms: (

1,

d3 r

1

d3

1

(

1

1

(

r

2)

1) r

r

basis:

2

(

r1

1

r

0)

1

(5)

Inserting this equality in (1), the propagator is expressed as: (r

d3

;r ) =

1

(

r r

d3

1)

(

d3

2

1)

r r

1 1

d3 r

1

r1

1

( (

2)

1 1

0)

r

r

2

(6)

We now let tend towards zero (or, equivalently, towards infinity). The number of integrations over the d3 will then tend towards infinity but the matrix elements are now those of the evolution operator over an infinitesimal time . 1-b.

Calculation of the matrix elements

We now compute the matrix elements r ( ators, with = 1 + . The particle’s Hamiltonian and the energy associated with an external potential: =

P2 + 2

(R)

1) r 1 of the evolution operis the sum of the kinetic energy

(7)

where P and R are, respectively, the momentum and position operators of the particle; is its mass. 2268

FEYNMAN PATH INTEGRAL

Figure 1: To express the evolution operator as a product of operators, one introduces between the initial time and the final time a whole series of intermediate times 1 , , .., time intervals. 2 , .., 1 ; an evolution operator is associated with each of the Introducing at each time a closure relation then leads to a summation over all the possible positions r of the particle at that time. .

Free particle

For a free particle, the Hamiltonian is simply the kinetic energy. Introducing a closure relation on the momentum eigenstates, we can write: r

P2

2

}

r

1

=

1 (2 )

3

d3

k (r

r

1)

}

2

2

(8)

We show below that this propagator can be computed exactly, whether the time interval is infinitesimal or finite; its expression is: P2

2

3 2

}

r

1

=

3

4

(r

r

2 1)

2} (9) 2 } Imagine for a moment that in the argument of the last exponential function the factor is replaced by a simple minus sign (this amounts to switching to an imaginary time). Within a numerical factor, the propagator of the free particle becomes a Gaussian function, decreasing rapidly as soon as the distance r r 1 becomes large compared to its width 2} .

r

Comment: This result is not surprising since if one replaces, in Schrödinger’s equation, the real time by an imaginary time , one gets the diffusion equation whose propagator is indeed a Gaussian. The width of that Gaussian goes to zero as 0: as expected, the shorter the time , the smaller the distance covered by the particle. It is worth noting, however, that this distance is not proportional to , but to the square root of this time; in other words, during very short times, the particle may propagate much further than if it had a constant velocity. This situation is characteristic of a random motion whose correlation time and mean free path tend simultaneously to zero, in a way where the diffusion coefficient remains constant: such a process lead to the classical diffusion equation, and its propagator therefore has the same property.

As for Schrödinger’s equation itself (without switching to an imaginary time), when the distance r r 1 increases, instead of a decrease of the propagator’s modulus, we 2269

APPENDIX IV

get an oscillation that gets faster as the distance gets greater; these oscillations are also faster as gets shorter. We discuss below how all these phases interfere, as well as the particular role played by the phases associated with classical paths. Demonstration of relation (9): We modify the exponent of the function to be integrated in (8) to introduce a perfect square: } 2

k2 + k (r

1)

r

=

2

} 2

k+

(r

r

}

1)

(r

2}

r

2 1)

(10)

Making the change of variables: q=k+

(r

1)

r

}

(11)

leads us to: P2

r

2

}

r

=

1

1 (2 )

} 2

d3

3

2

(r

r

1

)2

2}

(12)

The integral that still remains between the brackets on the right-hand side is now a number independent of r and r 1 . As the three components of the vector q yield the same contribution, this number is the cube of the integral: +

=

} 2

d

2

(13)

Following a classical procedure, we compute the square of that integral 2 coordinates where = + 2: + 2

+

=

d

}( 2 + 2 )

d } 2

=2 0

2

=

2 }

using the polar

2

1

(14)

Now, writing the number is mathematically meaningless: as , the imaginary exponential oscillates indefinitely between +1 and 1, around a zero average value. We shall simply take this number equal to its zero average value2 , so that: =

2 }

Replacing by

4

3

(15)

the integral over d3 in (12), we get relation (9).

2 One can reach this same result by adding a small imaginary part to the coefficient }2 2 of in the exponent. Thanks to this imaginary part, which can be arbitrarily small, the oscillating term does disappear. 2

2270

FEYNMAN PATH INTEGRAL

.

Effects of the external potential

} We must now compute the matrix elements of the exponential operator when is the sum of two non-commuting operators. We shall only consider the case where is infinitesimal. In that case, the exponential of a sum of non-commuting operators and can be written: 2 ( + )

=1

[ +

2

]

+

+

2

+

2 We can also expand the following product of exponentials as:

3

+0

(16)

2 2

2

2

= 1

2

+

8 2

2

2

1

+

2

2

1

2

+

8

(17)

This leads to: 2

2 2

=1

[ + ( + )

=

] +0

2

2

+

2

(

1)

r

1

2

2

2

+

+

2

+0

3

(18) (R) } we get, assuming that

(R)

= r =

2

+

3

Setting = P2 2 } and = so that we can neglect the 3 terms: r

2

+

P2

2}

[ (r )+ (r

1 )]

2}

2

(R)

} 2

P

r

2

}

is infinitesimal

2}

r

r

1

1

(19)

This result is simply the product of an exponential including the potential and a term that is the propagator of the free particle. Taking (9) into account, we can write: (

r

1

) r

1 3 2

2

3 4 (r r 1 ) 2} [ (r )+ (r 1 )] 2} (20) 2 } The effect of the external potential is straightforward: it adds an exponential including half the sum of the two potential energies associated, one with the position of the bra, the other with the position of the ket.

=

.

Final expression

Inserting (20) into (5), we get the following expression for the propagator: 2

(r

;r ) =

3

2

d3

2 } 1

exp =1

1

(r

r

d3 1)

2

2

2}

This equality is valid in the limit where neglected the 3 terms.

0 (or

d3

d3

(r ) + (r 2}

1 1)

(21)

) since, to establish (19), we

2271

APPENDIX IV

Figure 2: A Feynman path is obtained by associating a position r with each time . The path thus obtained is continuous, but looks in general like a zigzag (the velocity is discontinuous at each time ). Nevertheless, one can associate a classical action with each of these paths, and get the quantum propagator by summing exponentials of the actions along all these classical paths: while keeping the two end positions r and r fixed, the summation is performed on all the intermediate positions r1 , r2 ,...,r ,...r 1 . 2.

Interpretation in terms of classical histories

The probability amplitude (6) is obtained by inserting, as many times as necessary, the propagator matrix elements (20). It thus contains as many sums over intermediate positions as there are intermediate times , and this number goes to infinity as 0. This expression, somewhat complicated at first sight, actually has a simple interpretation in terms of classical paths the particles can follow between the initial and final times. 2-a.

Expressing the propagator as a function of classical actions

Let us go back to classical physics and consider for a moment all the r as being fixed, and a particle that goes through these successive positions at the different times . In between two consecutive times, we assume the particle keeps a constant velocity equal to: v =

r

r

1

(22)

This defines a classical path Γ, with a linear interpolation for times between the discrete instants (cf. Figure 2). The particle’s Lagrangian is written: =

1 v2 2

(r)

(23)

For any classical path Γ followed by the particle between the initial time and the final time , the position and velocity are both functions of time, and so is the Lagrangian ( ). The corresponding action is written (Appendix III, § 5-b): Γ

2272

=

d

( )

(24)

FEYNMAN PATH INTEGRAL

Let us compute this integral using Riemann’s method by introducing, between the times and , time intervals of length . We consider that during an infinitesimal time interval , the potential energy can be approximated by half the sum of its values at each end of the interval. We then get: (r Γ

r

(r

1)

+ 2

2

2

=1

2 1)

(r )

(25)

This approximate equality becomes exact in the limit 0. On the right-hand side of this equality, we find (to within a factor }) the argument of the exponential appearing in expression (21) for the quantum propagator. This means that this quantum propagator contains the exponential of an approximate value of the classical action, multiplied by }. When 0 (and hence ), the approximate value becomes exact3 , and the sums over d3 1 , d3 2 ,.., d3 1 on the right-hand side of (21) introduce a summation over all the paths going from r to r : Γ

exp

(26)

}

paths Γ

For this summation over the paths to be meaningful, we must now choose a “path density”; we therefore assume that the number of paths in a “path interval”, determined by the set of d3 , is given by the product 3 2 d3 1 d3 2 d3 d3 is the 1 , where constant (inverse of a length): 4

=

(27)

2 }

This allows us to write: (r

;r ) = r

(

) r =

Γ

exp

(28)

}

paths Γ

In the limit 0, the sum over the paths is no longer discrete. This is why, instead of (28), one often writes: r

(

) r =

Γ

[r ( )] exp

(29)

} where the notation

[r ( )] symbolizes the limit of a sum over the paths: 3

[r ( )] = lim

4

2 }

2

d3

1

d3

2

d3

d3

1

(30)

3 Remember that this complement does not pretend to be mathematically rigorous. This would entail a more careful study of the classical as well as quantum expressions, and of the effects of the simultaneous limits goes to zero and (number of terms in the products) goes to infinity, keeping in mind that these approximations are done on functions that are arguments of exponentials.

2273

APPENDIX IV

2-b.

Generalization: several particles interacting via a potential

The previous considerations can be directly generalized to a system with several particles interacting via a potential depending on their positions. Since the system Hamiltonian is, as above, the sum of two non-commuting terms, we again use the approximate expression (18) for the evolution operator. But rather than inserting the closure relations for a single position, one must now use a basis involving the positions of all the particles: each integral over d3 is then demultiplied into as many integrals as there are particles in the system. We will not include here the case where the particles are charged and subjected to a magnetic field, which would include terms such as v A (r), where A (r) is the vector potential. Even though it is an interesting case, in particular for its relation to gauge invariance, it will not be considered here for the sake of brevity. The interested reader should consult the references given in the introduction. 3.

Discussion; a new quantization rule

The path integral approach is particularly fit for developing an analogy with classical optics. In addition, it allows building new quantization rules. These two points will be discussed successively. 3-a.

Analogy with classical optics

Relations (28) and (29) allow making a link between two a priori unrelated quantities: the classical mechanics paths with their associated actions, and the quantum propagator. Knowing the wave function Ψ (r ) at a given time , it is this quantum propagator that enables computing that wave function at a later time as follows: Ψ (r

) = r Ψ( ) = r =

d3

r

(

(

) Ψ( )

) r r Ψ( )

(31)

Taking into account definition (1) of the propagator, the above expression can be rewritten as: Ψ (r

)=

d3

(r

; r ) Ψ (r )

(32)

The propagator is thus the kernel of the integral equation expressing the temporal propagation of the wave function. Now, equality (28) shows that this propagator is equal to a sum of exponentials of the actions corresponding to all the classical paths. Therefore, whereas in classical mechanics a single path (or in certain cases a finite number of paths) is selected by the stationarity condition of the action, in quantum mechanics all the paths come into play to determine the propagation amplitude, each one with its particular phase. In a manner of speaking, one can say that, in quantum mechanics, the particle goes through all the possible intermediate positions r and hence follows all the possible histories between the two end points. It is worth noting that all these histories (even highly unlikely histories involving totally arbitrary positions) contribute with the same amplitude. On the other hand, the 2274

FEYNMAN PATH INTEGRAL

phases associated with the histories are different from each other and allow understanding how the different histories come into play. This situation can be analyzed in terms of stationary phase conditions. It is easy to understand that in the summation, the histories corresponding to a stationary action will play a particular role since all the neighboring histories will add their contribution in a coherent way. On the other hand, in the vicinity of histories for which the action varies rapidly, the phase oscillates quickly and the corresponding contributions will cancel out through destructive interference. Consequently, classical histories play a privileged role, which becomes more prominent as the phase oscillations become more rapid. In the limit where } 0, these oscillations become infinitely rapid and only the classical histories prevail. Finally, this situation reminds us of classical optics and the Huyghens-Fresnel principle, where a light wave is computed as the sum of waves radiated from each point of an intermediate surface, taking into account the phases linked to the propagation along each path. In the geometrical approximation, where the wavelength tends towards zero, the trajectories of the light rays correspond to paths having a stationary phase, i.e. which does not change for infinitely close paths. Geometrical optics is the analog of classical mechanics, whereas Huyghens wave optics is the analog of quantum mechanics. The Feynman integral path is therefore a useful tool to study the link between classical and quantum mechanics, and in particular the semiclassical limit of quantum mechanics (WKB approximation, etc.). Comment: The preceding analogy is well founded for a single particle. For a system with particles, the histories no longer propagate in the ordinary three-dimensional space, but in a 3 dimensional configuration space. The analogy with optics described above is no longer as adequate, since in classical optics, electromagnetic waves propagate in ordinary 3 space.

3-b.

A new quantization rule

At this stage, we can invert the approach. Until now, starting from the rules of Hamiltonian quantum mechanics, we deduced an equivalent expression for the propagator, i.e. another way for finding the solutions of Schrödinger’s equation. It is also possible to consider this equivalent expression as the starting point and postulate that the propagator is defined ab initio by a sum over all the classical paths Γ, each contributing an exponential exp( Γ }). This yields another method for the quantization of a physical system, which offers several advantages. First of all, as we just saw, it highlights the relation of quantum mechanics with classical mechanics, where only a single classical path exists (or sometimes a finite number of paths), as opposed to an infinite number of possible paths in quantum mechanics. Furthermore, it is remarkable that the probability amplitudes thus computed only depend on classical functions (involving only numbers and not operators), the only explicit quantum component being the presence of } in the denominator of the phase. We shall see in § 4 how the concept of operators can be introduced in this approach. In addition, the expressions involving path integrals are symmetric with respect to time and space, since both type of coordinates are integrated 2275

APPENDIX IV

in a similar way4 . Reasoning directly in space-time makes it easier to include Einstein’s relativity, since one can replace the time differential by a proper time differential . If now the Lagrangian is a space-time scalar, so is the action, and the theory acquires relativistic invariance. Finally, Feynman’s quantization method only requires the existence of a Lagrangian, with its associated variational principle. Now all the physical systems that have a Lagrangian do not necessarily have the conjugate variables permitting the definition of a Hamiltonian. For such systems, the Feynman path integral method is powerful and this is why it is so important in quantum field theory. 4.

Operators

Feynman paths also permit computing the matrix elements of operators in the Heisenberg picture, where they are time-dependent. We shall mostly consider the simplest case where the operators are functions of the position operator R. 4-a.

One single operator

Let us insert any operator “in the middle” of the evolution operator (1) by splitting the time interval [ ] into two adjacent intervals [ ] and [ ], with . We get the expression: (

r

)

(

) r =

(

) r

(r

)

(r

)

(33)

with: (r

) =

(r

) =

(

) r =

(

) r

(34)

In this matrix element of , the ket (r ) is obtained by the evolution until the time of a state localized at r at time ; the bra (r ) corresponds to the ket (r ) which, as it evolves between and , becomes a ket localized at r : ( .

)

(r

) =

(

)

(

) r = r

(35)

Operator function of the position

We now assume operator is a function (R) of the particle’s position operator. Inserting a closure relation on the positions, we can write the left-hand side of (33) as: r

(

)

(R)

(

) r =

d3

r

(

) r

(r )

r

(

) r (36)

4 The sum over all the paths introduces an integral over all the positions r in Figure 2, hence differentials of the three space coordinates; in addition, the integral over the times introduces a differential d = 1 . The product of three space differentials by a time differential thus allows one to introduce a differential of space-time volume.

2276

FEYNMAN PATH INTEGRAL

Using the general relation (28), we can then write the two propagators appearing under the integral as the sum over all the paths Γ1 or Γ2 : (

r

) r =

Γ1

exp

}

paths Γ1

r

(

) r

=

Γ2

exp

(37)

}

paths Γ2

where Γ1 is a path linking the initial position r at time to the intermediate position r at time , and Γ2 the path linking thereafter the intermediate position r at time to the final position r at time . If coincides with the intermediate time , relation (36) becomes: r

(

)

(R)

(

) r

d3

=

exp

Γ2

Γ1

(r ) exp

}

paths Γ1 and Γ2

(38)

}

Now the product of the two exponentials yields a single exponential exp[ Γ }] associated with the action Γ of a path Γ consisting of the two paths Γ1 and Γ2 joined together end to end at r . The sum over d3 reconstitutes the ensemble of all paths going from the initial position r at time to the final position r at time , the only difference being that now each exponential exp[ Γ }] is multiplied by the value (r ) taken by the function at the intermediate point at time . We finally obtain: r

(

)

(R)

(

) r =

Γ

(r ) exp

(39)

}

paths Γ

where (r ) is the value of (r) at position r which the path Γ traverses at time . The matrix elements of the operator in the Heisenberg picture (special case = ) are thus given by the same summation over the histories as for the propagator, the only difference being that the contribution of each path is now multiplied by the value taken by the operator at the position r at the intermediate time . As before, we can now invert the approach and consider relation (39) as the definition of an operator in the framework of the Feynman path quantization method. Here again, it is remarkable that this relation involves only classical functions, without any operator. .

Velocity operator; canonical commutation relations

In order to define an operator W associated with the particle’s velocity at time (we use the notation W to avoid any confusion with the potential ), and taking (22) into account, a natural extension of (38) leads to setting: r

(

) W =

(

d3

) r exp

paths Γ2

r

Γ2

}

paths Γ1

r

1

exp

Γ1

(40)

} 2277

APPENDIX IV

where the paths Γ1 are all those going from the initial position r to the intermediate position r (the preceding intermediate position r 1 does depend on the path), whereas the paths Γ2 are all those going from r to the final position r . Introducing in the middle of the left-hand side a closure relation on the kets r , this relation becomes: d3

(

r

) r

(

r W d3

=

) r (

r

) r

Ψ

(r

r

r

)

(41)

with: Ψ

(r

)= r W

(

) r =

1

exp

Γ1

(42)

}

paths Γ1

This wave function is the result of the action of operator W on the wave function at time , equal to: Ψ (r

)= r

(

) r

(43)

Let us compare the sum over the paths Γ1 in relation (42) and in the relation (28) used to build the propagator between the times and . In these two equalities, the actions are given by the summations (25) over the intermediate positions. Concerning the contributions of the paths between the initial time and the intermediate time 1 (the last time over which the summation runs), the two sums in (42) and (28) are identical. Actually, their only difference concerns the very last time interval (between 1 and = ), which in (42) is multiplied by the factor (r r 1) . Now this multiplicative factor can also be found by taking the derivative of (28) with respect to r since, using (25), we have: ∇r

r

(

r

) r = }

r

1

2 paths Γ1

1 + ∇r 2

exp

Γ1

(44)

}

As r is a final fixed point, the term in ∇r on the right-hand side can be taken out of the summation over the paths. It yields a contribution in ∇r which goes to zero in the limit 0. We are left with: ∇r

r

(

r

) r = }

r

1

exp

Γ1

(45)

}

paths Γ1

so that relation (42) becomes: Ψ

(r

)=

}

∇r

r

(

) r =

}

∇r Ψ (r

)

(46)

This means that the action of the velocity operator W is simply proportional to a derivative5 with respect to the position r , which is the variable of the wave function at the 5 The demonstration pertains to a wave function at the instant that is issued from a wave function localized at point r at time . By linear superposition, it can be generalized to any wave function at time , hence confirming the same derivation property.

2278

FEYNMAN PATH INTEGRAL

instant . In other words, if P = W is the particle’s momentum operator, its action on the wave function is (} ) times the gradient with respect to the position. We have established a basic result of the usual quantum mechanics, starting from operators introduced in the path integral approach. The canonical commutation relations between R and P are easily derived, since: [ Ψ ( )] =

[Ψ ( )] + Ψ ( )

(47)

These commutation relations can also be considered as consequences of the path quantization rules. 4-b.

Several operators

Feynman postulates also permit introducing products of several operators, acting at the same instant or at different times. .

Several operators at different times

The previous argument can be generalized to several operators (R), (R), etc. acting at intermediate times , , etc. As before, we can split the evolution operator into several parts corresponding to the successive time intervals, and insert position closure relations at the intermediate times. Each operator introduces a factor dependent on the corresponding intermediate position, and the time propagation is a sum over histories between successive time intervals. For instance, for two operators, the same reasoning followed above leads to: r

(

)

(R)

(

)

(R)

=

(

) r

(r )

(r ) exp

Γ

(48)

}

paths Γ

where (r ) is the value of for the position r crossed by the path Γ at time , (r ) the value of for the position r crossed by the path Γ at a later time . The result is easily generalized to any number of operators. Note the order in which the operators are arranged in the matrix element on the left-hand side: it corresponds to the order in which the times , , etc. are arranged in the classical histories used to calculate the actions. The quantum operators are automatically arranged in decreasing times from left to right, even if (r ) and (r ) are numbers that commute in the right hand side of relation (48). .

Position and velocity operators, symmetrization

Imagine now we want to introduce, for example, the operator corresponding to the product R P of the position and momentum. We will then proceed as in (41) and (42), however with an added precaution: should we multiply (r r 1) by r or by r 1 (the order does not matter, since we are dealing with numbers). For the sake of symmetry, we multiply by half their sum, so that in (42) (r r 1) is replaced by: 1 r +r 2

1

(r

r

1)

=

1 [r 2

(r

r

1)

+ (r

r

1)

r

1]

(49) 2279

APPENDIX IV

On the right-hand side, we wrote the two terms so that the times are always decreasing (or stationary) from left to right. Because of the order of the indices, the first term on the right-hand side introduces the operator R W, whereas the second introduces W R, i.e. the same operators but in the inverse order. This is an example of how the path integral method leads quite naturally to a symmetrization of the operator order, which automatically ensures the hermiticity of their product. Conclusion To be able to use two complementary approaches, the Hamiltonian method and the path integral method, is often quite valuable in the study of numerous physical problems. As an example, path integrals play a fundamental role in field theory. They are for a large part at the base of the implementation of symmetry groups (Abelian or noncommutative) in this theory, which allows building a theory for elementary particles and their interactions. There are, however, other cases where the path integral formalism is very useful, as for example, in the computations of quantum interference with cold atoms. Conceptually, the path integral approach can shed new light on the relations between quantum mechanics and classical mechanics, as well as classical optics as we saw in § 3-a. In this appendix, we only used path integrals as a method to compute the time propagator of a quantum physical system, which involves imaginary exponentials of the Hamiltonian. Path integrals can also be used in quantum statistical mechanics (Appendix VI), and involve real exponentials of the Hamiltonian (multiplied by the inverse of the temperature). It is the basic tool for many numerical calculations; the interested reader can consult Zinn-Justin’s book [92], or reference [94] where, in particular, the PIMC (Path Integral Quantum Monte Carlo) methods are described.

2280

LAGRANGE MULTIPLIERS

Appendix V Lagrange multipliers

1 2

Function of two variables . . . . . . . . . . . . . . . . . . . . . 2281 Function of variables . . . . . . . . . . . . . . . . . . . . . . 2283

When a function depends on non-independent variables (i.e. which are related by constraints), its extrema (maxima or minima) can be found by the Lagrange multiplier method. A brief summary of this method is proposed in this appendix. The first part concerns functions of two variables, and the second part will generalize the concept to any number of variables. 1.

Function of two variables

Consider first a real function ( 1 2 ) of two independent variables 1 and 2 . We assume the fonction to be regular, continuous, differentiable with continuous derivatives. The extrema of correspond to values of the variables for which the two partial derivatives are zero: (

2)

1

=0

(

;

1

1

2)

=0

(1)

2

These two relations amount to stating that the gradient of =0

must be zero: (2)

Two equations with two unknowns 1 and 2 generally admit a finite number of solutions (pairs of values for 1 and 2 ); this number can even be zero if the function does not present any extrema. Let us now look for the extrema of when the variables are no longer independent, but must obey a constraint: (

1

2)

=

(3)

where is a constant and ( 1 2 ) a regular function (continuous, differentiable, etc.). When this constraint is satisfied, the point with coordinates 1 and 2 is forced to follow a curve in the plane (solid line in Figure 1). Imagine we place the point close to an arbitrary point of the curve, and move it by varying slightly its coordinates by d 1 and d 2 . For to remain constant, d 1 and d 2 must necessarily obey: d

=

(

1 1

2)

d

1

+

(

1

2)

d

2

=

d

=0

(4)

2

The point therefore necessarily moves along the tangent to the curve, i.e. perpendicularly to its gradient , as shown in Figure 1. As for the variation of , it is given by: d

=

d

(5) 2281

APPENDIX V

Figure 1: When the constraint ( 1 2 ) = is satisfied, the point , with coordinates 1 and 2 , is forced to move along a curve in the plane, shown as a solid line. The tangent to this curve is perpendicular to the gradient of the function , meaning that any small displacement of point along the curve must be perpendicular to this gradient. When the displacement starts from an arbitrary point , the vectors and are not parallel, and the function varies to first order in d = d . However, if the variation starts from a point, such as 0 , where the two gradients are parallel, the function is stationary. Geometrically, this parallelism means that the solid line is tangent to a contour line of the surface representing the function ( 1 2 ). As in general the vectors and are not parallel, this scalar product is not zero. The function thus varies to first order in d , meaning it is not stationary at that point. If, however, we start from a point 0 on the curve where the two gradients are parallel (or antiparallel), condition (4) means that the variation (5) is zero, and stationarity is attained. In such a case, moves (at constant ) along a curve that is tangent at 0 to a contour line of the surface representing the function . Geometrically, it is easy to understand that a displacement along a contour line keeps constant to first order. Algebraically, imposing the gradients to be parallel amounts to writing that there exists a constant , called “Lagrange multiplier”, such that: =

(6)

which is equivalent to saying that the differential of the function d(

)= =

(

) d

[

] 1

=0 2282

is zero:

d

1

+

[

]

d

2

2

(7)

LAGRANGE MULTIPLIERS

This means that one must simply replace the function by the fonction with an arbitrary Lagrange multiplier to obtain the stationarity of when its variables obey the constraint (3). When is fixed, we get as before two equations with two unknowns, so that the variables 1 and 2 are determined. Inserting them into (3) yields a value for , which is thus fixed. If, however, the Lagrange multiplier is allowed to vary, the constant becomes a function of , and can be adjusted by changing . As an example, when studying the canonical equilibrium (Appendix VI, § 1-b), one maximizes the value of the entropy (which plays the role of the function ) while keeping the average energy value constant. A Lagrange multiplier is then introduced to impose the stationarity of ; changing allows controlling the value of . 2.

Function of

variables

We now consider a function ( 1 2 ) of supposedly independent variables , ,... , . The extrema of are obtained by annulling the components of the 1 2 gradient of (each component being the partial derivative of with respect to one of the variables): =0

(8)

We get equations to determine unknown variables, yielding a finite number of extrema. Imagine now that the variables are no longer independent, but linked by conditions: (

1

)=

2

with

Consider a point in an coordinates satisfy the relations: d

=0

=1 2

(9)

-dimensional space, with coordinates 1 2 . If these conditions (9), their infinitesimal variations obey the

with

=1 2

(10)

For all the functions to remain constant, the displacement d of point in the -dimensional space must be orthogonal to all the gradients . Two cases are then possible: (i) either the gradient belongs to the sub-space generated by the , in which case the orthogonality condition (10) implies that d is also orthogonal to . Consequently the variation d = d is zero and the stationarity is ensured. (ii) or the gradient is not contained in that subspace, and it possesses a nonzero component orthogonal to that subspace. One can then choose d parallel to and obtain a first order variation of , while satisfying the constraints. In conclusion, the stationarity of is equivalent to the condition that the gradient be contained in the subspace generated by the . This amounts to stating that there exist Lagrange multipliers (with = 1 2 ) such that: =

1

1

+

2

2

+

+

(11) 2283

APPENDIX V

In an equivalent way, the stationarity condition can be obtained by annulling the differential: d(

1

1

2

2

)=0

(12)

and then treating the variables 1 2 as if they were independent. When the Lagrange multipliers are fixed, each component of relation (11) yields an equation, so that we have as many equations as variables 1 2 . One therefore obtains for the function a finite number of extrema linked by the constraints, yielding fixed values for the functions 1 , 2 , ..., . A variation in the Lagrange multipliers will change the values of these functions, which can therefore be adjusted to a value that has been chosen in advance.

2284

BRIEF REVIEW OF QUANTUM STATISTICAL MECHANICS

Appendix VI Brief review of Quantum Statistical Mechanics

1

Statistical ensembles . . . . . . . . . . . . . . 1-a Microcanonical ensemble . . . . . . . . . . . . 1-b Canonical ensemble . . . . . . . . . . . . . . 1-c Grand canonical ensemble . . . . . . . . . . . Intensive or extensive physical quantities . . 2-a Microcanonical ensemble . . . . . . . . . . . . 2-b Canonical ensemble . . . . . . . . . . . . . . 2-c Grand canonical ensemble . . . . . . . . . . . 2-d Other ensembles . . . . . . . . . . . . . . . .

2

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

2285 2285 2289 2291 2292 2293 2294 2295 2295

In quantum mechanics, as in classical mechanics, it is not possible to describe a system having a very large number of degrees of freedom (for example a system containing a number of particles that is of the order of the Avogadro number) with highest precision. Such a description would in particular include the value of many quantities, for instance many particle correlations, which fluctuate rapidly and are not necessarily of interest. A less detailed and more probabilistic description must be used, in which the state of the system in known only statistically. The system occupies one on a series of possible states, with a certain probability. One then says that the system is described by a “statistical ensemble”. The use of a density operator (Complement EIII ) to describe the physical system is particularly convenient in this case. We do not attempt in this appendix to give a general introduction to statistical mechanics and its postulates. We simply summarize a number of quantum statistical mechanics results used in several complements. For example, most of the complements of Chapter XV, as well as BXVII and DXVII , use the concept of chemical potential or of “grand potential” Φ; their interpretation in the framework of the different statistical ensembles will be given in this appendix. 1.

Statistical ensembles

Several “statistical ensembles” are commonly used to describe physical systems at equilibrium. We shall focus here on the three main ones: the microcanonical, the canonical and the grand canonical ensembles. The first of these ensembles provides the general setting for introducing the two others. 1-a.

Microcanonical ensemble

Consider a physical system containing energy of the system lies within an interval: ∆

2

+∆

2

particles in a box of volume

. The

(1) 2285

APPENDIX VI

with ∆ . The system is isolated from its surroundings preventing any exchange of particles or energy. We note the eigenstates of its Hamiltonian , where is an index reflecting the possible degeneracy of each eigenvalue of this Hamiltonian. .

Density operator, entropy

The system is supposed to have the same probability of being in any state whose energy falls within the interval (1); no state is favored over any other. The microcanonical density operator of the system at equilibrium is then: eq

=

where

1 ∆

( )

(2)

( ) is the projector onto the subspace containing all the accessible states:



+ ∆2

( )=



(3) ∆ 2

=

and where

is the microcanonical partition function defined as:

= Tr



( )

(4)

What relation (2) means is that the occupation probabilities of the states are all equal to 1 . Relation (3) shows that each of the projectors onto a state contributes one unit to the trace of relation (4); the partition function is simply the number of terms in the summation (3), i.e. the number of levels in the energy interval (1). If ( ) is the density of states, we can write: =

( )



(5)

As in § 1 of Complement AXXI , we define the entropy =

Tr

eq

ln

as: (6)

eq

where is the Boltzmann constant. The are eigenvectors of eq , with an eigen∆ value equal to 1 if belongs to the interval [ + ∆2 ], and equal to zero 2 otherwise. If belongs to the interval, we have: eq

If

ln

ln

=

eq

does not belong to the interval, since lim eq

ln

(7) 0

=0

eq

ln = 0, we get: (8)

We now multiply the two previous relations by the bra and sum over and to get a trace. Only the bras whose energy falls within the interval will yield a non-zero contribution. As there are of them, we obtain: Tr

eq

ln

eq

=

ln

(9)

The equilibrium value of the entropy is therefore: = 2286

ln

(10)

BRIEF REVIEW OF QUANTUM STATISTICAL MECHANICS

.

Temperature, chemical potential

Suppose we now change the equilibrium energy by an infinitesimal amount d , keeping the volume and the particle number constant. Since no work is exchanged with the outside (the external walls are fixed), this amounts solely to a heat change: d

=d

=

d

(11)

where we have used the usual thermodynamic definition of the entropy d = d the microcanonical ensemble, the temperature is thus defined as: 1

=

. In

(12)

where we have used the partial derivative notation to emphasize that the changes are made keeping both and constant. Let us now change the particle number , keeping the volume and the energy constant. We then define the “chemical potential” (which has the dimension of an energy) as: =

(13)

At fixed temperature, the faster the entropy (which depends on the number of accessible levels in the energy band ∆ ) grows with the particle number , the larger the absolute value of . The chemical potential plays an essential role in the grand canonical equilibrium as we shall see (§ 1-c). The third partial derivative of (with respect to the volume) will be determined in § 2-a. Comment: Let us insert (5) in relation (10), but first multiplying ( ) by and dividing ∆ by this same quantity (this has the advantage of providing dimensionless arguments for the logarithmic functions). This yields: = ln [

( )

] + ln



(14)

In a macroscopic system, the particle number is very large, of the order of the Avogadro number. Let us see then what happens when the particle number goes to infinity. We assume that the energy as well as the volume are proportional to (thermodynamic limit). We then expect the entropy to also be proportional to . This linear variation of the entropy cannot come from the second term in (14): even if the energy interval ∆ is proportional to , it will only yield a much slower logarithmic variation. Most of the variation of actually comes from the first term of (14), and from the fact that the density of states increases with in an exponential way: as the exponent of ( ) contains , this variation is phenomenally rapid. In the limit of large systems, the first term in (14) largely dominates the second. This is why it is often said that the entropy characterizes the density of states of a physical system (or more precisely the number of its quantum energy levels in a microscopic energy interval, chosen here to be equal to ).

2287

APPENDIX VI

.

Entropy maximization

We now choose for an arbitrary Hermitian density operator, with positive or zero eigenvalues whose sum is equal to 1. We denote its eigenvectors and its eigenvalues (0 1) which obey: =1

(15)

We assume that is restricted to the energy band (1): all the for which arbitrary linear combinations of the eigenvectors obeying (1). An entropy can be associated with : =

Tr

ln

= 0 are

(16)

where is the Boltzmann constant. We are going to show that among all possible operators, the equilibrium one, eq , maximizes this entropy. We can write: =

ln =

ln

(17)

and therefore: =

Tr

ln

Any variation of the d =

[1 + ln

=

results in a variation of ] d

ln

(18)

written as: (19)

However relation (15) requires the sum of the variations d to be zero. To write that relation (19) is zero while taking into account this constraint, we use a Lagrange multiplier (Appendix V) and obtain the equation: [ + 1 + ln

] d

=0

which must be satisfied for any d ln

=

1

(20) . Canceling the corresponding coefficients leads to: (21)

This means that all the non-zero must be equal. Operator is therefore proportional to the projector (3). Once we normalize its trace to 1, we get (2): the microcanonical density operator corresponds to an entropy extremum. As all the are between 0 and 1, relation (18) shows that this extremum is positive. To find out if it is a maximum or a minimum, we consider another operator, whose eigenvalues are all zero except one, equal to 1; its associated entropy is zero. Consequently the extremum of obtained for the microcanonical equilibrium is an absolute maximum. This result proves an important theorem: the density operator that maximizes the entropy is the sum of the projectors onto all the accessible states, with equal eigenvalues (the probabilities of finding the physical system in each of these states). 2288

BRIEF REVIEW OF QUANTUM STATISTICAL MECHANICS

1-b.

Canonical ensemble

We now consider a physical system S no longer isolated but in contact with a reservoir with which it exchanges energy; for example S and could be coupled through a wall conducting heat but remaining fixed so that no work can be exchanged between them. Let us call , and the particle number, the energy and the volume of the reservoir . .

Density operator

Assuming the reservoir to be much larger than the system S , its temperature remains constant as it exchanges energy with S . According to relation (12), this implies that its temperature , defined as: 1

=

(22)

is a constant. It will be characterized by the constant : =

1

=

1

(23)

As relation (10) showed that = ln , where is the number of the reservoir’s accessible levels in an energy band ∆ around , we deduce: ln

=

(24)

This means that this number of levels varies exponentially as a function of the energy (keeping and constant): (25) The total system S + is described by an equilibrium microcanonical density operator. Its energy eigenvectors are the tensor product1 of the energy eigenvectors of the system S and the energy eigenvectors of the system : (26) The microcanonical density operator of S + given by:

S+

=

= Tr

=

+

is

(27) +

=

We get the density operator eq

tot

∆ tot + 2

1 S+

, with a total energy

tot

eq

∆ 2

of the system S by taking a trace over the reservoir:

S+

1 We assume that the coupling between S and is negligible.

(28) is weak, so that its contribution to the total energy

2289

APPENDIX VI

In (27), the trace over of each projector is just equal to one. The density operator eq is simply a sum of projectors onto the energy eigenstates , multiplied by the number of levels of with an energy tot within an energy band ∆ . Relation (25) shows that this number of levels varies exponentially as ) . Omitting the proportionality factors 1 tot = ( tot and , we get: S+ =

eq

(29)

is the Hamiltonian of the system S . Normalizing the trace of , we obtain:

where =

eq

where

1

(30)

is the “canonical partition function” defined as: = Tr

(31)

These two relations define the density operator of S in the canonical thermal equilibrium. Contrary to what happened in the microcanonical equilibrium, the energy of the system S is no longer restricted to a small interval ∆ , but may spontaneously fluctuate outside this energy band under the effect of the coupling with the reservoir. The thermodynamic potential of the canonical equilibrium is defined by the function called the “free energy”: =

(32)

At equilibrium, when =

+

Tr

is given by (30), this free energy is equal to: eq

ln

=

ln

(33)

and we obtain: = .

ln

(34)

Minimization of the free energy

Starting from an arbitrary density operator of unit trace, let us show that its associated free energy will be minimal when is equal to its value at the canonical equilibrium (30). We first compute the variation of : d

= Tr

+

(1 + ln ) d

(35)

This variation is zero for any d only if the operator between the inner brackets is zero, which means: ln

=

1

(36)

This indicates that , which is the canonical equilibrium operator. Finally, if we choose for the projector onto a state having a large positive energy, will be zero, arbitrarily very large, and consequently will be very large as well. It is thus clear that the extremum of , which occurs when takes the equilibrium value, is a minimum. 2290

BRIEF REVIEW OF QUANTUM STATISTICAL MECHANICS

1-c.

Grand canonical ensemble

We now assume that the physical system S can exchange not only energy but also particles with the reservoir : S and must be coupled through an interface the particles can cross. As above, this reservoir is supposed large enough for its temperature to remain constant when it exchanges energy with S . We also assume it contains a very large number of particles, which barely changes in relative value during the particle exchanges with S . Its chemical potential therefore remains constant: =

.

= constant

(37)

Density operator

As the particle numbers of system S and of the reservoir are no longer constant, their state spaces are now Fock spaces (Chapter XV). We must add indices and respectively to the kets and and there are now two summations over these indices in the expression for the microcanonical density operator S + written in (27). The microcanonical density operator of the total system is then:

S+

=

∆ tot + 2

1 S+

+

=

tot

∆ 2

+

=

tot

(38)

where tot and tot are respectively the energy and the particle number of the total system S + . The argument then follows the same lines as in § 1-b- . A partial trace over the reservoir leads to the density operator of S , which is a linear combination of projectors: (39) with weights corresponding to the number of states of the reservoir in an energy band centered around = tot , the number of particles in the reservoir being = . Two reservoir variables change simultaneously, instead of one for the canonical tot equilibrium. As and remain constant in (24) and (37), the entropy varies linearly with respect to these variables: =

0

+

1

=

0

+

(

)

(40)

where 0 is a constant that is of no importance in what follows. Using again relation (10) to relate the reservoir entropy to the number of states accessible to this reservoir, we get: =

(

)=

(

tot

tot )

(

)

(41) 2291

APPENDIX VI

where and remain constant. The same argument as above shows that the trace over the reservoir variables lead to the following density operator for the system S :

eq

=

1

(42)

with: gc

= Tr is total particle number operator of S .

where .

(43)

Grand potential

The thermodynamic potential for the grand canonical ensemble is the “grand potential” Φ defined as: Φ=

(44)

Following the same demonstration as for the equilibrium value of this potential is: Φ=

in the canonical ensemble, we can show that

ln

(45)

If we let vary, we can show, as above, that this potential reaches a minimum when equal to (42); a detailed demonstration is given in § 1 of Complement GXV . 2.

is

Intensive or extensive physical quantities

Take a macroscopic physical system S at equilibrium, and divide it into two subsystems of equal sizes S and S ; one can imagine that a wall separates S from S . Certain physical quantities associated with S or , taken separately, are half of what they were for S : the volumes, the energies, the particle numbers, the entropies, etc. Such quantities are said to be “extensive”. Inversely, other physical quantities do not change upon this division: the particle number per unit volume, the temperature, the chemical potential, etc. Such quantities are said to be “intensive”. In a general way, when a macroscopic physical system of volume is divided into several macroscopic parts of volumes 1 , 2 , etc., the physical quantities measured in each part and which are proportional to their respective volume are said to be extensive, and those which remain constant are said to be intensive. As for the ensembles studied above, their description involves a mixture2 of extensive and intensive variables: (i) In the microcanonical ensemble, the three independent variables describing the physical system at equilibrium are the three extensive variables , and the system’s energy ; the other physical quantities (temperature, entropy, chemical potential, etc.) are 2 including

2292

at least one extensive variable, otherwise the system’s size would not be determined.

BRIEF REVIEW OF QUANTUM STATISTICAL MECHANICS

considered functions of these variables. The thermodynamic potential is the entropy , extensive and directly related to the logarithm of the microcanonical partition function. (ii) In the canonical ensemble, the three independent variables include two extensive variables and as well as an intensive variable (or ). The thermodynamic potential is the free energy , an extensive function directly related to the logarithm of the canonical partition function. (iii) In the grand canonical ensemble, there is only one independent extensive variable, the volume , and two intensive variables and . The thermodynamic potential is the function Φ, extensive and directly related to the logarithm of the grand canonical partition function. For a macroscopic system, the three ensembles are generally considered equivalent. The statistical descriptions are, however, different. In the canonical equilibrium, for example, the energy is not restricted to an interval ∆ but can fluctuate and take on values outside this interval. However, for a macroscopic system, the fluctuations in energy are very small compared to its average value. Assuming that the system’s energy is confined within a fixed band ∆ is a valid approximation and allows taking for ˆ the microcanonical energy : ˆ =

(46)

Another example, in the grand canonical ensemble, is the particle number, which fluctuates around its average value ˆ . For a macroscopic system, the relative value of these fluctuations is in general3 very small, and the average value ˆ is practically equal to the particle number of the microcanonical or canonical ensembles: ˆ = 2-a.

(47) Microcanonical ensemble

Relations (12) and (13) give the partial derivatives of the entropy with respect to the variables and ; we now compute that derivative with respect to the volume . Let us change the physical system volume by a small quantity d , keeping the particle number constant, and without any heat exchange (the system is surrounded by isolating walls). The system, having an internal pressure , is doing the work d , which means that its internal energy varies as:

d

=

d

(48)

As there is no heat exchange, d = 0, and the thermodynamic relation d that the entropy does not change either: d =

d +

d

=0

=

d means

(49)

3 There are exceptions to this rule: for a Bose-condensed ideal gas, the grand canonical fluctuations of the particles’ number remain large for a macroscopic system. This is a very special system for which the canonical and grand canonical ensembles are not equivalent for certain physical properties.

2293

APPENDIX VI

Inserting relation (12) in this result and multiplying by d +d

=0

(50)

As relation (48) shows that the pressure account, we finally get: =

, we obtain:

is given by

d

d

and taking (10) into

ln

=

(51)

which defines the pressure in the microcanonical ensemble. We already studied, in § 1-a- , the entropy changes due to variations of either (keeping and constant) or (keeping and constant). The present calculation is the last step for obtaining the three partial derivatives of the microcanonical thermodynamic potential, and we can express its total derivative as: d = 2-b.

1

d

+

d

d

(52)

Canonical ensemble

For a macroscopic system, we just saw that could be replaced by the microcanonical energy in the definition (32) of the free energy. Taking the differential of (32) then leads to: d

=d

d

Using (52) in the d

=

d

d d

term of this equation, the d

(53) terms cancel out and we get:

d + d

(54)

This is the total differential of the thermodynamic potential in the canonical ensemble. This relation allows a physical interpretation of the chemical potential: it is the gain in free energy when one particle is added to the system4 , keeping constant the temperature and the volume of the system. As for the pressure , it is given by: =

(55)

or, using (34): =

ln

(56)

which is similar to (51). We have obtained the pressure of the physical system as a function of its volume and its temperature, i.e. its “equation of state”. 4 When the temperature is zero, the free energy is just the energy when one particle is added.

2294

, and

is the increase of energy

BRIEF REVIEW OF QUANTUM STATISTICAL MECHANICS

To compute the average energy yields: =

1

Tr

=

1

, we can use relations (30) and (31), which

Tr

(57)

where the partial derivative is taken keeping =

1

and

constant. We then have:

ln

=

(58)

. 2-c.

Grand canonical ensemble

In a macroscopic system, as the particle number generally fluctuates very little in relative value, we can replace by in the definition (44) of the thermodynamic grand potential Φ. This leads to: dΦ = d

d

d

(59)

Using (54) in this relation, the d dΦ =

d

d

terms cancel out and we are left with:

d

(60)

In this ensemble, the volume is the only extensive variable. For a fixed temperature and chemical potential, and for a large volume, we get a macroscopic system whose energy, entropy and particle number are proportional to . We simply get: Φ=

(61)

The grand potential divided by yields the pressure directly, without any partial derivative. Taking (45) into account, the average particle number and the pressure obey: Φ

= =

Φ

=

= ln

ln

(62)

Using these two equalities to eliminate the chemical potential , we get the particle number in a given volume as a function of the pressure and the temperature (equation of state for the physical system). 2-d.

Other ensembles

We have studied the three most commonly used statistical ensembles, but there are others, as for example the isothermal-isobaric ensemble. In this ensemble, the system S is coupled with a reservoir allowing exchanges of energy and volume, but not particles; the number remains fixed. The only extensive variable is precisely this variable , 2295

APPENDIX VI

the other two, the temperature and the pressure , being intensive. The thermodynamic potential associated with this isothermal-isobaric ensemble is the Gibbs function defined as: =

+

(63)

As before, we take the differential of this function and note, using (52), that the terms d and d cancel out. We then get: d

= d

d + d

(64)

The function is extensive. It increases as the particle number gets larger (for fixed pressure and temperature), and for a macroscopic system it is proportional to the system size: =

(65)

Varying both the temperature and the pressure of an ensemble of particles, we can get the resulting variation of the chemical potential by dividing (64) by and then setting d = 0. This yields the Gibbs-Duhem relation: d =

d

d

(66)

This ensemble is particularly useful in the study of a two-phase equilibrium such as a liquid and its vapor, both at the same pressure and temperature. We have presented a brief review of the general principles of statistical mechanics. For more details, the reader may consult, for example, the following references [95, 96, 97, 3].

2296

WIGNER TRANSFORM

Appendix VII Wigner transform

1 2

3

4

5

Delta function of an operator . . . . . . . . . . . . . . . . . . Wigner distribution of the density operator (spinless particle) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-a Definition of the distribution, Weyl operators . . . . . . . . . 2-b Expressions for the Wigner transform . . . . . . . . . . . . . 2-c Reality, normalization, operator form . . . . . . . . . . . . . . 2-d Gaussian wave packet . . . . . . . . . . . . . . . . . . . . . . 2-e Semiclassical situations . . . . . . . . . . . . . . . . . . . . . 2-f Quantum situations where the Wigner distribution is not a probability distribution . . . . . . . . . . . . . . . . . . . . . Wigner transform of an operator . . . . . . . . . . . . . . . . 3-a Average value of a Hermitian operator (observable) . . . . . . 3-b Special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-c Wigner transform of an operator product . . . . . . . . . . . 3-d Evolution of the density operator . . . . . . . . . . . . . . . . Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-a Particle with spin . . . . . . . . . . . . . . . . . . . . . . . . . 4-b Several particles . . . . . . . . . . . . . . . . . . . . . . . . . Discussion: Wigner distribution and quantum effects . . . . 5-a An interference experiment . . . . . . . . . . . . . . . . . . . 5-b General discussion; “ghost” component . . . . . . . . . . . . .

2299 2299 2300 2301 2304 2305 2306 2309 2310 2311 2312 2313 2316 2318 2318 2319 2319 2319 2322

Introduction In classical mechanics, it is possible to specify with an arbitrary precision both the position r and the momentum p (and hence the velocity) of a particle. If the state of the particle is defined in a statistical way, one describes the classical particle by a distribution cl (r p) in the phase space (Appendix III, § 3-a) , which can be any positive function, normalized to unity. This distribution can, for example, include correlations between the particle’s position and velocity. In quantum mechanics, the situation is different. It is true that one often uses two representations, one in position space, the other in momentum space, and that we go from one to the other via a Fourier transformation. But in quantum mechanics these two representations are exclusive: in the position representation, one loses all information on the particle’s momentum, and conversely, in the momentum representation one loses all information on the particle’s position; consequently, no information on an eventual correlation between position and momentum can be obtained. It is interesting to introduce a quantum point of view intermediate between these two extremes, to keep at the same time information about position and momentum, 2297

APPENDIX VII

while obeying the general rules of quantum mechanics; these rules impose a limitation on the information precision. The Wigner transformation offers this intermediate point of view as it introduces a quantum mechanical function (r p) that allows computing average values in the same way as with a classical distribution cl (r p). Historically, this transformation was introduced1 in 1932 by Wigner [98, 99] as he was working on the quantum corrections to thermal equilibrium, but it turned out to be a much more general tool. It yields very naturally semiclassical expansions in powers of } while studying, for instance, the temporal evolution of a quantum system. We shall also show that it provides a quantization method, leading in particular to correctly symmetrized expressions for quantum operators starting from classical functions of the positions and momenta. There are numerous domains in physics where the Wigner transform has proven useful, and sometimes indispensable. This appendix will show how to associate with any quantum density operator a Wigner distribution (r p), sometimes called a “semiclassical distribution”, and we will discuss a certain number of its properties. In a similar way, to any operator (observable) acting in the state space, we can associate a Wigner transform (r p) that is a simple function of r and p. In classical or semiclassical situations (i.e. when the spatial variations of physical quantities occur over large enough distances), the function (r p) possesses all the properties of a classical distribution: it is a positive function that, multiplied by (r p) and then integrated over the variables r and p, does yield the average value of the operator . In this case, we will show that the Wigner distribution simply describes the flow of the probability fluid (Chapter III, § D-1-c- ). As (r p) allows keeping track of the correlations between position and momentum, this function is particularly useful in a number of cases, such as in the theory of quantum transport, with its numerous applications. In the general case where the quantum effects are important (rapid spatial variations), the classical relations between the distribution and the average values are still valid, meaning the Wigner distribution continues to be quite useful. This distribution shows, however, a significant difference with a probability distribution: the function (r p) can sometimes take on negative values. Furthermore, as we shall point out later, it can sometimes present “ghost” components where it is different from zero at points where the probability of finding the particle is zero. It is thus not possible to interpret the product (r p)d3 d3 as the probability for the particle to occupy an 3 3 infinitesimal cell d d of the phase space centered at (r p); moreover, when dealing with infinitesimal volumes such an interpretation would be in clear contradiction with Heisenberg’s uncertainty relations. Therefore, one should consider (r p) to be a “quasi-distribution”, a tool for computing all the average values without being a real probability distribution, even though we shall use the word distribution, following common usage. This appendix introduces the tools necessary for the study of these different situations. We shall obtain, in particular, gradient expansions directly yielding expansions in powers of }. We first introduce in § 1 a convenient form for the delta function of an operator. This form will be useful, in § 2, for defining the Wigner distribution of a spinless particle density operator. Several of its characteristics will be studied, in particular 1 Wigner does mention in his article that the same transformation had already been used by L. Szilard, but in another context.

2298

WIGNER TRANSFORM

when dealing with a Gaussian wave packet. We then focus in § 3 on the Wigner transform of operators and show how it can be associated with the Wigner distribution of the density operator, leading to computations similar to those corresponding to a classical distribution. An important step will be the computation of the Wigner transform associated with a product of operators. Generalization of these concepts to take into account the spin, as well as the possible presence of several interacting particles, is discussed in § 4. Finally, we focus in § 5 on the physical properties of the Wigner transformation, using it to analyze a quantum interference experiment. We will show, in particular, how “ghost components” of the Wigner transform can appear, rapidly change sign, and are the signature of quantum effects. 1.

Delta function of an operator

Consider a Hermitian operator having a continuous spectrum, and whose eigenvalues are noted . We define the operator ( ), depending on a real continuous parameter , as: ( )=

1 2

(

d

)

(1)

It is, in a way, a “delta function of an operator” associated with the difference between operator and the constant . We note the eigenvectors of ; the index (assumed to be discrete) accounts for the possible degeneracy of the eigenvalue . In a quantum state defined by the density operator , the average value of this operator is: 1 2 1 = 2

( ) =

=

1 2

d

d

d

d

The integral over d yields 2 ( ) =

(

d Tr

)

(

(

( =

)

)

), and we obtain: ( )

where ( ) d is the probability, in a measurement associated with operator a result in the interval [ + d ]. 2.

(2)

(3) , of finding

Wigner distribution of the density operator (spinless particle)

Imagine now that the physical system under study is a spinless particle, described by a density operator . Our purpose is to introduce a function (r p) yielding simultaneously information on the probability of measurement results on either the position operator R, or the momentum operator P. As these are incompatible observables, we should not expect to make highly accurate predictions concerning both types of measurements: the precision is limited by Heisenberg’s uncertainty principle. This function 2299

APPENDIX VII

will have to make a compromise between the two types of information, resulting from an unavoidable quantum uncertainty. As already mentioned in the introduction, this function is actually a “quasi-distribution”, even though it is commonly called the “Wigner distribution”. 2-a.

Definition of the distribution, Weyl operators

By analogy with (1), we can introduce the “Weyl operator”, which will be noted (r p). It is, in a way, a “delta function” of an operator, associated at the same time with the operators R r and P p. We set: (r p) =

1 }3

d3

d3 3

(2 )

(2 )

3

e [κ (R

r)+x (P p) }]

(4)

where the integration variable κ has the dimension of a wave vector (the inverse of a length), and x the dimension of a length. This operator is Hermitian, as can be shown by changing the sign of the two integration variables. For each quantum state, the Wigner distribution (r p; ) is defined as the average value in that state of the Weyl operator: (r p; ) =

(r p)

(5)

If the system under study is characterized by a density operator ( ) whose trace equals 1, this definition amounts to: (r p; ) = Tr

()

(r p)

(6)

On the other hand, if the system is described by a normalized state vector Ψ , this definition becomes: (r p; ) = Ψ ( )

(r p) Ψ ( )

(7)

Note that the density operator as well as the state vector are dimensionless. Taking into account the factors } introduced in (4), (r p) and (r p) have the dimensions of } 3 ; the product (r p)d3 d3 is thus dimensionless, as a probability should be. We are now going to show that this distribution has numerous useful properties. Later, we shall need the matrix elements r1 (r p) r2 of the operator (r p) in the position representation. As demonstrated below, they are written: r1

(r p) r2 =

1

p (r2 r1 ) } 3

(2 })

r1 + r2 2

r

(8)

Demonstration: The demonstration uses relation (63) of Complement BII , which expresses the exponential of the sum of two operators and as a product of exponentials, provided both operators commute with their commutator [ ]: +

2300

=

1[ 2

]

(9)

WIGNER TRANSFORM

We choose: r) κ

= (R

and

= (P

(10)

p) x }

The commutator of these two operators: [

κ x

]=

(11)

is a number, so that both (9) into (4), we get: r1

(r p) r2 =

and

commute with their commutator. Inserting relation

d3 (2 )3

1 }3

d3 (2 )3

κx 2

(R r) κ

r1

(P p) x }

Now relation (13) of Complement EII tells us that the action of operator simple translation of the position eigenvalue by x, which means that: (P p) x }

px }

r2 =

r2

(12)

r2 Px }

is a

(13)

x

The function to be integrated in relation (12) then becomes: κx 2

r1

(R r) κ

(P p) x }

κx 2

r2 =

x r) κ

(r2

px }

(r1

r2 + x) (14)

The delta function, integrated over d3 in (12), leads to replacing in the exponent x by r2 r1 , and r2 x by r1 . We then get: r1

(r p) r2 =

d3 (2 )3

1 (2 })3

κ

r1 +r2 2

r

p

r2

r1

(15)

}

The integration over d3 then yields a delta function, and we obtain relation (8). 2-b.

Expressions for the Wigner transform

Definition (5) of the Wigner transformation may lead to various expressions for the Wigner transform, depending on the representation used in the state space. .

Position representation

Using the position representation to calculate the trace appearing in (6), we get: (r p; ) = =

d3

d3

2

d3

d3

1

R+

y 2

r2

( ) r1 r1

() R

y 2

(r p) r2 R

y 2

(r p) R +

y 2

(16)

where, on the second line, we used the integration variables R = (r1 + r2 ) 2 and y = r2 r1 . Using relation (8) in this expression, we get the function (R r), to be integrated over d3 , which leads to: (r p; ) =

1 3

(2 })

d3

py }

r+

y 2

() r

y 2

(17)

2301

APPENDIX VII

This relation is often used as a definition of the Wigner distribution. Its integration over 3 d3 (2 }) yields a function (y), which leads to: d3

(r p; ) = r

() r =

(r )

(18)

We confirm a property of the classical distributions in phase space: the integral of the distribution over the momenta yields the probability density (r ) of finding the particle at point r. In the particular case where the particle is described by a pure state Ψ – see relation (7) – the definition of the Wigner distribution becomes: (r p; ) =

1

d3

3

(2 })

py }

Ψ r+

y ; 2

Ψ

r

y ; 2

(19)

where Ψ (r; ) is the wave function in the position representation: Ψ (r; ) = r Ψ ( ) .

(20)

Momentum representation

As position and momentum play symmetrical roles in the argument, we expect to find a similar relation for the Wigner distribution, involving now the matrix elements of the density operator in the momentum representation. This is indeed the case, and we are going to show that: (r p; ) =

1

d3

3

(2 })

qr }

p+

q 2

q 2

() p

(21)

This expression is the exact analog of (17); it can be considered as an alternative definition of the Wigner distribution. As for the analogy of the property expressed by (18), it is easy to show that: d3

(r p; ) = p

() p

(22)

Just as for a classical distribution, the integral over the positions of the Wigner distribution yields the probability density of finding a given momentum p. Demonstration: Inserting in the matrix element of (17) two closure relations on normalized momentum plane waves q and q , yields the two scalar products: r+

y q 2

1

=

(2 })3

q

(r+ y2 )

}

q

2

y 2

r

=

1 (2 })3

q

(r

y 2

)

}

2

(23) and we can write: (r p; ) =

1 (2 })6

d3

py }

q

2302

(r+ y2 )

d3 }

d3 q

(r

y 2

)

}

q

() q

(24)

WIGNER TRANSFORM

The summation over d3 yields a delta function: 1 (2 })3

q +q 2

d3

p

y }

q +q 2

=

(25)

p

so that if we take Q = (q + q ) 2 and q = q (21).

q as integration variables, we obtain

In the particular case where the particle is described by a pure state Ψ , relation (21) becomes: (r p; ) =

1

d3

3

(2 })

qr

Ψ p+

q ; 2

Ψ

p

q ; 2

(26)

where Ψ (p; ) is the wave function in the momentum representation: Ψ (p; ) = p Ψ ( )

(27)

If the wave function is factored2 : Ψ (p; ) =

(

; )

( ; )

( ; )

(28)

relation (26) shows that the Wigner transform is also factored: (r p; ) =

(

; )

(

; )

(

; )

(29)

with: (

; )=

1 2 }

d

}

+ ; 2

2

;

(30)

The other two components ( ; ) and ( ; ) are defined in a similar way. In this particular case, one can reason independently for the three dimensions. .

Inverting the relations

We just saw that to each density operator corresponds a unique and well defined Wigner distribution. Inversely, starting from this distribution, one can reconstruct the corresponding density operator via its matrix elements. To take the inverse Fourier transform of (17), we multiply this relation by p z } and integrate over d3 : d3

pz }

(r p; ) =

1 (2 })

3

d3

On the right-hand side, the integral over this equality becomes: d3

pz }

2 Whether

(r p; ) = r +

z 2

d3 3

() r

p (z y) }

r+

y 2

yields a function (2 }) z 2

() r 3

(z

y 2

(31)

y), so that

(32)

the wave function is factored in the momentum or position representation is equivalent.

2303

APPENDIX VII

Setting r1 = r + z 2 and r2 = r representation: d3

( ) r2 =

r1

z 2, we obtain the matrix elements of

r1 + r2 p; 2

p (r1 r2 ) }

in the position

(33)

Knowing the Wigner distribution (r p) thus defines the operator in a unique way. p r } In a similar way, multiplying (21) by , integrating over d3 , then setting p = p1 p2 and p = (p1 + p2 ) 2, yields the inversion relation in the momentum representation: d3

( ) p2 =

p1

2-c.

(p2 p1 ) r }

r

p1 + p2 ; 2

(34)

Reality, normalization, operator form

Let us take the Hermitian conjugate of relation (17). As the density operator is Hermitian, the matrix element on the right-hand side becomes: r+

y 2

() r

y 2

= r

y 2

( ) r+

y 2

(35)

Changing the sign of the integration variable y then yields again relation (17); the distribution (r p) is therefore equal to its complex conjugate, meaning it is real. We now compute the integral of (r p) over the entire phase space. Summing (18) over d3 , we get: d3

d3

(r p; ) =

d3

r

( ) r = Tr

() =1

(36)

where the last equality comes from the fact that the density operator has a trace equal to one. The Wigner distribution of the density operator is thus a real function, normalized to one in phase space, as is the case for a classical distribution. Comment: The density operator must obey a stronger constraint than having its trace equal to unity. It is defined as positive definite, meaning that for any ket , we must have: ()

0

(37)

Now this condition is not merely equivalent to a positivity condition for the Wigner transform. In fact, we shall see below that the Wigner distribution of a density operator can become negative at certain points of phase space. However, the only Wigner distributions (r p; ) acceptable for describing quantum systems are those that lead to a density operator obeying constraint (37). To know if a distribution in phase space is acceptable or not, is thus more difficult to decide in quantum mechanics than in classical mechanics.

2304

WIGNER TRANSFORM

We now show that relations (33) and (34) can be written in a simple operator form: 3

d3

( ) = (2 })

d3

(r p; )

(r p)

(38)

In this relation, (r p; ) is the Wigner distribution, hence a function of position and momentum, but (r p) is the Weyl operator defined in (4). To prove relation (38), we calculate the matrix element of this equality between the bra r1 and the ket r2 to verify that we indeed obtain relation (33). Taking (8) into account, the matrix elements of the right-hand side are: 3

d3

(2 })

d3

(r p; )

=

d3

=

d3

(r p) r2

r1

d3

r1 + r2 2

p (r2 r1 ) }

(r p; ) r1 + r2 p; 2

r

p (r2 r1 ) }

(39)

which is equivalent to the right-hand side of (33). 2-d.

Gaussian wave packet

A particular case where the computation can be completed is the one-dimensional Gaussian wave packet, studied in Complement GI . Relation (1) of this complement yields the normalized wave function ( ) of such a wave packet in the position representation. We slightly modify it to center the wave packet at an arbitrary non-zero position 0 , and replace the variable by = } (setting in particular 0 = } 0 ). We obtain: ( )=

d }

3 4

(2 ) =

2

(

0)

2

4}2

(

0)

}

1 4

2

0(

0)

(

}

0)

2

2

(40)

2

where the second equality corresponds to relation (9) of Complement GI (within an 0 1 2 } translation along ). The wave functions (2 }) correspond to plane waves with momentum (normalized with respect to that momentum); looking at the first equality in (40) we recognize the wave function in the momentum representation: ( )=

1

2

1 4

2 0)

(

4}2

0

}

(41)

}

(2 )

The Wigner distribution (30) can then be written (to simplify, we temporarily ignore the time dependence): 2

(

)=

3 2

(2 )

}2

d 2

=

3 2

(2 )

2}2

}2

4}2

(

(

2 0)

0+ 2

d

)

2

+(

2 2 8}2

2

0

2

(

)

0) }

(

0) }

(42)

2305

APPENDIX VII

The integral over d is a Fourier transform whose value can be obtained by replacing by 2 in relation (50) de l’Appendice I. We get: 2

(

)=

3 2

(2 ) 1 = }

2}2

2 0)

(

}2

2 2}2

(

2 0)

2

(

2 0) 2

2 }

2} 2

2(

0)

2

2

(43)

Taking (40) and (41) into account, we see that the Wigner distribution is simply the product of the probability densities in the position and momentum spaces: (

( )2

)=

( )2

(44)

This result is particularly simple and shows that the Wigner transform of a Gaussian wave packet (40) contains no correlations between the variables and . It can be factored into two Gaussian functions, one concerning the momentum, the other the position. The first one is centered on the average momentum 0 , and has a width of the order of } ; the second, on the average position 0 , with a width of the order of . These two widths are within the boundaries imposed by Heisenberg relations. Note that, in this case, the Wigner transform remains positive for all the values of its variables, as will also be the case for the semiclassical situations we consider in § 2-e. Comment: In the preceding paragraph, we ignored the time dependence of the wave packet. To take it into account and assuming we are dealing with a free particle, one can multiply, in (41), ( ) by , with: 2

=

2 }

(45)

where is the mass of the particle. This introduces, in the integral on the second line ( }) of (42), an additional exponential whose effect on the Fourier transform is to make the following substitution: (46) This simply corresponds to the motion of the particle with a velocity . Making this substitution in (43), we find that the Wigner distribution is still a product of two Gaussian functions, but no longer the product of a function of momentum by a function of position: correlations have appeared between the momentum and position variables. 2-e.

Semiclassical situations

To what extent is it possible to consider the Wigner transform to be a true probability distribution? Relations (18) and (22) seem to be in favor of it, as they show that integrating that distribution over the momenta (or over the positions) actually yields a probability distribution of finding the particle at a given point (or with a given momentum). These two “marginal distributions” thus obtained by integration are both probability distributions. But this is not sufficient to ensure that the function (r p) 2306

WIGNER TRANSFORM

itself (before integration) has the same property. Actually, we already mentioned in the introduction that it is not possible, in general, to interpret the product (r p) d3 d3 as yielding directly the probability for a particle to occupy an infinitesimal cell d3 d3 of phase space, centered at (r p). Such a probability distribution is meaningless in quantum mechanics, as Heisenberg’s relations forbid the existence of a quantum state defined with an arbitrary precision both in position and momentum spaces. There are, however, some simple cases, that we shall call “semiclassical”, where the Wigner transform is very similar to a classical probability distribution. They correspond to situations we will now study, where the physical quantities vary sufficiently slowly in space compared to a scale we shall define explicitly. In the following section, we will consider more general situations, where the properties of the Wigner transform are radically different. In particular, the Wigner transform can become negative, which immediately excludes any interpretation in terms of probability density. .

Wave packet with slow spatial variations

Consider the wave function: (r) =

(r)

(r)

(47)

where (r) is the modulus of the wave function and (r) its phase. The probability den2 sity of presence is then [ (r)] , while relation (D-17) of Chapter III yields the probability current J (r): J (r) =

}

2

[ (r)] ∇ (r)

(48)

The matrix elements of the corresponding density operator are written: r

() r

=

(r )

(r )

[ (r )

(r )]

(49)

We assume that, in the vicinity of each point r, the wave function behaves locally as a plane wave: (r)

(r)

[K(r) r+ (r)]

and that the two functions in space: their variations are wavelength 2 (r). When r of the exponential in (49); the then written: r

() r

(r )

(r )

in the vicinity of each point r

(50)

(r) and K (r), as well as the phase (r), vary slowly negligible over distances of the order of the de Broglie and r are close enough, one can expand the argument matrix elements r ( ) r of the density operator are K (r

r

)

(51a)

where K is defined as: K = ∇ (r =

r +r ) 2

(51b) 2307

APPENDIX VII

.

Density operator; link with the probability fluid

To characterize a semiclassical situation in a more general way, we will now reason in terms of a density operator, without restricting our study to a pure state as we did earlier. To start, we assume there is no long-range non-diagonal order3 : () r

r

0

if

r

(52)

r

where is a macroscopic coherence length. For a pure state, would be determined by the size of the domain where the wave function has a non-zero modulus ( ). For a statistical mixture of states, we have a different situation: the phases of the various wave functions may interfere destructively at shorter distances, so that can be much smaller. Nevertheless, we shall assume that remains larger than a few de Broglie wavelengths 2 (r ) and that when r r . , the non-diagonal matrix elements of the density operator vary in a similar way as (51a): () r

r

K (r

(r r )

r

)

if r

r

.

(53)

This expression is simply the generalization of (51a), only valid for a pure state; the real function (r r ) replaces the modulus product (r ) (r ). Both functions (r r ) and K are supposed to remain practically constant as the variables r and r vary by a quantity of the order of . With these assumptions, the values of the integration variable giving a significant contribution to the integral in relation (17) correspond to y . , so that we can write: 1

(r p)

(2 })

d3

3

py }

r+

y r 2

y 2

K(r) y

(54)

where the integration domain is centered at y = 0 and extends over a few coherence lengths . As the function is practically constant in this domain, and since (r r) = r ( ) r , we get: 1

(r p)

r

() r

() r

[p

(2 })

3

d3

[K(r) y p y }]

(55)

or else: (r p)

r

p0 (r)]

(56)

To write this expression, we have used the following definitions: (p) =

1 (2 })

3

d3

py }

(57)

and: p0 (r) = } K (r)

(58)

3 The concept of long-range non-diagonal order is introduced in Complement A XVI , §§ 2-a and 3-c, where, in particular, its relation with Bose-Einstein condensation is established. The present hypothesis concerning the absence of long-range order prevents ( ) from being the one-body density operator of a system of condensed bosons.

2308

WIGNER TRANSFORM

The function (p) is a momentum distribution centered at p = 0, with a width ∆ } . It is normalized to unity as the integral of (p) over the momenta yields a function (y), which integrated over d3 is equal to one. Note that in (56) this function takes on its value for a momentum equal to p p0 (r), which means the momentum distribution is centered at the value p0 (r). As this momentum value depends on r, correlations between position and momenta are now introduced in (r p). Expression (56) for the distribution (r p) can be interpreted as a classical distribution in the probability fluid phase space: it is the product of the local probability density r ( ) r by a function of momentum [p p0 (r)] centered around the value p0 (r) defined in (58). Now this p0 (r) value is precisely the momentum value that, divided by (to go from momentum to velocity) and multiplied by the probability density, yields the fluid probability current J (r). Note that the distribution keeps a certain width around p0 (r), of the order of } , as required by Heisenberg’s uncertainty relation. To sum up, in such semiclassical situations, the Wigner distribution directly reflects the spatial variation of the probability, and of its associated local current. It simply describes the flow of a “probability fluid” (III, § D-1-c- ), as does the distribution in phase space of an ensemble of classical particles forming a moving fluid. 2-f.

Quantum situations where the Wigner distribution is not a probability distribution

In the previous examples, the properties of the Wigner distribution are very similar to those of a classical distribution. This is, however, not always the case: as surprising as it may seem, the Wigner transform can, in general, become negative. .

Odd wave function

A very simple case offers such an example. In a one-dimension problem, imagine that the system has an odd wave function, as is the case for example for the first excited state of the harmonic oscillator. We then have, according to relation (19): ( =0

= 0) =

1

2 } 1 = 2 }

d

2

2 2

d

2

(59)

which is obviously negative. As odd wave functions often occur in quantum mechanics, we see that there exist numerous situations where the Wigner distribution has some properties unexpected for a distribution. Strictly speaking, the term “quasi-distribution” should always be used. .

Two-peak wave function

Imagine now the particle wave function is the sum of two wave packets, one localized around = + , the other around = : ( )=

1 2

(

)+

( + )

(60)

where the wave function ( ) is normalized; the relative phase factor is arbitrary. For the sake of simplicity, we assume that ( ) is zero when and that it is even. We 2309

APPENDIX VII

also suppose that in our case , meaning that the two wave packets forming the total wave function are well separated. Let us compute the Wigner distribution at point = 0, therefore at a point where the wave function ( ) is zero. In one dimension, relation (19) is written as: ( =0

)=

1 4 }

}

d

+

2 2

2

+

+ 2

+

(61)

In this expression, the functions are zero if their argument’s modulus is larger than . As an example, is different from zero only if 2 , whereas is 2 2 different from zero only if 2 ; consequently their product is always zero. Actually, in the product of the two brackets, only the “crossed” terms are non-zero, and we obtain (with our assumption that the function is even): ( =0

)=

2

1

}

d

4 }

2

Changing the sign of the integration variable can write: ( =0

)=

2

+

+

2

(62)

for the second term in the bracket, we 2

1

d

2 }

cos

+

2

}

(63)

In the limit where the width becomes very narrow, the squared modulus of the wave function in the integral behaves as a delta function ( 2), and we get: ( =0

1 cos }

)

2

+

(64)

}

This result illustrates two properties of the Wigner distributions that both seem quite surprising. The first is that the distribution is non-zero at point = 0, whereas the probability of finding the particle at this position is strictly zero. The second is that the distribution is an oscillating function of momentum, taking successively positive and negative values, whereas a classical distribution always remain positive or zero. These two properties are actually related: integrating the distribution over all possible momenta yields zero, which is in agreement with relation (18) stating that the integral of the Wigner distribution over the momenta yields the probability of the particle’s presence at each point. More details on the properties of a two-peak wave function will be given in § 5-a. 3.

Wigner transform of an operator

Consider now any operator acting in the particle state space. We define its Wigner transform (r p) in the same way as for a density operator, but without the prefactor 3 1 (2 }) that appears in front of the integrals in (17) and (21): (r p) = = 2310

d3 d3

py } qr }

r+ p+

q 2

y 2

r p

y 2 q 2

(65)

WIGNER TRANSFORM

To simplify the notation, this definition does not include a time dependence; one can, however, directly replace by ( ) and (r p) by (r p; ), without any other change. The inversion relations (33) and (34) now become: r1

r2 =

p1

p2 =

1

d3

p (r1 r2 ) }

3

r1 + r2 p 2

d3

(p2 p1 ) r }

3

r

(2 }) 1 (2 })

p1 + p2 2

(66)

Taking the complex conjugate of relation (65) shows that the Wigner transform of a Hermitian operator is necessarily a real function. Similarly, the fact that the complex conjugate of (66) is real indicates that it is a sufficient condition for hermiticity. 3 As the prefactor 1 (2 }) is no longer included in the definition (65), the equivalent of relation (38) is now: d3

=

d3

(r p)

(r p)

(67)

We saw previously that the operator (r p) is Hermitian. The above relation then allows building a Hermitian operator from any real function (r p) of position and momentum. In other words, we found a quantization procedure for any classical function, often called “Weyl quantization” or “phase space quantization” [100, 101, 102]. Starting from two functions (r p) and (r p), whose product obviously commutes, this procedure yields two operators and that, in general, do not commute. Such an operation, which introduces in phase space a non-commutative structure, is sometimes referred to as “geometric quantization”. 3-a.

Average value of a Hermitian operator (observable)

We can now compute the average value of operator by the density operator ( ): = Tr

()

=

d3

1

d3

2

r1

( ) r2 r2

in the quantum state defined

(68)

r1

We are going to show that: =

d3

d3

(r p; )

(r p)

(69)

This relation is the exact analog of the relation one would obtain with a classical distribution. It is the reason the Wigner transform of the density operator is referred to as a “quasi-classical distribution”, or more simply as a “distribution”. Demonstration: Inserting in (68) the equalities (33) and (66) leads to: =

1 (2 })3

d3

1

d3

2

d3

p (r1

r1 + r2 p; 2

r2 ) }

d3 r1 + r2 p 2

p (r1

r2 ) }

(70)

2311

APPENDIX VII

We now replace the integration variables r1 and r2 by the following variables: r=

r1 + r2 2

;

The summation over d3 (2 )3

p

p }

r = r1

(71)

r2

introduces a delta function:

= (2 })3

p

(72)

p

which takes care of the integration over d3 . We then finally obtain (69). 3-b.

Special cases

In the special case in which the operator =

(R)

and hence:

r1

r2 =

depends only on the position operator: (r1 )

(r1

r2 )

(73)

the first line of (65) leads to: (r p) =

(r)

(74)

The Wigner transform of the operator is then simply the function (r), which does not depend on the momentum p. In a similar way, if depends only on the momentum operator: =

(P)

and hence:

p1

p2 =

(p1 )

(p1

p2 )

(75)

the second line of (65) leads to: (r p) =

(p)

(76)

As a further illustration, let us find an operator both position and momentum, for example: (r p) = r p

whose Wigner transform involves (77)

Relation (66) yields its matrix elements: r1

r2 =

1 3

d3

p (r1 r2 ) }

r1 + r2 p 2

(2 }) } r1 + r2 = ∇r1 (r1 r2 ) (78) 2 We recognize in this expression the matrix elements of the operator P, equal to the gradient of a delta function of the positions, multiplied by } . Note, in addition, that r1 is the result of the action of the position operator on the bra, whereas r2 is the result of the action of the position operator on the ket. This means that: 1 [R P + P R] (79) 2 We thus get a Hermitian operator, as expected since its Wigner transform is real. It is however remarkable that building a quantum operator via the Wigner transforms spontaneously introduces an arrangement of the operators’ order leading to the necessary symmetry. This property is quite general: starting from real classical functions, the Wigner transform allows building operators symmetrized with respect to position and momentum. This method is a real quantization procedure. =

2312

WIGNER TRANSFORM

3-c.

Wigner transform of an operator product

We are going to show that, in general, the Wigner transform associated with the product of two operators and is not simply the product of the Wigner transforms of each operator. .

General expression

Let us apply relation (65) to obtain the Wigner transform of a product of two operators and . Inserting a closure relation on the kets z leads to: [

]

d3

(r p) =

d3

py }

r+

y 2

We can then replace the matrix elements of leads to: [

]

and

z

y 2

r

(80)

by their expressions (66), which

(r p) 1

=

z

6

(2 })

d3

d3

py }

d3

d3

1

2

r+z y + p1 2 4

p1 (r+ y 2

r+z 2

z) }

p2

r+ y 2 +z) }

(

y p2 4

(81)

Instead of using the position representation, one can use the momentum representation; we then must use the relations on the second lines of (65) and (66). A reasoning similar to that used before leads to: [

]

(r p) 1

=

6

(2 })

d3

d3

qr }

x

d3

(q

d3

q +p q + 2 4

p

y

q 2

)x

}

(p

q +p 2

q 2

q

)y

}

q 4

(82)

Depending on the case, it will be easier to use either (81) or (82). These two expressions are exact, but fairly complicated. They can be simplified, however, in a certain number of cases. .

A few simple cases

As a first example, imagine that operator is simply the position operator R while can be any operator. As is no longer dependent on p1 , the integration over 3 d3 1 (2 }) in (81) yields a delta function r + y2 z ; this allows integrating over d3 to obtain: [R ]

(r p) =

1 (2 })

3

d3

py }

d3

2

p2 y }

r+

y 2

(r p2 )

(83)

For the term in r, the integral over d3 of exponential (p2 p) y } introduces a function 3 r (p2 p) with the coefficient (2 }) . As for the term in y 2, it yields ∇p2 (p2 p), 2313

APPENDIX VII

3

with the coefficient (} 2 ) (2 }) . After integrating over d3 [R ]

(r p) = r

} ∇p 2

(r p)

2,

we get:

(r p)

(84)

If we now reverse the order of the operators R and , the roles of p1 and p2 are 3 interchanged in (81); the integration over d3 2 (2 }) yields a function r + y2 + z 3 and the integration over d leads to: [ R]

(r p) =

1 (2 })

d3

3

py }

d3

1

p1 y }

(r p1 ) r

y 2

(85)

Compared to (83), the only change is the sign of y in the final bracket, so that we simply obtain the final result by changing the sign of the gradient on the right-hand side of (84). This means that the Wigner transform of the commutator is: [R

]

}

(r p) =

∇p

(r p)

(86)

Starting from (82), the same reasoning leads to: [P ]

(r p) = p

(r p) +

}

∇r

(r p)

(87)

This relation can now be iterated to obtain: P2

} (r p) + 2 p ∇r

(r p) = p2

(r p)

}2 ∆r

(r p)

(88)

We then get the expression for the Wigner transform of the commutator of the momentum squared and any operator : P2

(r p) =

2}

p ∇r

(r p)

(89)

This relation will be useful for what follows. .

Gradient expansions

We now show that relation (81) can be expressed as a series expansion of higher order derivatives of the two functions and , of the form: [

]

(r p) =

(r p)

(r p) +

} 2

(r p)

(r p) +

(90)

where we have used the classical definition of the “Poisson bracket” [103, 104] of classical Lagrangian mechanics: (r p) = ∇r

(r p) (r p) ∇p

(r p)

∇r

(r p) ∇p

(r p)

(91)

This shows that, to lowest order in }, the Wigner function of an operator product is simply the product of the Wigner transforms of these operators. To first order, a correction must be added, which contains the Poisson bracket of the two Wigner transforms. It 2314

WIGNER TRANSFORM

is remarkable that purely quantum considerations bring in this classical Poisson bracket definition; this explains why these results are well suited for the study of the classical limit of quantum mechanics. In (90), the expansion is limited to the contribution of the first order derivatives of the two functions. The following terms involve higher order derivatives and, consequently, higher powers of } (the corresponding result is called the “Groenewold’s formula”; see for example [99]). Demonstration: Let us make in (81) the following change of momentum integration variables: p1 + p2 2 q = p1 p2

P=

(92)

(despite the notation with a capital letter, P is a classical variable, not an operator). This leads to the new expression: [

]

(r p) =

1 d3 d3 d3 (2 })6 r+z y q + P+ 2 4 2

If the two Wigner transforms can use the expansions: r+z y + P+ 2 4 r+z y P 2 4

q 2 q 2

and

d3

(P p) y }

r+z 2

=

q 2

(93)

vary slowly with position and momentum, we

r+z y P + ∇r 2 4 r+z y P ∇r 2 4

=

y P 4

q (r z) }

+

q ∇p 2 q ∇p 2

+ +

(94)

Keeping only the first term in each of these two expansions (zero-order term in the gradient expansion), the integrals over d3 and d3 introduce the delta functions (P p) and (r z) respectively, each with a coefficient (2 })3 . We then get: [

]

(r p) =

(r p)

(r p) +

(95)

In this approximation, the Wigner transform of the product of two operators is thus simply the product of the Wigner transforms. We now take into account the first order terms in the gradient expansion (94). The ∇r term on the first line contains a summation over d3 modified by the presence of y in the integral: 1 (2 })3

d3

(P p) y }

y=

}

∇P (P

p)

(96)

The integral over d3 in (93) is now modified and leads to a derivation with respect to P of the function to be integrated, a multiplication by the coefficient } , and finally the replacement of P by p. On the other hand, the integral over d3 (2 })3 is unchanged and leads to the replacement of z by r. The corresponding term is therefore written: } ∇p [ 4

(r p) ∇r

(r p)]

(97)

2315

APPENDIX VII

As for the ∇p term on the first line of (94), it can be handled in the same way. The presence of the variable q transforms (r z) into ∇z (r z), with a coefficient } , where the sign change of this coefficient comes from the z in the exponent q (r z); the integral over d3 is unchanged. This yields the term: } ∇r [ 4

(r p) ∇p

(r p)]

(98)

which, added to (97), leads to the contribution (the terms involving a double derivation of cancel each other): } [∇r 4

(r p) ∇p

(r p)

∇r

(r p) ∇p

(r p)]

(99)

Finally, the terms coming from the second line of (94) are obtained by exchanging the roles of and , and changing the signs because of the opposite values of y and q in the of relation (94). We thus double the result (99), and finally obtain expression (90) to first order in the gradients. 3-d.

Evolution of the density operator

The Schrödinger evolution of the density operator obeys the von Neumann equation: }

d ()=[ d

()

( )]

(100)

Taking its Wigner transformation, this equation becomes: }

(r p; ) =

1 (2 })3

[

]

(r p; )

(101)

where, on the right-hand side, is written the Wigner transform associated with the commutator of ( ) and ( ); the factor 1 (2 })3 comes from the definition of the Wigner distribution of the density operator, remembering that no such coefficient appears in the transform of an arbitrary operator. We already saw that the general expression of the Wigner transform of an operator product is somewhat complex, and the same is of course true for their commutator. .

Classical limit

If we only keep, as in (90), the first order terms in the gradients, we see that the zero-order terms disappear, and that the terms in ( ) ( ) and ( ) ( ) double up; in addition, factors } on each side of the equations cancel out. Using this approximation, we get: (r p; ) =

(r p; )

(r p; ) + }

(102)

where the Poisson bracket of (r p1 ; ) and (r p1 ; ) is defined in (91). As noticed earlier in § 3-c- , the neglected terms are proportional to }, and vanish in the classical limit } 0. We find in this limit, where the gradients of the Wigner transforms with respect to position and momentum are small, the usual equations of classical dynamics. 2316

WIGNER TRANSFORM

.

Particle in an external potential

An exact calculation can be made if the particle’s Hamiltonian is simply the sum of a kinetic energy and an external potential energy: =

P2 + 2

(R; )

(103)

where is the mass of the particle. The contribution of the kinetic energy to the right-hand side of (101) comes directly from relation (89): p

(r p; ) =

∇r

(r p; )

(104)

kinetic

The evolution of the Wigner distribution induced by the kinetic energy operator is thus given by a “drift term” just as in classical physics. As for the contribution of the potential energy, the computation is very similar to the one conducted at the beginning of § 3-c- , except that instead of dealing with the operator R itself, we are now dealing with a function (R) of that operator. Taking = in relations (83) and (85), they become: [ (R) ] 1 3

(2 })

(r p) = d3

py }

d3

2

d3

1

y ; 2

p2 y

r+

p1 y

(r p1 ; )

(r p2 ; )

(105)

y ; 2

(106)

and: [

(R)] 1 3

(2 })

(r p) = d3

py }

Finally, the evolution of the Wigner distribution equation: (r p; ) +

p

∇r

(r p; ) = r+

y ; 2

1 1 } (2 })3 y r ; 2

d3

r

(r p; ) obeys the following

d3 (r p ; )

(p

p) y }

(107)

This is an exact equation. It contains all the quantum effects that play a role in the particle’s evolution. It obeys a local conservation law for the probability: (r ) + ∇r J (r ) = 0 where the local probability density J (r ) is defined as: J (r ) =

d3

p

(r p; )

(108) (r ) is defined in (18), and its associated current

(109) 2317

APPENDIX VII

This can be shown by integrating (107) over d3 , as the left-hand side then becomes identical to the left-hand side of (108), just as in classical mechanics; as for the righthand side, the integration over d3 introduces a function (y) that cancels the bracket in the remaining integral. When the external potential varies slowly enough, one can use in (107) the following approximation: y y r+ ; r ; = y ∇r (r; ) + (110) 2 2 3

The integration over d3 (r p; ) +

p

(2 }) then leads to a function (} ) ∇r

(r p; ) = ∇r

(r; ) ∇p

p

(p

(r p; ) +

p) and we get: (111)

One recognizes here the Liouville equation of classical mechanics. The dots at the end of the equation symbolize the possible contributions of terms containing higher order spatial derivatives of the potential (r; ). They come with a power of } increasing with the order of the derivative. This means that they correspond to quantum corrections: the faster the potential varies in space, the more terms need to be taken into account. On the other hand, when the potential varies slowly, only keeping the classical evolution term is a good approximation. 4.

Generalizations

The above considerations can be directly generalized to particles with spin, or to an -particle system. 4-a.

Particle with spin

For a particle with spin, a basis in state space is formed by the kets r , where r is the eigenvalue of the position operator, and the eigenvalue of the spin component on the quantization axis. The matrix elements of the density operator are then written: () r

r

(112)

For each value of and (17), the functions: (r p; ) =

1 (2 })

3

we can perform a Wigner transformation and define, as in d3

py }

r+

y 2

() r

y 2

(113)

As an example, for a spin 1 2 the two indices and can take on two different values, noted . We thus define four Wigner functions, which can be arranged in a 2 2 spin matrix: ++ +

(r p; ) (r p; )

+

(r p; ) (r p; )

(114)

It is easy to show that this matrix is Hermitian: +

(r p; )

=

+

(r p; )

(115)

Such a matrix is frequently used when studying the quantum properties of spin polarization transport in fluids (spin waves for example). 2318

WIGNER TRANSFORM

4-b.

Several particles

For two spinless particles, relation (17) is easily generalized to: (r1 p1 ; r2 p2 ; ) =

1 6

(2 })

d3

1

d3

2

p1 y1 }

p2 y2 }

y1 y2 y1 y2 r2 + ( ) r1 r2 (116) 2 2 2 2 Actually, any number of particles can be treated this way. Including the spin can be done as in the previous section, but it rapidly leads to a great number of Wigner functions (4 for particles each having a spin 1 2). The Wigner distribution for a system including a large particle number therefore depends on 6 variables when the particles have no spin; when the particles have a spin 1 2, it is no longer a single distribution that must be studied, but rather 4 distributions which are the matrix elements of a spin operator. In practice, one usually uses the Wigner distribution of the one-particle density operator, resulting from the partial trace over the 1 other particles, or sometimes the Wigner distribution of the two-particle density operator. r1 +

5.

Discussion: Wigner distribution and quantum effects

Knowledge of the Wigner distribution allows computing the average values of observables, as seen from relation (69). It can be used to obtain the probability of any measurement result, since this probability is simply the average value of the projector onto the eigensubspace associated with this result. We simply have to compute the Wigner transform of this projector, multiply it by (r p; ), and integrate the result over the two variables. From a practical point of view, all the information is contained in (r p; ). However, and as already underlined with the examples given in § 2-f, that does not mean we should attribute too much physical content to the Wigner distribution itself. Strictly speaking, the Wigner distribution is rather a useful and powerful computation tool than a direct representation of the physical properties of the system. To highlight the behavior of the Wigner transform in a situations where quantum effects are predominant, we now study an interference experiment. 5-a.

An interference experiment

When the wave function of a particle goes through a screen pierced with two holes, it is split into two coherent wave packets propagating in space, and interfering when they overlap. Figure 1 represents these two wave packets after the screen, as they both propagate towards the region where they will interfere. As they propagate in free space, the Wigner distribution associated with the particle simply obeys relation (104), which is just a classical equation of motion. What causes the interference effects in region ? To answer this question, we shall use relation (19), or its equivalent (26), which allow computing the Wigner transform associated with the particle’s wave function. This wave function is now the sum of two components, Ψ1 (r ) for the wave packet emerging from the first hole in the screen, and Ψ2 (r ) for the wave packet emerging from the second hole: Ψ (r ) = Ψ1 (r ) + Ψ2 (r )

(117) 2319

APPENDIX VII

Figure 1: The wave function of a quantum particle can be split into two coherent components 1 and 2, after passing, for example, through a screen pieced with two holes, or through an interferometer. As long as the two wave packets do not overlap, the Wigner distribution is the sum of three components, schematically drawn in ordinary space in the figure: a first one localized with wave packet 1, a second with wave packet 2, and finally a third one (circled with dashed lines) remaining at mid-distance from the two wave packets. This third component is called the “ghost component”: when measuring its position, the particle can never be found in this component. The value of the ghost Wigner distribution oscillates rapidly as a function of the momentum p. Later on, as the two wave packets 1 and 2 overlap, the three components are different from zero in the same region of space; in addition, the momentum oscillations of the ghost component slow down and even vanish. This component now plays an essential role: as it is added to the terms 1 and 2, it is responsible for introducing the density oscillations producing the fringe pattern (schematized as horizontal lines in region ). It plays a virtual role as long as the wave packets are well separated, but an essential one when they overlap, as it leads to quantum interference effects. A similar situation has already been studied in § 2-f. Inserting (117) into relation (19), which is quadratic in Ψ, four contributions will come into play: (r p; ) =

1

(r p; ) +

2

(r p; ) +

12

(r p; ) +

21

(r p; )

(118)

In this equality, 1 (r p; ) is obtained when we replace in (19) the functions Ψ (r ) and Ψ (r ) by Ψ1 (r ) and Ψ1 (r ) respectively. The contribution 2 (r p; ) is obtained by replacing them by Ψ2 (r ) and Ψ2 (r ) respectively. Finally, the “crossed” contributions 12 (r p; ) and 2 1 (r p; ) come from replacing Ψ (r ) by Ψ1 (r ) and Ψ (r ) by Ψ2 (r ), and conversely. For example, relation (19) leads to: 12

(r p; ) =

1 3

(2 })

d3

py }

Ψ1 r +

y ; 2

Ψ2 r

y ; 2

(119)

whereas the equivalent relation (26) yields another expression as a function of the Fourier transforms Ψ1 and Ψ2 . It can easily be shown that the two distributions 1 2 (r p; ) and 21 (r p; ) are complex conjugates of each other. Their sum is real, as is, consequently, (r p; ). 2320

WIGNER TRANSFORM

As an example, imagine that the two wave packets are Gaussian, as were the wave packets studied in § 2-d. We saw in Complement GI shows that a Gaussian wave packet, as it propagates in free space, remains Gaussian at all times; its momentum dispersion remains constant, while its spatial width changes with time. For the sake of simplicity, we shall consider a one-dimensional problem and will not explicitly write the time dependence. We assume one of the wave packet to be centered at + 0 , and the other at 0 . Relation (41) then leads to: 1

(

2

(

1

)=

1 4

2}

1 4

2}

(2 ) 1

)=

(2 )

2

(

2 0)

4}2

2

(

2 0)

4}2

}

0

0

}

(120)

(a factor 1 2 has been added to ensure the normalization of the total wave function; we assume 0 , so that the spatial overlap of the two wave packets is negligible, and the squared norm of the sum is the sum of the squared norms). The same computation as in § 2-d then yields: 1

(

)=

1

2 2}2

(

2 0)

2

2 0) 2

(

2 } 2 ( + 0 )2 2 1 ( 2 0) 2 2}2 ( )= (121) 2 } As for the crossed contributions, the computation is slightly different. Since the two lines of relation (120) have different signs in front of 0 , the product 1 ( + 2) 2 ( contains the exponential e 2 0 } , whereas the product 1 ( + 2) 2 ( 2) contains e2 0 } . The computation of (42) then becomes: 2

12

( )+ 21( ) 2 1 ( 4}2 d = 3 2 2 (2 ) }2 2 2 0 ( 2}2 = cos 3 2 2 } (2 ) }

0+ 2

)

2

+(

0

2

)

2

}

2

0

}

+

2

0

}

2 2

2 0)

d

8}2

}

2 0)

2 2 2

(122)

or else: 12

(

)+

21

(

) = cos

2

2

0

2}2

(

(123)

} Finally, the total Wigner transform is: (

)=

1 2 }

2 2}2

(

2 0)

2

(

2 0) 2

+

2

( + 0 )2 2

+ 2 cos

2

0

2 2 2

(124)

}

The first two terms in the bracket are easy to understand: they are simply half the sum of the Wigner transforms associated with each of the wave packet. Each of these two terms is centered on the wave packet it corresponds to, that is at = 0 . The third term is the crossed term, which corresponds to an interference between the two wave packets, and is centered at = 0, half way between them. In addition, this term oscillates as a function of with a frequency proportional to the distance between the two wave packets. 2321

2)

APPENDIX VII

5-b.

General discussion; “ghost” component

The distribution 1 (r p; ) propagates as if it were the distribution of a free particle described by the single wave packet Ψ1 (r ); the distribution 2 (r p; ) corresponds to the second wave packet, here again as if it were isolated. If these were the only contributions, when the two wave packets overlap these two Wigner distributions would simply add to each other, since they follow a classical evolution; no quantum interference effects would result from this addition. However, we saw that in (118) we must also include crossed terms (interference terms) whose properties are radically different from the first two terms. A first significant difference comes from their oscillations as a function of momentum, which necessarily involves positive and negative values of the distribution. This is definitely a quantum effect since a classical distribution must always be positive or zero. Another difference is that this crossed term in the Wigner transform propagates in a region of space where the wave function is zero, and consequently cannot correspond to any probability of the particle’s presence; the integral over momentum of the last term on the right-hand side of (124) is indeed zero (in the limit 0 of well separated wave packets corresponding to the assumption made for our computation). The sum 1 2 ( ) + 21( ) is sometimes called the “ghost component” of the Wigner distribution (or sometimes, in quantum optics, the “tamasic component”); when measuring the particle’s position, it can never be found in this component4 . Its value is always real, but not necessarily positive, because of its oscillations. This means that, as long as the two wave packets 1 and 2 are well separated, the Wigner transform associated with the particle is the sum of three independent components: two components separately associated with each wave packet and propagating with them; one “ghost component”, also propagating but remaining at mid-distance from the two wave packets. However, when the wave packets meet in region , the three components of the Wigner transform overlap in space. The ghost component, which has a changing sign, combines with the other two components to modulate the particle’s probability of presence, hence producing the interference pattern predicted by quantum mechanics. In a certain sense, one can say that the ghost component carries the quantum effects associated with the particle. Conclusion Quantum mechanics and classical mechanics are two very different theories. It was not obvious that, using the Wigner transforms, one could write the quantum equations in a form so akin to the classical equations of a distribution in phase space. Furthermore, we showed that any real classical function of position and momentum could be used in this formalism to generate a Hermitian operator acting in state space. In the limit where } 0, the quantum equations of motion lead to the same Poisson brackets as the classical equations; quantum and classical theories then show strong similarities. Quantum effects, however, can manifest themselves in several ways: - the evolution of the Wigner distribution can be significantly different from the classical evolution when the potentials vary rapidly on a scale on the order of } (de 4 It is also known as the “empty component” stressing the fact that this component contains no particle.

2322

WIGNER TRANSFORM

Broglie wavelength), as higher order terms in the gradient expansion become essential. - the Wigner transform is not always positive. We saw an example of this with the ghost component in an interference experiment, which, in a manner of speaking, carries the quantum effects to the usual components. - whereas in classical physics any distribution in phase space, as long as it is positive and normalized, can be accepted, this is no longer the case in quantum mechanics. The only acceptable Wigner distributions are those which correspond to a density operator that is positive definite, a condition that is not expressed simply in terms of the distribution. The Wigner transformation is frequently used in quantum physics. We already mentioned that it was introduced in 1932, while studying quantum corrections to thermal equilibrium [98]. It probably plays an even more important role in the study of transport properties where Boltzmann type equations contain simultaneous information on particles’ positions and momenta. Furthermore, the Wigner transform is also useful for understanding and characterizing quantum effects, as its negativity in certain regions of phase space is a sensitive indicator of the existence of such effects. One can even use the Wigner transforms to introduce a “phase space formulation of quantum mechanics” [100, 101], totally equivalent to the usual formalism in terms of state space and operators, and which is a real quantization procedure. In a general way, the Wigner transformation belongs to the class of the so-called Liouville formulations of quantum mechanics [105], which have many uses. Finally, there are many domains of physics (such as signal processing, in particular) in which the Wigner transformation is part of a larger class of mixed time-frequency transformations. Numerous types of such transformations exist (such as sliding window or envelope transforms, wavelets, etc.) chosen to best fit the problem at hand. Even in quantum mechanics there are other quasi-classical transforms, beside the Wigner transform, as for example the Husimi or the Kirkwood transforms, or the Glauber transform expressed in terms of creation and annihilation operators of the electromagnetic field; a review on that subject can be found in [99]. The Wigner transform still remains one of the most useful transforms, allowing, in particular, analytical calculations for many interesting cases.

2323

Bibliography of volume III [1] C. Cohen-Tannoudji, B. Diu and F. Laloë, Quantum mechanics, Volume I, Wiley (1977). [2] C. Cohen-Tannoudji, B. Diu and F. Laloë, Quantum mechanics, Volume II, Wiley (1977). 1591 [3] R.K. Pathria, Statistical mechanics, Pergamon press (1972). 2296 [4] E.J. Mueller, Tin-Lun Ho, M. Ueda and G. Baym, “Fragmentation of Bose-Einstein condensates”, Phys. Rev. A 74, 033612 (2006). 1656 [5] J.P. Blaizot et G. Ripka, Quantum theory of finite systems, the MIT Press (1986). 1678, 1701, 1809 [6] Wikipedia, “Density functional theory”, https://en.wikipedia.org/wiki/Density_functional_theory 1699 [7] L.P. Kadanoff et G. Baym, Quantum statistical mechanics, Benjamin (1976). 1798 [8] A.J. Leggett, Quantum liquids, Oxford University Press, 2006. 1820, 1890, 1926 [9] J. Bardeen, L.N. Cooper and J.R. Schrieffer, “Theory of superconductivity”, Phys. Rev. 108, 1175-1204 (1957). 1889 [10] W. Ketterle and M. Zwierlein, “Making, probing and understanding ultracold Fermi gases” in Proceedings of the international school of physics Enrico Fermi, Course CLXIV, Varenna, edited by M. Inguscio, W. Ketterle and C. Salomon, IOS Press (Amsterdam), 2008; arXiv:0801.2500v1. 1926 [11] W. Zwerger, “The BCS-BEC crossover and the unitary Fermi gas”, Springer, 2012. 1926 [12] M. Tinkham, “Introduction to superconductivity”, Dover books on physics, 2004. 1926 [13] R.D. Parks, “Superconductivity”, Volume 1 and 2, Dekker, 1969. 1926 [14] M. Combescot and S-Y Shiau, “Excitons and Cooper pairs”, Oxford Univedrsity Press, 2016. 1926 [15] L. Pitaevskii and S. Stringari, “Bose-Einstein condensation and superfluidity”, Oxford University Press, 2016. 1944 Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

BIBLIOGRAPHY OF VOLUME III

[16] C. Cohen-Tannoudji, J. Dupont-Roc and G. Grynberg, Photons and atoms, introduction to quantum electrodynamics, Wiley (1997). 1960, 1963, 1968, 1980, 1990, 2007, 2008, 2014, 2053, 2063 [17] T.A. Welton, “Some observable effects of the quantum-mechanical fluctuations of the electromagnetic field”, Phys. Rev. 74, 1157-1167 (1948). 2008 [18] J.V. Prodan, W. D. Phillips and H. Metcalf, “Laser production of a very slow monoenergetic atomic beam”, Phys. Rev. Lett. 49, 1149-1153 (1982). 2025 [19] T.W. Hänsch and A. Schawlow, “Cooling of gases by laser radiation”Opt. Comm. 13, 68-69 (1975). 2026 [20] D.J. Wineland and H. Dehmelt, “Proposed 1014 laser fluorescence spectroscopy on + mono-ion oscillator III (sideband cooling)”, Bull. Am. Phys. Soc. 20, 637 (1975). 2026 [21] C. Cohen-Tannoudji, J. Dupont-Roc and G. Grynberg, Atom-Photon Interactions. Basic Processes and Applications, Wiley-Interscience (1992). 2028, 2107, 2117, 2130, 2136, 2145 [22] Special issue on laser cooling and trapping, JOSA B, Optical Physics, 6, Number 11 (1989). 2034 [23] W. Ketterle and N.L Van Druten, “Evaporative cooling of trapped atoms”, Advances in atomic, molecular, and optical physics, 37, 181-236 (1996). 2034 [24] C. Cohen-Tannoudji and D. Guéry-Odelin, Advances in atomic physics. An Overview, World Scientific, Singapour (2011). 1779, 2034, 2036, 2039, 2065, 2127, 2152, 2153 [25] W.E. Lamb, “Capture of neutrons by atoms in a crystal”, Phys. Rev. 55, 190-197 (1939). 2040 [26] R.V. Pound and G. A. Rebka Jr., “Apparent weight of photons”, Phys. Rev. Lett. 4, 337-341 (1960). 2040 [27] L.S. Vasilenko,V. P. Chebotayev and A. V. Shishaev, “Line shape of two-photon absorption in a standing-wave field in a gas”, JETP Lett. 12, 113-116 (1970). 2041 [28] B. Cagnac, G. Grynberg and F. Biraben, “Spectroscopie d’absorption multiphotonique sans effet Doppler”, J. Phys. (Paris) 34, 845-858 (1973). 2041 [29] T.W. Hänsch, “Passion for precision”, Rev. Mod. Phys. 78, 1297-1309 (2006). 2041 [30] A. Kastler, “Projet d’expérience sur le moment cinétique de la lumière”, Société des Sciences physiques et naturelles de Bordeaux, Jan. 28 (1932). 2052 [31] R.A. Beth, “Mechanical Detection and Measurement of the Angular Momentum of Light”, Phys. Rev. 50, 115-125 (1936). 2052 [32] J.W. Simmons and M.J. Guttmann, States, waves and photons: a modern introduction to light, Addison-Wesley (1970), Chap. 9. 2052 2326

BIBLIOGRAPHY OF VOLUME III

[33] A.M. Yao and M. J. Padgett, “Orbital angular momentum: origins, behavior and applications”, Advances in Optics and Photonics, IOP Publishing 3, 161-204 (2011). 2052 [34] J.D. Jackson, Classical electrodynamics, 3rd ed., Wiley (1999). 2053, 2063 [35] J. Brossel et A. Kastler, “La détection de la résonance magnétique des niveaux excités ”, C. R. Acad. Sci. 229, 1213 (1949). 2059 [36] J. Brossel, and F. Bitter, “A new ‘Double Resonance’ method for investigating atomic energy levels. Application to Hg 3 1 ”, Phys. Rev. 86, 308 (1952). 2059, 2061 [37] J.N. Dodd, W. N. Fox, G. W. Series and M. J. Taylor, “Light beats as indicators of structure of atomic energy levels”, Proc. Phys. Soc. 74, 789 (1959). 2061 [38] E. Majorana, “Atomi orientati in campo magnetico variabile”, Nuovo Cimento 9, 43-50 (1932). 2061 [39] A. Kastler, “Quelques suggestions concernant la production optique et la détection optique d’une inégalité de population des niveaux de quantifigation spatiale des atomes. Application à l’expérience de Stern et Gerlach et à la résonance magnétique”, J. Phys. Radium 11, 255-265 (1950). 2062 [40] N. F. Ramsey, Molecular beams, Oxford University Press (1956). 2064 [41] L. Allen, S.M. Barnett and M.J. Padgett, Optical angular momentum, IOP Publishing (2003). 2065 [42] M.F. Andersen, C. Ryu, P. Cladé, V. Natarajan, A. Vaziri, K. Helmerson, and W.D. Phillips, “Quantized rotation of atoms from photons with orbital angular momentum” Phys. Rev. Lett. 97, 170406 (2006). 2066 [43] A. Einstein, “Über einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt”, Annalen der Physik 17, 132-149 (1905). 2110 [44] R.A. Millikan, “On the elementary electric charge and the Avogadro constant”, Physical Review 2, 109-143 (1913). 2111 [45] W.E. Lamb and M.O. Scully, “The photoelectric effect without photons”, in Polarisation, Matière et Rayonnement, (Presses Universitaires de France), Jubilee volume in honour of Alfred Kastler, p. 363 (1969). 2110 [46] E. Hanbury Brown and R.Q. Twiss, “A test of a new type of stellar interferometer on Sirius”, Nature, 178, 1046-1048 (1956). 2120 [47] P. Grangier, G. Roger and A. Aspect, “Experimental evidence for a photon anticorrelation effect on a beam splitter: a new light on single-photon interferences”, Europhys. Lett. 1, 173-179 (1986). 2121 [48] J.T. Höffges, H.W. Baldauf, W. Lange and H. Walther “Heterodyne measurements of the resonance fluorescence of a single ion”, J. of Mod. Optics 44, 1999-2010 (1997). 2121, 2122 2327

BIBLIOGRAPHY OF VOLUME III

[49] H.J. Kimble, M. Dagenais and L. Mandel, “Photon antibunching in resonance fluorescence”, Phys. Rev. Lett. 39, 691 (1977). 2121 [50] S. Pancharatnam, “Light shifts in semiclassical dispersion theory”, J. Opt. Soc. Am. 56, 1636 (1966). 2139 [51] C. Cohen-Tannoudji, “Théorie quantique du cycle de pompage optique. Vérification expérimentale des nouveaux effets prévus”, Ann. Phys. 13, 423-461 and 469504(1962). 2140 [52] C. Cohen-Tannoudji and J. Dupont-Roc, “Experimental study of light shifts in weak magnetic fields”, Phys. Rev. A 5, 968-984 (1972). 2140 [53] C. Cohen-Tannoudji, “Observation d’un déplacement de raie de résonance magnétique causé par l’excitation optique”, C. R. Acad. Sci. 252, 394-396 (1961). 2140, 2141 [54] B.R. Mollow, “Power spectrum of light scattered by two-level systems”, Phys. Rev. 188, 1969-1975 (1969). 2144 [55] S.H. Autler and C. H. Townes, “Stark effect in rapidly varying fields”, Phys. Rev. 100, 703-722 (1955). 2144 [56] S. Chu, J. Bjorkholm, A. Ashkin, and A. Cable, “Experimental observation of optically trapped atoms”, Phys. Rev. Lett. 57, 314-317 (1986). 2152 [57] R.J. Cook and R.K. Hill, “An electromagnetic mirror for neutral atoms”, Opt. Commun. 43, 258-260 (1982). 2153 [58] M.Greiner, O. Mandel, T. Esslinger, T.W. Hänsch and I. Bloch, I. Nature 415, “Quantum phase transition from a superfluid to a Mott insulator in a gas of ultracold atoms”, 39-44 (2002). 2154 [59] M. Ben Dahan, E. Peik, J. Reichel, Y. Castin and C. Salomon, “Bloch oscillations of atoms in an optical potential”, Phys. Rev. Lett. 76, 4508-4511 (1996). 2155 [60] P.D. Lett, R.N. Watts, C.I. Westbrook, W.D. Phillips, P.L. Gould and H.J. Metcalf, “Observation of atoms laser coooled below the Doppler limit”, Phys. Rev. Lett. 61, 169-172 (1988). 2155 [61] J. Dalibard and C. Cohen-Tannoudji, “Laser cooling below the Doppler limit by polarization gradients: simple theoretical models”, J. Opt. Soc. Am. B 6, 2023-2045 (1989). 2159 [62] P.J. Ungar, D.S.Weiss, E. Riis and S. Chu, “Optical molasses and multilevel atoms: theory”, J. Opt. Soc. Am. B 6, 2058-2071 (1989). 2159 [63] C. Salomon, J. Dalibard, W.D. Phillips, A. Clairon and S. Guellati, Europhys. Lett., “Laser cooling of Cesium atoms below 3 K”, 12, 683-688 (1990). 2159 [64] S. Gleyzes, S. Kuhr, C. Guerlin, J. Bernu, S. Deléglise, U.B. Hoff, M. Brune, J-M. Raimond and S. Haroche, “Quantum jumps of light recording the birth and death of a photon in a cavity”, Nature 446, 297-300 (2007). 2160 2328

BIBLIOGRAPHY OF VOLUME III

[65] G. Grynberg, A. Aspect et C. Fabre, Introduction to quantum optics, with contributions from F. Bretenaker and A. Browaeys, Cambridge University Press (2010). 2186 [66] D.F. Walls and G.J. Milburn, Quantum optics, Springer (1994). 2186 [67] E. Schrödinger, “Discussion of probability relations between separated systems”, Proc. Cambridge Phil. Soc. 31, 555 (1935); “Probability relations between separated systems”, Proc. Cambridge Phil. Soc. 32, 446 (1936). 2190 [68] F. Laloë, Do we really understand quantum mechanics? , Cambridge University Press (2012); second expanded edition (2019). 2202, 2213, 2330 [69] A. Einstein, B. Podolsky and N. Rosen, “Can quantum-mechanical description of physical reality be considered complete?”, Phys. Rev. 47, 777–780 (1935); Quantum Theory of Measurement, J.A. Wheeler and W.H. Zurek eds., Princeton University Press (1983), pp. 138–141. 2204, 2207 [70] A. Einstein, “Quantenmechanik und Wirklichkeit”, Dialectica 2, 320–324 (1948). 2207 [71] N. Bohr, “Can quantum-mechanical description of physical reality be considered complete?”, Phys. Rev. 48, 696–702 (1935). 2207 [72] N. Bohr, “On the notions of causality and complementarity”, Dialectica 2, 312–319 (1948); Science, New Series 111, 51-54 (Jan20, 1950). 2207 [73] J.S. Bell, “On the problem of hidden variables in quantum mechanics”, Rev. Mod. Phys. 38, 447–452 (1966); reprinted in Quantum Theory and Measurement, J.A. Wheeler and W.H. Zurek editors, Princeton University Press (1983), 396–402 and in chapter 1 of [74]. 2208 [74] J.S. Bell, Speakable and Unspeakable in Quantum Mechanics, Cambridge University Press (1987); second augmented edition (2004), which contains the complete set of J. Bell’s articles on quantum mechanics. 2329 [75] A. Peres, “Unperformed experiments have no results”, Am. J. Phys. 46, 745–747 (1978). 2212 [76] J.A. Wheeler, “Niels Bohr in today’s words” in Quantum Theory and Measurement, J.A. Wheeler and W.H. Zurek editors, Princeton University Press (1983), pp. 182– 213. 2212 [77] A list of references describing Bell’s experiments can be found in: A. Aspect, “Closing the door on Einstein and Bohr’s quantum debate”, Physics 8, 123 (2015). 2212 [78] D. Mermin, Quantum computer science, Cambridge University Press (2007). 2212 [79] N. Gisin, G. Ribordy, W. Tittel and H. Zbinden, “Quantum cryptography”, Rev. Mod. Phys. 74, 145-195 (2002). 2213 [80] V. Coffman, J. Kundu and W.K. Wootters, “Distributed entanglement”, Phys. Rev. A 61, 052306 (2000). 2223 2329

BIBLIOGRAPHY OF VOLUME III

[81] R.F. Werner, “Quantum states with Einstein-Podolsky-Rosen correlations admitting a hidden variable model”, Phys. Rev. A 40, 4277-4281 (1989). 2224 [82] A. Peres, “Separability criterion for density matrices”, Phys. Rev. Lett. 77, 14131415 (1996). 2224 [83] D.M. Greenberger, M.A. Horne and A. Zeilinger, “Going beyond Bell’s theorem”, pp. 69-72 in Bell’s theorem, quantum theory and conceptions of the Universe, M. Kafatos (ed.), Kluwer Academic Publishers (1989). 2227 [84] D.M. Greenberger, M.A. Horne, A. Shimony, and A. Zeilinger, “Bell’s theorem without inequalities”, Am. J. Phys. 58, 1131-1143 (1990). 2227 [85] D. Bouwmeester, J.W. Pan, M. Daniell, H. Weinfurter, and A. Zeilinger, “Observation of three-photon Greenberger-Horne-Zeilinger entanglement”, Phys. Rev. Lett. 82, 1345-1349 (1999). 2231 [86] see for example § 5-1-2 of [68]. 2231 [87] J.W. Pan, D. Bouwmeester, H. Weinfurter, and A. Zeilinger, “Experimental entanglement swapping: entangling photons that never interacted”, Phys. Rev. Lett. 80, 3891-3894 (1998). 2235 [88] B. Hensen, H. Bernien, A.E. Dréau, A. Reiserer, N. Kalb, M.S. Blok, J. Ruitenberg, R.F. Vermeulen, R.N. Schouten, C. Abellan, W. Amaya, V. Pruneri, M.W. Mitchell, M. Markham, D.J. Twitchen, D. Elkouss, S. Wehner, T.H. Taminiau and R. Hanson, “Experimental loophole-free violation of a Bell inequality using electron spins separated by 1.3 km”, Nature 526, 682-686 (2015). 2235 [89] M.R. Andrews, C.G. Townsend, H.J. Miesner, D.S. Durfee, D.M. Kurn and W. Ketterle, “Observation of interference between two Bose condensates”, Science 275, 637-641 (1997). 2238 [90] F. Laloë, “The hidden phase of Fock states, quantum non-local effects”, Eur. Phys. J. D 33, 87-97 (2005); F. Laloë and W.J. Mullin, “Non-local quantum effects with Bose-Einstein condensates”, Phys. Rev. Lett. 99, 150401 (2007). 2262, 2263, 2265 [91] R.P. Feynman and A.R. Hibbs, Quantum mechanics and path integrals, McGraw Hill (1965). 2267 [92] J. Zinn-Justin, Intégrale de chemin en mécanique quantique: Introduction, CNRS Editions et EDP Sciences (2003). 2267, 2280 [93] M. Le Bellac, Physique quantique, EDP Sciences et CNRS Editions (2013), tome II, chapitre 12. 2267 [94] D.M. Ceperley, “Path integrals in the theory of condensed helium”, Rev. Modern Physics, 67, 279-355 (1995). 2280 [95] B. Diu, C. Guthmann, D. Lederer et B. Roulet, Physique statistique, Hermann (1989). 2296 [96] K. Huang, Statistical mechanics, Wiley (1963). 2296 2330

BIBLIOGRAPHY OF VOLUME III

[97] F. Reif, Fundamental of statistical and thermal physics, McGraw-Hill (1965). 2296 [98] E. Wigner, “On the quantum correction for thermodynamic equilibrium”, Phys. Rev. 40, 749-759 (1932). 2298, 2323 [99] M. Hillery, R.F. O’Connell, M.O. Scully and E.P. Wigner, “Distribution functions in physics; fundamentals”, Physics Reports, 106, 121-167 (1984). 2298, 2315, 2323 [100] A. Perelomov, “Generalized coherent states and their applications”, Springer (1986); see in particular Chap. 16. 2311, 2323 [101] C.K. Zachos, D. Fairlie and T.L. Cutright, Quantum mechanics in phase space, World Scientific, Singapore (2005); with the same title, Asia Pacific Newsletters, 01, 37-46 (2012) or ArXiv:1104.5269v2. 2311, 2323 [102] G.G. Athanasiu and E.G. Fioratos, “Coherent states in finite quantum mechanics”, Nuclear Physics B 425, 343-364 (1994). 2311 [103] L. Landau and E. Lifchitz, Mechanics, Course of theoretical physics, Vol. I, C § 42, Pergamon Press (1960) and Elsevier Butterworth-Heinemann (1976). 2314 [104] H. Goldstein, C.P. Poole et J.L. Safko, Classical mechanics, Addison-Wesley (2001). 2314 [105] N.L. Balazs and B.K. Jennings, “Wigner’s function and other distribution functions in mock phase spaces”, Physics Reports, 104, 347-391 (1984). 2323

2331

Index

[The notation (ex.) refers to an exercise]

Absorption and emission of photons, 2073 collision with, 971 of a quantum, a photon, 1311, 1353 of field, 2149 of several photons, 1368 rates, 1334 Acceptor (electron acceptor), 1495 Acetylene (molecule), 878 Action, 341, 1539, 1980 Addition of angular momenta, 1015, 1043 of spherical harmonics, 1059 of two spins 1/2, 1019 Adiabatic branching of the potential, 932 Adjoint matrix, 123 operator, 112 Algebra (commutators), 165 Allowed energy band, 381, 1481, 1491 Ammonia (molecule), 469, 873 Amplitude scattering amplitude, 929, 953 Angle (quantum), 2258 Angular momentum addition of momenta, 1015, 1043 and rotations, 717 classical, 1529 commutation relations, 669, 725 conservation, 668, 736, 1016 coupling, 1016 electromagnetic field, 1968, 2043 half-integral, 987 of identical particles, 1497(ex.) of photons, 1370 orbital, 667, 669, 685 quantization, 394 quantum, 667 spin, 987, 991 standard representation, 677, 691 two coupled momenta, 1091 Anharmonic oscillator, 502, 1135 Annihilation operator, 504, 513, 514, 1597

Annihilation-creation (pair), 1831, 1878 Anomalous average value, 1828, 1852 dispersion, 2149 Zeeman effect, 987 Anti-normal correlation function, 1782, 1789 Anti-resonant term, 1312 Anti-Stokes (Raman line), 532, 752 Antibunching (photon), 2121 Anticommutation, 1599 field operator, 1754 Anticrossing of levels, 415, 482 Antisymmetric ket, state, 1428, 1431 Antisymmetrizer, 1428, 1431 Applications of the perturbation theory, 1231 Approximation central field approximation, 1459 secular approximation, 1374 Argument (EPR), 2205 Atom(s), see helium, hydrogenoid donor, 837 dressed, 2129, 2133 many-electron atoms, 1459, 1467 mirrors for atoms, 2153 muonic atom, 541 single atom fluorescence, 2121 Atomic beam (deceleration), 2025 orbital, 869, 1496(ex.) parameters, 41 Attractive bosons, 1747 Autler-Townes doublet, 2144 effect, 1410 Autoionization, 1468 Average value (anomalous), 1828 Azimuthal quantum number, 811 Band (energy), 381 Bardeen-Cooper-Schrieffer, 1889 Barrier (potential barrier), 68, 367, 373 Basis 2333

Quantum Mechanics, Volume III, First Edition. C. Cohen-Tannoudji, B. Diu, and F. Laloë. © 2020 Wiley-VCH Verlag GmbH & Co. KGaA. Published 2020 by Wiley-VCH Verlag GmbH & Co. KGaA.

INDEX

[The notation (ex.) refers to an exercise]

change of bases, 174 characteristic relations, 101, 119 continuous basis in the space of states, 99 mixed basis in the space of states, 99 BCHSH inequalities, 2209, 2210 BCS, 1889 broken pairs and excited pairs, 1920 coherent length, 1909 distribution functions, 1899 elementary excitations, 1923 excited states, 1919 gap, 1894, 1896, 1923 pairs (wave function of), 1901 phase locking, 1893, 1914, 1916 physical mechanism, 1914 two-particle distribution, 1901 Bell’s inequality, 2208 theorem, 2204, 2208 Benzene (molecule), 417, 495 Bessel Bessel-Parseval relation, 1507 spherical Bessel function, 944 spherical equation, 961 spherical function, 966 Biorthonormal decomposition, 2194 Bitter, 2059 Blackbody radiation, 651 Bloch equations, 463, 1358, 1361 theorem, 659 Bogolubov excitations, 1661 Hamiltonian, 1952 operator method, 1950 phonons, spectrum, 1660 transformation, 1950 Bogolubov-Valatin transformation, 1836, 1919 Bohr, 2207 electronic magneton, 856 frequencies, 249 magneton, see front cover pages model, 40, 819 nuclear magneton, 1237 radius, 820 2334

Boltzmann constant, see front cover pages distribution, 1630 Born approximation, 938, 977, 1320 Born-Oppenheimer approximation, 528, 1177, 1190 Born-von Karman conditions, 1490 Bose-Einstein condensation, 1446, 1638, 1940 condensation (repulsive bosons), 1933 condensation of pairs, 1857 distribution, 652, 1630 statistics, 1446 Bosons, 1434 at non-zero temperature, 1745 attractive, 1747 attractive instability, 1745 condensed, 1638 in a Fock state, 1775 paired, 1881 Boundary conditions (periodic), 1489 Bra, 103, 104, 119 Bragg reflection, 382 Brillouin formula, 452 zone, 614 Broadband detector, 2165 optical excitation, 1332 Broadening (radiative), 2138 Broken pairs and excited pairs (BCS), 1920 Brossel, 2059 Bunching of bosons, 1777 C.S.C.O., 133, 137, 153, 236 Canonical commutation relations, 142, 223, 1984 ensemble, 2289 Hamilton-Jacobi canonical equations, 214 Hamilton-Jacobi equations, 1532 Cauchy principal part, 1517 Center of mass, 812, 1528 Center of mass frame, 814 Central

INDEX

[The notation (ex.) refers to an exercise]

field approximation, 1459 Commutation, 1599 canonical relations, 142, 223 potential, 1533 field operator, 1754 Central potential, 803, 841 of pair field operators, 1861 scattering, 941 relations, 1984 stationary states, 804 Commutation relations Centrifugal potential, 809, 888, 893 angular momentum, 669, 725 Chain (von Neumann), 2201 field, 1989, 1996 Chain of coupled harmonic oscillators, 611 Commutator algebra, 165 Change Commutator(s), 91, 167, 171, 187 of bases, 124, 174, 1601 of functions of operators, 168 of representation, 124 Compatibility of observables, 232 Characteristic equation, 129 Complementarity, 45 Characteristic relation of an orthonormal basis, 116 Complete set of commuting observables (C.S.C.O.), 133, 137, 236 Charged harmonic oscillator in an electric field, 575 Complex variables (Lagrangian), 1982 Charged particle Compton wavelength of the electron, 825, 1235 in an electromagnetic field, 1536 Condensates Charged particle in a magnetic field, 240, 321, 771 relative phase, 2237 Chemical bond, 417, 869, 1189, 1210 with spins, 2254 Chemical potential, 1486, 2287 Condensation Circular quanta, 761, 783 BCS condensation energy, 1917 Classical Bose-Einstein, 1446, 1857, 1933 electrodynamics, 1957 Condensed bosons, 1638 histories, 2272 Conduction band, 1492 Clebsch-Gordan coefficients, 1038, 1051 Conductivity (solid), 1492 Closure relation, 93, 117 Configurations, 1467 Coefficients Conjugate momentum, 214, 323, 1531, 1983, 1987, 1995 Clebsch-Gordan, 1038 Conjugation (Hermitian), 111 Einstein, 1334, 2083 Conservation Coherences (of the density matrix), 307 local conservation of probability, 238 Coherent length (BCS), 1909 of angular momentum, 668, 736, 1016 Coherent state (field), 2008 of energy, 248 Coherent superposition of states, 253, 301, 307 of probability, 237 Collision, 923 Conservative systems, 245, 315 between identical particles, 1454, 1497(ex.) Constants of the motion, 248, 317 between identical particles in classiContact term, 1273 cal mechanics, 1420 Contact term (Fermi), 1238, 1247 between two identical particles, 1450 Contextuality, 2231 cross section, 926 Continuous scattering states, 928 spectrum, 133, 219, 264, 1316 total scattering cross section, 926 variables (in a Lagrangian), 1984 with absorption, 971 Continuum of final states, 1316, 1378, Combination 1380 of atomic orbitals, 1172 Contractions, 1802 2335

INDEX

[The notation (ex.) refers to an exercise]

Convolution product of two functions, 1510 Cooling Doppler, 2026 down atoms, 2025 evaporative, 2034 Sisyphus, 2034 sub-Doppler, 2155 subrecoil, 2034 Cooper model, 1927 Cooper pairs, 1927 Cooperative effects (BCS), 1916 Correlation functions, 1781, 1804 anti-normal, 1782, 1789 dipole and field, 2113 for one-photon processes, 2084 normal, 1782, 1787 of the field, spatial, 1758 Correlations, 2231 between two dipoles, 1157 between two physical systems, 296 classical and quantum, 2221 introduced by a collision, 1104 Coulomb field, 1962 gauge, 1965 Coulomb potential cross section, 979 Coupling between angular momenta, 1016 between two angular momenta, 1091 between two states, 412 effect on the eigenvalues, 438 spin-orbit coupling, 1234, 1241 Creation and annihilation operators, 504, 513, 514, 1596, 1990 Creation operator (pair of particles), 1813, 1846 Critical velocity, 1671 Cross section and phase shifts, 951 scattering cross section, 926, 933, 953, 972 Current metastable current in superfluid, 1667 of particles, 1758 of probability, 240 probability current in hydrogen atom, 2336

851 Cylindrical symmetry, 899(ex.) Darwin term, 1235, 1279 De Broglie relation, 10 wavelength, see front cover pages, 11, 35 Decay of a discrete state, 1378 Deceleration of an atomic beam, 2025 Decoherence, 2199 Decomposition (Schmidt), 2193 Decoupling (fine or hyperfine structure), 1262, 1291 Degeneracy essential, 811, 825, 845 exchange degeneracy, 1423 exchange degeneracy removal, 1435 lifted by a perturbation, 1125 rotation invariance, 1072 systematic and accidental, 203 Degenerate eigenvalue, 127, 203, 217, 260 Degereracy lifted by a perturbation, 1117 parity, 199 Delta Dirac function, 1515 potential well and barriers, 83–85(ex.) use in quantum mechanics, 97, 106, 280 Density Lagrangian, 1986 of probability, 264 of states, 389, 1316, 1484, 1488 operator, 449, 1391 operator and matrix, 299 particle density operator, 1756 Density functions one and two-particle, 1502(ex.) Depletion (quantum), 1940 Derivative of an operator, 169 Detection probability amplitude (photon), 2166 Detectors (photon), 2165 Determinant Slater determinant, 1438, 1679 Deuterium, 834, 1107(ex.) Diagonalization

INDEX

of a 2 2 matrix, 429 of an operator, 128 Diagram (dressed-atom), 2133 Diamagnetism, 855 Diatomic molecules rotation, 739 Diffusion (momentum), 2030 Dipole -dipole interaction, 1142, 1153 -dipole magnetic interaction, 1237 electric dipole transition, 863 electric moment, 1080 Hamiltonian, 2011 magnetic dipole moment, 1084 magnetic term, 1272 trap, 2151 Dirac, see Fermi delta function, 97, 106, 280, 1515 equation, 1233 notation, 102 Direct and exchange terms, 1613, 1632, 1634, 1646, 1650 term, 1447, 1453 Discrete bases of the state space, 91 spectrum, 132, 217 Dispersion (anomalous), 2149 Dispersion and absorption (field), 2147 Distribution Boltzmann, 1630 Bose-Einstein, 1630 Fermi-Dirac, 1630 function (bosons), 1629 function (fermions), 1629 functions, 1625, 1733 functions (BCS), 1899 Distribution law Bose-Einstein, 652 Divergence (energy), 2007 Donor atom, 837, 1495 Doppler cooling, 2026 effect, 2022 effect (relativistic), 2022 free spectroscopy, 2105 temperature, 2033

[The notation (ex.) refers to an exercise]

Double condensate, 2237 resonance method, 2059 spin condensate, 2254 Doublet (Autler-Townes), 2144 Down-conversion (parametric), 2181 Dressed states and energies, 2133 Dressed-atom, 2129, 2133 diagram, 2133 strong coupling, 2141 weak coupling, 2137 E.P.R., 1225(ex.) Eckart (Wigner-Eckart theorem), see Wigner Effect Autler-Townes, 2144 Mössbauer, 2040 photoelectric, 2110 Effective Hamiltonian, 2141 Ehrenfest theorem, 242, 319, 522 Eigenresult, 9 Eigenstate, 217, 232 Eigenvalue, 11, 25, 176, 216 degenerate, 217, 260 equation, 126, 429 of an operator, 126 Eigenvector, 176 of an operator, 126 Einstein, 2110 coefficients, 1334, 1356, 2083 EPR argument, 297, 1104 model, 534, 653 Planck-Einstein relations, 3 temperature, 659 Einstein-Podolsky-Rosen, 2204, 2261 Elastic scattering, 925 scattering (photon), 2086 scattering, form factor, 1411(ex.) total cross section, 972 Elastically bound electron model, 1350 Electric conductivity of a solid, 1492 Electric dipole Hamiltonian, 2011 interaction, 1342 2337

INDEX

[The notation (ex.) refers to an exercise]

matrix elements, 1344 moment, 1080 selection rules, 1345 transition and selection rules, 863 transitions, 2056 Electric field (quantized), 2000, 2005 Electric polarisability NH3 , 484 Electric polarizability of the 1 state in Hydrogen, 1299 Electric quadrupole Hamiltonian, 1347 moment, 1082 transitions, 1348 Electric susceptibility bound electron, 577 of an atom, 1351 Electrical susceptibility, 1223(ex.) Electrodynamics classical, 1957 quantum, 1997 Electromagnetic field and harmonic oscillators, 1968 and potentials, 321 angular momentum, 1968, 2043 energy, 1966 Lagrangian, 1986, 1992 momentum, 1967, 2019 polarization, 1970 quantization, 631, 637 Electromagnetic interaction of an atom with a wave, 1340 Electromagnetism fields and potentials, 1536 Electron spin, 393, 985 Electron(s) configurations, 1463 gas in solids, 1491 in solids, 1177, 1481 mass and charge, see front cover pages Electronic configuration, 1459 paramagnetic resonance, 1225(ex.) shell, 827 Elements of reality, 2205 Emergence of a relative phase, 2248, 2253 2338

Emission of a quantum, 1311 photon, 2080 spontaneous, 2081, 2135 stimulated (or induced), 2081 Energy, see Conservation, Uncertainty and momentum of the transverse electromagnetic field, 1973 band, 381 bands in solids, 1177, 1481 conservation, 248 electromagnetic field, 1966 Fermi energy, 1772 fine structure energy levels, 986 free energy, 2290 levels, 359 levels of harmonic oscillator, 509 levels of hydrogen, 823 of a paired state, 1869 recoil energy, 2023 Ensemble canonical, 2289 grand canonical, 2291 microcanonical, 2285 statistical ensembles, 2295 Entanglement quantum, 2187, 2193, 2203, 2242 swapping, 2232 Entropy, 2286 EPR, 2204, 2261 elements of reality, 2205 EPRB, 2205 paradox/argument, 1104 Equation of state ideal quantum gas, 1640 repulsive bosons, 1745 Equation(s) Bloch, 1361 Hamilton-Jacobi, 1982, 1983, 1988 Lagrange, 1982, 1993 Lorentz, 1959 Maxwell, 1959 Schrödinger, 11, 12, 306 von Neumann, 306 Essential degeneracy, 811, 825 Ethane (molecule), 1223 Ethylene (molecule), 536, 881

INDEX

Evanescent wave, 29, 67, 70, 78, 285 Evaporative cooling, 2034 Even operators, 196 Evolution field operator, 1765 of quantum systems, 223 of the mean value, 241 operator, 313, 2069 operator (expansion), 2070 operator (integral equation), 2069 Exchange, 1611 degeneracy, 1423 degeneracy removal, 1435 energy, 1469 hole, 1774 integral, 1474 term, 1447, 1451, 1453 Excitations BCS, 1923 Bogolubov, 1661 vacuum, 1623 Excited states (BCS), 1919 Exciton, 838 Exclusion principle (Pauli), 1437, 1444, 1463, 1484 Extensive (or intensive) variables, 2292 Fermi contact term, 1238 energy, 1445, 1481, 1486, 1772 gas, 1481 golden rule, 1318 level, 1486, 1621 radius, 1621 surface (modified), 1914 , see Fermi-Dirac Fermi level and electric conductivity, 1492 Fermi-Dirac distribution, 1486, 1630, 1717 statistics, 1446 Fermions, 1434 in a Fock state, 1771 paired, 1874 Ferromagnetism, 1477 Feynman path, 2267

[The notation (ex.) refers to an exercise]

postulates, 341 Fictitious spin, 435, 1359 Field absorption, 2149 commutation relations, 1989, 1996 dispersion and absorption, 2147 intense laser, 2126 interaction energy, 1764 kinetic energy, 1763 normal variables, 1971 operator, 1752 operator (evolution), 1763, 1765 pair field operator, 1861 potential energy, 1764 quantization, 1765, 1999 quasi-classical state, 2008 spatial correlation functions, 1758 Final states continuum, 1378, 1380 Fine and hyperfine structure, 1231 Fine structure constant, see front cover pages, 825 energy levels, 1478 Hamiltonian, 1233, 1276, 1478 Helium atom, 1478 Hydrogen, 1238 of spectral lines, 986 of the states 1 , 2 et 2 , 1276 Fletcher, 2111 Fluctuations boson occupation number, 1633 intensity, 2125 vacuum, 644, 2007 Fluorescence (single atom), 2121 Fluorescence triplet, 2144 Fock space, 1593, 2004 state, 1593, 1614, 1769, 2103 Forbidden, see Band energy band, 381, 390, 1481 transition, 1345 Forces van der Waals, 1151 Form factor elastic scattering, 1411(ex.) Forward scattering (direct and exchange), 1874 Fourier 2339

INDEX

[The notation (ex.) refers to an exercise]

series and transforms, 1505 Fragmentation (condensate), 1654, 1776 Free electrons in a box, 1481 energy, 2290 particle, 14 quantum field (Fock space), 2004 spherical wave, 941, 944, 961 spherical waves and plane waves, 967 Free particle stationary states with well-defined angular momentum, 959 stationary states with well-defined momentum, 19 wave packet, 14, 57, 347 Frequency Bohr, 249 components of the field (positive and negative), 2072 Rabi’s frequency, 1325 Friction (coefficient), 2028 Function of operators, 166 periodic functions, 1505 step functions, 1521 Fundamental state, 41 Gap (BCS), 1894, 1896, 1923 Gauge, 1343, 1536, 1960, 1963 Coulomb, 1965 invariance, 321 Lorenz, 1965 Gaussian wave packet, 57, 292, 2305 Generalized velocities, 214, 1530 Geometric quantization, 2311 Gerlach, see Stern GHZ state, 2222, 2227 Gibbs-Duhem relation, 2296 Golden rule (Fermi), 1318 Good quantum numbers, 248 Grand canonical, 1626, 2291 Grand potential, 1627, 1721, 2292 Green’s function, 337, 936, 1781, 1786, 1789 evolution, 1785 Greenberger-Horne-Zeilinger, 2227 2340

Groenewold’s formula, 2315 Gross-Pitaevskii equation, 1643, 1657 Ground state, 363 harmonic oscillator, 509, 520 Hydrogen atom, 1228(ex.) Group velocity, 55, 60, 614 Gyromagnetic ratio, 396, 455 orbital, 860 spin, 988 H+ 2 molecular ion, 85(ex.), 417, 1189 Hadronic atoms, 840 Hall effect, 1493 Hamilton function, 1532 function and equations, 1531 Hamilton-Jacobi canonical equations, 214, 1532, 1982, 1983, 1988 Hamiltonian, 223, 245, 1527, 1983, 1988, 1995 classical, 1531 effective, 2141 electric dipole, 1342, 2011 electric quadrupole, 1347 fine structure, 1233, 1276 hyperfine, 1237, 1267 magnetic dipolar, 1347 of a charged particle in a vector potential, 1539 of a particle in a central potential, 806, 1533 of a particle in a scalar potential, 225 of a particle in a vector potential, 225, 323, 328 Hanbury Brown and Twiss, 2120 Hanle effect, 1372(ex.) Hard sphere scattering, 980, 981(ex.) Harmonic oscillator, 497 in an electric field, 575 in one dimension, 527, 1131 in three dimensions, 569 in two dimensions, 755 infinite chain of coupled oscillators, 611 quasiclassical states, 583 thermodynamic equilibrium, 647

INDEX

three-dimensional, 841, 899(ex.) two coupled oscillators, 599 Hartree-Fock approximation, 1677, 1701 density operator (one-particle), 1691 equations, 1686, 1731 for electrons, 1695 mean field, 1677, 1693 potential, 1706 thermal equilibrium, 1711, 1733 time-dependent, 1701, 1708 Healing length, 1652 Heaviside step function, 1521 Heisenberg picture, 317, 1763 relations, 19, 39, 41, 45, 55, 232, 290 Helicity (photon), 2051 Helium energy levels, 1467 ion, 838 isotopes, 1480 isotopes 3 He and 4 He, 1435, 1446 solidification, 535 Hermite polynomials, 516, 547, 561 Hermitian conjugation, 111 matrix, 124 operator, 115, 124, 130 Histories (classical), 2272 Hole creation and annihilation, 1622 exchange, 1774 Holes, 1621 Hybridization of atomic orbitals, 869 Hydrogen, 645 atom, 803 atom in a magnetic field, 853, 855, 862 atom, relativistic energies, 1245 Bohr model, 40, 819 energy levels, 823 fine and hyperfine stucture, 1231 ionisation energy, see front cover pages ionization energy, 820 maser, 1251 molecular ion, 85(ex.), 417, 1189 quantum theory, 41

[The notation (ex.) refers to an exercise]

radial equation, 821 Stark effect, 1298 stationary states, 851 stationary wave functions, 830 Hydrogen-like systems in solid state physics, 837 Hydrogenoid systems, 833 Hyperfine decoupling, 1262 Hamiltonian, 1237, 1267 Hyperfine structure, see Hydrogen, muonium, positronium, Zeeman effect, 1231 Muonium, 1281 Ideal gas, 1625, 1787, 1791, 1804 correlations, 1769 Identical particles, 1419, 1591 Induced emission, 1334, 1366, 2081 emission of a quantum, 1311 emission of photons, 1355 Inequality (Bell’s), 2208 Infinite one-dimensional well, 271 Infinite potential well, 74 in two dimensions, 201 Infinitesimal unitary operator, 178 Insulator, 1492 Integral exchange integral, 1474 scattering equation, 935 Intense laser fields, 2126 Intensive (or extensive) variables, 2292 Interaction between magnetic dipoles, 1141 dipole-dipole interaction, 1141, 1153 electromagnetic interaction of an atom with a wave, 1340 field and particles, 2009 field and atom, 2010 magnetic dipole-dipole interaction, 1237 picture, 353, 1393, 2070 tensor interaction, 1141 Interference photons, 2167 two-photon, 2170, 2183 Ion H+ 2 , 1189 2341

INDEX

[The notation (ex.) refers to an exercise]

Ionization photo-ionization, 2109 tunnel ionization, 2126 Isotropic radiation, 2079 Jacobi, see Hamilton Kastler, 2059, 2062 Ket, see state, 103, 119 for identical particles, 1436 Kuhn, see Thomas Lagrange equations, 1530, 1982, 1993 fonction and equations, 214 multipliers, 2281 Lagrangian, 1530, 1980 densities, 1986 electromagnetic field, 1986, 1992 formulation of quantum mechanics, 339 of a charged particle in an electromagnetic field, 1538 particle in an electromagnetic field, 323 Laguerre-Gaussian beams, 2065 Lamb shift, 645, 1245, 1388, 2008 Landau levels, 771 Landé factor, 1072, 1107(ex.), 1256, 1292 Laplacian, 1527 of 1 , 1524 of ( ) +1 , 1526 Larmor angular frequency, 857 precession, 394, 396, 410, 455, 857, 1071 Laser, 1359, 1365 Raman laser, 2093 saturation, 1370 trap, 2151 Lattices (optical), 2153 Least action principle of, 1539 Legendre associated function, 714 polynomial, 713 Length (healing), 1652 Level 2342

anticrossing, 415, 482 Fermi level, 1621 Lifetime, 343, 485, 645 of a discrete state, 1386 radiative, 2081 Lifting of degeneracy by a perturbation, 1125 Light quanta, 3 shifts, 1334, 2138, 2151, 2156 Linear, see operator combination of atomic orbitals, 1172 operators, 90, 108, 163 response, 1350, 1357, 1364 superposition of states, 253 susceptibility, 1365 Local conservation of probability, 238 Local realism, 2209, 2230 Longitudinal fields, 1961 relaxation, 1400 relaxation time, 1401 Lorentz equations, 1959 Lorenz (gauge), 1965 Magnetic dipole term, 1272 dipole-dipole interaction, 1237 effect of a magnetic field on the levels of the Hydrogen atom, 1251 hyperfine Hamiltonian, 1267 interactions, 1232, 1237 quantum number, 811 resonance, 455 susceptibility, 1224, 1487 Magnetic dipole Hamiltonian, 1347 transitions and selection rules, 1084, 1098, 1348 Magnetic dipoles interactions between two dipoles, 1141 Magnetic field and vector potential, 321 charged particle in a, 240, 771 effects on hydrogen atom, 853, 855 harmonic oscillator in a, 899(ex.) Hydrogen atom in a magnetic field, 1263, 1289

INDEX

multiplets, 1074 quantized, 2000, 2005 Magnetism (spontaneous), 1737 Many-electron atoms, 1459 Maser, 477, 1359, 1365 hydrogen, 1251 Mass correction (relativistic), 1234 Master equation, 1358 Matrice(s), 119, 121 diagonalization of a 2 2 matrix, 429 Pauli matrices, 425 unitary matrix, 176 Maxwell’s equations, 1959 Mean field (Hartree-Fock), 1693, 1708, 1725 Mean value of an observable, 228 evolution, 241 Measurement general postulates, 216, 226 ideal von Neumann measurement, 2196 of a spin 1/2, 394 of observables, 216 on a part of a physical system, 293 state after measurement, 221, 227 Mendeleev’s table, 1463 Metastable superfluid flow, 1671 Methane (molecule), 883 Microcanonical ensemble, 2285 Millikan, 2111 Minimal wave packet, 290, 520, 591 Mirrors for atoms, 2153 Mixing of states, 1121, 1137 Model Cooper model, 1927 Einstein model, 534 elastically bound electron, 1350 vector model of atom, 1071 Modes vibrational modes, 599, 611 Modes (radiation), 1974, 1975 Molecular ion, 417 Molecule(s) chemical bond, 417, 869, 873, 878, 883, 1189 rotation, 796 vibration, 527, 1137 vibration-rotation, 885

[The notation (ex.) refers to an exercise]

Mollow, 2144 Moment quadrupole electric moment, 1225(ex.) Momentum, 1539 conjugate, 214, 323, 1983, 1987, 1995 diffusion, 2030 electromagnetic field, 1967, 2019 mechanical momentum, 328 Monogamy (quantum), 2221 Mössbauer effect, 1415, 2040 Motional narrowing, 1323 condition, 1323, 1398, 1408 Multiphoton transition, 1368, 2040, 2097 Multiplets, 1072, 1074, 1467 Multipliers (Lagrange), 2281 Multipolar waves, 2052 Multipole moments, 1077 Multipole operators introduction, 1077, 1083 parity, 1082 Muon, 527, 541, 1281 Muonic atom, 541, 839 Muonium, 835 hyperfine structure, 1281 Zeeman effect, 1281 Narrowing (motional), 1323, 1408 condition, 1398 Natural width, 345, 1388 Need for a quantum treatment, 2118, 2120 Neumann spherical function, 967 Neutron mass, see front cover pages Non-destructive detection of a photon, 2159 Non-diagonal order (BCS), 1912 Non-locality, 2204 Non-resonant excitation, 1350 Non-separability, 2207 Nonlinear response, 1357, 1368 susceptibility, 1369 Norm conservation, 238 of a state vector, 104, 237 of a wave function, 13, 90, 99 2343

INDEX

[The notation (ex.) refers to an exercise]

Normal correlation function, 1782, 1787 variables, 602, 616, 631, 633 variables (field), 1971 Nuclear multipole moments, 1088 Bohr magneton, 1237 Nucleus spin, 1088 volume effect, 1162, 1268 Number occupation number, 1439, 1593 photon number, 2135 total number of particles in an ideal gas, 1635 Observable(s), 130 C.S.C.O., 133, 137 commutation, 232 compatibility, 232 for identical particles, 1429, 1441 mean value, 228 measurement of, 216, 226 quantization rules, 223 symmetric observables, 1441 transformation by permutation, 1434 whose commutator is }, 187, 289 Occupation number, 1439, 1593 operator, 1598 Odd operators, 196 One-particle Hartree-Fock density operator, 1691 operators, 1603, 1605, 1628, 1756 Operator(s) adjoint operator, 112 annihilation operator, 504, 513, 514, 1597 creation and annihilation, 1990 creation operator, 504, 513, 514, 1596 derivative of an operator, 169 diagonalization, 126, 128 even and odd operators, 196 evolution operator, 313, 2069 field, 1752 function of, 166 Hermitian operators, 115 linear operators, 90, 108, 163 2344

occupation number, 1598 one-particle operator, 1603, 1605, 1628, 1756 parity operator, 193 particle density operator, 1756 permutation operators, 1425, 1430 potential, 168 product of, 90 reduced to a single particle, 1607 representation, 121 restriction, 165 restriction of, 1125 rotation operator, 1001 symmetric, 1628, 1755 translation operator, 190 two-particle operator, 1608, 1610, 1631, 1756 unitary operators, 173 Weyl operator, 2300 Oppenheimer, see Born, 1177, 1190 Optical excitation (broadband), 1332 lattices, 2153 pumping, 2062, 2140 Orbital angular momentum (of radiation), 2052 atomic orbital, 1496(ex.) hybridization, 869 linear combination of atomic orbitals, 1172 quantum number, 1463 state space, 988 Order parameter for pairs, 1851 Orthonormal basis, 91, 99, 101, 133 characteristic relation, 116 Orthonormalization and closure relations, 101, 140 relation, 116 Oscillation(s) between two discrete states, 1374 between two quantum states, 418 Rabi, 2134 Oscillator anharmonic, 502 harmonic, 497 strength, 1352 Pair(s)

INDEX

annihilation-creation of pairs, 1831, 1874, 1887 BCS, wave function, 1909 Cooper, 1927 of particles (creation operator), 1813, 1846 pair field (commutation), 1861 pair field operator, 1845 pair wave function, 1851 Paired bosons, 1881 fermions, 1874 state energy, 1869 states, 1811 states (building), 1818 Pairing term, 1878 Paramagnetism, 855 Parametric down-conversion, 2181 Parity, 2106 degeneracy, 199 of a permutation operator, 1431 of multipole operators, 1082 operator, 193 Parseval Parseval-Plancherel equality, 20 Parseval-Plancherel formula, 1511, 1521 Partial reflection, 79 trace of an operator, 309 waves in the potential, 948 waves method, 941 Particle (current), 1758 Particles and holes, 1621 Partition function, 1626, 1627, 1717 Path integral, 2267 space-time path, 339 Pauli exclusion principle, 1437, 1444, 1463, 1481 Hamiltonian, 1009(ex.) matrices, 425, 991 spin theory, 986 spinor, 993 Penetrating orbit, 1463 Penrose-Onsager criterion, 1776, 1860, 1947 Peres, 2212

[The notation (ex.) refers to an exercise]

Periodic boundary conditions, 1489 classification of elements, 1463 functions, 1505 potential (one-dimensional), 375 Permutation operators, 1425, 1430 Perturbation applications of the perturbation theory, 1231 lifting of a degeneracy, 1125 one-dimensional harmonic oscillator, 1131 random perturbation, 1320, 1325, 1390 sinusoidal, 1311 stationary perturbation theory, 1115 Perturbation theory time dependent, 1303 Phase locking (BCS), 1893, 1916 locking (bosons), 1938, 1944 relative phase between condensates, 2237, 2248 velocity, 37 Phase shift (collision), 951, 1497(ex.) with imaginary part, 971 Phase velocity, 21 Phonons, 611, 626 Bogolubov phonons, 1660 Photodetection double, 2172, 2184 single, 2169, 2171 Photoelectric effect, 1412(ex.), 2110 Photoionization, 2109, 2165 rate, 2115, 2124 two-photon, 2123 Photon, 3, 631, 651, 2004, 2005, 2110 absorption and emission, 2067 angular momentum, 1370 antibunching, 2121 detectors, 2165 non-destructive detection, 2159 number, 2135 scattering (elastic), 2086 scattering by an atom, 2085 vacuum, 2007 , see Absorption, Emission Picture 2345

INDEX

[The notation (ex.) refers to an exercise]

Heisenberg, 317, 1763 interaction, 1393, 2070 Pitaevskii (Gross-Pitaevskii equation), 1643, 1657 Plancherel, see Parseval Planck constant, see front cover pages, 3 law , 2083 Planck-Einstein relations, 3, 10 Plane wave, 14, 19, 95, 943 Podolsky (EPR argument), 297, 1104 Pointer states, 2199 Polarizability of the 1 state in Hydrogen, 1299 Polarization electromagnetic field, 1970 of Zeeman components, 1295 space-dependent, 2156 Polynomial method (harmonic oscillator), 555, 842 Polynomials Hermite polynomials, 516, 547, 561 Position and momentum representations, 181 Positive and negative frequency components, 2072 Positron, 1281 Positronium, 836 hyperfine structure, 1281 Zeeman effect, 1281 Postulate (von Neumann projection), 2202 Postulates of quantum mechanics, 215 Potential adiabatic branching, 932 barrier, 26, 68, 367, 373 centrifugal potential, 809, 888, 893 Coulomb potential, cross section, 979 cylindrically symmetric, 899(ex.) Hartree-Fock, 1706 infinite one-dimensional well, 74 operator, 168 scalar and vector potentials, 1536, 1960, 1963 scattering by a, 923 self-consistent potential, 1461 square potential, 63 square well, 29 2346

step, 28, 65, 75, 284 well, 71, 367 well (arbitrary shape), 359 well (infinite one-dimensional), 271 well (infinite two-dimensional, 201 Yukawa potential, 977 Precession Larmor precession, 396, 1071 Thomas precession, 1235 Preparation of a state, 235 Pressure (ideal quantum gas), 1640 Principal part, 1517 Principal quantum number, 827 Principle of least action, 1539, 1980 of spectral decomposition, 11, 216 of superposition, 237 Probability amplitude, 11, 253, 259 conservation, 237 current, 240, 283, 333, 349, 932 current in hydrogen atom, 851 density, 11, 264 fluid, 932 of photon absorption, 2076 of the measurement results, 9, 11 transition probability, 439 Process (pair annihilation-creation), 1878, 1887 Product convolution product of functions, 1510 of matrices, 122 of operators, 90 scalar product, 101, 141, 149, 161 state (tensor product), 311 tensor product, 147 tensor product, applications, 441 Projection theorem, 1070 Projector, 109, 133, 165, 218, 222, 1108(ex.) Propagator for the Schrödinger equation, 335 of a particle, 2267, 2272 Proper result, 9 Proton mass, see front cover pages spin and magnetic moment, 1237, 1274 Pumping, 1358

INDEX

[The notation (ex.) refers to an exercise]

Pure (state or case), 301

cascade of the dressed atom, 2145 Raman Quadrupolar electric moment, 1082, 1225(ex.) effect, 532, 740, 1373(ex.) Quanta (circular), 761, 783 laser, 2093 Quantization scattering, 2091 electrodynamics, 1997 scattering (stimulated), 2093 electromagnetic field, 631, 637, 1997 Random perturbation, 1320, 1325, 1390 of a field, 1765 Rank (Schmidt), 2196 of angular momentum, 394, 677 Rate (photoionization), 2115, 2124 of energy, 3, 11, 71, 359 Rayleigh of measurement results, 9, 216, 398 line, 752 of the measurement results, 405 scattering, 532, 2089 rules, 11, 223, 226, 2274 Realism (local), 2205, 2209 Quantum Recoil angle, 2258 blocking, 2036 electrodynamics, 1245, 1282, 1997 effect of the nucleus, 834 entanglement, 2187, 2193 energy, 1415, 2023 monogamy, 2221 free atom, 2020 number suppression, 2040 orbital, 1463 Reduced principal quantum number, 827 density operator, 1607 numbers (good), 248 mass, 813 resonance, 417 Reduction of the wave packet, 221, 279 treatment needed, 2118, 2120 Reflection on a potential step, 285 Quasi-classical Refractive index, 2149 field states, 2008 Reiche, see Thomas states, 765, 791, 801 Relation (Gibbs-Duhem), 2296 states of the harmonic oscillator, 583 Relative Quasi-particles, 1736, 1840 motion, 814 Bogolubov phonons, 1954 particle, 814 Quasi-particle vacuum, 1836 phase between condensates, 2248, 2258 phase between spin condensates, 2253 Rabi Relativistic formula, 440, 460, 1324, 1376 corrections, 1233, 1478 formula), 419 Doppler effect, 2022 frequency, 1325 mass correction, 1234 oscillation, 2134 Relaxation, 465, 1358, 1390, 1413, 1414(ex.) Radial general equations, 1397 equation, 842 longitudinal, 1400 equation (Hydrogen), 821 longitudinal relaxation time, 1401 equation in a central potential, 808 transverse, 1403 integral, 1277 transverse relaxation time, 1406 quantum number, 811 Relay state, 2086, 2098, 2106 Radiation Renormalization, 2007 isotropic, 2079 Representation(s) pressure, 2024 change of, 124 Radiative broadening, 2138 in the state space, 116 2347

INDEX

[The notation (ex.) refers to an exercise]

of operators, 121 position and momentum, 139, 181 Schrödinger equation, 183–185 Repulsion between electrons, 1469 Resonance magnetic resonance, 455 quantum resonance, 417, 1158 scattering resonance, 69, 954, 983(ex.) two resonnaces with a sinusoidal excitation, 1365 width, 1312 with sinusoidal perturbation, 1311 Restriction of an operator, 165, 1125 Rigid rotator, 740, 1222(ex.) Ritz theorem, 1170 Root mean square deviation general definition, 230 Rosen (EPR argument), 297, 1104 Rotating frame, 459 Rotation(s) and angular momentum, 717 invariance and degeneracy, 734 of diatomic molecules, 739 of molecules, 796, 885 operator(s), 720, 1001 rotation invariance, 1478 rotation invariance and degeneracy, 1072 Rotator rigid rotator, 740, 1222(ex.) Rules quantization rules, 2274 selection rules, 197 Rutherford’s formula, 979 Rydberg constant, see front cover pages Saturation of linear response, 1368 of the susceptibility, 1369 Scalar and vector potentials, 321, 1536 interaction between two angular momenta, 1091 observable, operator, 732, 737 potential, 225 product, 89, 92, 101, 141, 149, 161 product of two coherent states, 593 2348

Scattering amplitude, 929, 953 by a central potential, 941 by a hard sphere, 980, 981(ex.) by a potential, 923 cross section, 933, 953, 972 cross section and phase shifts, 951 inelastic, 2091 integral equation, 935 of particles with spin, 1102 of spin 1/2 particles, 1108(ex.) photon, 2086 Raman, 2091 Rayleigh, 532, 2089 resonance, 954, 983(ex.) resonant, 2089 stationary scattering states, 951 stationary states, 928 stimulated Raman, 2093 Schmidt decomposition, 2193 rank, 2196 Schottky anomaly, 654 Schrödinger, 2190 equation, 11, 12, 223, 306 equation in momentum representation, 184 equation in position representation, 183 equation, physical implications, 237 equation, resolution for conservative systems, 245 picture, 317 Schwarz inequality, 161 Second quantization, 1766 harmonic generation, 1368 Secular approximation, 1316, 1374 Selection rules, 197, 863, 2014, 2056 electric quadrupolar, 1348 magnetic dipolar, 1098, 1348 Self-consistent potential, 1461 Semiconductor, 837, 1493 Separability, 2207, 2223 Separable density operator, 2223 Shell (electronic), 827 Shift

INDEX

light shift, 2138 of a discrete state, 1387 Singlet, 1024, 1474 Sinusoidal perturbation, 1311, 1374 Sisyphus cooling, 2034 effect, 2155 Slater determinant, 1438, 1679 Slowing down atoms, 2025 Solids electronic bands, 1177 energy bands of electrons, 1491 energy bands of electrons in solids, 381 hydrogen-like systems in solid state physics, 837 Space (Fock), 1593 Space-dependent polarization, 2156 Space-time path, 339, 1539 Spatial correlations (ideal gas), 1769 Specific heat of an electron gas, 1484 of metals, 1487 of solids, 653 two level system, 654 Spectral decomposition principle, 7, 11, 216 function, 1795 terms, 1469 Spectroscopy (Doppler free), 2105 Spectrum BCS elementary excitation, 1923 continuous, 219, 264 discrete, 132, 217 of an observable, 126, 216 Spherical Bessel equation, 961 Bessel function, 944, 966 free spherical waves, 961 free wave, 944 Neumann function, 967 wave, 941 waves and plane waves, 967 Spherical harmonics, 689, 705 addition of, 1059 expression for = 0 1 2 , 709 general expression, 707

[The notation (ex.) refers to an exercise]

Spin and magnetic moment of the proton, 1237 angular momentum, 987 electron, 985, 1289 fictitious, 435 gyromagnetic ratio, 396, 455, 988 nuclear, 1088 of the electron, 393 Pauli theory, 986, 988 quantum description, 985, 991 rotation operator, 1001 scattering of particles with spin, 1102 spin 1 and radiation, 2044, 2049, 2050 system of two spins, 441 Spin 1/2 density operator, 449 ensemble of, 1358 fictitious, 1359 interaction between two spins, 1141 preparation and measurement, 401 scattering of spin 1/2 particles, 1108(ex.) Spin-orbit coupling, 1018, 1234, 1241, 1279 Spin-statistics theorem, 1434 Spinor, 993 rotation, 1005 Spontaneous emission, 343, 645, 1301, 2081, 2135 emission of photons, 1356 magnetism of fermions, 1737 Spreading of a wave packet, 59, 348 Square barrier of potential, 26, 68 potential, 26, 63, 75, 283 potential well, 71, 271 spherical well, 982(ex.) Standard representation (angular momentum), 677, 691 Stark effect in Hydrogen atom, 1298 State(s), see Density operator density of, 389, 1316, 1484, 1488 Fock, 1593, 1614, 1769, 2103 ground state, 363 mixing of states by a perturbation, 1121 orbital state space, 988 paired, 1811 2349

INDEX

[The notation (ex.) refers to an exercise]

pointer states, 2199 quasi-classical states, 583, 765, 791, 801 relay state, 2086, 2098, 2106 stable and unstable states, 485 state after measurement, 221 state preparation, 235 stationary, 63, 359, 375 stationary state, 24, 246 stationary states in a central potential, 804 unstable, 343 vacuum state, 1595 vector, 102, 215 Stationary perturbation theory, 1115 phase condition, 18, 54 scattering states, 928, 951 states, 24, 63, 246, 359 states in a periodic potential, 375 states with well-defined angular momentum, 944, 959 states with well-defined momentum, 943 Statistical entropy, 2217 mechanics (review of), 2285 mixture of states, 253, 299, 304, 450 Statistics Bose-Einstein, 1446 Fermi-Dirac, 1446 Step function, 1521 potential, 28, 65, 75, 284 Stern-Gerlach experiment, 394 Stimulated (or induced) emission, 1334, 1366, 2081 Raman scattering, 2093 Stokes Raman line, 532, 752 Stoner (spontaneous magnetism), 1737 Strong coupling (dressed-atom), 2141 Subrecoil cooling, 2034 Sum rule (Thomas-Reiche-Kuhn), 1352 Superfluidity, 1667, 1674 Superposition of states, 253 2350

principle, 7, 237 principle and physical predictions, 253 Surface (modified Fermi surface), 1914 Susceptibility, see Linear, nonlinear, tensor electric susceptibility of an atom, 1351 electrical susceptibility, 577, e1223 electrical susceptibility of NH3 , 484 magnetic susceptibility, 1224 tensor, 1224, 1410(ex.) Swapping (entanglement), 2232 Symmetric ket, state, 1428, 1431 observables, 1429, 1441 operators, 1603, 1605, 1608, 1610, 1628, 1631, 1755 Symmetrization of observables, 224 postulate, 1434 Symmetrizer, 1428, 1431 System time evolution of a quantum system, 223 two-level system, 435 Systematic and accidental degeneracies, 203 degeneracy, 845 Temperature (Doppler), 2033 Tensor interaction, 1141 product, 147, 441 product of operators, 149 product state, 295, 311 product, applications, 201 susceptibility tensor, 1224 Term direct and exchange terms, 1613, 1632, 1634, 1646, 1650 pairing, 1878 spectral terms, 1467, 1469 Theorem Bell, 2204, 2208 Bloch, 659 projection, 1070 Ritz, 1170 Wick, 1799, 1804

INDEX

Wigner-Eckart, 1065, 1085, 1254 Thermal wavelength, 1635 Thermodynamic equilibrium, 308 harmonic oscillator, 647 ideal quantum gas, 1625 spin 1/2, 452 Thermodynamic potential (minimization), 1715 Thomas precession, 1235 Thomas-Reiche-Kuhn sum rule, 1352 Three-dimensional harmonic oscillator, 569, 841, 899(ex.) Three-level system, 1409(ex.) Three-photon transition, 1370 Time evolution of quantum systems, 223 Time-correlations (fluorescent photons), 2145 Time-dependent Gross-Pitaevskii equation, 1657 perturbation theory, 1303 Time-energy uncertainty relation, 250, 279, 345, 1312, 1389 Torsional oscillations, 536 Torus (flow in a), 1667 Total elastic scattering cross section, 972 reflection, 67, 75 scattering cross section (collision), 926 Townes Autler-Townes effect, 1410 Trace of an operator, 163 partial trace of an operator, 309 Transform (Wigner), 2297 Transformation Bogolubov, 1950 Bogolubov-Valatin, 1836, 1919 Gauge, 1960 of observables by permutation, 1434 Transition, see Probability, Forbidden, Electric dipole, Magnetic dipole, Quadrupole electric dipole, 2056 magnetic dipole transition, 1098 probability, 439, 1308, 1321, 1355 probability per unit time, 1319 probability, spin 1/2, 460 three-photon transition, 1370

[The notation (ex.) refers to an exercise]

two-photon, 2097 virtual, 2100 Translation operator, 190, 579, 791 Transpositions, 1431 Transverse fields, 1961 relaxation, 1403 relaxation time, 1406 Trap dipolar, 2151 laser, 2151 Triplet, 1024, 1474 fluorescence triplet, 2144 Tunnel effect, 29, 70, 365, 476, 540, 1177 ionization, 2126 Two coupled harmonic oscillators, 599 Two-dimensional harmonic oscillator, 755 infinite potential well, 201 wave packets, 49 Two-level system, 393, 411, 435, 1357 Two-particle operators, 1608, 1610, 1631, 1756 Two-photon absorption, 1373(ex.) interference, 2170, 2183 transition, 1409(ex.), 2097 Uncertainty relation, 19, 39, 41, 45, 232, 290 time-energy uncertainty relation, 1312 Uniqueness of the measurement result, 2201 Unitary matrix, 125, 176 operator, 173, 314 transformation of operators, 177 Unstable states, 343 Vacuum electromagnetism, 644, 2007 excitations, 1623 fluctuations, 2007 photon vacuum, 2007 quasi-particule vacuum, 1836 state, 1595 Valence band, 1493 2351

INDEX

[The notation (ex.) refers to an exercise]

Van der Waals forces, 1151 Variables intensive or extensive, 2292 normal variables, 602, 616, 631, 633 Variational method, 1169, 1190, 1228(ex.) Vector model, 1091 model of the atom, 1071, 1256 observable, operator, 732 operator, 1065 potential, 225 potential of a magnetic dipole, 1268 Velocity critical, 1671 generalized velocities, 214, 1530 group velocity, 23, 614 phase velocity, 21, 37 Vibration(s) modes, 599, 611 modes of a continuous system, 631 of molecules, 885, 1137 of nuclei in a crystal, 534, 611, 653 of the nuclei in a molecule, 527 Violations of Bell’s inequalities, 2210, 2265 Virial theorem, 350, 1210 Virtual transition, 2100 Volume effect, 544, 840, 1162, 1268 Von Neumann chain, 2201 equation, 306 ideal measurement, 2196 reduction postulate, 2202 statistical entropy, 2217 Vortex in a superfluid, 1667 Water (molecule), 873, 874 Wave (evanescent), 67 Wave function, 88, 140, 226 BCS pairs, 1901, 1909 Hydrogen, 830 norm, 90 pair wave functions, 1851 particle, 11 Wave packet(s) Gaussian, 57, 2305 in a potential step, 75 in three dimensions, 53 2352

minimal, 290, 520, 591 motion in a harmonic potential, 596 one-photon, 2168 particle, 13 photon, 2163 propagation, 20, 57, 242, 398 reduction, 221, 227, 265, 279 spreading, 57, 59, 347, 348(ex.) two-dimension, 49 two-photons, 2181 Wave(s) de Broglie wavelength, 10, 35 evanescent, 29 free spherical waves, 961 multipolar, 2052 partial waves, 948 plane, 14, 19, 943 wave function, 11, 88, 140, 226 Wave-particle duality, 3, 45 Wavelength Compton wavelength, 1235 de Broglie, 10 Weak coupling (dressed-atom), 2137 Well potential square well, 29 potential well, 367 Weyl operator, 2300 quantization, 2311 Which path type of experiments, 2202 Wick’s theorem, 1799, 1804 Wigner transform, 2297 Wigner-Eckart theorem, 1065, 1085, 1254 Young (double slit experiment), 4 Yukawa potential, 977 Zeeman components, polarizations, 865 effect, 855, 862, 987, 1251, 1253, 1257, 1261, 1281 polarization of the components, 1295 slower, 2025 Zeeman effect Hydrogen, 1289 in muonium, 1281 in positronium, 1281 Muonium, 1284

INDEX

[The notation (ex.) refers to an exercise]

Zone (Brillouin zone), 614

2353