Stochastic Processes in Classical and Quantum Physics and Engineering 9781032405391, 9781032405414, 9781003353553

263 121 9MB

English Pages [275] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Stochastic Processes in Classical and Quantum Physics and Engineering
 9781032405391, 9781032405414, 9781003353553

Table of contents :
Cover
Half Title
Title Page
Copyright Page
Preface
Brief Contents
Table of Contents
1. Stability Theory of Nonlinear Systems Based on Taylor Approximations of the Nonlinearity
1.1 Stochastic Perturbation of Nonlinear Dynamical Systems: Linearized Analysis
1.2 Large Deviation Analysis of the Stability Problem
1.3 LDP for Yang-Mills Gauge Fields and LDP for the Quantized Metric Tensor of Spacetime
1.4 String in Background Fields
2. String Theory and Large Deviations
2.1 Evaluating Quantum String Amplitudes Using Functional Integration
2.2 LDP for Poisson Fields
2.3 LDP for Strings Perturbed by Mixture of Gaussian and Poisson Noise
3. Ordinary, Partial and Stochastic Differential Equation Models for Biological Systems, Quantum Gravity, Hawking Radiation,
Quantum Fluids, Statistical Signal Processing,
Group Representations, Large Deviations in Statistical Physics
3.1 A lecture on Biological Aspects of Virus Attacking the Immune System
3.2 More on Quantum Gravity
3.3 Large Deviation Properties of Virus Combat with Antibodies
3.4 Large Deviations Problems in Quantum Fluid Dynamics
3.5 Lecture Schedule for Statistical Signal Processing
3.6 Basic Concepts in Group Theory and Group Representation Theory
3.7 Some Remarks on Quantum Gravity Using the ADM Action
3.8 Diffusion Exit Problem
3.9 Large Deviations for the Boltzmann Kinetic Transport Equation
3.10 Orbital Integrals for SL(2,R) with Applications to Statistical Image Processing
3.11 Loop Quantum Gravity (LQG)
3.12 Group Representations in Relativistic Image Processing
3.13 Approximate Solution to the Wave Equation in the Vicinity of the Event Horizon of a Schwarchild Black-hole
3.14 More on Group Theoretic Image Processing
3.15 A Problem in Quantum Information Theory
3.16 Hawking Radiation
4. Research Topics on Representations
4.1 Remarks on Induced Representations and Haar Integrals
4.2 On the Dynamics of Breathing in the Presence of a Mask
4.3 Plancherel Formula for SL(2,R)
4.4 Some Applications of the Plancherel Formula for a Lie Group to Image Processing
5. Some Problems in Quantum Information Theory
5.1 Quantum Information Theory and Quantum Blackhole Physics
5.2 Harmonic Analysis on the Schwarz Space of G = SL(2,R)
5.3 Monotonicity of Quantum Entropy Under Pinching
6. Lectures on Statistical Signal Processing
6.1 Large Deviation Analysis of the RLS Algorithm
6.2 Problem Suggested
6.3 Fidelity Between Two Quantum States
6.4 Stinespring’s Representation of a TPCP Map, i.e, of a Quantum Operation
6.5 RLS Lattice Algorithm, Statistical Performance Analysis
6.6 Approximate Computation of the Matrix Elements of the Principal and Discrete Series for G = SL(2,R)
6.7 Proof of the Converse Part of the Cq Coding Theorem
6.8 Proof of the Direct part of the Cq Coding Theorem
6.9 Induced Representations
6.10 Estimating Non-uniform Transmission Line Parameters and Line Voltage and Current Using the Extended Kalman Filter
6.11 Multipole Expansion of the Magnetic Field Produced by a Time Varying Current Distribution
7. Some Aspects of Stochastic Processes
7.1 Problem Suggested by Prof. Dhananjay
7.2 Converse of the Shannon Coding Theorem
8. Problems in Electromagnetics and Gravitation
9. Some More Aspects of Quantum Information Theory
9.1 An Expression for State Detection Error
9.2 Operator Convex Functions and Operator Monotone Functions
9.3 A Selection of Matrix Inequalities
9.4 Matrix Holder Inequality
10. Lecture Plan for Statistical Signal Processing
10.1 End-Semester Exam, Duration: Four Hours, Pattern Recognition
10.2 Quantum Stein’s Theorem
10.3 Question Paper, Short Test, Statistical Signal Processing
10.4 Question Paper on Statistical Signal Processing, Long Test
10.5 Quantum Neural Networks for Estimating EEG pdf from Speech pdf Based on Schrodinger’s Equation
10.6 Statistical Signal Processing
10.7 Monotonocity of Quantum Relative Renyi Entropy
11. My Commentary on the Ashtavakra Gita
12 LDP-UKF Based Controller
13. Abstracts for a Book on Quantum Signal Processing Using Quantum Field Theory
14. Quantum Neural Networks and Learning Using Mixed State Dynamics for Open Quantum Systems
15. Quantum Gate Design, Magnet Motion, Quantum Filtering, Wigner Distributions
15.1 Designing a Quantum Gate Using QED with a Control c-number Four Potential
15.2 On the Motion of a Magnet Through a Pipe with a Coil Under Gravity and the Reaction Force Exerted by the Induced Current
Through the Coil on the Magnet
15.3 Quantum Filtering Theory Applied to Quantum Field Theory for Tracking a Given Probability Density Function
15.4 Lecture on the Role of Statistical Signal Processing in General Relativity and Quantum Mechanics
15.5 Problems in Classical and Quantum Stochastic Calculus
15.6 Some Remarks on the Equilibrium Fokker-Planck Equation for the Gibbs Distribution
15.7 Wigner-distributions in Quantum Mechanics
15.8 Motion of An Element having Both an Electric and a Magnetic Dipole Moment in an External Electromagnetic Field Plus a Coil
16. Large Deviations for Markov Chains, Quantum Master Equation, Wigner Distribution for the Belavkin Filter
16.1 Large Deviation Rate Function for the Empirical Distribution of a Markov Chain
16.2 Quantum Master Equation in General Relativity
16.3 Belavkin Filter for the Wigner Distirbution Function
17. String and Field Theory with Large Deviations, Stochastic Algorithm Convergence
17.1 String Theoretic Corrections to Field Theories
17.2 Large Deviations Problems in String Theory
17.3 Convergence Analysis of the LMS Algorithm
17.4 String Interacting with Scalar Field, Gauge Field and a Gravitational Field

Citation preview

Stochastic Processes in Classical and

Quantum Physics and Engineering

Stochastic Processes in Classical and

Quantum Physics and Engineering

Harish Parthasarathy Professor

Electronics & Communication Engineering

Netaji Subhas Institute of Technology (NSIT)

New Delhi, Delhi-110078

First published 2023 by CRC Press 4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN and by CRC Press 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 © 2023 Manakin Press

CRC Press is an imprint of Informa UK Limited The right of Harish Parthasarathy to be identified as the author of this work has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. For permission to photocopy or use material electronically from this work, access www. copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact [email protected]

Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Print edition not for sale in South Asia (India, Sri Lanka, Nepal, Bangladesh, Pakistan or Bhutan).

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 9781032405391 (hbk) ISBN: 9781032405414 (pbk) ISBN: 9781003353553 (ebk) DOI: 10.4324/9781003353553 Typeset in Arial, BookManOldStyle, CourierNewPS, MS-Mincho, Symbol, TimesNewRoman and Wingdings by Manakin Press, Delhi

2

Preface

Preface This book covers a wide range of problems involving the applications of stochastic processes, stochastic calculus, large deviation theory, group represen­ tation theory and quantum statistics to diverse fields in dynamical systems, elec­ tromagnetics, statistical signal processing, quantum information theory, quan­ tum neural network theory, quantum filtering theory, quantum electrodynam­ ics, quantum general relativity, string theory, problems in biology and classical and quantum fluid dynamics. The selection of the problems has been based on courses taught by the author to undergraduates and postgraduates in elec­ tronics and communication engineering. The theory of stochastic processes has been applied to problems of system identification based on time series models of the process concerned to derive time and order recursive parameter estimation followed by statistical performance analysis. The theory of matrices has been applied to problems such as quantum hypothesis testing between two states in­ cluding a proof of the quantum Stein Lemma involving asymptotic formulas for the error probabilities involved in discriminating between a sequence of ordered pairs of tensor product states. This problem is the quantum generalization of the large deviation result regarding discriminating between to probability dis­ tributions from iid measurements. The error rates are obtained in terms of the quantum relative entropy between the two states. Classical neural networks are used to match an input­output signal pair. This means that the weights of the network are trained so that it takes as input the true system input and repro­ duces a good approximaition of the system output. But if the input and output signals are stochastic processes characterized by probability densities and not by signal values, then the network must take as input the input probability distribution and reproduce a good approximation of the output probability dis­ tribution. This is a very hard task since a probability distribution is specified by an infinite number of values. A quantum neural network solves this problem easily since Schrodinger’s equation evolves a wave function whose magnitude square is always a probability density and hence by controlling the potential in this equation using the input pdf (probability distribution function) and the error between the desired output pdf and that generated by Schrodinger’s equa­ tion, we are able to track a time varying output pdf successfully during the training process. This has applications to the problem of transforming speech data to EEG data. Speech is characterized by a pdf and so is EEG. Further the speech pdf is correlated to the EEG pdf because if the brain acquires a disease, the EEG signal will get distorted and correspondingly the speech signal will also get slurred. Thus, the quantum neural network based on Schrodinger’s equation can exploit this correlation to get a fairly good approximation of the EEG pdf given the speech pdf. These issues have been discussed in this book. Other prob­ lems include a proof of the celebrated Lieb’s inequality which is used in proving convexity and concavity properties of various kinds of quantum entropies and quantum relative entropies that are used in establishing coding theorems in quantum information theory like what is the maximum rate at which classical information can be transmitted over a Cq channel so that a decoder with error probability converging to zero can be constructed. Some of the inequalities dis­

v

3 cussed in this book find applications to more advanced problems in quantum information theory that have not been discussed here like how to generate a se­ quence of maximally entangled states having maximal dimension asymptotically from a sequence of tensor product states using local quantum operations with one way or two way classical communication using a sequence of TPCP maps or conversely, how to generate a sequence tensor product states optimally from a sequence of maximally entangled states by a sequence of TPCP maps with the maximally entangled states having minimal dimension. Or how to gener­ ate a sequence of maximal shared randomness between two users a sequence of quantum operations on a sequence of maximally entangled states. The perfor­ mance of such algorithms are characterized by the limit of the logarithm of the length divided by the number of times that the tensor product is taken. These quantities are related to the notion of information rate. Or, the maximal rate at which information can be transmitted from one user to another when they share a partially entangled state. This book also covers problems involving quantiza­ tion of the action functionals in general relativity and expressing the resulting approximate Hamiltonian as perturbations to harmonic oscillator Hamiltonians. It also covers topics such as writing down the Schrodinger equation for a mixed state of a particle moving in a potential coupled to a noisy bath in terms of the Wigner distribution function for position and momentum. The utility of this lies in the fact that the position and momentum observables in quantum mechanics do not commute and hence by the Heisenberg uncertainty principle, they do not have a joint probability density unlike the classical case of a particle moving in a potential with fluctuation and dissipation forces present where we have a Fokker­Planck equation for the joint pdf of the position and momentum. The Wigner distribution being complex, is not a joint probability distribution function but its marginals are and it is the closest that one can come in quan­ tum mechanics for making an analogy with classical joint probabilities and we show in this book that its dynamics is the same a that of a classical Fokker­ Planck equation but with small quantum correction terms that are represented by a power series in Planck’s constant. We then also apply this formalism to Belavkin’s quantum filtering theory based on the Hudson­Parthasarathy quan­ tum stochastic calculus by showing that when this equation for a particle moving in a potential with damping and noise is described in terms of the Wigner dis­ tribution function, then it just the same as the Kushner­Kallianpur stochastic filter but with quantum correction terms expressed as a power series in Planck’s constant. This book also covers some aspects of classical large deviation theory especially in the derivation of the rate function of the empirical distribution of a Markov process and applied to the Brownian motion and the Poisson process. This problem has applications to stochastic control where we desire to control the probability of deviation of the empirical distribution of the output process of a stochastic system from a desired probability distribution by incorporating control terms in the stochastic differential equation. This book also discusses some aspects of string theory especially the problem of how field equations of physics get modified when string theory is taken into account and conformal invariance is imposed. How supersymmetry breaking stochastic terms in the

vi

4 Lagrangian can be reduced by feedback stochastic control etc. Some problem deal with modeling blood flow, breathing mechanisms etc using stochastic or­ dinary and partial differential equations and how to control them by applying large deviation theory. One of the problems deals with analysis of the motion of a magnet through a tube with a coil wound around it in a graviational field taking into account electromagnetic induction and back reaction. This problem is result of discussions with one of my colleagues. The book discusses quan­ tization of this motion when the electron and electromagnetic fields are also quantized. In short, this book will be useful to research workers in applied mathematics, physics and quantum information theory who wish to explore the range of applications of classical and quantum stochastics to problems of physics and engineering. Table of Contents Author chapter 1 Stability theory of nonlinear systems based on Taylor approximations of the nonlinearity 1.1 Stochastic perturbation of nonlinear dynamical systems:Linearized anal­ ysis 1.2 Large deviation analysis of the stability problem 1.3 LDP for Yang­Mills gauge fields and LDP for the quantized metric tensor of space­time 1.4 Strings in background fields chapter 2 2.1 String theory and large deviations 2.2 Evaluating quantum string amplitudes using functional integration 2.3 LDP for Poisson fields 2.4 LDP for strings perturbed by mixture of Gaussian and Poisson noise chapter 3 ordinary, partial and stochastic differential equation models for biological systems,quantum gravity, Hawking radiation, quantum fluids,statistical signal processing, group representations, large deviations in statistical physics 3.1 A lecture on biological aspects of virus attacking the immune system 3.2 More on quantum gravity 3.3 Large deviation properties of virus combat with antibodies 3.4 Large deviations problems in quantum fluid dynamics 3.5 Lecture schedule for statistical signal processing 3.6 Basic concepts in group theory and group representation theory 3.7 Some remarks on quantum gravity using the ADM action 3.8 Diffusion exit problem 3.9 Large deviations for the Boltzmann kinetic transport equation 3.10 Orbital integrals for SL(2, R) with applications to statistical image pro­ cessing. 3.11 Loop Quantum Gravity (LQG)

vii

Brief Contents

1. Stability Theory of Nonlinear Systems Based on Taylor Approximations of the Nonlinearity

1–10

2. String Theory and Large Deviations

11–14

3. Ordinary, Partial and Stochastic Differential Equation Models for Biological Systems, Quantum Gravity, Hawking Radiation, Quantum Fluids, Statistical Signal Processing, Group Representations, Large Deviations in Statistical Physics

15–60

4. Research Topics on Representations

61–74

5. Some Problems in Quantum Information Theory

75–86

6. Lectures on Statistical Signal Processing

87–106

7. Some Aspects of Stochastic Processes

107–126

8. Problems in Electromagnetics and Gravitation

127–128

9. Some More Aspects of Quantum Information Theory

129–140

10. Lecture Plan for Statistical Signal Processing

141–182

11. My Commentary on the Ashtavakra Gita

183–186

12 LDP-UKF Based Controller

187–192

13. Abstracts for a Book on Quantum Signal Processing Using Quantum Field Theory

193–200

14. Quantum Neural Networks and Learning Using Mixed State Dynamics for Open Quantum Systems

201–208

15. Quantum Gate Design, Magnet Motion, Quantum Filtering, Wigner Distributions

209–238

16. Large Deviations for Markov Chains, Quantum Master Equation, Wigner Distribution for the Belavkin Filter

239–248

17. String and Field Theory with Large Deviations, Stochastic Algorithm Convergence

249–260

ix

Detailed Contents

1. Stability Theory of Nonlinear Systems Based on Taylor Approximations of the Nonlinearity 1.1 Stochastic Perturbation of Nonlinear Dynamical Systems:

Linearized Analysis 1.2 Large Deviation Analysis of the Stability Problem 1.3 LDP for Yang-Mills Gauge Fields and LDP for the Quantized

Metric Tensor of Spacetime 1.4 String in Background Fields 2. String Theory and Large Deviations 2.1 Evaluating Quantum String Amplitudes Using

Functional Integration 2.2 LDP for Poisson Fields 2.3 LDP for Strings Perturbed by Mixture of Gaussian and

Poisson Noise

1–10 1

3

4

6

11–14 11

12

13

3. Ordinary, Partial and Stochastic Differential Equation Models for Biological Systems, Quantum Gravity, Hawking Radiation, Quantum Fluids, Statistical Signal Processing, Group Representations, Large Deviations in Statistical Physics 15–60 3.1 A lecture on Biological Aspects of Virus Attacking

the Immune System 15

3.2 More on Quantum Gravity 23

3.3 Large Deviation Properties of Virus Combat with Antibodies 25

3.4 Large Deviations Problems in Quantum Fluid Dynamics 31

3.5 Lecture Schedule for Statistical Signal Processing 34

3.6 Basic Concepts in Group Theory and Group Representation Theory 37

3.7 Some Remarks on Quantum Gravity Using the ADM Action 38

3.8 Diffusion Exit Problem 39

3.9 Large Deviations for the Boltzmann Kinetic Transport Equation 41

3.10 Orbital Integrals for SL(2,R) with Applications to

Statistical Image Processing 42

3.11 Loop Quantum Gravity (LQG) 44

3.12 Group Representations in Relativistic Image Processing 45

3.13 Approximate Solution to the Wave Equation in the Vicinity of

the Event Horizon of a Schwarchild Black-hole 48

3.14 More on Group Theoretic Image Processing 51

3.15 A Problem in Quantum Information Theory 58

3.16 Hawking Radiation 59

xi

4. Research Topics on Representations 61–74 4.1 Remarks on Induced Representations and Haar Integrals 61

4.2 On the Dynamics of Breathing in the Presence of a Mask 66

4.3 Plancherel Formula for SL(2,R) 69

4.4 Some Applications of the Plancherel Formula for a Lie Group to

Image Processing 71

5. Some Problems in Quantum Information Theory 75–86 5.1 Quantum Information Theory and Quantum Blackhole Physics 81

5.2 Harmonic Analysis on the Schwarz Space of G = SL(2,R) 83

5.3 Monotonicity of Quantum Entropy Under Pinching 84

6. Lectures on Statistical Signal Processing 6.1 Large Deviation Analysis of the RLS Algorithm 6.2 Problem Suggested 6.3 Fidelity Between Two Quantum States 6.4 Stinespring’s Representation of a TPCP Map, i.e,

of a Quantum Operation 6.5 RLS Lattice Algorithm, Statistical Performance Analysis 6.6 Approximate Computation of the Matrix Elements of

the Principal and Discrete Series for G = SL(2,R) 6.7 Proof of the Converse Part of the Cq Coding Theorem 6.8 Proof of the Direct part of the Cq Coding Theorem 6.9 Induced Representations 6.10 Estimating Non-uniform Transmission Line Parameters and

Line Voltage and Current Using the Extended Kalman Filter 6.11 Multipole Expansion of the Magnetic Field Produced

by a Time Varying Current Distribution

87–106 88

90

92

93

94

95

100

102

104

105

7. Some Aspects of Stochastic Processes 7.1 Problem Suggested by Prof. Dhananjay 7.2 Converse of the Shannon Coding Theorem

107–126 121

122

8. Problems in Electromagnetics and Gravitation

127–128

9. Some More Aspects of Quantum Information Theory 129–140 9.1 An Expression for State Detection Error 129

9.2 Operator Convex Functions and Operator Monotone Functions 131

9.3 A Selection of Matrix Inequalities 133

9.4 Matrix Holder Inequality 138

10. Lecture Plan for Statistical Signal Processing 141–182 10.1 End-Semester Exam, Duration: Four Hours, Pattern Recognition 143

xii

10.2 10.3 10.4 10.5

Quantum Stein’s Theorem Question Paper, Short Test, Statistical Signal Processing Question Paper on Statistical Signal Processing, Long Test Quantum Neural Networks for Estimating EEG pdf from

Speech pdf Based on Schrodinger’s Equation 10.6 Statistical Signal Processing 10.7 Monotonocity of Quantum Relative Renyi Entropy

153

154

157

166

173

178

11. My Commentary on the Ashtavakra Gita

183–186

12 LDP-UKF Based Controller

187–192

13. Abstracts for a Book on Quantum Signal Processing Using Quantum Field Theory

193–200

14. Quantum Neural Networks and Learning Using Mixed State Dynamics for Open Quantum Systems

201–208

15. Quantum Gate Design, Magnet Motion, Quantum Filtering, Wigner Distributions 209–238 15.1 Designing a Quantum Gate Using QED with a Control

c-number Four Potential 209

15.2 On the Motion of a Magnet Through a Pipe with a Coil Under

Gravity and the Reaction Force Exerted by the Induced Current

Through the Coil on the Magnet 211

15.3 Quantum Filtering Theory Applied to Quantum Field Theory for

Tracking a Given Probability Density Function 219

15.4 Lecture on the Role of Statistical Signal Processing in General

Relativity and Quantum Mechanics 221

15.5 Problems in Classical and Quantum Stochastic Calculus 223

15.6 Some Remarks on the Equilibrium Fokker-Planck Equation for

the Gibbs Distribution 226

15.7 Wigner-distributions in Quantum Mechanics 228

15.8 Motion of An Element having Both an Electric and a Magnetic

Dipole Moment in an External Electromagnetic Field Plus a Coil 236

16. Large Deviations for Markov Chains, Quantum Master Equation, Wigner Distribution for the Belavkin Filter 16.1 Large Deviation Rate Function for the Empirical

Distribution of a Markov Chain 16.2 Quantum Master Equation in General Relativity 16.3 Belavkin Filter for the Wigner Distirbution Function

xiii

239–248 239

241

246

17. String and Field Theory with Large Deviations, Stochastic Algorithm Convergence 17.1 String Theoretic Corrections to Field Theories 17.2 Large Deviations Problems in String Theory 17.3 Convergence Analysis of the LMS Algorithm 17.4 String Interacting with Scalar Field, Gauge Field and

a Gravitational Field

xiv

249–260 249

250

252

253

Chapter 1

Chapter 1

Stability Theory Stability theory of ofNonlinear Systems based Based on on nonlinear systems Taylor Approximations Taylor approximations of of the Nonlinearity the nonlinearity 1.1 Stochastic perturbation of nonlinear dynam­ ical systems:Linearized analysis Consider the nonlinear system defined by the differential equation dx(t)/dt = f (x(t)), t ≥ 0 where f : Rn → Rn is a continuously differentiable mapping. Let x0 be a fixed point of this system, ie f (x0 ) = 0 Denoting δx(t) to be a small fluctuation of the state x(t) around the stability point x0 , we get on first order Taylor expansion, ie, linearization, dδx(t)/dt = f � (x0 )δx(t) + O(|δx(t)|2 ) If we neglect O(|δx|2 ) terms, then we obtain the approximate solution to this differential equation as δx(t) = exp(tA)δx(0), t ≥ 0 where

A = f � (x0 )

is the Jacobian matrix of f at the equilibrium point x0 and hence it follows that this linearized system is asymptotically stable under small perturbations in all 9

1

2

Stochastic Processes in Classical and Quantum Physics and Engineering 10CHAPTER 1. STABILITY THEORY OF NONLINEAR SYSTEMS BASED ON TAYLO directions iff all the eigenvalues of A have negative real parts. By asymptoti­ cally stable, we meant that limt→∞ δx(t) = 0. The linearized system is critically stable at x0 if all the eigenvalues of A have non­positive real part, ie, an eigen­ value being purely imaginary is also allowed. By critically stable, we mean that δx(t), t ≥ 0 remains bounded in time. Stochastic perturbations: Consider now the stochastic dynamical system obtained by adding a source term on the rhs of the above dynamical system dx(t)/dt = f (x(t)) + w(t) where w(t) is stationary Gaussian noise with mean zero and autcorrelation Rw (τ ) = E(w(t + τ )w(t)) Now it is meaningless to speak of a fixed point, so we choose a trajectory x0 (t) that solves the noiseless problem dx0 (t)/dt = f (x0 (t)) Let δx(t) be a small perturbation around x0 (t) so that x(t) = x0 (t) + δx(t) solves the noisy problem. Then, assuming w(t) to be small and δx(t) to be of the ”same order or magnitude” as w(t), we get approximately upto linear orders in w(t), the linearized equation dδx(t)/dt = Aδx(t) + w(t), A = f ' (x0 ) which has solution δx(t) = Φ(t)δx(0) +

f

t 0

Φ(t − τ )w(τ )dτ, t ≥ 0,

where Φ(t) = exp(tA) Assuming δx(0) to be a nonrandom initial perturbation, we get E(δx(t)) = Φ(t)δx(0) which converges to zero as t → ∞ provided that all the eigenvalues of A have negative real part. Moreover, in this case, f tf t Φ(t − t1 )Rw (t1 − t2 )Φ(t − t2 )T dt1 dt2 E(|δx(t)|2 ) = T r 0

= Tr

f tf 0

0

t 0

which converges as t → ∞ to f ∞f Px = T r 0

Φ(t1 )Rw (t1 − t2 )Φ(t2 )T dt1 dt2 ∞ 0

Φ(t1 )Rw (t1 − t2 )Φ(t2 )T dt1 dt2

Stochastic Processes in Classical and Quantum Physics and Engineering 3 1.2. LARGE DEVIATION ANALYSIS OF THE STABILITY PROBLEM 11 Suppose now that A is diagonalizable with spectral representation A=− Then Φ(t) =

r r

r r

λ(k)Ek

k=1

exp(−tλ(k))Ek

k=1

and we get , Px =

f r r

k,m=1

∞ 0

f

∞ 0

T r(Ek .Rw (t1 − t2 ).Em )exp(−λ(k)t1 − λ(m)t2 )dt1 dt2

A sufficient condition for this to be finite is that |Rw (t)| remain bounded in which case we get when supt |Rw (t)| ≤ B < ∞∀t that Px ≤ B

r r

k,m=1

(Re(λ(k) + λ(m)))−1 ≤ 2Br2 /λR (min)

where λR (min) = mink Re(λ(k))

1.2 Large deviation analysis of the stability prob­ lem Let x(t) be a diffusion process starting at some point x0 within an open set V with boundary S = ∂V . If the noise level is low say parametrized by the parameter E, then the diffusion equation can be expressed as √ dx(t) = f (x(t))dt + Eg(x(t))dB(t) Let PE (x) denote the probability of a sample path of the process that starts at x0 end ends at the stop time τE = t at which the process first hits the boundary S. Then, from basic LDP, we know that the approximate probability of such a path is given by exp(−E−1 infy∈S V (x0 , y, t)) where V (x, t) = infu:dx(s)/ds=f (x(s))+g(x(s))u(s),0≤s≤t,x(0)=x0

f

t

u(s)2 ds/2 0

4 12CHAPTERStochastic Processes in Classical Quantum Physics and Engineering 1. STABILITY THEORY OF and NONLINEAR SYSTEMS BASED ON TAYLO and V (x0 , y, t) = infx:x(0)=x0 ,x(t)=y V (x0 , x, t) The greater the probability of the process hitting S, the lesser the mean value of the hitting time. This is valid with an inverse proportion. Thus, we can say that τE ≈ exp(V /E), E → 0 where V = infy∈S,t≥0 V (x0 , y, t) More precisely, we have the LDP limE→0 P (exp((V − δ)/E) < τE < exp((V + δ)/E)) = 1 Note that τE = exp(V /E) for a given path x implies that the probability of x PE (x) = exp(−V /E). This is because of the relative frequency interpretation of probability given by Von Mises or equivalently because of ergodic theory: If the path hits the boundary N (E) times in a total of N0 trials, with the time taken to hit being τE for each path, then the probability of hitting the boundary is N (E)/N . However, since it takes a time τE for each trial hit, it follows that the number of hits in time T is T /τE . Thus if each trial run in the set of N trials is of duration T , then the probability of hit is PE = N (E)/N = T /(N.τE ) and if we assume that that T, N are fixed, or equivalently that T, N → ∞ with N/T → to a finite nonzero constant, then we deduce from the above, τE ≈ exp(V /E) in the sense that E.log(τE ) converges to V as E → 0.

1.3 LDP for Yang­Mills gauge fields and LDP for the quantized metric tensor of space­ time Let L(qab , qab , N, N a ) denote the ADM Lagrangian of space­time. Here, N defines the diffeomorphism constraint, N the Hamiltonian constraint. The canonical momenta are P ab = ∂L/∂qab a

The Hamiltonian is then H = P ab qab − L

Stochastic Processes in Classical and Quantum PhysicsAND and Engineering 5 1.3. LDP FOR YANG­MILLS GAUGE FIELDS LDP FOR THE QUANTIZED MET 1

Note that the time derivatives N ' , N a of N and N a do not appear in L. QGTR is therefore described by a constrained Hamiltonian with the constraints being P N = ∂L/∂N ' = 0, leading to the equation of motion H = ∂L/∂N = 0 which is the Wheeler­De­Witt equation, the analog of Schrodinger’s equation in particle non­relativistic mechanics. It should be interpreted as H(N )ψ = 0∀N where H(N ) =

N.Hd3 x

The Large deviation problem: If there is a small random perturbation in the initial conditions of the Einstein field equations, then what will be the statistics of the perturbed spatial metric qab after time t ? This is the classical LDP. The quantum LDP is that if there is a small random perturbation in the initial wave function of a system and also small random Hamiltonian perturbations as well as small random time varying perturbations in the potentials appearing in the Hamiltonian, then what will be the statistics of quantum averaged observables ? The Yang­Mills Lagrangian in curved space­time: Let Aµ = Aaµ τa denote the Lie algebra valued Yang­Mills connection potential field. Let Dµ = ∂µ + Γµ denote the spinor gravitational covariant derivative. Then the Yang­Mills field tensor is given by Fµν = Dµ Aν − Dν Aµ + [Aµ , Aν ] The action functional of the Yang­Mills field in the background gravitational field is given by SY M (A) =

√ g µα g νβ T r(Fµν Fαβ ) −gd4 x

Problem: Derive the canonical momentum fields for this curved space­time Yang­Mills Lagrangian and hence obtain a formula for the canonical Hamilto­ nian density.

6

Stochastic Processes in Classical and Quantum Physics and Engineering 14CHAPTER 1. STABILITY THEORY OF NONLINEAR SYSTEMS BASED ON TAYLO

1.4

String in background fields

The Lagrangian is µ ν L(X, X,α , α = 1, 2) = (1/2)gµν (X)hαβ X,α X,β µ ν +Φ(X) + Hµν (X)Eαβ X,α X,β

where a conformal transformation has been applied to bring the string world sheet metric hαβ to the canonical form diag[1, −1]. Here, Φ(X) is a scalar field. The string action functional is J S[X] = Ldτ dσ The string field equations

µ ∂α ∂L/∂X,α − ∂L/∂X µ = 0

are ν ν ∂α (gµν (X)hαβ X,β )+Eαβ (Hµν (X)X,β ),α σ ρ σ ρ +Φ,µ (X)+Hρσ,µ (X)Eab X,a X,b = (1/2)gρσ,µ (X)hab X,a X,b

or equivalently, ν ν ν ν ρ ρ X,b gµν,ρ (X)hab X,a X,b + Eab Hµν (X)X,ab + gµν (X)hab X,ab + Eab Hµν,ρ (X)X,a σ ρ σ ρ + Φ,µ (X) + Hρσ,µ (X)Eab X,a X,b = (1/2)gρσ,µ (X)hab X,a X,b

These differential equations can be put in the form µ ρ ρ η ab X,ab , X,ab ) = δ.F µ (X ρ , X,a

where δ is a small perturbation parameter and ((η ab )) = diag[1, −1] The prob­ lem is to derive the correction to string propagator caused by the nonlinear perturbation term. The propagator is Dµν (τ, σ|τ 1 , σ 1 ) =< 0|T (X µ (τ, σ).X ν (τ 1 , σ 1 ))|0 >

and we see that it satisfies the following fundamental propagator equations: let

x = (τ, σ), x1 = (τ 1 , σ 1 ), ∂0 = ∂τ , ∂1 = ∂σ

Then,

1

∂0 Dµν (x|x1 ) = δ(x0 − x 0 ) < [X µ (x), X ν (x1 )] > µ + < T (X,0 (x).X ν (x1 )) > µ (x).X ν (x1 ) > =< T (X,0

and likewise 1

µ ∂02 Dµν (x|x1 ) = δ(x0 − x 0 ) < [X,0 (x), X ν (x1 )] >

Stochastic Processes in Classical and Quantum Physics and Engineering 1.4. STRING IN BACKGROUND FIELDS

15

µ + < T (X,00 (x).X ν (x� )) > �

µ = δ(x0 − x 0 ) < [X,0 (x), X ν (x� )] >

+∂12 Dµν (x|x� ) + δ. < T (F µ (X, X,a X,b )(x).X ν (x� )) > or equivalently

η ab ∂a ∂b D(x|x� ) = δ. < T (F µ (X, X,a X,b )(x).X ν (x� )) > �

µ (x), X ν (x� )] > − − −(1) +δ(x0 − x 0 ) < [X,0

Now consider the canonical momenta µ ν = gµν (X(x))X,0 (x) Pµ (x) = ∂L/∂X,0 ν (x) −2Hµν (X(x))X,1

Using the canonical equal time commutation relations [X µ (τ, σ), X ν (τ, σ � )] = 0 so that

ν (τ, σ � )] = 0 [X µ (τ, σ), X,1

and

[X µ (τ, σ), Pν (τ, σ � )] = iδµν δ(σ − σ � )

we therefore deduce that ρ [X µ (τ, σ), gνρ (X(τ, σ � ))X,0 (τ, σ � )] = iδµν δ(σ − σ � )

or equivalently, ν [X µ (τ, σ), X,0 (τ, σ � )] = ig µν (X(τ, σ))δ(σ − σ � )

and therefore (1) can be expressed as �Dµν (x|x� ) = η ab ∂a ∂b Dµν (x|x� ) = δ. < T (F µ (X, X,a X,b )(x).X ν (x� )) >

−i < g µν (X(x)) > δ 2 (x − x� ) − − − (2)

Now suppose we interpret the quantum string field X µ as being the sum of a

large classical component X0µ (τ, σ) and a small quantum fluctuating component δX µ (τ, σ), then we can express the above exact relation as Dµν (x|x� ) = X0µ (x)X0ν (x� )+ < T (δX µ (x).δX ν (x� )) > and further we have approximately upto first order of smallness µ (X0 , X0� , X0�� )δX ρ F µ (X, X � , X �� ) = F µ (X0 , X0,a , X0,b )+F,1ρ ρ µ µ ρ +F,2ρa (X, X � , X �� )δX,a +F,3ρab (X, X � , X �� )δX,ab

7

8

Stochastic Processes in Classical and Quantum Physics and Engineering 16CHAPTER 1. STABILITY THEORY OF NONLINEAR SYSTEMS BASED ON TAYLO and hence upto second orders of smallness, (2) becomes D(X0µ (x)X0ν (x' )) + D < T (δX µ (x).δX ν (x' )) > = δF µ (X0 , X0,a , X0,b )(x)X0ν (x' )

µ (X0 , X0' , X0'' )(x) < T (δX ρ (x)δX ν (x' )) >

+δ.F,1ρ µ ρ +δ.F,2ρa (X, X ' , X '' )(x) < T (δX,a (x).δX ν (x' )) > µ ρ +δ.F,3ρab (X, X ' , X '' )(x) < T (δX,ab (x).δX ν (x' )) >

−i < g µν (X(x)) > δ 2 (x − x' ) To proceed further, we require the dynamics of the quantum fluctuation part δX µ . To get this, we apply first order perturbation theory to the system µ ρ ρ η ab X,ab = δ.F µ (X ρ , X,a , X,ab )

to get

µ η ab X0,ab =0 µ = δ.F µ (X0 , X0' , X0'' ) η ab δX,ab

and solve this to express δX µ as the sum of a classical part (ie a particular solution to the inhomogeneous equation) and a quantum part (ie the general solution to the homogeneous equation). The quantum component is 2 δXqµ (τ, σ) = −i (aµ (n)/n)exp(in(τ − σ)) n=0

where [aµ (n), aν (m)] = nη µν δ[n + m] Note that in particular, DX0µ (x) = 0 and hence the approximate corrected propagator equation becomes D < T (δX µ (x).δX ν (x' )) > = δF µ (X0 , X0,a , X0,b )(x)X0ν (x' )

µ (X0 , X0' , X0'' )(x) < T (δX ρ (x)δX ν (x' )) >

+δ.F,1ρ µ ρ +δ.F,2ρa (X, X ' , X '' )(x) < T (δX,a (x).δX ν (x' )) > µ ρ +δ.F,3ρab (X, X ' , X '' )(x) < T (δX,ab (x).δX ν (x' )) >

−ig µν (X0 (x))δ 2 (x − x' ) provided that we neglect the infinite term gµν,αβ (X0 ) < δX α (x)δX β (x) >

Stochastic Processes in Classical and Quantum Physics and Engineering 1.4. STRING IN BACKGROUND FIELDS

17

if we do not attach the perturbation tag δ to the F ­term, ie, we do not regard F as small, then the unperturbed Bosonic string X0 satisfies DX0µ = F µ (X0 , X0' , X0'' ) and then the corrected propagator Dµν satisfies the approximate equation DDµν (x|x' ) = µ +δ.F,1ρ (X0 , X0' , X0'' )(x) < T (δX ρ (x)δX ν (x' )) >0 µ ρ +δ.F,2ρa (X, X ' , X '' )(x) < T (δX,a (x).δX ν (x' )) >0 ρ µ +δ.F,3ρab (X, X ' , X '' )(x) < T (δX,ab (x).δX ν (x' )) >0

−iη µν δ 2 (x − x' ) provided that we assume that the metric is a very small perturbation of that of Minkowski flat space­time. Here, < . >0 denotes the unperturbed quantum average, ie, the quantum average in the absence of the external field Hµν and also in the absence of metric perturbations of flat space­time. We write D0µν (x|x' ) =< T (δX µ (x).δX ν (x' )) >0 Now observe that in the absence of such perturbations, L (aµ (n)/n)exp(in(τ − σ)) δX µ (x) = −i n=0

so − −



L

n>0

L

n>0

L

n>0



< T (δX µ (x).δX ν (x' )) >0 = (θ(τ − τ ' )/n)exp(−in(τ − τ ' − σ + σ ' ))

(θ(τ ' − τ )/n).exp(−in(τ ' − τ − σ ' + σ)) µ < T (δX,0 (x).δX ν (x' )) >0 =

(θ(τ − τ ' )/n)(−in)exp(−in(τ − τ ' − σ + σ ' ))

L

n>0

(θ(τ ' − τ )/n).(in)exp(−in(τ ' − τ − σ ' + σ))

=i

L

n>0

−i

L

n>0

(θ(τ − τ ' )exp(−in(τ − τ ' − σ + σ ' ))

(θ(τ ' − τ )exp(−in(τ ' − τ − σ ' + σ)) µ < T (δX,1 (x).δX ν (x' )) >0 =

9

10

Stochastic Processes in Classical and Quantum Physics and Engineering 18CHAPTER 1. STABILITY THEORY OF NONLINEAR SYSTEMS BASED ON TAYLO − −



(θ(τ − τ � )/n)(in)exp(−in(τ − τ � − σ + σ � ))

n>0



n>0

(θ(τ � − τ )/n).(−in)exp(−in(τ � − τ − σ � + σ))

= −i +i



n>0



n>0

(θ(τ − τ � )exp(−in(τ − τ � − σ + σ � ))

(θ(τ � − τ )exp(−in(τ � − τ − σ � + σ))

µ (x).δX ν (x� )) >0 = − < T (δX,0

Note that we have the alternate expression µ < T (δX,1 (x).δX ν (x� )) >0 =

∂1 D0µν (x|x� ) because differentiation w.r.t the spatial variable σ commutes with the time or­ dering operation T . Further, µ < T (δX,00 (x).δX ν (x� )) >0 µ =< T (δX,11 (x).δX ν (x� )) >0

= ∂12 D0µν (x|x� ) µ < T (δX,01 (x).δX ν (x� )) >0 µ = ∂1 < T (δX,0 (x).δX ν (x� )) >0 � = ∂1 [i (θ(τ − τ � )exp(−in(τ − τ � − σ + σ � )) n>0

−i i



n>0



n>0

−i

(θ(τ � − τ )exp(−in(τ � − τ − σ � + σ))] =

(θ(τ − τ � )(in)exp(−in(τ − τ � − σ + σ � ))



(θ(τ � − τ )(−in)exp(−in(τ � − τ − σ � + σ)) =



(θ(τ − τ � )n.exp(−in(τ − τ � − σ + σ � ))

n>0





n>0



n>0

(θ(τ � − τ )n.exp(−in(τ � − τ − σ � + σ))

Chapter 2 Chapter 2

String Theory and Large String theory and Deviations large deviations 2.1 Evaluating quantum string amplitudes using functional integration Assume that external lines are attached to a given string at space­time points (τr , σr ), r = 1, 2, ..., M . Note that the spatial points σr are fixed while the interaction times τr vary. Let Pr (σ) denote the momentum of the rth external line at σ. At the point σ = σr where it is attached, this momentum is specified as Pr (σr ). Thus noting that X(τ, σ) = −i

L

n=0

a(n)exp(in(τ − σ))/n

we have Pr (σr ) = X,τ (σr ) =

L

a(n)exp(−inσr )

n=0

This equation constrains the values of a(n). We thus define the numbers Prn by L Prn exp(−inσr ) − − − (1) Pr (σr ) = n

and the numbers {Prn } characterize the momentum of the external line attached at the point σr . Note that for reality of the momentum, we must have Prn = ; s are real and therefore Pr,−n if we assume that the Prn Pr (σr ) = 2

L

Prn cos(nσr )

n>0

19

11

12

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 2. STRING THEORY AND LARGE DEVIATIONS

20

Note that Pr (σ) is the momentum field corresponding to the rth external line ; s are determined by the equation and this is given to us. Thus, the Prn L Pr (σ) = 2 Prn cos(nσ) n>0

Equivalently, we fix the numbers {Prn }r,n corresponding to the momenta carried by the external lines and then obtain the momentum Pr (σr ) of the line attached at σr using (1). The large deviation problem: Express the Hamiltonian density of the string having Lagrangian density (1/2)∂α X µ ∂ α Xµ in terms of X µ , Pµ = ∂0 X µ . Show that it is given by H0 = (1/2)P µ Pµ + (1/2)∂1 X µ ∂1 Xµ In the presence of a vector potential field Aµ (σ), the charged string field has the Hamiltonian density H with Pµ replaced by Pµ + eAµ . Write down the string field equations for this Hamiltonian and assuming that the vector potential is a weak Gaussian field, evaluate the rate function of the string field X µ . LDP in general relativistic fluid dynamics.

2.2

LDP for Poisson fields

Small random perturbations in the metric field having a statistical model of the sum of a Gaussian and a Poisson field affect the fluid velocity field. Let X1 (x), x ∈ Rn be a zero mean Gaussian random field with correlation R(x, y) = E(X1 (x)X1 (y)) and let N (x), x ∈ Rn be a Poisson random field with mean E(N (dx)) = λ(x)dx. Consider the family of random Poisson measure NE (E) = Ea N (E−1 E) for any Borel subset E of Rn . We have J E[exp( f (x)N (dx))] = exp( Thus,

J

Rn

Rn

λ(x)(exp(f (x)) − 1)dx)

ME (f ) = Eexp( = exp(

J

a

J

f (x)NE (dx))

E f (x)N (dx/E))) = exp(

= exp( Thus, log(ME (E

J

−a

J

Ea f (Ex)N (dx))

λ(x)(exp(Ea f (Ex)) − 1)dx)

f )) =

J

λ(x)(exp(f (Ex)) − 1)dx

Stochastic Processes in Classical and Quantum Physics and Engineering 13 2.3. LDP FOR STRINGS PERTURBED BY MIXTURE OF GAUSSIAN AND POISSON N = �−n Then,

assuming that



λ(�−1 x)(exp(f (x)) − 1)dx

lim�→0 �n .log(M� (�−a f )) � = λ(∞) (exp(f (x)) − 1)dx λ(∞) = lim�→0 λ(x/�)

exists. More generally, suppose we assume that this limit exists and depends only on the direction of the vector x ∈ Rn so that we can write λ(∞, x ˆ) = lim�→0 λ(x/�) then we get for the Gartner­Ellis limiting logarithmic moment generating func­ tion lim�→0 �n .log(M� (�−a f )) � ¯ ) = λ(∞, x ˆ)(exp(f (x)) − 1)dx = Λ(f

2.3 LDP for strings perturbed by mixture of Gaussian and Poisson noise Consider the string field equations √ �X µ (σ) = fk µ (σ) �W k (τ ) + gkµ �dNk (τ )/dτ where W k are independent white Gaussian noise processes and Nk are indepen­ dent Poisson processes. Assume that the rate of Nk is λk /�. Calculate then the LDP rate functional of the string process X µ . Consider now a superstring with Lagrangian density ¯ α ∂α ψ L = (1/2)∂α X µ ∂ α Xµ + ψρ Add to this an interaction Lagrangian µ ν X,β �αβ Bµν (X)X,α

an another interaction Lagrangian ¯ αψ Jα (X)ψρ Assume that Bµν (X) and Jα (X) are small Gaussian fields. Calculate the joint rate function of the Bosonic and Fermionic string fields X µ , ψ.

Chapter 3 Chapter 3

Ordinary, Partial and

ordinary, partial and Stochastic Differential

stochastic differential Equation Models for equation forQuantum Biologicalmodels Systems, biological systems,quantum Gravity, Hawking Radiation, gravity, Hawking Quantum Fluids,radiation, Statistical quantum fluids,statistical Signal Processing, Group signal processing, group Representations, Large representations, large Deviations in Statistical deviations in statistical Physics physics 3.1 A lecture on biological aspects of virus at­ tacking the immune system We discuss here a mathematical model based on elementary fluid dynamics and quantum mechanics, how a virus attacks the immune system and how antibodies generated by the immune system of the body and the vaccination can wage a counter attack against this virus thereby saving the life. According to recent 23

15

16 24CHAPTERStochastic ProcessesPARTIAL in Classical andSTOCHASTIC Quantum Physics and Engineering EQUATI 3. ORDINARY, AND DIFFERENTIAL researches, a virus consists of a spherical body covered with dots/rods/globs of protein. When the virus enters into the body, as such it is harmless but becomes dangerous when it lodges itself in the vicinity of a cell in the lung where the protein blobs are released. The protein blobs thus penetrate into the cell fluid where the ribosome gives the command to replicate in the same way that normal cell replication occurs. Thus a single protein blob is replicated several times and hence the entire cell is filled with protein and thus gets destroyed completely. When this happens to every cell in the lung, breathing is hampered and death results. However, if our immune system is strong enough, then the blood generates antibodies the moment it recognizes that a stranger in the form of protein blobs has entered into the cell. These antibodies fight against the protein blobs and in fact the antibodies lodge themselves on the boundary of the cell body at precisely those points where the protein from the virus has penetrated thus preventing further protein blobs from entering into the cell. Whenever this fight between the antibodies and the virus goes on, the body develops fever because an increase in the body temperature disables the virus and enables the antibodies to fight against the proteins generated by the virus. If however the immune system is not powerful enough, sufficient concentration of antibodies is not generated and the viral protein dominates over the antibodies in their fight finally resulting in death of the body. The Covid vaccine is prepared by generating a sample virus in the laboratory within a fluid like that present within the body and hence antibodies are produced within the fluid by this artifical process in the same way that the small­pox vaccine was invented by Edward Jenner by injecting the small­pox virus into a cow to generate the required antibodies. Now we discuss the mathematics of the virus, protein, cells and antibodies. Assume that the virus particles have masses m1 , ..., mN and charges e1 , ..., eN . The blood fluid has a velocity field v(t, r) whose oxygenated component cir­ culates to all the organs of the body and whose de­oxygenated component is directed toward the lungs. This circulation is caused by the pumping of the heart. Assume that the heart pumps blood with a generation rate g(t, r) having units of density per unit time or equivalently mass per unit volume per unit time. Assume also that the force of this pumping is represented by a force density field f (t, r). Also there may be present external electromagnetic fields E(t, r), B(t, r) within the blood field which interact with the current field within the blood caused by non­zero conductivity σ of the blood. This current density is according to Ohm’s law given by J(t, r) = σ(E(t, r) + v(t, r) × B(t, r)) The relevant MHD equations and the continuity equation are given by ρ(∂t v + (v, \)v) = −\p + η\2 v + f + σ(E + v × B) × B ∂t ρ + div(ρ.v) = g Sometimes a part of the electromagnetic field within the blood is of internal origin, ie, generated by the motion of this conducting blood fluid. Thus, if we

Stochastic Processes in Classical and Quantum Physics and Engineering 17 3.1. A LECTURE ON BIOLOGICAL ASPECTS OF VIRUS ATTACKING THE IMMUNE write Eext , Bext for the externally applied electromagnetic field and Eint , Bint for the internally generated field, then we have Eint (t, r) = −�φint − ∂t Aint , Bint = � × Aint where Aint (t, r) = (µ/4π)



J(t − |r − r� |/c, r� )d3 r� /|r − r� |,

Φint (t, r) = −c2



t

divAint dt 0

The total electromagnetic field within the blood fluid is the sum of these two components: E = Eext + Eint , B = Bext + Bint We have control only over Eext , Bext and can use these control fields to ma­ nipulate the blood fluid velocity appropriately so that the virus does not enter into the lung region. Let rk (t) denote the position of the k th virus particle. It satisfies the obvious differential equation drk (t)/dt = v(t, rk (t)) with initial conditions rk (0) depending on where on the surface of the body the virus entered. If the virus is modeled as a quantum particle with nucleus at the classical position rk (t) at time t, then the protein molecules that are stuck to this virus surface have relative positions (rkj , j = 1, 2, ..., M } and their joint quantum mechanical wave function ψk (t, rkj , j = 1, 2, ..., M ) will satisfy the Schrodinger equation ih∂t ψk (t, rkj , j = 1, 2, ..., M ) =

M � (−h2 /2mkj )�2rkj ψk (t, rkj , j = 1, 2, ..., M ) j=1

+

M � j=1

ek Vk (t, rkj − rk (t))ψk (t, rkj , j = 1, 2, ..., M )

where Vk (t, r) is the binding electrostatic potential energy of the viral nucleus. There may also be other vector potential and scalar potential terms in the above Schrodinger equation, these potentials coming from the prevalent electromag­ netic field. Some times it is more accurate to model the entire swarm of viruses as a quantum field rather than as individual quantum particles but the result­ ing dynamics becomes more complicated as it involves elaborate concepts from quantum field theory. Suppose we have solved for the joint wave functions ψk of the protein blobs on the k th virus. Then the average position of the lth blob on the k th viral surface is given by � 3 < rkl > (t) = rkl ψk (t, rkj , j = 1, 2, ..., M )ΠM j=1 d rkj , 1 ≤ k ≤ M

18 Stochastic Processes in Classical and Quantum Physics and Engineering 26CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUATI We consider the time at which these average positions of the lth protein blob on the k th viral surface hits the boundary S = ∂V of a particular cell. This time is given by τ (k, l) = inf {t ≥ 0 :< rkl > (t) ∈ S} At this time, replication of this protein blob due to cell ribosome action begins. Let λ(n)dt denote the probability that in the time interval [t, t + dt], n protein blobs will be generated by the ribosome action on a single blob. Let Nkl (t) denote the number of blobs due to the (k, l)th one present at time t. Then for Pkln (t) = P r(Nkl (t) = n), n = 1, 2, ... we have the Chapman­Kolmogorov Markov chain equations � � dPkln (t)/dt = Pklr (t)λ(kl, r, n) − Pkln (t) Pkln (t) 1≤rs i−1 (∂θr − ∂θs ) Then,

(Du)(0) = Πr>s (mr − ms )

L

s∈ W

χs (0) = |W |Πr>s (mr − ms )

on the one hand and on the other, using the fact that L Δ(H) = E(s).χsn,n−1,...,1 (H) s∈W

where χn−1,...,0 (H) = exp(inθ1 + ... + i2θ2 + iθ1 ) we get and hence where

DΔ(0) = |W |Πn≥r>s≥1 (r − s) (Du)(0) = χ(0)(DΔ(0)) = d(χ).|W |.Πr>s (r − s) d(χ) = χ(0)

is the dimension of the irreducible representation. Equating these two expres­ sions, we obtain Weyl’s dimension formula d(χ) = d(m1 , ..., mn ) =

Πr>s (mr − ms ) Πr>s (r − s)

48

Stochastic Processes in Classical and Quantum Physics and Engineering 56CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUATI This argument works for the compact group SU (n) where a dominant integral weight is parametrized by an n tuple of non­negative integers (m1 ≥ m2 ≥ ... ≥ m1 ≥ 0). In the general case of a compact semisimple Lie group G, we use Weyl’s character in the form � �(s)exp(s.(λ + ρ)(H)) χ(H) = s∈W Δ(H) � Δ(H) = �(s).exp(s.ρ(H)) = Πα∈Δ+ (exp(α(H)/2) − exp(−α(H)/2)) s∈W

Then define

D = Πα∈Δ+ ∂(Hα ) We get DΔ(0) = |W |.Πα∈Δ+ ρ(Hα ) Thus, on the one hand, D(χΔ)(0) = χ(0)DΔ(0) = d(χ).|W |.Πα∈Δ+ ρ(Hα ) and on the other, D(Δ.χ)(0) = |W |Πα∈Δ+ (λ + ρ)(Hα ) Equating these two expressions gives us Weyl’s dimension formula for the irre­ ducible representation of any compact semisimple Lie group corresponding to the dominant integral weight λ in terms of the roots: d(χ) = d(λ) =

=

Πα∈Δ+ (λ + ρ)(Hα ) Πα∈Δ+ ρ(Hα )

Πα∈Δ+ < λ + ρ, α > Πα∈Δ+ < ρ, α >

3.13 Approximate solution to the wave equation in the vicinity of the event horizon of a Schwarchild black­hole This calculation would enable us to expand the solution to the wave equation in the vicinity of a blackhole in terms of ”normal modes” or eigenfunctions of the Helmholtz operator in Schwarzchild space­time. The coefficients in this expansion will be identified with the particle creation and annihilation oper­ ators which would therefore enable us to obtain a formula for the blackhole entropy assuming that it is in a thermal state. It would enable us to obtain the

Stochastic Processes in Classical and Quantum Physics and Engineering 49 3.13. APPROXIMATE SOLUTION TO THE WAVE EQUATION IN THE VICINITY OF blackhole temperature at which it starts radiating particles in the Rindler space­ time metric which is an approximate transformation of the Schwarzchild metric. By using the relationship between the time variable in the Rindler space­time and the time variable in the Schwarzchild space­time, we can thus obtain the characteristic frequencies of oscillation in the solution of the wave equation in Schwarzchild space­time in terms of the the gravitational constant G and the mass M of the blackhole. These characteristic frequencies can be used to calcu­ late the critical Hawking temperature at which the blackhole radiates in terms of G, M using the formula for the entropy of a Gaussian state. Let g00 = α(r) = 1 − 2GM/c2 r, g11 = α(r)−1 /c2 = c−2 (1 − 2GM/rc2 )−1 , g22 = −r2 /c2 , g33 = −r2 sin2 (θ)/c2 The Schwarzchild metric is then dτ 2 = gµν dxµ dxν , x0 = t, x1 = r, x2 = θ, x3 = φ or equivalently, dτ 2 = α(r)dt2 − α(r)−1 dr2 /c2 − r2 dθ2 /c2 − r2 sin2 (θ)dφ2 /c2 The Klein­Gordin wave equation in Schwarzchild space­time is √ √ (g µν ψ,ν −g),µ + µ2 −gψ = 0 where µ = mc/h with h denoting Planck’s constant divided by 2π. Now, √ −g = (g00 g11 g22 g33 )1/2 = r2 sin(θ)/c3 so the above wave equation expands to give α(r)−1 r2 sin(θ)ψ,tt − ((c2 /α(r))r2 sin(θ)ψ,r ),r −(c2 /r2 )r2 (sin(θ)ψ,θ )),θ − (c2 /r2 sin2 (θ))(r2 sin(θ)ψ,φφ + µ2 r2 sin(θ)ψ = 0 This equation can be rearranged as (1/c2 )∂t2 ψ = (α(r)/r2 )((r2 /α(r))ψ,r ),r +(α(r)/r2 )((sin(θ))−1 (sin(θ)ψ,θ ),θ + sin(θ)−2 ψ,φφ ) − µ2 α(r)ψ This wave equation can be expressed in quantum mechanical notation as (1/c2 )∂t2 ψ = (α(r)/r2 )((r2 /α(r))ψ,r ),r −(α(r)/r2 )L2 ψ − µ2 α(r)ψ

5058CHAPTERStochastic ProcessesPARTIAL in Classical andSTOCHASTIC Quantum Physics and Engineering EQUATI 3. ORDINARY, AND DIFFERENTIAL where L2 = −(

1 ∂ ∂ 1 ∂2 sin(θ) + ) sin(θ) ∂θ ∂θ sin2 (θ) ∂φ2

is the square of the angular momentum operator in non­relativistic quantum mechanics. Thus, we obtain using separation of variables, the solution � fnl (t, r)Ylm (θ, φ) ψ(t, r, θ, φ) = nlm

where fnl (t, r) satisfies the radial wave equation (1/c2 )∂t2 fnl (t, r) = (α(r)/r2 )((r2 /α(r))fnl, r ),r −((α(r)/r2 )l(l + 1) + µ2 α(r))fnl (t, r) where l = 0, 1, 2, .... We write fnl (t, r) = exp(iωt)gnl (r) Then the above equation becomes � (−ω 2 /c2 )gnl (r) = (α(r)/r2 )((r2 /α(r))gnl (r))�

−((α(r)/r2 )l(l + 1) + µ2 α(r))gnl (r) For each value of l, this defines an eigenvalue equation and on applying the boundary conditions, the possible frequencies ω assume discrete values say ω(n, l), n = 1, 2, .... These frequenices will depend upon GM and µ since α(r) depends on GM . Let gnl (r) denote the corresponding set of orthonormal eigen­ functions w.r.t the volume measure √ c3 −gd3 x = r2 sin(θ)dr.dθ.dφ The general solution can thus be expanded as ψ(t, r, θ, φ) =



[a(n, l, m)gnl (r)Ylm (θ, φ)exp(iω(nl)t)+a(n, l, m)∗ g¯nl (r)Y¯lm (θ, φ)exp(−iω(nl)t)]

nlm

where n = 1, 2, ..., l = 0, 1, 2, ..., m = −l, −l + 1, ..., l − 1, l. The Hamiltonian operator of the field can be expressed as � hω(nl)a(n, l, m)∗ a(n, l, m) H= nlm

and the bosonic commutation relations are

[a(nlm), a(n� l� m� )∗ ] = hω(nl)δ(nn� )δ(ll� )δ(mm� )

51 59

Stochastic Processes in Classical and Quantum Physics and Engineering 3.14. MORE ON GROUP THEORETIC IMAGE PROCESSING An alternate analysis of Hawking temperature. The equation of motion of a photon, ie null radial geodesic is given by α(r)dt2 − α(r)−1 dr2 = 0 or dr/dt = ±α(r) = ±(1 − 2m/r), m = GM/c2 The solution is



rdr/(r − 2m) = ±t + c

so that 2m.r + 2mln(r − 2m) = ±t + c The change in time t when the particle goes from 2m+0 to 2m−0 is purely imag­ inary an is given by δt = 2mπi since ln(r − 2m) = ln|r − 2m| + i.arg(r − 2m) with arg(r − 2m) = 0 for r > 2m and = π for r < 2m. Thus, a photon wave oscillating at a frequency ω has its amplitude changed from exp(iωt) to exp(iω(t + 2mπi)) = exp(iωt − 2mπω). The corresponding ratio of the proba­ bility when it goes from just outside to just inside the critical radius is obtained by taking the modulus square of the probability amplitude and is given by P = exp(−4mπω). The energy of the photon within the blackhole is hω (where h is Planck’s constant divided by 2π) and hence by Gibbsian theory in equilibrium statistical mechanics, its temperature T must be given by P = exp(−hω/kT ). Therefore hω/kT = 4πmω or equivalently, T = 4πmk/h = 8πGM k/hc2 This is known as the Hawking temperature and it gives us the critical temper­ ature of the interior of the blackhole at photons can tunnel through the critical radius, ie the temperature at which the blackhole can radiate out photons.

3.14

More on group theoretic image processing

Problem: Compute the Haar measure on SL(2, R) in terms of the Iwasawa decomposition x = u(θ)a(t)n(s) where u(θ) =



cos(θ) −sin(θ)

sin(θ) cos(θ)



= exp(θ(X − Y )),

a(t) = exp(tH) = diag(et , e−t ), n(s) = exp(sX) =



1 0

s 1



52 Stochastic Processes in Classical and Quantum Physics and Engineering 60CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUATI Solution: We first express the left invariant vector fields corresponding to H, X, Y as linear combinations of ∂/∂t, ∂/∂θ, ∂/∂s and then the reciprocial of the associated determinant gives the Haar density. Specifically, writing H ∗ = f11 (θ, t, s)∂/∂θ + f12 (θ, t, s)∂/∂t + f13 (θ, t, s)∂/∂s, X ∗ = f21 (θ, t, s)∂/∂θ + f22 (θ, t, s)∂/∂t + f23 (θ, t, s)∂/∂s, Y ∗ = f31 (θ, t, s)∂/∂θ + f32 (θ, t, s)∂/∂t + f33 (θ, t, s)∂/∂s, for the left invariant vector fields corresponding to H, X, Y respectively, it would follow that the Haar measure on SL(2, R) is given by F (θ, t, s)dθdtds, F = |det((fij ))| or more precisely, the Haar integral of a function φ(x) on SL(2, R) would be given by � � φ(x)dx = φ(u(θ)a(t)n(s))F (θ, t, s)dθdtds SL(2,R)

We have d/ds(u(θ)a(t)n(s)) = u(θ)a(t)n(s)X so

X ∗ = ∂/∂s

Then, d/dt(u(θ)a(t)n(s)) = u(θ)a(t)Hn(s) = u(θ)a(t)n(s)n(s)−1 Hn(s) Now, n(s)−1 Hn(s) = exp(−s.ad(X))(H) = H − s[X, H] = H − 2sX So

H ∗ − 2sX ∗ = ∂/∂t

Finally, d/dθ(u(θ)a(t)n(s)) = u(θ)(X−Y )a(t)n(s) = u(θ)a(t)n(s)(n(s)−1 a(t)−1 (X−Y )a(t)n(s)) and a(t)−1 (X − Y )a(t) = exp(−t.ad(H))(X − Y ) = exp(−2t)X − exp(2t)Y n(s)−1 a(t)−1 (X − Y )a(t)n(s) = n(s)−1 (exp(−2t)X − exp(2t)Y )n(s) = exp(−2t)X − exp(2t).exp(−s.ad(X))(Y )

exp(−s.ad(X))(Y ) = Y − s[X, Y ] + (s2 /2)[X, [X, Y ]] = Y − sH − s2 X so

exp(−2t)X ∗ − exp(2t)(Y ∗ − sH ∗ − s2 X ∗ ) = ∂/∂θ

Stochastic Processes ClassicalTHEORETIC and Quantum Physics Engineering 3.14. MORE ONinGROUP IMAGEand PROCESSING

6153

Problem: Solve these three equations and thereby express X ∗ , Y ∗ , H ∗ as linear combinations of ∂/∂s, ∂/∂t, ∂/∂θ. Problem: Suppose I(f, g) is the invariant functional for a pair of image fields f, g defined on SL(2, R). If f changes by a small random amount and so does g, then what is the rate functional of the change in I(f, g) ? Suppose that for each irreducible character χ of G, we have constructed the invariant functional ˆ denote the set of all the irreducible Iχ (f, g) for a pair of image fields f, g. Let G characters of G. Consider the Plancherel formula for G: � f (e) = χ(f )dµ(χ) ˆ G

ˆ In other words, we have where µ is the Plancherel measure on G. � δe (g) = χ(g)dµ(χ) ˆ G

Then we can write



Iχ (f, g) =

G×G

ˆ f (x)g(y)χ(yx−1 )dxdy, χ ∈ G

Let ψ(g) be any class function on G. Then, by Plancherel’s formula, we have � ψ(g) = ψ ∗ δe (g) = δe (h−1 g)ψ(h)dh = G



χ(h

=



−1

g).ψ(h)dµ(χ)dh =



ˆ G

ψ ∗ χ(g).dµ(χ)

and hence the invariant Iψ associated with a pair of image fields can be expressed as � Iψ (f, g) = f (x)g(y)ψ(yx−1 )dxdy = � ψ ∗ χ(yx−1 )f (x)g(y)dxdy.dµ(χ) ˆ G×G×G

ψ(h)χ(h−1 yx−1 )f (x)g(y)dxdy.dµ(χ)dh � = ψ(h)Iχ (f, g, h)dh.dµ(χ) ˆ G×G

where we define

Iχ (f, g, h) =



G×G

f (x)g(y)χ(yx−1 h)dxdy, h ∈ G

Note that (f, g) → Iχ (f, g, h) is not an invariant. We define the Fourier trans­ form of the class function ψ by � ˆ ψ(χ) = ψ(h)χ(h−1 )dh G

54 Stochastic Processes in Classical and Quantum Physics and Engineering 62CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUATI Then, we have δe (g) =



χ(g)dµ(χ),

For any function f (g) on G, not necessarily a class function, � � f (g) = δe (h−1 g)f (h)dh = χ(h−1 g)f (h)dhdµ(χ) Formally, let πχ denote an irreducible representation having character χ. Then, defining � � fˆ(πχ ) = f (g)π(g −1 )dg = f (g −1 )π(g)dg G

G

(Note that we are assuming G to be a unimodular group. In the case of SL(2, R), this anyway true), we have � f (g) = T r(πχ (h−1 g))f (h)dhdµ(χ) =



ˆ G

and hence, Iχ (f, g, h) = =



T r(fˆ(πχ )πχ (g))dµ(χ) �

f (x)g(y)χ(yx−1 h)dxdy G×G

f (x)g(y)T r(πχ (y)πχ (x−1 )πχ (h))dxdy = T r(ˆ g0 (πχ )fˆ(πχ )πχ (h))

where g0 (x) = g(x−1 ), x ∈ G Then, for a class function ψ, � g0 (πχ )fˆ(πχ )πχ (h))dhdµ(χ) Iψ (f, g) = ψ(h)T r(ˆ but, ˆ ψ(χ) =



ψ(g)T r(πχ (g −1 ))dg = ˆ χ )) T r(ψ(π

We have,



ψ(h)πχ (h)dh =



ψ0 (h)πχ (h−1 )dh

= ψˆ0 (πχ )

Stochastic Processes in Classical and Quantum Physics and Engineering 3.14. MORE ON GROUP THEORETIC IMAGE PROCESSING

55 63

We have, on the one hand ψ(x) =

j

ψ(h)χ(h−1 x)dhdµ(χ)

and since ψ is a class function, ∀x, y ∈ G, j ψ(x) = ψ(yxy −1 ) = ψ(h)χ(h−1 yxy −1 )dh.dµ(χ) =

j

= where χ1 (h, x, a) =

j

ψ(h)χ(y −1 h−1 yx)dh.dµ(χ) j

ψ(h)χ1 (h, x, a)dµ(χ)

a(y)χ(y

−1 −1

h

yx)dy =

G

j

a(y)χ(h−1 yxy −1 )dy G

where a(x) is any function on G satisfying, j a(x)dx = 1 G

Remark: SL(2, R) under the adjoint action, acts as a subgroup of the Lorentz transformations in a plane, ie, in the t−y −z space. Indeed, represent the space­ time point (t, x, y , z) by ( ) t + z x + iy Φ(t, x, y , z) = x − iy t − z Then define (t' , x' , y ' , z ' ) by Φ(t' , x' , y ' , z ' ) = A.Φ(t, x, y , z)A∗ where A ∈ SL(2, R) Clearly, by taking determinants, we get �



�2

t2−x2−y −z



2

= t2 − x2 − y 2 − z 2

and then choosing A = u(θ) = exp(θ(X − Y )), a(v) = exp(vH), n(s) = exp(sX) successively gives us the results that t' + z ' = t + z.cos(2θ) + x.sin(2θ), t' − z ' = t − z.cos(2θ) − x.sin(2θ) x' + iy ' = x.cos(2θ) − z.sin(2θ)

5664CHAPTERStochastic ProcessesPARTIAL in Classical andSTOCHASTIC Quantum Physics and Engineering EQUATI 3. ORDINARY, AND DIFFERENTIAL which means that t� = t, x� = x.cos(2θ) − z.sin(2θ), z � = x.sin(2θ) + z.cos(2θ), y � = y for the case A = u(θ), which corresponds to a rotation of the x − z plane around the y axis by an angle 2θ. The case A = a(v) gives x� + iy � = x + iy, t� + z � = e2v (t + z), t� − z � = e−2v (t − z) or equivalently, x� = x, y � = y, t� = t.cosh(2v) + z.sinh(2v), z � = t.sinh(2v) + z.cosh(2v) which corresponds to a Lorentz boost along the z direction with a velocity of u = −tanh(2v). Finally, the case A = n(s) gives t� = t + sx + s2 (t − z)/2, x� = x + s(t − z), z � = z + sx + s2 (t − z), y � = y This corresponds to a Lorentz boost along a definite direction in the xz plane. The direction and velocity of the boost is a function of s. To see what this one parameter group of transformations corresponding to A = n(s) corresponds to, we first evaluate it for infinitesimal s to get t� = t + sx, x� = x + s(t − z), z � = z + sx, y � = y This transformation can be viewed as a composite of two infinitesimal transfor­ mations, the first is x → x − sz, z → z + sx, t → t This transformation corresponds to an infinitesimal rotation of the xz plane around the y axis by an angle s. The second is x → x + st, t → t + sx which corresponds to an infinitesimal Lorentz boost along the x axis by a velocity of s. To see what the general non­infinitesimal case corresponds to, we assume that it corresponds to a boost with speed of u along the unit vector (nx , 0, nz ). Then the corresponding Lorentz transformation equations are (x� , z � ) = γ(u)(xnx + znz − ut)(nx , nz ) + (x, z) − (xnx + znz )(nx , nz ) t� = γ(u)(t − xnx − znz ) The first pair of equation is in component form, x� = γ(xnx + znz − ut)nx + x − (xnx + znz )nx , z � = γ(xnx + znz − ut)nz + z − (xnx + znz )nz To get the appropriate matching, we must therefore have 1 + s2 /2 = γ(u), s = −γ(u)nx , −s2 /2 = γ(u)nz ,

Stochastic Processes ClassicalTHEORETIC and Quantum Physics Engineering 3.14. MORE ONinGROUP IMAGEand PROCESSING

6557

which is impossible. Therefore, this transformation cannot correspond to a single Lorentz boost. However, it can be made to correspond to the composition of a rotation in the xz plane followed by a Lorentz boost in the same plane. Let Lx (u) denote the Lorentz boost along the x axis by a speed of u and let Ry (α) denote a rotation around the y axis by an angle α counterclockwise. Then, we have already seen that the finite transformation corresponding to A = n(s) is given by L(s) = limn→∞ (Lx (−s/n)Ry (−s/n))n = exp(−s(Kx + Xy )) where Kx is the infinitesimal generator of Lorentz boosts along the x axis while Xy is the infinitesimal generator of rotations around the y axis. Specifically, Kx = L�x (0), Xy = Ry� (0) The characters of the discrete series of SL(2, R). Let m be a nonpositive integer. Let em−1 be a highest weight vector of a representation πm of sl(2, R). Then π(X)em−1 = 0, π(H)em−1 = (m − 1)em−1 Define π(Y )r em−1 = em−1−2r , r = 0, 1, 2, ... Then π(H)em−1−2r = π(H)π(Y )r em−1 = ([π(H), π(Y )r ] + π(Y )r π(H))em−1 = (m − 1 − 2r)π(Y )r em = (m − 1 − 2r)em−1−2r , r ≥ 0

Thus, considering the module V of SL(2, R) defined by the span of em−2r , r ≥ 0, we find that � exp((m − 1)t) T r(π(a(t)) = T r(π(exp(tH)) = exp(t(m − 1 − 2r)) = 1 − exp(−2t) r≥0

= Now,

exp(mt) exp(t) − exp(−t)

π(X)em−1−2r = π(X)π(Y )r em−1 = [π(X), π(Y )r ]em−1 = (π(H)π(Y )r−1 + .... + π(Y )r−1 π(H))em−1 = [(−2(r−1)π(Y )r−1 +π(Y )r−1 π(H)−2(r−2)π(Y )r−1 +π(Y )r−1 π(H)+...+π(Y )r−1 π(H)]em−1 = [−r(r − 1) + r(m − 1)]π(Y )r−1 em−1 = r(m − r)em+1−2r Also note that π(Y )em−1−2r = em−3−2r , r ≥ 0 Problem: From these expressions, calculate T r(π(u(θ)) where u(θ) = exp(θ(X− Y )). Show that it equals exp(imθ)/(exp(iθ) − exp(−iθ)). In this way, we are able to determine all the discrete series characters for SL(2, R).

58

Stochastic Processes in Classical and Quantum Physics and Engineering 66CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUATI

3.15

A problem in quantum information theory

Let K be a quantum operation, ie, a CPTP map. Its action on an operator X is given in terms of the Choi­Kraus representation by � K(X) = Ea XEa∗ a

Define the following inner product on the space of operators: < X, Y >= T r(XY ∗ )

Then, calculate K ∗ relative to this inner product in terms of the Choi­Kraus representation of K. Answer: � � < K(X), Y >= T r(K(X)Y ∗ ) = T r( Ea XEa∗ Y ∗ ) = T r(XEa∗ Y ∗ Ea ) = T r(X.(



a

a

∗ Ea

Y Ea )∗ ) = T r(X.K ∗ (Y )∗ ) =< X, K(∗ (Y ) >

a

where

K ∗ (X) =



Ea∗ XEa

a

Problem: Let A, B be matrices A square and B rectangular of the appropri­ ate size so that � � A B X= ≥0 B∗ I

Then deduce that

A ≥ BB ∗

Solution: For any two column vectors u, v of appropriate size

� � u ∗ ∗ (u , v )x ≥0 v This gives Take

u∗ Au + u∗ Bv + v ∗ B ∗ u + v ∗ v ≥ 0 v = Cu

where C is any matrix of appropriate size. Then, we get u∗ (A + BC + C ∗ B ∗ + C ∗ C)u ≥ 0 this being true for all u, we can write A + BC + C ∗ B ∗ + C ∗ C ≥ 0

for all matrices C of appropriate size. In particular, taking C = −B ∗ gives us A − BB ∗ ≥ 0

Stochastic Processes in Classical and Quantum Physics and Engineering 3.16. HAWKING RADIATION

3.16

59 67

Hawking radiation

Problem: For a spherically symmetric blackhole with a charge Q at its centre, solve the static Einstein­Maxwell equations to obtain the metric. Hence, cal­ culate the equation of motion of a radially moving photon in this metric and determine the complex time taken by the photon to travel from just outside the critical blackhole radius to just inside and denote this imaginary time by iΔ. Note that Δ will be a function of the parameters G, M, Q of the blackhole. Deduce that if T is the Hawking temperature for quantum emission of photons for such a charged blackhole, then at frequency ω, exp(−hω/kT ) = exp(−ωΔ) and hence that the Hawking temperature is given by T = h/kΔ Evaluate this quantity. Let

D(ρ|σ) = T r(ρ.(log(ρ) − log(σ)))

the relative entropy between two states ρ, σ defined on the same Hilbert space. We know that it is non­negative and that if K is a quantum operation, ie, a CPTP map, then D(ρ|sigma) ≥ D(K(ρ)|K(σ)) Consider three subsystems A, B, C and let ρ(ABC) be a joint state on these three subsystems. Let ρm = IB /d(B) the completely mixed state on B. IB is the identity operator on HB and d(B) = dimHB . Let K denote the quantum operation corresponding to partial trace on the B system. Then consider the inequality D(ρ(ABC)|ρm ⊗ ρ(AC)) ≥ D(K(ρ(ABC))|K(ρm ⊗ ρ(AC))) = D(ρ(AC)|ρ(AC)) = 0 Evaluating gives us T r(ρ(ABC)log(ρ(ABC)) − T r(ρ(ABC).(log(ρ(AC)) − log(d(B))) ≥ 0 or equivalently, −H(ABC) + H(AC) + log(d(B)) ≥ 0 Now, let K denote partial tracing on the C­subsystem. Then, D(ρ(ABC)|ρm ⊗ ρ(AC)) ≥ D(K(ρ(ABC))|K(ρm ⊗ ρ(AC))) gives −H(ABC) + H(AC) + log(d(B)) ≥ D(ρ(AB)|Im ⊗ ρ(A))

60 Stochastic Processes in Classical and Quantum Physics and Engineering 68CHAPTER 3. ORDINARY, PARTIAL AND STOCHASTIC DIFFERENTIAL EQUATI = −H(AB) − T r(ρ(AB)(log(ρ(A)) − log(d(B))) = −H(AB) + H(A) + d(B) giving H(AB) + H(AC) − H(A) − H(ABC) ≥ 0 Let the metric of space­time be given in spherical polar coordinates by dτ 2 = A(r)dt2 − B(r)dr2 − C(r)(dθ2 + sin2 (θ)dφ2 ) The equation of the radial null geodesic is given by A(r) − B(r)(dr/dt)2 = 0 or equivalently for an incoming radial geodesic, � dr/dt = − A(r)/B(r) = −F (r) say where

F (r) =

� A(r)/B(r)

Let F (r) have a zero at the critical radius r0 so that we can write F (r) = G(r)(r − r0 ) where G(r0 ) =  0. Then, the time taken for the photon to go from r0 + 0 to r0 − 0 is given by � r0 +0 dr/((r − r0 )G(r)) = G(r0 )−1 lim�↓0 (log(�) − log(−�)) Δ= r0 −0

= −iπ.G(r0 )−1 so that the ratio change in the probability amplitude of a photon pulse of fre­ quency ω is the quantity exp(−ωπ.G(r0 )−1 ) and equating this to the Gibbsian probability exp(−hω/kT ) gives us the Hawking temperature as T = h.G(r0 )/πk

Chapter 4

Chapter 4

Research Research Topics on Topics on Representations represenations, Using the Schrodinger equation to track the time varying probability density of a signal. The Schrodinger equation naturally generates a family of pdf’s as the modulus square of the evolving wave function and by controlling the time varying potential in which the Schrodinger evolution takes place, it is possible to track a time varying pdf, ie, encode this time varying pdf into the form the weights of a neural network. From the stored weights, we can synthesize the time varying pdf of the signal.

4.1 Remarks on induced representations and Haar integrals Let G be a compact group. Let χ be a character of an Abelian subgroup N of G in a semidirect product G = HN . Let U = IndG N χ be the induced representation. Suppose a function f on G is orthogonal to all the matrix elements of U . Then we can express this condition as follows. First the representation space V of U consists of all functions h : G → C for which h(xn) = χ(n)−1 h(x)∀x ∈ G, n ∈ N and U acts on such a function to give U (y)h(x) = h(y −1 x), x, y ∈ G. The condition � f (x)U (x)dx = 0 G

gives



G

f (x)h(x−1 y)dx = 0∀y ∈ G∀h ∈ V

Now for h ∈ V , we have h(xn) = χ(n)−1 h(x), x ∈ G, n ∈ N 69

61

62 70

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 4. RESEARCH TOPICS ON REPRESENATIONS,

and hence for any representation π of G, � ˆ h(xn)π(x−1 )dx = χ(n)−1 h(π) or

or



ˆ h(x)π(n.x−1 )dx = χ(n)−1 h(π)

ˆ ˆ n∈N π(n)h(π) = χ(n−1 )h(π),

Define Pπ,N =



π(n)dn

N

Assume without loss of generality that π is a unitary representation of G. Then, Pπ,N is the orthogonal projection onto the space of all those vectors w in the representation space of π for which π(n)w = w∀n ∈ N . Then the above equation gives on integrating w.r.t n ∈ N , ˆ Pπ,N h(π) = 0, h ∈ V, χ �= 1 and

ˆ ˆ Pπ,N h(π) = h(π), h ∈ V, χ = 1

on using the Schur orthogonality relations for the characters of N . We therefore have The equation � f (x)h(x−1 y)dx = 0

G

gives us



f (x)h(x−1 y)π(y −1 )dxdy = 0

G

or



or

f (x)h(z)π(z −1 x−1 )dxdz = 0 ˆ ˆ(π) = 0, h ∈ V h(π)f

It also implies



or or

f (x)h(x−1 y)π(y)dxdy = 0 �

f (x)h(z)π(xz)dxdz = 0 ˆ (π)∗ = 0 fˆ(π)∗ h

which is the same as the above. Suppose on the other hand that f satisfies � f (xny)dn = 0, x, y ∈ G − − − (1) N

Stochastic Processes ON in Classical and REPRESENTATIONS Quantum Physics and Engineering 63 4.1. REMARKS INDUCED AND HAAR INTEGRALS71 Then, it follows that � which implies

f (xny)π(y −1 )dydn = 0 G×N



This implies

f (z)π(z −1 xn)dzdn = 0 G×N

fˆ(π)π(x)Pπ,N = 0, x ∈ G

It also implies

which implies

or equivalently,



f (xny)π(x−1 )dxdn = 0



f (z)π(nyz −1 )dzdn = 0

Pπ,N π(y)fˆ(π) = 0, y ∈ G

In particular, (1) implies Pπ,N fˆ(π) = fˆ(π)Pπ,N = 0 for all representations π of G. The condition (1) implies that P π(y)fˆ(π) = 0 where P = Pπ,N for all representations π and hence for all functions h on ˆ ˆ(π) = 0. More precisely, it implies that for all G it also implies that P h(π)f representations π and all functions h on G, ˆ ˆ(π) = 0 − − − (2) Pπ,N h(π)f Now suppose that h1 belongs to the the representation space of U = IndG N χ0 where χ0 is the identity character on N , ie, one dimensional unitary represen­ tation of N equal to unity at all elements of G. Then, it is clear that h1 can be expressed as � h1 (x) =

N

h(xn)dn, x ∈ N

for some function h on G and conversely if h is an arbitrary function on G for which the above integral exists, then h1 will belong to the representation space of U . That is because h1 as defined above satisfies h1 (xn) = h1 (x), x ∈ G, n ∈ N

The condition that f be orthogonal to the matrix elements of U is that � f (x)U (x−1 )dx = 0

64

72

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 4. RESEARCH TOPICS ON REPRESENATIONS,

and this condition is equivalent to the condition that for all h on G, � � f (x)h1 (xy)dx = 0, y ∈ G, h = h(xn)dn G

N

and this condition is in turn equivalent to � f (x)h1 (xy)π(y −1 )dxdy = 0 or

or equivalently,



f (x)h1 (z)π(z −1 x)dxdz = 0 ˆ 1 (π)fˆ(π)∗ = 0 h

for all unitary representations π of G and for all functions h on G. Now, � � ˆ 1 (π) = h(xn)π(x−1 )dxdn = h(z)π(nz −1 )dzdn = P.h(π) ˆ h and hence the condition is equivalent to ˆ ˆ(π)∗ = 0 − − − (3) Pπ,N h(π)f for all unitary representations π of G and for all functions h on G. This condition is the same as (2) if we observe that fˆ(π) can be replaced by fˆ(π)∗ because f (x) being orthogonal to the matrix elements of U (x) is equivalent to f (x−1 ) being orthogonal to the same because for compact groups, a finite dimensional representation is equivalent to a unitary representation which means that U is an appropriate basis is a unitary representation. Thus, we have proved that a necessary and sufficient condition that a function f be orthogonal to all the matrix elements of the representation of G induced by the identity character χ0 on N is that f be a cusp form, ie, it should satisfies � f (xny)dn = 0∀x, y ∈ G N

Now suppose χ is an arbitary character of N (ie a one dimensional unitary rep­ resentation of N ). Then let U = IndG N χ. The If h1 belongs to the representation space of U , then h1 (xn) = χ(n)−1 h1 (x), x ∈ G, n ∈ N . Now let h(x) be any function on G. Define � h1 (x) =

h(xn)χ(n)dn

N

Then,

h1 (xn0 ) =



h(xn0 n)χ(n)dn = N



h(xn)χ(n−1 0 n)dn

= χ(n0 )−1 h1 (x), n0 ∈ N, x ∈ G

Stochastic Processes in Classical and Quantum Physics and Engineering 65 4.1. REMARKS ON INDUCED REPRESENTATIONS AND HAAR INTEGRALS73 It follows that h1 belongs to the representation space of U . conversely if h1 belongs to this representation space, then J h1 (x) = h1 (xn)χ(n)dn N

Thus we have shown that h1 belongs to the representation space of U = IndG Nχ iff there exists a function h on G such that J h1 (x) = h(xn)χ(n)dn Note that this condition fails for non­compact groups since the integral dn on N cannot be normalized in general. Then, the condition for f to be orthogonal to U is that J f (x)U (x−1 )dx = 0 − − − (4) G

and this condition is equivalent to J f (x)h1 (xy)dx = 0

for all h1 in the representation space of U or equivalently, J f (x)h(xyn)χ(n)dxdn = 0 for all functions h on G. This condition is equivalent to J f (x)h(xyn)χ(n)π(y −1 )dxdydn = 0 for all unitary representations π of G which is in turn equivalent to J f (x)h(z)χ(n)π(nz −1 x)dxdzdn = 0 G×G×N

or equivalently,

ˆ ˆ(π)∗ = 0 − − − (5) Pχ,π,N .h(π).f

for all functions h on G and for all unitary representations π of G where J Pχ,π,N = P = χ(n)π(n)dn N

Now consider the following condition on a function f defined on G: J f (xny)χ(n)dn = 0, x, y ∈ G N

This condition is equivalent to J f (xny)χ(n)π(y −1 )dydn = 0 G×N

66

74

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 4. RESEARCH TOPICS ON REPRESENATIONS,

for all unitary representations π of G or equivalently, � f (z)χ(n)π(z −1 xn)dzdn = 0 G×N

and this is equivalent to fˆ(π)π(x)P = 0, x ∈ G − − − (6) which is in turn equivalent to ˆ fˆ(π)h(π)P = 0 − − − (7) for all functions h on G as follows by multiplying (5) with h(x−1 ) and integrating over G. This condition is the same as ˆ ∗ fˆ(π)∗ = 0 Pχ,π,N h(π) or since h(x) can be replaced by h(x−1 ), the condition is equivalent to ˆ ˆ(π)∗ = 0 − − − (8) Pχ,π,N h(π)f for all functions h on G and all unitary representations π of G. This condition is the same as (5). Thus, we have established that a necessary and sufficient condition for f to be orthogonal to all the matrix elements of U = IndG N χ is that f should satisfy � f (xny)χ(n)dn = 0, x, y ∈ G N

In other words, f should be a generalized cusp form corresponding to the char­ acter χ of N .

4.2 On the dynamics of breathing in the pres­ ence of a mask Let V denote the volume between the face and the mask. Oxygen passes from outside the mask to within which the person breathes in. While breathing out, the person ejects CO2 along with of course some fraction of oxygen. Let no (t) denote the number of oxygen molecules per unit volume at time t and nc (t) denote the number of carbon­dioxide molecules per unit volume at time t in the region between the face and the mask. Let nr (t) denote the number of molecules of the other gases in the same region per unit volume. Let λc (t)dt denote the probability of a CO2 molecule escaping away the volume between the mask and the face in time [t, t + dt]. Then let gc (t)dt denote the number of CO2 molecules

Stochastic Processes in Classical and Quantum Physics and Engineering 67 4.2. ON THE DYNAMICS OF BREATHING IN THE PRESENCE OF A MASK75 being ejected from the nose and mouth of the person per unit time. Then, we have a simple conservation equation dnc (t)/dt = −nc (t)λc (t) + gc (t)/V To this differential equation, we can add stochastic terms and express it as dnc (t)/dt = −nc (t)λc (t) + gc (t)/V + wc (t) We can ask the question of determining the statistics of the first time τc at which the CO2 concentration nc (t) exceeds a certain threshold Nc . Sometimes there can be a chemical reaction between the CO2 trapped in the volume with some other gases. Likewise other gases within the volume can react and produce CO2 . If we take into account these chemical reactions, then there will be additional terms in the above equation describing the rate of change of concentration of CO2 . For example, consider a reaction of the form a.CO2 + b.X → c.Y + d.Z where a, b, c, d are positive integers. X denotes the concentration of the other gases, ie, gases other than CO2 within the volume. At time t = 0, there are nc (0) molecules of CO2 and nX (0) molecules of X and zero molecules of Y and Z. Let K denote the rate constant at fixed temperature T . Then, we have an equation of the form dnc (t)/dt = −K.nc (t)a nX (t)b − − − (1) according to Arrhenius’ law of mass action. After time t, therefore, nc (0)−nc (t) molecules of CO2 have reacted with (nc (0) − nc (t))b/a moles of X. Thus, nX (t) = nX (0) − (nc (0) − nc (t))b/a from which we deduce dnX (t)/dt = (b/a)dnc (t)/dt = (−Kb/a)nc (t)a .nX (t)b − − − (2) equations (1) and (2) have to be jointly solved for nc (t) and nX (t). This can be reduced to a single differential equation for nc (t): dnc (t)/dt = −K.nc (t)a (nX (0) − (nc (0) − nc (t))b/a)b − − − (3) Likewise, we could also have a chemical reaction in which CO2 is produced by a chemical reaction between two compounds U, V . We assume that such reaction is of the form pU + qV → r.CO2 + s.M

Let nU (0), nV (0) denote the number of molecules of U and V at time t = 0. Then, the rate of at which their concentrations change with time is again given by Arrhenius’ law of mass action dnU (t)/dt = −K1 nU (t)p nV (t)q

68 76

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 4. RESEARCH TOPICS ON REPRESENATIONS, nV (t) = nV (0) − q(nU (0) − nU (t))/p

and then the number of CO2 molecules generated after time t due to this reaction will be given by nc (t) = (r/p)(nU (0) − nU (t)) and also nV (t) = nV (0) − (q/p)(nU (0) − nU (t)) = nV (0) − (q/r)nc (t) and hence dnc (t)/dt = (−r/p)dnU (t)/dt = (r/p)K1 nU (t)p .nV (t)q = (K1 r/p)(nU (0) − pnc (t)/r)p (nV (0) − (q/r)nc (t))q It follows that combining these two processes we can write Now consider the combination of two such reactions. Writing down explicitly these reactions, we have aCO2 + bX → c.Y + d.Z, p.U + q.V → r.CO2 + s.Z At time t = 0, we have initial conditions on nc (0), nX (0), nU (0), nV (0). Then the rate at which CO2 gets generated by the combination of these two reactions and the other non­chemical­reaction processes of diffusion exit etc. is given by dnc (t)/dt = −Knc (t)a nX (t)b + (K1 r/p)nU (t)p nV (t)q + f (nc (t), t) where dnX (t)/dt = (−Kb/a)nc (t)a nX (t)b , dnU (t)/dt = −K1 nU (t)p nV (t)q , dnV (t)/dt = (−K1 q/p)nU (t)p nV (t)q Here, f (nc (t), t) denotes the rate at which CO2 concentration changes due to the other effects described above. Of course, we could have more complex chemical reactions in of the sort a0 CO2 + a1 X1 + ... + an Xn → b1 Y1 + ... + bm Ym , for which the law of mass action would give dnc (t)/dt = −Knc (t)a0 nX1 (t)a1 ...nXn (t)an , nXj (t) = nXj (0) − (aj /a0 )(nc (0) − nc (t)), j = 1, 2, ..., n, nYj (t) = nYj (0) + (bj /a0 )(nc (0) − nc (t)), j = 1, 2, ..., m

Stochastic Processes in Classical and Quantum Physics and Engineering 4.3. PLANCHEREL FORMULA FOR SL(2, R)

4.3

77

Plancherel formula for SL(2, R)

Let G = KAN be the usual Iwasawa decomposition and let B denote the com­ pact/elliptic Cartan subgroup B = {exp(θ(X − Y )), θ ∈ [0, 2π)}. Let L denote the non­compact/hyperbolic Cartan subgroup L = {exp(tH), t ∈ R}. Let GB denote the conjugacy class of B, ie, GB is the elliptic subgroup gBg−1, g ∈ G and likewise GL will denote the conjugacy class of L ie, the hyperbolic subgroup gLg −1 , g ∈ G. These two subgroups have only the identity element in common. Now let Θm denote the discrete series characters. We know that Θm (u(θ)) = −sgn(m)

exp(imθ) , m ∈ Z, u(θ) = exp(θ(X − Y )) exp(iθ) − exp(−iθ)

Θm (a(t)) =

exp(mt) , a(t) = exp(tH) exp(t) − exp(−t)

Weyl’s integration formula gives

Θm (f ) = �

=

Θm (x)f (x)dx = G



Θm (u(θ))Ff,B (theta)ΔB (θ)dθ 0

+ �





R

Θm (a(t))Ff,L (t)ΔL (t)dt

sgn(m)exp(imθ)Ff,B (θ)dθ + = sgn(m)Fˆf,B (m) +



R



R

exp(−|mt|)Ff,L (t)dt

exp(−|mt|)Ff,L (t)dt

where Ff,B and Ff,L are respectively the orbital integrals of f corresponding to the elliptic and hyperbolic groups and if a(θ) is a function of θ, then a ˆ(m) denotes its Fourier transform, ie, its Fourier series coefficients. Thus, � Ff,B (θ) = ΔB (θ) f (xu(θ)x−1 )dx G/B

Ff,L (a(t)) = ΔL (t)



f (xa(t)x−1 )dx G/L

where Δ(θ) = (exp(iθ) − exp(−iθ)), Δ(t) = exp(t) − exp(−t) Now



modd

|m|Θm (f ) = +



R





ΔL (t)Ff,L (t) 0

ΔB (θ)Ff,B (θ).



modd



modd

|m|Θm (a(t))dt

|m|Θm (u(θ))dθ

69

70 78

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 4. RESEARCH TOPICS ON REPRESENATIONS,

Now,



modd

m≥0

k ≥0





modd

|m|exp(−|mt|)/(exp(t) − exp(−t))

mexp(−mt) =

m≥1,modd



So �

|m|Θm (a(t)) =



(2k + 1)exp(−(2k + 1)t)

k≥0

m.exp(−mt) = −d/dt(1 − exp(−t))−1 = exp(−t)(1 − exp(−t))−2 , t ≥ 0

(2k+1)exp(−(2k+1)t) = exp(−t)(2exp(−2t)(1−exp(−2t))−2 +(1−exp(−2t))−1 )

= exp(−t)(1+exp(−2t))/(1−exp(−2t))2 = (exp(t)+exp(−t))/(exp(t)−exp(−t))2 Thus, �

modd

|m|exp(−|mt|) = 2



m.exp(−m|t|) = tanh(t) = w1 (t)

m≥0

say. �

modd

|m|Θm (u(θ)) =



modd

m.exp(imθ)/(exp(iθ) − exp(−iθ)) = w2 (θ)

say. Then, the Fourier transform of Ff� ,B (θ) (ie Fourier series coefficients) Fˆf� ,B (m) = −





Ff� ,B (θ)exp(imθ)dθ =

Ff,B (θ)(im)exp(imθ)dθ = −imFˆf,B (m)

Now, we’ve already seen that Θm (f ) = sgn(m)Fˆf,B (m) +



R

exp(−|mt|)Ff,L (t)dt

So −imΘm (f ) = sgn(m)(−imFˆf,B (m)) − im = sgn(m)Fˆf� ,B (m) − im





exp(−|mt|)Ff,L (t)dt

exp(−|mt|)Ff,L (t)dt

Multiplying both sides by sgn(m) gives us � −i|m|Θm (f ) = Fˆf,B (m) − i|m|



exp(−|mt|)Ff,L (t)dt

Stochastic Processes in Classical and Quantum Physics and Engineering 71 4.4. SOME APPLICATIONS OF THE PLANCHEREL FORMULA FOR A LIE GROUP T Summing over odd m gives −i



modd

Now,

|m|Θm (f ) =

� m

Now,



modd

Fˆf� ,B (m) − i

Fˆf� ,B (m) = Ff� ,B (0) − i



Ff,B (θ) = (exp(iθ) − exp(−iθ))





w1 (t)Ff,L (t)dt

w1 (t)Ff,L (t)dt

f (xu(θ)x−1 )dx G/B

which gives on differentiating w.r.t θ at θ = 0 � (0) = 2if (1) Ff,B

Thus, we get the celebrated Plancherel formula for SL(2, R), � � 2f (1) = − |m|Θm (f ) + w1 (t)Ff,L (t)dt m

Now we know that the Fourier transform � Fˆf,L (µ) = Ff,L (t)exp(itµ)dt R

equals Tiµ (f ), the character of the Principal series corresponding to iµ, µ ∈ R. Thus defining � w1 (t)exp(itµ)dt w ˆ1 (µ) = R

the Plancherel formula can be expressed in explicit form: � � 2f (1) = − w ˆ(µ)∗ .Tiµ (f )dµ |m|Θm (f ) + R

m

This formula was generalized to all semisimple Lie groups and also to p­adic groups by Harish­Chandra.

4.4 Some applications of the Plancherel formula for a Lie group to image processing ˆ let dµ(χ) be its Let G be a group and let its irreducible characters be χ, χ ∈ G. Plancherel measure: � δe (g) = χ(g).dµ(χ) or equivalently,

f (e) =



ˆ G

χ(f )dµ(χ)

72 80

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 4. RESEARCH TOPICS ON REPRESENATIONS,

for all f in an appropriate Schwartz space. Consider an invariant for image pairs of the form � Iχ (f1 , f2 ) = f1 (x)χ(xy −1 )f2 (y)dxdy G×G

ˆ , we have Then if ψ is any class function on G � χ(ψ).Iχ (f1 , f2 )dµ(χ) = ˆ G



f1 (x)χ(ψ 0 )χ(xy −1 )f2 (y)dxdydµ(χ) =

where ψ1 (x) = so that





f1 (x)ψ1 (xy −1 )f2 (y)dxdy χ(ψ 0 )χ(x)dµ(χ), ψ 0 (x) = ψ(x−1 )

0

χ(ψ ) =



ψ(x−1 )χ(x)dx

Remark: let πχ be an irreducible representation having character χ. Then, we can write � f (x) = T r(fˆ(πχ )πχ (x))dµ(χ)

where

fˆ(πχ ) =



f (x)πχ (x−1 ))dx

Thus, in particular, if f is a class function, � f (x) = f (y)T r(πχ (y −1 )πχ (x))dµ(χ)dy =



f (y)χ(y −1 x)dµ(χ)dy



ψ(y)π(x−1 y −1 x)dydx

as follows from the Plancherel formula. Problem: What is the relationship between ψ1 (x) and ψ(x) when ψ is a class function ? Note that if the group is compact, then if ψ is a class function and π is an irreducible unitary representation of dimension d, we have � � ψˆ(π) = ψ(y)π(y)∗ )dy = ψ(xyx−1 )π(y −1 )dydx =

Stochastic Processes in Classical and Physics and Engineering 4.4. SOME APPLICATIONS OFQuantum THE PLANCHEREL FORMULA FOR A LIE73GROUP T Now,



πab (x

= πcd (y)



−1

yx)dx =



πac (x−1 )πcd (y)πdb (x)dx

π ¯ca (x).πdb (x)dx = d−1 πcd (y)δ(c, d)δ(a, b)

= d−1 πcc (y)δ(a, b) = d−1 χ(y)δ(a, b) where χ = T r(π) is the character of π. This means that ˆ ψ(π) = d−1 χ(ψ)I Thus, for a class function on a compact group, we have � ψ(g) = d(χ)−1 T r(χ(ψ 0 )π(g)) ˆ χ∈G

=



d(χ)−1 χ(ψ 0 )χ(g)

ˆ χ∈G

=



d(χ)−1 χ(g)

ˆ χ∈G



ψ(x)χ(x)dx ¯

This is the same as saying that δg (x) =



d(χ)−1 χ(g)χ(x) ¯

ˆ χ∈G

Is this result true for an arbitrary semisimple Lie group when we replace d(χ) by the Plancherel measure/formal dimension on the space of irreducible characters ? In other words, is the result � χ(g)χ(x)dµ(χ) ¯ δg (x) = ˆ G

true ?

Chapter 5 Chapter 5

Some Problems in Quantum Some problems in quantum Information Theory information theory A problem in quantum information theory used in the proof of the coding the­ orem for Cq channels : Let S, T be two non­negative operators with S ≤ 1, ie, 0 ≤ S ≤ 1, T ≥ 0. Then 1 − (S + T )−1/2 S(S + T )−1/2 ≤ 2 − 2S + 4T or equivalently, (S + T )−1/2 S(S + T )−1/2 ≥ 2S − 1 − 4T To prove this, we define the operators √ √ A = T ((T + S)−1/2 − 1), B = T Then, the equation or equivalently,

(A − B)∗ (A − B) ≥ 0 A∗ A + B ∗ B ≥ A∗ B + B ∗ A

gives ((T +S)−1/2 −1)T +T ((T +S)−1/2 −1) ≤ T +((T +S)−1/2 −1)T ((T +S)−1/2 −1) This equation can be rearranged go give 2(T + S)−1/2 T + 2T (T + S)−1/2 − 4T ≤ (T + S)−1/2 T.(T + S)−1/2 = (T + S)−1/2 (T + S − S)(T + S)−1/2 = 1 − (T + S)−1/2 S(T + S)−1/2 A rearrangement of this equation gives 4T +1+(S+T )−1/2 S(S+T )−1/2 ≥ 2(S+T )−1/2 S(S+T )−1/2 +2(S+T )−1/2 T +2T (S+T )−1/2 83

75

7684CHAPTERStochastic in Classical and Quantum Physics and Engineering 5. SOMEProcesses PROBLEMS IN QUANTUM INFORMATION THEORY So to prove the stated result, it suffices to prove that (S + T )−1/2 S(S + T )−1/2 + (S + T )−1/2 T + T (S + T )−1/2 ≥ S

Let A, B be two states. The hypothesis testing problem amounts to choosing an operator 0 ≤ T ≤ 1 such that C(T ) = T r(A(1 − T )) + T r(BT ) is a minimum. Now, C(T ) = 1 + T r((B − A)T ) It is clear that this is a minimum iff T = {B − A ≤ 0} = {A − B ≥ 0} ie T is the projection onto the space spanned by those eigenvectors of B −A that have positive eigenvalues or equivalently the space spanned by those eigenvectors of A − B that have non­negative eigenvalues. The minimum value of the cost function is then C(min) = T r(A{A < B}) + T r(B{A > B}) Note that the minimum is attained when T is a projection. An inequality: Let 0 ≤ s ≤ 1. Then if A, B are any two positive operators, we have T r(A{A < B}) ≤ T r(A1−s B s ) To see this we note that T r(A1−s B s ) ≤ T r((B+(A−B)+ )1−s B s ) ≤ T r(B+(A−B)+ )1−s (B+(A−B)+ )s ) = T r(B+(A−B)+ ) = T r(B)+T r((A−B)+ ) = T r(B)+T r((A−B){A−B > 0}) = T r(B{A − B < 0}) + T r(A.{A − B > 0}) Other related inequalities are as follows: T r(A1−s B s ) ≥ T r((B − (B − A)+ )1−s B s )

≥ T r((B − (B − A)+ )1−s .(B − (B − A)+ )s ) = T r((B − (B − A)+ )

= T r(B) − T r((B − A)+ ) = T r(B) − T r((B − A){B > A})

= T r(B.{A > B}) + T r(A.{B > A})

In particular, when A, B are states, the minimum probability of discrimination

error has the following upper bound: P (error) = T r(A.{B > A}) + T r(B.{A > B}) ≤ min0≤s≤1 T r(A1−s B s )

Stochastic Processes in Classical and Quantum Physics and Engineering

77 85

= min0≤s≤1 exp(φ(s|A|B)) where φ(s|A|B) = log(T r(A1−s B s )) To obtain the minimum in the upper bound, we set the derivative w.r.t s to zero to get d/ds(T r(A1−s B s ) = 0 or T r(log(A)A1−s B s ) = T r(A1−s B s log(B)) It is not clear how to solve this equation for s. Now define ψ1 (s) = log(T r(A1−s B s ))/s, ψ2 (s) = log(T r(A1−s B s )/(1 − s) Then an elementary application of Le’Hopital’s rule gives us ψ1 (0+) = lims↓0 ψ1 (s) = −D(A|B) = −T r(A.(log(A) − log(B)), ψ2 (1−) = lims↑1 ψ2 (s) = −D(B|A) = −T r(B.(log(B) − log(A))

Consider now the problem of distinguishing between tensor product states A⊗n and B ⊗n . Let Pn (e) denote the minimum error probability. Then, by the above, Pn (e) ≤ min0≤s≤1 exp(logT r(A⊗n(1−s) B ⊗ns ) Now,

T r(A⊗n(1−s) B ⊗ns ) = (T r(A1−s B s ))n

Therefore, we can write limsupn→∞ n−1 log(Pn (e)) ≤ min0≤s≤1 logT r(A1−s B s ) Define F (s) = T r(A1−s B s ) Then,

F � (s) = −T r(A1−s log(A)B s ) + T r(A1−s B s log(B))

F �� (s) = T r(A1−s (logA)2 B s ) + T r(A1−s B s (log(B))2 ) −T r(A1−s log(A)B s log(B)) + T r(A1−s .log(A).B s .log(B)) = T r(A1−s (logA)2 B s ) + T r(A1−s B s (log(B))2 ) ≥ 0

Hence F (s) is a convex function and hence F � (s) is increasing. Further, F � (0) = −T r(A.log(A))+T r(A.log(B)) = −D(A|B) ≤ 0, F � (1) = −T r(B.log(A))+T r(B.log(B)) = D(B|A) ≥ 0

Hence F � (s) changes sign in [0, 1] and hence it must be zero at some s ∈ [0, 1], say at s0 and since F is convex, it attains its minimum at s0 .

78 86CHAPTERStochastic in Classical and Quantum Physics and Engineering 5. SOMEProcesses PROBLEMS IN QUANTUM INFORMATION THEORY Let ρ, σ be two states and let

� � q(i)|fi >< fi |

ρ= p(i)|ei >< ei |, σ = i

i

be their respective spectral decompositions. Define p(i)| < ei |fj > |2 = P (i, j), q(j)| < ei |fj > |2 = Q(i, j)

Then P, Q are both classical probability distributions. We have for 0 ≤ T ≤ 1,

� T r(ρ(1 − T )) = < ei |ρ|fj >< fj |1 − T |ei >

i,j

=

� i,j

p(i) < ei |fj > (< fj |ei > − < fj |T |ei >) =1−

� i,j

p(i) < ei |fj > t(j, i)

and likewise, T r(σT ) =

� i,j

< ei |σ|fj > t(j, i) =

� i,j

q(j) < ei |fj > t(j, i)

where t(j, i) =< fj |T |ei > So, T r(ρ(1 − T )) + T r(σT ) = 1 +

� i,j

t(j, i)(q(j) − p(i)) < ei |fj > t(j, i)

Now, |

� i,j



p(i) < ei |fj > t(j, i)|2 ≤ =



i,j

p(i)|ei |fj > |2 .

� i,j

p(i)|t(j, i)|2

i,j

Thus, T r(ρ(1 − T )) ≥ 1 − T r(σT ) =

� i,j



p(i)|t(j, i)|2

i,j

q(j) < ei |fj > t(j, i)

So T r(σT )2 ≤

� i,j

q(j)|t(j, i)|2

p(i)|t(j, i)|2

Stochastic Processes in Classical and Quantum Physics and Engineering

79 87

Then, if c is any positive real number, � � q(j)|t(j, i)|2 T r(ρ(1 − T )) − c.T r(σT ) ≥ 1 − p(i)|t(j, i)|2 − c. i,j

i,j

Suppose now that T is a projection. Then, � � p(i) < ei |(1 − T )2 |ei > T r(ρ(1 − T )) = p(i) < ei |1 − T |ei >= i

=

� i,j

i

p(i)| < ei |1 − T |fj > |2 ≥i,j a(i, j)| < ei |1 − T |fj > |2

and likewise, �

T r(σT ) =

i,j

q(j)| < fj |T |ei > |2 ≥

� i,j

a(i, j)| < fj |T |ei > |2

where a(i, j) = min(p(i), q(j)) Now, | < ei |1 − T |fj > |2 + < ei |T |fj > |2 ≥ (1/2)| < ei |1 − T |fj > + < ei |T |fj > |2 = (1/2)| < ei |fj > |2 Thus, T r(ρ(1 − T )) + T r(σT ) ≥ (1/2) = (1/2)

� i,j

= (1/2)

� i,j

� i,j

P (i, j)χp(i)≤q(j) + (1/2)

P (i, j)χP (i,j)≤Q(i,j) + (1/2)

a(i, j)| < ei |fj > |2 �

Q(i, j)χp(i)>q(j)

i,j



Q(i, j)χP (i,j)>Q(i,j)

i,j

Now, for tensor product states, this result gives T r(ρ⊗n (1 − Tn )) + T r(σ ⊗n Tn ) ≥ (1/2)P n (P n ≤ Qn ) + (1/2)Qn (P n > Qn ) for any measurement Tn in the tensor product space, ie, 0 ≤ Tn ≤ 1. Now by the large deviation principle as n → ∞, we have n−1 .log(P n (P n ≥ Qn )) ≈ −infx≤0 I1 (x) where I1 is the rate function under P n of the random family n−1 log(P n /Qn ). Evaluating the moment generating function of this r.v gives us Mn (ns) = EP n exp(s.log(P n /Qn )) = EP n (P n /Qn )s

80

Stochastic Processes in Classical and Quantum Physics and Engineering 88CHAPTER 5. SOME PROBLEMS IN QUANTUM INFORMATION THEORY = (EP (P/Q)s )n = ( and hence the rate function is



P 1−s Qs )n = Ms (P, Q)n

I1 (x) = sups (sx − logMs (P, Q)) = � sups (sx − log( P (ω)1−s Q(ω)s )) ω

Then,

infx≤0 I1 (x) = sups≥0 − log( = −infs≥0 log(





P (ω)1−s Q(ω)s ))

ω

P (ω)1−s Q(ω)s ))

ω

This asymptotic result can also be expressed as liminfn→∞ inf0≤Tn ≤1 [n−1 log(T r(ρ⊗n (1 − Tn )) + T r(σ ⊗n Tn ))] � � P 1−s Qs )] − inf0≤s≤1 [log Q1−s P s ] ≥ −inf0≤s≤1 [log(

Relationship between quantum relative entropy and classical relative en­ tropy: � T r(ρ.log(ρ)) = p(i)log(p(i)) T r(ρ.log(σ)) =

� i,j

i

p(i)log(q(j)) < ei |fj > |2

D(ρ|σ) = T r(ρ.log(ρ) − ρ.log(σ)) Therefore, D(ρ|σ) =

� i,j



� i,j

p(i) < ei |fj > |2 (log(p(i)) + log(| < ei |fj > |2 ))

p(i) < ei |fj > |2 (log(q(j)) + log(| < ei |fj > |2 ))

(Note the cancellations) � � = P (i, j)log(P (i, j)) − P (i, j)log(Q(i, j)) i,j

i,j

= D(P |Q)

Let X be a r.v. with moment generating function

M (λ) = E[exp(λX)]

Stochastic Processes in Classical and Quantum Physics and Engineering 81 5.1. QUANTUM INFORMATION THEORY AND QUANTUM BLACKHOLE PHYSICS8 Its logarithmic moment generating function is Λ(λ) = log(M (λ)) Let µ = E(X) be its mean. Then, µ = Λ� (0) Further,

Λ�� (λ) ≥ 0∀λ

ie Λ is a convex function. Hence Λ� (λ) is increasing for all λ. Now suppose x ≥ µ. Consider ψ(x, λ) = λ.x − Λ(λ) Since

∂λ2 ψ(x, λ) = −Λ�� (λ) < 0

it follows that ∂λ ψ(x, λ) is decreasing or equivalently that ψ(x, λ) is concave in λ. Suppose λ0 is such that ∂λ ψ(x, λ0 ) = 0 Then it follows that for a fixed value of x, ψ(x, λ) attains its maximum at λ0 . Now ∂λ ψ(x, λ0 ) = 0 and ∂λ ψ(x, λ) is decreasing in λ. It follows that ∂λ ψ(x, λ) ≤ 0 for λ ≤ λ0 . Hence, ψ(x, λ) is decreasing for λ ≥ λ0 . Also for λ < λ0 , ∂λ ψ(x, λ) ≥ 0 (since ∂λ ψ(x, λ0 ) = 0 and ∂λ ψ(x, λ) is decreasing for all λ) which means that ψ(x, λ) is increasing for λ < λ0 . Now ∂λ ψ(x, 0) = x − µ ≥ 0 by hypothesis. Since ∂λ ψ(x, λ) is decreasing in λ, it follows that ∂λ ψ(x, λ) ≥ 0 for all λ ≤ 0. This means that ψ(x, λ) is increasing for all λ ≥ 0. In particular, ψ(x, λ) ≤ ψ(x, 0) = x, ∀λ ≤ 0 Thus, I(x) = supλ∈R ψ(x, λ) = supλ≥0 ψ(x, λ) in the case when x ≥ µ. Consider now the case when x ≤ µ.

5.1 Quantum information theory and quantum blackhole physics Let ρ1 (0) be the state of the system of particles within the event horizon of a Schwarzchild blackhole and ρ2 (0) the state outside. At time t = 0, the state of the total system to the interior and the exterior of the blackhole is ρ(0) = ρ1 (0) ⊗ ρ2 (0)

82

Stochastic Processes in Classical and Quantum Physics and Engineering 90CHAPTER 5. SOME PROBLEMS IN QUANTUM INFORMATION THEORY Here, as a prototype example, we are considering the Klein­Gordon field of particles in the blackhole as well as outside it. Let χ(x) denote this KG field. Its action functional is given by � � � S[χ] = (1/2) (g µν (x) −g(x)χ,µ (x)χ,ν (x) − µ2 −g(x)χ(x)2 )d4 x = (1/2)



[g µν (x)χ,µ (x)χ,ν (x) − µ2 χ(x)2 ]

which in the case of the Schwarzchild blackhole reads S[χ] = (1/2)



� −g(x)d4 x

[α(r)−1 χ2,t −α(r)χ2,r −r−2 χ2,θ −r−2 sin(θ)−2 χ2,φ ]r2 sin(θ)dtdrdθdφ =



Ld4 x

The corresponding KG Hamiltonian can also be written down. The canonical position field is χ and the canonical momentum field is π = δS/δχ,t = α(r)−1 χ,t so that the Hamiltonian density is H = π.χ,t − L =

2 2 2 + r−2 χ,θ + r−2 sin(θ)−2 χ2,φ ] + α(r)χ,r (1/2)[α(r)−1 χ,t

= (1/2)[α(r)π 2 + α(r)χ2,r + r−2 χ2,θ + r−2 sin(θ)−2 χ2,φ + µ2 χ2 ] The KG field Hamiltonian is then � � H = Hd3 x = Hr2 sin(θ)drdθdφ

The radial Hamiltonian: Assume that the angular oscillations of the KG field are as per the classical equations of motion of the field corresponding to an angular momentum square of l(l + 1). Then, we can write χ(t, r, θ, φ) = ψ(t, r)Ylm (θ, φ) an substituting this into the action functional, performing integration by parts using L2 Ylm (θ, φ) = l(l + 1)Ylm (θ, φ), ∂ ∂ 1 ∂2 sin(θ) + ∂θ ∂θ sin2 (θ) ∂φ2 gives us the ”radial action functional” as � 2 2 Sl [φ] = (1/2) (α(r)−1 ψ,t − α(r)ψ,r − l(l + 1)ψ 2 /r2 − µ2 ψ 2 )r2 dtdr L2 = −(sin(θ)−1

and the corresponding radial Hamiltonian as � ∞ 2 [α(r)(π 2 + ψ,r ) + (µ2 + l(l + 1)/r2 )ψ 2 ]r2 dr Hl (ψ, π) = (1/2) 0

where ψ = ψ(t, r), π = π(t, r), the canonical position and momentum fields are functions of only time and the radial coordinate.

Stochastic Processes in Classical and Quantum Physics and Engineering 83 5.2. HARMONIC ANALYSIS ON THE SCHWARZ SPACE OF G = SL(2, R)91

5.2

Harmonic analysis on the Schwarz space of G = SL(2, R)

Let fmn (x, µ) denote the matrix elements of the unitary principal series ap­ pearing in the Plancherel formula and let φkmn (x) the matrix elements of the discrete series appearing in the Plancherel formula. Then we have a splitting of the Schwarz space into the direct sum of the discrete series Schwarz space and the principal series Schwarz space. Thus, for any f ∈ L2 (G), we have the Plancherel formula � � � 2 2 d(k)| < f |φkmn > | + | < f |fmn (., µ) > |2 c(mn, µ)dµ |f (x)| dx = G

R

kmn

where d(k)' s are the formal dimensions of the discrete series while c(mn, µ) are determined from the wave packet theory for principal series, ie, from the asymptotic behaviour of the principal series matrix elements. The asymptotic behaviour of the principal and discrete matrix elements is obtained from the differential equations Ω.fmn (x, µ) = −µ2 fmn (x, µ) where Ω is the Casimir element for SL(2, R). We can express x ∈ G as x = u(θ1 )a(t)u(θ2 ) and Ω as a second order linear partial differential operator in ∂/∂t, ∂/∂θk , k = 1, 2. Then, observing that fmn (u(θ1 )a(t)u(θ2 ), µ) = exp(i(mθ1 + nθ2 ))fmn (a(t), µ) we get on noting that the restriction of the Casimir operator to R.H is given by (Ωf )(a(t)) = ΔL (t)−1 where

d2 ΔL (t)f (a(t)) dt2

ΔL (t) = et − e−t

as follows by applying both sides to the finite dimensional irreducible characters of SL(2, R) restricted to R.H on noting that the restriction of such a character is given by (exp(mt)−exp(−mt))/(exp(t)−exp(−t)), we get a differential equation of the form d2 fmn (a(t), µ) − qmn (t)fmn (a(t), µ) = −µ2 fmn (a(t), µ) dt2 where qmn (t) is expressed in terms of the hyperbolic Sinh functions. From this equation, the asymptotic behaviour of the matrix elements fmn (a(t), µ) as |t| → ∞ may be determined. It should be noted that the discrete series modules can be realized within the principal series modules and hence the asymptotic be­ haviour of the matrix elements fmn (a(t), µ) will determine that of the principal series when µ is non­integral and that of the discrete series when µ is integral.

84 Stochastic Processes in Classical and Quantum Physics and Engineering 92CHAPTER 5. SOME PROBLEMS IN QUANTUM INFORMATION THEORY Specifically, consider a principal series representation π(x). Let em denote its highest weight vector and define fm (x) =< π(x)em |em > where now x is regarded as an element of the Lie algebra g of G. We have for any a ∈ U (g), the universal enveloping algebra of g, the fact that π(a).em = 0 =⇒ afm (x) = f (x; a) = 0 This means that if we consider the U (g)­module V = U (g)fm and the map T : π(U (g))em → V defined by T (π(a)em ) = a.fm , then it is clear that T is a well defined linear map with kernel KT = {π(a)em : a ∈ U (g), a.fm = 0} Note that π(a)em ∈ KT implies a.fm = 0 implies b.a.fm = 0 implies π(b)π(a)em = π(b.a)em ∈ KT for any a, b ∈ U (g). Thus, KT is U (g) invariant. Then, the U (g)­module V is isomorphic to the quotient module π(U (g))em /KT . However, π(U (g))em is an irreducible (discrete series) module. Hence, the submodule KT of it must be zero and hence, we obtain the fundamental conclusion that V = U (g)fm is an irreducible discrete series module. Note that fm is a princi­ pal series (reducible since m is an integer) matrix element and this means that we have realized the discrete series representations within the principal series. This also means that in order to study the asymptotic properties of the matrix elements of the discrete series, it suffices to study asymptotic properties of dif­ ferential operators in U (g) acting on the principal series matrix elements fm .

5.3 Monotonicity of quantum entropy under pinch­ ing Let |ψ1 >, ..., |ψn > be normalized pure states, not necessarily orthogonal. Let p(i), i = 1, 2, ..., n be a probability distribution and let |i >, i = 1, 2, ..., n be an orthonormal set. Consider the pure state |ψ >=

n � � i=1

p(i)|ψi > ⊗|i >

Note that this is a state because �� �� p(i)p(j) < ψj |ψi > δ(j, i) < ψ|ψ >= p(i)p(j) < ψj |ψi >< j|i >= i,j

i,j

=

� i

We have ρ1 = T r2 (|ψ >< ψ|) =

� i

p(i)|ψi >< ψi |

p(i) = 1

Stochastic Processes in Classical and Quantum Physics and Engineering 85 5.3. MONOTONICITY OF QUANTUM ENTROPY UNDER PINCHING 93 ρ2 = T r1 (|ψ >< ψ| =

�� p(i)p(j) < ψj |ψi > |i >< j| i

Then by Schmidt’s theorem,

H(ρ1 ) = H(ρ2 ) ≤ H(p) Now let σ1 , σ2 be two mixed states and let p > 0, p + q = 1. Consider the spectral decompositions � � p(i)|ei >< ei |, σ2 = q(i)|fi > fi | σ1 = i

i

Then, pσ1 + qσ2 =

� (pp(i)|ei >< ei | + qq(i)|fi >< fi |) i

So by the above result,

H(pσ1 + qσ2 ) ≤ H({pp(i), qq(i)}) � =− (pp(i)log(pp(i)) + qq(i)log(qq(i))) i

= −p.log(p) − q.log(q) + pH({p(i)}) + qH({qi }) = H(p, q) + p.H(σ1 ) + q.H(σ2 ) where H(p, q) = −p.log(p) − q.log(q) + pH({p(i)}) Now consider several mixed states σ1 , ..., σn and a probability distribution p(i), i = 1, 2, ..., n. Then n � i=1

p(i)σi = p(1)σ1 + (1 − p(1))

n � i=2

p(i)/(1 − p(1))σi

So the same argument and induction gives n �

H(

i=1

p(i)σi ) ≤ H(p(1), 1−p(1))+p(1)H(σ1 )+(1−p(1))

n �

(p(i)/(1−p(1)))H(σi )

i=2

+H(p(i)/(1 − p(1)), i = 2, 3, ..., n) = H(p(i), i = 1, 2, ..., n) +

n �

p(i)H(σi )

i=1

Remark: Let ρ be a state and {Pi } a PVM. Define the state � � σ= Pi ρ.Pi = 1 − Pi (1 − ρ)Pi i

i

Then 0 ≤ D(ρ|σ) = T r(ρ.log(ρ)) − T r(ρ.log(σ))

86

Stochastic Processes in Classical and Quantum Physics and Engineering 94CHAPTER 5. SOME PROBLEMS IN QUANTUM INFORMATION THEORY = −H(ρ) −

� i

T r(ρ.Pi .log(Pi ρ.Pi )Pi ) = −H(ρ) −

= −H(ρ) −



T r(Pi ρ.Pi .log(

i

� j



T r(Pi ρ.Pi .log(Pi ρ.Pi )

i

Pj ρ.Pj )) = −H(ρ) + H(σ)

Thus, H(σ) ≥ H(ρ) which when stated in words, reads, pinching increases the entropy of a state. Remark: We’ve made use of the following facts from matrix theory: � � log( Pi .log(ρ)Pi Pi .ρ.Pi ) = i

i

because the Pi� s satisfy Pj Pi = 0, j = � i. In fact, we write � � Pi ρ.Pi = 1 − Pi (1 − ρ)Pi = 1 − A i

i

and use log(1 − z) = −z − z 2 /2 − z 3 /3 − ... combined with Pj Pj = Pi δ(i, j) to get � log( Pi ρ.Pi ) = −A − A2 /2 − A3 /3 − ... i

where A=

� i

Pi (1 − ρ)Pi , A2 =

� i

Pi (1 − ρ)2 Pi , ..., An = −

so that log(1 − A) =



� i

Pi (1 − ρ)n Pi

Pi .log(ρ)Pi

i

A particular application of this result has been used above in the form �� H( p(i)p(j) < ψj |ψi > |i >< j|) i

≤ H(

= H(

� k

�� i,k

p(i)p(j) < ψj |ψi >< k|i >< j|k > |k >< k|)

p(k)|k >< k|) = H({p(k)}) = −

� k

p(k).log(p(k))

Chapter 6 Chapter 6

Lectures on Statistical Lectures onSignal Statistical Processing Signal Processing [1] Large deviation principle applied to the least squares problem of estimating vector parameters from sequential iid vector measurements in a linear model. X(n) = H(n)θ + V (n), n = 1, 2, ... V (n)� s are iid and have the pdf pV . Then θ is estimated based on the measure­ ments X(n), n = 1, 2, ..., N using the maximum likelihood method as ˆ ) = argmaxθ N −1 θ(N

N �

n=1

log(pV (X(n) − H(n)θ))

Consider now the case when H(n) is a matrix valued stationary process in­ dependent of V (.). Then, the above maximum likelihood estimate is replaced by θˆ(N ) = argmaxθ



N [ΠN n=1 pV (X(n) − H(n)θ)]pH (H(1), ..., H(N ))Πj=1 dH(j)

In the special case when the H(n)� s are iid, the above reduces to ˆ ) = argmaxθ N −1 θ(N

N �

n=1

log(



pV (X(n) − Hθ)pH (H)dH)

The problem is to calculate the large deviation properties of θˆ(N ), namely the rate at which P (|θˆ(N ) − θ| > δ) converges to zero as N → ∞.

95

87

88

96

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 6. LECTURES ON STATISTICAL SIGNAL PROCESSING

6.1 Large deviation analysis of the RLS algo­ rithm Consider the problem of estimating θ from the sequential linear model X(n) = H(n)θ + V (n), n ≥ 1 sequentiall. The estimate based on data collected upto time N is θ(N ) = [

N �

n=1

H(n)T H(n)]−1 [

N �

H(n)T X(n)]

n=1

To cast this algorithm in recursive form, we define R(N ) =

N �

H(n)T H(n), r(N ) =

n=1

N �

H(n)T X(n)

n=1

Then, R(N +1) = R(N )+H(N +1)T H(N +1), r(N +1) = r(N )+H(N +1)T X(N +1)

Then,

R(N +1)−1

= R(N )−1 −R(N )−1 H(N +1)T (I+H(N +1)H(N +1)T )−1 H(N +1)R(N )−1 so that θ(N + 1) = θ(N ) + K(N + 1)(X(N + 1) − H(N + 1)θ(N )) where K(N +1) is a Kalman gain matrix expressible as a function of R(N ), H(N + 1). Our aim is to do a large deviation analysis of this recursive algorithm. Define δθ(N ) = θ(N ) − θ Then we have δθ(N + 1) = δθ(N ) + K(N + 1)(X(N + 1) − H(N + 1)θ − H(N + 1)δθ(N )) = δθ(N ) + K(N + 1)(V (N + 1) − H(N + 1)δθ(N )) We assume that the noise process V (.) is small and perhaps even non­Gaussian. This fact is emphasized by replacing V (n) with �.V (n). Then, the problem is to calculate the relationship between the LDP rate function of δθ(N ) and δθ(N + 1).

Stochastic Processes in Classical and Quantum Physics and Engineering 89 6.2. PROBLEM SUGGESTED BY MY COLLEAGUE PROF. DHANANJAY97

6.2

Problem suggested

Large deviation analysis of magnet with time varying magnetic moment falling thorugh a coil. Assume that the coil is wound around a tube with n0 turns per unit length extending from a height of h to h + b. The tube is cylindrical and extends from z = 0 to z = d. Taking relativity into account via the retarded potentials, if m(t) denotes the magnetic moment of the magnet, then the vector potential generated by it is approximately obtained from the retarded potential formula as A(t, r) = µm(t − r/c) × r/4πr3 where we assume that the magnet is located near the origin. This approximates to A(t, r) = µm(t) × r/4πr3 − µm� (t) × r/4πcr2 Now Assume that the magnet falls through the central axis with its axis always being oriented along the −zˆ direction. Then, after time t, let its z coordinate be ξ(t). We have the equation of motion M ξ �� (t) = F (t) − M g where F (t) is the z component of the force exerted by the magnetic field gener­ ated by the current induced in the coil upon the magnet. Now, the flux of the magnetic field generated by the magnet through the coil is given by � Bz (t, X, Y, Z)dXdY dZ Φ(t) = n0 2 ,h≤z≤h+b X2 +Y 2 ≤RT

where Bz (t, X, Y, Z) = zˆ.B(t, X, Y, Z), B(t, X, Y, Z) = curlA(t, R − ξ(t)ˆ z) where R = (X, Y, Z) and where RT is the radius of the tube. The current through the coil is then I(t) = Φ� (t)/R0 where R0 is the coil resistance. The force of the magnetic field generated by this current upon the falling magnet can now be computed easily. To do this, we first compute the magnetic field Bc (t, X, Y, Z) generated by the coil using the usual retarded potential formula Bc (t, X, Y, Z) = curlAc (t, X, Y, Z) Ac (t, X, Y, Z) = (µ/4π)



0

2n0 π

(I(t−|R−RT (ˆ x.cos(φ)+ˆ y .sin(φ))−zˆ(b/2n0 π)φ|/c).

90 98

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 6. LECTURES ON STATISTICAL SIGNAL PROCESSING

.(−x.sin ˆ (φ) + yˆ.cos(φ))/|R − RT (ˆ x.cos(φ) + yˆ.sin(φ)) − zˆ(b/2n0 π)φ|)RT dφ Note that Ac (t, X, Y, Z) is a function of ξ(t) and also of m(t) and m� (t). It follows that I(t) is a function of ξ(t), ξ � (t), m(t), m� (t), m�� (t). Then we calculate the interaction energy between the magnetic fields generated by the magnet and by the coil as � � −1 (Bc (t, X, Y, Z), B(t, X, Y, Z))dXdY dZ Uint (ξ(t), ξ (t), t) = (2µ) and then we get for the force exerted by the coil’s magnetic field upon the falling magnet as F (t) = −∂Uint (ξ(t), ξ � (t), t)/∂ξ(t) Remark: Ideally since the� interaction energy between the two magnetic fields appears in the Lagrangian (E 2 /c2 − B 2 )d3 R/2µ, we should write down the Euler­Lagrange equations using the Lagrangian L(t, ξ � (t), ξ � (t)) = M ξ � (t)2 /2 − Uint (ξ(t), ξ � (t), t) − M gξ(t) in the form

d ∂L/∂ξ � − ∂L/∂ξ = 0 dt

which yields M ξ �� (t) −

6.3

d ∂U/∂ξ � + ∂Uint /∂ξ + M g = 0 dt

Fidelity between two quantum states

Let ρ, σ be two quantum states in the same Hilbert space H1 . Let P (ρ), P (σ) denote respectively the set of all purifications of ρ and σ w.r.t the same reference Hilbert space H2 . Thus, if |u >∈ H1 ⊗ H2 is a purification of ρ, ie |u >∈ P(ρ), then ρ = T r2 |u >< u| and likewise for σ. Then, the fidelity between ρ and σ is defined as F (ρ, σ) = sup|u>∈P (ρ),|v>∈P (σ) | < u|v > | Theorem:

√ √ F (ρ, σ) = T r(| ρ. σ|)

Proof: Let |u >∈ P (ρ). Then, we can write |u >=

� i

|ui ⊗ wi >

Stochastic Processes in Classical and Quantum Physics and Engineering 6.3. FIDELITY BETWEEN TWO QUANTUM STATES

91 99

where {|wi >} is an onb for H2 and



ρ=

|ui >< ui |

i

Likewise if |v > is a purification of σ, we can write |v >= where σ=

� i

� i

But then < u|v >=

|vi ⊗ wi >

|vi >< vi |

� i

< ui |vi >

Define the matrices X=

� i

|ui >< wi |, Y =

� i

|vi >< wi |

Then we have ρ = XX ∗ , σ = Y Y ∗ Hence, by the polar decomposition theorem, we can write X=

√ √ ρU, Y = σV

where U, V are unitary matrices. Then, < u|v >=

� i

< ui |vi >= T r

� i

√ √ |vi >< ui | = T r(Y X ∗ ) = T r( σV U ∗ ρ)

√ √ √ √ = T r( ρ σV U ∗ ) = T r( ρ. σW )

where W = V U ∗ is a√unitary matrix. It is then clear from the singular value

√ decomposition of ρ.√ σ that the maximum of | < u|v > | which equals the √ maximum of |T r( ρ. σW )| as W√varies over all unitary matrices is simply the √ sum of the singular values of ρ. σ, ie, √ √ F (ρ, σ) = T r| ρ σ| √ √ √ √ = T r[( ρσ. ρ)1/2 ] = T r[( σ.ρ. σ)1/2 ]

92 Stochastic Processes in Classical and Quantum Physics and Engineering 100 CHAPTER 6. LECTURES ON STATISTICAL SIGNAL PROCESSING

6.4 Stinespring’s representation of a TPCP map, ie, of a quantum operation Let K be a quantum operation. Thus, for any state ρ, K(ρ) =

N �

Ea ρ.Ea∗ ,

a=1

N �

Ea∗ Ea = 1

a=1

We claim that K can be represented as K(ρ) = T r2 (U (ρ ⊗ |u >< u|)U ∗ ) where ρ acts in H1 , |u >∈ H2 and U is a unitary operator in H1 ⊗ H2 . To see this, we define ∗ ] V ∗ = [E1∗ , E2∗ , ..., EN Then,

V ∗ V = Ip

In other words, the columns of V are orthonormal. We assume that ρ ∈ Cp×p so V ∈ CN p×p Note that

V ρV ∗ = ((Ea ρEb∗ ))1≤a,b≤N

as an N p × N p block structured matrix, each block being of size p × p. Now define |u >= [1, 0, ..., 0]T ∈ CN Then, |u >< u| ⊗ ρ =



ρ 0

0 0



∈ CN p×N p

Since the columns of V are orthonormal, we can add columns to it to make it a square N p × N p unitary matrix U ∗ : U = [V, W ] ∈ CN p×N p , U ∗ U = IN p Then, ∗

U (|u >< u| ⊗ ρ)U = [V, W ] = V ρV ∗



ρ 0

0 0

��

V∗ W∗



from which it follows that T r1 (U (|u >< u| ⊗ ρ)U ∗ ) = T r1 (V ρV ∗ ) =

� a

Ea ρEa∗

Stochastic Processes in Classical and Quantum Physics and Engineering 93 6.5. RLS LATTICE ALGORITHM, STATISTICAL PERFORMANCE ANALYSIS101 Remark: Consider the block structured matrix X = ((Bij ))1≤i,j≤N where each Bij is a p × p matrix. Then T r1 (X) =

L i

Bii ∈ Cp×p

and T r2 (X) = ((T r(Bij )))1≤i,j≤N ∈ CN ×N

6.5 RLS lattice algorithm, statistical performance analysis The RLS lattice algorithm involves updating the forward and backward pre­ diction parameters simultaneously in time and order. In order to carry out a statistical performance analysis, we first simply look at the block processing problem involving estimating the forward prediction filter parameter vector aN,p of order p based on data collected upto time N . Let x(n) = [x(n), x(n−1), ..., x(0)]T , z −r x(n) = [x(n−r), x(n−r−1), ..., x(0), 0, ..., 0]T , r = 1, 2, ... where these vectors are all of size N + 1. Form the data matrix of order p at time N : XN,p = [z −1 x(n), ..., z −p x(n)] ∈ RN +1×p Then the forward prediction filter coefficient estimate is aN,p = (XTN,p XN,p )−1 XTN,p x(N ) If x(n), n ∈ Z is a zero mean stationary Gaussian process with autocorrelation R(τ ) = E(x(n + τ )x(n)), then the problem is to calculate the statistics of aN,p . To this end, we observe that N −1 (z −i x(N ))T (z −j x(N )) = N

−1

N L

n=max(i,j)

x(n − i)x(n − j) ≈ R(j − i)

So we write N −1 (z −i x(N ))T (z −j x(N )) = R(j − i) + δRN (j, i)

94102 CHAPTER Stochastic Processes in Classical and Quantum Physics and Engineering 6. LECTURES ON STATISTICAL SIGNAL PROCESSING where now δRN (j, i) is a random variable that is a quadratic function of the Gaussian process {x(n)} plus a constant scalar. Equivalently, N −1 XTN,p XN,p = Rp + δRN,p where Rp = ((R(j − i)))1≤i,j≤p is the statistical correlation matrix of size p and δRN,p is a random matrix. Likewise, N −1 (z −i x(N ))T x(N ) = N −1

N

� n=i

Thus,

x(n − i)x(n) = R(i) + δRN (0, i)

N −1 XTN,p x(n) = r(1 : p) + δrN (0, 1 : p)

where r(1 : p) = [R(1), ..., R(p)]T , δrN (1 : p) = [δRN (0, 1), ..., δRN (0, p)]T

6.6 Approximate computation of the matrix el­ ements of the principal and discrete series for G = SL(2, R) These matrix elements fmn (x, µ) for µ non­integral are the irreducible princi­ pal series matrix elements and fmm (x, m) = fm (x) for k integral generate the Discrete series Dm ­modules when acted upon by U (g), the universal enveloping algebra of G. These matrix elements satisfy the equations fmn (u(θ1 )xu(θ2 ), µ) = exp(i(mθ1 + nθ2 ))fmn (x, µ), Ωfmn (x, µ) = −µ2 fmn (x, µ) the second of which, in view of the first and the expression of the Casimir opera­ tor Ω in terms of t, θ1 , θ2 in the singular value decomposition x = u(θ1 )a(t)u(θ2 ) simplifies to an equation of the form d2 fmn (t, µ)/dt2 − qmn (t)fmn (t, µ) + µ2 fmn (t, µ) = 0 where fmn (t, µ) = fmn (a(t), µ). This determines the matrix elements of the principal series. For the discrete series matrix elements, we observe that these are φkmn (x) =< πk (x)em |en >

Stochastic Processes Classical and Quantum andCQ Engineering 95 6.7. PROOF OF in THE CONVERSE PARTPhysics OF THE CODING THEOREM103 where πk is the reducible principal series for integral k. Replacing k by −k where k = 1, 2, ..., the discrete series matrix elements are φ−kmn (x) =< π−k (x)em |en >, m, n = −k − 1, −k − 2, ... with π(X)e−k−1 = 0 Note that we can using the method of lowering operators, write π−k (Y )r e−k−1 = c(k, r)e−k−1−r , r = 0, 1, 2, ... where c(k, r) are some real constants. Thus, writing fk+1 (x) =< π−k (x)e−k−1 |e−k−1 >, x ∈ G, we have for the discrete series matrix elements, φ−k,−k−1−r,−k−1−s (x) =< π−k (xY r )e−k−1 |π−k (Y s )e−k−1 > =< π−k (Y ∗s xY r )e−k−1 |e−k−1 >= fk+1 (Y ∗s ; x; Y r ) In this way, all the discrete series matrix elements can be determined by applying differential operators on fk+1 (x). Now given an image field h(x) on SL(2, R), in order to extract invariant features, or to estimate the group transformation element parameter from a noisy version of it, we must first evaluate its matrix elements w.r.t the discrete and principal series, ie, � � h(x)fmn (x, µ)dx h(x)φ−k,−k−1−r,−k−1−s (x)dx, G

G

Now, �

G

h(x)φ−k,−k−1−r,−k−1−s (x)dx = =





h(x)fk+1 (Y ∗s ; x; Y r )dx

G

h(Y s ; x; Y ∗r )fk+1 (x)dx

G

6.7 Proof of the converse part of the Cq coding theorem Let φ(i), i = 1, 2, ..., N be a code. Note that the size of the code is N . Let s ≤ 0 and define ps = argmaxp Is (p, W )

96 104 CHAPTER Stochastic Processes in Classical and QuantumSIGNAL Physics and Engineering 6. LECTURES ON STATISTICAL PROCESSING where Is (p, W ) = minσ Ds (p ⊗ W |p ⊗ σ) where p ⊗ W = diag[p(x)W (x), x ∈ A], p ⊗ σ = diag[p(x)sigma, x ∈ A] Note that Ds (p ⊗ W |p ⊗ σ] = log(T r((p ⊗ W )1−s (p ⊗ σ)s )/(−s) = (1/ − s) log sumx T r(p(x)W (x))1−s (p(x)σ)s ) � = (1/ − s)log p(x)T r(W (x)1−s σ s ) x

so that

� x

where

lims→0 Ds (p ⊗ W |p ⊗ σ) = p(x)T r(W (x)log(W (x)) − T r(Wp .log(σ)) Wp =



p(x)Wx

x

Note that this result can also be expressed as � lims→0 Ds (p ⊗ W |p ⊗ σ) = p(x)D(W (x)|σ) x

We define f (t) = T r(tWx1−s + (1 − t) Note that Is (p, W ) = (T r(



� y

ps (y)Wy1−s )1/(1−s) , t ≥ 0

p(y)Wy1−s )1/(1−s) )1−s

y

and since by definition of ps , we have Is (ps , W ) ≥ Is (p, W )∀p, it follows that f (t) ≤ f (0)∀t ≥ 0 Thus

f � (0) ≤ 0

This gives T r[(Wx1−s −

� y

ps (y)Wy1−s ).(

� y

ps (y)Wy1−s )s/(1−s) ] ≤ 0

Stochastic Processes Classical and Quantum andCQ Engineering 97 6.7. PROOF OF in THE CONVERSE PARTPhysics OF THE CODING THEOREM105 or equivalently, T r(Wx1−s .(

� y

ps (y)Wy1−s )s/(1−s) ) ≤ T r(



ps (y)Wy1−s )1/(1−s)

y

Now define the state � � σs = ( ps (y)Wy1−s )1/(1−s) /T r( ps (y)Wy1−s )1/(1−s) y

y

Then we can express the above inequality as � T r(Wx1−s σss ) ≤ [T r( ps (y)Wy1−s )1/(1−s) ]1−s y

Remark: Recall that Ds (p ⊗ W |p ⊗ σ] = (1/ − s)log



p(x)T r(Wx1−s σ s )

x

= (1/ − s)log(T r(Aσ s ) where A=



p(x)Wx1−s

x

Note that we are all throughout assuming s ≤ 0. Application of the reverse Holder inequality gives for all states σ T r(Aσ s ) ≥ (T r(A1/(1−s) ))1−s .(T r(σ))s = (T r(A1/(1−s) ))1−s with equality iff σ is proportional to A1/(1−s) , ie, iff σ = A1/(1−s) /T r(A1/(1−s) ) Thus, minσ Ds (p ⊗ W |p ⊗ σ) = � (−1/s)[T r( p(x)Wx1−s )1/(1−s) ]1−s x

The maximum of this over all p is attained when p = ps and we denote the corresponding value of σ by σs . Thus, � ( x ps (x)Wx1−s )1/(1−s) σs = � T r( x ps (x)Wx1−s )1/(1−s)

n Now let φ(i), i = 1, 2, ..., N be a code � with φ(i) ∈ A . Let Yi , i = 1, 2, ..., N be detection operators, ie, Yi ≥ 0, i Yi = 1. Let �(φ) be the error probability corresponding to this code and detection operators. Then

1 − �(φ) = N −1

N � i=1

T r(Wφ(i) Yi )

98 Stochastic Processes in Classical and Quantum Physics and Engineering 106 CHAPTER 6. LECTURES ON STATISTICAL SIGNAL PROCESSING = N −1 T r(S(φ)T ) where S(φ) = N −1 diag[Wφ(i) , i = 1, 2, ..., N ], T = diag[Yi , i = 1, 2, ..., N ] Also define for any state σ, the state S(σ) = N −1 diag[σn , ..., σn ] = N −1 .IN ⊗ σn We observe that �

T r(S(σn )T ) == N −1

T r(σn Yi ) = T r(σ.N −1 .

i

N �

Yi )

i=1

= N −1 T r(σn ) = 1/N

where σn = σ ⊗n . Then for s ≤ 0 we have by monotonicity of quantum Renyi entropy, (The relative entropy between the pinched states cannot exceed the relative entropy between the original states) N −s (1 − �(φ))1−s = (T r(S(φ)T ))1−s (T r(S(σns )T ))s ≤ (T r(S(φ)T ))1−s (T r(S(σns )T ))s + (T r(S(φ)(1 − T )))1−s (T r(S(σns )(1 − T )))s ≤ T r(S(φ)1−s S(σns )s )

Note that the notation σns = σs⊗n is being used. Now, we have � 1−s s T r(Wφ(i) (T r(S(φ)1−s S(σns )s )) = (N −1 σns )) i

1−s s log(T r(Wφ(i) σns )

=

n �

s log(T r(Wφ1l−s (i) σs ))

l=1

= n.n−1

n �

log(T r(Wφ1−s σ s )) l (i) s

l=1

≤ n.log(n−1 . ≤ n.log(T r(



n � l=1

ps (y)Wy1−s )1/(1−s) ]1−s )

y

= n(1 − s).log(T r( Thus,

T r(Wφ1−s σ s )) l (i) s



ps (x)Wx1−s )1/(1−s) )

x

� ps (x)Wx1−s )1/(1−s) )n(1−s) T r(S(φ)1−s S(σns )s ) ≤ (T r( x

Stochastic Processes in Classical and Quantum Physics and Engineering 99 6.7. PROOF OF THE CONVERSE PART OF THE CQ CODING THEOREM107 and hence N −s (1 − �(φ))1−s ≤ (T r(



ps (x)Wx1−s )1/(1−s) )n(1−s)

x

or equivalently, 1 − �(φ) ≤ exp((s/(1 − s))log(N ) + n.log(T r( = exp((s/(1 − s))log(N ) + n.log(T r( = exp(ns((1 − s)log(N )/n + s−1 log(T r(





x



ps (x)Wx1−s )1/(1−s) ))

x

ps (x)Wx1−s )1/(1−s) ))

x

ps (x)Wx1−s )1/(1−s) )) − − − (a)

where s ≤ 0. Now, for any probability distribution p(x) on A, we have � d/dsT r[( p(x)Wx1−s )1/(1−s) ] = x

(1/(1 − s)2 )T r[( −(1/(1 − s))[T r(





p(x)Wx1−s )1/(1−s) .log(

x



p(x)Wx1−s )]

x

p(x)Wx1−s log(Wx ))].[T r(

x



p(x)Wx1−s )s/(1−s) ]



p(x)D(Wx |Wp )

x

and as s → 0, this converges to � p(x)Wx1−s )1/(1−s) ]|s=0 = d/dsT r[( x

� x

p(x)T r(Wx .(log(Wp ) − log(Wx ))) = −

= −(H(Wp ) − where

� p

x

(x)H(Wx )) = −(H(Y ) − H(Y |X)) = −Ip (X, Y ) Wp =



p(x)Wx

x

Write N = N (n) and define

R = limsupn log(N (n))/n, φ = φn Also define Is (X, Y ) = −(1/s(1 − s))log(T r(



ps (x)Wx1−s )1/(1−s) ))

x

Then, we’ve just seen that

lims→0 Is (X, Y ) = I0 (X, Y ) ≤ I(X, Y )

100 Stochastic Processes in Classical and Quantum Physics and Engineering 108 CHAPTER 6. LECTURES ON STATISTICAL SIGNAL PROCESSING where I0 (X, Y ) = Ip0 (X, Y ) and I(X, Y ) = supp Ip (X, Y ). and 1 − �(φn ) ≤ exp((ns/(1 − s))(log(N (n))/n − Is (X, Y ))) It thus follows that if R > I(X, Y ) then by taking s < 0 and s sufficiently close to zero we would have R > Is (X, Y ) and then we would have limsupn (1 − �(φn )) ≤ limsupn exp((ns/(1 − s))(R − Is (X, Y )) = 0 or equivalently, liminfn �(φn ) = 1 This proves the converse of the Cq coding theorem, namely, that if the rate of information transmission via any coding scheme is more that the Cq capacity I(X, Y ), then the asymptotic probability of error in decoding using any sequence of decoders will always converge to unity.

6.8 Proof of the direct part of the Cq coding theorem Our proof is based on Shannon’s random coding method. Consider the same scenario as above. Define πi = {Wφ(i) ≥ 2N Wpn }, i = 1, 2, ..., N where N = N (n), φ(i) ∈ An with φ(i), i = 1, 2, ..., N being iid with probability distribution pn = p⊗n on An . Define the detection operators N N � � Y (φ(i)) = ( π)−1/2 .πi .( πj )−1/2 , i = 1, 2, ..., N j=1

Then 0 ≤ Yi ≤ 1 and

�N

i=1

j=1

Yi = 1. Write S = πi , T =



πj

j=i

Then,

Y (φ(i)) = (S + T )−1/2 S(S + T )−1/2 , 0 ≤ S ≤ 1, T ≥ 0

and hence we can use the inequality 1 − (S + T )−1/2 S(S + T )−1/2 ≤ 2 − 2S + 4T

Stochastic Processes Classical and Quantum Physics and CODING Engineering 101 6.8. PROOF OFinTHE DIRECT PART OF THE CQ THEOREM109 Then, 1 − Y (φ(i)) =≤ 2 − 2πi + 4 and hence �(φ) = N −1

N � i=1

N �

≤ N −1

i=1



πj

j:j=i

T r(Wφ(i) (1 − Y (φ(i))))

T r(W (φ(i))(2 − 2πi ) + 4N −1



T r(W (φ(i))πj )

i=  j

Note that since for i = � j, φ(i) and φ(j) are independent random variables in An , it follows that W (φ(i)) and πj are also independent operator valued random variables. Thus, E(�(φ)) ≤ (2/N ) �

+(4/N )

N � i=1

i=j,i,j=1,2,...,N

=2



xn ∈An

+4(N − 1)

E(T r(W (φ(i)){W (φ(i)) ≤ 2N Wpn }))

T r(E(W (φ(i))).E{W (φ(j)) > 2N Wpn })

pn (xn )T r(W (xn ){W (xn ) ≤ 2N.W (pn )})



pn (xn )pn (yn )T r(W (xn ){W (yn ) > 2N W (pn )})

xn ,yn ∈An

= (2/N )

� xn

+4(N − 1)



pn (yn )T r(W (pn ){W (yn ) > 2N W (pn )})

yn ∈An

≤2 +4N

pn (xn )T r(W (xn ){W (xn ) ≤ 2N.W (pn )})



pn (xn )T r(W (xn )1−s (2N W (pn ))s )

xn



pn (yn )T r(W (yn )1−s W (pn )s (2N )s−1 )

yn

for 0 ≤ s ≤ 1. This gives

2s+2 N s



xn s+2

=2

E(�(φn )) ≤

pn (xn )T r(W (xn )1−s W (pn )s )

∈An

N s[



p(x)T r(W (x)1−s W (p)s )]n

x∈A

= 2s+2 .exp(sn(log(N (n)/n + s−1 .log(

� x

p(x)T r(W (x)1−s W (p)s ))))

102 Stochastic Processes in Classical and Quantum Physics and Engineering 110 CHAPTER 6. LECTURES ON STATISTICAL SIGNAL PROCESSING

6.9

Induced representations

Let G be a group and H a subgroup. Let σ be a unitary representation of H in a Hilbert space V . Let F be the space of all f : G → V for which f (gh) = σ(h)−1 f )(g), h ∈ H, g ∈ G Then define U (g)f (x) = ρ(g, g −1 xH)f (g −1 x), g, x ∈ G, f ∈ F where ρ(g, x) = (dµ(xH)/dµ(gxH))1/2 , g, x ∈ G with µ a measure on G/H. U is called the representation of G induced by the representation σ of H. We define � � f �= |f (g)|2 dµ(g) G

Then, unitarity of U (g) is easy to see: � � U (g)f (x) �2 dµ(x) = G/H





G/H

(ρ(g, g −1 xH))2 |f (g −1 x)|2 dµ(xH) =

(dµ(g −1 xH)/dµ(xH))|f (g −1 x)|2 dµ(xH) = =



G/H



G/H

|f (g −1 x)|2 dµ(g −1 xH)

|f (x)|2 dµ(xH)

proving the claim. Note that (ρ(g1 , g2 xH)ρ(g2 , xH))2 = (dµ(g2 xH)/dµ(g1 g2 xH))(dµ(xH)/dµ(g2 xH)) = dµ(xH)/dµ(g1 g2 xH) = ρ(g1 g2 , xH)2 ie ρ(g1 g2 , xH) = ρ(g1 , g2 xH)ρ(g2 , xH), g1 , g2 , x ∈ G ie, ρ is a scalar cocycle. Remark: |f (x)|2 can be regarded a function on G/H since it assumes the same value on xH for any x ∈ G. Indeed for x ∈ G, h ∈ H, |f (xh)|2 = |σ(h)−1 f (x)|2 = |f (x|2 since σ(h) is a unitary operator. There is yet another way to describe the induced representation, ie, in a way that yields a unitary representation of G that is equivalent to U . The method is as follows: Let X = G/H and let F1 be the space of all square integrable

Stochastic Processes in Classical and Quantum Physics and Engineering 6.9. INDUCED REPRESENTATIONS

103 111

functions f : X → V w.r.t the measure µX induced by µ on X. Let γ be any cocycle of (G, X) with values in the space of unitary operators on V , ie, γ : G × X → U(V ) satisfies γ(g1 g2 , x) = γ(g1 , g2 x)γ(g2 , x), g1 , g2 ∈ G, x ∈ X Then define a representation U1 of G in F1 by U1 (g)f (x) = ρ(g, g −1 x)γ(g, g −1 x)f (g −1 x), g ∈ G, x ∈ X, f ∈ F1 Let s : X → G be a section of X = G/H such that s(H) = e. This means that for any x ∈ X, s(x)H = x, ie, s(x) belong to the coset x. Define b(g, x) = s(gx)−1 gs(x) ∈ G, g ∈ G, x ∈ X Then,

b(g, x)H = s(gx)−1 gx = H

because gs(x)H = gx, s(gx)H = gx Now observe that b(g1 , g2 x)b(g2 , x) = s(g1 g2 x)−1 g1 s(g2 x)s(g2 x)−1 g2 s(x) = s(g1 g2 x)−1 g1 g2 s(x) = b(g1 g2 , x), g1 , g2 ∈ G, x ∈ X implying that γ = σ(b) is a (G, X, V ) cocycle, ie, γ(g1 g2 , x) = σ(b(g1 g2 , x)) = σ(b(g1 , g2 x)b(g2 , x)) = σ(b(g1 , g2 x))σ(b(g2 , x)) = γ(g1 , g2 x)γ(g2 , x) with γ assuming values in the set of unitary operators in V . Now let f (g) = γ(g, H)−1 f1 (gH) = T (f1 )(g) ∈ V, f1 ∈ F1 , g ∈ G Then

U (g)T (f1 )(g1 ) = U (g)f (g1 ) = ρ(g, g −1 g1 H)f (g −1 g1 ) = ρ(g, g −1 g1 H)γ(g −1 g1 , H)−1 f1 (g −1 g1 H) = ρ(g, g −1 g1 H)(γ(g −1 , g1 H)γ(g1 , H))−1 f1 (g −1 g1 H)

On the other hand, T (U1 (g)f1 )(g1 ) = γ(g1 , H)−1 U1 (g)f1 (g1 H) = γ(g1 , H)−1 ρ(g, g −1 g1 H)γ(g, g −1 g1 H)f1 (g −1 g1 H) = ρ(g, g −1 g1 H)γ(g1 , H)−1 γ(g, g −1 g1 H)f1 (g −1 g1 H)

104 Stochastic Processes in Classical and Quantum Physics and Engineering 112 CHAPTER 6. LECTURES ON STATISTICAL SIGNAL PROCESSING Now, γ(g1 , H) = γ(gg −1 g1 , H) = γ(g, g −1 g1 H)γ(g −1 g1 , H) or equivalently, γ(g1 , H)−1 γ(g, g −1 g1 H) = γ(g −1 g1 , H)−1 Thus,

T (U1 (g)f1 )(g1 ) = ρ(g, g −1 g1 H)γ(g −1 g1 , H)−1 f1 (g −1 g1 H) = ρ(g, g −1 g1 H)T (f1 )(g −1 g1 ) = U (g)T (f1 )(g1 )

In other words, T U1 (g) = U (g)T Thus, U1 and U are equivalent unitary representations.

6.10 Estimating non­uniform transmission line parameters and line voltage and current using the extended Kalman filter Abstract: The distributed parameters of a non­uniform transmission line are assumed to depend on a finite set of unknown parameters, for example their spatial Fourier series coefficients. The line equations are setup taking random voltage and current line loading into account with the random loading being assumed to be white Gaussian noise in the time and spatial variables. By pass­ ing over to the spatial Fourier series domain, these line stochastic differential equations are expressed as a truncated sequence of coupled stochastic differen­ tial equations for the spatial Fourier series components of the line voltage and current. The coefficients that appear in these stochastic differential equations are the spatial Fourier series coefficients of the distributed line parameters and these coefficients in turn are linear functions of the unknown parameters. By separating out these complex stochastic differential equations into their real and imaginary part, the line equations become a finite set of sde’s for the extended state defined to be the real and imaginary components of the line voltage and current as well as the parameters on which the distributed parameters depend. The measurement model is assumed to be white Gaussian noise corrupted ver­ sions of the line voltage and current sampled at a discrete set of spatial points on the line and this measurement model can be expressed as a linear transfor­ mation of the state variables, ie, of the spatial Fourier series components of the line voltage and current plus a white measurement noise vector. The extended Kalman filter equations are derived for this state and measurement model and our real time line voltage and current and distributed parameter estimates show excellent agreement with the true values based on MATLAB simulations.

Stochastic Processes in Classical and Quantum Physics and Engineering 105 6.11. MULTIPOLE EXPANSION OF THE MAGNETIC FIELD PRODUCED BY A TIME

6.11 Multipole expansion of the magnetic field produced by a time varying current distri­ bution Let J(t, r) be the current density. The magnetic vector potential generated by it is given by � A(t, r) = (µ/4π) J(t − |r − r� |/c, r� )d3 r� /|r − r� | = (µ/4π)



n

(1/c n!)

n≥0



(∂tn J(t, r� ))|r − r� |n−1 d3 r�

For |r| greater than the source dimension and with θ denoting the angle between the vectors r and r� , we have the expansion �

|r − r� |α = (r2 + r 2 − 2r.r� )α/2 = rα (1 + (r� /r)2 − 2r.r� /r2 )α/2 = rα (1 + (r� /r)2 − 2(r� /r)cos(θ))α/2 � Pm,α (cos(θ))(r� /r)m = rα m≥0

where the generating function of Pm,α (x), m = 0, 1, 2, ... is given by (1 + x2 − 2x.cos(θ))α/2 . Then, � � A(t, r) = (µ/4π) (1/cn n!)rn−1−m ∂tn J(t, r� )r� mPn−1 (r.r� /rr� )d3 r� n,m≥0

A useful approximation to this exact expansion is obtained by taking |r − r� |α ≈ (r2 − 2r.r� )α/2 ≈ rα (1 − αr.r� /r2 ) = rα − α.rα−2 r.r� Further, ∂t n J(t, r� )(r.r� ) = (r� × ∂tn J(t, r� )) × r + (∂tn J(t, r� ).r)r� In the special case when the source is a wire carrying a time varying current I(t), we can write J(t, r� )d3 r� = I(t)dr� and then the above equation gets replaced by

Chapter 7 Chapter 7

Some Aspects of Stochastic Some aspects of stochastic Processes processes �n[1] In a single server queue, let the arrivalsthtake place at the times Sn = customer be Yn and let his k=1 Xk , n ≥ 1, let the service time of the n waiting time in the queue be Wn . Derive the recursive relation Wn+1 = max(0, Wn + Yn − Xn+1 ), n ≥ 1 where W1 = 0. If the Xn� s are iid and the Yn� s are also iid and independent of {Xn }, then define Tn = Yn − Xn+1 and prove �nthat Wn+1 has the same distribution as that of max0≤k≤n Fk where Fn = k=1 Tk , n ≥ 1 and F0 = 0. When Xn , Yn are respectively exponentially distributed with parameters λ, µ respectively explain the sequence of steps based on ladder epochs and ladder heights by which you will calculate the distribution of the equilibrium waiting time W∞ = limn→∞ Wn in the case when µ > λ. What happens when µ ≤ λ ? Solution: The (n + 1)th customer arrives at the time Sn+1 . The previous, ie, the nth customer departs at time Sn + Wn + Yn . If this time of his departure is smaller than Sn+1 , (ie, Sn + Wn + Yn − Sn+1 = Wn + Yn − Xn+1 < 0), then the (n + 1)th customer finds the queue empty on arrival and hence his waiting time Wn+1 = 0. If on the other hand, the time of departure Sn + Wn + Yn of the nth customer is greater than Sn+1 , the arrival time of the (n + 1)th customer, then the n + 1th customer has to wait for a time Wn+1 = (Sn + Wn + Yn ) − Sn+1 = Wn + Yn − Xn+1 before his service commences. Therefore, we have the recursion relation Wn+1 = max(0, Wn + Yn − Xn+1 ), n ≥ 0 115

107

108 116

Stochastic 7. Processes Classical and Physics PROCESSES and Engineering CHAPTER SOMEinASPECTS OFQuantum STOCHASTIC

Note that W1 = 0. This can be expressed as Wn+1 = max(0, Wn + Tn ), Tn − Yn − Xn+1 By iteration, we find W2 = max(0, T1 ), W3 = max(0, T2 +max(0, T1 )) = max(0, max(T2 , T1 +T2 )) = max(0, T2 , T1 +T2 ) W4 = max(0, W3 +T3 ) = max(0, T3 +max(0, max(T2 , T1 +T2 )) = max(0, max(T3 , max(T2 +T3 , T1 +T2 +T3 )))) = max(0, T3 , T2 + T3 , T1 + T2 + T3 ) Continuing, we find by induction that n � Tk , r = 1, 2, ..., n) Wn+1 = max(0, k=r

Now the Tk� s are iid. Therefore, Wn+1 has the same distribution as max(0,

n �

k=r

= max(0,

Tn+1−k , r = 1, 2, ..., n) r �

Tk , r = 1, 2, ..., n)

k=1

In particular, the equilibrium distribution of the waiting time, ie of W∞ = limn→∞ Wn is the same as that of ∞ � max(0, Tk ) k=1

provided that this limiting � r.v. exists, ie, the provided that the infinite series n converges w.p.1. Let Fn = k=1 Tk , n ≥ 1, F0 = 0 Define τ1 denote the first time k ≥ 1 at which Fk > F0 = 0. For n ≥ 1, let τn+1 denote the first time k ≥ 1 at which Fk > Fτn , ie, the n + 1th time k at which Fk exceeds all of Fj , j = 0, 1, ..., k − 1. Then, it is clear that W∞ has the same distribution as that of limFτn , provided that τn exists for all n. In the general case, we have P (W∞ ≤ x) = limn→∞ P (Fτn ≤ x) where if for a given ω, τN (ω) exists for some positive integer N = N (ω) but τN +1 (ω) does not exist, then we set τk (ω) = τN (ω) for all k > N . It is also clear using stopping time theory, that the r.v’s Un+1 = Fτn+1 − Fτn , n = 1, 2, ... are iid and non­negative and we can write W∞ =

∞ �

n=0

U n , U0 = 0

Stochastic Processes in Classical and Quantum Physics and Engineering

109 117

Thus, the distribution of W∞ is formally the limit of FU∗n where FU is the distribution of U1 = Fτ1 with τ1 = min(k ≥ 1, Fk > 0). Note also that for x > 0, ∞ � P (maxk≤n Fk ≤ x < Fn+1 ) 1 − FU (x) = P (U > x) = n=0

[2] Consider a priority queue with two kinds of customers labeled C1 , C2 and a single server. Customers of type C1 arrive at the rate λ1 and customers of type C2 arrive at the rate λ2 . A customer of type C1 is served at the rate µ1 while a customer of type C2 is served at the rate µ2 . If a customer of type C1 is present in the queue, then his service takes immediate priority over a customer of type C2 , otherwise the queue runs on a first come first served basis. Let X1 (t) denote the number of type C1 customers and X2 (t) the number of type C2 customers in the queue at time t. Set up the Chapman­Kolmogorov differential equations for P (t, n1 , n2 ) = P r(X1 (t) = n1 , X2 (t) = n2 ), n1 , n2 = 0, 1, ... and also set up the equilibrium equations for P (n1 , n2 ) = limt→∞ P (t, n1 , n2 )

Solution: Let θ(n) = 1if n ≥ 0 and θ(n) = 0 if n < 0 and let δ(n, m) = 1 if n = m else δ(n, m) = 0. P (t+dt, n1 , n2 ) = P (t, n1 −1, n2 )λ1 dt.θ(n1 −1)+P (t, n1 , n2 −1)λ2 dt.θ(n2 −1)+ P (t, n1 + 1, n2 )µ1 dt + P (t, n1 , n2 + 1)µ2 .dt.δ(n1 , 0) +P (t, n1 , n2 )(1 − (λ1 + λ2 )dt − µ1 .dt.θ(n1 − 1) − µ2 dt.θ(n2 − 1).δ(n1 , 0)) Essentially, the term δ(n1 , 0) has been inserted to guarantee that the C2 type of customer is currently being served only when there is no C1 type of customer in the queue and the term θ(n2 − 1) term has been inserted to guarantee that the C2 type of customer can depart from the queue only if there is at least one such in the queue. The product δ(n1 , 0)θ(n2 − 1) term corresponds to the situation in which there is no C1 type of customer and at least one C2 type of customer which means that a C2 type of customer is currently being served and hence will depart immediately after his service is completed. From these equations follow the Chapman­Kolmogorov equations: ∂t P (t, n1 , n2 ) = P (t, n1 − 1, n2 )λ1 θ(n1 − 1) + P (t, n1 , n2 − 1)λ2 .θ(n2 − 1)+ P (t, n1 + 1, n2 )µ1 + P (t, n1 , n2 + 1)µ2 .δ(n1 , 0) −P (t, n1 , n2 )(λ1 + λ2 ) − µ1 .θ(n1 − 1) − µ2 .θ(n2 − 1).δ(n1 , 0))

110 118

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 7. SOME ASPECTS OF STOCHASTIC PROCESSES

The equilibrium equations are obtained by setting the rhs to zero, ie ∂t P (n1 , n2 ) = 0. [3] Write short notes on any three of the following: [a] Construction of Brownian motion on [0, 1] from the Haar wavelet func­ tions including the proof of almost sure continuity of the sample paths. Solution: Define the Haar wavelet functions as Hn,k (t) = 2(n−1)/2 for (k − 1)/2n ≤ t ≤ k/2n , Hn,k (t) = −2(n−1)/2 for k/2n < t ≤ (k + 1)/2n whenever k is in the set I(n) = {k : kodd, k = 0, 1, ..., 2n−1 } for n=1,2,... and put H0 (t) = 1. Here t ∈ [0, 1]. It is then easily verified that the functions Hnk , k ∈ I(n), n ≥ 1, H00 form an orthonormal basis for L2 [0, 1]. We set I(0) = {0}. Define the Schauder functions � t Snk (t) = Hnk (s)ds 0

Then, from basic properties of orthonormal bases, we have � Hnk (t)Hnk (s) = δ(t − s) nk

and hence



Snk (t)Snk (s) = min(t, s)

nk

Let ξ(n, k), k ∈ I(n), n ≥ 1, ξ(0, 0) be iid N (0, 1) random variables and define the random functions BN (t) =

N � �

n=0 k∈I(n)

ξ(n, k)Snk (t), N ≥ 1

Then since Snk (t) are continuous functions, BN (t) is also continuous for all N . � s for a fixed n are non­overlapping functions and Further, the Snk maxt |Snk (t)| = 2−(n+1)/2 Thus, maxt



k∈I(n)

|Snk (t)| = 2−(n+1)/2

Let b(n) = max(|ξ(nk)| : k ∈ I(n)} Then, by the union bound n

n+1

P (b(n) > n) ≤ 2 .P (|ξ(nk)| > n) = 2



∞ n

≤ K.2n+1 n.exp(−n/ 2)

√ exp(−x2 /2)dx/ 2π

Stochastic Processes in Classical and Quantum Physics and Engineering Thus,



n≥1

111 119

P (b(n) > n) < ∞

which means that by the Borel­Cantelli lemma, for a.e.ω, there is a finite N (ω) such that for all n > N (ω), b(n, ω) < n. This means that for a.e.ω, N � �

n=1 k∈I(n)



|ξ(n, k, ω)||Snk (t)|

N �

n.2−(n+1)/2

n=1

which converges as N → ∞. This result implies that the sequence of con­ tinuous processes BN (t), t ∈ [0, 1] converges uniformly to a continuous process B(t), t ∈ [0, 1] a.e. Note that continuity of each term and uniform convergence imply continuity of the limit. That B(t), t ∈ [0, 1] is a Gaussian process with zero mean and covariance min(t, s) as can be seen from the joint characteristic function of the time samples of BN (.). Alternately, B(.) being an infinite linear superposition of Gaussian r.v’s must be a Gaussian process, its mean zero is obvious since ξ(n, k) have mean zero and the covariance follows from E(BN (t)BN (s)) =

N �



E(ξ(nk)ξ(mr))Snk (t)Smr (s)

n,m=0 k,r ∈I(n)

=

N �



δ(n, m)δ(k, r)Snk (t)Smr (s)

n,m=0 k,r∈I(n)

=

N � �

n=0 k∈I(n)

Snk (t)Snk (s) →

∞ � �

Snk (t)Snk (s) = min(t, s)

n=0 k∈I(n)

[b] L2 ­construction of Brownian motion from the Karhunen­Loeve expansion. The autocorrelation of B(t) is R(t, s) = E(B(t)B(s)) = min(t, s) Its eigenfunctions φ(t) satisfy �

1

0

Thus,



R(t, s)φ(s)ds = λφ(t), t ∈ [0, 1]

t

sphi(s)ds + t 0



1

φ(s)ds = λφ(t) t

112

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 7. SOME ASPECTS OF STOCHASTIC PROCESSES

120

Differentiating, tφ(t) + Again differentiating



1 t

φ(s) − tφ(t) = λφ� (t)

−φ(t) = λφ�� (t)

Let λ = 1/ω 2 . Then, the solution is

φ(t) = A.cos(ωt) + B.sin(ωt) � �

t 0

sφ(s)ds = A((t/ω)sin(ωt) + (cos(ωt) − 1)/ω 2 )+ B(−t.cos(ωt)/ω + sin(ωt)/ω 2 )

1 t

φ(s)ds = A(sin(ω) − sin(ωt))/ω + (B/ω)(cos(ωt) − cos(ω))

Substituting all these expressions into the original integral equation gives A((t/ω)sin(ωt) + (cos(ωt) − 1)/ω 2 )+ B(−t.cos(ωt)/ω + sin(ωt)/ω 2 ) +tA(sin(ω) − sin(ωt))/ω + (Bt/ω)(cos(ωt) − cos(ω)) = (A.cos(ωt) + B.sin(ωt))/ω 2

On making cancellations and comparing coefficients, A.sin(ω) = 0, B.cos(ω) = 0, A/ω 2 = 0, Therefore, A = 0, ω = (n + 1/2)π, n ∈ Z and for normalization



we get B or

2



1

φ2 (t)dt = 1 0 1

sin2 (ωt)dt = 1 0

(B 2 /2)(1 − sin(2ω)/2ω) = 1 or B=



2

Thus, the KL expansion for Brownian motion is given by �� λn ξ(n)φn (t) B(t) = n∈Z

113 121

Stochastic Processes in Classical and Quantum Physics and Engineering

=



√ ((n + 1/2)π)−2 ξ(n). 2sin((n + 1/2)πt)

n∈Z

where ξ(n), n = 1, 2, ... are iid N (0, 1) r.v’s. [c] Reflection principle for Brownian motion with application to the evalua­ tion of the distribution of the first hitting time of the process at a given level. [d] The Borel­Cantelli lemmas with application to proving almost sure con­ vergence of a sequence of random variables. [e] The Feynman­Kac formula for solving Schrodinger’s equation using Brow­ nian motion with complex time. [f] Solving Poisson’s equation �2 u(x) = s(x) in an open connected subset of n R with the Dirichlet boundary conditions using Brownian motion stop times. [4] A bulb in a bulb holder is changed when it gets fused. Let Xn denote the time for which the nth bulb lasts. If n is even, then Xn has the exponential density λ.exp(−λ.x)u(x) while if n is odd then Xn has the exponential density µ.exp(−µ.x)u(x). Calculate (a) the probability distribution of the number of bulbs N (t) = max(n : Sn ≤ t) changed in [0, t] where Sn = X1 + ... + Xn , n ≥ 1 and S0 = 0. Define δ(t) = t − SN (t) , γ(t) = SN (t)+1 − t Pictorially illustrate and explain the meaning of the r.v’s δ(t), γ(t). What hap­ pens to the distributions of these r.v’s as t → ∞ ? Solution: Let n ≥ 1. P (N (t) ≥ n) = P (Sn ≤ t) = P ( Specifically,



Xk +

1≤k≤n,keven



1≤k≤n,nodd

n n � � P (N (t) ≥ 2n) = P ( X2k + X2k−1 ≤ t) k=1

=F

∗n

k=1

∗n

∗ G (t)

where F (t) = 1 − exp(−λt), G(t) = 1 − exp(−µt)

Note that with f (t) = F � (t), g(t) = G� (t), we have

f n∗ (t) = λn tn−1 exp(−λt)/(n − 1)!, g n∗ (t) = µn tn−1 exp(−µt)/(n − 1)! Equivalently, in terms of Laplace transforms, L(f (t)) = λ/(λ + s), L(g(t)) = µ/(µ + s)

Xk ≤ t)

114 122

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 7. SOME ASPECTS OF STOCHASTIC PROCESSES

so L(f ∗n (t)) = λn /(λ + s)n , L(g ∗n (t)) = µn /(µ + s)n and hence f ∗n (t) ∗ g ∗n (t) = L−1 ((λµ)2 /(λ + s)n (µ + s)n ) = ψn (t) say. Then,



P (N (t) ≥ 2n) =

t

ψn (x)dx

0

Likewise, P (N (t) ≥ 2n − 1) = P (S2n−1 ≤ t) =



t

φn (x)dx 0

where φn (t) = g n∗ (t) ∗ f ∗(n−1) (t) = L−1 (λn−1 µn /(λ + s)n−1 (µ + s)n ) δ(t) = t − SN (t) is the time taken from the last change of the bulb to the current time. γ(t) = SN (t)+1 − t is the time taken from the current time upto the next change of the bulb. Thus, for 0 ≤ x ≤ t, � P (δ(t) ≤ x) = P (SN (t) > t − x) = P (Sn > t − x, N (t) = n) n≥1

=



n≥1

P (t − x < Sn ≤ t < Sn+1 ) = =

��

n≥1



n≥1

P (t − x < Sn < t, Xn+1 > t − Sn )

t t−x

P (Xn+1 > t − s)P (Sn ∈ ds)

Likewise, P (γ(t) ≤ x) = P (SN (t)+1 ≤ t + x) =

=



n≥ 0

P (Sn ≤ t < Sn+1 ≤ t + x) =



n≥0

��

n≥0

P (Sn+1 ≤ t + x, Sn ≤ t < Sn+1 )

t 0

P (Sn ∈ ds)P (t − s < Xn+1 < t + x)

[5] [a] Let {X(t) : t ∈ R} be a real stochastic process with autocorrelation E(X(t1 ).X(t2 )) = R(t1 , t2 ). Assume that there exists a finite positive real num­ ber T such that R(t1 + T, t2 + T ) = R(t1 , t2 )∀t1 , t2 ∈ R. Prove that X(t) is a mean square periodic process, ie, E[(X(t + T ) − X(t))2 ] = 0∀t ∈ R

Stochastic Processes in Classical and Quantum Physics and Engineering

115 123

Hence show that X(t) has a mean square Fourier series: limN →∞ E[|X(t) − where ω0 = 2π/T and c(n) = T −1



N �

n=−N

c(n)exp(jnω0 t)|2 ] = 0

T

X(t)exp(−jnω0 t)dt 0

[b] Define a cyclostationary and wide sense cyclostationary process with an example. Give an example of a wide sense cyclostationary process that is not cyclostationary. Solution: A process X(t), t ∈ R is said to be cyclostationary if its finite di­ mensional probability distribution functions are invariant under shifts by integer multiples of some fixed positive T , ie, if P (X(t1 ) ≤ x1 , ..., X(tN ) ≤ xN ) = P (X(t1 + T ) ≤ x1 , ..., X(tN + T ) ≤ xN ) for all x1 , ..., xN , t1 , ..., tN , N ≥ 1. For example, the process � X(t) = c(n)exp(jnωt) n

where c(n) are any r.v’s is cyclostationary with period 2π/ω. In fact, this process is a periodic process. A wide sense cyclostationary process is a process whose mean and autocorrelation function are periodic with period T for some fixed T > 0, ie, if µ(t) = E(X(t)), R(t, s) = E(X(t)X(s)) then µ(t + T ) = µ(t), R(t + T, s + T ) = R(t, s)∀t, s ∈ R An example of a WSS process that is not stationary is X(t) = cos(ω1 t + φ1 ) + cos(ω2 t + φ2 ) + cos(ω3 t + φ1 + φ2 ) where ω1 + ω2 = ω3 and φ1 , φ2 are iid r.v’s uniformly distributed over [0, 2π). This process has second moments that are invariant under time shifts but its  third moments are not invariant under time shifts. Likewise, an example of a WS­cyclostationary process that is not cyclostationary is the process X(t)+Y (t) with Y (t) having the same structure as X(t) but independent of it with frequen­ cies ω4 , ω5 , ω6 (and phases φ4 , φ5 , φ4 + φ5 with φ1 , φ2 , φ4 , φ5 being independent and uniform over [0, 2π)) such that (ω1 + ω2 − ω3 )/(ω4 + ω5 − ω6 ) is irrational. This process is also in fact WSS. To see why it is not cyclostationary, we compute its third moments: E(X(t)X(t+t1 )X(t+t2 )) = E(cos(ω3 t+φ1 +φ2 ).cos(ω1 (t+t1 )+φ1 ).cos(ω2 (t+t2 )+φ2 ))

116 124

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 7. SOME ASPECTS OF STOCHASTIC PROCESSES +E(cos(ω3 t + φ1 + φ2 ).cos(ω1 (t + t2 ) + φ1 ).cos(ω2 (t + t1 ) + φ2 ))

= (1/4)(cos((ω1 + ω2 − ω3 )t + ω1 t1 + ω2 t2 )

+cos((ω1 + ω2 − ω3 )t + ω2 t1 + ω1 t2 ))

and likewise E(Y (t)Y (t + t1 )Y (t + t2 )) = = (1/4)(cos((ω4 + ω5 − ω6 )t + ω4 t1 + ω5 t2 ) +cos((ω4 + ω5 − ω6 )t + ω5 t1 + ω4 t2 )) Thus, if Z(t) = X(t) + Y (t) then E(Z(t)Z(t+t1 )Z(t+t2 )) = E(X(t)X(t+t1 )X(t+t2 ))+E(Y (t)Y (t+t1 )Y (t+t2 )) = (1/4)(cos((ω1 + ω2 − ω3 )t + ω1 t1 + ω2 t2 )

+cos((ω1 + ω2 − ω3 )t + ω2 t1 + ω1 t2 ))

+(1/4)(cos((ω4 + ω5 − ω6 )t + ω4 t1 + ω5 t2 )

+cos((ω4 + ω5 − ω6 )t + ω5 t1 + ω4 t2 ))

which is clearly not periodic in t since (ω1 + ω2 − ω3 )/(ω4 + ω5 − ω6 ) is irrational.

[6] Explain the construction of the Lebesgue measure as a countably additive non­negative set function on the Borel subsets of R whose value on an interval coincides with the interval’s length. Solution: We first define the Lebesgue measure µ on the field F of finite disjoint unions of intervals of the form (a, b] in (0, 1] as the sum of lengths. Then, we define the outer measure µ∗ on the class of all subsets of (0, 1] as �

µ(En ) : En ∈ F, E ⊂ En } µ∗ (E) = inf { n

n

We prove countable subadditivity of µ∗ . Then, we say that E is a measurable subset of (0, 1] if µ∗ (EA) + µ∗ (E c A) = µ∗ (A)∀A ⊂ (0, 1] Then, if M denotes the class of all measurable subsets, we prove that M is a field and that µ∗ is finitely additive on M. We then prove that M is a σ field and that µ∗ is countably additive on M. Finally, we prove that M contains the field F and hence the σ­field generated by F, namely, the Borel σ­field on (0, 1].

Stochastic Processes in Classical and Quantum Physics and Engineering

117 125

[7] [a] What is a compact subset of Rn ? Give an example and characterize it means of (a) the Heine­Borel property and (b) the Bolzano­Weierstrass property. Prove that the countable intersection of a sequence of decreasing non­empty compact subsets of Rn is non­empty. A compact subset E of Rn is a closed and bounded subset. By bounded, we mean that sup{|x| : x ∈ E} < ∞

By closed, we mean that if xn ∈ E is a sequence in E such that limxn = x ∈ Rn exists, then x ∈ E. For example, E = {x ∈ Rn : |x| ≤ 1}

It can be shown that E is compact iff every open cover of it has finite subcover, ie, if Oα , α ∈ I are open sets in Rn such that

E⊂ Oα α∈I

then there is a finite subset

F = {α1 , ..., αN } ⊂ I such that E⊂

N

O αk

k=1

(b) What is a regular measure on (Rn , B(Rn ))?. Show that every probability measure on (Rn , B(Rn )) is regular. Solution: A measure µ on (Rn , B(Rn )) is said to be regular if given any B ∈ B(Rn ) and � > 0, there exists an open set F and a closed set C in Rn such that C ⊂ B ⊂ F and µ(F − C) < �. Let us now show that any probability measure (and hence any finite measure) on (Rn , B(Rn )) is regular. Let P be a probability measure on the above space. Define C to be the family of all sets B ∈ B(Rn ) such that ∀� > 0, there exists a closed set C and an open set F such that C ⊂ B ⊂ F and P (F −C) < �. We prove that C is a σ­field and then since it contains all closed sets (If C is a closed set and Fn = {x ∈ Rn : d(x, C) < 1/n}, then Fn is open, C ⊂ C ⊂ Fn and P (Fn − C) → 0 because Fn ↓ C), it will follow that C = B(Rn ) which will complete the proof of the result. Let B ∈ C and let � > 0. Then, there exists a closed C and an open F such that C ⊂ B ⊂ F and P (F − C) < �. Then F c is closed, C c is open, F c ⊂ B c ⊂ C c and P (C c − F c ) = P (F − C) < �, thereby proving that B c ∈ C, ie, C is closed under complementation. Now, let Bn ∈ C, n = 1, 2, ... and let � > 0. Then, for each n we can choose a closed n with NCn and an open Fn such that Cn ⊂ Bn ⊂F∞ Then, n=1 Cn is closed for any finite integer N , n=1 Fn P (Fn − Cn ) < �/2n .  is open and for B = n Bn , we have that N

n=1

Cn ⊂ B ⊂

n

Fn , N ≥ 1

118126

Stochastic 7. Processes Classical and Physics PROCESSES and Engineering CHAPTER SOMEinASPECTS OFQuantum STOCHASTIC

and further P(

n



N �

n=1

Fn −

N

n=1

Cn ) ≤ P ((

N

n=1

P (Fn − Cn ) +



n>N

(Fn − Cn ))

P (Fn ) ≤

N �



�/2n +

n=1

Fn )

n>N



P (Fn )

n>N

which converges as N → ∞ to � and hence for sufficiently large N , it must be smaller than 2� proving thereby that B ∈ C, ie, C is closed under countable unions. This completes the proof that C is a σ­field and hence the proof of the result. (c) What is a consistent family of finite dimensional probability distributions on (RZ+ , B(RZ+ ))? Solution: Such a family is a family of probability distributions F (x1 , ..., xN , n1 , ..., nN ) on RN for each set of positive integers n1 , ..., nN and N = 1, 2, ... satisfying (a)

F (xσ1 , ..., xσN , nσ1 , ..., nσN ) = F (x1 , ..., xN , n1 , ..., nN )

for all permutations σ of (1, 2, ..., N ) and (b)

limxN →∞ F (x1 , ..., xN , n1 , ..., nN ) = F (x1 , ..., xN −1 , n1 , ..., nN −1 )

(d) Prove Kolmogorov’s consistency theorem, that given any consistent fam­ ily of finite dimensional probability distributions on (RZ+ , B(RZ+ )), there exists a probability space and a real valued stochastic process defined on this space whose finite dimensional distributions coincide with the given ones. Let F (x1 , ..., xN ; n1 , ..., nN ), n1 , ..., nN ≥ 0, x1 , ..., xN ∈ R, N ≥ 1 be a con­ sistent family of probability distributions. Thus F (x1 , ..., xN ; 1, 2, ..., N ) = FN (x1 , ..., xN ) is a probability distribution on RN with the property that limxN →∞ FN (x1 , ..., xN ) = FN −1 (x1 , ..., xN −1 ) The aim is to construct a (countably additive) probability measure P on (RZ+ , B(RZ+ ) such that where

P (B × RZ+ ) = PN (B), B ∈ B(RN )

PN (B) = Define



FN (x1 , ..., xN )dx1 ...dxN B

Q(B × RZ+ ) = PN (B), B ∈ B (RN )

Then by the consistency of FN , N ≥ 1, Q is a well defined finitely additive probability measure on the field

F= (B(RN ) × RZ+ ) N ≥1

Stochastic Processes in Classical and Quantum Physics and Engineering Note that

119 127

BN = B(RN ) × RZ+

is a field on RZ+ for every N ≥ 1 and that

BN ⊂ BN +1 , N ≥ 1 ˜N ∈ It suffices to prove that Q is countably additive on F , or equivalently, if B ˜ ˜ BN is a sequence such that BN ↓ φ, then Q(BN ) ↓ 0. Equivalently, it suffices to show that if BN ∈ B(RN ) is a sequence such that ˜ N = BN × R Z + ↓ φ B then PN (BN ) ↓ 0 Suppose that this is false. Then, without loss of generality, we may assume that there is a δ > 0 such that PN (BN ) ≥ δ∀N ≥ 1. By regularity of probability measures on RN for N finite, we can find for each N ≥ 1, a compact set KN ∈ B(RN ) such that KN ⊂ BN and PN (BN − KN ) < δ/2N . Now consider ˜ N = (K1 × RN −1 ) ∩ (K2 ∩ RN −2 ) ∩ ... ∩ (KN −1 × R) × KN K Then

˜ N ⊂ KN ⊂ RN K

˜ N being a closed subset of the compact set KN is also compact. K ˜ N is and K closed because each Kj , j = 1, 2, ..., N is compact and hence closed. Define ˆN = K ˜ N × RZ + , N ≥ 1 K ˆ N is decreasing and that Then, it is clear that K ˜ N = BN × R Z + ˆN ⊂ B K Further, Now,

ˆ N ) = Q(B ˜N ) − Q(B ˜N − K ˆN ) Q(K ˜N ) = PN (BN ) ≥ δ Q(B

and ˜N − K ˆ N ) = PN (BN − Q(B = PN (

N

k=1

≤ PN (

N �

k=1

(Kk × RN −k ))

(BN − (Kk × RN −k ))

N

k=1

(Bk − Kk ) × RN −k )

120 128

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 7. SOME ASPECTS OF STOCHASTIC PROCESSES

≤ Thus,

N �

k=1

Pk (Bk − Kk ) =

N �

k=1

δ/2k ≤ δ/2

ˆ N ) ≥ δ/2 Q(K

ˆ N is non­empty for each N . Hence, so is K ˜ N for each N . Now and hence K ˆ ˜ ˆ KN ⊂ BN ↓ φ. Hence KN ↓ φ and we shall obtain a contradiction to this as ˜ N is non­empty, we may choose follows: Since K ˜ N , N = 1, 2, ... (xN 1 , ..., xN N ) ∈ K Then, xN 1 ∈ K1 , (xN 1 , xN 2 ) ∈ K2 , ..., (xN 1 , ..., xN N ) ∈ KN Since K1 is compact, xN 1 , N ≥ 1 has a convergent subsequence say {xN1 ,1 }. Now, (xN1 ,1 , xN1 ,2 ) ∈ K2 and since K2 is compact, this has a convergent subse­ quence say (xN2 ,1 , xN2 ,2 ). Then, {xN2 ,1 } is a subsequence of {xN1 ,1 } and hence also converges. Continuing this way, we finally obtain a convergent sequence (yn,1 , ..., yn,N ), n = 1, 2, ... in KN such that (yn,1 , ...yn,j ) ∈ Kj , j = 1, 2, ..., n and of course (yn,1 , ..., yn,j ), n ≥ 1 converges. Let yj = limn yn,j , j = 1, 2, ... Then, it is clear that since Kj is closed, (y1 , ..., yj ) ∈ Kj , j = 1, 2, ... Hence and hence,

˜N , N ≥ 1 (y1 , y2 , ...yN ) ∈ K ˆN , N ≥ 1 (y1 , y2 , ...) ∈ K

which means that (y1 , y2 , ...) ∈



ˆN K

N ≥1

which contradicts the fact that the rhs is an empty set. This proves the Kol­ mogorov consistency theorem.

Stochastic Processes in Classical and Quantum Physics and Engineering 7.1. PROBLEM SUGGESTED BY PROF.DHANANJAY

7.1

121 129

Problem suggested

When the magnet falls within the tube of radius R0 with the coil having n0 turns per unit length and extending from z = h to z = h + a and with the tube extending from z = 0 to z = d, we write down the equations of motion of the magnet. We usually may assume that at time t, the magnetic moment vector of the magnet is m and that its centre does not deviate from the central axis of the tube. This may be safely assumed provided that the tube’s radius is small. However suppose we do not make this assumption so that we aim a the most general possible non­relativistic situation in which the magnetic fields are strong which may cause the position of the CM of the magnet to deviate from the central axis of the cylinder. At time t, assume that the CM of the magnet is located at r0 (t) = (ξ1 (t), ξ2 (t), ξ3 (t)). We require a differential equation for ξ(t). The magnet at time t is assumed to have experienced a rotation R(t) (a 3 × 3 rotation matrix). Thus, it moment at time t is given by z m(t) = m0 R(t)ˆ assuming that at time t = 0, the moment pointed along the positive z direction. The magnetic field produced by the magnet at time t is given by Bm (t, r) = curl(µ/4π)(m(t) × (r − r0 (t))/|r − r0 (t)|3 ) and the flux of this magnetic field through the coil is 1 h+a 1 Φ(t) = n0 dz Bmz (t, x, y, z)dxdy h

x2 +y 2 ≤R02

The angular momentum vector of the magnet is proportional to its magnetic moment. So by the torque equation, we have dm(t)/dt = γ.m(t) × Bc (t, r0 (t)) − − − (1) where Bc (t, r) is the magnetic field produced by the coil. The coil current is I(t) = Φ' (t)/RL where RL is the coil resistance. Then, the magnetic field produced by the coil is 1 2n0 π [(−ˆ x.sin(φ)+ˆ y.cos(φ))×(r−R0 (cos(φ)ˆ x Bc (t, r) = (µ/4π)I(t) 0

+sin(φ)ˆ y )−(a/2πn0 )φzˆ]

/|r − R0 (cos(φ)ˆ x + sin(φ)ˆ y ) − (a/2πn0 )φzˆ|3 R0 dφ This expression can be used in (1). Note that I(t) depends on Φ(t) which in turn is a function of m(t). Since in fact I(t) is proportional to Φ' (t), it follows that I(t) is function of m(t) and m' (t). Note that Bc (t, r) is proportional to I(t) and is therefore also a function of m(t) and m' (t). However, we also note

122 130

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 7. SOME ASPECTS OF STOCHASTIC PROCESSES

that Bm (t, r) is also a function of r0 (t). Hence, Φ(t) is also a function of r0 (t) and hence I(t) is also a function of r0 (t) and r�0 (t). Thus, Bc (t, r) is a function of r0 (t), r�0 (t), m(t), m� (t), ie, we can express it as Bc (t, r|r0 (t), r�0 (t), m(t), m� (t)) Thus, we write (1) as dm(t)/dt = γ.m(t) × Bc (t, r0 (t)|r0 (t), r0� (t), m(t), m� (t)) − − − (2) Next, the force exerted on the magnet by the magnetic field produced by the coil is given by F(t) = �r0 (t) (m(t).Bc ((t, r0 (t)|r0 (t), r0� (t), m(t), m� (t)) which means that the rectilinear equation of motion of the magnet must be given by M r0�� (t) = F(t) − M gzˆ

However to be more precise, the term V = −m(t).Bc (t, r0 (t)|r0 (t), r0� (t), m(t), m� (t)) represents a potential energy of interaction between the magnet and the reac­ tion field produced by the coil. So its recilinear equation of motion should be given by d/dt∂L/∂r0� (t) − ∂L/∂r0 (t) = 0 where L, the Lagrangian is given by L = M |r0� (t)|2 /2 − V − mgz0 (t) Thus the rectilinear equation of motion of the coil is given by M r0�� (t) + d/dt(�r�0 (t) m(t).Bc (t, r0 (t)|r0 (t), r0� (t), m(t), m� (t))) −�r0 (t) (m(t).Bc (t, r0 (t)|r0 (t), r0� (t), m(t), m� (t))) + mgzˆ = 0 − − − (3) (2) and (3) constitute the complete set of six coupled ordinary differential equa­ tions for the six components of m(t), r0 (t).

7.2

Converse of the Shannon coding theorem

Let A be an alphabet. Let u1 , ..., uM ∈ An with Pu i = P where for u ∈ An , Pu (x) = N (x|u)/n, x ∈ A with N (x|u) denoting the number of times that x has occurred in u. Let νx (y), x ∈ A, y ∈ B be the discrete memory channel transition probability from the alphabet A to the alphabet B. Define � P (x)νx (y), y ∈ B Q(y) = x

Stochastic Processes in Classical and Quantum Physics and Engineering 7.2. CONVERSE OF THE SHANNON CODING THEOREM

123 131

let D1 , ..., DM be disjoint sets in B n such that νuk (Dk ) > 1 − �, k = 1, 2, ..., n, where νu (v) = Πnk=1 νu(k) (v(k)) for u = (u(1), ..., u(n)), v = (v(1), ..., v(n)). Let Ek = Dk ∩ T (n, νuk , δ), k = 1, 2, ..., M where T (n, f, δ) is the set of δ­Bernoulli typical sequences of length n having probability distribution f . For u ∈ An with Pu = P , define T (n, νu , δ) to consist of all v ∈ B n for which vx ∈ T (N (x|u), νx , δ)∀x ∈ A Then we claim that if v ∈ T (n, νu , δ), then

√ v ∈ T (n, Q, δ a)

The notation used is as follows. For x ∈ A, and u ∈ An , v ∈ B n , define vx = (v(k) : u(k) = x, k = 1, 2, ..., n) Note that vx ∈ B N (x|u) . To prove this claim, we note that for y ∈ B, � � P (x)νx (y)| |N (y|v) − nQ(y)| = | N (y|vx ) − n x

=|

� x

N (y|vx ) −





x

x

N (x|u)νx (y)| ≤

� x

|N (y|vx ) − N (x|u)νx (y)|

� � δ N (x|u)νx (y)(1 − νx (y)) x

� � √ ≤ δ a. N (x|u)νx (y)(1 − νx (y)) x

√ = δ an. Now, � x

and therefore,





� x

P (x)νx (y)(1 − νx (y))

P (x)νx (y) = Q(y),

x

P (x)νx (y)2 ≥ (



P (x)νx (y))2 = Q(y)2

x

√ � |N (y|v) − nQ(y)| ≤ δ an Q(y)(1 − Q(y)

proving the claim. For v ∈√Ek , we have v ∈ T (n, νuk , δ) and hence by the result just proved, v ∈ T (n, Q, δ a). Now let v ∈ T (n, Q, δ).

124 132

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 7. SOME ASPECTS OF STOCHASTIC PROCESSES It follows that since Qn (v) = Πy∈B Q(y)N (y|v)

and since |N (y|v) − nQ(y)| ≤ δ we have nQ(y) − δ

� nQ(y)(1 − Q(y)), y ∈ B



nQ(y)(1 − Q(y)) ≤ N (y|v) ≤ nQ(y) + δ

and therefore, Πy Q(y)nQ(y)+δ



nQ(y)(1−Q(y))

� nQ(y)(1 − Q(y)), y ∈ B

≤ Qn (v) ≤ Πy Q(y)nQ(y)−δ

Thus summing over all v ∈ T (n, Q, δ), we get √

2−nH(Q)−c

n

µ(T (n, Q, δ)) ≤ Qn (T (n, Q, δ) ≤ 2−nH(Q)+c

where c = −δ





nQ(y)(1−Q(y))

n

µ(T (n, Q, δ))

�� y

Q(y)(1 − Q(y))log2 (Q(y))

and where µ(E) denotes the cardinality of a set E. In particular, µ(T (n, Q, δ)) ≤ 2nH(Q)+c.sqrtn It follows that M �

µ(Ek ) = µ(

k=1

k

√ √ Ek ) ≤ µ(T (n, Q, δ a)) ≤ 2nH(Q)+c an

On the other hand, νuk (Dk ) > 1 − � and by the Chebyshev inequality combined with the union bound, νx,N (x|u) (T (N (x|u), νx , δ)) ≥ 1 − b/δ 2 and hence νu (T (n, νu , δ)) ≥ 1 − ab/δ 2

for any u ∈ An . In particular,

νuk (Ek ) ≥ 1 − ab/δ 2 − �, k = 1, 2, ..., M Further, by the above argument, we have for v ∈ T (n, νu , δ), νx,N (x|u) (vx ) ≤ 2N (x|u)H(νx )+c and hence,

νu (v) ≤ 2−



x



n

√ N (x|u)H(νx )−c n

125 133

Stochastic Processes in Classical and Quantum Physics and Engineering 7.2. CONVERSE OF THE SHANNON CODING THEOREM = 2−nI(P,ν)+c where I(P, ν) =





n

P (x)H(νx ), P = Pu

x

From this we easily deduce that

νuk (Ek ) ≤ 2−nI(P,ν)+c Thus, (1 − ab/δ 2 − �)2nI(P,ν)−c



n

µ(Ek ), k = 1, 2, ..., M

n

≤ µ(Ek ), k = 1, 2, ..., M

n





so that √

M (1 − ab/δ 2 − �)2nI(P,ν)−c

� k

µ(Ek ) ≤ 2nH(Q)+c

or equivalently, √ log(M )/n ≤ H(Q) − I(P, ν) + O(1/ n) which proves the converse of Shannon’s coding theorem.



n

Chapter 8

Chapter 8

Problems in Problems inElectromagnetics and Gravitation electromagnetics and gravitation The magnetic field produced by a magnet of given magnetic dipole moment in a background gravitational field taking space­time curvature into account. The Maxwell equations in curved space­time are √ √ (F µν −g),ν = J µ −g or equivalently

√ √ (g µα g νβ −gFαβ ),ν = J µ −g

Writing gµν (x) = ηµν + hµν (x) where η is the Minkowski metric and h is a small perturbation, we have approx­ imately g µν = ηµν − hµν , hµν = ηµα ηνβ hαβ and



−g = 1 − h/2, h = ηµν hµν

We have Fµν = Aν,µ − Aµ,ν and writing (1) Aµ = A(0) µ + Aµ (1)

where Aµ(0) is the zeroth order solution, ie, without gravity and Aµ the first order correction to the solution, ie, it is a linear functional of hµν , we obtain using first order perturbation theory (0) (1) Fµν = Fµν + Fµν

135

127

128 Stochastic Processes in Classical and Quantum Physics and Engineering 136CHAPTER 8. PROBLEMS IN ELECTROMAGNETICS AND GRAVITATION where (k) (k) = A(k) Fµν ν,µ − Aµ,ν , k = 0, 1 √ g µα g νβ −g =

(ηµα − hµα )(ηνβ − hνβ )(1 − h/2)

= ηµα ηνβ − fµναβ where fµναβ = ηµα ηνβ h/2 + ηµα hνβ +ηνβ hµα Then

√ g µα g νβ −gFαβ = (0)

ηµα ηνβ Fαβ (1)

(0)

+ηµα ηνβ Fαβ − fµναβ Fαβ = F 0µν + F 1µν where

(0)

F 0µν = ηµα ηνβ Fαβ , (1)

(0)

F 1µν = ηµα ηνβ Fαβ − fµναβ Fαβ Also

√ J µ −g = J 0µ + J 1µ , J 0µ = J µ , J 1µ = −J µ h/2

The gauge condition becomes

√ (Aµ −g),µ = 0 √ 0 = (g µν −gAν ),µ =

[(ηµν − hµν )(1 − h/2)Aν ],µ (1) = [(ηµν − kµν )(A(0) ν + Aν )],µ

where kµν = hµν − hηµν /2 and hence equating coefficients of zeroth and first orders of smallness on both sides gives ηµν A(0) ν,µ = 0, (1) (kµν Aν(0) ),µ = ηµν Aν,µ

Chapter 9

Chapter 9

Some more of Someaspects More Aspects of quantum Quantum information Information Theory theory 9.1

An expression for state detection error

Let ρ, σ be two states and let their spectral decompositions be � � |fa > q(a) < fa | ρ= |ea > p(a) < ea |, σ = a

a

so that < ea |eb >= δ(a, b), < fa |fb >= δ(a, b). The relative entropy between these two states is D(ρ|σ) = T r(ρ.(log(ρ) − log(σ))) � � = p(a)log(p(a)) − p(a)log(qb )| < ea |fb > |2 a

=

� a,b

a,b

(p(a)| < ea |fb > |2 (log(p(a)| < ea |fb > |2 ) − log(qb | < ea |fb > |2 )) =

� a,b

(P (a, b)log(P (a, b)) − log(Q(a, b))) = D(P |Q)

where P, Q are two bivariate probability distributions defined by P (a, b) = p(a)| < ea |fb > |2 , Q(a, b) = q(b)| < ea |fb > |2 Now let 0 ≤ T ≤ 1, ie, T is a detection operator. Consider the detection error probabiity 2P (�) = T r(ρ(1 − T )) + T r(σT ) � = (p(a) < ea |1 − T |ea > −q(a) < fa |T |fa >) a

137

129

130138CHAPTER Stochastic Processes Classical and Quantum Physics and Engineering 9. SOME MOREin ASPECTS OF QUANTUM INFORMATION THEORY =

� a

p(a)(1− < ea |T |ea >) −

� a,b

q(a)| < fa |eb > |2 < eb |T |fa >



2

Suppose T is a projection, ie, T = T = T . Then, � � T r(ρ(1 − T )) = p(a) < ea |1 − T )|ea >= p(a) < ea |(1 − T )2 |ea > a

a

=

� a,b

p(a)| < ea |1 − T |fb > |2

Likewise, T r(σT ) =

� a,b

q(b) < ea |T |fb > |2

Hence, 2P (�) =

� a,b



� a,b

(p(a)| < ea |1 − T |fb > |2 + q(b) < ea |T |fb > |2 )

min(p(a), q(b))(| < ea |1 − T |fb > |2 + q(b) < ea |T |fb > |2 ) ≥

� a,b

(1/2)min(p(a), q(b))| < ea |fb > |2

where we have used 2(|z|2 + |w|2 ) ≥ |z + w|2 for any two complex numbers z, w. Note that � (1/2)min(p(a), q(b))| < ea |fb > |2 = a,b

(1/2)min0≤t(a,b)≤1,a,b=1,2,...,n

� a,b

[p(a)| < ea |fb > |2 (1−t(a, b))+ q(b)| < ea |fb > |2 t(a, b))]

because if u, v are positive real numbers, then as t varies over [0, 1], tu + (1 − t)v attains is minimum value of v when t = 0 if u ≥ v and u when t = 1 if u < v. In other words, mint∈[0,1] (tu + (1 − t)v) = min(u, v) In other words we have proved that 2P (�) ≥ (1/2)min0≤t(a,b)≤1,a,b=1,2,...,n

� a,b

[P (a, b)(1 − t(a, b)) + Q(a, b)t(a, b)]

The quantity on the rhs is the minimum error probability for the classical hy­ pothesis testing problem between the two bivariate probability distributions P, Q.

Stochastic Processes in Classical and Quantum Physics and Engineering 131 9.2. OPERATOR CONVEX FUNCTIONS AND OPERATOR MONOTONE FUNCTIONS1

9.2 Operator convex functions and operator mono­ tone functions [1] Let f be operator convex. Then, if K is a contraction matrix, we have f (K ∗ XK) ≤ K ∗ f (X)K for all Hermitian matrices X. By contraction we mean that K ∗ K ≤ I. To see this define the following positive (ie non­negative) matrices √ √ L = 1 − K ∗ K, M = 1 − KK ∗ Then by Taylor expansion and the identity K(K ∗ K)m = (KK ∗ )m K, m = 1, 2, ..., we have

KL = M K, LK ∗ = K ∗ M

Define the matrix U=



K L

−M K∗

and observe that since L∗ = L, M ∗ = M and



K ∗ K + L2 = I, −K ∗ M + LK ∗ = 0, −M K + KL = 0, M 2 + KK ∗ = I it follows that U is a unitary matrix. Likewise define the unitary matrix � � K M V = L −K ∗ Also define for two Hermitian matrices A, B � � A 0 T = , 0 B Then, we obtain the identity � ∗ K AK + LBL ∗ ∗ (1/2)(U T U + V T V ) = 0

0 KBK ∗ + M AM

Using operator convexity of f , we have f ((1/2)(U ∗ T U +V ∗ T V )) ≤ (1/2)(f (U ∗ T U )+f (V ∗ T V ))

= (1/2)(U ∗ f (T )U +V ∗ f (T )V ) which in view of the above identity, results in f (K ∗ AK + LBL) ≤ K ∗ f (A)K + Lf (B)L and

f (KBK ∗ + M AM ) ≤ Kf (B)K ∗ + M f (A)M



132 Stochastic Processes in Classical and Quantum Physics and Engineering 140CHAPTER 9. SOME MORE ASPECTS OF QUANTUM INFORMATION THEORY In particular, taking B = 0, the first gives f (K ∗ AK) ≤ K ∗ f (A)K + f (0)L2 and choosing A = 0, the second gives f (KB K ∗ ) ≤ Kf (B)K ∗ + f (0)M 2 In particular, if in addition to being operator convex, f satisfies f (0) ≤ 0, then we get f (K ∗ AK) ≤ K ∗ f (A)K, f (KBK ∗ ) ≤ Kf (B)K ∗

for any contraction K. Now we show that if f is operator convex and f (0) ≤ 0, then for operators K1 , ..., Kp that satisfy p � j=1

we have

p �

f(

j=1

In fact, defining

Kj∗ Kj ≤ I

Kj∗ Aj Kj ) ≤ ⎛

K1 ⎜ K2 K=⎜ ⎝ .. Kp

we observe that

p �

Kj∗ f (Aj )Kj

j=1

0 0 .. 0

... ... ... ...

� Kj∗ Kj 0 ≤I 0 0 ie K is a contraction and hence by the above result K ∗K =

� �p

⎞ 0 0 ⎟ ⎟ .. ⎠ 0

j=1

f (K ∗ T K) ≤ K ∗ f (T )K Taking T = diag[A1 , A2 , ..., Ap ] we get K ∗T K = and hence,



f(

� �p

j=1

�p

j=1

Kj∗ Aj Kj ) 0

0 f (0)

0 0 �



� Kj∗ f (Aj )Kj 0 0 0 which results in the desired inequality. Note that from the above discussion, we also have � � f( Kj Bj Kj∗ ) ≤ Kj f (Bj )Kj∗ ≤

j

� �p

Kj∗ Aj Kj 0

j=1

j

Stochastic Processes in Classical and Quantum Physics and Engineering 9.3. A SELECTION OF MATRIX INEQUALITIES

9.3

133 141

A selection of matrix inequalities

[1] If A, B are square matrices the eigenvalues of AB are the same as those of BA. If A is non­singular, then this is obvious since A−1 (AB)A = BA Likewise if B is non­singular then also it is obvious since B −1 (BA)B = AB If A is singular, we can choose a sequence cn → 0 so that A + cn I is non­singular for every n and then (A + cn I)−1 (A + cn I)B(A + cn I) = B(A + cn I)∀n Thus, (A + cn I)B has the same eigenvalues as those of B(A + cn I) for each n. Taking the limit n → ∞, the result follows since the eigenvalues of (A + cI)B are continuous functions of c. [2] Let A be any square diagonable matrix. Let λ1 , ..., λn be its eigenvalues. We have for any unit vector u, | < u, Au > |2 ≤< u, A∗ Au > Let e1 , ..., en be a basis of unit vectors of A corresponding to the eigenvalues λ1 , ..., λn respectively. Assume that |λ1 >≥ |λ2 | ≥ ... ≥ |λn | Let M be any k dimensional subspace and consider the n − k + 1 dimensional subspace W = span{ek , ek+1 ..., en }. Then dim(M ∩ W ) ≥ 1 and hence we can choose a unit vector u ∈ M ∩ W . Then writing u = ck ek + ... + cn en , it follows that | < u, Au > | = |ck λk ek + ... + cn λn en | ≤ |λk |.(|ck | + ... + |cn |) Suppose we Gram­Schmidt orthonormalize ek , ek+1 , ..., en in that order. The resulting vectors are wk = ek , wk+1 ∈ span(ek , ek+1 ), ..., wn ∈ span(ek , ..., en ) Specifically, wk+1 = r(k + 1, k)ek + r(k + 1, k + 1)ek+1 , ..., wn = r(n, k)ek + ... + r(n, n)en Then we can write u = dk wk +...+dm wn , |dk |2 +...+|dn |2 = 1, < wr , ws >= δ(r, s), r, s = k, k+1, ..., n

134142CHAPTER Stochastic Processes Classical and Quantum Physics and Engineering 9. SOME MOREin ASPECTS OF QUANTUM INFORMATION THEORY and we find that Awk = λk wk , Awk+1 = r(k + 1, k)λk ek + r(k + 1, k + 1)λk+1 ek+1 , Awn = r(n, k)λk ek + ... + r(n, n)λn en Diagonalize A so that A=

n �

λk ek fk∗ , fk∗ ej = δ(k, j)

k=1

Then, < fk , Aek >= λk Thus, |λk | ≤� fk � . � Aek � Let A⊗a k denote the k­fold tensor product of A with itself acting on the k fold antisymmetric tensor product of Cn . Then, the eigenvalues of A⊗a n are λi1 ...λik with n ≥ i1 > i2 > ... > ik ≥ 1 and likewise its singular values are si1 ...sik , n ≥ i1 > i2 > ... > ik ≥ 1 where s1 ≥ s2 ≥ ... ≥ sn ≥ 0 are the singular values of A. Now if T is any operator whose eigenvalue having maximum magnitude is µ1 and whose maximum singular value is s1 , then we have |µ1 | = spr(T ) = limn � T n �1/n ≤� T �= s1 Applying this result to T = A⊗a k gives us Πki=1 |λi | ≤ Πki=1 si , k = 1, 2, ..., n

[3] If A, B are such that AB is normal, then � AB �≤� BA � In fact since AB is normal, its spectral radius coincides with its spectral norm and further, since AB and BA have the same eigenvalues, the spectral radii of AB and BA are the same and finally since the spectral radius of any matrix is always smaller than its spectral norm (� T n �1/n ≤� T �), the result follows immediately: � AB �= spr(AB) = spr(BA) ≤� BA �

[1] Show that if S, T ≥ 0 and S ≤ 1, then 1 − (S + T )−1/2 S(S + T )−1/2 ≤ 2 − 2S + 4T

Stochastic Processes in Classical and Quantum Physics and Engineering 9.3. A SELECTION OF MATRIX INEQUALITIES

135 143

[2] Show that if Sk , Tk , Rk , k = 1, 2 are positive operators in a Hilbert space such that (a) [S1 , S2 ] = [T1 , T2 ] = [[R1 , R2 ] = 0 and Sk + Tk ≤ Rk , k = 1, 2,, then for any 0 ≤ s ≤ 1, we have Lieb’s inequality: R11−s R2s ≥ S11−s S2s + T11−s T2s First we show that this inequality holds for s = 1/2, then we show that if it holds for s = µ, ν, then it also holds for s = (µ + ν)/2 and that will complete the proof. 1/2 1/2 1/2 1/2 | < y|S1 S2 + T1 T2 |x > |2 1/2

1/2

1/2

1/2

≤ (| < S1 y, S2 x > | + | < T1 y, T2 x > |)2 1/2

1/2

1/2

1/2

≤ [� S1 y � . � S2 x � + � T1 y � . � T2 x �]2 1/2

1/2

1/2

1/2

≤ [� S1 y �2 + � T1 y �2 ] ×[� S2 x �2 + � T2 x �2 ] =< y, (S1 + T1 )y >< x, (S2 + T2 )x >≤< y, R1 y >< x, R2 x > −1/4

−1/4

1/4

1/4

Then, replacing y by R1 R2 x and x by R2 R1 x in this inequality gives us (recall that R1 and R2 are assumed to commute) −1/4

< x|R1

1/4

1/2

1/2

R2 (S1 S2

1/2

−1/4

1/2

+ T1 T2 )R2 1/2

1/4

R1 |x >

1/2

≤< x, R1 R2 x > Now define the matrices −1/4

B = R1

1/4

1/2

1/2

R2 (S1 S2

Then it is clear that

1/2

1/2

−1/4

1/2

+ T1 T 2 , A = R 2 1/2

AB = S1 S2

1/2

1/4

R1

1/2

+ T1 T 2

is a Hermitian positive matrix because [S1 , S2 ] = [T1 , T2 ] = 0 and we also have the result that the eigenvalues of AB are the same as those of BA. Further, from the above inequality, the k th largest eigenvalue of BA is ≤ the k th largest 1/2 1/2 eigenvalue of R1 R2 . It follows therefore that the k th largest eigenvalue of the Hermitian matrix AB is ≤ the k th largest eigenvalue of the Hermitian matrix 1/2 1/2 R1 R2 . However, from this result, we cannot deduce the desired inequality. To do so, we note that writing 1/2

1/2

S 1 S2

1/2

1/2

+ T1 T 2

we have proved that | < y, Cx > |2 ≤< y, R1 y >< x, R2 x > This implies −1/2

| < R1

−1/2

y, CR2

x > | ≤� y � . � x �

136 Stochastic Processes in Classical and Quantum Physics and Engineering 144CHAPTER 9. SOME MORE ASPECTS OF QUANTUM INFORMATION THEORY for all x, y and hence −1/2

| < y, R1

−1/2

CR2

x > |/ � y � . � x �≤ 1

for all nonzero x, y. Taking supremum over x, y yields −1/2

� R1

−1/2

CR2

�≤ 1

and by normality of −1/4

R1 1/4

−1/4

(R1 R2 where

1/4

−1/4

R2

−1/2

)(R1

−1/4

P = (R1 R2

−1/4

CR1

−1/4

R2

−1/4

CR1

−1/4

R2

−1/2

), Q = (R1

= ) = PQ

−1/4

CR1

−1/4

R2

)

we get � P Q �≤� QP �= −1/2

� R1

−1/2

CR2

�≤ 1

which implies since P Q is positive that PQ ≤ I or equivalently −1/4

(R1

−1/4

R2

−1/4

CR1

−1/4

R2

)≤I

from which we deduce that (Note that R1 and R2 commute) 1/2

1/2

C ≤ R1 R2

proving the claim. Now suppose it is valid for some s = µ, ν. Then we have R11−µ R2µ ≥ S11−µ S2µ + T11−µ T2µ , and

R11−ν R2ν ≥ S11−ν S2ν + T11−ν T2ν ,

Then, replacing R1 by R11−µ R2µ , R2 by R11−ν R2ν , S1 by S11−µ S2µ , S2 by S11−ν S2ν , T1 by T11−µ T2µ and T2 by T11−ν T2ν and noting that these replacements satisfy all the required conditions of the theorem, gives us 1−(µ+ν)/2

R1

(µ+ν)/2

R2

1−(µ+ν)/2

≥ S1

(µ+ν)/2

S2

1−(µ+ν)/2

+ T1

(µ+ν)/2

T2

,

ie the result also holds for s = (µ + ν)/2. Since it holds for s = 0, 1/2, 1 we deduce by induction that it holds for all dyadic rationals in [0, 1], ie, for all s of the form k/2n , k = 0, 1, ..., 2n , n = 0, 1, .... Now every real number in [0, 1] is a limit of dyadic rationals and hence by continuity of the inequality, the theorem has been proved.

Stochastic in Classical and Quantum Physics and Engineering 9.3. A Processes SELECTION OF MATRIX INEQUALITIES

137 145

An application example: Remark: Let T be a Hermitian matrix of size n × n. Let M be an n − k + 1­ dimensional subspace and let λ1 ≥ ... ≥ λn be the eigenvalues of T arranged in decreasing order. Then, supx∈M < x, T x >≥ λk and thus infdimN =n−k+1 supx∈N < x, T x >= λk Likewise, if M is a k dimensional subspace, then infx∈M < x, T x >≤ λk and hence supdimN =k infx∈N < x, T x >= λk Thus, λk (AB) = λk (BA) ≤ λk (C), k = 1, 2, ..., n where

1/2

1/2

C = R1 R2

Let A, B be two Hermitian matrices. Choose a subspace M of dimension n − k + 1 so that λk (A)↓ = supx∈M < x, Ax > Then, λk (A − B)↓ = infdimN =n−k+1 supx∈N < x, (A − B)x > ≤ supx∈M < x, (A − B)x >= supx∈M (< x, Ax > − < x, Bx >) ≤ supx∈M < x, Ax > −infx∈M < x, Bx > = λk (A)↓ − infx∈M < x, Bx >

λk (A − B)↓ ≤ λk (A)↓ − infx∈M < x, Bx > ≤ λk (A)↓ − λn (B)↓

Now let M be any n − k + 1 dimensional subspace such that infx∈M < x, Bx >= λn−k+1 (B)↓ = λk (B)↑ Then, λk (A−B)↓ ≤ supx∈M < x, (A−B)x >≤ supx∈M < x, Ax > −infx∈M < x, Bx > ≤ λk (A)↓ − λk (B)↑ This is a stronger result than the previous one.

138 Stochastic Processes in Classical and Quantum Physics and Engineering 146CHAPTER 9. SOME MORE ASPECTS OF QUANTUM INFORMATION THEORY

9.4

Matrix Holder Inequality

Now let A, B be two matrices. Consider (AB)⊗a k . its maximum singular value square is � (AB)⊗a k �2

which is the maximum eigenvalue of ((AB)∗ (AB))⊗a k . This maximum eigen­ value equals the product of the first k largest eigenvalues of (AB)∗ (AB), ie, the product of the squares of the first k largest singular values of AB. Thus, we can write � (AB)⊗a k �= Πkj=1 sj (AB)↓ On the other hand, � (AB)⊗a k �=� A⊗a k . � B ⊗a k � ≤� A⊗a k � . � B ⊗a k �

Now, in the same way, ≤� A⊗a k � equals Πkj=1 sj (A)↓ and likewise for B. Thus we obtain the fundamental inequality Πkj=1 sj (AB)↓ ≤ Πkj=1 sj (A)↓ sj (B)↓ , k = 1, 2, ..., n and hence we easily deduce that s(AB) ≤ s(A).s(B) which is actually a shorthand notation for k � j=1

sj (AB)↓ ≤

k �

sj (A)↓ .sj (B)↓ , k = 1, 2, ..., n

j=1

In particular, taking k = n gives us for p, q ≥ 1 and 1/p + 1/q = 1, T r(|AB|)) =

n � j=1

sj (AB) ≤ (



Now, T r(|A|1/p ) =

sj (A)p )1/p .(

j





sj (B)q )1/q

j

sj (A)1/p

j

and likewise for B. Therefore, we have proved the Matrix Holder inequality T r(|AB|) ≤ (T r(|A|p ))1/p .(T r(|B|q ))1/q for any two matrices A, B such that AB exists and is a square matrix. In fact, the method of our proof implies immediately that if ψ(.) is any unitarily invariant matrix norm, then ψ(AB) ≤ ψ(|A|p ).1/p .ψ(|B|q )1/q

Stochastic Processes in Classical and Quantum Physics and Engineering 9.4. MATRIX HOLDER INEQUALITY

139 147

This is seen as follows. Since ψ is a unitarily invariant matrix norm, ψ(A) = Φ(s(A)) where Φ() is a convex mapping from Rn+ into R+ that satisfies the triangle inequality and in fact all properties of a symmetric vector norm and also since ap /p + bq /q ≥ ab, a, b > 0 we get for t > 0, Φ(x.y) = φ(tx.y/t) ≤ φ(tp xp )/p + φ(y q /tq )/q ≤ tp φ(xp )/p + φ(y q )/qtq

Now utp + v/tq is minimized when

putp−1 − qv/tq+1 = 0 or equivalently, when tp+q = qv/pu Substituting this value of t immediately gives Φ(x.y) ≤ Φ(xp )1/p .Φ(y q )1/q and hence since s(A.B) ≤ s(A).s(B)

(in the sense of majorization), it follows that for two matrices A, B, ψ(A.B) = Φ(s(A.B)) ≤ Φ(s(A).s(B)) ≤ Φ(s(A)p )1/p .Φ(s(B)q )1/q = ψ(|A|p )1/p .Φ(|B|q )1/q

Note that the eigenvalues (or equivalently the singular values) of |A|p are same as s(A)p . Remark: We are assuming that if x, y are positive vectors such that x < y in the sense of majorization, then Φ(x) ≤ Φ(y) This result follows since if a < b are two positive real numbers, then a = tb + (1 − t)0 for t = a/b ∈ [0, 1] and by using the convexity of Φ and the fact that Φ(0) �k �k 0. If x < y, then i=1 xi ≤ i=1 yi , k = 1, 2, ..., n and hence writing ξk �k �k i=1 xi , ηk = i=1 yi , k = 1, 2, ..., n, we get by regarding Φ(x) as a function ξk , k = 1, 2, ..., n and noting that convexity of Φ in x implies convexity of Φ ξ and the above result that Φ(ξ) = Φ(ξ1 , ..., ξn ) ≤ Φ(η1 , ξ2 , ..., ξn ) ≤ Φ(η1 , η2 , ξ3 , ..., ξn ) ≤ ... ≤ Φ(η1 , ..., ηn ) = Φ(η)

= = of in

Chapter 10

Chapter 10

Lecture Plan for Statistical Lecture PlanSignal for Statistical Processing Signal Processing [1] Revision of basic probability theory: Probability spaces, random vari­ ables, expectation, variance, covariance, characteristic function, convergence of random variables, conditional expectation. (four lectures) [2] Revision of basic random process theory:Kolmogorov’s existence theo­ rem, stationary stochastic processes, autocorrelation and power spectral density, higher order moments and higher order spectra, Brownian motion and Poisson processes, an introduction to stochastic differential equations. (four lectures) [3] Linear estimation theory: Estimation of parameters in linear models and sequential linear models using least squares and weighted least squares methods, estimation of parameters in linear models with coloured Gaussian measurement noise using the maximum likelihood method, estimation of parameters in time series models (AR,MA,ARMA) using least squares method with forgetting fac­ tor, Estimation of parameters in time series models using the minimum mean square method when signal correlations are known. Linear estimation theory as a computation of an orthogonal projection of a Hilbert space onto a subspace, the orthogonality principle and normal equations. (four lectures) [4] Estimation of non­random parameters using the Maximum likelihood method, estimation of random parameters using the maximum aposteriori method, basic statistical hypothesis testing, the Cramer­Rao lower bound. (four lectures) [5] non­linear estimation theory: Optimal nonlinear minimum mean square estimation as a conditional expectation, applications to non­Gaussian models. (four lectures) [6] Finite order linear prediction theory: The Yule­Walker equations for esti­ mating the AR parameters. Fast order recursive prediction using the Levinson­ Durbin algorithm, reflection coefficients and lattice filters, stability in terms of reflection coefficients. (four lectures) [7] The recursive least squares algorithm for parameter estimation linear

149

141

142 Stochastic Processes in Classical and Quantum Physics and Engineering 150CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING models, simultaneous time and order recursive prediction of a time series using the RLS­Lattice filter. (four lectures) [8] The non­causal Wiener filter, causal Wiener filtering using Wiener’s spec­ tral factorization method and using Kolmogorov’s innovations method, Kol­ mogorov’s formula for the mean square infinite order prediction error. (four lectures) [9] An introduction to stochastic nonlinear filtering: The Kushner­Kallianpur filter, the Extended Kalman filter approximation in continuous and discrete time, the Kalman­Bucy filter for linear state variable stochastic models. (four lectures) [10] Power spectrum estimation using the periodogram and the windowed periodogram, mean and variance of the periodogram estimate, high resolution power spectrum estimation using eigensubspace methods like Pisarenko har­ monic decomposition, MUSIC and ESPRIT algorithms. Estimation of the power spectrum using time series models. (four lectures) [11] Some new topics: [a] Quantum detection and estimation theory, Quantum Shannon coding theory. (four lectures) [b] Quantum filtering theory (four lectures)

Total: Forty eight lectures.

[1] Large deviations for the empirical distribution of a stationary Markov process. [2] Proof of the Gartner­Ellis theorem. [3] Large deviations in the iid hypothesis testing problem. [4] Large deviations for the empirical distribution of Brownian motion pro­ cess. Let Xn , n ≥ 0 be a Markov process in discrete time. Consider MN (f ) = Eexp(

N �

f (Xn ))

n=0

Then, given X0 = x MN (f ) = (πf )N (1) where π is the transition probability operator kernel: � π(g)(x) = π(x, dy)g(y) and πf (g)(x) =



exp(f (x))π(x, dy)g(y)

ie πf (x, dy) = exp(f (x))π(x, dy)

Stochastic Processes in Classical and Quantum Physics and Engineering

143 151

If we assume that the state space E is discrete, then π(x, y) is a stochastic matrix and πf (x, y) = exp(f (x))π(x, y). Let Df denote the diagonal matrix diag[f (x) : x ∈ E]. Then πf = exp(Df )π and MN (f ) = (exp(Df )π)N (1) The Gartner­Ellis limiting logarithmic moment generating function is then Λ(f ) = limN →∞ N −1 .log((exp(Df )π)N (1)) = log(λm (f )) where λm (f ) is the maximum eigenvalue of the matrix exp(Df )π. Now let � I(µ) = supu>0 µ(x)log((πu)(x))/u(x)) x

where µ is a probability distribution on E. We wish to show that � I(µ) = supf ( f (x)µ(x) − λm (f )) x

= supf (< f, µ > −λm (f )) 152CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING

10.1 End­Semester Exam, duration:four hours, Pattern Recognition [1] [a] Consider N iid random vectors Xn , n = 1, 2, ..., N where Xn has the prob­ �K ability density k=1 π(k)N (x|µk , Σk ), ie, a Gaussian mixture density. Derive the optimal ML equations for estimating the parameters θ = {π(k), µk , σk ) : k = 1, 2, ..., K} from the measurements X = {Xn : n = 1, 2, ..., N } by maximizing ln(p(X|θ)) =

N L

n=1

ln(

K L

k=1

π(k)N (Xn |µk , Σk ))

[b] By introducing latent random vectors z(n) = {z(n, k) : k = 1, 2, ..., K}, n = 1, 2, ..., N which are iid with P (z(n, k) = 1) = π(k) such that for each n, exactly one of the z(n, k)' s equals one and the others are zero, cast the ML parameter estimation equation in the recursive EM form, namely as θm+1 = argmaxθ Q(θ, θm ) where Q(θ, θm ) =

L z

ln(p(X, Z|θ)).p(Z|X, θm )

with Z = {z(n) : n = 1, 2, ..., N }. Specifically, show that Q(θ, θm ) =

L

γnk (X, θm )ln(π(k)N (Xn |µ(k), Σk )))p(Z|X, θm )

θm+1 = argmaxθ Q(θ, θm ) where 144

L Stochastic Processes in Classical and Quantum Physics and Engineering ln(p(X, Z|θ)).p(Z|X, θm ) Q(θ, θm ) = z

with Z = {z(n) : n = 1, 2, ..., N }. Specifically, show that Q(θ, θm ) =

L n,k

γnk (X, θm )ln(π(k)N (Xn |µ(k), Σk )))p(Z|X, θm )

where =

L Z

γn,k (X, θm ) = E(z(n, k)|X, θm ) L z(n, k)p(z(n, k)|X, θm ) z(n, k)p(Z|X, θm ) = z(n,k)

Derive a formula for p(z(n, k)|X, θm ). Derive the optimal equations for θm+1 by maximizing Q(θ, θm ). [2] Let Xn , n = 1, 2, ..., N be vectors in RN . Choose K 0, there there exists a sequence {Tn } of detection operators such that P2 (n) → 0 and simultaneously limsupn n−1 .log(P1 (n)) ≤ −D(σ|ρ) + δ To prove this, we first use the monotonicity of the Renyi entropy to deduce that for any detection operator Tn , (T r(ρ⊗n (1 − Tn )))s T r(σ ⊗n (1 − Tn ))1−s ≤ T r(ρ⊗ns) σ ⊗n(1−s) ) for any s ≤ 0. Therefore, if T r(σ ⊗n Tn ) → 0 then

T r(σ ⊗n (1 − Tn )) → 1

and we get from the above on taking logarithms, limsupn−1 .log(T r(ρ⊗n (1 − Tn )) ≥ log(T r(ρs σ 1−s ))/s or equivalently writing α = limsupn−1 .log(P1 (n)) we get

α ≥ s−1 T r(ρs σ 1−s )∀s ≤ 0

Taking lims ↑ 0 in this expression gives us 162CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING α ≥ −D(σ|ρ) = −T r(σ(log(σ) − log(ρ))) This proves the first part. For the second part, we choose Tn in accordance with the optimal Neyman­Pearson test. Specifically, choose Tn = {exp(nR)ρ⊗n ≥ σ ⊗n } where the real number R will be selected appropriately.

154 Stochastic Processes in Classical and Quantum Physics and Engineering 10.3. QUESTION PAPER, SHORT TEST, STATISTICAL SIGNAL PROCESSING, M.TE

10.3 Question paper, short test, Statistical sig­ nal processing, M.Tech [1] Consider the AR(p) process x[n] = −

p �

k=1

a[k]x[n − k] + w[n]

where w[n] is white noise with autocorrelation 2 δ[n − kj E(w[n]w[n − k]] = σw

Assume that A(z) = 1 +

p �

k=1

−1 )(1 − z¯m z −1 ) a[k]z −k = Πrm=1 (1 − zm z −1 )Πr+l m=r+1 (1 − zm z

where p = r + 2l and z1 , ..., zr+l are all within the unit circle with z1 , ..., zr being real and zr+1 , ..., zr+l being complex with non­zero imaginary parts. Assume that the measured process is y[n] = x[n] + v[n] where v[.] is white, independent of w[.] and with autocorrelation σv2 δ[n]. Assume that 2 + σv2 A(z)A(z −1 ) = B(z)B(z −1 ) σw with p B(z) = KΠm=1 (1 − pm z −1 )

where p1 , ..., pp are complex numbers within the unit circle. Find the value of K 2 , σv2 , a[1], ..., a[p]. Determine the transfer function of the optimal in terms of σw non­causal and causal Wiener filters for estimating the process x[.] based on y[.] and cast the causal Wiener filter in time recursive form. Finally, determine the mean square estimation errors for both the filters. [2] Consider the state model X[n + 1] = AX[n] + bw[n + 1] and an associated measurement model z[n] = Cx[n] + v[n] Given an AR(2)process x[n] = −a[1]x[n − 1] − a[2]x[n − 2] + w[n]

Stochastic Processes in Classical and Quantum Physics and Engineering 155 164CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING and a measurement model z[n] = x[n] + v[n] with w[.] and v[.] being white and mutually uncorrelated processes with vari­ 2 and σv2 respectively. Cast this time series model in the above state ances of σw variable form with A being a 2×2 matrix, X[n] a 2×1 state vector, b a 2×1 vec­ tor and C a 1 × 2 matrix. Write down explicitly the Kalman filter equations for this model and prove that in the steady state, ie, as n → ∞, the Kalman filter converges to the optimal causal Wiener filter. For proving this, you may assume 2 an appropriate factorization of σv2 A(z)A(z −1 ) + σw into a minimum phase and a non­minimum phase part where A(z) = 1 + a[1]z −1 + a[2]z −2 is assumed to be minimum phase. Note that the steady state Kalman filter is obtained by setting the state estimation error conditional covariance matrices for prediction and filtering, namely P[n + 1|n], P[n|n] as well as the Kalman gain K[n] equal to constant matrices and thus express the Kalman filter in steady state form as ˆ + 1|n + 1] = DX[n|n] ˆ X[n + f z[n + 1] where D, f are functions of the steady state values of P[n|n] and P[n + 1|n] which satisfy the steady state Riccati equation. [3] Let Xk , k = 1, 2, ..., N and θ be random vectors defined on the same probability space with the conditional distribution of X1 , ..., XN given θ being that of iid N (θ, R1 ) random vectors. Assume further that the distribution of θ is N (µ, R2 ). Calculate (a) p(X1 , ..., XN |θ),(b) p(θ|X1 , ..., XN ), (3) the ML estimate of θ given X1 , ..., XN and (4) the MAP estimate of θ given X1 , ..., XN . Also evaluate Cov(θˆM L − θ), Cov(θM AP − θ) Which one has a larger trace ? Explain why this must be so. [4] Let X1 , ..., XN be iid N (µ, R). Evaluate the Fisher information matrix elements ∂2 −E[ ln(p(X1 , ..., XN |µ, R)] ∂µa ∂µb −E[ −E[

∂2 ln(p(X1 , ..., XN |µ, R)] ∂µa ∂Rbc

∂2 ln(p(X1 , ..., XN |µ, R)] ∂Rab ∂Rcd

Taking the special case when the X�k s are 2­dimensional random vectors, evalu­ ˆ ab ), a, b = 1, 2 ate the Cramer­Rao lower bound for V ar(ˆ µa ), a = 1, 2 and V ar(R ˆ where µ ˆ and R are respectively unbiased estimates of µ and R.

156 Stochastic Processes in Classical and Quantum Physics and Engineering 10.3. QUESTION PAPER, SHORT TEST, STATISTICAL SIGNAL PROCESSING, M.TE [5] Let ρ(θ) denote a mixed quantum state in CN dependent upon a classical parameter vector θ = (θ1 , ...θp )T . Let X, Y be two observables with eigende­ compositions X=

N �

a=1

|ea > x(a) < ea |, Y =

N �

a=1

|fa > y(a) < fa |, < ea |eb >=< fa |fb >= δab

First measure X and note its outcome x(a). Then allowing for state collapse, allow the state of the system to evolve from this collapsed state ρc to the new state m m � � Ek∗ Ek = I Ek ρc Ek∗ , T (ρc ) = k=1

k=1

Then measure Y in this evolved state and note the outcome y(b). Calculate the joint probability distribution P (x(a), y(b)|θ). Explain by linearizing about some θ0 , how you will calculate the maximum likelihood estimate of θ as a function of the measurements x(a), y(b). Also explain how you will calculate the CRLB for θ. Finally, explain how you will select the observables X, Y to be measured in such a way that the CRLB is minimum when θ is a scalar parameter. [6] For calculating time recursive updates of parameters in a linear AR model, we require to apply the matrix inversion lemma to a matrix of the form (λ.R + bbT )−1 while for calculating order recursive updates, we require to use the projection operator update formula in the form Pspan(,ξ) = PW + PP⊥ Wξ and also invert a block structured matrices of the form � � A b bT c and



c b

bT A



in terms of A−1 . Explain these statements in the light of the recurisive least squares lattice algorithm for simultaneous time and order recursive updates of the prediction filter parameters.

Stochastic Processes in Classical and Quantum Physics and Engineering 157 166CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING

10.4 Question paper on statistical signal pro­ cessing, long test 10.4.1 Atom excited by a quantum field [1] Let the state at time t = 0 of an atom be ρA (0) and that of the field be the coherent state |φ(u) >< φ(u)| so that the state of both the atom and the field at time t = 0 is ρ(0) = ρA (0) ⊗ |φ(u) >< φ(u)| Let the Hamiltonian of the atom and field in interaction be � H(t) = HA + ω(k)a(k)∗ a(k) + δ.V (t) k

� where HA acts in the atom Hilbert space, HF = k ω(k)a(k)∗ a(k) acts in the field Boson Fock space and the interaction Hamiltonian V (t) acts in the tensor product of both the spaces. Assume that V (t) has the form � V (t) = (Xk (t) ⊗ ak + Xk (t)∗ ⊗ a(k)∗ ) k

where Xk (t) acts in the atom Hilbert space. Note that writing exp(itHA )Xk (t)exp(−itHA ) = Yk (t), exp(itHF )ak .exp(−itHF ) = ak .exp(−iω(k)t) = ak (t) we have on defining the unperturbed (ie, in the absence of interaction) system plus field Hamiltonian � ω(k)a(k)∗ a(k) H0 = HA + HF = HA + k

that W (t) = exp(itH0 )V (t).exp(−itH0 ) =

� k

(Yk (t) ⊗ ak (t) + Yk (t)∗ ⊗ ak (t)∗ )

show that upto O(δ 2 ) terms, the total system ⊗ field evolution operator U (t) can be expressed as U (t) = U0 (t).S(t) where U0 (t) = exp(−itH0 ) = exp(−itHA ).exp(−itHF ) is the unperturbed evolution operator and � t � 2 S(t) = 1 − iδ W (s)ds − δ 0

0= u(k), ¯(k), T r(a(k)∗ |φ(u) >< φ(u)|) =< φ(u)|a(k)∗ |φ(u) >= u T r(a(k)a(m)|φ(u) >< φ(u)|) =< φ(u)|a(k)a(m)|φ(u) >= u(k)u(m), T r(a(k)a(m)∗ |φ(u) >< φ(u)|) =< φ(u)|a(k)a(m)∗ |φ(u) >

=< φ(u)|[a(k), a(m)∗ ]|φ(u) > + < φ(u)|a(m)∗ a(k)|φ(u) > = δ(k, m) + u(k)¯ u(m) and

T r(a(k)∗ a(m)|φ(u) >< φ(u)|) =< φ(u)|a(k)∗ a(m)|φ(u) >= =u ¯(k)u(m)

Remark: More generally, we can expand any general initial state of the system (atom) and bath (field) as � ρ(0) = FA (u, u ¯) ⊗ |φ(u) >< φ(u)|dudu ¯ =



FA (u, u ¯)exp(−|u|2 ) ⊗ |e(u) >< e(u)|du.du ¯

In this case also, we can calculate assuming that ρ(0) is the initial state, final state at time T using the formula ρ(T ) = U0 (T )(1 − iδS1 (T ) − δ 2 S2 (T ))ρ(0).(1 + iδS1 (T ) − δ 2 S2 (T )∗ )U0 (T )∗ by evaluating this using the fact that if Y1 , Y2 are system operators, then � (Y1 ⊗ ak )( F (u, u ¯) ⊗ |φ(u) >< φ(u)|dudu ¯)(Y2 ⊗ am ) =



Y1 F (u, u ¯)Y2 exp(−|u|2 ) ⊗ ak |e(u) >< e(u)|am dudu ¯ − − − (1)

Stochastic Processes in Classical and Quantum Physics and Engineering 159 168CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING Now ak |e(u) >= u(k)|e(u) >

a∗k |e(u)

>= ∂|e(u) > /∂u(k)

so (1) evaluates to after using integration by parts, � ¯)Y2 u(k)exp(−|u|2 ) ⊗ (∂/∂ u ¯(m))(|e(u) >< e(u)|)du.du ¯ Y1 F1 (u, u =



¯)Y2 ).exp(−|u|2 u(k))) ⊗ |e(u) >< e(u)|dudu ¯ (−∂/∂ u ¯(m))(Y1 F (u, u � = (−∂/∂ u ¯(m) + u(m))(Y1 F (u, u ¯)Y2 u(k))|φ(u) >< φ(u)|du.du ¯

Likewise,

(Y1 ⊗ a∗k )( = =

=









F (u, u ¯) ⊗ |φ(u) >< φ(u)|dudu ¯)(Y2 ⊗ am )

Y1 F (u, u ¯)Y2 ⊗ a∗k |φ(u) >< φ(u)|am dudu ¯

Y1 F (u, u ¯)Y2 exp(−|u|2 ) ⊗ (∂/∂u(k)).(∂/∂ u ¯(m))(|e(u) >< e(u)|)du.du ¯ � = (∂ 2 /∂u(k)∂u ¯(m))(Y1 F (u, u ¯)Y2 .exp(−|u|2 )) ⊗ |e(u) >< e(u)|du.du ¯

(exp(|u|2 )(∂ 2 /∂u(k)∂u ¯(m))(Y1 F (u, u ¯)Y2 .exp(−|u|2 )))⊗|φ(u) >< φ(u)|du.du ¯ � = (∂/∂u(k) + u ¯(k))(∂/∂ u ¯(m) − u(m))(Y1 F (u, u ¯)Y2 ) ⊗ |φ(u) < φ(u)|dudu ¯

and likewise terms like

a∗k )(



F (u, u ¯) ⊗ |φ(u) >< φ(u)|dudu ¯)(Y2 ⊗ a∗m )

(Y1 ⊗ ak )(



F (u, u ¯) ⊗ |φ(u) >< φ(u)|dudu ¯)(Y2 ⊗ a∗m )

(Y1 ⊗ and

We leave it as an exercise to the reader to evaluate these terms and hence derive upto the second order Dyson series approximation for the evolution operator the approximate final state of the atom and field and then by partial tracing over the field variables, to derive the approximate final atomic state under this interaction of atom and field (ie, system and bath). [2] If the state of the system is given by ρ = ρ0 + δ.ρ1 , then evaluate the Von­Neumann entropy of ρ as a power series in δ. Hint:

160 Stochastic Processes in Classical and Quantum Physics and Engineering 10.4. QUESTION PAPER ON STATISTICAL SIGNAL PROCESSING, LONG TEST169

10.4.2 Estimating the power spectral density of a station­ ary Gaussian process using an AR model:Performance analysis of the algorithm The process x[n] has mean zero and autocorrelation R(τ ). It’s pth order predic­ tion error process is given by e[n] = x[n] +

p �

k=1

a[k]x[n − k]

where the a[k]� s are chosen to minimize Ee[n]2 . The optimal normal equations are E(e[n]x[n − m]) = 0, m = 1, 2, ..., p and these yield the Yule­Walker equations p �

k=1

R[m − k]a[k] = −R[m], m = 1, 2, ..., p

or equivalently in matrix notation, a = −R−1 p rp where

Rp = ((R[m − k]))1≤m,k≤p , rp = ((R[m]))pm=1

The minimum mean square prediction error is then σ 2 = E(e[n]2 = E(e[n]x[n]) = R[0] +

p �

a[k]R[k] = R[0] + aT rp

k=1

= R[0] − rTp R−1 p rp If the order of the AR model p is large, then e[n] is nearly a white process, in fact in the limit p → ∞, e[n] becomes white, ie, E(e[n]x[n − m]) becomes zero for all m ≥ 1 and hence so does E(e[n]e[n − m]). The ideal known statistics AR spectral estimate is given by SAR,p (ω) = S(ω) =

σ2 , |A(ω)|2

A(ω) = 1 + aT e(ω), e(ω) = ((exp(−jωm)))pm=1 Since in practice, we do not have the exact correlations available with us, we estimate the AR parameters using the least squares method, ie, by minimizing N �

n=1

(x[n] + aT xn )2

Stochastic Processes Classical and Quantum and Engineering 161 170CHAPTER 10.in LECTURE PLAN FORPhysics STATISTICAL SIGNAL PROCESSING where

xn = ((x(n − m)))pm=1

The finite data estimate of Rp is ˆ p = N −1 R

N �

xn nT

n=1

and that of rp is rˆp = N

−1

N �

x[n]xn

n=1

ˆ = R+δR for R ˆ p . Likewise, we write r for rp and r +δr We write R for Rp and R ˆ and rˆ are unbiased estimates of R for rˆp for notational convenience. Note that R and r respectively and hence δR and δr have zero mean. Now the least squares estimate a ˆ = a + δa of a is given by ˆ −1 rˆ = −(R + δR)−1 (r + δr) a ˆ = a + δa = −R = −(R−1 − R−1 δRR−1 + R−1 δR.R−1 δR.R−1 )(r + δr)

= a − R−1 δr − R−1 δRa + R−1 δR.R−1 δr + R−1 δR.R−1 δR.a upto quadratic orders in the data correlation estimate errors. Thus, we can write δa = −R−1 δr − R−1 δRa + R−1 δR.R−1 δr + R−1 δR.R−1 δR.a Likewise, the estimate of the prediction error energy σ 2 is given by ˆ [0] + a σ ˆ 2 = σ 2 + δσ 2 = R ˆT rˆ and hence δσ 2 = δR[0] + aT δr + δaT r + δaT δr = δR[0] + aT δr + (−R−1 δr − R−1 δRa + R−1 δR.R−1 δr + R−1 δR.R−1 δr)T δr +(−R−1 δr − R−1 δRa)T δr

again upto quadratic orders in the data correlation estimate errors. In terms of the fourth moments of the process x[n], we can now evaluate the bias and mean square fluctuations of a ˆ and σ ˆ 2 w.r.t their known statistics values a, σ 2 respectively. First we observe that the bias of a ˆ is E(δa) = E(−R−1 δr − R−1 δRa + R−1 δR.R−1 δr + R−1 δR.R−1 δR.a) = E(R−1 δR.R−1 δr) + E(R−1 δR.R−1 δR.a) Specifically, for the ith component of this, we find that with P = R−1 , and using Einstein’s summation convention over repeated indices, E(δai ) =

162 Stochastic Processes in Classical and Quantum Physics and Engineering 10.4. QUESTION PAPER ON STATISTICAL SIGNAL PROCESSING, LONG TEST171 E(Pij δRjk Pkm δrm ) + E(Pij δRjk Pkm δRml al ) = Pij Pkm E(δRjk δrm ) + Pij Pkm al E(δRjk δRml ) Now, N �

E(δRjk .δrm ) = N −2

n,l=1

= N −2

N



E[(x[n − k]x[n − j] − R[j − k])(x[l]x[l − m] − R[m])]

(E(x[n − k]x[n − j]x[l]x[l − m]) − R[j − k]R[m])

n,l=1

upto fourth degree terms in the data. Note that here we have not yet made any assumption that x[n] is a Gaussian process, although we are assuming that the process is stationary at least upto fourth order. Writing E(x[n1 ]x[n2 ]x[n3 ]x[n4 ]) = C(n2 − n1 , n3 − n1 , n4 − n1 ) we get E(δRjk .δrm ) = N −2

N �

n,l=1

10.4.3

C(k − j, l − n + k, l − n + k − m)

Statistical performance analysis of the MUSIC and ESPRIT algorithms

Let ˆ = R + δR R be an estimate of the signal correlation matrix R. Let v1 , ..., vp denote the signal eigenvectors and vp+1 , ..., vN the noise eigenvectors. The signal subspace eigen­matrix is VS = [v1 , ..., vp ] ∈ CN ×p and the noise subspace eigen­matrix is VN = [vp+1 , ..., vN ] so that V = [VS |VN ] is an N × N unitary matrix. The signal correlation matrix model is R = EDE ∗ + σ 2 I

Stochastic Processes in Classical and Quantum Physics and Engineering 163 172CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING σ 2 is estimated as σ ˆ 2 which is the average of the N − p smallest eigenvalues of R. In this expression, E = [e(ω1 ), ..., e(ωp )] is the signal manifold matrix where e(ω) = [1, exp(jω), ..., exp(j(N − 1)ω)]T is the steering vector along the frequency ω. Note that since N > p, E has full column rank p. The signal eigenvalues of R are λi = µi + σ 2 , i = 1, 2, ..., p while the noise eigenvalues are λi = σ 2 , i = p + 1, ..., N Thus, Rvi = λi vi , i = 1, 2, ..., N The known statistics exact MUSIC spectral estimator is P (ω) = (1 − N −1 � VS∗ e(ω) �2 )−1 which is infinite when and only when ω ∈ {ω1 , ..., ωp }. Alternately, the zeros of the function Q(ω) = 1/P (ω) = 1 − N −1 � VS∗ e(ω) �2

are precisely {ω1 , ..., ωp } because, the VN∗ e(ω) = 0 iff e(ω) belongs to the signal subspace span{e(ωi ) : 1 ≤ i ≤ p} iff {e(ω), e(ωi ), i = 1, 2, ..., p} is a linearly dependent set iff ω ∈ {ω1 , ..., ωp }, the last statement being a consequence of the fact that N ≥ p + 1 and the Van­Der­Monde determinant theorem. Now, owing to finite data effects, vi gets perturbed to vi + δvi for each i = 1, 2, ..., p so that correspondingly, VS gets perturbed to VˆS = VS + δVS . Then Q gets perturbed to (upto first orders of smallness, ie, upto O(δR) Q(ω) + δQ(ω) = 1 − N −1 � (VS + δVS )∗ e(ω) �2 so that

δQ(ω) = (−2/N )Re(e(ω)∗ δVS VS∗ e(ω))

Let ω be one of the zeros of Q(ω), ie, one of the frequencies of the signal. Then, owing to finite data effects, it gets perturbed to ω + δω where Q� (ω)δω + δQ(ω) = 0 ie,

δω = −δQ(ω)/Q� (ω)

and Q� (ω) = (−2/N )Re(e(ω)∗ PS e� (ω)) = (−2/N )Re < e(ω)|PS |e� (ω) >

16410.4. QUESTION Stochastic Processes in Classical andSIGNAL QuantumPROCESSING, Physics and Engineering PAPER ON STATISTICAL LONG TEST173 where

PS = VS VS∗

is the orthogonal projection onto the signal subspace. Note that by first order perturbation theory |δvi >=

� |vj >< vj |δR|vi >  , i = 1, 2, ..., N λ i − λj j=i

and δλi =< vi |δR|vi > So δVS = column(

� 

j=i

|vj >< vj |δR|vi > /(λi − λj ) : i = 1, 2, ..., p]

= [Q1 δR|v1 >, ..., Qp δR|vp >] where Qi =

�  j=i

|vj >< vj |/(λi − λj )

We can express this as δVS = [Q1 , ..., Qp ](Ip ⊗ δR.VS ) and hence

δQ(ω) = (−2/N )Re(< e(ω)|δVS VS∗ |e(ω)) >) = (−2/N )Re(< e(ω)S(Ip ⊗ δR).VS VS∗ |e(ω) >) = (−2/N )Re(< e(ω)|S(Ip ⊗ δR)PS |e(ω) >)

where as before PS = VS VS∗ is the orthogonal projection onto the signal subspace and S = [Q1 , Q2 , ..., Qp ] The evaluation of E((δω)2 ) can therefore be easily done once we evaluate E(δR(i, j)δR(k, m)) and this is what we now propose to do.

Stochastic Processes Classical and Quantum and Engineering 165 174CHAPTER 10.inLECTURE PLAN FORPhysics STATISTICAL SIGNAL PROCESSING Statistical performance analysis of the ESPRIT algorithm. Here, we have two correlation matrices R, R1 having the structures R = EDE ∗ + σ 2 I, R1 = EDΦ∗ E ∗ + σ 2 Z where E has full column rank, D is a square, non­singular positive definite matrix and Φ is a diagonal unitary matrix. The structure of Z is known. We have to estimate the diagonal entries of Φ which give us the signal frequencies. Let as before VS denote the matrix formed from the signal eigenvectors of R and VN that formed from the noise eigenvectors of R. V = [VS |VN ] is unitary. ˆ = R + δR and R1 by When we form finite data estimates, R is replaced by R 2 2 2 2 ˆ ˆ = σ + δσ obtained by averaging the R1 = R1 + δR1 . The estimate of σ is σ ˆ The estimate of RS = EDE ∗ is then N − p smallest eigenvalues of R. ˆ−σ ˆ S = RS + δRS = R ˆ 2 I = R − σ 2 I + δR − δσ 2 R Thus, δRS = δR − δσ 2 I

Likewise, the estimate of RS1 = EDΦ∗ E ∗ is

ˆ1 − σ ˆ S1 = RS1 + δRS1 = R ˆ 2 Z = R1 − σ 2 Z + δR1 − δσ 2 Z R so that δRS1 = δR1 − δσ 2 Z

The rank reducing numbers of the matrix pencil

RS − γRS1 = ED(I − γΦ∗ )E ∗ are clearly the diagonal entries of Φ. Since R(E) = R(VS ), it follows that these rank reducing numbers are the zeroes of the polynomial equation det(VS∗ (RS − γRS1 )VS ) = 0 or equivalently of where

det(M − γVS∗ RS1 VS ) = 0 M = VS∗ RS VS = diag[µ1 , ..., µp ]

with µk = λk − σ 2 , kj = 1, 2, ..., p

being the nonzero eigenvalues of RS . When finite data effects are taken into consideration, these become the rank reducing numbers of det(M + δM − γˆ (VS + δVS )∗ (RS1 + δRS1 )(VS + δVS )) = 0 or equivalently, upto first order perturbation theory, the rank reducing number γ gets perturbed to γˆ = γ + δγ, where det(M − γX + δM − γ.δX − δγX) = 0

166 Stochastic Processes in ClassicalFOR and Quantum PhysicsEEG and Engineering 10.5. QUANTUM NEURAL NETWORKS ESTIMATING PDF FROM SPEECH where X = VS∗ RS1 VS , δX = δVS∗ RS1 VS + VS∗ RS1 δVS + VS∗ δRS1 VS Let ξ be a right eigenvector of M − γX and η ∗ a left eigenvector of the same corresponding to the zero eigenvalue. Note that M, X, δM, δX are all Hermitian but M − γX is not because γ = exp(jωi ) is generally not real. Then, let ξ + δξ be a right eigenvector of M − γX + δM − γδX − δγX corresponding to the zero eigenvalue. In terms of first order perturbation theory, 0 = (M − γX + δM − γδX − δγX)(ξ + δξ) = (M − γX)δξ + (δM − γδX)ξ − δγXξ

Premultiplying by η ∗ gives us

δγη ∗ Xξ = η ∗ (δM − γδX)ξ or equivalently, δγ =

< η|δM − γδX�ξ > < η|X|ξ >

which is the desired formula.

10.5 Quantum neural networks for estimating EEG pdf from speech pdf based on Schrodinger’s equation Quantum neural networks for estimating the marginal pdf of one signal from the marginal pdf of another signal. The idea is that the essential characteristics of either speech or EEG signals are contained in their empirical pdf’s, ie, the proportion of time that the signal amplitude spends within any Borel subset of the real line. This hypothesis leads one to believe that by estimating the pdf of the EEG data from that of speech data alone, we can firstly do away with the elaborate apparatus required for measuring EEG signals and secondly we can extract from the EEG pdf, the essential characteristics of the EEG sig­ nal parameters which determine the condition of a diseased brain. It is a fact that since the Hamiltonian operator H(t) in the time dependent Schrodinger equation is Hermitian, the resulting evolution operator U (t, s), t > s is unitary causing thereby the wave function modulus square to integrate to unity at any time t if at time t = 0 it does integrate to unity. Thus, the Schrodinger equation

Stochastic Processes in Classical and Quantum Physics and Engineering 167 176CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING naturally generates a family of pdf’s. Further, based on the assumption that the speech pdf and the EEG pdf for any patient are correlated in the same way (a particular kind of brain disease causes a certain distortion in the EEG signal and a correlated distortion in the speech process with the correlation being patient independent), we include a potential term in Schrodinger’s equation involving speech pdf and hence while tracking a given EEG pdf owing to the adaptation algorithm of the QNN, the QNN also ensures that the output pdf coming from Schrodinger’s equation will also maintain a certain degree of correlation with the speech pdf. The basic idea behind the quantum neural network is to mod­ ulate the potential by a ”weight process” with the weight increasing whenever the desired pdf exceeds the pdf generated by Schrodinger’s equation using the modulus of the wave function squared. A hieuristic proof of why the pdf gener­ ated by the Schrodinger equation should increase when the weight process and hence th modulated potential increases is provided at the end of the paper. First consider the following model for estimating the pdf p(t, x) of one signal via Schrodinger’s wave equation which naturally generates a continuous family of pdf’s via the modulus square of the wave function: ih∂t ψ(t, x) = (−h2 /2)∂x2 ψ(t, x) + W (t, x)V (x)ψ(t, x) The weight process W (t, x) is generated in such a way that |ψ(t, x)|2 tracks the given pdf p(t, x). This adaptation scheme is ∂t W (t, x) = −βW (t, x) + α(p(t, x) − |ψ(t, x)|2 ) where α, β are positive constants. This latter equation guarantees that if W (t, x) is very large and positive (negative), then the term −βW (t, x) is very large and negative (positive) causing W to decrease (increase). In other words, −β.W is a threshold term that prevents the weight process from getting too large in magnitude which could cause the break­down of the quantum neural network. We are assuming that V (x) is a non­negative potential. Thus, increase in W will cause an increase in W V which as we shall see by an argument given below, will cause a corresponding increase in |ψ|2 and likewise a decrease in W will cause a corresponding decrease in |ψ|2 . Now we add another potential term p0 (t, x)V0 (x) to the above Schrodinger dynamics where p0 is the pdf of the input signal. This term is added because we have the understanding that the output pdf p is in some way correlated with the input pdf. Then the dynamics becomes ih∂t ψ(t, x) = (−h2 /2)∂x2 ψ(t, x) + W (t, x)V (x)ψ(t, x) + p0 (t, x)V0 (x)ψ(t, x) Statistical performance analysis of the QNN: Suppose p0 (t, x) fluctuates to p0 (t, x) + δp0 (t, x) and p(t, x) to p(t, x) + δp(t, x). Then the corresponding QNN output wave function ψ(t, x) will fluctuate to ψ(t, x)+δψ(t, x) where δψ satisfies the linearized Schrodinger equation and likewise the weight process W (t, x) will

168 Stochastic Processes in Classical and Quantum Physics and Engineering 10.5. QUANTUM NEURAL NETWORKS FOR ESTIMATING EEG PDF FROM SPEECH fluctuate to W (t, x) + δW (t, x). The linearized dynamics for these fluctuations are ih∂t δψ(t, x) = (−h2 /2)∂x2 δψ(t, x) + V (x)(W δψ + ψδW ) + V0 (x)(p0 δψ + ψδp0 ) ¯ ∂t δW = −βδW + α(δp − ψ.δψ − ψ.δ ψ¯) Writing ψ(t, x) = A(t, x).exp(iS(t, x)/h) we get δψ = (δA + iAδS/h)exp(iS/h) so that with prime denoting spatial derivative, δψ " = (δA" + iA" δS/h + iAδS " /h + iS " δA/h − AS " δS/h2 )exp(iS/h) δψ "" = (δA"" +iA"" δS/h+2iA" δS " /h+iAδS "" /h+2iS " δA" /h+iS "" δA/h−AS " δS " /h2 d

d

−A" S " δS/h2 − AS "" δS/h2 − AS " δS " /h2 − S 2 δA/h2 − iAS 2 δS/h3 ).exp(iS/h) Thus we obtain the following differential equation for the wave function fluctu­ ations caused by random fluctuations in the EEG and speech pdf’s: ih∂t δA − ∂t (AδS) − δA.∂t S − AδS∂t S/h = (−h2 /2)δA"" − ihA"" δS/2 − ihA" δS " − ihA.δS "" /2 − ihS " δA" d

d

−ihS "" δA/2+AS " δS " /2+A" S " δS/2+AS "" δS/2+AS " δS " /2+S 2 δA/2+iAS 2 δS/2h +V0 (x)p0 (t, x)(δA + iAδS/h) + V0 (x)Aδp0 (t, x) +V (x)W (t, x)(δA + iAδS/h) + V (x)AδW ) On equating real and imaginary parts in this equation, we get two pde’s for the two functions A and S. However, it is easier to derive these equations by first deriving the unperturbed pde’s for A, S and then forming the perturbations. The unperturbed equations are (See Appendix A.1) h∂t A = −h∂x A.∂x S − (1/2)A∂x2 S −A∂t S = −((h2 /2)∂x2 A − (1/2)A(∂x S)2 ) + W V A + p0 V0 A ∂t W (t, x) = −β(W (t, x) − W0 (x)) + α(p(t, x) − A2 )

Forming the variation of this equation around a fixed operating point p0 (x), p(x) = |ψ(t, x)|2 and W (x) with ψ(t, x) = A(x).exp(−iEt + iS0 (x)) gives us h∂t δA = −hδA" .S0" − hA" δS " − (1/2)S0"" δA − (1/2)AδS "" , d

−A∂t δS + EδA = (−h2 /2)δA"" − (1/2)S02 δA − AS0" δS ""

+V W δA + V AδW + V0 Aδp0 + V0 p0 δA,

∂t δW = −βδW + α.δp − 2αAδA

178CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING

Stochastic Processes in Classical and Quantum Physics and Engineering

169

Appendix A.1

Proof that increase in potential causes increase in probability density deter­

mined as modulus square of the wave function. Consider Schrodinger’s equation in the form ih∂t ψ(t, x) = (−h2 /2)∂x2 ψ(t, x) + V (x)ψ(t, x) We write ψ(t, x) = A(t, x).exp(iS(t, x)/h) where A, S are real functions with A non­negative. Thus, |ψ|2 = A2 . Substitut­ ing this expression into Schrodinger’s equation and equating real and imaginary parts gives h∂t A = −h∂x A.∂x S − (1/2)A∂x2 S) −A∂t S = −((h2 /2)∂x2 A − (1/2)A(∂x S)2 ) + V A

We write S(t, x) = −Et + S0 (x) so that E is identified with the total energy of the quantum particle. Then, the second equation gives on neglecting O(h2 ) terms, � (∂x S0 ) = ± 2(E − V (x))

and then substituting this into the first equation with ∂t A = 0 gives S0�� /S0� = −2hA� /A or so

ln(S0� ) = −2h.ln(A) + C A = K1 (S0� )−1/2 = K2 (E − V (x))−1/4

which shows that as long as V (x) < E, A(x) and hence A(x)2 = |ψ(x)|2 will increase with increasing V (x) confirming the fact that by causing the driving potential to increase when pE − |ψ|2 is large positive, we can ensure that |ψ|2 will increase and viceversa thereby confirming the a validity of our adaptive algorithm for tracking the EEG pdf. A.2 Calculating the empirical distribution of a stationary stochastic process. Let x(t) be a stationary stochastic process. The empirical density estimate of N samples of this process spaced τk , k = 1, 2, ..., n − 1 seconds apart from the first sample, ie, an estimate of the joint density of x(t), x(t+τk ), k = 1, 2, ..., N −1 is given bylog( � T −1 ΠN pˆT (x1 , ..., xN ; τ1 , ..., τN −1 ) = T k=1 δ(x(t + τk ) − xk )dt 0

170 Stochastic Processes in Classical and Quantum Physics and Engineering 10.5. QUANTUM NEURAL NETWORKS FOR ESTIMATING EEG PDF FROM SPEECH where τ0 = 0. The Gartner­Ellis theorem in the theory of large deviations pro­ vides a reliability estimate of how this empirical density performs. Specifically, if f is a function of N variables, then the moment generating functional of pˆT (., τ1 , ..., τN −1 ) is given by � E[exp( pˆT (x1 , ..., xN ; τ1 , ..., τN −1 )f (x1 , ..., xN )dx1 ...dxN )] = E[exp(T −1



T 0

f (x(t), x(t + τ1 ), ..., x(t + τN −1 ))dt)] = MT (f |τ1 , ..., τN −1 )

say. Then, the rate functional for the family of random fields pˆT (., τ1 , ...τN −1 ), T → ∞ is given by � I(p) = supf ( f (x1 , ..., xN )p(x1 , ..., xN )dx1 ...dxN − Λ(f )) where Λ(f ) = limT →∞ T −1 log(MT (T.f )) assuming that this limit exists. It follows that as T → ∞, the asymptotic probability of pˆT taking values in a set E is given by T −1 log(P r(ˆ pT (., τ1 , ..., τN −1 ) ∈ E) ≈ −inf (I(p) : p ∈ E) or equivalently, P r(ˆ pT (., τ1 , ..., τN −1 ) = p) ≈ exp(−T I(p)), T → ∞

Appendix A.3 We can also use the electroweak theory or more specially, the Dirac­Yang­ Mills non­Abelian gauge theory for matter and gauge fields to track a given pdf. Such an algorithm enables us to make use of elementary particles in a particle accelerator to simulate probability densities for learning about the statistics of signals. Let Aaµ (x)τa be a non­Abelian gauge field assuming values in a Lie algebra g which is a Lie sub­algebra of u(N ). Here, τa , a = 1, 2, ..., N are Hermitian generators of g. Denote their structure constants by {C(abc)}, ie, [τa , τb ] = iC(abc)τc Then the gauge covariant derivative is �µ = ∂µ + ieAaµ τa

Stochastic Processes in Classical and Quantum Physics and Engineering 171 180CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING and the corresponding Yang­Mills antisymmetric gauge field tensor, also called the curvature of the gauge covariant derivative is given by ieFµν = [�mu , �ν ] = a = ie(Aν,µ − Aaµ,ν ) − eC(abc)Abµ Acν )τa

so that a τa Fµν = Fµν

where a a = Aν,µ − Aaµ,ν ) − eC(abc)Abµ Acν Fµν

The action functional for the matter field ψ(x) having 4N components and the gauge potentials Aaµ (x) is then given by (S.Weinberg, The quantum theory of fields, vol.II) � S[A, ψ] =

Ld4 x,

¯ µ (i∂µ + eAa τa ) − m)ψ − (1/4)F a F aµν L = ψ(γ µν µ

where

ψ¯ = ψ ∗ γ 0

The equations of motion obtained by setting the variation of S w.r.t the matter wave function ψ and the gauge potentials A are given by [γ µ (i∂µ + eAaµ τa ) − m]ψ = 0, Dν F µν = J µ where Dµ is the gauge covariant derivative in the adjoint representation, ie, [∂ν + ieAν , F µν ] = J µ or equivalently in terms of components, ∂ν F aµν − eC(abc)Abν F cµν = J aµ with J aµ being the Dirac­Yang­Mills current: ¯ µ ⊗ τa )ψ J aµ = −eψ(γ it is clear that ψ satisfies a Dirac­Yang­mills matter wave equation and it can be expressed in the form of Dirac’s equation with a self­adjoint Hamiltonian owing to which J aµ is a conserved current and in particular, the |ψ|2 integrates over space to give a constant at any time which can be set to unity by the initial condition. In other words, |ψ(x)|2 is a probability density at any time t where x = (t, r). Now it is easy to make the potential in this wave equation adaptive, ie, depend on the error p(x) − |ψ(x)|2 . Simply add an interaction term W (x)(p(x) − |ψ(x)|2 )V (x) to to the Hamiltonian. This is more easily accomplished using the electroweak theory where there is in addition to the

172 Stochastic Processes in Classical and Quantum Physics and Engineering 10.5. QUANTUM NEURAL NETWORKS FOR ESTIMATING EEG PDF FROM SPEEC Yang­Mills gauge potential terms a photon four potential Bµ with B0 (x) = W (x)(p(x) − |ψ(x)|2 )V (x). This results in the following wave equation [γ µ (i∂µ + eAaµ τa + eBµ ) − m]ψ = 0 The problem with this formalism is that a photon Lagrangian term (−1/4)(Bν,µ − Bµ,ν )(B ν,µ − B µ,ν ) also has to be added to the total Lagrangian which results in the Maxwell field equation ¯ µψ F,νphµν = −eψγ where ph Fµν = Bν,µ − Bµ,ν

is the electromagnetic field. In order not to violate these Maxwell equations, we must add another external charge source term to the rhs so that the Maxwell equation become ¯ µ ψ + ρ(x)δ µ F,νphµν = −eψγ 0 ρ(x) is an external charge source term that is given by −eψ ∗ ψ + ρ(x) = −�2 (W (x)(p(x) − |ψ(x)|2 )V (x)) = −�2 B0 so that B0 = W (p − |ψ|2 )V

The change in the Lagrangian produced by this source term is −ρ(x)B0 (x). This control charge term can be adjusted for appropriate adaptation. Thus in effect our Lagrangian acquires two new terms, the first is ¯ 0 B0 ψ = eψ ∗ ψ.B0 eψγ and the second is −ρ.B0 where

ρ = −�2 (W (p − ψ ∗ ψ)) + eψ ∗ ψ

The dynamics of the wave function ψ then acquires two new terms, the first ¯ 0 ψ.B0 , this term is coming the variation w.r.t ψ¯ of the term eψ ∗ ψ.B0 = eψγ 0 eB0 γ ψ, the second is the term coming from the variation w.r.t ψ¯ of the term ¯ 0 ψ)). This variation gives −ρ.B0 = B0 �2 (W (p − ψγ (�2 B0 )(W γ 0 ψ) The resulting equations of motion for the wave function get modified to [γ µ (i∂µ + eAaµ τa + eBµ ) − m + �2 B0 .W γ 0 ]ψ = 0 This equation is to be supplemented with the Maxwell and Yang­Mills gauge field equations and of course the weight update equation for W (x).

Stochastic Processes in Classical and Quantum Physics and Engineering 173 182CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING

10.6

Statistical Signal Processing,

[1] [a] Prove that the quantum relative entropy is convex, ie, if (ρi , σi ), i = 1, 2 are pairs of mixed states in Cd , then D(p1 ρ1 + p2 ρ2 |p1 σ1 + p2 σ2 ) ≤ p1 D(ρ1 |σ1 ) + p2 D(ρ2 |σ2 ) for p1 , p2 ≥ 0, p1 + p2 = 1. hint: First prove using the following steps that if K is a quantum operation, ie, TPCP map, then for two states ρ, σ, we have D(K(ρ)|K(σ)) ≤ D(ρ|σ) − − − (a) Then define ρ= σ=





p1 ρ1 0

0 p2 ρ2

p1 σ1 0

0 p2 σ 2

and K to be the operation given by





, ,

K(ρ) = E1 ρ.E1∗ + E2 ρ.E2∗ where E1 = [I|0], E2 = [0|I] ∈ Cd×2d Note that

E1∗ E1 + E2∗ E2 = I2d

To prove (a), make use of the result on the asymptotic error rate in hypothesis testing on tensor product states: Given that the first kind of error, ie, probability of detecting ρ⊗n given that the true state is σ ⊗n goes to zero, the minimum (Neyman­Pearson test) probability of error of the second kind goes to zero at the rate of −D(ρ|σ) (this result is known as quantum Stein’s lemma). Now consider any sequence of tests Tn , ie 0 ≤ Tn ≤ I on (Cd )⊗n . Then consider using this test for testing between the pair of states K ⊗n (ρ⊗n ) and K ⊗n (σ ⊗n ). Observe that the optimum performance for this sequential hypothesis testing problem is the same as that of the dual problem of testing ρ⊗n versus σ ⊗n using the dual test K ∗⊗n (Tn ). This is because T r(K ⊗n (ρ⊗n )Tn ) = T r(ρ⊗n K ∗⊗n (Tn )) etc. It follows that the optimum Neyman­Pearson error probability of the first kind for testing ρ⊗n versus σ ⊗n will be much smaller than that of test­ ing K ⊗n (ρ⊗n ) versus K ⊗n (σ ⊗n ) since K ∗⊗n (Tn ) varies only over a subset of tests as Tn varies over all tests in (Cd )⊗n . Hence going to the limit gives the result exp(−nD(ρ|σ)) ≤ exp(−nD(K(ρ)|K(σ))

174 Stochastic Processes in Classical and Quantum Physics and Engineering 10.6. STATISTICAL SIGNAL PROCESSING, M.TECH, LONG TESTMAX MARKS:501 asymptotically as n → ∞ which actually means that D(ρ|σ) ≥ D(K(ρ)|K(σ))

[b] Prove using [a] that the Von­Neumann entropy H(ρ) = −T r(ρ.log(ρ)) is a concave function, ie if ρ, σ are two states, then H(p1 ρ + p2 σ) ≥ p1 H(ρ) + p2 H(σ), p1 , p2 ≥ 0, p1 + p2 = 1 hint: Let K be any TPCP map. Let

� � 0 p1 ρ1 ρ = ∈ C2d×2d , 0 p2 ρ2 and σ = I2d /2d where ρk ∈ C

d×d

, k = 1, 2. Then choose E1 = [I|0], E2 = [0|I] ∈ Cd×2d

and

K(X) = E1 XE1∗ + E2 XE2∗ ∈ Cd×d , X ∈ C2d×2d

Note that

E1∗ E1 + E2∗ E2 = Id

Show that K(σ) = K(I2d /2d) = Id /d Then we get using [a], D(K(ρ)|K(σ)) ≤ D(ρ|sigma) Show that this yields H(p1 ρ1 + p2 ρ2 ) + log(d) ≥ p1 H(ρ1 ) + p2 H(ρ2 ) − H(p) + log(2d) where H(p) = −p1 log(p1 ) − p2 log(p2 ) Show that H(p) ≤ log(2) and hence deduce the result. [2] [a] Let X[n], n ≥ 0 be a M ­vector valued AR process, ie, it satisfies the first order vector difference equation X[n] = AX[n − 1] + W[n], n ≥ 1

Stochastic Processes10. in Classical andPLAN Quantum Physics and Engineering 175 184CHAPTER LECTURE FOR STATISTICAL SIGNAL PROCESSING Assume that W[n], N ≥ 1 is an iid N (0, Q) process independent of X[0] and that X[0] is a Gaussian random vector. Calculate the value of R[0] = E(X[0]X[0]T ) in terms of A and Q so that X[n], n ≥ 0 is wide­sense stationary in the sense that E(X[n]X[n − m]T ) = R[m] does not depend on n. In such a case, evaluate the ACF R[m] of {X[n]} in terms of A and Q. Assume that A is diagonable with all eigenvalues being strictly inside the unit circle. Express your answer in terms of the eigenvalues and eigen­projections of A. [b] Assuming X[0] known, evaluate the maximum likelihood estimator of A given the data {X[n] : 1 ≤ n ≤ N }. Also evaluate the approximate mean and covariance of the estimates of the matrix elements of A upto quadratic orders in the correlations {R[m]} based on the data X[n], 1 ≤ n ≤ N and upto O(1/N ). In other words, calculate using perturbation theory, E(δAij ) and E(δAij δAkm ) upto O(1/N ) where Aˆ − A = δA

where Aˆ is the maximum likelihood estimator of A. hint: Write N

−1

N �

n=1

X[n − i]X[n − j]T − R[j − i] = δR(j, i)

and then evaluate δA upto quadratic orders in δR(j, i). Finally, use the fact that since X[n], n ≥ 0 is a Gaussian process, E(Xa [n1 ]Xb [n2 ]Xc [n3 ]Xd [n4 ]) = Rab [n1 −n2 ]Rcd [n3 −n4 ]+Rac [n1 −n3 ]Rbd [n2 −n4 ]+Rad [n1 −n4 ]Rbc [n2 −n3 ] where ((Rab (τ )))1≤a,b≤M = R(τ ) = E(X[n]X[n − τ ]T ) [c] Calculate the Fisher information matrix for the parameters {Aij } and hence the corresponding Cramer­Rao lower bound. [3] Let X[n] be a stationary M ­vector valued zero mean Gaussian process with ACF R(τ ) = E(X[n]X[n−τ ]T ) ∈ RM ×M . Define its power spectral density matrix by ∞ � S(ω) = R(τ )exp(−jωτ ) ∈ CM ×M τ =−∞

Prove that S(ω) is a positive definite matrix. Define its windowed periodogram estimate by ˆ N (ω)(X ˆ N (ω))∗ SˆN (ω) = X

17610.6. STATISTICAL Stochastic SIGNAL ProcessesPROCESSING, in Classical and M.TECH, Quantum Physics Engineering LONGand TESTMAX MARKS:501 where ˆ N (ω) = N −1/2 X

N −1 �

W(n)X(n)exp(−jωn)

n=0

where W(n) is an M × M matrix valued window function. Calculate ESˆN (ω), Cov((SN (ω1 ))ab , (SN (ω2 ))cd ) and discuss the limits of these as N → ∞. Express these means and covariances in terms of ∞ � ˆ W (τ )exp(−jωτ ) W (ω) = τ =−∞

[4] This problem deals with causal Wiener filtering of vector valued sta­ tionary processes. Let X(n), S(n), n ∈ Z be two jointly WSS processes with correlations RXX (τ ) = E(X(n)X(n − τ )T ), RSX (τ ) = E(S(n)X(n − τ )T ) Let SXX (z) =



RXX (τ )z −τ

τ

Prove that

RXX (−τ ) = RXX (τ )T and hence

SXX (z −1 ) = SXX (z)T

Assume that this power spectral density matrix can be factorized as SXX (z) = L(z)L(z −1 )T where L(z) and G(z) = L(z)−1 admit expansions of the form � L(z) = l[n]z −n , |z | ≥ 1 n≥0

G(z) =



n≥0

with



n≥0



n≥0

Then suppose

g[n]z −n , |z| ≥ 1

� l[n] �< ∞, � g[n] �< ∞

H(z) =



n≥0

h[n]z −n

Stochastic Processes in Classical and Quantum Physics and Engineering 177 186CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING is the optimum causal matrix Wiener filter that takes X[.] as input and outputs ˆ of S[.] in the sense that an estimate S[.] � ˆ = S[n] h[k]X[n − k] k≥0

with {h[k]} chosen so that ˆ �2 ] E[� S[n] − S[n] is a minimum, then prove that h[.] satisfies the matrix Wiener­Hopf equation � h[k]RXX [m − k] = RSX [m], m ≥ 0 k≥0

Show that this equation is equivalent to [H(z)SXX (z) − SSX (z)]+ = 0 Prove that the solution to this equation is given by H(z) = [SSX (z)L(z −1 )−T ]+ L(z)−1 = [SSX (z)G(z −1 )T ]+ G(z) where

[SSX (z)G(z −1 )T ]+

� =[ RSX [n]g[m]T z −n+m ]+ = � � RSX [n + m]g[m]T z −n n≥0

m≥0

Prove that this is a valid solution for a stable Causal matrix Wiener filter, it is sufficient that � � RSX [n] �< ∞ n≥0

[5] Write short notes on any two of the following: [a] Recursive least squares lattice filter for prediction with time and order updates of the prediction filter coefficients. [b] Kolmogorov’s formula for the entropy rate of a stationary Gaussian pro­ cess in discrete time in terms of the power spectral density and its relationship to the infinite order one step prediction error variance. [c] The Cq Shannon coding theorem and its converse and a derivation of the classical Shannon noisy coding theorem and its converse from it. [d] Cramer­Rao lower bound for vector parameters and vector observations. [e] The optimum detector in a binary quantum hypothesis testing problem between two mixed states.

178 Stochastic Processes in Classical and Quantum Physics and Engineering 10.7. MONOTONOCITY OF QUANTUM RELATIVE RENYI ENTROPY187

10.7 Monotonocity of quantum relative Renyi entropy Let A, B be positive definite matrices and let 0 ≤ t ≤ 1, then prove that (tA + (1 − t)B)−1 ≤ t.A−1 + (1 − t).B −1 − − − (1) ie, the map A → A−1 is operator convex on the space of positive matrices. Solution: Proving (1) is equivalent to proving that A1/2 (tA + (1 − t)B)−1 .A1/2 ≤ tI + (1 − t)A1/2 B −1 A1/2 which is the same as proving that (tI + (1 − t)X)−1 ≤ tI + (1 − t)X −1 − − − (2) where

X = A−1/2 BA−1/2

It is enough to prove (2) for any positive matrix X. By the spectral theorem for positive matrices, to prove (2), it is enough to prove (t + (1 − t)x)−1 ≤ t + (1 − t)/x − − − (3) for all positive real numbers x. Proving (3) is equivalent to proving that (t + (1 − t)x)(tx + 1 − t) ≥ x for all x > 0, or equivalently, (t2 + (1 − t)2 )x + t(1 − t)x2 + t(1 − t) ≥ x or equivalently, adding and subtracting 2t(1 − t)x on both sides to complete the square, −2t(1 − t)x + t(1 − t)x2 + t(1 − t) ≥ 0 or equivalently, 1 + x2 − 2x ≥ 0 which is true. Problem: For p ∈ (0, 1), show that x → −xp , x → xp−1 and x → xp+1 are operator convex on the space of positive matrices. Solution: Consider the integral � ∞ I(a, x) = ua−1 du/(u + x) 0

It is clear that this integral is convergent when x ≥ 0 and 0 < a < 1. Now � ∞ −1 I(a, x) = x ua−1 (1 + u/x)−1 du 0

Stochastic Processes in Classical and Quantum Physics and Engineering 179 188CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING = xa−1





y a−1 (1 + y)−1 dy

0

on making the change of variables u = xy. Again make the change of variables 1 − 1/(1 + y) = y/(1 + y) = z, y = 1/(1 − z) − 1 = z/(1 − z) Then I(a, x) = xa−1



1

z a−1 (1−z)−a+1 (1−z)(1−z)−2 dz = xa−1

0

a−1

=x

β(a, 1 − a) = x

1−a

a−1

Γ(a)Γ(1 − a) = x



1

z a−1 (1−z)−a dz

0

π/sin(πa)

Thus, x

a−1

= (sin(πa)/π)I(a, x) = (sin(πa)/π)





ua−1 du/(u + x)

0

and since for u > 0, as we just saw, x → (u + x)−1 is operator convex, the proof that f (x) = xa−1 is operator convex when a ∈ (0, 1). Now, Now, ua−1 x/(u + x) = ua−1 (1 − u/(u + x)) = ua−1 − ua /(u + x) Integrating from u = 0 to u = ∞ and making use of the previous result gives � ∞ xa = (sin(πa)/π) (ua−1 − ua /(u + x))du 0

from which it immediately becomes clear that x → −xa is operator convex on the space of positive matrices. Now consider xa+1 . We have ua−1 x2 /(u + x) = ua−1 ((u + x)2 − u2 − 2ux)/(u + x) = ua−1 (u+x)−u a+1 /(u+x)−2ua x/(u+x) = ua−1 (u+x)−ua+1 /(u+x)−2ua (1−u/(u+x)) and integration thus gives � ∞ xa+1 = (sin(πa)/π) (ua−1 x + ua+1 /(u + x) − ua )du 0

from which it becomes immediately clear that xa+1 is operator convex. Monotonicity of Renyi entropy: Let s ∈ R. For two positive matrices A, B, consider the function Ds (A|B) = log(T r(A1−s B s )) Our aim is to show that for s ∈ (0, 1) and a TPCP map K, we have for two states A, B, Ds (K(A)|K(B)) ≥ Ds (A|B)

180 Stochastic Processes in Classical and Quantum Physics and Engineering 10.7. MONOTONOCITY OF QUANTUM RELATIVE RENYI ENTROPY189 while for s ∈ (−1, 0), we have Ds (K(A)|K(B)) ≤ Ds (A|B) For a ∈ (0, 1), and x > 0 we’ve seen that � ∞ xa+1 = sin(πa)/π) (ua−1 x + ua+1 /(u + x) − ua )du 0

and hence for s ∈ (−1, 0), replacing a by −s gives � ∞ x1−s = −(sin(πs)/π) (ua−1 x + ua+1 /(u + x) − ua )du 0

Let s ∈ (−1, 0). Then the above formula shows that x1−s is operator convex. Hence, if A, B are any two density matrices and K a TPCP map, then K(A)1−s ≤ K(A1−s ) whence B s/2 K(A)1−s B s/2 ≤ B s/2 K(A1−s )B s/2 for any positive B and taking taking trace gives T r(K(A)1−s B s ) ≤ T r(K(A1−s )B s ) Replacing B by K(B) here gives T r(K(A)1−s K(B)s ) ≤ T r(K(A1−s )K(B)s ) Further since s ∈ (−1, 0) it follows that xs is operator convex. Hence K(B)s ≤ K(B s ) and so A1/2 K(B)s A1/2 ≤ A1/2 K(B s )A1/2

for any positive A and replacing A by K(A1−s ) and then taking trace gives T r(K(A1−s )K(B)s ) ≤ T r(K(A1−s )K(B s )) More generally, we have the following: If X is any matrix, and s ∈ (0, 1) and A, B positive, and K any TPCP map T r(X ∗ K(A)1−s XK(B)s ) ≥ T r(X ∗ K(A)1−s XK(B s )) (since K(B)s ≥ K(B s ) because B → B s is operator concave and since K(A)1−s X is positive) = T r(K(A)1−s XK(B s )X ∗ ) ≥ T r(K(A1−s )XK(B s )X ∗ )

Stochastic Processes in Classical and Quantum Physics and Engineering 181 190CHAPTER 10. LECTURE PLAN FOR STATISTICAL SIGNAL PROCESSING (since K(A)1−s ≥ K(A1−s ) because A → A1−s is operator concave and since XK(B s )X ∗ is positive) = T r(X ∗ K(A1−s )XK(B s )) Since X is an arbitary complex square matrix, this equation immediately can be written as K(A)1−s ⊗ K(B)s ≥ K(A1−s ) ⊗ K(B s ) for all positive A, B and s ∈ (0, 1). We can express this equation as K(A)1−s ⊗ K(B)s ≥ (K ⊗ K)(A1−s ⊗ B s ) and hence if I denotes the identity matrix of the same size as A, B, then T r(K(A)1−s K(B)s ) ≥ T r((A1−s ⊗ B s )(K ⊗ K)(V ec(I).V ec(I)∗ )) Note that if {ek : k = 1, 2, ..., n} is an onb for Cn , the Hilbert space on which A, B act, then � � |ek ⊗ ek >= |ξ > I= |ek >< ek |, V ec(I) = k

k

V ec(I).V ec(I)∗ = |ξ >< ξ| = where

� ij

Eij ⊗ Eij

Eij = |ei >< ej | Thus, for s ∈ (0, 1), T r(K(A)1−s K(B)s ) ≥



T r(Kij A1−s Kji B s )

i,j

where Kij = K(Eij ) Likewise, if s ∈ (0, 1) so that s − 1 ∈ (−1, 0), 2 − s ∈ (1, 2), then both the maps A → As−1 and A → A2−s are operator convex and hence for any TPCP map K , K(A)s−1 ≤ K(A)s−1 , K(A)2−s ≤ K(A2−s ), from which we immediately deduce that T r(X ∗ K(A)2−s XK(B)s−1 ) ≤ T r(X ∗ K(A2−s )XK(B s−1 )) or equivalently, K(A)2−s ⊗ K(B)s−1 ≤ K(A2−s ) ⊗ K(B s−1 ) which result in the inequality T r(K(A)2−s K(B)s−1 ) ≤

� i,j

T r(Kij A2−s Kji B s−1 )

182 Stochastic Processes in Classical and Quantum Physics and Engineering 10.7. MONOTONOCITY OF QUANTUM RELATIVE RENYI ENTROPY191 Remark: Lieb’s inequality states that for s ∈ (0, 1), the map (A, B) → A1−t ⊗ B t = ft (A, B) is joint operator concave, ie, if A1 , A2 , B1 , B2 are positive and p, q > 0, p + q = 1 then with A = pA1 + qA2 , B = pB1 + qB2 we have ft (A, B) ≥ pft (A1 , B1 ) + qft (A2 , B2 ) From this, we get the result that (A, B) → T r(X ∗ A1−t XB t ) = Ht (A, B|X) is jointly concave. Now let K denote the pinching K(A) = (A + U AU ∗ )/2 where U is a unitary matrix. Then, we get Ht (K(A), K(B)|X) ≥ (1/2)Ht (A, B|X) + (1/2)Ht (U AU ∗ , U BU ∗ |X) and replacing X by I in this inequality gives T r(K(A)1−t K(B)t ) ≥ T r(A1−t B t ) which proves the monotonicity of the Renyi entropy when t ∈ (0, 1) under pinching operations K of the above kind. Now, this inequality will also hold for any pinching K since any pinching can be expressed as a composition of such elementary pinchings of the above kind. We now also wish to show the reverse inequality when t ∈ (−1, 0), ie show that if t ∈ (0, 1), then T r(K(A)2−t K(B)t−1 ) ≤ T r(A2−t B t−1 )

Chapter 11

Chapter 11

My commentary My commentary on the on

TheGita Ashtavakra Gita

Ashtavakra A couple of years ago, my colleague and friend presented me with a verse to verse translation of the celebrated Ashtavakra Gita in English along with a commentary on each verse. I was struck by the grandeur and the simplicity of the Sanskrit used by Ashtavakra for explaining to King Janaka, the true meaning of self­realization, how to attain it immediately without any difficulty and also King Janaka’s reply to Ashtavakra about how he has attained immeasurable joy and peace by understanding this profound truth. Ashtavakra was a sage who lived during the time in which the incidents of the Raamayana took place and hence his Gita dates much much before even the Bhagavad Gita spoken by Lord Krisnna to Arjuna was written down by Sage Vyasa through Ganapati the scribe. One can see in fact many of the slokas in the Bhagavad Gita have been adapted from the Ashtavakra Gita and hence I feel that it is imperative for any seeker of the absolute truth and everlasting peace, joy and tranquility to read the Ashtavakra Gita, so relevant in our modern times where there are so many disturbances. Unlike the Bhagavad Gita, Ashtavkra does not bring in the concept of a personal God to explain true knowledge and the process of self­realization. He simply says that you attain self­realization immediately at this moment when you understand that the physical universe that you see around you is illusory and the fact that your soul is the entire universe. Every living being’s soul is the same and it fills the entire universe and that all the physical events that take place within our universe as perceived by our senses are false, ie, they do not occur. In the language of quantum mechanics, these events occur only because of the presence of the observer, namely the living individuals with their senses. The absolute truth is that each individual is a soul and not a body with his/her senses and this soul neither acts nor causes to act. Knowledge is obtained immediately at the moment when we understand the truth that the impressions we have from our physical experiences are illusory, ie false and that the truth is that our soul fills the entire universe and that this 193

183

184 194

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 11. MY COMMENTARY ON THE ASHTAVAKRA GITA

soul is indestructible and imperishable as even the Bhagavad Gita states. The Bhagavad Gita however distinguishes between two kinds of souls, the jivatma and the paramatma, the former belonging to the living entities in our universe and the latter to the supreme personality of Godhead. The Bhagavad Gita tells us that within each individual entity, both the jivatma and the paramatma reside and when the individual becomes arrogant, he believes that his jivatma is superior to the paramatma and this then results in self conflict owing to which the individual suffers. By serving the paramatma with devotion as outlined in the chapter on Bhakti yoga, the jivatma is able to befriend the paramatma and then the individual lives in peace and tranquility. To explain self­knowledge more clearlh without bringing in the notion of a God, Ashtavakra tells Raja Janaka that there are two persons who are doing exactly the same kind of work every day both in their offices as well as in their family home. However, one of them is very happy and tranquil while the other is depressed and is always suffering. What is the reason for this ? The reason, Ashtavakra says is that the second person believes that his self is doing these works and consequently the reactions generated by his works affect him and make him either very happy or very sad. He is confused and does not know how to escape from this knot of physical experiences. The first person, on the other hand, knows very well that he is compelled to act in the way he is acting owing to effects of his past life’s karma’s and that in fact his soul is pure and has nothing at all to do with the results and effects of his actions. He therefore neither rejoices nor does he get depressed. The physical experiences that he goes through do not leave any imprints on his self because he knows that the physical experiences are illusory and that his soul which pure and without any blemish fills the entire universe and is the truth. Ashtavarka further clarifies his point by saying that the person in ignorance will practice meditation and try to cultivate detachment from the objects of this world thinking that this will give him peace. Momentarily such a person will get joy but when he comes of the meditation, immediately all the desires of this illusory physical world will come back to him causing even more misery. He will try to run to a secluded place like a forest and meditate to forget about his experiences, but again when his meditation is over, misery will come back to him. However, once he realizes that he is actually not his body but his soul and that the soul is free of any physical quality and that it is the immeasurable truth, then he will neither try forcefully to cultivate detachment nor will he try forcefully to acquire possessions. Both the extremes of attachment and detachment will not affect him. In short, he attains peace even living in this world immediately at this moment when true knowledge comes to him. According to Ashtavkara, the seeker of attachement as well as the seeker of detachment are both unhappy because they have not realized that the mind must not be forced to perform an activity. Knowledge is already present within the individual, he just has to remain in his worldly position as it is, performing his duties with the full knowledge that this compulsion is due to his past karmas and then simultaneously be conscious of the fact that his soul is the perfect Brahman and it pervades the entire universe. At one point, Ashtavakra tells Janaka that let even Hari (Vishnu) or Hara (Shiva) be your

Stochastic Processes in Classical and Quantum Physics and Engineering

185 195

guru, even then you cannot be happy unless you have realized at this moment the falsity of the physical world around you and the reality of your soul as that which pervades the entire universe. Such a person who has understood this, will according to Ashtavakra, be unmoved if either if death in some form approaches him or if temptations of various kinds appear before him. He will remain neutral to both of these extremes. Such a realized soul has eliminated all the thought processes that occur within his mind. When true knowledge dawns upon a person, his mind will become empty and like a dry leaf in a strong wind, he will appear to be blown from one place to other doing different kinds of karma but always with the full understanding that these karmas have nothing at all to do with his soul which is ever at bliss, imperturbable, imperishable and omnisicient. In the quantum theory of matter and fields, we learn about how the observer perturbs a system in order to measure it and gain knowledge about the system. When the system is thus perturbed, one measures not the true system but only the perturbed system. This is the celebrated Heisenberg uncertainty principle. Likewise, in Ashtavakra’s philosophy, if we attempt to gain true knowledge about the universe and its living entities from our sensory experience, we are in effect perturbing the universe and hence we would gain only perturbed knowledge, not true knowledge. True knowledge comes only with the instant realization that we are not these bodies, we are all permanent souls. Even from a scientific standpoint, as we grow older, all our body cells eventually get replaced by new ones, then what it is that in our new body after ten years that we are able to retain our identity ? What scientific theory of the DNA or RNA can explain this fact ? What is the difference between the chemical composition of our body just before it dies and just after it dies ? Absolutely no difference. Then what is it that disinguishes a dead body from a living body ? That something which distinguishes the two must be the soul which is not composed of any chemical. It must be an energy, the life force which our modern science is incapable of explaining.

Chapter 12

Chapter 12

LDP-UKF Basedcontroller Controller LDP­UKF based State equations: x(n + 1) = f (x(n), u(n)) + w(n + 1) Measurement equations: z(n) = h(x(n), u(n)) + v(n) w, v are independent white Gaussian processes with mean zero and covariances Q, R respectively. xd (n) is the desired process to be tracked and it satisfies xd (n + 1) = f (xd (n), u(n)) UKF equations: Let {ξ(k) : 1 ≤ k ≤ K} and {η(k) : 1 ≤ k ≤ K} be independent iid N (0, I) random vectors. The UKF equations are x ˆ(n + 1|n) = K −1

K �

f (ˆ x(n|n) +

k=1



P (n|n)ξ(k), u(n))

x ˆ(n + 1|n + 1) = x ˆ(n + 1|n) + K(n)(z(n + 1) − zˆ(n + 1|n)) where

K(n + 1) = ΣXY Σ−1 YY

Where ΣXY = K

−1

K � � � [ P (n + 1|n)ξ(k)(h(ˆ x(n+1|n)+ P (n + 1|n)η(k))−zˆ(n+1|n))T ],

k=1

zˆ(n + 1|n) = K −1

K �

h(ˆ x(n + 1|n) +



(h(ˆ x(n + 1|n) +



k=1

ΣY Y = K −1 .

K �

k=1

197

187

P (n + 1|n)η(k), u(n + 1))

P (n + 1|n)η(k))(, , )T

188 198

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 12. LDP­UKF BASED CONTROLLER

P (n + 1|n) = K −1

K �

[(f (ˆ x(n|n) +

k=1



P (n|n)ξ(k), u(n)) − x ˆ(n + 1|n))(, , )T ] + Q

T P (n + 1|n + 1) = ΣXX − ΣXY .Σ−1 Y Y ΣXY

where ΣXX = P (n + 1|n) Tracking error feedback controller: Let f (n) = xd (n) − x ˆ(n|n), f (n + 1|n) = xd (n + 1) − x ˆ(n + 1|n) Controlled dynamics: x(n + 1) = f (x(n), u(n)) + w(n + 1) + Kc f (n) Aim: To choose the controller coefficient Kc so that the probability P (max1≤n≤N (|e(n)|2 + |f (n)|2 ) > �) is minimized where e(n) = x(n) − x ˆ(n), e(n + 1|n) = x(n + 1) − x ˆ(n + 1|n), x ˆ(n) = x ˆ(n|n) Linearized difference equations for e(n), f (n): (Approximate) e(n+1|n) = x(n+1)−x ˆ(n+1|n) = f (x(n), u(n))+w(n+1)+Kc f (n+1|n)−K −1 .



f (ˆ x(n)+

k

x(n), u(n)) = f (x(n), u(n))−f (ˆ x(n), u(n))−f � (ˆ �



x(n), u(n))e(n)−f (ˆ x(n), u(n)) = f (ˆ





P (n|n)ξ(k), u(n))

� � P (n|n)K −1 ξ(k)+w(n+1)+Kc f (n)

P (n|n)K



−1

k

ξ(k)+w(n+1)+Kc f (n)−−−(1)

k

ˆ(n+1|n) = f( xd (n), u(n))−K −1 . f (n+1|n) = xd (n+1)−x



f (ˆ x(n)+

k

= f � (ˆ x(n), u(n))f (n) − f � (ˆ x(n), u(n))



P (n|n)K −1

� k



P (n|n)ξ(k), u(n))

ξ(k) − − − (2)

e(n + 1) = x(n + 1) − x ˆ(n + 1|n + 1) =

e(n + 1|n) + Kc f (n + 1|n) − K(n)(z(n + 1) − zˆ(n + 1|n)) = e(n+1|n)+Kc f (n+1|n)−K(n)(h(x(n+1), u(n+1))+ � � h(ˆ x(n+1|n)+ P (n + 1|n)η(k), u(n+1))) v(n+1)−K −1 k

x(n + 1|n), u(n + 1))e(n + 1|n) + v(n + 1) = e(n + 1|n) + Kc f (n + 1|n) − K(n)(h� (ˆ � � x(n + 1|n), u(n + 1)). P (n + 1|n).K −1 . η(k)) −h� (ˆ k



= (I − K(n)h (ˆ x(n + 1|n), u(n + 1)))e(n + 1|n) + Kc f (n) − K(n)v(n + 1)

Stochastic Processes in Classical and Quantum Physics and Engineering +K(n)h� (ˆ x(n + 1|n), u(n + 1)). Finally,



P (n + 1|n)K −1

� k

189 199

η(k) − − − (3)

f (n + 1) = xd (n + 1) − x ˆ(n + 1|n + 1) = f (n + 1|n) − K(n)(z(n + 1) − zˆ(n + 1|n)) = f (n+1|n)−K(n)(h(x(n+1), u(n+1))+ � � v(n+1)−K −1 h(ˆ x(n+1|n)+ P (n + 1|n)η(k), u(n+1)) k

x(n + 1|n), u(n + 1))e(n + 1|n) = f (n + 1|n) − K(n)(h� (ˆ � � η(k)) − K(n)v(n + 1) −h� (ˆ x(n + 1|n), u(n + 1)). P (n + 1|n)K −1 k

x(n + 1|n), u(n + 1)))e(n + 1|n) = f (n + 1|n) − K(n)h� (ˆ � � x(n+1|n), u(n+1)). P (n + 1|n).K −1 . η(k)−K(n)v(n+1)−−−(4) +K(n)h� (ˆ k

Substituting for e(n + 1|n) and f (n + 1|n) from (1) and (2) respectively into (3) and (4) results in the following recursion for e(n), f (n): e(n + 1) = � � x(n), u(n)) P (n|n)K −1 x(n+1|n), x(n), u(n))f (n)−f � (ˆ f � (ˆ ξ(k)−K(n)h� (ˆ k

u(n+1)))(f � (ˆ x(n), u(n))e(n)

−f � (ˆ x(n), u(n))



P (n|n)K −1



ξ(k) + w(n + 1) + Kc f (n))

k

� � x(n+1|n), u(n+1)). P (n + 1|n).K −1 . η(k)−K(n)v(n+1)−−−(5) +K(n)h� (ˆ k

and

f (n + 1) = x(n), u(n)) x(n), u(n))f (n) − f � (ˆ f � (ˆ



P (n|n)K −1



ξ(k)

k

x(n + 1|n), u(n + 1))). −K(n)h� (ˆ � � ξ(k) + w(n + 1) + Kc f (n)) x(n), u(n))e(n) − f � (ˆ x(n), u(n)) P (n|n)K −1 .(f � (ˆ k

� � x(n+1|n), u(n+1)). P (n + 1|n).K −1 . η(k)−K(n)v(n+1)−−−(6) +K(n)h (ˆ �

k

If we regard the error processes e(n), f (n) and the noise processes w(n),v(n), �K � ξ1 (n) = K −1 . k=1 ξ(k) and η1 (n) = K −1 k η(k) to be all of the first order of smallness in perturbation theory, then to the same order, we can replace in (4) and (5), x ˆ(n) appearing on the rhs by xd (n) and then the error thus incurred

190 200

Stochastic Processes in Classical and Quantum Physics and Engineering CHAPTER 12. LDP­UKF BASED CONTROLLER

will be only of the second order of smallness. Doing so, we can express (4) and (5) in the form after equilibrium has been obtained as e(n+1) = A11 e(n)+(A12 +A18 Kc )f (n)+A13 w(n+1)+ A14 A15 v(n+1)+A16 ξ1 (n+1)+A17 η1 (n+1) f (n+1) = A21 e(n)+(A22 +A28 Kc )f (n)+A23 w(n+1)+ A24 A25 v(n+1)+A26 ξ1 (n+1)+A27 η1 (n+1) Define the aggregate noise process W [n] = [w(n)T , v(n)T , ξ1 (n)T , η1 (n)T ]T It is a zero mean vector valued white Gaussian process and its covariance matrix is block diagonal: E(W (n)W (m)T ) = RW (n)δ(n − m) where RW (n) = E(W (n)W (n)T ) = diag[Q, R, P [n|n)/K, P (n + 1|n)/K] Let P denote the limiting equilibrium value of P [n|n] as n → ∞. The limiting equilibrium value of P [n + 1|n] is then given approximately by x, u)P f � (ˆ x, u)T + Q P+ = f � (ˆ where x ˆ may be replaced by the limiting value of xd , assuming it to be asymp­ totically d.c and f � (x, u) is the Jacobian matrix of f (x, u) w.r.t x. We write f � (xd , u) = F and then P+ = F P F T + Q If we desire to use a more accurate value of P+ , then we must replace it by P+ = Cov(f (xd + P.ξ1 )) + Q where ξ1 is an N (0, K −1 P ) random vector. Now, let RW denote the limiting equilibrium value of RW (n), ie, RW = diag[Q, R, P/K, P+ /K] Then, the above equations for χ[n] = [e[n]T , f [n]T ]T can be expressed as χ[n + 1] = (G0 + G1 Kc G2 )χ[n] + G3 W [n + 1] and its solution is given by χ[n] =

n−1 � k=0

(G0 + G1 Kc G2 )n−1−k G3 W [k + 1]

Stochastic Processes in Classical and Quantum Physics and Engineering

191 201

This process is zero mean Gaussian with an autocorrelation given by min(n,m)−1

E(χ[n]χ[m]T ) =



(G0 + G1 Kc G2 )n−1−k .RW .(G0 + G1 Kc G2 )T m−1−k

k=0

= Rχ [n, m|Kc ] say. The LDP rate function of the process χ[.] over the time interval [0, N ] is then given by I[χ[n] : 0 ≤ n ≤ N ] = (1/2) where

N �

χ[n]T Qχ [n, m|Kc ]χ[m]

n,m=0

Qχ = ((Rχ [n, m|Kc ]))−1 0≤n,m≤N

�N According to the LDP for Gaussian processes, the probability that n=0 � √ χ[n] �2 will exceed a threshold δ after we scale χ by a small number �| is given approximately by exp(−�−1 min(I(χ) :

N �

n=0

� χ[n] �2 > δ))

and in order to minimize this deviation probability, we must choose Kc so that the maximum eigenvalue λmax (Kc ) of ((Rχ [n, m|Kc ]))−1 0≤n,m≤N is a minimum.

Chapter 13

Chapter 13

Abstracts for a Book on Abstracts a book on Quantumfor Signal Processing quantum signal Field processing Using Quantum Theory using quantum field theory Chapter 1 An introduction to quantum electrodynamics [1] The Maxwell and Dirac equations. [2] Quantization of the electromagnetic field using Bosonic creation and an­ nihilation operators. [3] Second Quantization of the Dirac field using Fermionic creation and an­ nihilation operators. [4] The interaction term between the Dirac field and the Maxwell field. [5] Propagators for the photons and electron­positron fields. [7] The Dyson series and its application to calculation of amplitudes for absorption, emission and scattering processes between electrons and positrons. [8] Representing the terms in the Dyson series for interactions between pho­ tons, electrons and positrons using Feynman diagrams. [9] The role of propagators in Feynman diagrammatic calculations. [10] Quantum electrodynamics using the Feynman path integral method. [11] Calculating the propagator of fields using Feynman path integrals. [12] Electrons self energy, vacuum polarization, Compton scattering and anomalous magnetic moment of the electron. Chapter 2 The design of quantum gates using quantum field theory [1] Design of quantum gates using first quantized quantum mechanics by controlled modulation of potentials. [2] Design of quantum gates using harmonic oscillators with charges per­ turbed by and electric field. 203

193

194 Stochastic Processes in Classical and Quantum Physics and Engineering 204CHAPTER 13. ABSTRACTS FOR A BOOK ON QUANTUM SIGNAL PROCESSING [3] Design of quantum gates using 3­D harmonic oscillators perturbed by control electric and magnetic field. [4] Representing the Klein­Gordon Hamiltonian with perturbation using truncated spatial Fourier modes within a box. [5] Design of the perturbation potential in Klein­Gordon quantization for realizing very large sized quantum gates. [6] Representing the Hamiltonian of the Dirac field plus electromagnetic field in terms of photon and electron­positron creation and annihilation operators. [7] Design of control classical electromagnetic fields and control classical current sources in quantum electrodynamics for approximating very large sized quantum unitary gates. [8] The statistical performance of quantum gate design methods in the pres­ ence of classical noise. Chapter 3 The quantum Boltzmann equation with applications to the design of quan­ tum noisy gates, ie, TPCP channels. [1] The classical Boltzmann equation for a plasma of charged particles inter­ acting with an external electromagnetic field. [2] Calculating the conductivity and permeability of a plasma interacting with an electromagnetic field using first order perturbation theory. [3] The quantum Boltzmann equation for the one particle density opera­ tor for a system of N identically charged particles interacting with a classical electromagnetic field. The effect of quantum charge and current densities on the dynamics of the Maxwell field and the effect of the Maxwell field on the dynamics of the one particle density operator. [4] The quantum nonlinear channel generated by the evolving the one particle density operator under an external field in the quantum Boltzmann equation. Chapter 4 The Hudson­Parthasarathy quantum stochastic calculus with application to quantum gate and quantum TPCP channel design. [1] Boson and Fermion Fock spaces. [2] Creation, annihilation and conservation operator fields in Boson and Fermion Fock spaces. [3] Creation, annihilation and conservation processes in Boson Fock space. [4] The Hudson­Parthasarathy quantum Ito formulae. [5] Hudson­Parthasarathy­Schrodinger equation for describing the evolution of a quantum system in the presence of bath noise. [6] The GKSL (Lindlbad) master equation for open quantum systems. [7] Design of the Lindlbad operators for an open quantum system for gener­ ation of a given TPCP map, ie, design of noisy quantum channels. [8] Schrodinger’s equation with Lindlbad noisy operators in the presence of an external electromagnetic field. Design of the electromagnetic field so that the evolution matches a given TPCP map in the least squares sense.

Stochastic Processes in Classical and Quantum Physics and Engineering

195 205

[9] Quantum Boltzmann equation in the presence of an external electromag­ netic field with Lindblad noise terms. Design of the Lindblad operators and the electromagnetic field so that the evolution matches a given nonlinear TPCP map. Chapter 5 Non­Abelian matter and gauge fields applied to the design of quantum gates. [1] Lie algebra of a Lie group. [2] Gauge covariant derivatives. [3] The non­Abelian matter and gauge field Lagrangian that is invariant under local gauge transformations. [4] Application of non­Abelian matter and gauge field theory to the design of large sized quantum gates in the first quantized formalism, gauge fields are assumed to be classical and the control fields and current are also assumed to be classical. Chapter 6 The performance of quantum gates in the presence of quantum noise. Quantum plasma physics: For an N particle system with identitcal particles � � Hk , ρ] + [ Vij , ρ] + Lindlbadnoiseterms. iρ,t = [ k

i