Encyclopedia of modern optics [Volume 1-5, 1 ed.] 0122276000, 9780122276002

The encyclopedia provides valuable reference material for those working in the field who wish to know more about a topic

1,347 121 74MB

English Pages 2285 Year 2004

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Encyclopedia of modern optics [Volume 1-5, 1 ed.]
 0122276000, 9780122276002

Table of contents :
cover.jpg......Page 1
Editor-in-Chief......Page 2
Editors......Page 3
Consulting Editors......Page 4
Editorial Advisory Board......Page 5
Preface......Page 6
Introduction......Page 7
The Helium-Neon Laser......Page 0
Principles......Page 8
Qualification of Signal Regenerator Performance......Page 9
All-Optical 2R/3R Regeneration Using Optical Nonlinear Gates......Page 10
Optical Decision Element......Page 11
Optical Clock Recovery (CR)......Page 12
Optical Regeneration by Saturable Absorbers......Page 13
Synchronous Modulation Technique......Page 14
See also......Page 15
Further Reading......Page 16
Fraunhofer Diffraction......Page 17
Fresnel Diffraction......Page 18
Further Reading......Page 19
Laser Physics......Page 20
Nonlinear Dynamics and Chaos in Lasers......Page 21
Applications of Chaos in Lasers......Page 23
Further Reading......Page 25
Kubelka-Munk Theory of Reflectance......Page 36
Intermediate Case......Page 38
Sample Geometry......Page 39
Kinetic Analysis......Page 40
Rate Constant Distributions......Page 41
Examples......Page 42
Further Reading......Page 43
Principle of Laser Trapping......Page 44
Laser Manipulation System......Page 45
Patterning of Polymer Nanoparticles......Page 46
Application to Biotechnology......Page 47
Transfer of Cells in Microchannel......Page 48
Collection and Alignment of Cells......Page 49
Further Reading......Page 50
Second-Order Spectroscopies......Page 53
Third-Order Spectroscopies......Page 54
Ultrafast Time Resolved Spectroscopy......Page 56
Spatially Resolved Spectroscopy......Page 57
Further Reading......Page 58
Photoproperties of Photosensitizers......Page 59
Mechanisms of Photodynamic Therapy......Page 61
Lasers in PDT......Page 62
Clinical Applications of Lasers in PDT......Page 65
Future Prospects......Page 66
Further Reading......Page 67
Experimental Measurement of Pump-Probe Data......Page 68
Origin of the Signal......Page 69
Nuclear Wavepackets......Page 71
See also......Page 72
Further Reading......Page 73
High Repetition Rate......Page 74
Ti-Sapphire......Page 75
Diode Lasers......Page 77
Further Reading......Page 78
Experimental Setup......Page 80
The Transient Density Phase Grating Technique......Page 81
Investigation of Population Dynamics......Page 84
Polarization Selective Transient Grating......Page 85
Further Reading......Page 87
Introduction......Page 88
Mutual Coherence......Page 90
Spectral Representation of Mutual Coherence......Page 91
Generalized Propagation......Page 92
Types of Fields......Page 93
Perfectly Coherent Fields......Page 94
Cross-Spectrally Pure Fields......Page 95
Secondary Sources......Page 96
Quasi-Homogeneous Sources......Page 97
Scaling Law......Page 98
Experimental Confirmations......Page 99
Interference Spectroscopy......Page 100
Higher-Order Coherence......Page 101
Further Reading......Page 102
Elementary Coherence Concepts......Page 104
Two-point Imaging......Page 105
Incoherent Two-Point Imaging......Page 106
Source Distribution and Object Illumination Coherence......Page 107
Spatial Frequency Modeling of Imaging......Page 108
Experimental Examples of Important Coherence Imaging Phenomena......Page 110
Primary Source Generation......Page 111
Noise Immunity......Page 112
Digital Post-detection Processing and Partial Coherence......Page 114
Summary and Discussion......Page 115
Further Reading......Page 118
Some Basic Observations......Page 119
Speckle Limits of Metrology......Page 122
Can We Overcome Coherence Limits......Page 123
Speckles as a Carrier of Information......Page 124
Further Reading......Page 126
Quantum Optimal Control Theory for Designing Laser Fields......Page 129
Algorithms for Implementing Optimal Control Experiments......Page 131
Conclusions......Page 136
Introduction......Page 137
Two-Path Interference Methods......Page 139
Closed-Loop Control Methods......Page 140
Further Reading......Page 141
Coherence Control......Page 142
Coherence Control in Semiconductors......Page 143
Coherent Control of Electrical Current Using Two Color Beams......Page 144
Coherent Control of Carrier Density, Spin Population, and Spin Current Using Two Color Beams......Page 146
Further Reading......Page 147
Introduction......Page 148
Basic Principles......Page 149
PSK Homodyne Detection......Page 150
FSK Heterodyne Synchronous Detection......Page 151
CPFSK Heterodyne Differential Detection......Page 152
Coherent Receiver Sensitivity......Page 153
PSK Homodyne Detection - Phase Locked Loop Schemes......Page 154
Phase-Diversity Receivers......Page 155
Polarization......Page 156
System Experiments......Page 157
Further Reading......Page 158
Optical Bloch Equations......Page 159
Maxwell-Bloch Equations......Page 160
Free Polarization Decay......Page 161
Photon Echo......Page 162
Stimulated Photon Echo......Page 164
Optical Ramsey Fringes......Page 165
Acknowledgments......Page 166
Further Reading......Page 167
Optical Bloch Equations......Page 168
Semiconductor Bloch Equations......Page 169
Superradiance......Page 170
Destructive Interference......Page 171
Quantum Beats......Page 172
Coherent Control......Page 173
Transient Absorption Changes......Page 174
Photon Echo......Page 176
Further Reading......Page 177
Coherent Dynamics......Page 178
Incoherent Relaxation Dynamics......Page 179
Hot Carrier Regime......Page 180
Hot Phonons......Page 181
Novel Coherent Phenomena......Page 182
Further Reading......Page 183
Photoreceptors in the Eye: Rods and Cones......Page 184
Temporal Response of the Human Visual System......Page 185
Color Perception Models......Page 186
The Commission Internationale de l’Echlairage (C.I.E.) Diagram......Page 187
Atmospheric Colors......Page 189
Colors due to Water and Water Droplets......Page 191
Sources of Color......Page 193
Further Reading......Page 194
The Absorption-Emission Cycle......Page 26
Signal vs Background......Page 27
Single Molecule Detection using Confocal Microscopy......Page 28
Optical Probe Volumes......Page 29
Intensity Fluctuations: Photon Burst Statistics......Page 30
Data Filtering......Page 32
Photon Burst Statistics......Page 33
Temporal Fluctuations: Autocorrelation Analysis......Page 34
Further Reading......Page 35
Introduction......Page 195
Interferometric Sensors......Page 196
Fiber Grating Sensors......Page 200
Fiber Laser Doppler Velocimeter......Page 202
Luminescence-Based Fiber Sensors......Page 203
Further Reading......Page 204
Heterodyning......Page 205
Introduction......Page 210
Image Formation......Page 212
Fractals for Texture Analysis......Page 213
Image Post-Processing Examples......Page 214
Example II: Frequency Domain Processing: Image Post-Processing in Biomedical Tissue Detection......Page 215
Example III: Time-Frequency Domain Filtering: Image Post-Processing in Noise Reduction (De-Noising)......Page 216
Image Post-processing, Transmission, and Distribution......Page 218
Requirements of Secure Digital Image Distribution......Page 219
Further Reading......Page 222
Silicon Technology for Image Sensing......Page 223
Basic Functionality and Physical Limitations of Conventional Solid-State Photosensors......Page 224
Desired Optoelectronic Functionality in Smart Pixel Arrays......Page 225
High-Sensitivity Charge Detection......Page 226
Extension of the Spectral Sensitivity Range......Page 227
Static Spatial Demodulation......Page 228
Dynamic Spatial Convolution......Page 229
Photonic Systems-on-chip......Page 230
Summary......Page 231
Further Reading......Page 232
Grating Equations......Page 233
Grating Theory......Page 235
Fabrication of Gratings......Page 236
Spectroscopic Gratings......Page 239
Diffractive Lenses......Page 240
Beam Splitter Gratings......Page 241
Inductive Grid Filters......Page 242
Introduction......Page 243
Kirchhoff Theory of Diffraction......Page 244
Fraunhofer Approximation......Page 249
Diffraction by a Rectangular Aperture......Page 251
Diffraction from a Circular Aperture......Page 252
Array Theorem......Page 254
N Rectangular Slits......Page 255
The Diffraction Grating......Page 256
Grating Spectrometer......Page 258
Blazed Gratings......Page 259
Fresnel Diffraction......Page 261
The Obliquity Factor......Page 262
The Cornu Spiral......Page 267
Fresnel Zones......Page 269
Circular Aperture......Page 271
Opaque Screen......Page 272
Zone Plate......Page 273
Further Reading......Page 274
Ray Tracing Simulation of DOEs......Page 275
Local Grating Model for the Ray Tracing Simulation of DOEs......Page 276
Aberration Correction for Interferometrically Recorded Holograms......Page 278
Testing of Aspheric Surfaces by Using a DOE as Aberration Compensating Element......Page 279
Correction of Chromatic Aberrations......Page 281
Summary......Page 283
Further Reading......Page 284
Overview of Conventional Lithography Systems......Page 285
Resolution Enhancement Techniques......Page 287
Micro-Optics in Conventional Lithography Systems......Page 288
Aperture Modulated Diffusers......Page 289
Fan-Out Based Diffusers......Page 291
Micro-Optics and Non-Conventional Lithography......Page 292
Introduction......Page 294
Design......Page 295
Multilevel Fabrication Using Binary Masks......Page 297
Continuous DOE Profiles Using Grayscale Lithography......Page 299
Further Reading......Page 301
Modes in Laser Resonators......Page 302
Design Principles......Page 303
Use of General Resonator Modes......Page 305
Examples......Page 306
Conclusion......Page 308
Introduction......Page 309
Multilayer Reflectors......Page 310
Diffraction Gratings......Page 313
Scattering (Incoherent Reflectors)......Page 318
Introduction......Page 320
Preform Fabrication......Page 321
Index Guided Fibers......Page 322
Bandgap Guided Fibers......Page 328
Tunable Microstructure Fiber Devices......Page 329
Further Reading......Page 330
Omnidirectional Reflecting Mirrors......Page 331
Fabrication Approach......Page 333
Materials Selection Criteria......Page 334
Bandstructure for Multilayer Fibers for External Reflection Applications......Page 335
Optical Characterization of ‘Mirror Fibers’......Page 336
Structure and Optical Properties of the Fabry-Perot Fibers......Page 337
Simulation of the Opto-Mechanical Behavior of the Fabry-Perot Fibers......Page 338
Mechanical Tuning Experiment and Discussion......Page 339
Wavelength-Scalable Hollow Optical Fibers with Large Photonic Bandgaps for CO2 Laser Transmission......Page 340
Further Reading......Page 343
Modeling Principles......Page 344
Free Propagation......Page 345
Propagation Through Elements......Page 347
Design Principles......Page 349
Fundamental Systems......Page 351
Further Reading......Page 356
Chromatic Dispersion in Optical Fiber Communication Systems......Page 357
Optical Nonlinearities as Factors to be Considered in Dispersion Compensation......Page 359
Dispersion Maps......Page 360
Corrections to Linear Dispersion Maps......Page 361
Fixed Dispersion Compensation......Page 362
Tunable Dispersion Compensation......Page 364
Chromatic Dispersion Monitoring......Page 367
Conclusion......Page 368
Further Reading......Page 369
Contrast Ratio......Page 370
Table of Emissive and Nonemissive Displays......Page 372
The Cathode Ray Tube (CRT)......Page 373
The Color CRT......Page 374
Field Emissive Displays (FEDs)......Page 375
Inorganic LED Displays......Page 376
Organic LEDs......Page 377
Thin Film EL (TFEL) Displays......Page 378
Reflective LCD......Page 379
Further Reading......Page 380
Introduction......Page 381
Theoretical Treatment of EIT in a Three-Level Medium......Page 382
Nonlinear Optical Processes......Page 384
Propagation and Wave-Mixing in a Doppler Broadened Medium......Page 385
Nonlinear Optics with a Pair of Strong Coupling Fields in Raman Resonance......Page 386
Pulse Propagation and Nonlinear Optics for Weak CW Fields......Page 387
Further Reading......Page 388
Backscattered Light......Page 389
Elements of a Doppler Lidar......Page 390
Description......Page 391
Applications of Coherent Doppler Lidar......Page 392
Description......Page 394
Heterodyne and Direct-Detection Doppler Trade-Offs......Page 396
Global Wind Measurements......Page 397
Further Reading......Page 398
Underlying Principles......Page 399
Spectral Mixture Analysis......Page 402
Geology......Page 403
Vegetation and the Environment......Page 404
Hyperspectral Remote Sensing of the Atmosphere......Page 406
Introduction......Page 407
Airborne DIAL Systems......Page 408
Global O3 Measurements......Page 409
Space-Based O3 DIAL System......Page 411
Global H2O Measurements......Page 412
H2O Raman Lidar Systems......Page 413
H2O DIAL Systems......Page 414
Tunable Laser Systems for Point Monitoring......Page 415
Laser Long-Path Measurements......Page 418
Further Reading......Page 419
Introduction......Page 420
Gases......Page 421
Spectral Resolution......Page 422
Scattering Methods......Page 423
An Example Atmospheric RT Model: MODTRAN......Page 424
Sun and Sky Viewing......Page 425
Further Reading......Page 427
Fiber Optics......Page 428
Active Fiber Compatible Components......Page 431
Telecommunications Technology......Page 432
Integrated Optics as a Future of Guided Wave Optics......Page 433
See also......Page 434
Effect of Dispersion on an Optical Signal......Page 435
Material Group Velocity Dispersion......Page 438
Waveguide Group Velocity Dispersion......Page 440
Further Reading......Page 442
Introduction......Page 443
MCVD Process......Page 444
VAD Process......Page 445
Fiber Drawing......Page 446
Fiber Coating......Page 447
Types of Fiber......Page 448
Light Propagation......Page 449
Attenuation......Page 452
Microbending Sensitivity......Page 455
Multimode Fiber Bandwidth for Multimode Fibers......Page 456
Chromatic Dispersion......Page 457
Polarization Mode Dispersion......Page 459
Nonlinear Effects......Page 462
Fiber Geometry Characteristics......Page 464
Numerical Aperture......Page 466
Effective Area......Page 467
Further Reading......Page 468
Linear and Nonlinear Signatures......Page 469
Physical Origin of Optical Nonlinearity......Page 470
Parametric Phenomena in Optical Fibers......Page 471
Four-Wave Mixing......Page 472
Further Reading......Page 473
Introduction......Page 474
Second-Harmonic Generation in Crystals......Page 475
Quasi Phase Matching (QPM)......Page 477
Self-Phase Modulation (SPM)......Page 479
Propagation of a Pulse......Page 481
Cross Phase Modulation (XPM)......Page 482
Four-Wave Mixing (FWM)......Page 484
Supercontinuum Generation......Page 486
Conclusions......Page 487
Further Reading......Page 488
Tight Jacket......Page 489
Fiber Identification......Page 490
Strength Member......Page 491
Filling Compounds and Other Components......Page 492
Cable Structures......Page 493
Types and Applications......Page 494
Blown Fiber......Page 495
General Considerations......Page 496
Optical Connectors......Page 498
Branching Devices......Page 499
WDM......Page 500
Other Passive Optical Components......Page 502
Fiber Modes......Page 503
Fiber Grating Theory......Page 504
Photosensitivity......Page 508
Grating Inscription......Page 509
Optical Add/Drop Multiplexers......Page 510
Dispersion Compensator......Page 511
Optical Monitors......Page 512
Nonlinear Optics in Fiber Gratings......Page 513
Further Reading......Page 514
Fourier Transform Property of Lens......Page 515
4-f Coherent Optical Processor......Page 517
Spatial Filtering......Page 518
Complex Matched Spatial Filtering......Page 519
Joint Transform Correlator......Page 520
Further Reading......Page 522
The Cardinal Points and Planes of an Optical System......Page 523
Paraxial Matrices......Page 525
Using the Gaussian Constants......Page 526
Nodal Points and Planes......Page 529
Approximations for Thin Lenses......Page 530
The Paraxial System for Design or Analysis......Page 531
Reflectors......Page 532
Introduction......Page 533
Coma......Page 535
Astigmatism......Page 538
Curvature of Field and Distortion......Page 539
Conclusion......Page 540
Single Prisms as Reflectors......Page 541
Double Prism Reflectors......Page 543
Prisms as Instruments......Page 544
Further Reading......Page 545
Introduction......Page 546
Holographic Black Holes......Page 547
Public Display......Page 548
Object Holography......Page 550
Pulsed Lasers......Page 552
Pseudo Holograms......Page 553
Architectural Scale......Page 554
Twenty-First Century Art......Page 556
Further Reading......Page 557
Methodology of Holographic Recording and Replay......Page 558
Holographic Image Quality......Page 562
Underwater Holography......Page 563
Underwater Holographic Cameras......Page 564
The HoloMar System......Page 566
Further Reading......Page 567
Holographic Recording Materials......Page 568
Sensitivity of Photographic and Holographic Materials......Page 570
Processing of Silver Halide Emulsions......Page 571
Bleach Baths......Page 573
Dichromated Gelatin Materials......Page 574
Photopolymer Materials......Page 575
Thermoplastic Materials......Page 576
Bacteriorhodopsin......Page 577
Further Reading......Page 578
Overview......Page 579
Hologram of a Point Object......Page 580
Types of Holograms......Page 581
Recording Materials......Page 582
Application of Holography......Page 583
Further Reading......Page 584
The Development of Color Holography......Page 585
Silver Halide Materials......Page 586
Laser Wavelengths for Color Holograms......Page 587
Setup for Recording Color Holograms......Page 588
Processing of Color Holograms......Page 589
Computer-Generated Color Holograms......Page 590
The Future of Color Holography......Page 591
Further Reading......Page 592
From the Classical Hologram to the Computer-Generated Hologram: CGH......Page 593
From a Diffraction Grating to a Fourier CGH......Page 594
About Some CGH Algorithms......Page 596
Some CGH Applications......Page 598
Further Reading......Page 599
Direct Phase Reconstruction by Digital Holography......Page 600
The Fresnel Approximation......Page 603
Numerical Reconstruction by the Convolution Approach......Page 604
Numerical Reconstruction by the Lensless Fourier Approach......Page 605
Influences of Discretization......Page 606
See also......Page 607
Further Reading......Page 608
Double-Exposure Holographic Interferometry......Page 609
Real-Time Holographic Interferometry......Page 610
Digital Phase Measurement......Page 611
In-Plane Displacement Component Measurement......Page 612
Basic Interferometers for Dynamic Displacement Measurement......Page 613
Real-Time Time-Average Holographic Interferometry......Page 614
Flow Measurement......Page 615
The Dual Illumination Method......Page 616
Holographic Shearing Interferometry......Page 618
Further Reading......Page 619
Interferometric Sensitivity......Page 620
The Principle of Sandwich Holography......Page 621
Examples......Page 622
Introduction to Light-in-flight Recording by Holography......Page 623
Examples of Wavefront Studies......Page 624
Measuring the Shape of 3D Objects......Page 625
Further Reading......Page 626
Description......Page 628
Radiance Field......Page 629
Image Gathering......Page 630
Signal Coding......Page 631
Image Restoration......Page 632
Maximum-Realizable Fidelity F......Page 633
Electro-optical Design......Page 634
Information Rate, Fidelity and Robustness......Page 635
Information Rate and Visual Quality......Page 636
Further Reading......Page 638
Introduction......Page 639
Introduction......Page 648
Fast Corrections: Adaptive Optics......Page 650
Imaging Through the Atmosphere......Page 651
Adaptive System Design......Page 652
Future Development......Page 653
Other Adaptive Optics Applications......Page 654
Introduction......Page 655
Multispectral Imaging......Page 656
Interferometric Hyperspectral Imaging......Page 658
Spectral Resolution......Page 659
Grating-Dispersion Hyperspectral Imaging......Page 660
Software......Page 661
Further Reading......Page 663
Introduction......Page 664
Scattering Cross-Section......Page 665
Absorption Cross-Section......Page 666
Shallow Tissue Imaging Through Selection of Ballistic Photons......Page 667
Imaging in the Snake-Like Regime: Taking Advantage of Time (or Frequency) Gating......Page 669
Opto- (or Photo-)Acoustics......Page 671
Acousto-Optic......Page 672
Introduction......Page 673
History of Infrared Imaging......Page 676
Infrared Imager Performance......Page 677
General Characteristics of Infrared Imagers......Page 678
Testing Infrared Imagers: NETD, MTF, and MRTD (Sensitivity, Resolution and Acuity)......Page 680
Summary......Page 682
Modeling Infrared Imagers......Page 683
Further Reading......Page 684
Interferometric Imaging......Page 685
Introduction......Page 690
Frequency Modulated Continuous Wave (FM-CW) LIDAR......Page 691
Numerical Example......Page 692
Aperture Array......Page 693
Imaging Applications Involving Incoherent Sources......Page 694
Radar Range Equation......Page 695
Applications and Future Directions......Page 696
Introduction......Page 699
Spatially Incoherent Light......Page 701
Synthetic Apertures with Spatially Incoherent Objects......Page 702
Theta Rotating Interferometer......Page 704
The Two Telescopes Interferometer of Labeyrie......Page 705
Photon Density Wave Imaging......Page 706
Forward Problem: Photon Density Waves......Page 707
Inverse Problem and Imaging Algorithm......Page 709
Applications......Page 710
Further Reading......Page 711
Geometrical Optics Transformations......Page 712
Spatial Frequency Content of the 3D Point Spread Function......Page 713
Diffraction Calculation of the 3D Point Spread Function......Page 715
Volume Holographic Imaging......Page 716
Further Reading......Page 720
Imaging and Atmospheric Turbulence......Page 721
Speckle Imaging......Page 724
Blind Deconvolution......Page 725
Deconvolution from Wavefront Sensing......Page 726
Partially Redundant Pupil Masking......Page 727
Further Reading......Page 728
Incandescent Lamps......Page 729
Discharge Lamps......Page 731
Fluorescent Lamps......Page 732
High Intensity Discharge Lamps......Page 735
Further Reading......Page 737
Synchrotrons......Page 738
Insertion Devices......Page 740
Undulator Spectrum......Page 741
Timing Structure......Page 743
See also......Page 744
Introduction......Page 745
Principle of OTDM......Page 746
Packet Interleaved OTDM......Page 747
Optical Sources......Page 749
Mach-Zehnder (M-Z) Interferometers......Page 750
Sagnac Interferometers......Page 752
Four-Wave Mixing (FWM)......Page 754
Synchronization - Optical Phased Locked Loops (PLL)......Page 755
OTDM Bit-Error Rate (BER) Performance......Page 756
Further Reading......Page 757
FT Without a Lens......Page 758
Object Before the Lens......Page 759
Serial Correlators......Page 760
Joint Transform Correlators......Page 762
Conclusion......Page 767
Introduction......Page 768
Historical Perspective......Page 769
Nonlinear Optical Element......Page 770
Absorptive Bistability......Page 771
Dispersive Bistability......Page 772
Four-Wave Mixing......Page 773
Optical Shadow Casting (OSC)......Page 774
Discrete Processors: Optical Matrix Processor......Page 775
Analog Optical Computing......Page 776
Further Reading......Page 777
Incoherent Image Formation......Page 778
Incoherent Spatial Filtering......Page 779
Incoherent Complex Matched Spatial Filtering......Page 780
Computed Tomography......Page 781
Further Reading......Page 783
SOA-XGM for Logic......Page 784
Integrated Optic Microring Resonator......Page 785
Further Reading......Page 786
Optical Digital Image Processing......Page 787
The Error Diffusion Algorithm......Page 788
The Hopfield-Type Neural Network......Page 790
The Error Diffusion Filter......Page 791
A Smart Pixel Implementation of the Error Diffusion Neural Network......Page 792
See also......Page 795
Neural Networks - Natural and Artificial......Page 796
Optical Processors......Page 797
Coherent Optical Fourier Transformation......Page 798
Photorefractive Neural Networks......Page 799
Conclusions......Page 800
Further Reading......Page 801
Astronomical Data......Page 802
Angular Resolution......Page 803
Imaging......Page 804
Spectroscopy......Page 805
Spectropolarimetry and Polarimetry......Page 813
Use of Optical Fibers in Astronomy......Page 815
Structure......Page 816
Further Reading......Page 817
Materials......Page 818
Interaction Between Light and Materials......Page 819
Ellipsometry Measurements......Page 820
Single Wavelength Ellipsometry (SWE)......Page 821
Variable Angle Ellipsometry......Page 822
Data Analysis......Page 823
Film Thickness......Page 824
Optical Constants......Page 825
Composition......Page 826
See also......Page 827
Fundamentals of Photometry......Page 828
Photometric Quantities......Page 829
Concepts of Advanced Photometry......Page 831
Primary Standards......Page 832
Secondary Type Measurements......Page 833
List of Units and Nomenclature......Page 836
Introduction......Page 838
Quantifying Scattered Light......Page 839
Angle Resolved Scatterometers......Page 840
TIS Instruments......Page 842
Analyzing Scatter from Surface Roughness......Page 843
Conclusion......Page 844
Prisms......Page 845
Gratings......Page 847
Imaging Spectrometers......Page 850
Further Reading......Page 856
Basic Imaging Theory......Page 857
Reflection at a Spherical Surface......Page 858
Reflector Surfaces......Page 859
Angular Spherical Aberration......Page 861
Chromatic Aberration......Page 862
Refractor Objectives......Page 863
Basic Oculars......Page 864
Reflectors......Page 865
Metrology......Page 866
Astronomical Telescopes......Page 869
Further Reading......Page 871
Plane Parallel Plate......Page 872
Fizeau Interferometer......Page 873
Radial Shear Interferometer......Page 874
Sagnac Interferometer......Page 875
Phase-Shifting Interferometry......Page 876
Further Reading......Page 877
Introduction......Page 878
Interferometer Styles......Page 879
Noise Sources and Interferometer Sensitivity......Page 882
Further Reading......Page 884
Basic Parts of a Phase-Measuring Interferometer......Page 885
Common Interferometer Types......Page 886
Phase Modulation Techniques......Page 887
Ramping Versus Stepping......Page 888
Phase Unwrapping......Page 889
Overview of Phase Measurement Algorithms and Techniques......Page 890
Algorithm Design......Page 891
Spatial Carrier-Frequency Technique......Page 892
Extended Range Phase Measurement Techniques......Page 893
Other Types of Systematic Errors to Consider......Page 894
Further Reading......Page 895
Interference in Thin Film......Page 896
Polarization Microscope......Page 897
White Light Interference......Page 898
Position of Fringes Under Envelope Due to Reflection of Dielectric......Page 899
Changes in Envelope and Fringes Due to Dispersion......Page 900
Controlled Phase Shift of Fringes Under the Envelope - Geometric Phase Shift......Page 901
Surface Topography and Object Structure Measurement......Page 902
Signal Processing of White Light Interferograms......Page 903
Scanner Nonlinearity......Page 904
Film Thickness Measurement......Page 906
Spatial Coherence Effects in the Interference Microscope......Page 907
Further Reading......Page 908
Brief History......Page 909
Characteristics......Page 910
Resonant Energy Transfer......Page 911
Stimulated Emission......Page 912
Optical Resonator......Page 913
Slow Axial Flow Lasers......Page 915
Diffusion Cooled Laser......Page 916
Conclusions......Page 917
Further Reading......Page 919
Background......Page 920
Brief History of Dye Lasers......Page 921
Laser-Pumped Pulsed Dye Lasers......Page 922
Continuous Wave Dye Lasers......Page 924
Multiple-Prism Dispersion Grating Theory......Page 925
Physics and Architecture of Solid-State Dye-Laser Oscillators......Page 926
Xanthenes......Page 928
Conjugated Hydrocarbons......Page 929
Solid-State Laser Dye Matrices......Page 930
Organic Hosts......Page 931
Dye Laser Applications......Page 932
The Future of Dye Lasers......Page 933
Introduction......Page 934
Further Reading......Page 940
Background: Why Excimer Lasers......Page 941
Excimer Laser Fundamentals......Page 944
Discharge Technology......Page 948
See also......Page 950
Introduction......Page 951
Principles of FEL Operation......Page 952
The Quantum-Theory Picture......Page 954
The Classical Picture......Page 957
Principles of FEL Theory......Page 959
The Pierce Dispersion Equation......Page 960
The FEL Gain Regimes......Page 961
Super-Radiance, Spontaneous-Emission and Self Amplified Spontaneous Emission (SASE)......Page 963
Saturation Regime......Page 964
FEL Accelerator Technologies......Page 967
Magnetic Wiggler Schemes......Page 973
FEL Oscillators......Page 974
Copper Vapor Lasers......Page 977
Afterglow Recombination Metal Vapor Lasers......Page 981
Continuous-Wave Metal Ion Lasers......Page 982
History......Page 984
Theory of Operation......Page 985
Operating Characteristics......Page 987
Technology......Page 988
Further Reading......Page 991
Optical Fiber Lasers......Page 992
Fiber Laser Fundamentals......Page 993
Continuous Wave Fiber Lasers......Page 995
Pulsed Fiber Lasers......Page 996
Other Fiber Lasers......Page 1000
Further Reading......Page 1001
Conjugated Polymers......Page 1002
Organic Semiconductor Gain Materials......Page 1003
Measuring Gain......Page 1004
Polymer Laser Resonators......Page 1006
Towards Plastic Diode Lasers......Page 1008
Further Reading......Page 1009
Planar Waveguides......Page 1010
Lasers......Page 1011
Waveguide Laser Materials......Page 1012
Distributed Bragg Reflector (DBR) PWL......Page 1014
Distributed Feedback (DFB) PWL......Page 1015
PWL Arrays......Page 1016
PWL Stability......Page 1017
Further Reading......Page 1018
Basic Principles......Page 1019
Physics of the Gain Medium......Page 1021
The Resonator......Page 1022
Semiconductor Laser Dynamics......Page 1024
Introduction......Page 1025
Energy Transfers......Page 1026
Photon Avalanche......Page 1027
Up-Conversion from Second-Order Optical Nonlinearity......Page 1029
Energy Transfers......Page 1030
Sequential Two-Photon Absorption......Page 1031
Photon Avalanche......Page 1033
The Self-Frequency Doubling Laser......Page 1034
See also......Page 1035
LASER-INDUCED DAMAGE OF OPTICAL MATERIALS......Page 1036
Further Reading......Page 1038
Physical Fundamentals......Page 1039
White LED Structures......Page 1041
Application of LEDs......Page 1042
Further Reading......Page 1043
Introduction......Page 1044
Basic Principles......Page 1045
Basic Principles......Page 1046
Examples of Experimental Results......Page 1047
Effect of a Constant Transverse Magnetic Field - The Hanle Effect......Page 1049
Examples of Experimental Results for Optical Pumping and Optically Detected Magnetic Resonances......Page 1050
Further Reading......Page 1052
Theoretical Background......Page 1053
Experimental......Page 1055
Further Reading......Page 1057
Introduction......Page 1058
Definitions......Page 1059
Symmetry......Page 1060
Measurements......Page 1061
See also......Page 1067
Introduction......Page 1068
Theory of the Thin Sample z-Scan......Page 1069
Thick Sample z-Scan......Page 1072
Example of z-Scan: Effective Cubic Nonlinearity of Liquid Crystals......Page 1073
Discussion: Effective n2 of Other Materials......Page 1074
Introduction......Page 1076
Purely Optically Induced Orientation......Page 1079
Photorefractive Effect......Page 1080
Supra-Nonlinear Methyl-Red Doped NLC - Observed Phenomena......Page 1081
Thermal and Density Effect......Page 1082
Second and Third Order Nonlinear Susceptibilities......Page 1083
Concluding Remarks......Page 1084
Macroscopic......Page 1085
Molecules......Page 1086
Bulk Materials......Page 1087
Importance of Order......Page 1093
Two-Photon Absorbing Materials......Page 1095
Composite Materials......Page 1100
Further Reading......Page 1103
The Conventional Microscope......Page 1104
Conventional and Scanning Microscopes......Page 1108
Probe Microscopes......Page 1109
Other Radiation......Page 1110
Raman Microscopy......Page 1111
Introduction......Page 1112
Image Formation in Scanning Microscopes......Page 1113
Applications of Depth Discrimination......Page 1115
Fluorescence Microscopy......Page 1116
The Use of Structured Illumination to Achieve Optical Sectioning......Page 1118
Summary......Page 1119
Introduction......Page 1120
Multi-photon Excitation Selection Rules......Page 1121
Enhanced Microscope Resolution Based on Multi-photon Excitation......Page 1122
Basic Imaging Applications of Multi-Photon Microscopy......Page 1123
4Pi Confocal Microscopy......Page 1124
Multifocal Multi-Photon Microscopy (MMM)......Page 1125
Multi-Photon 3D Microfabrication......Page 1126
Background......Page 1127
Polarization-Division Interference Microscopes......Page 1128
White Light Optical Profilers......Page 1130
Interferometric Microscope Objectives......Page 1132
Applications......Page 1133
Laser-based Interference Microscopes......Page 1134
Introduction......Page 1135
Basic Principles of Two-Photon Microscopy (TPM)......Page 1136
Fluorescent Probes......Page 1137
Two-photon Instrumentation......Page 1138
Fluorescence Lifetime Imaging (FLIM)......Page 1139
Fluorescence Correlation Spectroscopy (FCS)......Page 1141
Second-Harmonic Generation Microscopy......Page 1142
Coherent Anti-Raman Stokes (CARS)......Page 1144
Conclusion......Page 1145
Introduction......Page 1146
Dark Field Microscopy......Page 1148
Schlieren Imaging, Hoffman Modulation Contrast, and Differential Phase Contrast (DPC)......Page 1149
Interference Microscopy......Page 1150
Differential Interference Microscopy (DIC)......Page 1151
Digital Phase Retrieval......Page 1152
Further Reading......Page 1153
The Photo-Elastic Effect......Page 1154
Diffraction by Acoustic Waves......Page 1155
Anisotropic Bragg Diffraction......Page 1157
AO Devices......Page 1158
AOTFs......Page 1160
Modulators......Page 1162
List of Units and Nomenclature......Page 1163
Electro-Optics......Page 1164
Basic Relationships......Page 1167
Materials......Page 1168
Devices......Page 1170
Introduction......Page 1172
The Modulation of Semiconductor Lasers......Page 1173
Electro-optic Modulation of Light......Page 1176
Demodulation using PIN Photodiodes......Page 1178
Demodulation using Avalanche Photodiodes......Page 1179
Wavelength Selective Demodulation......Page 1180
Further Reading......Page 1181
Classical Versus Quantum Waves......Page 1182
Nonclassical Light from a Single Atom......Page 1184
Application of Nonclassical Light......Page 1186
Further Reading......Page 1187
Introduction......Page 1188
Relativistic Optics......Page 1189
Some Examples of Relativistic Nonlinear Optics......Page 1191
Further Reading......Page 1195
Introduction......Page 1196
Birefringent Phase Matching......Page 1199
Quasi-Phase Matching (QPM)......Page 1201
Compensated Phase Matching (CPM)......Page 1203
Acceptance Bands......Page 1204
See also......Page 1205
Grating-Fiber Compressors......Page 1206
Soliton-Effect Compressors......Page 1208
Compression of Fundamental Solitons......Page 1209
Raman Lasers......Page 1211
Raman Shifters......Page 1212
Raman Lasers......Page 1214
Introduction......Page 1219
Nonlinear Index of Refraction......Page 1221
Self-Lensing......Page 1222
Beam Trapping......Page 1223
Self-Focusing in Space-Time......Page 1227
Other Related Effects (Multiphoton Nonlinearities)......Page 1229
Further Reading......Page 1231
3D Confinement......Page 1232
Opto-Mechanical Implementation......Page 1234
Excitation Sources......Page 1235
Structural Materials......Page 1236
Multiphoton Initiators......Page 1239
Resolution......Page 1240
Multibeam-Interference Three-Dimensional Microfabrication......Page 1242
Structures and Functional Devices......Page 1244
Further Reading......Page 1249
Nonlinear Phase Shift in Collinear Second Harmonic Generation......Page 1250
Frequency Shifting......Page 1253
An Optical Diode......Page 1254
List of Units and Nomenclature......Page 1255
Introduction......Page 1256
General Considerations......Page 1257
Basic Equations......Page 1258
Steady-State Solutions......Page 1259
Phase Matching......Page 1260
Gaussian Beams and Pulses......Page 1263
Further Reading......Page 1265
Polarization and the Third-Order Susceptibility......Page 1266
Wave Equation for 3 Third-Harmonic Generation......Page 1267
Photonic Crystal......Page 1268
Defect Mode Using Photonic Crystal......Page 1269
Applications......Page 1270
Introduction......Page 1271
Relation Between Field and Polarization: The Response Function......Page 1272
Example and Selected Applications......Page 1273
Degenerate Four-Wave Mixing and Phase Conjugation......Page 1274
Conclusion......Page 1276
Kramers-Kronig Relations in Nonlinear Optics......Page 1277
Further Reading......Page 1282
Nomenclature Associated with the Excitation Light......Page 1283
Nonlinear Susceptibility......Page 1285
Complex Quantities......Page 1286
Nonlinear Absorption......Page 1287
Further Reading......Page 1289
Nonlinear Optical Phase Conjugation......Page 1290
Further Reading......Page 1293
The Standard Rate Equation Model......Page 1294
Coupled Wave Equations......Page 1296
Organic Materials......Page 1297
Distortion Compensation by Phase Conjugation......Page 1298
Optical Limiting, the Novelty Filter, and Laser Ultrasonic Inspection......Page 1299
List of Units and Nomenclature......Page 1300
Introduction......Page 1301
Self Focusing, Supercontinuum Generation, and Filamentation......Page 1302
High-Harmonic Generation......Page 1303
Further Reading......Page 1304
Introduction......Page 1305
Typical Experimental Parameters......Page 1306
Second Step: Free-electron Trajectory......Page 1307
Feynman Path-Integral Approach......Page 1308
HHs Generated by Few-Optical-Cycle Pulses......Page 1309
HHs Beam Characteristics......Page 1310
Spectrometers and Monochromators for HHs......Page 1311
HHs at Work: Applications and Perspectives......Page 1312
Further Reading......Page 1313
Amplifier Gain and Bandwidth......Page 1314
Gain Saturation......Page 1315
Basic Amplifier Configurations......Page 1316
Further Reading......Page 1317
Introduction......Page 1318
Components for EDFAs......Page 1319
The Single-Stage Amplifier......Page 1320
Multiple-Stage EDFAs......Page 1321
Advanced EDFA Functions......Page 1326
Further Reading......Page 1327
Key Elements......Page 1328
Key Optical-Amplifier Parameters......Page 1329
Erbium Fiber Amplifier......Page 1330
Raman Fiber Amplifier......Page 1332
Distributed Amplification......Page 1333
Hybrid Amplification......Page 1334
Wavelength-Division Multiplexing......Page 1335
Terrestrial Systems......Page 1337
Unrepeatered Systems......Page 1338
See also......Page 1339
Fiber Raman Amplifiers......Page 1340
Gain Saturation......Page 1341
FRA Performance and Applications......Page 1343
Brillouin Gain and Bandwidth......Page 1344
Gain Saturation......Page 1345
FBA Performance and Applications......Page 1346
Phase-Matching in Parametric Processes......Page 1347
Parametric Gain......Page 1348
Amplifier Gain and Bandwidth......Page 1349
Amplifier Performance and Applications......Page 1350
Basic Principles......Page 1351
SOA Structures......Page 1352
Basic Applications of SOAs in Optical Communication Systems......Page 1353
Pattern Effects and Crosstalk......Page 1355
Ultrashort Pulse Amplification......Page 1356
Functional Applications......Page 1357
See also......Page 1358
Introduction......Page 1359
Historical Perspective......Page 1360
Watermarks......Page 1361
Laser Perforation......Page 1362
Moire Effects......Page 1363
Devices Based on Light Diffraction......Page 1364
Thin-Film Foils......Page 1368
Light Interference Inks......Page 1369
Devices Based on Light Polarization......Page 1370
Coatings Based on Diffractive Pigments......Page 1371
Aluminum Flake Based Inks and Coatings......Page 1372
Summary and Challenges for the Future......Page 1373
Diamond Optical Devices and Coatings......Page 1374
Diamond Growth by CVD......Page 1376
Absorption, Reflection, and Transmission in Diamond......Page 1377
Impurities in Diamond......Page 1378
Electroluminescence in Poly-C......Page 1379
Poly-C Applications......Page 1380
Optical MEMS......Page 1381
Fundamental Damage Mechanisms in Thin Films......Page 1382
Units and Scaling of Laser-Induced Damage Threshold......Page 1385
Measurement of Laser-Induced Damage Thresholds......Page 1386
Optical Coatings for High Power Lasers......Page 1387
Summary......Page 1390
Further Reading......Page 1391
Introduction......Page 1392
Uses of Black Surfaces......Page 1393
Selection of Optical Black Surfaces......Page 1394
Optical Characterization......Page 1396
Future Developments......Page 1400
See also......Page 1402
Thin Film Optical Coatings......Page 1403
Design of Optical Coatings......Page 1404
Production of Optical Coatings......Page 1407
Quality Parameters of Optical Coatings......Page 1410
Summary and Outlook......Page 1411
X-Ray Coatings......Page 1412
X-rays Compared to the Other Electromagnetic Spectral Regions......Page 1413
Total Reflection X-Ray Mirrors......Page 1415
Multilayer X-Ray Mirrors......Page 1416
Further Reading......Page 1418
Introduction......Page 1419
Optical Fiber......Page 1420
Communication Components......Page 1422
Data Modulation Formats......Page 1425
Data Multiplexing......Page 1426
Dispersion Management and Compensation......Page 1428
System Performance Parameters......Page 1429
Introduction......Page 1430
Optical Fiber Development......Page 1431
Fielded Systems......Page 1434
Entrance of Wavelength Division Multiplexing......Page 1435
Undersea Optical Cable Systems......Page 1436
Simple Optical Fiber Links......Page 1437
Networks of Links......Page 1439
Single-Span WDM Links......Page 1441
Passive Optical Networks......Page 1442
Wavelength-Routed Optical Networks......Page 1443
Introduction......Page 1445
Atmospheric Losses......Page 1446
Turbulence and Scintillation......Page 1448
The Light Source......Page 1449
The Detector System......Page 1450
Recent Research and Future Considerations......Page 1451
Semiconductor Laser Principles......Page 1452
DWDM Transceivers......Page 1453
Directly Modulated Lasers......Page 1454
Requirements for Externally Modulated Lasers......Page 1455
Reliability of Lasers in Fiber Optic Systems......Page 1456
Further Reading......Page 1457
Introduction......Page 1458
Overview......Page 1459
IEEE 802.3 100Mb/s Ethernet (Fast Ethernet)......Page 1460
IEEE 802.3 1Gb/s Ethernet (Gigabit Ethernet)......Page 1461
Overview......Page 1462
MAC Protocol......Page 1464
Overview......Page 1465
MAC Protocol......Page 1466
Further Reading......Page 1467
Principles of Time Division Multiplexing......Page 1468
Ultra-Short Optical Pulse Sources......Page 1470
Transmission of an OTDM Signal over Fiber......Page 1471
Demultiplexing of OTDM Signal at Receiver......Page 1472
OTDM Networking Issues......Page 1473
Conclusion......Page 1474
Further Reading......Page 1475
Dense Wavelength Division Multiplexing (DWDM)......Page 1476
Point-to-Point DWDM Links......Page 1477
DWDM Networks......Page 1478
Optical Amplifiers......Page 1479
Optical Filters......Page 1480
Chromatic Dispersion......Page 1481
Conclusions......Page 1482
Bandpass Filters......Page 1483
Dispersion Filters......Page 1485
Longpass/Shortpass Filters......Page 1486
Absorber Glasses......Page 1488
Introduction......Page 1489
The Optical Nanoparticle Resonances (Mie Resonances)......Page 1491
The Theory of Mie......Page 1492
Beyond Mie’s Theory......Page 1494
Aggregates of Nanoparticles with Electrodynamic Particle-Particle Coupling......Page 1495
Size Effects......Page 1496
Electronic Interface States......Page 1498
The Static Interface Charge Transfer and its Effects Upon Mie Resonance Positions......Page 1499
Experimental Results......Page 1500
Comparison of Plasmon Polariton Lifetimes with Femtosecond Experiments......Page 1501
Further Reading......Page 1502
Lightweight Mirrors......Page 1503
Direct Generation......Page 1504
Glass, Glass Ceramics, and Hybrids......Page 1505
Silicon Carbide......Page 1506
Composites......Page 1507
Grazing Incidence Mirrors......Page 1508
Introduction......Page 1509
Free Electron Oscillators......Page 1510
The Dipole Oscillator Model (Lorentz Oscillator)......Page 1511
The Refractive Index......Page 1512
Local Field Correction......Page 1513
Measurable Optical Parameters......Page 1514
Metals......Page 1516
Crown and Flint Glasses......Page 1517
Dispersion Formulae for Optical Glasses......Page 1518
Athermal Lenses......Page 1519
Ultraviolet-Transmitting Glasses......Page 1520
Infrared-Transmitting Glasses......Page 1521
Further Reading......Page 1522
Introduction......Page 1523
The First Plastic Lenses in Cameras......Page 1524
Thermoplastics......Page 1525
Elastomere......Page 1528
Compression Molding......Page 1529
Series Production......Page 1530
Further Reading......Page 1531
Structure in Thin Films......Page 1532
Optical Thin Films......Page 1535
Current Research......Page 1537
Further Reading......Page 1539
Photochromic Glass......Page 1540
Thermochromic Materials......Page 1541
Electrochromic Materials......Page 1542
Nonlinear Optical Materials......Page 1543
Emissive Materials......Page 1544
Sol-Gel Chemistry of Oxide Optical Materials......Page 1547
Shapes, Compositions and Precursor Concentrations......Page 1548
Sol-Gel Fiber Processing......Page 1549
After the Sol-Gel Transition......Page 1550
Encapsulation......Page 1551
Introduction......Page 1552
The Paraxial Approximation......Page 1553
Introduction to Third-Order Monochromatic Aberrations......Page 1557
Spherical Aberration......Page 1558
Introduction......Page 1560
Fraunhofer and Fresnel Diffraction......Page 1561
Diffraction in a Lens System: The Point Spread Function (PSF)......Page 1562
The Rayleigh and Marechal Criterion......Page 1563
Transfer Functions......Page 1564
Microlens Testing......Page 1565
Surface Profile Measurements......Page 1566
Measurement of the Optical Lens Performance......Page 1567
Summary of the Interferometric Instrumentation......Page 1571
Acknowledgment......Page 1572
Further Reading......Page 1573
What is an Optical Parametric Device......Page 1574
Nonlinear-Optical Origins of OPDs......Page 1575
Nanosecond-Pulsed OPOs - Design and Wavelength Control......Page 1576
Ultrafast-Pulsed OPOs......Page 1580
OPGs and DFGs - Dispensing with the Optical Cavity......Page 1581
Introduction......Page 1582
Basic Principle......Page 1583
Output Power......Page 1584
Frequency Control......Page 1585
SRO......Page 1586
DRO......Page 1587
Linewidth and Frequency Stability......Page 1588
Device Development Trends......Page 1589
Three-Wave Mixing in a (2) Medium with Focused Waves......Page 1590
Steady-State Description of the cwOPO......Page 1591
Synchronous Pumping......Page 1593
Dispersion in Nonlinear Media and Optical Glasses......Page 1594
Pump Sources......Page 1595
Tuning Characteristics and Methods......Page 1596
KTP......Page 1597
Fused Silica, BK7, SF2, SF5, SF10, SF11, SF18......Page 1598
Further Reading......Page 1599
Introduction......Page 1600
Optical Computing......Page 1601
Image Processing......Page 1602
Fourier Transform......Page 1603
l Mitigation Processing in Nonideal Systems......Page 1604
Optical Transfer Functions for Image Motion and Vibration......Page 1605
Optical Transfer Functions for the Atmosphere......Page 1606
Spatial Filtering......Page 1607
Further Reading......Page 1608
Theory......Page 1609
Force Calibration by Viscous Drag......Page 1610
Design......Page 1612
Biopolymers......Page 1616
Further Reading......Page 1617
Introduction......Page 1618
Phase Conjugation by Holography......Page 1619
Pseudoscopic Image......Page 1621
One Way Phase Conjugation......Page 1622
Background and Basic Concepts......Page 1624
Wavefront Coding Theory......Page 1625
Wavefront Coding in Imaging Systems with Extended Depth of Field or Focus......Page 1628
Further Reading......Page 1636
The Origins of Quantum Optics......Page 1637
The Photon in Modern Physics......Page 1640
Further Reading......Page 1643
Radiative Decay and Photon-Atom Binding in a PC......Page 1644
EIT and Cross-Coupling of Photons in Doped PCs......Page 1647
Further Reading......Page 1650
Maxwell’s Equations in Periodic Media......Page 1651
The Origin of the Photonic Bandgap......Page 1652
Semi-analytical Methods: Perturbation Theory......Page 1654
Two-Dimensional Photonic Crystals......Page 1655
Photonic-Crystal Slabs......Page 1656
Three-Dimensional Photonic Crystals......Page 1657
Further Reading......Page 1658
Wire Mesh Photonic Crystals......Page 1659
Defect States......Page 1660
Capacitive Mesh Photonic Crystals......Page 1661
Surface Waves......Page 1662
High-Impedance Surfaces......Page 1665
Further Reading......Page 1669
Introduction......Page 1670
Dispersion in PCF......Page 1671
Nonlinear Phenomena......Page 1673
Supercontinuum Generation......Page 1674
Optical Switching in PCF......Page 1675
Conclusion......Page 1676
Electromagnetics of Photonic Crystals......Page 1677
Photonic Crystal Resonant Cavities and Lasers......Page 1679
Photonic Crystal Waveguides......Page 1682
Further Reading......Page 1685
Photonic Bandgap......Page 1686
The Self-Assembly Technique......Page 1687
Formation of Colloidal Lattices......Page 1688
Colloidal Lattices as Photonic Crystals......Page 1689
Colloidal Lattices as Templates......Page 1690
Functionalization......Page 1692
Tunable Photonic Crystals......Page 1693
Further Reading......Page 1694
Introduction......Page 1695
Scanning Near-Field Optical Microscopy (SNOM)......Page 1697
Cluster Physics......Page 1698
FELs in the Extended UV to X-ray Region......Page 1699
Further Reading......Page 1700
Theory......Page 1716
Experimental......Page 1717
Surface Specificity......Page 1718
Applications......Page 1719
Further Reading......Page 1720
Introduction......Page 1721
Birefringent Materials (Calcite)......Page 1723
Dichroic Absorbers......Page 1725
Reflection and Transmission......Page 1726
Miscellaneous Types......Page 1728
Retarders or Retardation Plates......Page 1729
Variable Retardation Plates and Compensators......Page 1731
Electro-Optic, Magneto-Optic, and Piezo-Optic Devices......Page 1732
Matrix Methods for Computing Polarization......Page 1734
Further Reading......Page 1735
Polarization Ellipse......Page 1736
Stokes Parameters......Page 1737
Mueller Calculus......Page 1738
Jones Vector......Page 1739
Jones Calculus......Page 1740
Further Reading......Page 1741
Computer to Plate Printing......Page 1702
Materials Processing - Heating......Page 1704
Materials Processing - Localized Heating......Page 1705
Semiconductor Lithography......Page 1706
Fine Control Material Removal......Page 1707
Microscopy......Page 1709
Raman Methods and Microscopy......Page 1710
Infrared Methods......Page 1712
Further Reading......Page 1714
Brief History......Page 1742
Electromagnetic Field as a Collection of Harmonic Oscillators......Page 1743
Mode Functions......Page 1744
Canonical Quantization......Page 1745
Space of Quantum States......Page 1746
Entangled States......Page 1747
Further Reading......Page 1748
Modification of the Spontaneous Transition Rate in Confined Space......Page 1749
Maser Operation......Page 1750
Generation of Number States (Fock States) of the Radiation Field......Page 1752
Other Cavity Experiments......Page 1753
Microlasers......Page 1754
Introduction......Page 1755
Quantized States: Observation via Strong Light-Matter Interaction......Page 1757
Superposition Principle: Observation of Quantum Optical Correlations......Page 1759
Heisenberg Uncertainty Principle: Squeezing of Light Emission......Page 1761
Summary......Page 1762
Atom Optics......Page 1763
Nanofabricated Atom Optics......Page 1765
Standing Waves of Light......Page 1767
Kapitza-Dirac Scattering......Page 1772
Comparison of Atom-Standing Wave Interactions......Page 1774
Atom Interferometry......Page 1776
Conclusion......Page 1777
Electromagnetically Induced Transparency......Page 1778
Introduction......Page 1780
Bringing Light to a Halt: Frozen Light......Page 1781
Storing and Retrieving Quantum Information......Page 1782
Introduction......Page 1783
Schemes with Population Inversion at the Driving Transition......Page 1785
Further Reading......Page 1786
Multiqubit States and Entanglement......Page 1787
Creating Entangled States Experimentally......Page 1788
Quantum Key Distribution......Page 1790
Quantum Superdense Coding......Page 1791
Quantum Teleportation......Page 1792
Lithography......Page 1793
Further Reading......Page 1794
Laser Cooling of Ions in a Radiofrequency Trap......Page 1795
Sideband Cooling......Page 1796
Generation of Chaos During the Cooling Process and Phase Transitions to Ordered Configurations......Page 1798
Conclusion......Page 1801
Further Reading......Page 1802
Quantum Computation: Basic Definitions......Page 1803
Single-Qubit Rotations......Page 1805
Conditional Dynamics: Controlled-Phase Gate......Page 1806
Quantum Computing with Atoms II: Ion Traps......Page 1807
The Cirac and Zoller Quantum Computer......Page 1808
Frequency Estimations as a Quantum Computation: Building Frequency Standards with Entangled States......Page 1809
Further Reading......Page 1810
Introduction......Page 1811
Dynamics of a Parametrically Excited Crystal......Page 1812
Physical Interactions Giving Phonon Squeezing......Page 1814
Experimental Generation and Detection of Squeezed Phonons by Second Order Raman Scattering......Page 1815
KTaO3......Page 1816
Further Reading......Page 1818
Introduction......Page 1819
Definitions......Page 1820
Single-Particle Motion......Page 1822
Constants of the Motion......Page 1823
Role of Initial Phase......Page 1826
Radiation from Relativistic Electrons......Page 1827
Collective Plasma Response......Page 1829
Propagation......Page 1830
Relativistic Self-Focusing......Page 1831
Raman Scattering, Plasma Wave Excitation and Electron Acceleration......Page 1832
Interactions with Solid-Density Targets......Page 1834
Further Reading......Page 1837
Raman Scattering......Page 1838
Stimulated Raman Scattering......Page 1839
Transient Effects......Page 1841
Anti-Stokes Raman Scattering......Page 1842
Scattering from Rough Surfaces......Page 1843
Scattering from Dielectric Thin Films......Page 1844
Angle Resolved Scattering (ARS)......Page 1845
Fields of Applications......Page 1846
Examples of Measurements......Page 1847
Further Reading......Page 1849
General Features......Page 1850
Stimulated Raman Scattering......Page 1851
Stimulated Brillouin Scattering......Page 1853
Further Reading......Page 1854
Helmholtz Equation. Herman Ludwig von Helmholtz (1821-1894)......Page 1855
Silver-Muller Radiation Conditions......Page 1856
Huygen’s Principle (Green’s Theorem). Christian Huygens (1629-1695)......Page 1857
Mie Scattering......Page 1858
Stimulated Scattering......Page 1859
Stimulated Raman Scattering......Page 1860
Stimulated Brillouin and Rayleigh Scattering......Page 1863
Stimulated Brillouin Scattering......Page 1865
Stimulated Rayleigh Scattering......Page 1866
Parametric Fluorescence......Page 1867
Self-focusing......Page 1868
Further Reading......Page 1869
Amorphous Structure......Page 1870
Amorphous Semiconductor Energy Bands......Page 1871
Preparation of Amorphous Silicon......Page 1872
Optical Properties of Amorphous Semiconductors......Page 1873
Electrical Properties of Amorphous Semiconductors......Page 1874
Further Reading......Page 1875
Suppression of Auger Recombination......Page 1876
Interband (Bipolar) Lasers......Page 1878
Phonon Scattering: QWIPs and QC Lasers......Page 1879
Further Reading......Page 1880
Energy Band Structure, Optical and Magneto-optical Properties of SMSs......Page 1881
Magnetic Properties......Page 1884
Low-Dimensional (LD) SMSs: Magneto-optical and Spintronic Effects......Page 1885
Further Reading......Page 1886
GaAs Bandstructure......Page 1887
Temperature Dependence......Page 1888
GaAs/AlxGa(1-x)As - The Most Significant Alloy......Page 1889
Optical Absorption......Page 1890
Refractive Index......Page 1891
Further Reading......Page 1892
Material Properties and Strain......Page 1893
Electronic Properties and Band Structure......Page 1895
Optical Properties......Page 1898
Conclusions......Page 1900
Introduction to InGaN......Page 1901
Optical Energy Relationships......Page 1902
Bandgap of InN......Page 1903
Structure of InGaN Epilayers......Page 1904
II-VI Materials Growth......Page 1906
Introduction......Page 1907
Transmission and Absorption of Light within II-VI Materials......Page 1909
Light-Semiconductor Interactions......Page 1910
Quantum Wells, High Carrier Density Effects, and Lasers......Page 1912
Further Reading......Page 1913
Basic Properties......Page 1914
Band Structure......Page 1915
Optical Properties......Page 1916
Infrared Lasers......Page 1917
Infrared Detectors......Page 1919
Summary......Page 1920
Introduction......Page 1921
Fundamental Material Properties......Page 1922
Carrier Lifetime Mechanisms......Page 1923
Substrates for Epitaxial Growth......Page 1926
HgCdTe Infrared Detector Configurations......Page 1927
PV HgCdTe Detectors......Page 1928
MIS HgCdTe Detectors......Page 1929
Two-Dimensional HgCdTe Infrared Focal Plane Arrays......Page 1930
Further Reading......Page 1931
Instrumentation......Page 1932
Bulk/Thin-Film Material......Page 1933
Surfaces/Interfaces......Page 1934
Micro- and Nanostructures......Page 1935
Summary......Page 1936
Introduction......Page 1937
Lithography for Quantum Dots......Page 1938
Growth of Self-Assembled Quantum Dots......Page 1939
Optical Properties......Page 1940
Lasers and Self-Assembled Quantum Dots......Page 1943
Single-Dot Spectroscopy......Page 1944
Introduction......Page 1946
InAs/GaSb/AlSb-Based Type-II Superlattices and Quantum Wells......Page 1948
Type-II IR Photodetectors......Page 1949
Type-II IR Lasers......Page 1951
Further Reading......Page 1954
Introduction......Page 1955
Density of States......Page 1956
Energy Bands and Energy Gaps: Semiconductors......Page 1957
Electrons and Holes......Page 1958
Infrared Absorption - Interband Optical Transitions, Excitons......Page 1959
Low Dimensional Systems: Quantum Wells......Page 1960
Introduction......Page 1961
k.p Theory......Page 1962
Narrow-Gap Semiconductors......Page 1963
Density of States......Page 1964
Quantum Wells......Page 1965
Further Reading......Page 1966
Optical Properties......Page 1967
Influence of Quantum Confinement......Page 1968
Exciton Scattering......Page 1970
Introduction......Page 1971
Dopants......Page 1972
Native Defects......Page 1973
1D Defects......Page 1974
2D Defects......Page 1976
Interactions of Defects......Page 1977
Further Reading......Page 1978
Phonons in a Diatomic Linear Chain Lattice......Page 1979
Born-Huang Optic Phonon Model......Page 1981
Impurity Phonon Modes......Page 1982
Group IV Semiconductors......Page 1983
Group III-V and II-VI Semiconductors: Bulk Crystals......Page 1984
Layered Semiconductor Structures......Page 1987
Light Scattering......Page 1989
Raman Scattering by Phonons......Page 1990
Selection Rules in Raman Scattering......Page 1991
Geometrical Aspects of First-Order Raman Scattering......Page 1992
Anharmonic Effects on Raman Spectra......Page 1993
Further Reading......Page 1994
The Polaron Concept......Page 1995
Optical Absorption of Polarons at Weak Coupling......Page 1996
Optical Absorption of Polarons at Strong Coupling......Page 1997
Polaron Cyclotron Resonance......Page 1998
Cyclotron Resonance of Polarons in Silver Halides......Page 1999
Cyclotron Resonance of Polarons in CdTe......Page 2000
Optical Properties of Quantum Dots: Effects of the Polaron Interaction......Page 2001
Further Reading......Page 2002
Electron States in Bulk Material......Page 2003
Electron States in Quantum Wells......Page 2005
Occupancy of States in Quantum Wells......Page 2007
Formation of Quantum Wells......Page 2008
General Principles......Page 2010
Optical Absorption......Page 2012
Concluding Remarks......Page 2013
Further Reading......Page 2014
Electron-Electron Interaction in Traps......Page 2015
Assumptions for Nonequilibrium Statistics......Page 2016
The Case of Defects......Page 2017
Quantum Efficiency......Page 2018
Auger Effects......Page 2020
Identification of Auger Effects......Page 2022
Spin Transport and Relaxation in Semiconductors; Spintronics......Page 2023
Optical Generation of Spin-Polarized Distributions......Page 2024
Mobile Electron Spin Time Evolution: Decoherence and Precession......Page 2025
Transport and Manipulation of Spin Coherence......Page 2026
Transfer of Spin Coherence from Mobile Electrons to Other Spin Systems......Page 2028
Further Reading......Page 2029
Principles of Operation......Page 2030
Experimental Details......Page 2032
Applications I: Uniform Semiconductors......Page 2033
Applications II: Multilayer Structures......Page 2035
Further Reading......Page 2036
Introduction......Page 2037
1D Kerr Spatial Solitons and the Nonlinear Schrodinger Equation......Page 2038
Kerr Spatial Solitons in Two Transverse Dimensions: Saturable and Nonlocal Media......Page 2040
Spatial Solitons in Photorefractive Media......Page 2041
Screening and Photovoltaic Solitons......Page 2042
Spatial Solitons in Nonlocal Media, Liquid Crystalline Media......Page 2043
Parametric Spatial Solitons in Quadratic Media......Page 2045
Conclusions......Page 2048
Further Reading......Page 2049
The Nonlinear Schrodinger Equation......Page 2050
Self-Phase Modulation......Page 2051
Bright Solitons......Page 2053
Dispersion-Managed Solitons......Page 2055
Modulational Instability......Page 2056
Conclusion......Page 2058
Introduction......Page 2059
Solitons in the Presence of Amplification and Loss......Page 2060
Soliton Control......Page 2061
Introduction......Page 2062
Properties......Page 2063
Loop Experiments......Page 2064
Future Outlook......Page 2065
Introduction......Page 2066
Fiber Temporal Solitons......Page 2067
Solitons in Optical Communications......Page 2068
Multi-Component Temporal Solitons......Page 2070
Temporal Solitons in Bragg Gratings......Page 2072
Solitons in Dissipative Nonlinear Optics......Page 2074
Further Reading......Page 2075
Introduction......Page 2076
Background......Page 2078
Optical Frequency Metrology with Femtosecond Combs......Page 2080
Outlook......Page 2083
Introduction......Page 2084
Fourier Analysis and Interferometry......Page 2085
Discrete Sampling......Page 2087
Phase Correction......Page 2088
Example of Experimental Results......Page 2090
Advantage of FTS Over Dispersive Spectrometer......Page 2091
Further Reading......Page 2093
Fourier Transform Multiplexing......Page 2094
Hadamard Transform Multiplexing......Page 2095
Theory of Hadamard Multiplexing......Page 2096
Hadamard Encoding Masks......Page 2097
2D Encoding Masks for Multiplexing in Spectral Imaging and Imaging......Page 2098
Tensor Product Construction......Page 2099
Optics for Hadamard Encoded Apertures......Page 2100
The History of Applied Hadamard Multiplexing......Page 2101
Further Reading......Page 2102
Introduction: Linear and Nonlinear Spectroscopy......Page 2103
Third-Order Nonlinear Spectroscopy......Page 2104
Saturation Spectroscopy......Page 2105
Polarization Spectroscopy......Page 2106
Multiphoton Absorption Spectroscopy......Page 2108
Degenerate Four-Wave Mixing (DFWM) Spectroscopy......Page 2109
Coherent Raman Spectroscopy......Page 2110
Laser-Induced Grating Spectroscopy......Page 2112
Introduction......Page 2113
Raman Microscopy......Page 2114
Beam Quality......Page 2116
Polarization......Page 2117
Emission Linewidth......Page 2118
Helium-Neon Laser......Page 2119
Krypton Ion Laser......Page 2120
Near Infrared Lasers......Page 2121
External-Cavity Semiconductor Lasers (ECSL)......Page 2122
Nd:YVO4 and Nd:YLF Lasers......Page 2123
UV Lasers......Page 2124
Frequency-Doubled Argon Ion Laser......Page 2125
Conclusions......Page 2126
Further Reading......Page 2127
Basic Notions......Page 2128
Anharmonic Oscillator Model......Page 2129
The Surface Nonlinear Response......Page 2131
Radiation Properties for Surface SHG......Page 2132
Experimental Considerations......Page 2133
Adsorbate Density......Page 2135
Surface Symmetry and Molecular Orientation......Page 2136
Surface and Interface Spectroscopy......Page 2137
Spatially Resolved Measurements......Page 2138
Probing Magnetization at Interfaces......Page 2139
Concluding Remarks......Page 2140
Introduction......Page 2141
Historical Outlook......Page 2142
Requirements for Single-Molecule Sensitivity......Page 2143
Fluorescence Emission......Page 2146
Other Spectroscopies......Page 2148
Biophysical and Biological Studies......Page 2151
Perspectives......Page 2155
Further Reading......Page 2156
Introduction......Page 2157
Free Electron THz Laser......Page 2158
Semiconductor THz Laser......Page 2159
THz Photoconductive Antenna (PCA)......Page 2160
Other Coherent THz Sources......Page 2161
THz Generation from Semiconductors......Page 2162
Few-Cycle THz Spectroscopy of Semiconductor Quantum Structures......Page 2167
Further Reading......Page 2169
Introduction......Page 2170
General Principles of Time-Resolved Fluorometry......Page 2171
Principle of the Single-Photon Timing Technique......Page 2173
Laser Sources for the Single-Photon Timing Technique......Page 2174
Phase Fluorometers Using the Harmonic Content of a Pulsed Laser......Page 2176
Data Analysis......Page 2177
Introduction......Page 2178
Ultrafast Laser Spectroscopic Techniques......Page 2179
Excimer Formation......Page 2180
Electronic Energy Transfer, Energy Migration and Trapping......Page 2182
Further Reading......Page 2186
Introduction......Page 2200
Optical Doppler Tomography......Page 2201
Phase-Resolved ODT Method......Page 2203
Spectral Domain Phase-Resolved ODT Method......Page 2204
Transverse Flow Velocity and Doppler Angle Determination......Page 2205
Applications......Page 2206
Polarization Sensitive OCT......Page 2207
Second Harmonic OCT......Page 2209
Acknowledgments......Page 2210
Further Reading......Page 2211
Principles of Operation......Page 2187
Fast Scanning......Page 2190
Beam Delivery......Page 2191
Developmental Biology......Page 2192
Medicine - Imaging Barrett’s Esophagus......Page 2194
Oncology - Identifying Tumors and Tumor Margins......Page 2195
Image-Guided Surgery......Page 2196
Materials......Page 2197
Conclusions......Page 2198
Further Reading......Page 2199
Active Mode-locking......Page 2212
Passive Mode-locking......Page 2213
Pulse Shaping by Material Dispersion......Page 2214
Ti:sapphire Lasers......Page 2217
Fiber Lasers......Page 2218
Sources of Amplified Ultrashort Pulses......Page 2219
Introduction......Page 2220
Autocorrelation......Page 2221
Spectral Interferometry - Relative Phase Measurements......Page 2223
Time-Frequency Representation......Page 2224
Frequency-Resolved Optical Gating......Page 2225
FROG Inversion Algorithms......Page 2227
Self Checks in FROG Measurements......Page 2229
Measuring Pulses Directly - SPIDER......Page 2230
Which Pulse Measurement Method Should I Use......Page 2231
Further Reading......Page 2232
From Nanosecond to Picosecond to Femtosecond Reaction Kinetics......Page 2233
The Formation of Coherent Wave Packets and their Motion......Page 2235
Pump-Probe Method......Page 2236
Three-Pulse Four-Wave Mixing Method......Page 2237
Dissociation on a Repulsive Surface......Page 2238
Reactions Involving Crossing of Potential Energy Surfaces......Page 2239
The Dynamics of Bond Formation......Page 2240
Different Measurements Possible with Three-Pulse FWM......Page 2242
Control Using Three-Pulse Four-Wave Mixing......Page 2244
Further Reading......Page 2245
Ultrafast Chemistry......Page 2246
Laser Technology......Page 2248
UV/VIS Pump-UV/VIS Probe Spectroscopy......Page 2249
Electronic Four-Wave Mixing Spectroscopy......Page 2250
Electronic Condensed Phase Spectroscopy: Femtochemistry and Solvation Dynamics......Page 2251
Ultrafast Vibrational Spectroscopy in Photochemistry: Structural Dynamics......Page 2252
Equilibrium Structural Dynamics in the Electronic Ground State......Page 2254
Further Reading......Page 2256
Generation and Characteristics of Spectrally Decomposed Waves......Page 2257
Nonlinear Wave Mixing With Nondepleting Pumps......Page 2260
Detection of Ultrafast Waveforms......Page 2261
Processing of Ultrafast Waveforms......Page 2262
Synthesis of Ultrafast Waveforms......Page 2264
Further Reading......Page 2266
Early History of Quantum Electronics......Page 2267
Further Reading......Page 2271
Race to the Light......Page 2273
Obstacles and Solutions......Page 2275
The Laser Design......Page 2276
Do it......Page 2277
The Light Fantastic......Page 2278
sdarticle_002.pdf......Page 2279
How Things Evolved......Page 2280
The Physics Effect......Page 2281
Optical Gain and Amplification Effect......Page 2282
Experimental Phase......Page 2283
The Impact......Page 2284
Further Reading......Page 2285

Citation preview

EDITOR-IN-CHIEF Robert D. Guenther Duke University Durham, NC, USA

EDITORS Duncan G. Steel University of Michigan Ann Arbor, MI, USA Leopold Bayvel London, UK

CONSULTING EDITORS John E. Midwinter University College London, London, UK Alan Miller University of St Andrews, Fife, UK

EDITORIAL ADVISORY BOARD R Alferness

D Killinger

Lucent Technologies Bell Laboratories Holmdel, NJ, USA

University of South Florida Tampa, FL, USA

D J Brady

T A King

Duke University Durham, NC, USA

University of Manchester UK

J Caulfield

E Leith

Diversified Research Corporation Cornersville, TN, USA

University of Michigan Ann Arbor, MI, USA

R L Donofrio

A Macleod

Display Device Consultants LLC Ann Arbor, MI, USA

Thin Film Center, Inc. Tucson, AZ, USA

J Dudley

P F McManamon

Universite´ de Franche-Comte´ Besanc¸on, France

Air Force Research Laboratory Wright-Paterson AFB, OH, USA

J G Eden

J N Mait

University of Illinois Urbana, IL, USA

U.S. Army Research Laboratory Adelphi, MD, USA

H O Everitt

D Phillips

U.S. Army Research Office Research Triangle Park, NC, USA

Imperial College London, UK

E L Hecht

C R Pidgeon

Adelphi University Garden City, NY, USA

Heriot-Watt University Edinburgh, UK

H S Hinton

R Pike

Utah State University Logan, UT, USA

King’s College London UK

T N Hurst

E Van Stryland

Hewlett-Packard Laboratories Tucson, AZ, USA

University of Central Florida Orlando, FL, USA

PREFACE

vii

Preface We live in a world powered by light, but much of the understanding of light was developed around the time of the French Revolution. Before the 1950s, optics technology was viewed as well established and there was little expectation of growth either in scientific understanding or in technological exploitation. The end of the Second World War brought about huge growth in scientific exploration, and the field of optics benefited from that growth. The key event was the discovery of methods for producing a source of coherent radiation, with the key milestone being the demonstration of the first laser by Ted Maiman in 1960. Other lasers, nonlinear optical phenomena, and technologies such as holography and optical signal processing, followed in the early 1960s. In the 1970s the foundations of fiber optical communications were laid, with the development of low-loss glass fibers and sources that operated in the wavelength region of low loss. The 1980s saw the most significant technological accomplishment: the development of efficient optical systems and resulting useful devices. Now, some forty years after the demonstration of the first coherent light source, we find that optics has become the enabling technology in areas such as: † † † † † † †

information technology and telecommunications; health care and the life sciences; sensing, lighting, and energy; manufacturing; national defense; manufacturing of precision optical components and systems; and optics research and education.

We find ourselves depending on CDs for data storage, on digital cameras and printers to produce our family photographs, on high speed internet connections based on optical fibers, on optical based DNA sequencing systems; our physicians are making use of new therapies and diagnostic techniques founded on optics. To contribute to such a wide range of applications requires a truly multidisciplinary effort drawing together knowledge spanning many of the traditional academic boundaries. To exploit the accomplishments of the past forty years and to enable a revolution in world fiber-optic communications, new modalities in the practice of medicine, a more effective national defense, exploration of the frontiers of science, and much more, a resource to provide access to the foundations of this field is needed. The purpose of this Encyclopedia is to provide a resource for introducing optical fundamentals and technologies to the general technical audience for whom optics is a key capability in exploring their field of interest. Some 25 internationally recognized scientists and engineers served as editors. They helped in selecting the topical coverage and choosing the over 260 authors who prepared the individual articles. The authors form an international group who are expert in their discipline and come from every part of the technological community spanning academia, government and industry. The editors and authors of this Encyclopedia hope that the reader finds in these pages the information needed to provide guidance in exploring and utilizing optics. As Editor-in-Chief I would like to thank all of the topical editors, authors and the staff of Elsevier for each of their contributions. Special thanks should go Dr Martin Ruck of Elsevier who provided not only organizational skills but also technological knowledge which allowed all of the numerous loose ends to be tied. B D Guenther Editor-in-Chief

A ALL-OPTICAL SIGNAL REGENERATION O Leclerc, Alcatel Research & Innovation, Marcoussis, France q 2005, Elsevier Ltd. All Rights Reserved.

Introduction The breakthrough of optical amplification, combined with the techniques of wavelength division multiplexing (WDM) and dispersion management, have made it possible to exploit a sizeable fraction of the optical fiber bandwidth (several terahertz). Systems based on 10 Gbit/s per channel bit-rate and showing capacities of several terabit/s, with transmission capabilities of hundreds or even thousands of kilometers, have reached the commercial area. While greater capacities and spectral efficiencies are likely to be reached with current technologies, there is potential economic interest in reducing the number of wavelength channels by increasing the channel rate (e.g., 40 Gbit/s). However, such fourfold increase in the channel bit-rate clearly results in a significant increase in propagation impairments, stemming from the combined effects of noise accumulation, fiber dispersion, fiber nonlinearities, and inter-channel interactions and contributing to two main forms of signal degradation. The first one is related to the amplitude domain; power levels of marks and spaces can suffer from random deviations arising from interaction between signal and amplified spontaneous emission (ASE) noise or with signals from other channels through cross-phase modulation (XPM) from distortions induced by chromatic dispersion. The second type of signal degradations occurs in the time domain; time position of pulses can also suffer from random deviations arising from interactions between signal and ASE noise through fiber dispersion. Preservation of high power contrast between ‘1’ and ‘0’, and of both amplitude fluctuations and timing jitter below some acceptable levels, are mandatory for high transmission quality, evaluated through bit-error-rate (BER) measurements or

estimated by Q-factors. Moreover, in future optical networks, it appears mandatory to ensure similar but high optical signal quality at the output of whatever nodes in the networks, as to enable successful transmission of the data over arbitrary distance. Among possible solutions to overcome such systems limitations is the implementation of Optical Signal Regeneration, either in-line for long-haul transmission applications or at the output of network nodes. Such Signal Regeneration performs, or should be able to perform, three basic signal-processing functions that are Re-amplifying, Re-shaping, and Re-timing, hence the generic acronym ‘3R’ (Figure 1). When Re-timing is absent, one usually refers to the regenerator as ‘2R’ device, which has only re-amplifying and re-shaping capabilities. Thus, full 3R regeneration with retiming capability requires clock extraction. Given system impairments after some transmission distance, two solutions remain for extending the actual reach of an optical transmission system or the scalability of an optical network. The first consists in segmenting the system into independent trunks, with full electronic repeater/transceivers at interfaces (we shall refer to this as ‘opto-electronic regeneration’ or O/E Regeneration forthwith). The second solution, i.e., all-optical Regeneration, is not the optical version of the first which would have higher bandwidth capability but still performs the same signal-restoring functions with far reduced complexity. At this point, it should be noted that Optical 3R techniques are not necessarily void of any electronic functions (e.g., when using electronic clock recovery and O/E modulation), but the main feature is that these electronic functions are narrowband (as opposed to broadband in the case of electronic regeneration). Some key issues have to be considered when comparing such Signal Regeneration approaches. The first is that today’s and future optical transmission systems or/and networks are WDM networks.

2 ALL-OPTICAL SIGNAL REGENERATION

Figure 1 Principle of 3R regeneration, as applied to NRZ signals. (a) Re-Amplifying; (b) Re-Shaping; (c) Re-Timing. NB: Such eye diagrams can be either optical or electrical eye diagrams.

Under this condition, the WDM compatibility – or the fact that any Regeneration solution can simultaneously process several WDM channels – represents a key advantage. The maturity of the technology – either purely optical or opto-electronic – also plays an important role in the potential (pre-)development of such solutions. But the main parameter that will decide the actual technology (and also technique) relies on the tradeoff between actual performance of the regeneration solutions and their costs (device and implementation), depending on the targeted applications (long-haul system, medium haul transport, wide area optical network, etc.). In this article, we review the current alternatives for all-optical Signal Regeneration, considering both theoretical and experimental performance and practical implementation issues. Key advantages and possible drawbacks of each solutions are discussed, to sketch the picture in this field. However, first we must focus on some generalities about Signal Regeneration and the way to define (and qualify) such regenerator performance. In a second part, we will detail the currently-investigated optical solutions for Signal Regeneration with a specific highlight for semiconductor-based solutions using either semiconductor optical amplifiers (SOA) technology or newlydeveloped saturable absorbers. Optical Regeneration techniques based on synchronous modulation will also be discussed in a third section. The conclusion will summarize the key features of each solution,

Figure 2 Generic structure of Signal 2R/3R Regenerator based on Decision Element (2R) and Decision Element and Clock Recovery (3R).

so as to underline the demanding challenge optical components are facing in this application.

Generalities on Signal Regeneration Principles

In the general case, Signal Regeneration is performed using a decision element exhibiting a nonlinear transfer function. Provided with a threshold level and when associated with an amplifier, such an element then performs the actual Re-shaping of the incoming data (either in the electrical or optical domain) and completes a 2R Signal Regenerator. Figure 2 shows the generic structure of such a Signal Regenerator in the general case as applied to non return to zero (NRZ) data. A clock recovery block can be added (dotted lines) to provide the decision element with time reference and hence perform the third R (Re-timing) of full Signal 3R Regeneration. At this point, it should be mentioned that the decision

ALL-OPTICAL SIGNAL REGENERATION

element can operate either on electrical signals (standard electrical DFF) provided that optical ! electrical and electrical ! optical signal conversion stages are added or directed onto optical signals using the different techniques described below. The clock signal can be of an electrical nature, as for electrical decision element in O/E regenerator – or either an electrical or a purely optical signal in all-optical regenerators. Prior to reviewing and describing the various current technology alternatives for such Optical Signal Regeneration, the issue of the actual characterization of regenerator performance needs to be explained and clarified. As previously mentioned, the core element of any Signal Regenerator is the decision element showing a nonlinear transfer function that can be of varying steepness. As will be seen in Figure 3, the actual regenerative performance of the regenerator will indeed depend upon the degree of nonlinearity of the decision element transfer function. Figure 3 shows the principle of operation of a regenerator incorporating a decision element with two steepnesses of the nonlinear transfer function. In any case, the ‘1’ and ‘0’ symbols amplitude probability densities (PD) are squeezed after passing through the decision element. However, depending upon the addition of a clock reference for triggering the decision element, the symbol arrival time PD will be also squeezed (clocked decision ¼ 3R regeneration) or enlarged (no clock reference ¼ 2R regeneration) resulting in conversion of amplitude fluctuations to time position fluctuations. As for system performance – expressed through BER – regenerative capabilities of any regenerator

3

simultaneously depend upon both the output amplitude and arrival time PD of the ‘1’ and ‘0’ symbols. In the unusual case of 2R regeneration (no clocked decision), a tradeoff has then to be derived, considering both the reduction of amplitude PD and the enlarged arrival time PD induced by the regenerator, to ensure sufficient signal improvement. In Figure 3a, we consider a step function for the transfer function of the decision element. In this case, amplitude PD are squeezed to Dirac PD after the decision element, and depending upon addition or not of a clock reference, arrival time PD is reduced (3R) or dramatically enlarged (2R). In Figure 3b, the decision element exhibits a moderately nonlinear transfer function. This results in an asymmetric and less-pronounced squeezing of the amplitude PD compared to the previous case, but in turn results in a significantly less enlarged arrival time PD when no clock reference is added (2R regeneration). Comparison of these two decision element of different nonlinear transfer function indicates that for 3R regeneration applications, the more nonlinear the transfer function of the decision element the better performance, the ideal case being the step function. In the case of 2R regeneration applications, a tradeoff between the actual reduction of the amplitude PD and enlargement of timing PD is to be derived and clearly depends upon the actual shape of the nonlinear transfer function of the decision element. Qualification of Signal Regenerator Performance

To further illustrate the impact of the actual shape of the nonlinear transfer function of the decision

Figure 3 Signal Regeneration process using Nonlinear Gates. (a) Step transfer function (¼ Electronic DFF); (b) ‘moderately’ nonlinear transfer function. As an illustration of the Regenerator operation ‘1’ and ‘0’ symbols amplitude probability density (PD) and arrival time probability density (PD) are shown in light gray and dark gray, respectively.

4 ALL-OPTICAL SIGNAL REGENERATION

element in 3R application, the theoretical evolution of BER with number of concatenated regenerators have been plotted for regenerators having different nonlinear responses. Figure 4 shows the numerically calculated evolution of the BER of a 10 Gbit/s NRZ signal with fixed optical signal-to-noise ratio (OSNR) at the input of the 3R regenerator, as a function of the cascaded regenerator incorporating nonlinear gates with nonlinear transfer function of different depths. From Figure 4, can be seen different behaviors depending on the nonlinear function shape. As previously stated, the best regeneration performance is obtained with an ideal step function (case a), which is actually the case for O/E regenerator using electronic decision flip-flop (DFF). In that case, BER linearly increase (i.e., more errors) in the cascade. Conversely, when nonlinearity is reduced (cases (b) and (c)), both BER and noise accumulate, until the concatenation of nonlinear functions reach some steady-state pattern, from which BER linearly increases. Concatenation of nonlinear devices thus magnifies shape differences in their nonlinear response, and hence their regenerative capabilities. Moreover, as can be seen in Figure 4, all curves standing for different regeneration efficiencies pass through a common point defined after the first device. This clearly indicates that it is not possible to qualify the regenerative capability of any regenerator when considering the output signal after only one regenerator. Indeed, the BER is the same for either a 3R regenerator or a mere amplifier if only measured after a single element. This originates from the initial overlap between noise distributions associated with marks and spaces, that cannot be suppressed but only minimized by a single decision element through threshold adjustment.

As a general result, the actual characterization of the regenerative performance of any regenerator should in fine be conducted considering a cascade of regenerators. In practice this can easily be done with the experimental implementation of the regenerator under study in a recirculating loop. Moreover, such an investigation tool will also enable access to the regenerator performance with respect to the transmission capabilities of the regenerated signal, which should not be overlooked. Let us now consider the physical implementation of such all-optical Signal Regenerators, along with the key features offered by the different alternatives.

All-Optical 2R/3R Regeneration Using Optical Nonlinear Gates Prior to describing the different solutions for alloptical in-line Optical Signal Regeneration, it should be mentioned that since the polarization states of the optical data signals cannot be preserved during propagation, it is required that the regenerator exhibits an extremely low polarization sensitivity. This clearly translates to a careful optimization of all the different optical components making up the 2R/3R regenerator. It should be noted that this also applies to the O/E solutions but is of limited impact, since only the photodiode has to be polarization insensitive. Figure 5 illustrates the generic principle of operation of an all-optical 3R Regenerator using optical nonlinear gates. Contrary to what occurs in O/E, regenerator where the extracted clock signal drives the electrical decision element, the incoming and distorted optical data signal triggers the nonlinear gate, hence generating a switching window which is

Figure 4 Evolution of the BER with concatenated regenerators for nonlinear gates with nonlinear transfer function of decreasing depths from case (a)–(c) (step function).

ALL-OPTICAL SIGNAL REGENERATION

Figure 5 Principle of operation of an all-Optical Signal 3R Regenerator using nonlinear gates.

applied to a newly generated optical clock signal so as to reproduce the initial data stream on the new optical carrier. In the case of 2R Optical Signal Regeneration, a continuous wave (CW) signal is substituted for the synchronized optical clock pulses. As previously mentioned, the actual regeneration performance of the 2R/3R devices will mainly depend upon the nonlinearity of the transfer function of the decision element but in 3R applications the quality of the optical clock pulses has also to be considered. In the following, we describe current solutions to explain the two main building blocks of all-optical Signal Regenerators: the decision element (i.e., nonlinear gate) and the clock recovery (CR) elements. Optical Decision Element

In the physical domain, optical decision elements with ideal step response – as for electrical DFF – do not exist. Different nonlinear optical transfer functions, approaching more or less the ideal case, can be realized in various media such as fiber, SOA, electroabsorption modulators (EAM), and lasers. Generally, as described below, the actual response (hence the regenerative properties of the device) of such optical gates directly depends upon the incoming signal instantaneous power. Under these conditions, it appears essential to add an adaptation stage so as to reduce intensity fluctuations (as caused by propagation or crossing routing/switching node) and provide the decision element with fixed power conditions. In practice, this results in the addition of a control circuitry (either optical or electrical) in the Re-amplification block, whose complexity directly depends on actual system environment (ultra-fast power equalization for packet-switching applications and compensation of slow power fluctuations in transmission applications). As previously described the decision gate performs Re-shaping (and Re-timing when clock pulses are added) of the incoming distorted optical signal, and represent the regenerator’s core element. Ideally, it should also act as a transmitter with characteristics ensuring the actual propagation of the regenerated data stream. In that respect, the chirp possibly

5

induced by the optical decision gate onto the regenerated signal – and the initial quality of the optical clock pulses in 3R applications – should be carefully considered (ideally by means of loop transmission) as to adequately match line transmission requirements. Different solutions for the actual realization of the optical decision element have been proposed and extensively investigated using, for example, cross gain modulation in semiconductor optical amplifier (SOA) devices but the most promising and flexible devices probably are interferometers, for which, descriptions of the generic principle of operation follows. Consider a CW signal (probe) at l2 wavelength injected into an optical interferometer, in which one arm incorporates a nonlinear medium in which an input signal carried by l1 wavelength (command) is, in turn, injected. Such a signal, at l1 wavelength, induces a phase shift through cross-phase modulation (XPM) in this arm of the interferometer, the amount depending upon power level Pin;l1 : In turn, such phase modulation (PM) induces amplitude modulation (AM) on the signal at l2 wavelength when recombined at the output of the interferometer and translates the information carried by wavelength l1 onto l2 : Under these conditions, such optical gates clearly act as wavelength converters (it should be mentioned that Wavelength Conversion is not necessarily equivalent to Regeneration; i.e., a linear transfer function performs suitable Wavelength Conversion but by no means Signal Regeneration). Optical interferometers can be classified according to the nature of the nonlinearity exploited to achieve a p phase shift. In the case of fiber-based devices, such as the nonlinear optical loop mirror (NOLM), the phase shift is induced through the Kerr effect in an optical fiber. The key advantage of fiberbased devices such as NOLM lies in the nearinstantaneous (fs) response of the Kerr nonlinearity, making them very attractive for ultra-high bit-rate operation ($160 Gbit/s). Polarization-insensitive NOLMs have been realized, although with the same drawbacks concerning integrability. With recent developments in highly nonlinear (HNL) fibers, however, the required NOLM fiber length could be significantly reduced, hence dramatically reducing environmental instability. A second type of device is the integrated SOAbased Mach-Zehnder interferometers (MZI). In MZIs, the phase shift is due to the effect of photoinduced carrier depletion in the gain saturation regime of one of the SOAs. The control and probe can be launched in counter- or co-directional ways. In the first case, no optical filter is required at the output

6 ALL-OPTICAL SIGNAL REGENERATION

of the device for rejecting the signal at l1 wavelength but operation of the MZI is limited by its speed. At this point, one should mention that the photoinduced modulation effects in SOAs are intrinsically limited in speed by the gain recovery time, which is a function of the carrier lifetime and the injection current. An approach referred to as differential operation mode (DOM) and illustrated on Figure 6, which takes advantage of the MZI’s interferometric properties, makes it possible to artificially increase the operation speed of such ‘slow’ devices up to 40 Gbit/s. As discussed earlier, the nonlinear response is a key parameter for regeneration efficiency. Combining two interferometers is a straightforward means to improve the nonlinearity of the decision element transfer function, and hence regeneration efficiency. This approach was validated at 40 Gbit/s using a cascade of two SOA-MZI, (see Figure 7 (left)). Such a scheme offers the advantage of restoring data polarity and wavelength, hence making the regenerator inherently transparent. Finally, the second conversion stage can be used as an adaptation interface to the transmission link achieved through chirp tuning in this second device. Such an Optical 3R Regenerator was upgraded to 40 Gbit/s, using DOM in both SOA-MZIs with validation in a 40 Gbit/s loop RZ transmission. The 40 Gbit/s eye diagram monitored at the regenerator output after 1, 10, and 100 circulations are shown in Figure 7 (right) and remain unaltered by distance. With this all-optical regenerator structure, the

Figure 6 Schematic and principle of operation of SOA-MZI in differential mode.

minimum OSNR tolerated by the regenerator (1 dB sensitivity penalty at 10210 BER) was found to be as low as 25 dB/0.1 nm. Such results clearly illustrate the high performance of this SOA-based regenerator structure for 40 Gbit/s optical data signals. Such a complex mode of operation for addressing 40 Git/s bit-rates will probably be discarded when we consider the recent demonstration of standard-mode wavelength conversion at 40 Gbit/s, which uses a newly-designed active-passive SOA-MZI incorporating evanescent-coupling SOAs. The device architecture is flexible in the number of SOAs, thus enabling easier operation optimization and reduced power consumption, leading to simplified architectures and operation for 40 Gbit/s optical 3R regeneration. Based on the same concept of wavelength conversion for Optical 3R Regeneration, it should be noted many devices have been proposed and experimentally validated as wavelength converters at rates up to 84 Gbit/s, but with cascadability issues still to be demonstrated to assess their actual regenerative properties. Optical Clock Recovery (CR)

Next to the decision, the CR is a second key function in 3R regenerators. One possible approach for CR uses electronics while another only uses optics. The former goes with OE conversion by means of a photodiode and subsequent EO conversion through a modulator. This conversion becomes more complex and power-hungry as the data-rate increases. It is clear that the maturity of electronics gives a current advantage to this approach. But considering the pros and cons of electronic CR for cost-effective implementation, the all-optical approach seems more promising, since full regenerator integration is potentially possible with reduced power consumption. In this view, we shall focus here on the optical approach and more specifically on the self-pulsating effect in three-sections distributed feedback (DFB) lasers or more recently in distributed Bragg reflector

Figure 7 Optimized structure of a 40 Gbit/s SOA-based 3R regenerator. 40 Gbit/s eye diagram evolution: (a) B-to-B; (b) 1 lap; (c) 10 laps; (d) 100 laps.

ALL-OPTICAL SIGNAL REGENERATION

(DBR) lasers. Recent experimental results illustrate the potentials of such devices for high bit rates (up to 160 Gbit/s), broad dynamic range, broad frequency tuning, polarization insensitivity, and relatively short locking times (1 ns). This last feature makes these devices good candidates for operation in asynchronous-packet regimes. Optical Regeneration by Saturable Absorbers

We next consider saturable absorbers (SA) as nonlinear elements for optical regeneration. Figure 8 (left) shows a typical SA transfer function and illustrates the principle of operation. When illuminated with an optical signal with peak power below some threshold ðPsat Þ; the photonic absorption of the SA is high and the device is opaque to the signal (low transmittance). Above Psat ; the SA transmittance rapidly increases and asymptotically saturates to transparency (passive loss being overlooked). Such a nonlinear transfer function only applies to 2R optical regeneration. Different technologies for implementing SAs are available, but the most promising approach uses semiconductors. In this case, SA relies upon the control of carrier dynamics through the material’s recombination centers. Parameters such as on – off contrast (ratio of transmittance at high and low incident powers), recovery time (1/e) and saturation energy, are key to device optimization. In the following, we consider a newly-developed ion-irradiated MQW-based device incorporated in a micro-cavity and shown on Figure 8 (right). The device operates as a reflection-mode vertical cavity, providing both a high on/off extinction ratio by canceling reflection at low intensity and a low saturation energy of 2 pJ. It is also intrinsically polarization-insensitive. Heavy ionirradiation of the SA ensures recovery times (at 1=e) shorter than 5 ps (hence compatible with bit-rate above 40 Gbit/s), while maintaining a dynamic contrast in excess of 2.5 dB at 40 GHz repetition rate.

7

The regenerative properties of SA make it possible to reduce cumulated amplified spontaneous emission (ASE) in the ‘0’ bits, resulting in a higher contrast between mark and space, hence increasing system performance. Yet SAs do not suppress intensity noise in the marks, which makes the regenerator incomplete. A solution for this noise suppression is optical filtering with nonlinear (soliton) pulses. The principle is as follows. In absence of chirp, the soliton temporal width scales in the same way as the reciprocal of its spectral width (Fourier-transform limit) times its intensity (fundamental soliton relation). Thus, an increase in pulse intensity corresponds to both time narrowing and spectral broadening. Conversely, a decrease in pulse intensity corresponds to time broadening and spectral narrowing. Thus, the filter causes higher loss when intensity increases, and lower loss when intensity decreases. The filter thus acts as an automatic power control (APC) in feed-forward mode, which causes power stabilization. The resulting 2R regenerator (composed by the SA and the optical filter) is fully passive, which is of high interest for submarine systems where the power consumption must be minimal, but it does not include any control in the time domain (no Re-timing). System demonstrations of such passive SA-based Optical Regeneration have been reported with a 20 Gbit/s single-channel loop experiment. Implementation of the SA-based 2R Regenerator with 160 km-loop periodicity made it possible to double the error-free distance (Q ¼ 15.6 dB or 1029 BER) of a 20 Gbit/s RZ signal. So as to extend the capability of passive 2R regeneration to 40 Gbit/s systems, an improved configuration was derived from numerical optimization and experimentally demonstrated in a 40 Gbit/s WDM-like, dispersion-managed loop transmission, showing more than a fourfold increase in the WDM transmission distance at 1024 BER (1650 km without the SA-based regenerator and 7600 km when implementing the 2R regenerator with 240 km periodicity).

Figure 8 (a) Saturable Absorber (SA) ideal transfer function. (b) Structure of Multi-Quantum Well SA.

8 ALL-OPTICAL SIGNAL REGENERATION

Such a result illustrates the potential high interest of such passive optical 2R regeneration in long-haul transmission (typically in noise-limited systems) since the implementation of SA increases the system’s robustness to OSNR degradation without any extra power consumption. Reducing both saturation energy and insertion loss along with increasing dynamic contrast represent key future device improvements. Regeneration of WDM signals from the same device, such as one SA chip with multiple fibers implemented between Mux/DMux stages, should also be thoroughly investigated. In this respect, SA wavelength selectivity in quantum dots could possibly be advantageously exploited.

Synchronous Modulation Technique All-optical 3R regeneration can be also achieved through in-line synchronous modulation (SM) associated with narrowband filtering (NF). Figure 9 shows the basic layout of such an Optical 3R Regenerator. It is composed of an optical filter followed by an Intensity and Phase Modulator (IM/PM) driven by a recovered clock. Periodic insertion of SM-based modulators along the transmission link provides efficient jitter reduction and asymptotically controls ASE noise level, resulting in virtually unlimited transmission distances. Re-shaping and Re-timing provided by IM/PM intrinsically requires nonlinear (soliton) propagation in the trunk fiber following the SM block. Therefore, one can refer to the approach as distributed optical regeneration. This contrasts with lumped regeneration, where 3R is completed within the regenerator (see above with Optical Regeneration using nonlinear gates), and is independent of line transmission characteristics. However, when using a new approach referred to ‘black box’ optical regeneration (BBOR), it is possible

to make the SM regeneration function and transmission work independently in such a way that any type of RZ signals (soliton or non-soliton) can be transmitted through the system. The BBOR technique includes an adaptation stage for incoming RZ pulses in the SM-based regenerator, which ensures high regeneration efficiency regardless of RZ signal format (linear RZ, DM-soliton, C-RZ, etc.). This is achieved using a local and periodic soliton conversion of RZ pulses by means of launching an adequate power into some length of fiber with anomalous dispersion. The actual experimental demonstration of the BBOR approach and its superiority over the ‘classical’ SM-based scheme for DM transmission was experimentally investigated in 40 Gbit/s DM loop transmission. Under these conditions, one can then independently exploit dispersion management (DM) techniques for increasing spectral efficiency in long-haul transmission, while ensuring high transmission quality through BBOR. One of the key properties of the SM-based alloptical Regeneration technique relies on its WDM compatibility. The first (Figure 10, left) and straightforward solution to apply Signal Regeneration to WDM channels amounts to allocating a regenerator to each WDM channel. The second consists in sharing a single modulator, thus processing the WDM channels at once in serial fashion. This approach requires WDM synchronicity, meaning that all bits be synchronous with the modulation, that can be achieved either by use of appropriate time-delay lines located within a DMux/Mux apparatus (Figure 10, upper right), or by making the WDM channels inherently time-coincident at specific regenerator locations (Figure 10, bottom right). Clearly, the serial regeneration scheme is far simpler and cost-effective than the parallel version; however, optimized re-synchronization schemes still remain to be

Figure 9 Basic layout of the all-optical Regenerator by Synchronous Modulation and Narrowband Filtering and illustration of the principle of operation.

ALL-OPTICAL SIGNAL REGENERATION

9

Figure 10 Basic implementation schemes for WDM all-Optical Regeneration. (a) parallel asynchronous; (b) serial re-synchronized; (c) serial self-synchronized.

developed for realistic applications. Experimental demonstration of this concept was assessed by means of a 4 £ 40 Gbit=s dispersion-managed transmission over 10 000 km (BER , 5.1028) in which a single modulator was used for the simultaneous regeneration of the 4 WDM channels. Considering next all-optical regeneration schemes with ultra-high speed potential, a compact and lossfree 40 Gbit/s Synchronous Modulator, based on optically-controlled SOA-MZI, was proposed and loop-demonstrated at 40 Gbit/s with an error-free transmission distance in excess of 10 000 km. Moreover, potential ultra-high operation of this improved BBOR scheme was recently experimentally demonstrated by means of a 80 GHz clock conversion with appropriate characteristics through the SOA-MZI. One should finally mention all fiber-based devices such as NOLM and NALM for addressing ultra-high speed SM-based Optical Regeneration, although no successful experimental demonstrations have been reported so far in this field.

Conclusion In summary, optical solutions for Signal Regeneration present many key advantages. These are the only advantages to date to possibly ensure WDM compatibility of the regeneration function (mostly 2R related). Such optical devices clearly exhibits the best 2R regeneration performance (wrt to O/E solutions) as a result of the moderately nonlinear transfer function (which in turn can be considered as a drawback in 3R applications), but the optimum configuration is still to be clearly derived and identified depending upon the system application. Optics also allow to foresee and possibly target ultrafast applications above 40G, for signal regeneration if needed. Among the current drawbacks, one

should mention the relative lack of wavelength/ formats flexibility of these solutions (compared to O/E solutions). It is complex or difficult to restore the input wavelength or address any C-band wavelength at the output of the device or to successfully regenerate modulation formats other than RZ. In that respect, investigations should be conducted to derive new optical solutions capable of processing more advanced modulation formats at 40G. Finally, the fact that the nonlinear transfer function of the optical is in general triggered by the input signal instantaneous power also turns out to be a drawback since it requires control circuitry. The issue of cost (footprint, power consumption, etc.) of these solutions, compared to O/E ones, is still open. In this respect, purely optical solutions incorporating alloptical clock recovery, the performance of which is still to be technically assessed, are of high interest for reducing costs. Complete integration of an all-optical 2R/3R regenerator or such parallel regenerators onto a single semiconductor chip should also contribute to make all-optical solutions cost-attractive even though acceptable performance of such fully integrated devices is still to be demonstrated. From today’s status concerning the two alternative approaches for in-line regeneration (O/E or alloptical), it is safe to say that the choice between either solution will be primarily dictated by both engineering and economical considerations. It will result from a tradeoff between overall system performance, system complexity and reliability, availability, time-to-market, and rapid returns from the technology investment.

See also Interferometry: Overview. Optical Amplifiers: Semiconductor Optical Amplifiers. Optical Communication Systems: Wavelength Division Multiplexing.

10

ALL-OPTICAL SIGNAL REGENERATION

Further Reading Bigo S, Leclerc O and Desurvire E (1997) All-optical fiber signal processing for solitons communications. IEEE Journal of Selected Topics on Quantum Electronics 3(5): 1208– 1223. Dagens B, Labrousse A, Fabre S, et al. (2002) New modular SOA-based active– passive integrated Mach– Zehnder interferometer and first standard mode 40 Gb/s alloptical wavelength conversion on the C-band. paper PD 3.1, Proceedings of ECOCOL 02, Copenhague. Durhuus T, Mikkelsen B, Joergensen C, Danielsen SL and Stubkjaer KE (1996) All-optical wavelength conversion by semiconductor optical amplifiers. Journal of Lightwave Technology 14(6): 942– 954. Lavigne B, Guerber P, Brindel P, Balmefrezol E and Dagens B (2001) Cascade of 100 optical 3R regenerators at 40 Gbit/s based on all-active Mach– Zehnder interferometers. paper We.F.2.6, Proceedings of ECOC 2001, Amsterdam. Leclerc O, Lavigne B, Chiaroni D and Desurvire E (2002) All-optical regeneration: Principles and WDM implementation. In: Kaminow IP and Li T (eds) Optical Fiber Technologies IVA, chap. 15, pp. 732 – 784. San Diego: Academic Press.

Leuthold J, Raybon G, Su Y, et al. (2002) 40 Gbit/s transmission and cascaded all-optical wavelength conversion over 1,000,000 km. Electronic Letters 38(16): 890 – 892. ¨ hle´n P and Berglind E (1997) Noise accumulation and O BER estimates in concatenated nonlinear optoelectronic repeaters. IEEE Photon Technol. Letters 9(7): 1011– 1013. Otani T, Suzuki M and Yamamoto S (2001) 40 Gbit/s optical 3R regenerator for all-optical networks. We.F.2.1, Proceedings of ECOC 2001, Amsterdam. Oudar J-L, Aubin G, Mangeney J, et al. (2004) Ultra-fast quantum-well saturable absorber devices and their application to all-optical regeneration of telecommunication optical signals. Annales des Te´le´communications 58(11 – 12): 1667– 1707. Sartorius B, Mohrle M, Reichenbacher S, et al. (1997) Dispersive self-Q-switching in self-pulsating DFB lasers. IEEE Journal of Quantum Electronics 33(2): 211 – 218. Ueno Y, Nakamura S, Sasaki J, et al. (2001) Ultrahigh-speed all-optical data regeneration and wavelength conversion for OTDM systems. Th.F.2.1, Proceedings of ECOC 2001, Amsterdam.

B BABINET’S PRINCIPLE B D Guenther, Duke University, Durham, NC, USA

Fraunhofer Diffraction

q 2005, Elsevier Ltd. All Rights Reserved.

The electric field due to Fraunhofer diffraction is given by iae2 ikR0 ð ð f ðx; yÞe2 iðvx xþvy yÞ dx dy EP ðvx ; vy Þ ¼ lR0

In the development of the theory of diffraction, the diffraction field is due to a surface integral, the Fresnel integral, but no restrictions are imposed on the choice of the surface over which the integration must be performed. This fact leads to a very useful property called Babinet’s principle, first stated by Jacques Babinet (1794 – 1872) in 1837 for scalar waves. We will discuss only the scalar Babinet’s principle; discussion of the rigorous vector formulation can be found in a number of books on electromagnetic theory. To introduce Babinet’s principle, we label a plane, separating the source, S, and observation point, P, as S. If no obstructions are present, a surface integration over S yields the light distribution at P. If we place an opaque obstruction in this plane with a clear aperture, S1, then the field at P is given by integrating over only S1; contributions from S, outside of S1, are zero since the obstruction is opaque. We may define an aperture, S2, as complementary to S1 if the obstruction is constructed by replacing the transparent regions of S, i.e., S1, by opaque surfaces and the opaque regions of S by clear apertures. Figure 1 shows two complementary obstructions, where the shaded region indicates an opaque surface. The surface integral over S generates the field, E, in the absence of any obstruction. If obstruction S1 is present then the diffraction field is E1, obtained by integrating over S1. According to Babinet’s principle, the diffraction field due to obstruction S2, must be ~ 2 ¼ E~ 2 E ~1 E We will look at examples of the application of the principle for both Fraunhofer and Fresnel diffraction.

S

where a is the amplitude of the incident plane wave and R0 is the distance from the obstruction to the observation point. We have defined the spatial frequencies, vx and vy by the equations

vx ¼ kL ¼ 2

2pj lR0

vy ¼ kM ¼ 2

2ph lR0

This surface integral is an approximation of the Hygens –Fresnel integral and can be identified as a Fourier transform of the aperture transmission function, f(x, y) in two dimensions. For our discussion we will consider only one dimension and ignore the parameters in front of the integral. Assume that a(x) is the amplitude transmission of an aperture in an obstruction, b(x) is the amplitude transmission function of the complementary obstruction and g is the amplitude of the wave when no

Σ1

Σ2

Figure 1 An aperture S1 shown by the unshaded region and its complement S2. Reproduced with permission from Guenther R (1990) Modern Optics. New York: Wiley. Copyright q 1990 John Wiley & Sons.

12

BABINET’S PRINCIPLE

obstruction is present. The Fourier transforms of the one-dimensional amplitude transmission functions of the two apertures are equal to the Fraunhofer diffraction patterns that will be generated by the two apertures AðkÞ ¼

ð1

aðxÞe2 ikx dx

21

BðkÞ ¼

ð1

bðxÞe2ikx dx

21

With no aperture present, the far field amplitude is GðkÞ ¼ g

ð1

e2 ikx dx

21

Figure 2 Geometry for the analysis of Fresnel diffraction of a circular aperture. Reproduced with permission from Guenther R (1990) Modern Optics. New York: Wiley. Copyright q 1990 John Wiley & Sons.

Babinet’s principle states that BðxÞ ¼ G 2 AðxÞ must be the Fraunhofer diffraction field of the complementary aperture. We may rewrite this equation for the diffraction field as BðkÞ ¼ GdðkÞ þ AðkÞeip

½1

The first term on the right of the Fraunhofer diffraction field for the complementary obstruction [1] is located at the origin of the observation plane and is proportional to the amplitude of the unobstructed wave. The second term in the equation for the Fraunhofer diffraction pattern [1] is identical to the Fraunhofer diffraction pattern of the original aperture, except for a constant phase. Thus the Fraunhofer diffraction from the two complementary apertures are equal, except for a constant phase and the bias term at the origin. Physically this means that the diffraction intensity distributions of complementary apertures will be identical but their brightness will differ!

D ¼ Z þ Z0 þ

ðj 2 xs Þ2 þ ðh 2 ys Þ2 2ðZ þ Z0 Þ

1 1 1 þ 0 ¼ Z Z r

and the coordinates of the stationary point are x0 ¼

Z0 j þ Zxs Z þ Z0

y0 ¼

Z0 h þ Zys Z þ Z0

The parameters of the integral are simplified by the geometry selected for Figure 2. Both the observation point and source are located on the axis of symmetry of the aperture which lies along the z-axis of a cylindrical coordinate system thus x 0 ¼ y0 ¼ j ¼ h ¼ 0

and

D ¼ Z0 þ Z

The approximation for the distance from the source and observation point to the aperture yields

Fresnel Diffraction We can calculate the Fresnel diffraction from an opaque disk by applying Babinet’s principle to the diffraction pattern calculated for a circular aperture. We assume that a circular aperture of radius a is illuminated by a point source a distance Z0 from the aperture. We observe the transmitted wave at the point P, located a distance Z from the aperture (see Figure 2). The Fresnel diffraction integral is ia 2 ikD e E~ P ¼ lrD   ðð ik 2 ðx2x0 Þ2 þðy2y0 Þ2 dx dy £ f ðx; yÞe 2r

where the distance between the source and observation point is

½2

R þ R0 < Z þ Z0 þ

r2 r2 ¼Dþ 2r 2r

where r 2 ¼ x2 þ y2 · The cylindrical coordinate version of [2] is then ð a ð 2p kr 2 2i ~ P ¼ ia e2 ikD f ðr; wÞe 2r r dr df E lrD 0 0

½3

We assume that the transmission function of the aperture is a constant, f ðr; fÞ ¼ 1; to make the integral as simple as possible. After performing the integration over the angle f , [3] can be

BABINET’S PRINCIPLE

rewritten as

The intensity is 2

~ Pa E

13

a iap 2 ikD ð rl 2 ipr 2 =rl r2 e ¼ e d D rl 0

!

IPd ¼ I0 ½4

Performing the integration of [4], results in the Fresnel diffraction amplitude   ~ Pa ¼ a e2 ikD 1 2 e2 ipa2 =rl E ½5 D

The result of the application of Babinet’s principle is the conclusion that, at the center of the geometrical shadow cast by the disk, there is a bright spot with the same intensity as would exist if no disk were present. This is Poisson’s spot. The intensity of this spot is independent of the choice of the observation point along the z-axis.

The intensity of the diffraction field is IPa ¼ 2I0

p a2 1 2 cos rl

!

pa2 ¼ 4I0 sin 2rl 2

See also Diffraction: Fraunhofer Diffraction; Fresnel Diffraction.

½6

[6] predicts a sinusoidal variation in intensity as the observation point moves toward the aperture. Babinet’s principle can now be used to evaluate Fresnel diffraction by an opaque disk, the same size as the circular aperture, i.e., of radius a. The field for the disk is obtained by subtracting the field, diffracted by a circular aperture, from the field of an unobstructed spherical wave, EP: ~ Pd ¼ E ~P 2 E ~ Pa E ~ Pa ~ Pd ¼ a e2ikD 2 E E D a 2ikD 2ipa2 =rl e ¼ e D

½7

Further Reading Abramowitz M and Stegun IA (1964) Handbook of Mathematical Functions. NBS Applied Mathematics Series-55 Washington, DC: National Bureau of Standards. Born M and Wolf E (1959) Principles of Optics. New York: Pergamon Press. Guenther R (1990) Modern Optics. New York: Wiley. Haus HA (1984) Waves and Fields in Optoelectronics. Englewood Cliffs, NJ: Prentice-Hall. Hecht E (1998) Optics, 3rd edn. Reading, MA: Addison-Wesley. Klein MV (1970) Optics. New York: Wiley. Rossi B (1957) Optics. Reading, MA: Addison-Wesley. Sommerfeld A (1964) Optics. New York: Academic Press.

C CHAOS IN NONLINEAR OPTICS R G Harrison and W Lu, Heriot-Watt University, Edinburgh, UK q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Instabilities in laser emission, notably in the form of spontaneous coherent pulsations, have been observed almost since the first demonstration of laser action. Indeed, the first laser operated in 1960 by Maiman produced noisy spiked output even under conditions of quasi-steady excitation and provided the early impetus for studies of such effects. Subsequent theoretical efforts towards understanding these phenomena occurred up to the 1980s at a modest level, due in part to the wide scope of alternative areas of fertile investigation provided by lasers during this period. However, since the 1990s, there has been a major resurgence of interest in this area. This has been due to profound mathematical discoveries over this period which have revolutionized our understanding of classical dynamical systems. It is now clear that many systems containing some form of nonlinearity are dynamically unstable and even chaotic and that such behavior is deterministic. Further, the discovery that chaos evolves through particular routes with well-defined scenarios and that such routes are universal, has stimulated experimentalists to search for physical systems that exhibit these properties. These phenomena have now been observed in areas as diverse as fluid flow, chemical reactions, population ecology, and superconductor devices. The laser, perhaps the most recent newcomer to the field, is a paradigm for such investigation, owing to its simplicity both in construction and in the mathematics that describe it, and already a wide range of these phenomena have been observed. Along with a proliferation of new observations, many of the instabilities known to occur in these systems are now being reinterpreted in lieu of our

new insight in this area. From this, fascinating new concepts of control and synchronization of chaos have emerged and spawned new fields of applications, not least in secure communication. In this article we introduce the general principles of dynamical instabilities, chaos, and control in lasers. For in-depth study, the reader is referred to texts and review articles cited in the Further Reading section at the end of this article.

Laser Physics Our conventional understanding of laser physics is concerned with how cooperative order evolves from randomness. This transition is explained by first considering the ensemble of lasing atoms in thermal equilibrium. Every atom executes a small motion, giving rise to an oscillating electric dipole, described by linear dynamics. For larger motions, the atomic dipoles interfere with each other’s motion and beyond a particular threshold, the motion becomes ‘co-operative’ or ordered over a long range. The role of the electromagnetic field in the laser cavity in ordering the induced atomic dipoles, is collectively described as giving rise to a macroscopic polarization, the magnitude of which depends on the number of dipoles (excited atoms). The dynamical interplay between the cavity field amplitude (E) as one variable and the atomic material variables of polarization (P) and population (D) of excited atoms, all of which are in general complex quantities, provide a full description of lasing action. For the simplest laser, a single mode two-level system with a homogeneously broadened gain lasing at resonance, these reduce to just three real variables, the Maxwell – Bloch equations: E ¼ 2kE þ kP

½1

P ¼ g’ ED 2 g’ P

½2

D ¼ gII ðl þ 1Þ 2 gII D 2 gII lEP

½3

16

CHAOS IN NONLINEAR OPTICS

where k is the cavity decay rate, g’ is the decay rate of atomic polarization, gII is the decay rate of population inversion, and l the pumping parameter. Described in this way the laser acts as a single oscillator in much the same way as a mechanical nonlinear oscillator, where laser output represents damping of the oscillator and excitation of the atoms is the driving mechanism to sustain lasing (oscillation). For ordered or coherent emission, it is necessary for one or both of the material variables (which are responsible for further lasing emission) to respond sufficiently fast to ensure a phase correlation with the existing cavity field. This is readily obtained in many typical lasers with output mirrors of relatively high reflectivity, since the field amplitude within the laser cavity will then vary slowly compared to the fast material variables which may then be considered through their equilibrium values. This situation, commonly referred to as adiabatic elimination of fast variables, reduces the dynamics of lasing action to that of one (or two) variables, the field (and population), all others being forced to adapt constantly to the slowly varying field variable. Our familiar understanding of DC laser action presupposes such conditions to hold. However when, for example, the level of excitation of the atoms is increased to beyond a certain value, i.e., the second laser threshold, all three dynamical variables may have to be considered, which satisfies the minimum requirement of three degrees of freedom (variables) for a system to display chaos. So this simple laser is capable of chaotic motion, for which the emission is aperiodic in time and has a broadband spectrum. Prediction of such behavior was initially made by Haken in 1975 through establishing the mathematical equivalence of the Maxwell – Bloch equations to those derived earlier by Lorentz describing chaotic motion in fluids.

Nonlinear Dynamics and Chaos in Lasers In general there is no general analytical approach for solving nonlinear systems such as the Maxwell – Bloch equations. Instead solutions are obtained by numerical means and analyzed through geometric methods originally developed by Poincare´ as early as 1892. This involves the study of topological structures of the dynamical trajectories in phase space of the variables describing the system. If an initial condition of a dissipative nonlinear dynamical system, such as a laser, is allowed to evolve for a long time, the system, after all the transients have died out, will eventually approach a restricted region of the phase space called an attractor. A dynamical

system can have more than one attractor, in which case different initial conditions lead to different types of long-time behavior. The simplest attractor in phase space is a fixed point which is a solution with just one set of values for its dynamical variables; the nonlinear system is attracted towards this point and stays there, giving a DC solution. For other control conditions the system may end up making a periodic motion; the attractor of this motion is called a limit cycle. However, when the operating conditions exceed a certain critical value, the periodic motion of the system breaks down into a more complex dynamical trajectory, which never repeats. This motion represents a third kind of attractor in phase space called a chaotic or strange attractor. Figure 1 shows a sequence of trajectories of the Lorentz equations in the phase space of (x, y, z) on increasing the value of one of the control parameters; this corresponds to (E, P, D) in the Maxwell – Block equations on increasing the laser pump. For a control setting near zero, all trajectories approach stable equilibrium at the origin, the topological structure of the basin of attraction being hyperbolic about zero (Figure 1a). For the laser equations, this corresponds to operation below lasing threshold for which the magnitude of the control parameter (laser gain) is insufficient to produce lasing. As the control parameter is increased, the basin lifts at its zero point to create an unstable fixed point here but with now two additional fixed points located in the newly formed troughs on either side of the zero point, which is now a saddle point. This is illustrated in Figure 1b where the system eventually settles to one or other of the two new and symmetrical fixed points, depending on the initial conditions. This corresponds to DC or constant lasing as defined by the parameter values of one or other of these points which are indistinguishable. Pictorially, think of a conventional saddle shape comprising a hyperbolic and inverted hyperbolic form in mutually perpendicular planes and connected tangentially at the origin. With the curve of the inverted hyperbola turned up at its extremity and filling in the volume with similar profiles which allow the two hyperbolic curves to merge into a topological volume, one sees that a ball placed at the origin is constrained to move most readily down either side of the inverted hyperbolas into one or other of the troughs formed by this volume. For the Lorentz system, chaotic behavior occurs at larger values of the control parameter when all three fixed points become saddles. Since none of the equilibrium points are now attracting, the behavior of the system cannot be a steady motion. Although perhaps difficult to visualize topologically, it is then possible to find a region in this surface enclosing all three points and

CHAOS IN NONLINEAR OPTICS

17

Figure 1 Trajectories in the three-dimensional phase space of the Lorentz attractor on increasing one of the control parameters. (Reproduced with permission from Thompson JMT and Stewart HB (1987) Nonlinear Dynamics and Chaos. New York: John Wiley.)

large enough so that no trajectories leave the region. Thus, all initial conditions outside the region evolve into the region and remain inside from then on. A corresponding chaotic trajectory is shown in Figure 1c. A point outwardly spirals from the proximity of one of the new saddle points until the motion brings it under the influence of the symmetrically placed saddle, the trajectory then being towards the center of this region from where outward spiraling again occurs. The spiraling out and switching over continues forever though the trajectory never intersects. In this case, arbitrarily close initial conditions lead to trajectories, which, after a sufficiently long time, diverge widely. Since a truly exact assignment of the initial condition is never possible,

even numerically, a solution comprising several such trajectories therefore evolves and, as a consequence, long-term predictability is impossible. This is in marked contrast to that of the fixed point and limit cycle attractors, which settle down to the same solutions. A recording of one of these variables in time, say for a laser the output signal amplitude (in practice recorded as signal intensity, proportional to the square of the field amplitude), then gives oscillatory emission of increasing intensity with sudden discontinuities (resulting from flipping from one saddle to the other in Figure 1c) as expected. While the Lorentz/Haken model is attractive for its relative simplicity, many practical lasers cannot be reduced to this description. Nevertheless, the

18

CHAOS IN NONLINEAR OPTICS

Figure 2 Lorentz-type chaos in the NH3 laser (emitting at 81 mm optically pumped by a N2O laser.

underlying topology of the trajectory of solutions in phase space for these systems is often found to be relatively simple and quite similar, as for three-level models descriptive of optically pumped lasers. An experimental example of such behavior for a single mode far infared molecular laser is shown in Figure 2. In other lasers, for which there is adiabatic elimination of one or other of the fast variables, chaotic behavior is precluded if, as a consequence, the number of variables of the system is reduced to less than three. Indeed this is the case for many practical laser systems. For these systems the addition of an independent external control parameter to the system, such as cavity length or loss modulation, have been extensively used as a means to provide the extra degrees of freedom. Examples include gas lasers with saturable absorbers, solid state lasers with loss modulation, and semiconductor lasers with external cavities, to name but a few. In contrast, for multimode rather than single mode lasers, intrinsic modulation of inversion (or photon flux) by multimode parametric interaction ensures the additional degrees of freedom. Furthermore, when the field is detuned from gain center, the dynamical variables E, P, D are complex, providing five equations for single mode systems, which is more than sufficient to yield deterministic chaos for suitable parameter values. Also, of significance is the remarkably low threshold found for the generation of instabilities and chaos in single-mode laser systems in which the gain is inhomogeneously broadened, an example being the familiar He –Ne laser. Erratic and aperiodic temporal behavior of any of the system’s variables implies a corresponding continuous spectrum for its Fourier transform, which is thus a further signature of chaotic motion. Time series, power spectra, and routes to chaos collectively provide evidence of deterministic behavior. Of the wide range of possible routes to chaos, three have emerged as particularly common and are frequently observed in lasers. These are period doubling, intermittency, and two-frequency scenarios. In the first, a solution, which is initially stable is found to

oscillate, the period of which successively doubles at distinct values of the control parameter. This continues until the number of fixed points becomes infinite at a finite parameter value, where the variation in time of the solution becomes irregular. For the intermittency route, a signal that behaves regularly in time becomes interrupted by statistically distributed periods of irregular motion, the average number of which increases with the external control parameter until the condition becomes chaotic. The two-frequency route is more readily identified with early concepts of turbulence, considered to be the limit of an infinite sequence of instabilities (Hopf bifurcation) evolving from an initial stable solution each of which creates a new basic frequency. It is now known that only two or three instabilities (frequencies) are sufficient for the subsequent generation of chaotic motion.

Applications of Chaos in Lasers It has been accepted as axiomatic since the discovery of chaos that chaotic motion is in general neither predictable nor controllable. It is unpredictable because a small disturbance will produce exponentially growing perturbation of the motion. It is uncontrollable because small disturbances lead to other forms of chaotic motion and not to any other stable and predictable alternative. It is, however, this very sensitivity to small perturbations that has been more recently used to stabilize and control chaos, essentially using chaos to control itself. Among the many methods proposed in the late 1980s, a feedback control approach was proposed by Ott, Grebogi, and Yorke (OGY) in which tiny feedback was used to stabilize unstable periodic orbits or fixed points of chaotic attractors. This control strategy can be best understood by a schematic of the OGY control algorithm for stabilizing a saddle point Pp ; as shown in Figure 3. Curved trajectories follow a stable (unstable) manifold towards (away from) the saddle point. Without perturbation of a parameter, the starting state, s1 ; would evolve to the state s2 :

CHAOS IN NONLINEAR OPTICS

Figure 3 The OGY control strategy for stabilizing an orbit in a chaotic attractor.

The effect of changing a parameter of the system is depicted as shifting states near Pp along the solid black arrows, whereas the combination of the unperturbed trajectory and the effect of the perturbation is to bring the state to a point, s3 ; on the stable manifold. Once on the stable manifold, the trajectory naturally tends towards the desired point. This algorithm has been successfully applied in numerous experimental systems without a priori modeling of these systems, examples being in cardiology, electronics, and lasers. Based on the concept of feedback control, various other approaches have been developed where emphasis has been given to algorithms which are more readily implemented in practical systems, in particular that utilizing occasional proportional feedback (OPF). These pioneering studies have since inspired prolific activities, both in theory and experiment, of control of chaos across many disciplines, opening up possibilities of utilizing chaos in many diverse systems. In optics, Roy et al. first demonstrated dynamical control of an autonomously chaotic and high-dimensional laser system on microsecond time-scales. The laser, a diode pumped solidstate Nd:YAIG system with an intracavity KTP doubling crystal, exhibited chaotic behavior from coupling of the longitudinal modes through nonlinear sum-frequency generation. To control the system, the total laser output was sampled within a window of selected offset and width. A signal proportional the deviation of the sampled intensity from the center of the window was generated and applied to perturb the driving current of the diode laser. The sampling frequency is related to the relaxation oscillation frequency of the system. This control signal repeatedly attempts to bring the system closer to a periodic unstable orbit that is embedded

19

in the chaotic attractor, resulting in a realization of the periodic orbit. By adjusting the frequency of the feedback signal they observed the basic (relaxation oscillation) and many higher-order periodic waveforms of the laser intensity. In a subsequent experiment, Gills et al. showed that an unstable steady state of this laser could also be stabilized by the OPF control technique and tracked with excellent stability to higher output power with increase of the pump excitation. Another area to receive considerable attention in the last few years is that of synchronization of chaos. It may be expected that chaotic systems will defy synchronization because two such identical systems started at nearly the same initial conditions have dynamical trajectories that quickly becomes uncorrelated. However, it has now been widely demonstrated that when these systems are linked, their chaotic trajectories converge to be the same and remain in step with each other. Further, such synchronization is found to be structurally stable and does not require the systems to be precisely identical. Not surprisingly, these findings have attracted interest from the telecommunication community; the natural masking of information by chaotic fluctuations offering a means to a certain degree of security. In this new approach, a chaotic carrier of information can be considered as a generalization of the more traditional sinusoidal carrier. In communication systems that use chaotic waveforms, information can be recovered from the carrier using a receiver synchronized, or tuned, to the dynamics of the transmitter. Optical systems are particularly attractive since they display fast dynamics, offering the possibility of communication at bandwidths of a hundred megahertz or higher. Van Wiggeren and Roy first demonstrated data transmission rates of 10 Mbites per second using erbium-doped fiber ring lasers. These lasers are particularly well-suited for communication purposes since their lasing wavelength is close to the minimumloss wavelength in optical fiber. Figure 3 shows a schematic of this system. The tiny message, in order of one thousandth of the carrier intensity, is decoded in the transmitter. The receiver tends to synchronize its behavior to the chaotic part of the transmitted wave (but not the message). Subtracting the waveform created in the receiver from the transmitted signal recovers the tiny message. In principle, it is possible to communicate information at ultrahigh data rate with the use of this scheme because the chaotic dynamics in the ring laser has a very large spectral width. In a later experiment they showed that a receiver can recover information at 126 Mbits/sec from the chaotic carrier (Figure 4).

20

CHAOS IN NONLINEAR OPTICS

Figure 4 Schematic of communication with chaotic erbium-doped fiber amplifiers (EDFAs). Injecting a message into the transmitter laser folds the data into the chaotic frequency fluctuations. The receiver reverses this process, thereby recovering a high-fidelity copy of the message. (Reproduced with permission from Gauthier DJ (1998) Chaos has come again. Nature 279: 1156– 1157.)

See also Fourier Optics. Optical Amplifiers: Erbrium Doped Fiber Amplifiers for Lightwave Systems. Polarization: Introduction.

Further Reading Abraham NB, Lugiato LA and Narducci LM (eds) (1985) Instabilities in active optical media. Journal of the Optical Society of America B 2: 1– 272. Arecchi FT and Harrison RG (1987) Instabilities and Chaos in Quantum Optics. Synergetics Series 34, 1– 253. Berlin: Springer Verlag. Arecchi FT and Harrison RG (eds) (1994) Selected Papers on Optical Chaos. SPIE Milestone Series, vol. MS 75, SPIE Optical Engineering Press.

Gauthier DJ (1998) Chaos has come again. Nature 279: 1156– 1157. Gills Z, Iwata C and Roy R (1992) Tracking unstable steady states: extending the stability regime of a multimode laser system. Physical Review Letters 69: 3169– 3172. Haken H (1985) Light. In: Laser Light Dynamics, vol. 2. Amsterdam: North-Holland. Harrison RG (1988) Dynamical instabilities and chaos in lasers. Contemporary Physics 29: 341– 371. Harrison RG and Biswas DJ (1985) Pulsating instabilities and chaos in lasers. Progress in Quantum Electronics 10: 147– 228. Hunt ER (1991) Stabilizing high-period orbits in a chaotic system – the diode resonator. Physical Review Letters 67: 1953– 1955. Ott E, Grebogi C and Yorke JA (1990) Controlling chaos. Physical Review Letters 64: 1196– 1199.

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

Pecore LM and Carrol TL (1990) Synchronization of chaotic systems. Physical Review Letters 64: 821– 824. Roy R, et al. (1992) Dynamical control of a chaotic laser: experimental stabilization of a globally coupled system. Physical Review Letters 68: 1259–1262.

21

VanWiggeren GD and Roy R (1998) Communication with chaotic lasers. Science 279: 1198– 1200. VanWiggeren GD and Roy R (1998) Optical communication with chaotic waveforms. Physical Review Letters 279: 1198– 1200.

CHEMICAL APPLICATIONS OF LASERS Contents Detection of Single Molecules in Liquids Diffuse-Reflectance Laser Flash Photolysis Laser Manipulation in Polymer Science Nonlinear Spectroscopies Photodynamic Therapy of Cancer Pump and Probe Studies of Femtosecond Kinetics Time-Correlated Single-Photon Counting Transient Holographic Grating Techniques in Chemical Dynamics

Detection of Single Molecules in Liquids A J de Mello, J B Edel and E K Hill, Imperial College of Science, Technology and Medicine, London, UK

(most notably scanning tunneling microscopy and atomic force microscopy) have been used to great effect in the analysis of surface bound species, but for the detection of single molecules in liquids, optical methods incorporating the measurement of absorption or emission processes, have proved most successful.

q 2005, Elsevier Ltd. All Rights Reserved.

The Absorption– Emission Cycle Introduction A significant challenge facing experimentalists in the physical and biological sciences is the detection and identification of single molecules. The ability to perform such sensitive and selective measurements is extremely valuable in applications such as DNA analysis, immunoassays, environmental monitoring, and forensics, where small sample volumes and low analyte concentrations are the norm. More generally, most experimental observations of physical systems provide a measurement of ensemble averages, and yield information only on mean properties. In contrast, single molecule measurements permit observation of the interactions and behavior of a heterogeneous population in real time. Over the past few years a number of techniques with sufficient sensitivity have been developed to detect single molecules. Scanning probe microscopies

The key concept underlying most emissive approaches to single molecule detection is that a single molecule can be cycled repeatedly between its ground state and an excited electronic state to yield multiple photons. The process can be understood by reference to Figure 1. Fluorescence emission in the condensed phase can be described using a four-step cycle. Excitation from a ground electronic state to an excited state is followed by rapid (internal) vibrational relaxation. Subsequently, radiative decay to the ground state is observed as fluorescence emission and is governed by the excited state lifetime. The final stage is internal relaxation back to the original ground state. Under saturating illumination, the rate-limiting step for this cycle is governed by the fluorescence lifetime ðtf Þ; which is typically of the order of a few nanoseconds. If a single molecule diffuses through an illuminated zone (e.g., the focus of a laser beam) it may reside in that

22

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

Figure 1 Schematic illustration of the molecular absorption– emission cycle and timescales for the component processes. Competing processes that may reduce the ultimate photon yield are also shown.

region for several milliseconds. The rapid photon absorption-emission cycle may therefore be repeated many times during the residence period, resulting in a burst of fluorescence photons as the molecule transits the beam. The burst size is limited theoretically by the ratio of the beam transit time ðtr Þ and the fluorescence lifetime: Nphotons ¼

tr tf

½1

For typical values of tr (5 ms) and tf (5 ns) up to one million photons may be emitted by the single molecule. In practice, photobleaching and photodegradation processes limit this yield to about 105 photons. Furthermore, advances in optical collection and detection technologies enable registration of about 1 – 5% of all photons emitted. This results in a fluorescence burst signature of up to a few thousand photons or photoelectrons. Successful single molecule detection (SMD) depends critically on the optimization of the fluorescence burst size and the reduction of background interference from the bulk solvent and impurities. Specifically a molecule is well suited for SMD if it is efficiently excited by an optical source (i.e., possesses a large molecular absorption cross-section at the wavelength of interest), has a high fluorescence quantum efficiency (favoring radiative deactivation

of the excited state), has a short fluorescence lifetime, and is exceptionally photostable. Ionic dyes are often well suited to SMD as fluorescence quantum efficiencies can be close to unity and fluorescence lifetimes below 10 nanoseconds. For example, xanthene dyes such as Rhodamine 6G and tetramethyl-rhodamine isothiocyanate are commonly used in SMD studies. However, other highly fluorescent dyes such as fluorescein are unsuitable for such applications due to unacceptably high photodegradation rate coefficients. Furthermore, some solvent systems may enhance nonradiative processes, such as intersystem crossing, and yield significant reduction in the photon output. Structures of three common dyes suitable for SMD are shown in Figure 2.

Signal vs Background The primary challenge in SMD is to ensure sufficient reduction in background levels to enable discrimination between signal and noise. As an example, in a 1 nM aqueous dye solution each solute molecule occupies a volume of approximately 1 fL. However, this same volume also contains in excess of 1010 solvent molecules. Despite the relatively small scattering cross-section for an individual water molecule (, 10228 cm2 at 488 nm), the cumulative scattering

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

Figure 2 Structures of common dye molecules suitable for SMD in solution: (a) 3,6-diamino-9-(2-carboxyphenyl)-chloride (rhodamine 110); (b) 9-(2-(ethoxycarbonyl)phenyl)-3,6-bis (ethylamino)-2,7-dimethyl chloride (rhodamine 6G); (c) 9-(2-carboxyisothiocyanatophenyl)-3,6-bis(dimethylamino)-inner salt (tetra methylrhodamine-5-(and-6)-isothiocyanate).

signal from the solvent may swamp the desired fluorescence signal. The principal method of reducing the solvent background is to minimize the optical detection volume: the signal from a single molecule is independent of probe volume dimensions, but the background scales proportionally with the size of the detection region. Although there are several experimental approaches to SMD in solution, several factors hold common: 1. Tiny detection volumes (10212 –10215 L) are used to reduce background signals. 2. A low analyte concentration combined with the small observation volume, ensures that less than one analyte molecule is present in the probe volume on average. 3. High-efficiency photon collection (optics) and detection maximize the proportion of the isotropic fluorescence burst that is registered. 4. Background reduction methods are employed to improve signal-to-noise ratios. These include: optical rejection of Raman and Rayleigh scatter, time-gated discrimination between prompt scatter and delayed emission, and photobleaching of the solvent immediately before detection.

23

The minute volumes within which single molecules are detected can be generated in a variety of ways. Picoliter volumes can be defined by mutually orthogonal excitation and detection optics focused in a flowing stream. Much smaller, femtoliter probe volumes are generated using confocal microscopes. At this level, background emission is significantly reduced and high signal-to-noise ratios can be achieved. Confocal detection techniques are versatile and have been widely adopted for SMD in freely diffusing systems. Consequently, confocal methods will be discussed in detail in this article. The other general approach to performing SMD in solution involves the physical restriction of single molecules within defined volumes. Of particular note are techniques where single molecules are confined within a stream of levitated microdroplets. Droplet volumes are typically less than 1 fL and imaging of the entire microdroplet enables single molecule fluorescence to be contrasted against droplet ‘blanks’ with good signal-to-noise. Furthermore, since molecules are confined within discrete volumes, the technique can be utilized for high-efficiency molecular counting applications. More recently, spatial confinement of molecules in capillaries and microfabricated channels (with submicron dimensions) has been used to create probe volumes between 1 fL and 1 pL, and immobilized molecules on surfaces have been individually probed using wide-field microscopy with epi-illumination or evanescent wave excitation.

Single Molecule Detection using Confocal Microscopy As previously stated, the confocal fluorescence microscope is an adaptable and versatile tool for SMD. In its simplest form, a confocal microscope is one in which a point light source, a point focus in the object plane, and a pinhole detector are all confocal with each other. This optical superposition generates superior imaging properties and permits definition of ultra-small probe volumes. The concepts behind a basic confocal microscope and its use in SMD are schematically illustrated in Figure 3. Coherent light (typically from a laser and tuned to an optical transition of the molecule under investigation) behaves as a point light source and is focused into a sample chamber using a high-numerical aperture objective lens. As a single molecule traverses the laser beam it is continuously cycled between the ground and an excited electronic state, emitting a burst of fluorescence photons. Fluorescence emission is isotropic (spontaneous emission), so photons are emitted in all directions (4p steradians).

24

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

avalanche photodiodes (SPADs). A SPAD is essentially a p – n junction reverse biased above the breakdown voltage, that sustains an avalanche current when triggered by a photon-generated carrier. Detection efficiencies for typical SPADs are normally between 60 – 70% and are thus ideal for SMD in solution. An approximation of the overall detection efficiency of a confocal system for SMD can be made using eqn [2], which incorporates an estimation of photon losses at all stages of the collection/detection process. Typical transmission efficiencies for each step are also shown. overall objective dichroic additional detector detection < collection £ transmission £ optical £ efficiency efficiency efficiency coefficient losses 0:06

0:24

0:9

0:5

0:6

½2 Figure 3 Principle of confocal detection. A confocal pinhole only selects light that emanates from the focal region. Dashed lines indicate paths of light sampled above and below the focal plane that are rejected by the pinhole. The solid ray derives from the focal point, and is transmitted through the pinhole to a detector.

Consequently, the high numerical aperture is used to collect as large a fraction of photons emitted from the focal plane as possible. Designing the objective to be used with an immersion medium, such as oil, glycerin, or water, can dramatically increase the objective numerical aperture, and thus the number of collected photons. Light collected by the objective is then transmitted towards a dichroic beam splitter. In the example shown, fluorescence photons (of lower energy) are reflected towards the confocal detector pinhole, whereas scattered radiation (of higher energy) is transmitted through the dichroic towards the light source. Creation of a precise optical probe volume is effected through the definition of the confocal pinhole. The detector is positioned such that only photons that pass through the pinhole are detected. Consequently, light emanating from the focal plane in the sample is transmitted through the pinhole and detected, whereas light not deriving from the focal plane is rejected by the aperture, and therefore not detected (Figure 3). To ensure that the maximum number of photons are detected by the system, high efficiency detectors must be used. Photomultiplier tubes (the most common detectors for light-sensing applications) are robust and versatile but have poor detection efficiencies (approximately 5% of all photons that fall on the photocathode yield an electrical signal). Consequently, the most useful detectors for SMD (or low light level) applications are single photon-counting

Optical Probe Volumes The precise nature of the probe volume is determined by the image of the pinhole in the sample and the spherical aberration of the microscope objective. Routinely, confocal probe volumes are approximated as resembling a cylinder with a radius defined by the diffraction-limited waist of a Gaussian beam. This approximation is useful when the incident beam is narrow or not tightly focused. However, when the radius of the incident beam is large, the corresponding diffraction limited focus is narrowed, and the probe volume more closely resembles a pair of truncated cones. Figure 4a illustrates the dependence of the curvature of the 1/e2 intensity contour on the collimated beam radius. Consequently, it is clear that a simple cylindrical approximation for the probe volume breaks down for wide, tightly focused beams. If a broad incident beam (diameter . 1.5 mm) is used, a large noncylindrical contribution to the probe volume is anticipated and a more appropriate model is required. An alternative and more accurate model for the confocal probe volume considers the Gaussian profile of the focused beam. The 1/e2 intensity contour radius of a Gaussian waveform with wavelength l; at some distance z from the beam waist radius w0 ; is given by eqn [3]: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi !2 u u lz t wðzÞ ¼ w0 1 þ pw20

½3

In this case, the probe volume V is given by the volume of rotation of wðzÞ around the z-axis between Z0 and 2Z0 : The volume of rotation can therefore be

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

25

Figure 4 (a) 1/e2 Gaussian intensity contours plotted for a series of laser beam radii (l ¼ 488 nm, f ¼ 1.6 mm, n ¼ 1.52). (b) Cylindrical and curved components of the Gaussian probe volume. The curved contribution is more significant for larger beam radii and correspondingly tight beam waists.

simply defined according to, V¼

ðZ0 2Z0

pwðzÞ2 dz

½4

Solution of eqn [4] yields V ¼ 2pw20 Z0 þ

2l2 0 3 Z 3pw20

½5

The Gaussian volume expression contains two terms. The first term, 2pw20 Z0 ; corresponds to a central cylindrical volume; the second term has a more complex form that describes the extra curved volume (Figure 4b). The diffraction-limited beam waist radius w0 can be defined in terms of the focusing objective focal length f ; the refractive index n; and the collimated beam radius R according to w0 ¼

lf npR

½6

Intensity Fluctuations: Photon Burst Statistics

Substitution in eqn [5] yields, 

lf V ¼ 2p npR ¼

2

2l2 Z þ 3p 0

additional volume due to the curved contour. It is clear from Figure 4a that, for a given focal length, the noncylindrical contribution to the probe volume increases with incident beam diameter, when the diffraction limited focus is correspondingly sharp and narrow. Furthermore, it can also be seen that the second term in eqn [7] is inversely proportional to f 2 ; and thus the extent to which the probe volume is underestimated by the cylindrical approximation increases with decreasing focal length. This fact is significant when performing confocal measurements, since high numerical aperture objectives with short focal lengths are typical. Some realistic experimental parameters give an indication of typical dimensions for the probe volume in confocal SMD systems. For example, if l ¼ 488 nm; f ¼ 1:6 mm; Z0 ¼ 1:0 mm and n ¼ 1:52; a minimum optical probe volume of 1.1 fL is achievable with a collimated beam diameter of 1.1 mm.

npR lf

2l2 f 2 0 2pn2 R2 0 3 Z þ Z pn2 R2 3f 2

!2

Z0

3

½7

The volume is now expressed in terms of identifiable experimental variables and constants. Once again, the first term may be correlated with the cylindrical contribution to the volume, and the second term is the

When sampling a small volume within a system that may freely exchange particles with a large surrounding analyte bath, a Poisson distribution of particles is predicted. A Poisson distribution is a discrete series that is defined by a single parameter m equating to the mean and variance of the distribution:

Pðn ¼ xÞ ¼

mx e2m x!

½8

26

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

Common Poisson processes include radioactive disintegration, random walks and Brownian motion. Although particle number fluctuations in the excitation volume are Poissonian in nature, the corresponding fluorescence intensity modulation induces a stronger correlation between photon counts. For a single molecular species the model is described by two parameters: an intrinsic molecular brightness and the average occupancy of the observation volume. A super-Poissonian distribution has a width or variance that is greater than its mean; in a Poisson distribution the mean value and the variance are equal. The fractional deviation Q is defined as the scaled difference between the variance and the expectation value of the photon counts, and gives a measure of the broadening of the photon counting histogram (PCH). Q is directly proportional to the molecular brightness factor 1 and the shape factor g of the optical point spread function. (g is constant for a given experimental geometry.) Q¼

kDn2 l 2 knl ¼ g1 knl

½9

A pure Poisson distribution has Q ¼ 0; for superPoissonian statistics Q . 0: Deviation from the Poisson function is maximized at low number density and high molecular brightness. In a typical SMD experiment raw data are generally collected with a multichannel scalar and photons are registered in binned intervals. Figure 5

illustrates typical photon burst scans demonstrating the detection of single molecules (R-phycoerythrin) in solution. Fluorescence photon bursts, due to single molecule events, are clearly distinguished above a low background baseline (top panel) of less than 5 counts per channel in the raw data. It is noticeable that bursts vary in both height and size. This is in part due to the range of possible molecular trajectories through the probe volume, photobleaching kinetics, and the nonuniform illumination intensity in the probe region. In addition, it can be seen that the burst frequency decreases as bulk solution concentration is reduced. This effect is expected since the properties of any given single-molecule event are determined by molecular parameters alone (e.g., photophysical and diffusion constants) and concentration merely controls the frequency/number of events. Although many fluorescence bursts are clearly distinguishable from the background, it is necessary to set a count threshold for peak discrimination in order to correctly identify fluorescence bursts above the background. A photocount distribution can be used as the starting point for determining an appropriate threshold for a given data set. The overlap between signal and background photocount distributions affects the efficiency of molecular detection. Figure 6 shows typical signal and background photocount probability distributions, with a threshold set at approximately 2 photocounts per channel. The probability of spurious (or false)

Figure 5 Photon burst scans originating from 1 nM and 500 pM R-phycoerythrin buffered solutions. Sample is contained within a 50 mm square fused silica capillary. Laser illumination ¼ 5 mW, channel width ¼ 1 ms. The top panel shows a similar burst scan originating from a deionized water sample measured under identical conditions.

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

Figure 6 Simulated fluorescence and background photocount probability distributions. The vertical dashed line at 2 counts represents an arbitrarily defined threshold value for peak determination.

detection resulting from statistical fluctuations in the background can be quantified by the area under the ‘background’ curve at photocount values above the threshold. Similarly, the probability that ‘true’ single molecule events are neglected can be estimated by the area under the ‘fluorescence’ curve at photocount values below the threshold. Choice of a high threshold value will ensure a negligible probability of calling a false positive, but will also exclude a number of true single molecule events that lie below the threshold value. Conversely, a low threshold value will generate an unacceptably high number of false positives. Consequently, choice of an appropriate threshold is key in efficient SMD. Since the background shot noise is expected to exhibit Poisson statistics, the early part of the photocount distribution (i.e., the portion that is dominated by low, background counts) can be modeled with a Poisson distribution, to set a statistical limit for the threshold. Photon counting events above this threshold can be defined as photon bursts associated with the presence of single molecules. In an analogy with Gaussian systems the selected peak discrimination threshold can be defined as three standard deviations from the mean count rate: pffiffi nthreshold ¼ m þ 3 m ½10

Figure 7 A photon counting histogram generated from a 16 second photon burst scan originating from a 10 mg/mL solution of 1000 nm fluorescent microbeads. The dotted curve shows a leastsquares fit of early channels to a Poisson distribution, and the dashed vertical line marks the peak threshold (defined as p m þ 3 m ¼ 4:47 counts).

illustrates a sample photocount distribution, a least-squares fit to an appropriate Poisson distribution, and the calculated threshold that results. Once the threshold has been calculated, its value is subtracted from all channel counts and a peak search utility used to identify burst peaks in the resulting data set.

Data Filtering As stated, the primary challenge in detecting single molecules in solution is not the maximization of the detected signal, but the maximization of the signal-tonoise ratio (or the reduction of background interferences). Improving the signal-to-noise ratio in such experiments is important, as background levels can often be extremely high. Several approaches have been used to smooth SMD data with a view to improving signal-to-noise ratios. However, the efficacy of these methods is highly dependent on the quality of the raw data obtained in experiment. As examples, three common methods are briefly discussed. The first method involves the use of a weighted quadratic sum (WQS) smoothing filter. The WQS function creates a weighted sum of adjacent terms according to n~ k;WQS ¼

Adoption of a threshold that lies 3 standard deviations above the mean yields a confidence limit that is typically greater than 99%. Figure 7

27

m21 X

wj ðnkþj Þ2

½11

j¼0

The range of summation m is the same order as the burst width, and the weighting factors wj are chosen

28

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

Figure 8 Effect of various smoothing filters on a 750 ms photon burst scan originating from a 1 nM R-phycoerythrin buffered solution. Raw data are shown in the top panel.

to best discriminate between single-molecule signals and random background fluctuations. This method proves most useful for noisy systems, in which the raw signal is weak. There is a practical drawback, in that peak positions are shifted by the smoothing function, and subsequent burst analysis is therefore hampered. Another popular smoothing filter is the Leefiltering algorithm. The Lee filter preferentially smoothes background photon shot noise and is defined according to n~ k ¼ n k þ ðnk 2 n k Þ

s 2k s 2k þ s 20

½12

where the running mean ðnk Þ and running variance ðs 2k Þ are defined by n k ¼

s 2k ¼

m X 1 n m,k#N2m ð2m þ 1Þ j¼2m kþj

m X 1 ðn 2 n kþj Þ2 ð2m þ 1Þ j¼2m kþj

A final smoothing technique worth mentioning is the Savitsky-Golay smoothing filter. This filter uses a least-squares method to fit an underlying polynomial function (typically a quadratic or quartic function) within a moving window. This approach works well for smooth line profiles of a similar width to the filter window and tends to preserve features such as peak height, width and position, which may be lost by simple adjacent averaging techniques. Figure 8 shows the effects of using each approach to improve signal-to-noise for raw burst data.

½13

2m , k # N 2 2m ½14

for a filter ð2m þ 1Þ channels wide. Here, nk is the number of detected photons stored in a channel k, s0 is a constant filter parameter, and N is the total number of data points.

Photon Burst Statistics A valuable quantitative analysis method for analysis of fluorescence bursts utilizes the analysis of Poisson statistics. Burst interval distributions are predicted to follow a Poissonian model, in which peak separation frequencies adopt an exponential form. The probability of a single molecule (or particle) event occurring after an interval Dt is given by eqn [15]: NðDtÞ ¼ l expð2btÞ

½15

where l is a proportionality constant and b is a characteristic frequency at which single molecule events occur. The recurrence time tR can then be simply defined as

tR ¼

1 b

½16

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

29

where N is the number of molecules. It is observed that the relative fluctuation diminishes as the number of particles measured is increased. Hence, it is important to minimize the number of molecules present in the probe volume. It should be noted, however, that if there are too few molecules in the solution there will be long dark periods were no single molecule bursts are observed. If the probe volume is bathed in radiation of constant intensity, fluctuation of the resulting fluorescence signal can simply be defined as deviations from the temporal signal average: kFðtÞl ¼ Figure 9 Burst interval distribution analysis of photon burst scans. Data originate from 1 mm fluorescent beads moving through 150 mm wide microchannels at flow rates of 200 nL min21 (circles) and 1000 nL min21 (squares). Least squares fits to a single exponential function are shown by the solid lines.

Equation [6] simply states that longer intervals between photon bursts are less probable than shorter intervals. Furthermore, the recurrence time reflects a combination of factors that control mobility, probe volume occupancy, or other parameters in the single molecule regime. Consequently, it is expected that tR should be inversely proportional to concentration, flow rate or solvent viscosity in a range of systems. Figure 9 shows an example of frequency NðDtÞ versus time plots for two identical particle systems moving at different velocities through the probe volume. A least-squares fit to a single exponential function yields values of tR ¼ 91 ms for a volumetric flow rate of 200 nL/min and tR ¼ 58 ms for a volumetric flow rate of 1000 nL/min.

Temporal Fluctuations: Autocorrelation Analysis Autocorrelation analysis is an extremely sensitive method for detecting the presence of fluorescence bursts in single molecule experiments. This approach essentially measures the average of a fluctuating signal as opposed to the mean spectral intensity. As previously discussed, the number of molecules contained within a probe volume at any given time is governed by Poisson statistics. Consequently, the root mean square fluctuation can be defined according to eqn [17], pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi kðdNÞ2 l kðN 2 kNlÞ2 l 1 ¼ ¼ pffiffiffiffiffi kNl kNl kNl

½17

1 ðT FðtÞdt T 0

½18

Here, t is defined as the total measurement time, FðtÞ is the fluorescence signal at time t; and kFðtÞl is the temporal signal average. Fluctuations in the fluorescence intensity, dFðtÞðdFðtÞ ¼ FðtÞ 2 kFðtÞlÞ; with time t; about an equilibrium value kFl; can be statistically investigated by calculating the normalized autocorrelation function, GðtÞ; where GðtÞ ¼

kFðt þ tÞFðtÞl kdFðt þ tÞdFðtÞl ¼ þ 1 ½19 2 kFl kFl2

In dedicated fluorescence correlation spectroscopy experiments, the autocorrelation curve is usually generated in real time in a high-speed digital correlator. Post data acquisition calculation is also possible using the following expression GðtÞ ¼

N21 X

gðtÞgðt þ tÞ

½20

t¼0

Here gðtÞ is the total number of counts during the time interval ðt; t þ DtÞ; gðt þ tÞ is the number of counts detected in an interval of Dt at a later time t þ t; and N is the total number of time intervals in the dataset. In a diffusion controlled system with a single fluorescent molecule that is irradiated with a three dimensional Gaussian intensity profile, the autocorrelation curve is governed by the mean probe volume occupancy N and the characteristic diffusion time ðtD Þ: The laser beam waist radius v and the probe depth 2z describe the Gaussian profile:  2 !21=2  1 t 21 v t 1þ GðtÞ ¼ 1 þ ½21 1þ N td td z The diffusion time is a characteristic molecular residence time in the probe volume and inversely

30

CHEMICAL APPLICATIONS OF LASERS / Detection of Single Molecules in Liquids

increased, the width of the autocorrelation curves is seen to narrow as a result of the reduced residence time in the probe volume. A plot of the reciprocal of the full width half maximum of the autocorrelation curve as a function of volumetric flow rate is linear, and provides a simple way of calculating particle/molecule velocities within flowing systems.

Applications

Figure 10 Autocorrelation analysis of photon burst scans of 1 mm fluorescent beads moving through 150 mm wide microchannels at flow rates of 500 nL min21 (stars) and 1000 nL min21 (circles). Solid lines represent fits to the data according to eqn [23].

related to the translational diffusion coefficient for the molecule: w2 4tD



½22

In a flowing system, the autocorrelation function depends on the average flow time through the probe volume tflow : A theoretical fit to the function can be described according to 1 A exp GðtÞ ¼ 1 þ N 

t A¼ 1þ td

21

(

t tflow 

v 1þ z

2 ) A 2

t td

!

½23

w tflow

See also Microscopy: Confocal Microscopy.

where N is the mean probe volume occupancy; the flow velocity v can then be extracted from the characteristic flow time according to v¼

The basic tools and methods outlined in this chapter have been used to perform SMD in a variety of chemically and biologically relevant systems, and indeed there is a large body of work describing the motion, conformational dynamics and interactions of individual molecules (see Further Reading). A primary application area has been in the field of DNA analysis, where SMD methods have been used in DNA fragment sizing, single-molecule DNA sequencing, high-throughput DNA screening, single-molecule immunoassays, and DNA sequence analysis. SMD methods have also proved highly useful in studying protein structure, protein folding, protein-molecule interactions, and enzyme activity. More generally, SMD methods may prove to be highly important as a diagnostic tool in systems where an abundance of similar molecules masks the presence of distinct molecular anomalies that are markers in the early stages of disease or cancer.

½24

It should be noted that in the case that directed flow is negligible or defined to be zero, the autocorrelation function simplifies to eqn [21]. Figure 10 illustrates experimentally determined autocorrelation curves for two identical particle systems moving at different velocities through the probe volume. As particle flow velocity is

Further Reading Ambrose WP, Goodwin PM, Jett JH, Van Orden A, Werner JH and Keller RA (1999) Single molecule fluorescence spectroscopy at ambient temperature. Chem. Rev. 99: 2929–2956. Barnes MD, Whitten WB and Ramsey JM (1995) Detecting single molecules in liquids. Anal. Chem. 67: 418A –423A. Basche T, Orrit M and Rigler R (2002) Single Molecule Spectroscopy in Physics, Chemistry, and Biology. New York: Springer-Verlag. Zander CJ, Enderlein J and Keller RA (2002) Single Molecule Detection in Solution: Methods and Applications. New York: John Wiley & Sons.

CHEMICAL APPLICATIONS OF LASERS / Diffuse-Reflectance Laser Flash Photolysis

31

Diffuse-Reflectance Laser Flash Photolysis D R Worrall and S L Williams, Loughborough University, Loughborough, UK q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Flash photolysis was developed as a technique to study short-lived intermediates in photoinduced reactions by George Porter (later Lord Porter of Luddenham) and Ronald Norrish in 1949, drawing on Porter’s wartime experience with radar techniques. Such was the impact of this development that it earned the Nobel prize jointly for Porter, Norrish, and Eigen in 1967. In this article we will describe the application of flash photolysis to opaque, scattering samples, detailing how light propagation in such samples can be treated theoretically, and will discuss methods by which the data obtained from diffuse reflectance flash photolysis experiments may be analyzed. The technique of flash photolysis was originally based on using an intense flash of light (the photolysis flash) from a xenon tube to excite the sample, followed a certain time delay later by a spectroscopic flash of lower intensity from a second flash tube, the light from the latter being detected using a photographic plate. The photolysis flash is of sufficient intensity to produce a large population of intermediates (radicals, ions, excited states, isomers) in the sample, which then absorb light from the spectroscopic flash depending on the concentration of intermediates according to the Beer– Lambert law: A ¼ 1cl

resolution to the femtosecond regime. Indeed, recently the Nobel prize for chemistry was awarded to Ahmed Zewail for his work with ultrafast pumpprobe techniques. However, in its conventional geometry, flash photolysis is limited to transparent samples, since it is necessary to be able to probe the excited species by monitoring the absorption spectra. Many biological systems and industrially important samples are opaque or highly scattering, and hence the attenuation of light through the sample is no longer described by the Beer–Lambert law. In 1981, Frank Wilkinson and Rudolph Kessler had the idea of using diffusely reflected light to interrogate the changes in concentration within a scattering sample subjected to a high-intensity excitation pulse. When photons enter a sample, they may be absorbed or scattered. Those which are scattered may re-emerge from the irradiated surface as diffusely reflected light. The intensity of diffusely reflected light emerging from the surface at a particular wavelength is a unique function of the ratio of scattering to absorption. The more scattering events occurring before absorption, the more likely the photon is to escape from the sample as diffusely reflected light. Hence the probability of escape decreases as the absorption probability increases, and the diffusely reflected light is deficient in those wavelengths where the absorption is strongest, i.e., the ratio of the incident to absorbed light intensity at a given wavelength is related to the absorption of the sample at that wavelength.

½1

Kubelka– Munk Theory of Reflectance

where A is the sample absorbance, 1 the molar absorption coefficient, c the concentration and l the pathlength. The absorbance is related to the incident and transmitted intensities as

The theory which describes the relationship between incident and scattered light intensity, absorption and scatter, and concentration which is widely applied in this context, is the Kubelka – Munk theory of reflectance. The theory was originally developed to describe the reflectance characteristics of paint films but it works quite well for many samples containing a homogeneous distribution of scatterers and absorbers. The limiting assumption in this theory is that the scatterers from which the scattering layer is composed are very much smaller than the total layer thickness. Additionally, the layer should be optically thick such that all of the light entering the layer should be either absorbed or reflected, with a negligible fraction transmitted. For a layer of thickness X diffusely irradiated with monochromatic light, the diagram shown in Figure 1

 A ¼ log10 I0 I

½2

With I0 the incident and I the transmitted intensities, there is therefore an exponential fall-off of intensity with pathlength for a homogeneous absorber. Hence, by monitoring the evolution of the absorption spectra, the changes in concentration of the photoproduced intermediates and hence the kinetics of the processes in which they are involved are elucidated. Flash photolysis has evolved subsequently to make use of laser sources and sophisticated electronic detection apparatus to push the limits of time

32

CHEMICAL APPLICATIONS OF LASERS / Diffuse-Reflectance Laser Flash Photolysis

decreases. However, this change with wavelength is small provided the particle size is large relative to the wavelength of the light. The effect of the material in the element dx on the counterpropagating fluxes i and j depends on the absorption and scattering coefficients. Both i and j will be attenuated by both absorption and scattering: i2 ¼ i1 2 i1 ðS þ KÞdx

½5

j2 ¼ j1 2 j1 ðS þ KÞdx

½6

Both i and j are reinforced by backscattering from the other flux: Figure 1 Schematic of counterpropagating fluxes in a diffusely irradiated scattering medium.

i2 ¼ i1 þ j1 Sdx

½7

j2 ¼ j1 þ i1 Sdx

½8

can be constructed, with I the incident flux and J the diffusely reflected flux, and i and j the fluxes traveling upwards and downwards through an infinitesimally small thickness element dx: Two further parameters may be defined which are characteristic of the medium.

The net effect of this in the attenuation of the flux propagating into the sample ðiÞ and the flux backscattered from the sample ð jÞ can be expressed as the following differential equations: di ¼ ðS þ KÞi1 dx 2 j1 Sdx

½9

K The absorption coefficient. Expresses the attenuation of light due to absorption per unit thickness. S The scattering coefficient. Expresses the attenuation of light due to scattering per unit thickness.

dj ¼ 2ðS þ KÞj1 dx þ i1 Sdx

½10

Both of these parameters can be thought of as arising due to the particle (or chromophore) acting as a sphere of characteristic size, which casts a shadow either due to the prevention of on-axis transmission or due to absorption. The scattering or absorption coefficient then depends on the effective size of this sphere, and the number density of spheres in the medium. In each case, the probability P of a photon being transmitted through a particular thickness X of a medium is related exponentially to the absorption or scattering coefficient: P ¼ expð2KXÞ

½3

P ¼ expð2SXÞ

½4

The scattering coefficient S depends on the refractive index difference between the particle and the dispersion medium. The scattering coefficient is also dependent upon the particle size, and shows an inverse correlation, i.e., the scattering coefficient increases as the particle size decreases. This effect is a function of the size of the particle relative to the wavelength of light impinging on it, with the scattering coefficient increasing as the wavelength

For a sample of infinite thickness, these equations can be solved to give an analytical solution for the observed reflectance of the sample in terms of the absorption and scattering coefficients: K ð1 2 R1 Þ2 ¼ S 2R1

½11

where R1 is the reflectance from a sample of such thickness that an increase in sample thickness has no effect on the observed reflectance. The absorption coefficient K is dependent upon the concentration of absorbers in the sample through K ¼ 21c

½12

with 1 the naperian absorption coefficient, c the concentration of absorbers and the factor 2 which is the geometrical factor for an isotropic scatterer. For multiple absorbers, K is simply the linear sum of absorption coefficients and concentrations. Hence for a diffusely scattering medium, an expression analogous to the Beer–Lambert law can be derived to relate concentration to a physically measurable parameter, in this case the sample reflectance. The ratio K=S is usually referred to as the Kubelka – Munk remission function, and is the parameter usually quoted in this context. It is important to note that the relationship

CHEMICAL APPLICATIONS OF LASERS / Diffuse-Reflectance Laser Flash Photolysis

with concentration is only valid for a homogeneous distribution of absorbers within a sample.

Transient Concentration Profiles As has been discussed in the previous section, a light flux propagating through a scattering medium is attenuated by both scattering and absorption events, whilst in a nonturbid medium attenuation is by absorption only. Hence even in a sample with a very low concentration of absorbers, the light flux is rapidly attenuated as it penetrates into the sample by scattering events. This leads to a significant flux gradient through the sample, resulting in a reduction in the concentration of photo-induced species as penetration depth into the sample increases. There are three distinct concentration profiles which may be identified within a scattering sample which are dependent on the scattering and absorption coefficients. These concentration profiles are interrogated by a beam of analyzing light, and hence an understanding of the effect of these differing profiles on the diffusely reflected intensity is vital in interpreting transient absorption data. The transient depth profiles are illustrated in Figure 2. Kubelka– Munk Plug

This occurs when the photolysis flash, i.e., laser fluence is high and the absorption coefficient is low at the laser wavelength. If we assume a simple photophysical model involving simply the ground state S0, the first excited singlet state S1 and first excited triplet state T1, and we assume that either the quantum yield of triplet state production is high or the S1 lifetime is very much shorter than the laser pulse allowing re-population of the ground state, then it is possible at high enough fluence to completely convert all of the S0 states to T1

Figure 2 Transient concentration profiles following pulsed laser excitation of a scattering and absorbing sample.

33

states within the sample. Provided the fluence is high enough, this complete conversion will penetrate some way into the sample (Figure 2). Under circumstances where the T1 state absorbs strongly at some wavelength other than the photolysis wavelength, probing at this wavelength will result in the probe beam being attenuated significantly within a short distance of the surface of the sample, and thus will only interrogate regions where there is a homogeneous excited state concentration. Under these circumstances the reflectance of the sample as a function of the concentration of excited states is described by the Kubelka –Munk equation, and the change in remission function can be used to probe the change in excited state concentration. Exponential Fall-Off of Concentration

This occurs when the laser fluence is low and the absorption coefficient at the laser wavelength high. Again considering the simple photophysical model, most of the laser flux will be absorbed by the ground state in the first few layers of sample, and little will penetrate deeply into the sample. In the limiting case this results in an exponential fall-off of transient concentration with penetration depth into the sample (Figure 2). Here the distribution of absorbers is not random and the limiting Kubelka– Munk equation is no longer applicable since the mean absorption coefficient varies with sample penetration depth. Lin and Kan solved eqs [9] and [10] with the absorption coefficient K varying exponentially with penetration depth and showed that the series solution converges for changes in reflectance of less than 10% such that the reflectance change is a linear function of the number of absorbing species. Intermediate Case

Between the two extremes described above is a case where significant transient conversion takes place at the front surface, but with little penetration into the sample. This can occur, for example, with high laser fluences and large ground state absorption coefficients at the laser wavelength. Under these circumstances, illustrated in Figure 2, the analyzing light interrogates not only the transient concentration profile, but also a significant amount of the analyzing light may penetrate through the transient layer into the unconverted sample behind, if the transient absorption at the analyzing wavelength is low. This creates a more complex problem for analysis since effectively the sample is irradiated from both front and back faces, with consequent effects on the diffusely reflected intensity. It is possible to numerically model the reflectance change as a function of

34

CHEMICAL APPLICATIONS OF LASERS / Diffuse-Reflectance Laser Flash Photolysis

transient concentration under these circumstances but a precise knowledge of the absorption and scattering coefficients is required. Under most circumstances, this case is avoided and diffuse reflectance flash photolysis experiments are arranged such that one of the two limiting cases above, generally the exponential fall-off (usually achieved by attenuation of the laser beam), prevails in a particular experiment.

Sample Geometry In the case of conventional nanosecond laser flash photolysis, it is generally the case that right-angled geometry is employed for the photolysis and probe beams (Figure 3a). This geometry has a number of

Figure 3 Sample geometries for (a) conventional and (b,c) diffuse reflectance flash photolysis. , photolysis beam; !, analyzing beam.

!

advantages over alternatives. The photolysis beam and analyzing beam are spatially well separated, such that the analyzing beam intensity is largely unaffected by scattered photolysis light. Also, the fluorescence generated by the sample will be emitted in all directions, whilst the analyzing beam is usually collimated allowing spatial separation of fluorescence and analyzing light; sometimes an iris is used to aid this spatial discrimination. This geometry is appropriate for quite large beam diameters and fluences; where smaller beams are used collinear geometry may be more appropriate in order to achieve long interaction pathlengths. In the case of nanosecond diffuse reflectance flash photolysis, the geometry required is quite different (Figure 3b,c). Here, the photolysis beam and analyzing beam must be incident on the same sample surface, and the diffusely reflected analyzing light is collected from the same surface. The geometry is often as shown in Figure 3b, where the analyzing light is incident almost perpendicularly on the sample surface, with the photolysis beam incident at an angle such that the specular reflection of the photolysis beam passes between detector and analyzing beam (not shown). Alternatively, the geometry shown in Figure 3c may be employed, where diffusely reflected light is detected emerging perpendicular to the sample surface. In both cases, the geometry is chosen such that specular (mirror) reflection of either exciting or analyzing light from the sample is not detected, since specular reflection is light which has not penetrated the sample and therefore contains no information regarding the concentrations of species present. A requirement, as in conventional flash photolysis, is that the analyzing beam probes only those areas which are excited by the photolysis beam, requiring the latter to be larger than the former. The nature of the scattering described previously means that the required geometry does not give spatial discrimination at the detector between photolysis and analyzing light, and fluorescence. This is since both analyzing and photolysis light undergo scattering and absorption processes (although with wavelength-dependent absorption and scattering coefficients) and emerge with the same spatial profiles. Fluorescence, principally that stimulated by the photolysis beam since this is of greatest intensity, originates from within the sample but again undergoes absorption and scattering and emerges with the same spatial distribution as the exciting light. In diffuse reflectance flash photolysis, separation of the analyzing light from the excitation or fluorescence must be achieved using spectral (filters and/or monochromators) or temporal, rather than spatial discrimination. Time-gated charge coupled devices

CHEMICAL APPLICATIONS OF LASERS / Diffuse-Reflectance Laser Flash Photolysis

(CCD) or photodiode array detectors can be effectively used in nanosecond laser flash photolysis to exclude excitation light and fluorescence, since these occur on time-scales usually much shorter than the transient absorption of species of interest. The usual analyzing light source used in nanosecond diffuse reflectance flash photolysis is a xenon arc lamp, due to its good spectral coverage and high intensity. High probe intensity is particularly important in diffuse reflectance flash photolysis since the scattered intensities are often low, and the scattered light emerges in a large solid angle, only part of which can be effectively collected and detected. Also, high intensities allow for the light from the analyzing source to dominate over fluorescence if the latter is relatively weak. In femtosecond diffuse reflectance laser flash photolysis, sample geometry considerations are also important. Such experiments are performed using the pump-probe technique, with the probe often being a femtosecond white-light continuum. The sample geometry employed is illustrated in Figure 4. Here the pump and probe are almost colinear, and are incident on the same sample area; again, the requirement is for the pumped area to be larger than the probed area. Diffusely reflected light is then collected and analyzed, time resolution being achieved by varying the delay between pump and probe beams. It should be noted that the temporal resolution is worse than in conventional pump-probe techniques. In conventional femtosecond flash photolysis, the time resolution is generally limited by the widths of pump and probe pulses; in diffuse reflectance mode, the pulses undergo numerous refractions, reflections, and diffractions such that the pulse is temporally broadened during its transit through the material. The extent of this broadening is a sensitive function of the optical properties of the individual sample.

Figure 4 Sample geometry for femtosecond diffuse reflectance flash photolysis. !, photolysis beam; !, analyzing beam.

35

Kinetic Analysis Kinetic analysis, and of course time-resolved spectroscopic analysis, require a quantitative treatment of the concentration changes within a sample following an excitation pulse as a function of time. When studying transient absorption phenomena in opaque samples, it is usual to define the reflectance change in terms of the sample reflectance before and after excitation, such that in spectrally resolved data a transient difference spectrum rather than an absolute reflectance spectrum of the transient species is obtained. The latter can, however, be reconstructed from a knowledge of the absorption coefficients and concentration of the species involved. It is possible to define the reflectance change as DJt R ¼12 t J0 R0

½13

where R0 and Rt represent the intensity of probe light diffusely reflected from the sample before excitation and at a time t after excitation, respectively. Frequently reflectance change is expressed as ‘% absorption’, which is defined in eqn [14]. R % absorption ¼ 100 £ 1 2 t R0

! ½14

These parameters are often used as being proportional to transient concentration, subject to satisfying the criteria for an exponential fall-off of transient concentration with penetration depth as discussed previously, and are used to replace transient concentration in kinetic equations used in data analysis. It is generally the case that the samples studied using diffuse reflectance laser flash photolysis have either some degree of heterogeneity, for example paper, microcrystalline cellulose, silica gel or alumina, or have well-defined porous structures such as zeolites. Molecules adsorbed to these supports may be present on the surface, within micro- or mesopores, or intimately included within the structure. Hence each molecule may experience its own unique environment, and this will obviously influence its observed photophysics. It is therefore the case that in these systems, even very simple photo-induced reactions such as unimolecular photoisomerizations do not follow first-order kinetics; rather, a distribution of rates is observed which reflect the differing environments experienced by the molecules, and hence the molecules act as probes of this heterogeneity.

36

CHEMICAL APPLICATIONS OF LASERS / Diffuse-Reflectance Laser Flash Photolysis

There are a number of approaches to kinetic analysis in opaque, often heterogeneous systems, as described below.

in the first-order rate constant is then  þ gx lnðkÞ ¼ lnðkÞ

½16

The equation used to describe the data is given as: Rate Constant Distributions

Here the sample heterogeneity is treated as producing a series of micro-environments, and the reaction studied will have its own unique rate constant in each of these environments. The width of the observed rate constant distribution therefore describes the number of these possible environments, with the distribution amplitude at any particular rate constant, reflecting the probability of the molecule existing in the corresponding environment. Exponential series lifetime distribution analysis is the simplest approach to distributed lifetime analysis. This analysis has the inherent advantage that there are no presuppositions regarding the kinetic model describing the data. Rather, a large number of exponentials with fixed rate constants and with amplitudes as adjustable parameters are applied to the data, and a least-squares procedure used to optimize the amplitude of each exponential. Generally there is a relationship between the rate constants, they being equally spaced on either a linear or logarithmic scale. Physically this can be interpreted as a large number of first-order or pseudo-first-order (in the case of bimolecular reactions) rates arising from the sample heterogeneity. Hence the rate constant distribution emerges naturally from the fitting procedure, with no preimposed constraints. Since there are a large number of adjustable parameters, these models are applied most successfully to data with good signal-to-noise. Alternative approaches involve imposing a distribution shape onto a set of exponentials, and optimizing this distribution to the data. This approach has the advantage that by assuming a rate constant distribution, the relative amplitudes of the exponentials are fixed and hence the number of fitting parameters greatly reduced. One of the more successful models applied in this context is that developed by Albery et al. In the development of their model it is assumed that, for the reaction in question, the free energy of activation, DG– ; is distributed normally around a  – according to eqn [15]: mean value DG  – 2 gxRT DG– ¼ DG

h i  expðgtÞ dt expð2t2 Þ exp 2 kt 21 ð1 expð2t2 Þdt

ð1

½15

with g being the width of the distribution for 1 $ x $ 21: This assumed normal distribution of the free energy of activation leads to a log-normal distribution of the decay rate constants distributed  The dispersion around some average rate constant k:

c ¼ c0

½17

21

with c and c0 being concentration at time t ¼ t and t ¼ 0 relative to the excitation flash, respectively. Where reflectance change is proportional to transient concentration, reflectances can be used directly in place of concentrations. This equation can be solved by making the following substitutions: ð1

expð2t2 Þdt ¼ p1=2

½18

21

h i  expðgtÞ dt expð2t2 Þexp 2kt

ð1 21 ð1

¼

    lg Þþexpð2kt  l2g Þ dl l21 exp 2ðln l2 Þ expð2kt

0

½19 Hence here there are only two fitting parameters, namely the width of the distribution g and the  Note that for g ¼ 0; eqn [17] distribution center, k: reduces to a single exponential function. A further model which constrains the rate constant distribution, which has been successfully applied to describe the rates of photo-induced processes on silica gel surfaces, is a Le´vy stable distribution of rate constants, described as:  a; gÞ ¼ Pðk;



1 ð1  exp 2 gqa cosðkqÞdq p 0

½20

where a is the characteristic power law exponent ð0 , a # 2Þ and g is the distribution width ðg . 0Þ: Special cases of the Le´vy stable distribution occur for a ¼ 1 and a ¼ 2; where the distribution shape becomes Lorentzian and Gaussian, respectively. The Le´vy distribution gives an increasing weighting to the tails of the distribution as the distribution width decreases, and can be described as a random walk consisting of long jumps followed by several short walks. It has been shown that this type of foraging behavior is more efficient at seeking out randomly distributed targets than a simple random walk. The Le´vy stable distribution has three adjustable parameters which allows greater flexibility in the distribution of rate constants than does the Gaussian model, but still constrains the distribution to be symmetrical about some mean value.

CHEMICAL APPLICATIONS OF LASERS / Diffuse-Reflectance Laser Flash Photolysis

37

Physical Models

A number of these have been applied to very specific data sets, where parameters of the systems are accurately known. These include random walk models in zeolites, and time-dependent fractaldimensional rate constants to describe kinetics on silica gel surfaces. The advantage of these methods over rate constant distributions is that since they are based on the physical description of the system, they yield physically meaningful parameters such as diffusion coefficients from analysis of kinetic data. However, for many systems the parameters are known in insufficient detail for accurate models to be developed.

Figure 6 Transient absorption decay for the naphthalene (1 mmol g21) radical cation monitored at 680 nm on silica gel. Fitted using a Le´vy stable distribution.

Examples An example of bimolecular quenching data is shown in Figure 5. Here, anthracene at a concentration of 1 mmol g21 is co-adsorbed to silica gel from acetonitrile solution together with azulene at a concentration of 0.8 mmol g21. Laser excitation at 355 nm from an Nd:YAG laser produces the excited triplet state of the anthracene, which undergoes energy transfer to the co-adsorbed azulene molecule as a result of the rapid diffusion of the latter. The data shown in Figure 5 are recorded monitoring at 420 nm, and the laser energy (approximately 5 mJ per pulse) is such that an exponential fall-off of transient concentration with penetration depth is expected such that reflectance change is proportional to transient concentration (see section on Transient Concentration Profiles above). The data are shown on a logarithmic time axis for clarity. The fitted line is obtained by applying the model of Albery et al.

(see section on Rate Constant Distributions above) with fitting parameters k ¼ 1:01 £ 104 s21 and g ¼ 0:73: Where ion – electron recombination is concerned, the Albery model often fails to adequately describe the data obtained since it does not allow sufficient  given small rate constants relative to the value of k; the constraints of the Gaussian distribution. The Le´vy stable distribution is ideal in this application since it allows greater flexibility in the shape of the distribution. Figure 6 shows example data for naphthalene adsorbed to silica gel (1 mmol g21). The laser pulse energy at 266 nm (approximately 40 mJ per pulse) is such that photo-ionization of the naphthalene occurs producing the radical cation. The subsequent decay of the radical cation via ion – electron recombination can be monitored at 680 nm. Note that decay is observed on a time-scale of several thousand seconds. The fitted line is according to a Le´ vy stable distribution with parameters k ¼ 8:2 £ 1024 s21 ; g ¼ 0:5 and a ¼ 1:7 (see section on Rate Constant Distributions above). This model allows the shape of the distribution to deviate from a Gaussian, and can be more successful than the model of Albery et al. in modeling the complex kinetics which arise on surfaces such as silica gel. Note that where the model of Albery et al. can successfully model the data, the data can also be described by a Le´ vy stable distribution with a ¼ 2:

List of Units and Nomenclature Figure 5 Transient absorption decay for anthracene (1 mmol g21) co-adsorbed with azulene (0.8 mmol g21) to silica gel monitored at 420 nm. Fitted using the model of Albery et al.

A c I J

Sample absorbance Concentration Incident flux Diffusely reflected flux

38

CHEMICAL APPLICATIONS OF LASERS / Laser Manipulation in Polymer Science

k k K l P R1 S t X a g DG – ¯– DG 1

Rate constant Mean rate constant Absorption coefficient (absorption per unit thickness) [cm21] Pathlength [cm] Probability of transmission of a photon through a sample of defined thickness Reflectance of an infinitely thick sample Scattering coefficient (scattering per unit thickness) [cm21] Time Thickness of sample layer [cm] Characteristic Le´vy power law exponent Width of rate constant distribution Free energy of activation [kJ mol21] Mean free energy of activation [kJ mol21] Molar decadic or naperian absorption coefficient

See also Chemical Applications of Lasers: Pump and Probe Studies of Femtosecond Kinetics. Optical Materials: Measurement of Optical Properties of Solids. Scattering: Scattering from Surfaces and Thin Films; Scattering Theory.

Further Reading Albery WJ, Bartlett PN, Wilde CP and Darwent JR (1985) A general-model for dispersed kinetics in heterogeneous systems. Journal of the American Chemical Society 107: 1854– 1858. Anpo M (ed.) (1996) Surface photochemistry. Wiley Series in Photoscience and Photoengineering, vol. 1. Chichester, UK: John Wiley and Sons. Anpo M and Matsuura T (eds) (1989) Photochemistry on solid surfaces. Elsevier Series in Studies in Surface Science and Catalysis, vol. 47. Amsterdam, The Netherlands: Elsevier Science Publishers B.V.

Asahi T, Furube A, Fukumura H, Ichikawa M and Masuhara H (1998) Development of a femtosecond diffuse reflectance spectroscopic system, evaluation of its temporal resolution, and applications to organic powder systems. Review of Scientific Instruments 69: 361 – 371. Bertoin J (1998) Le´ vy Processes. Cambridge, UK: Cambridge University Press. Hapke B (1993) Introduction to the Theory of Reflectance and Emittance Spectroscopy. Cambridge, UK: Cambridge University Press. Kamat PV (1993) Photochemistry on nonreactive and reactive (semiconductor) surfaces. Chemical Reviews 93: 287– 300. Kan HKA and Lin TP (1970) Calculation of reflectance of a light diffuser with non-uniform absorption. Journal of the Optical Society of America 60: 1252–1256. Kessler RW and Wilkinson F (1981) Diffuse reflectance triplet – triplet absorption spectroscopy of aromatic hydrocarbons chemisorbed on g-alumina. Journal of the Chemical Society – Faraday Transactions I 77: 309 – 320. Kortum G (1969) Reflectance Spectroscopy. Berlin: Springer-Verlag. Kubelka P and Munk F (1931) Ein beitrag zur optik der farbanstriche. zeitschrift fuer Technische Physik 12: 593 – 601. Ramamurthy V (ed.) (1991) Photochemistry in Organised & Constrained Media. New York: VCH Publishers. Thomas JK (1993) Physical aspects of photochemistry and radiation chemistry of molecules adsorbed on SiO2, g-Al2O3, zeolites and clays. Chemical Reviews 93: 301 – 320. Thomas JK and Ellison EH (2001) Various aspects of the constraints imposed on the photochemistry of systems in porous silica. Advances in Colloid and Interface Science 89 – 90: 195 – 238. Wendlandt WW and Hecht HG (1966) Reflectance Spectroscopy. Chemical Analysis Series, vol. 21. New York: Interscience Publishers. Wilkinson F and Kelly G (1989) Diffuse reflectance flash photolysis. In: Scaiano JC (ed.) Handbook of Organic Photochemistry, vol. 1, pp. 293 – 314. Florida: CRC Press.

Laser Manipulation in Polymer Science S Ito, Y Hosokawa and H Masuhara, Osaka University, Osaka, Japan q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Laser manipulation is a method for manipulating single particles with a size of less than a few tens of microns in which optical pressure of a focused laser

beam is used to trap and control the particles without any mechanical contact. As infrared lasers, as a trapping light source, have become more userfriendly, even operators who are not familiar with lasers and microscopes can perform laser manipulation. Those laser manipulators already on sale have attracted significant attention as a new tool. It is especially interesting to combine this method with nanotechnology and biotechnology, which have progressed rapidly during the 1990s. In this article,

CHEMICAL APPLICATIONS OF LASERS / Laser Manipulation in Polymer Science

the principle and method of laser manipulation is described and then its applications and possibilities in nanotechnology and biotechnology are summarized.

Principle and Method When light is reflected or refracted at the interface between two media with different refractive indices, the momentum of photons is changed, which leads to the generation of photon force as a reaction to the momentum change. For example, when a laser beam is reflected by a mirror, the momentum of photons is changed by Dp (Figure 1a). As a result, a photon force, Fphot, acts on the mirror to deflect it vertically away from the reflected beam. In refraction, the momentum of photons is also changed, so that the photon force acts on the interface, as shown in Figure 1b. Thus light does not only give its energy to materials via absorption but also applies a mechanical force to them. However, when we are exposed to light, such as from a halogen lamp, we are never

Figure 1 Optical pressure originating from the momentum change of photons.

39

aware of the photon force because its magnitude is less than an order of pN. However, the photon force of the pN order acquired from a focused laser beam is sufficient to manipulate nm – mm-sized particles in solution, under an optical microscope. Principle of Laser Trapping

If the size of a particle is larger than the wavelength of a trapping laser beam (mm-sized particles), the principle of single-beam gradient force optical trapping can be explained in terms of geometrical optics. When a tightly focused laser beam is irradiated onto a transparent dielectric particle, an incident laser beam is refracted at the interface between the particle and medium, as represented by Figure 2a. The propagation direction of the beam is changed, i.e., the momentum of the photon is changed, and consequently, a photon force is generated. As the laser beam leaves the particle and enters the surrounding medium, this refraction causes a photon force to be exerted again on that interface. Summing up the force contributions of all rays, if the refractive index of the particle is higher than that of the medium, a resultant force exerted on the particle is directed toward the focal point as an attractive force. However, reflection at the surface of the particle is negligible if that particle is transparent and the refractive index ratio of the particle to medium is close to unity. Therefore, the incident beam is reflected at two surfaces by a small amount, and as a result, the particle is directed slightly to the propagation direction of the incident light. Where the particle absorbs the incident beam, the photon force is also generated to push it in the propagation direction. The negligible effects of the

Figure 2 Principle of single-beam gradient force optical trapping explained by ray optics.

40

CHEMICAL APPLICATIONS OF LASERS / Laser Manipulation in Polymer Science

reflection and absorption occur in trapping experiment of transparent particles, such as polymer particles, silica microspheres, etc. However, particles with high reflectance and absorption coefficients such as metallic particles, exert a more dominant force at the surface where absorption and reflection occurs. This force is a repulsive force away from the focal point; consequently, metallic microparticles cannot be optically trapped. If a dielectric particle is much smaller than the wavelength of a trapping light (nm-sized), it can be regarded as a point dipole (Rayleigh approximation) and the photon force (Fphot) acted on it is given by Fphot ¼ Fgrad þ Fscat

½1

Here, Fgrad and Fscat are called the gradient force and scattering force, respectively. The scattering force is caused by the scattering of light, and it pushes the particle toward the direction of light propagation. On the other hand, the gradient force is generated when a particle is placed in a heterogeneous electric field of light. If the dielectric constant of the particle is higher than that of the surrounding medium, the gradient force acts on the particle to push it toward the higher intensity region of the beam. In the case of laser trapping of dielectric nanoparticles, the magnitude of gradient force is much larger than the scattering force. Consequently, the particle is trapped at the focal

point of the trapping laser beam, where the beam intensity (electric field intensity) is maximum. The photon force can be expressed as follows: 1 ½2 Fphot ø Fgrad ¼ 1m a7lEl2 2 1p 2 1m a ¼ 3V ½3 1p þ 21m where, E is electric field of the light, V is volume of the particle, and 1p and 1m are dielectric constants of the particle and medium. In trapping nanoparticles that absorb the laser beam, such as gold nanoparticles, the complex dielectric constant and attenuation of the electric field in the particle need to be taken into consideration. Although the forces due to light absorption and scattering are both propelling the particle toward the direction of light propagation, the magnitude of gradient force is much larger than these forces. This is in contrast to the mm-sized metallic particle, which cannot be optically trapped. Laser Manipulation System

An example of a laser micromanipulation system, with a pulsed laser to induce photoreactions, is shown in Figure 3. A linearly polarized laser beam from a CW Nd3þ:YAG laser is modulated to the circularly polarized light by a l=4 plate and then split into horizontally and vertically polarized beams by a

Figure 3 A block diagram of a dual beam laser manipulation-reaction system.

CHEMICAL APPLICATIONS OF LASERS / Laser Manipulation in Polymer Science

polarizing beamsplitter (PBS1). Both laser beams are combined by another (PBS2), introduced coaxially into an optical microscope, and then focused onto a sample through a microscope objective. The focal spots of both beams in the sample solution can be scanned independently by two sets of computercontrolled galvano mirrors. Even if the beams are spatially overlapping, interference does not take place because of their orthogonal polarization relation. Pulsed lasers, such as a Q-switched Nd3þ:YAG laser, are used to induce photoreactions, such as photopolymerization, photothermal reaction, laser ablation, etc.

Laser Manipulation and Patterning of Nanoparticles Laser manipulation techniques enable us to capture and mobilize fine particles in solution. Most studies using this technique have been conducted on mmsized objects such as polymer particles, microcrystals, living cells, etc. Because it is difficult to identify individual nanoparticles with an optical microscope, laser manipulation techniques have rarely been applied to nanotechnology and nanoscience. However, the laser trapping technique can be a powerful tool for manipulation of nanoparticles in solution where individual particles can be observed. This is achieved by detection of fluorescence emission from labeled dyes or scattered light. Now, even single molecule fluorescence spectroscopy has been achieved by the use of a highly sensitive photodetector, and a single metallic nanoparticle can be examined by detecting the light scattered from it. There have been several reports on the application of laser manipulation techniques for patterning of nm-sized particles. Here, fixation methods of nanoparticles, using the laser manipulation technique and local photoreactions, are introduced.

41

entered the region, irradiated by a near-infrared laser beam (1064 nm), was trapped at the focal point, and moved onto the surface of the glass substrate by handling the 3D stage of the microscope. An additional fixation laser beam (355 nm, 0.03 mJ, pulse duration , 6 ns, repetition rate , 5 Hz) was then focused to the same point for , 10 s, which led to the generation of acrylamide gel around the trapped nanoparticle. By repeating the procedure, patterning of single polymer nanoparticles on a glass substrate was achieved, and a fluorescence image of single nanoparticles as a letter ‘H’, is shown in Figure 4. A magnified atomic force microscope (AFM) image of one of the fixed nanoparticles is also shown, which confirms that only one polymer nanoparticle was contained in the polymerized gel. Multiple polymer nanoparticles can also be gathered, patterned, and fixed on a glass substrate by scanning both trapping and fixation laser beams with the use of two pairs of galvano mirrors. The optical transmission and fluorescence images of the ‘H’ patterned nanoparticles on a glass substrate are shown in Figure 5a and b, respectively. The letter ‘H’ consists of three straight lines of patterned and fixed nanoparticles. The trapping laser beam (180 mW) was scanned at 30 Hz along each line with a length of , 10 mm on a glass substrate for 300 s. Nanoparticles were gathered and patterned along the locus of the focal point on the substrate. Then the fixation laser beam (0.097 mJ) was scanned for another 35 s. As a result, the straight lines of

Patterning of Polymer Nanoparticles

Patterning of individual polymer nanoparticles onto a substrate can be achieved by using local photopolymerization. The following example shows the strength of this method. Polystyrene nanoparticles of 220 nm with fluorescent dye were dispersed in ethylene glycol solution containing polymerizable vinyl monomer (acrylamide, 31 wt%), crosslinker (N,N0 -methylenebisacrylamide, 2.2 wt%), and radical photoinitiator (Irgacure2959, Ciba Specialty Chemicals, 1.1 wt%). When the sample was irradiated by blue light from a high-pressure mercury lamp, green fluorescence from dye molecules within each nanoparticle was observed. A nanoparticle that

Figure 4 (a) A fluorescence image of spatially patterned individual polymer nanoparticles as the letter ‘H’ in distilled water. (b) A magnified AFM image of one of the produced acrylamide gels containing only one polymer nanoparticle on a glass substrate in the air.

42

CHEMICAL APPLICATIONS OF LASERS / Laser Manipulation in Polymer Science

Figure 5 Optical transmission (a) and fluorescence (b) images of patterned and fixed polymer nanoparticles on the glass substrate, as the letter ‘H’.

patterned nanoparticles were fixed in generated acrylamide gel on the substrate. Combining several simple fixed patterns, a more complex arrangement can be created with use of the present manipulation and fixation techniques. Fixation of Individual Gold Nanoparticles

Manipulation and fixation of single metallic nanoparticles in solution has been achieved by means of photo-induced transient melting of nanoparticles. As an example, the following method describes this technique. Gold nanoparticles (diameter , 80 nm) were dispersed in ethylene glycol. In order to identify a single gold nanoparticle, extinction of light from the halogen lamp of the optical microscope by the particle was utilized. Thus, the trapped particle was observed as a black spot in transmission image. The optically trapped single gold nanoparticle was then transferred to a precise position on the surface of glass substrate in the sample solution. The focused 355 nm pulse was additionally irradiated onto the pressed nanoparticle, which led to a transient temperature elevation of the gold nanoparticle to enable its fixation. It was confirmed by AFM observation that, at suitable laser fluence (32 – 64 mJ cm22), a gold nanoparticle was fixed on the glass substrate without fragmentation. Repeating the same manipulation and fixation procedure, single gold nanoparticles could be patterned on a glass substrate. Figure 6 shows an AFM image of successive spatial patterning of single 80 nm gold particles. The significance of the laser manipulation-fixation technique is that we can trap, manipulate, and fix single and/or many nanoparticles in solution at room temperature. We have already demonstrated assembling entangled polymer chains of 10 –20 nm mean radius by laser trapping, and formation of a single mm-sized particle. Thus, these various kinds of

Figure 6 An AFM image of fixed and patterned gold nanoparticles, as the letter I.

nanoscopic materials can be well manipulated using these techniques and it is believed that the present nanomanipulation-fixation method will be useful for future nanoscience and nanotechnology.

Application to Biotechnology Recent progress in biotechnology, using single cell manipulation by laser and microscope, has been attracting significant attention. In conventional methods, cell manipulation has been performed by mechanical manipulation using microneedles and micropipettes. However, laser manipulation can be applied as microtweezers. In comparison with the former, this form of laser manipulation has the

CHEMICAL APPLICATIONS OF LASERS / Laser Manipulation in Polymer Science

43

Figure 7 An experimental setup of dual-beam laser manipulation and microfluidic devices.

advantage of no contact with cells and can perform some complex and characteristic movement, for example, rotation, separation, accumulation, etc. Furthermore, by combining it with a microcutter using laser ablation, cell fusion and extraction/injection of organs from/into the cells can be achieved. In this section, single-cell manipulation achieved by using the laser manipulation, is introduced. Noncontact Rotation of Cells Using Dual Beam Laser Manipulation

As a special feature of laser-cell manipulation, it is noticeable that multiple laser tweezers can be operated independently without any contact. Here, noncontact rotation of the fission yeast, Schizosaccharomyces pombe (h2), demonstrates this method. The yeast has the shape of an ellipse of length 8 and 3 mm for major and minor axes, and is shown set on a dual-beam laser manipulation system in Figure 7. One trapping beam A was focused at the end of the cell to anchor it, and another beam, B, was at the other end and scanned circularly around beam A by controlling the Galvano mirrors. The rotation of the yeast cell was realized at a frequency of less than 2 Hz, as shown in Figure 8. Such a cell manipulation is impossible by mechanical manipulation and indicates the superior performance of laser manipulation. Transfer of Cells in Microchannel

A flow cytometry to physically separate and identify specific types of cells from heterogeneous populations

Figure 8 Rotation of fission yeast cells using dual-beam laser manipulation. The bars are 10 mm.

by fluorescence, which is called fluorescence-activated cell sorter (FACS), has attracted significant attention as an important technique in biotechnology. In the separation process, charged single droplets containing single, fluorescence labeled cells are selected by passing them between two high-voltage deflection plates. Since it is a sequential process, which does not use microscopy, flexible and high-purity cell separation is limited. To overcome this problem, a selective cell separation system,

44

CHEMICAL APPLICATIONS OF LASERS / Laser Manipulation in Polymer Science

combining the laser trapping with the microchannel, is used. The method using microscopy is useful and flexible, because cells are not simply detected by fluorescence but also identified by their shape, size, surface morphology, absorption, etc. However, cell sorting under a microscope is not processed efficiently by laser trapping. A more effective method is described by the following. The microchannel was prepared with laser microfabrication and photopolymerization, whose top view is shown in Figure 9a, and was then set on the microscope stage. Three syringes were equipped with homemade pumps A, B, and C, and connected with a microchip. There were two wing-like chambers in the microchip, that were connected with pumps A and C. Between chambers, there was a microchannel 100 mm wide, which joined these chambers and crossed a long microchannel connected with the pump B, thus forming a drain. Thickness of these chambers was 100 mm. By controlling these pumps, the left and right chambers were filled by the culture medium including the yeast cells and the neat culture medium, respectively, as shown in Figure 9b. Individual cells were transferred from left to right chambers by using a single laser beam and handling an electrically movable microscope stage. The trapping laser irradiated a yeast cell in the left chamber and the trapped cell was transferred to the right chamber. A representative demonstration is given in Figure 9c. The position of the trapped yeast cell was controlled by the microscope stage with the velocity of 20 mm/s. The single yeast cells were successfully transferred from left to right chambers at the rate of 26 s. By combining laser trapping with the microchannel, single separation of cells was achieved under a microscope. Collection and Alignment of Cells

When cell sorting using laser trapping is applied to biotechnology, the time to transfer cells should be short compared to the above-mentioned times. One method is to transfer multiple cells simultaneously by developing dual-beam laser manipulation, which is shown schematically in Figure 10. One trapping beam A was scanned on a line given as an arrow in Figure 10a at a rate faster than motions of cells.

Figure 9 (a) A microchip fabricated by laser polymerization technique and (b) its schematic representation corresponding to dashed area in the microchip. (c) Cell transfer in microchip using laser trapping. The bars are 100 mm. A trapping laser beam is fixed at the center of each picture, by which a particle is trapped and transferred from left to right chambers. The position of each picture is shown in (b).

CHEMICAL APPLICATIONS OF LASERS / Laser Manipulation in Polymer Science

45

Figure 10 A schematic illustration of efficient collection and transfer of cells based by dual-beam laser manipulation.

the velocity of 20 mm/s by driving the microscope stage. By improving the presented cell-sorting system, it is expected to realize flexible and high-purity cell separation.

List of Units and Nomenclature Laser fluence (light energy density per pulse)

[mJ cm22]

See also Time-Resolved Fluorescence: Polymer Science. Figure 11 Collection and alignment of fission yeast cells using dual-beam laser manipulation.

The other trapping beam B was used to move individual cells to the line sequentially (Figure 10b). Finally, by moving the microscope stage, many particles trapped at the scanning line can be transferred to another chamber (Figure 10c). An example is shown in Figure 11. The trapping beam A was scanned on a line with the rate of 15 Hz whose length was 30 mm. The trapping beam B was used to trap a yeast cell and move it to the line, which was controlled by a computer mouse pointer. The cells stored on the line were successfully isolated and with this method three cells can be collected within 15 s. Furthermore, the cells could be transferred with

Measurements

in

Further Reading Ashkin A, Dziedzic MJ, Bjorkholm JE and Chu S (1986) Observation of a single-beam gradient force optical trap for dielectric particles. Optical Letters 11: 288 – 290. Ashkin A, Dziedzic MJ and Yamane T (1987) Optical trapping and manipulation of single cells using infrared laser beams. Nature 330: 769– 771. Berns WM, Aist J, Edwards J, et al. (1981) Laser microsurgery in cell and developmental biology. Science 213: 505 – 513. Hoffmann F (1996) Laser microbeams for the manipulation of plant cells and subcellular structures. Plant Science 113: 1– 11. Hosokawa Y, Masuhara H, Matsumoto Y and Sato S (2002) Dual-beam laser micromanipulation for sorting

46

CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies

biological cells and its device application. Proceedings of SPIE 4622: 138 –142. Ito S, Yoshikawa H and Masuhara H (2001) Optical patterning and photochemical fixation of polymer nanoparticles on glass substrates. Applied Physics Letters 78: 2566– 2568. Ito S, Yoshikawa H and Masuhara H (2002) Laser manipulation and fixation of single gold nanoparticles in solution at room temperature. Applied Physics Letters 80: 482 – 484. Kamentsky AL, Melamed RM and Derman H (1965) Spectrophotometer: new instrument for ultrarapid cell analysis. Science 150: 630 – 631. Kim BH, Kogi O and Kitamura N (1999) Single-microparticle measurements: laser trapping-absorption microscopy under solution-flow conditions. Analytical Chemistry 71: 4338– 4343.

Masuhara H, De Schryver FC, Kitamura N and Tamai N (eds) (1994) Microchemistry: Spectroscopy and Chemistry in Small Domains. Amsterdam: Elsevier-Science. Sasaki K, Koshioka M, Misawa H, Kitamura N and Masuhara H (1991) Laser-scanning micromanipulation and spatial patterning of fine particles. Japanese Journal of Applied Physics 30: L907 – L909. Schu¨tze K, Clement-Segelwald A and Ashkin A (1994) Zona drilling and sperm insertion with combined laser microbeam and optical tweezers. Fertility and Sterility 61: 783– 786. Shikano S, Horio K, Ohtsuka Y and Eto Y (1999) Separation of a single cell by red-laser manipulation. Applied Physics Letters 75: 2671–2673. Svoboda K and Block SM (1994) Optical trapping of metallic Rayleigh particles. Optical Letters 19: 930 – 932.

Nonlinear Spectroscopies S R Meech, University of East Anglia, Norwich, UK

Nonlinear Optics for Spectroscopy

q 2005, Elsevier Ltd. All Rights Reserved.

The foundations of nonlinear optics are described in detail elsewhere in the encyclopedia, and in some of the texts listed in the Further Reading section at the end of this article. The starting point is usually the nonlinearity in the polarization, Pi , induced in the sample when the applied electric field, E, is large:   1 ð2Þ 1 ð3Þ ð1Þ Pi ¼ 10 xij Ej þ xijk Ej Ek þ xijkl Ej Ek El þ · · · ½1 2 4

Introduction Chemistry is concerned with the induction and observation of changes in matter, where the changes are to be understood at the molecular level. Spectroscopy is the principal experimental tool for connecting the macroscopic world of matter with the microscopic world of the molecule, and is, therefore, of central importance in chemistry. Since its invention, the laser has greatly expanded the capabilities of the spectroscopist. In linear spectroscopy, the monochromaticity, coherence, high intensity, and high degree of polarization of laser radiation are ideally suited to high-resolution spectroscopic investigations of even the weakest transitions. The same properties allowed, for the first time, the investigation of the nonlinear optical response of a medium to intense radiation. Shortly after the foundations of nonlinear optics were laid, it became apparent that these nonlinear optical signals could be exploited in molecular spectroscopy, and since then a considerable number of nonlinear optical spectroscopies have been developed. This short article is not a comprehensive review of all these methods. Rather, it is a discussion of some of the key areas in the development of the subject, and indicates some current directions in this increasingly diverse area.

where 10 is the vacuum permittivity, xðnÞ is the nth order nonlinear susceptibility, the indices represent directions in space, and the implied summation over repeated indices convention is used. The signal field, resulting from the nonlinear polarization, is calculated by substituting it as the source polarization in Maxwell’s equations and converting the resulting field to the observable, which is the optical intensity. The nonlinear polarization itself arises from the anharmonic motion of electrons under the influence of the oscillating electric field of the radiation. Thus, there is a microscopic analog of eqn [1] for the induced molecular dipole moment, mi : 23 mi ¼ ½10 aij dj þ 122 0 bijk dj dk þ 10 gijkl dj dk dl þ · · ·

½2

in which a is the polarizability, b the first molecular hyperpolarizability, g the second, etc. The power series is expressed in terms of the displacement field d rather than E to account for local field effects. This somewhat complicates the relationship between the molecular hyperpolarizabilities, for example, gijkl and

CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies

the corresponding macroscopic susceptibility, xð3Þ ijkl , but it is nevertheless generally true that a molecule exhibiting a high value of gijkl will also yield a large third-order nonlinear polarization. This relationship between the macroscopic and microscopic para meters is the basis for an area of chemistry which has influenced nonlinear optics (rather than the inverse). The synthesis of molecules which have giant molecular hyperpolarizabilities has been an active area, because of their potential applications in lasers and electro-optics technology. Such molecules commonly exhibit a number of properties, including an extended linear p electron conjugation and a large dipole moment, which changes between the ground and excited electronic states. These properties are in accord with the predictions of theory and of quantum chemical calculations of molecular hyperpolarizabilities. Nonlinear optical spectroscopy in the frequency domain is carried out by measuring the nonlinear signal intensity as a function of the frequency (or frequencies) of the incident radiation. Spectroscopic information is accessible because the molecular hyperpolarizability, and therefore the nonlinear susceptibility, exhibits resonances: the signal is enhanced when one or more of the incident or generated frequencies are resonant with the frequency of a molecular transition. The rich array of nonlinear optical spectroscopies arises in part from the fact that with more input fields there are more accessible resonances than is the case with linear spectroscopy. As an example we consider the third-order susceptibility, xð3Þ ijkl Ej Ek El , in the practically important case of two incident fields at the same frequency, v1 , and a third at v2 : The difference between the two frequencies is close to the frequency of a Raman active vibrational mode, Vrg (Figure 1). The resulting susceptibility can be calculated to have the form:

xð3Þ ijkl ð22v1 þ v2 ; v1 ; v1 ; 2v2 Þ ¼

R R R NDrrg Vrg ðaR ij akl þ aik ajl Þ   2 12" Vrg 2 ðv1 2 v2 Þ2 þ G2 2 2iðv1 2 v2 ÞG

½3

in which the aR are elements of the Raman susceptibility tensor, G is the homogeneous linewidth of the Raman transition, Drrg a population difference, and N the number density. A diagram illustrating this process is shown in Figure 1, where the signal field is at the anti-Stokes Raman frequency ð2v1 2 v2 Þ: The spectroscopic method which employs this scheme is called coherent anti-Stokes Raman spectroscopy (CARS) and is one of the most widely applied nonlinear optical spectroscopies (see below). Clearly, from eqn [3] we can see that the signal will be

47

Figure 1 An illustration of the resonance enhancement of the CARS signal, vsig ¼ 2v1 2 v2 when v1 2 v2 ¼ Vrg.

enhanced when the difference frequency is resonant with the Raman transition frequency. With essentially the same experimental geometry there will also be nonlinear signals generated at both the Stokes frequency ð2v2 2 v1 Þ and at the lower of the incident frequencies, v2 : These signals have distinct phase matching conditions (see below), so they can easily be discriminated from one another, both spatially and energetically. Additional resonance enhancements are possible if either of the individual frequencies is resonant with an electronic transition of the molecule, in which case information on Raman active modes in the excited state is also accessible. It is worthwhile noting here that there is an equivalent representation of the nth order nonlinear susceptibility tensor xðnÞ ðvsig : v1 ; …; vn Þ as a time domain response function, RðnÞ ðt1 ; …; tn Þ: While it is possible to freely transform between them, the frequency domain representation is the more commonly used. However, the response function approach is increasingly applied in time domain nonlinear optical spectroscopy when optical pulses shorter than the homogeneous relaxation time are used. In that case, the time ordering of the incident fields, as well as their frequencies, is of importance in defining an experiment. An important property of many nonlinear optical spectroscopies is the directional nature of the signal, illustrated in Figure 2. The directional nonlinear

48

CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies

of the output signal, k s. This is illustrated for a number of important cases in Figure 2.

Nonlinear Optical Spectroscopy in Chemistry As already noted, there is a vast array of nonlinear optical spectroscopies, so it is clear some means of classification will be required. For coarse graining the order of the nonlinear process is very helpful, and that is the scheme we will follow here. Second-Order Spectroscopies

Figure 2 Experimental geometries and corresponding phase matching diagrams for (a) CARS (b) DFWM (c) RIKES. Note that only the RIKES geometry is fully phase matched.

signal arises from the coherent oscillation of induced dipoles in the sample. Constructive interference can lead to a large enhancement of the signal strength. For this to occur the individual induced dipole moments must oscillate in phase – the signal must be phase matched. This requires the incident and the generated frequencies to travel in the sample with the same phase velocity, kv =v ¼ c=nv where nv is the index of refraction and kv is the wave propagation constant at frequency v: For the simplest case of secondharmonic generation (SHG), in which the incident field oscillates at v and the generated field at 2v, phase matching requires 2kv ¼ k2v : The phasematching condition for the most efficient generation of the second harmonic is when the phase mismatch, Dk ¼ 2kv 2 k2v ¼ 0, but this is not generally fulfilled due to the dispersion of the medium, nv , n2v : In the case of SHG, a coherence length L can be defined as the distance traveled before the two waves are 1808 out of phase, L ¼ lp=Dkl: For cases in which the input frequencies also have different directions, as is often the case when laser beams at two frequencies are combined (e.g., CARS), the phase-matching condition must be expressed as a vectorial relationship, hence, for CARS, Dk ¼ kAS 2 2k1 þ k 2 < 0, where kAS is the wavevector of the signal at the anti-Stokes frequency (Figure 2). Thus for known input wavevectors one can easily calculate the expected direction

Inspection of eqn [1] shows that in gases, isotropic liquids, and solids where the symmetry point group contains a center of inversion, xð2Þ is necessarily zero. This is required to satisfy the symmetry requirement that polarization must change sign when the direction of the field is inverted, yet for a quadratic, or any even order dependence on the field strength, it must remain positive. Thus second-order nonlinearities might not appear very promising for spectroscopy. However, there are two cases in which second-order nonlinear optical phenomena are of very great significance in molecular spectroscopy, harmonic generation and the spectroscopy of interfaces. Harmonic conversion Almost every laser spectroscopist will have made use of second-harmonic or sum frequency generation for frequency conversion of laser radiation. Insertion of an oscillating field of frequency v into eqn [1] yields a second-order polarization oscillating of 2v: If two different frequencies are input, the second-order polarization oscillates at their sum and difference frequencies. In either case, the second-order polarization acts as a source for the second-harmonic (or sum, or difference frequency) emission, provided xð2Þ is nonzero. The latter can be arranged by selecting a noncentrosymmetric medium for the interaction, the growth of such media being an important area of materials science. Optically robust and transparent materials with large values of xð2Þ are available for the generation of wavelengths shorter than 200 nm to longer than 5 mm. Since such media are birefringent by design, a judicious choice of angle and orientation of the crystal with respect to the input beams allows a degree of control over the refractive indices experienced by each beam. Under the correct phase matching conditions nv < n2v , and very long interaction lengths result, so the efficiency of signal generation is high. Higher-order harmonic generation in gases is an area of growing importance for spectroscopy in the

CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies

deep UV and X-ray region. The generation of such short wavelengths depends on the ability of amplified ultrafast solid state lasers to generate extremely high instantaneous intensities. The mechanism is somewhat different to the one outlined above. The intense pulse is injected into a capillary containing an appropriate gas. The high electric field of the laser causes ionization of atoms. The electrons generated begin to oscillate in the applied laser field. The driven recombination of the electrons with the atoms results in the generation of the high harmonic emission. Although the mechanism differs from the SHG case, questions of phase matching are still important. By containing the gas in a corrugated waveguide phase matching is achieved, considerably enhancing the intensity of the high harmonic. Photon energies of hundreds of electronvolts are possible using this technique. The generation of such high energies is not yet routine, but a number of potential applications are already apparent. A powerful coherent source of X-ray and vacuum UV pulses will certainly aid surface analysis techniques such as UV photoemission and X-ray photoelectron spectroscopy. Much excitement is currently being generated by the possibility of using intense ultrashort X-ray pulses to record structural dynamics on an ultrafast timescale. Surface second-order spectroscopy At an interface inversion symmetry is absent by definition, so the second-order nonlinear susceptibility is finite. If the two bulk phase media are themselves isotropic, then even a weak second-order signal necessarily arises from the interface. This surface-specific all-optical signal is unique, because it can be used to probe the interface between two condensed phases. This represents a great advantage over every other form of surface spectroscopy. In linear optical spectroscopy, the signal due to species at the interface are usually swamped by contributions from the bulk phase. Other surface-specific signals do exist, but they rely on the scattering of heavy particles (electrons, atoms) and so can only be applied to the solid vacuum interface. For this reason the techniques of surface SHG and SFG are widely applied in interface spectroscopy. The most widely used method is sum frequency generation (SFG) between temporally overlapped tuneable infrared and fixed frequency visible lasers, to yield a sum frequency signal in the visible region of the spectrum. The principle of the method is shown in Figure 3. The surface nonlinear susceptibility exhibits resonances at vibrational frequencies, which are detected as enhancements in the visible SFG intensity. Although the signal is weak, it is directional and

49

Figure 3 The experimental geometry for SFG, and an illustration of the resonance enhancement of vsig ¼ vIR þ vvis at a Raman and IR allowed vibrational transition.

background free, so relatively easily measured by photon counting techniques. Thus, SFG is used to measure the vibrational spectra of essentially any optically accessible interface. There are, however, some limitations on the method. The surface must exhibit a degree of order – if the distribution of adsorbate orientation is isotropic the signal again becomes zero by symmetry. Second, a significant enhancement at the vibrational frequency requires the transition to be both IR and Raman allowed, as suggested by the energy level diagram (Figure 3). The SHG signal can also be measured as a function of the frequency of the incident laser, to recover the electronic spectrum of the interface. This method has been used, particularly in the study of semiconductor surfaces, but generally the electronic spectra of adsorbates contain less information than their vibrational spectra. However, by measuring the SHG intensity as a function of time, information on adsorbate kinetics is obtained, provided some assumptions connecting the surface susceptibility to the molecular hyperpolarizability are made. Finally, using similar assumptions, it is possible to extract the orientational distribution of the adsorbate, by measuring the SHG intensity as a function of polarization of the input and output beams. For these reasons, SHG has been widely applied to analyze the structure and dynamics of interfaces. Third-Order Spectroscopies

The third-order coherent Raman spectroscopies were introduced above. One great advantage of these methods over conventional Raman is that the signal is generated in a coherent beam, according to the appropriate phase matching relationship (Figure 2). Thus, the coherent Raman signal can easily be distinguished from background radiation by spatial filtering. This has led to CARS finding widespread application in measuring spectra in (experimentally)

50

CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies

hostile environments. CARS spectroscopy has been widely applied to record the vibrational spectra of flames. Such measurements would obviously be very challenging for linear Raman or IR, due to the strong emission from the flame itself. The directional CARS signal in contrast can be spatially filtered, minimizing this problem. CARS has been used both to identify unstable species formed in flames, and to probe the temperature of the flame (e.g., from measured population differences eqn [3]). A second advantage of the technique is that the signal is only generated when the two input beams overlap in space. Thus, small volumes of the sample can be probed. By moving the overlap position around in the sample, spatially resolved information is recovered. Thus, it is possible to map the population of a particular transient species in a flame. CARS is probably the most widely used of the coherent Raman methods, but it does have some disadvantages, particularly in solution phase studies. In that case the resonant xð3Þ signal (eqn [3]) is accompanied by a nonresonant third-order background. The interference between these two components may result in unusual and difficult to interpret lineshapes. In this case, some other coherent Raman methods are more useful. The phase matching scheme for Raman-induced Kerr effect spectroscopy (RIKES) was shown in Figure 2. The RIKES signal is always phase matched, which leads to a long interaction length. However, the signal is at v2 and in the direction of v2 , which would appear to be a severe disadvantage in terms of detection. Fortunately, if the input polarizations are correctly chosen the signal can be isolated by its polarization. In an important geometry, the signal ðv2 Þ is isolated by transmission through a polarizer oriented at 458 to a linearly polarized pump ðv1 Þ: The pump is overlapped in the sample with the probe ðv2 Þ linearly polarized at 2 458. Thus the probe is blocked by the polarizer, but the signal is transmitted. This geometry may be viewed as pump-induced polarization of the isotropic medium to render it birefringent, thus inducing ellipticity in the transmitted probe, such that the signal leaks through the analyzing polarizer (hence the alternative name optical Kerr effect). In this geometry, it is possible to greatly enhance signal-to-noise ratios by exploiting interference between the probe beam and the signal. Placing a quarterwave plate in the probe beam with its fast axis aligned with the probe polarization, and reorienting it slightly (, 18) yields a slightly elliptically polarized probe before the sample. A fraction of the probe beam, the local oscillator (LO), then also leaks through the analyzing polarizer, temporally and spatially overlapped with the signal. Thus, the signal

and LO fields are seen by the detector, which measures the intensity as: nc lE ðtÞ þ ES ðtÞl2 IðtÞ ¼ 8p LO nc ¼ ILO ðtÞ þ IS ðtÞ þ Re½EpS ðtÞ·ELO ðtÞ ½4 4p where the final term may be very much larger than the original signal, and is linear in xð3Þ : This term is usually isolated from the strong ILO by lock-in detection. This method is called optical heterodyne detection (OHD), and generally leads to excellent signal to noise. It can be employed with other coherent signals by artificially adding the LO to the signal, provided great care is taken to ensure a fixed phase relationship between LO and signal. In the RIKES experiment, however, the phase relationship is automatic. The arrangement described yields an outof-phase LO, and measures the real part of xð3Þ , the birefringence. Alternatively, the quarterwave plate is omitted, and the analyzing polarizer is slightly reoriented, to introduce an in-phase LO, which measures the imaginary part of xð3Þ , the dichroism. This is particularly useful for absorbing media. The OHD-RIKES method has been applied to measure the spectroscopy of the condensed phase, and has found particularly widespread application in transient studies (below). Degenerate four wave mixing (DFWM) spectroscopy is a simple and widely used third-order spectroscopic method. As the name implies, only a single frequency is required. The laser beam is split into three, and recombined in the sample, in the geometry shown in Figure 2. The technique is also known as laser-induced grating scattering. The first two beams can be thought of as interfering in the sample to write a spatial grating, with fringe spacing dependent on the angle between them. The signal is then scattered from the third beam in the direction expected for diffraction from that grating. The DFWM experiment has been used to measure electronic spectra in hostile environments, by exploiting resonances with electronic transitions. It has also been popular in the determination of the numerical value of xð3Þ , partly because it is an economical technique, requiring only a single laser, but also because different polarization combinations make it possible to access different elements of xð3Þ : The technique has also been used in time resolved experiments, where the decay of the grating is monitored by diffraction intensity from a time delayed third pulse. Two-photon or, more generally, multiphoton excitation has applications in both fundamental spectroscopy and analytical chemistry. Two relevant level

CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies

Figure 4 Two cases of resonant two photon absorption. In (a) the excited state is two-photon resonant, and the process is detected by the emission of a photon. In (b) the intermediate state is resonant, and the final energy is above the ionization potential (IP) so that photocurrent or mass detection can be used.

schemes are shown in Figure 4. Some property associated with the final state permits detection of the multiphoton absorption, for example, fluorescence in (a) and photocurrent in (b). Excitation of two-photon transitions, as in Figure 4a, is useful in spectroscopy because the selection rules are different to those for the corresponding one-photon transition. For example the change in angular momentum quantum number, DL, in a two-photon transition is 0, ^ 2, so, for example, an atomic S to D transition can be observed. High spectroscopic resolution may be attained using Doppler free two-photon absorption spectroscopy. In this method, the excitation beams are arranged to be counter-propagating, so that the Doppler broadening is cancelled out in transitions where the two excitation photons arise from beams with opposing wavevectors. In this case, the spectroscopic linewidth is governed only by the homogeneous dephasing time. The level scheme in Figure 4b is also widely used in spectroscopy, but in this case the spectrum of the intermediate state is obtained by monitoring the photocurrent as a function of v1 : The general technique is known as resonance enhanced multiphoton ionization (REMPI) and yields high-quality spectra of intermediate states which are not detectable by standard methods, such as fluorescence. The sensitivity of the method is high, and it is the basis of a number of analytical applications, often in combination with mass spectrometry. Ultrafast Time Resolved Spectroscopy

The frequency and linewidth of a Raman transition may be extracted from the CARS measurement, typically by combining two narrow bandwidth pulsed

51

lasers, and tuning one through the resonance while measuring the nonlinear signal intensity. The time resolved analogue requires two pulses, typically of a few picoseconds duration (and therefore a few wavenumbers bandwidth) at v1 and v2 to be incident on the sample. This pair coherently excites the Raman mode. A third pulse at v1 is incident on the sample a time t later, and stimulates the CARS signal at 2v1 2 v2 in the phase-matched direction. The decay rate of the vibrational coherence is measured from the CARS intensity as a function of the delay time. Thus, the frequency of the mode is measured in the frequency domain, but the linewidth is measured in the time domain. If very short pulses are used (such that the pulsewidth is shorter than the inverse frequency of the Raman active mode) the Raman transition is said to be impulsively excited, and the CARS signal scattered by the time delayed pulse reveals an oscillatory response at the frequency of the Raman active mode, superimposed on its decay. Thus, in this case, spectroscopic information is measured exclusively in the time domain. In the case of nonlinear Raman spectroscopy, similar information is available from the frequency and the time domain measurements, and the choice between them is essentially one of experimental convenience. For example, time domain CARS, RIKES, and DFWM spectroscopy have turned out to be particularly powerful routes to extracting low-frequency vibrational and orientational modes of liquids and solutions, thus providing detailed insights into molecular interactions and reaction dynamics in the condensed phase. Other time domain experiments contain information that is not accessible in the frequency domain. This is particularly true of photon echo methods. The name suggests a close analogy with nuclear magnetic resonance (NMR) spectroscopy, and the (optical) Bloch vector approach may be used to describe both measurements, although the transition frequencies and time-scales involved differ by many orders of magnitude. In the photon echo experiment, two or three ultrafast pulses with carefully controlled interpulse delay times are resonant with an electronic transition of the solute. In the two-pulse echo, the echo signal is emitted in the phase match direction at twice the interpulse delay, and its intensity as a function of time yields the homogeneous dephasing time associated with the transition. In the three-pulse experiment the pulses are separated by two time delays. By measuring the intensity of the stimulated echo as a function of both delay times it is possible to separately determine the dephasing time and the population relaxation time associated with the resonant transition. Such information is not accessible from linear spectroscopy, and can be extracted

52

CHEMICAL APPLICATIONS OF LASERS / Nonlinear Spectroscopies

only with difficulty in the frequency domain. The understanding of photon echo spectroscopy has expanded well beyond the simple description given here, and it now provides unprecedented insights into optical dynamics in solution, and thus informs greatly our understanding of chemistry in the condensed phase. The methods have recently been extended to the infra red, to study vibrational transitions. Higher-Order and Multidimensional Spectroscopies

The characteristic feature of this family of spectroscopies is the excitation of multiple resonances, which may or may not require measurements at xðnÞ with n . 3: Such experiments require multiple frequencies, and may yield weak signals, so they only became experimentally viable upon the availability of stable and reliable solid state lasers and optical parametric generators. Measurements are made in either the time or the frequency domain, but in either case benefit from heterodyne detection. One of the earliest examples was two-dimensional Raman spectroscopy, where multiple Raman active modes are successively excited by temporally delayed pulse pairs, to yield a fifth-order nonlinear signal. The signal intensity measured as a function of both delay times (corresponding to the two dimensions) allows separation of homogeneous and inhomogeneous contributions to the line shape. This prodigiously difficult xð5Þ experiment has been completed in a few cases, but is plagued by interference from third-order signals. More widely applicable are multidimensional spectroscopies using infrared pulses or combinations of them with visible pulses. The level scheme for one such experiment is shown in Figure 5 (which is one of many possibilities). From the scheme, one can see that the nonlinear signal in the visible depends on two resonances, so both can be detected. This can be regarded as a multiply resonant nondegenerate fourwave mixing (FWM) experiment. In addition, if the two resonant transitions are coupled, optical excitation of one affects the other. Thus, by measuring the signal as a function of both frequencies, the couplings between transitions are observed. These appear as cross peaks when the intensity is plotted in the two frequency dimensions, very much as with 2D NMR. This technique is already providing novel information on molecular structure and structural dynamics in liquids, solutions, and proteins. Spatially Resolved Spectroscopy

A recent innovation is nonlinear optical microscopy. The nonlinear dependence of signal strength on

Figure 5 Illustration of multiple resonance enhancements in a FWM geometry, from which 2D spectra may be generated.

intensity means that nonlinear processes are localized at the focal point of a lens. When focusing is strong, such as in a microscope objective, spatial localization of the nonlinear signal can be dramatic. This is the basis of the two-photon fluorescence microscopy method, where a high repetition rate source of lowenergy ultrafast pulses is focused by a microscope objective into a sample labeled with a fluorescent molecule, which has absorption at half the wavelength of the incident photons (Figure 4). The fluorescence is necessarily localized at the focal point because of its dependence on the square of the incident intensity. By measuring intensity while scanning the position of the focal point in space, a 3D image of the distribution of the fluorophore is constructed. This technique turns out to have a number of advantages over one photon fluorescence microscopy, most notably in terms of ease of implementation, minimization of sample damage, and depth resolution. The technique is widely employed in cell biology. Stimulated by the success of two-photon microscopy, further nonlinear microscopies have been developed, all relying on the spatial localization of the signal. CARS microscopy has been

CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer

demonstrated to yield 3D images of the distribution of vibrations in living cells. It would be difficult to recover such data by linear optical microscopy. SHG has been applied in microscopy. In this case, by virtue of the symmetry selection rule referred to above, a 3D image of orientational order is recovered. Both these and other nonlinear signals provide important new information on complex heterogeneous samples, most especially living cells.

See also Spectroscopy: Nonlinear Laser Spectroscopy; Raman Spectroscopy.

Further Reading Andrews DL and Allcock P (2002) Optical Harmonics in Molecular Systems. Weinheim, Germany: Wiley-VCH. Andrews DL and Demidov AA (eds) (2002) An Introduction to Laser Spectroscopy. New York: Kluwer Academic. Butcher PN and Cotter D (1990) The Elements of Nonlinear Optics. Cambridge, UK: Cambridge University Press. Cheng JX and Xie XS (2004) Coherent anti-Stokes Raman scattering microscopy: Instrumentation, theory, and applications. Journal of Physical Chemistry B 108: 827. de Boeij WP, Pshenichnikov MS and Wiersma DA (1998) Ultrafast solvation dynamics explored by femtosecond photon echo spectroscopies. Annual Review of Physical Chemistry 49: 99. Eisenthal KB (1992) Equilibrium and dynamic processes at interfaces by 2nd harmonic and sum frequency generation. Annual Review of Physical Chemistry 43: 627. Fleming GR (1986) Chemical Applications of Ultrafast Spectroscopy. Oxford, UK: Oxford University Press. Fleming GR and Cho M (1996) Chromophore-solvent dynamics. Annual Review of Physical Chemistry 47: 109. Fourkas JT (2002) Higher-order optical correlation spectroscopy in liquids. Annual Review of Physical Chemistry 53: 17.

53

Hall G and Whitaker BJ (1994) Laser-induced grating spectroscopy. Journal of Chemistry Society Faraday Transactions 90: 1. Heinz TF (1991) Second order nonlinear optical effects at surfaces and interfaces. In: Ponath H-E and Stegeman GI (eds) Nonlinear Surface Electromagnetic Phenomena, pp. 353. Hesselink WH and Wiersma DA (1983) Theory and experimental aspects of photon echoes in molecular solids. In: Hochstrasser RM and Agranovich VM (eds) Spectroscopy and Excitation Dynamics of Condensed Molecular Systems, chap. 6. Amsterdam: North Holland. Levenson MD and Kano S (1988) Introduction to Nonlinear Laser Spectroscopy. San Diego, CA: Academic Press. Meech SR (1993) Kinetic application of surface nonlinear optical signals. In: Lin SH, Fujimura Y, and Villaeys A (eds) Advances in Multiphoton Processes and Spectroscopy, vol. 8, pp. 281. Singapore: World Scientific. Mukamel S (1995) Principles of Nonlinear Optical Spectroscopy. Oxford: Oxford University Press. Rector KD and Fayer MD (1998) Vibrational echoes: a new approach to condensed-matter vibrational spectroscopy. International Review of Physical Chemistry 17: 261. Richmond GL (2001) Structure and bonding of molecules at aqueous surfaces. Annual Review of Physical Chemistry 52: 357. Shen YR (1984) The Principles of Nonlinear Optics. New York: Wiley. Smith NA and Meech SR (2002) Optically heterodyne detected optical Kerr effect – Applications in condensed phase dynamics. International Review of Physical Chemistry 21: 75. Tolles WM, Nibler JW, McDonald JR and Harvey AB (1977) Review of theory and application of coherent anti-Stokes Raman spectroscopy (CARS). Applied Spectroscopy 31: 253. Wright JC (2002) Coherent multidimensional vibrational spectroscopy. International Review of Physical Chemistry 21: 185. Zipfel WR, Williams RM and Webb WW (2003) Nonlinear magic multiphoton microscopy in the biosciences. Nature Biotechnology 21: 1368. Zyss J (ed.) (1994) Molecular Nonlinear Optics. San Diego: Academic Press.

Photodynamic Therapy of Cancer A J MacRobert and T Theodossiou, University College London, London, UK q 2005, Elsevier Ltd. All Rights Reserved.

Introduction One of the most active areas of photomedical research, in recent years, has been the exploration

of the use of light-activated drugs known as photosensitizers. These compounds may be activated using light, usually provided by a laser via an optical fiber which is placed at the site of the target lesion. This treatment is known as photodynamic therapy (PDT) and is being applied to the local destruction of malignant tumors and certain nonmalignancies. Activation of the photosensitizer results in the

54

CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer

generation of reactive oxidizing intermediates which are toxic to cells, and this process ultimately leads to tumor destruction. PDT is a relatively low-power, nonthermal, photochemical technique that uses fluence rates not exceeding 200 mW/cm2 and total light doses or fluences of typically 100 J/cm2. Generally red or near-infrared light is used, since tissue is relatively transparent at these wavelengths. It is a promising alternative approach to the local destruction of tumors for several reasons. First, it is a minimally invasive treatment since laser light can be delivered with great accuracy to almost any site in the body via thin flexible optical fibers with minimal damage to overlying normal tissues. Second, the nature of PDT damage to tissues is such that healing is safer and more complete than after most other forms of local tissue destruction (e.g., radiotherapy). PDT is also capable of some degree of selectivity for tumors when the sensitizer levels, light doses, and irradiation geometry are carefully controlled. This selectivity is based on the higher sensitizer retention in tumors after administration relative to the adjacent normal tissues in which the tumor arose (generally 3:1 for extracranial tumors but up to 50:1 for brain tumors). The photosensitizer is administered intravenously to the patient and time allowed (3 –96 hours depending on the sensitizer) for it to equilibrate in the body before the light treatment (Figure 1). This time is called the drug– light interval. PDT may also be useful for treating certain nonmalignant conditions, in particular, psoriasis and dysfunctional menorrhagia (a disorder of the uterus), and local treatment of infections; such as genital papillomas, infections in the mouth and the upper gastrointestinal tract. In certain cases, the photosensitizer may be applied directly to the lesions, particularly for treatment of skin tumors, as discussed later. The main side-effect of PDT is skin photosensitivity, owing to retention of the

Figure 1 treatment.

Photodynamic therapy: from sensitization to

drug in the skin, so patients must avoid exposure to sunlight in particular for a short period following treatment. Retreatment is then possible once the photosensitizer has cleared from the skin since these drugs have little intrinsic toxicity, unlike many conventional chemotherapy agents.

Photoproperties of Photosensitizers By definition, there are three fundamental requirements for obtaining a photodynamic effect: (a) light of the appropriate wavelength matched to the photosensitizer absorption; (b) a photosensitizer; and (c) molecular oxygen. The ideal photochemical and biological properties of a photosensitizer may be easily summarized, although the assessment of a sensitizer in these terms is not as straightforward as might be supposed because the heterogeneous nature of biological systems can sometimes profoundly affect these properties. Ideally though, a sensitizer should possess the following attributes: (a) red or near infrared light absorption; (b) nontoxic, and with low skin photosensitizing potency; (c) selective retention in tumors relative to normal adjacent tissue; (d) an efficient generator of cytotoxic species, usually singlet oxygen; (e) fluorescence, for visualization; (f) a defined chemical composition, and (g) preferably water soluble. A list of several photosensitizers possessing the majority of the above-mentioned attributes is given in Table 1. The reasons for these requirements are partly selfevident, but worth amplifying. Strong absorption is desirable in the red and near-infrared spectral region, where tissue transmittance is optimum enabling penetration of the therapeutic light within the tumor (Figure 2). To minimize skin photosensitization by solar radiation, the sensitizer absorption spectrum should ideally consist of a narrow red wavelength band, with little absorption at other wavelengths down to 400 nm, below which solar irradiation falls off steeply. Another advantage of red wavelength irradiation is that the potential mutagenic effects encountered with UV-excited sensitisers (e.g., psoralens) are avoided. Since the object of the treatment is the selective destruction of tumor tissue, leaving surrounding normal tissue undamaged, some degree of selective retention of the dye in tumor tissue is desirable. Unfortunately, the significance of this aspect has been exaggerated in the literature and in many cases treatment selectivity owes more to careful light irradiation geometry. Nevertheless, many normal tissues have the capacity to heal safely following PDT damage. The key photochemical property of photosensitizers is to mediate production of some active

CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer

Table 1

55

Properties of several photosensitizers

Compound

l/nm (1/M21 cm21) Drug dose/mg kg21 Light dose/J cm22 Diseases treated

Hematoporphyrin (HpD–Photofrin– Photosan)

628 (3.0 £ 103)

ALA (converted to protoporphyrin IX) 635 (5 £ 103)

1.5–5

75–250

60

50–150

Benzoporphyrin derivative (BpD)

690 (3.5 £ 104)

4

150

Tin etiopurpurin (SnET2–Purlytin) Monoaspartyl chlorin e6 (MACE) Lutetium texaphyrin (Lu–Tex)

665 (3.0 £ 104) 660 (4.0 £ 104) 732 (4.2 £ 104)

1.2 1.0 1.0

150– 200 25–200 150

Aluminum disulfonated phthalocyanine

675 (2.0 £ 105)

1.0

50–200

Metatetrahydroxychlorin (mTHPC–temoporphin –Foscan)

652 (3.5 £ 104)

0.15

5– 20

Palladium pheophorbide (Tookad) Hypericin

763 (8.6 £ 104) 590 (3 £ 104)

2.0 –

– –

Figure 2 The absorption spectrum of m-THPC. The structure of m-THPC is shown in the inset.

molecule which is cytotoxic, that is, will destroy cells. The first electronically excited state of molecular oxygen, so-called singlet oxygen, fulfills this role very well and may be produced via the interaction of an excited electronic state of the sensitizer with oxygen present in the tissue. Thus, in summary, to achieve effective photosensitization, a sensitizer should exhibit appreciable absorption at red to near-infrared wavelengths and generate cytotoxic species via oxygen-dependent photochemical reactions. The first clinical photosensitizer

Early stage esophagus, bladder, lung, cervix, stomach, and mouth cancers. Palliative in later stages Skin, stomach, colon, bladder, mouth cancers. Esophageal dysplasia. Various nonmalignant conditions Age-related macular degeneration (AMD) Breast and skin cancers, AMD Skin cancers Metastatic brain tumors, breast cancers, atherosclerotic plaques Brain, colon, bladder, and pancreatic cancers. Head and neck cancers in animal studies only Head, neck, prostate, pancreas, lung, brain, biliary tract, and mouth cancers. Superior to HpD and ALA in mouth cancers Prostate cancer Psoriasis

was hematoporphyrin derivative (HpD), which is derived synthetically from hematoporphyrin by reaction with acetic and sulfuric acids to give a complex mixture of porphyrins. A purified fraction of these (Photofrin) is available commercially, and this has been used most widely in clinical applications to date. Second-generation (photosensitizers are now becoming available, including phthalocyanine and chlorin compounds as shown in Table 1. A new approach to PDT has recently emerged involving the administration of a natural porphyrin precursor, 5-aminolaevulinic acid (ALA), which is metabolized within cells to produce protoporphyrin IX (see Table 1). This porphyrin is known to be a powerful photosensitizing agent but suffers from the drawback of being a poor tumor localizer when used directly. In contrast, administration of ALA induces protoporphyrin biosynthesis, particularly in rapidly proliferating cells, which may then be destroyed using irradiation at 630 nm. Therefore, this new therapy may offer enhanced treatment selectivity with little risk of skin photosensitivity owing to the rapid clearance after 24 h of the protoporphyrin. Investigation of this new approach has already proved successful in clinical treatment of

56

CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer

skin tumors using topical application of ALA in a thick emulsion. Considerable effort is also being expended on exploiting the retention of sensitizers in tumors for diagnostic purposes, although the results are rather mixed to date. The prospects with ALA-induced protoporphyrin IX are, however, more promising, as illustrated in Figure 3 where the fluorescence is selectively confined to the skin tumor (basal cell carcinoma).

Figure 3 (a) Image of a basal cell carcinoma; (b) fluorescence imaging of the same lesion after ALA sensitization and 405 nm light excitation.

Mechanisms of Photodynamic Therapy The main principles of the photosensitization mechanism are now well established with the initial step being excitation of the sensitizer from its electronic ground state to the short-lived fluorescent singlet state. The lifetime of the singlet state is generally only a few nanoseconds and the main role of this state in the photosensitization mechanism is to act as a precursor of the metastable triplet state through intersystem crossing. Efficient formation of this metastable state is required because it is the interaction of the triplet state with tissue components that generates cytotoxic species such as singlet oxygen. Thus the triplet state quantum yield (i.e., probability of triplet state formation per photon absorbed) of photosensitizers should ideally approach unity. Interaction of the metastable triplet state (which in de-aerated solutions has a lifetime extending to the millisecond range) with tissue components may proceed via either a type I or II mechanism or a combination (see Figure 4). A type I process can involve hydrogen abstraction from the sensitizer to

Figure 4 The photophysical and photochemical mechanisms involved in PDT. PS denotes the photosensitizer, SUB denotes the substrate, either biological, a solvent or another photosensitizer; * denotes the excited state and † denotes a radical.

CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer

produce free radicals or electron transfer, resulting in ion formation. The type II mechanism, in contrast, exclusively involves interaction between molecular oxygen and the triplet state to form singlet oxygen which is highly reactive in biological systems. The near-resonant energy transfer from the triplet state to O2 can be highly efficient and the singlet oxygen yield can approach the triplet state yield provided that the triplet state energy exceeds 94 kJ/mol, the singlet oxygen excitation energy. It is widely accepted that the type II mechanism underlies the oxygen-dependent phototoxicity of sensitizers used for photodynamic therapy. Both proteins and lipids (the main constituents of membranes) are susceptible to photooxidative damage induced via a type II mechanism which generally results in the formation of unstable peroxide species. For example, unsaturated lipids may be oxidized by the ‘ene’ reaction where the singlet oxygen reacts with a double bond. Other targets include such important biomolecules as cholesterol, certain amino acid residues, collagen, and the coenzyme, NADPH. Another synergistic mode of action involves the occlusion of the microcirculation to tumors and reperfusion injury. Reperfusion injury relates to the formation of xanthene oxidase from xanthene dehydrogenase in anoxic conditions which reacts with xanthene or hypoxanthene products of ATP dephosphorylation to convert the restored oxygen into superoxide anion which directly or via the Fenton reaction causes cellular damage. An important feature of the type II mechanism, which is sometimes overlooked, is that when the sensitizer transfers electronic energy to O2 it returns to its ground state. Thus the cytotoxic singlet oxygen species is generated without chemical transformation of the sensitizer which may then absorb another photon and repeat the cycle. Effectively, a singlet photosensitizer molecule is capable of generating many times its own concentration of singlet oxygen, which is clearly a very efficient means of photosensitization provided the oxygen supply is adequate.

Lasers in PDT Lasers are the most popular light source in PDT since they have several key characteristics that differentiate them from conventional light sources, namely coherence, monochromaticity, and collimated output. The two main practical features that make them so useful in PDT are their monochromaticity and their combination with fiber-optic delivery. The monochromaticity is important since the laser can be tuned to a specific absorption peak of a photosensitizer, thus ensuring that all the energy delivered is utilized for the

57

excitation and photodynamic activation of the photosensitizer. This is not true for a conventional light source (as for example a tungsten lamp) where the output power is divided over several hundred nanometers throughout the UV, visible, and IR regions and only a fraction of the power lies within the absorption band of the photosensitizer. To numerically illustrate this, let us simplify things by representing the lamp output as a square profile and also consider a photosensitizer absorption band about 30 nm full width half maximum. If a laser with a frequency span of about 2 nm and a lamp with the same power but with a frequency span of about 600 nm are used, then the useful portion of it is about 0.05 and if we consider a Gaussian profile for the sensitizer absorption band, the above fraction drops even lower, perhaps even to 0.01. That means that the lamp would achieve 100 times less the excitation rate of the laser, or in other words, we would require a lamp with an output power of 100 times that of the laser to achieve the same rate of excitation and consequently the same time of treatment, provided we use a low enough laser power to remain within the linear regime of the photosensitizer excitation. The other major disadvantage of a lamp source is the lack of collimation which results in low efficiency for fiberoptic delivery. We now review the different lasers that have found application in PDT. Laser technology has significantly advanced in recent years and there are now a range of options in PDT laser sources for closely matching the absorption profiles of the various photosensitizers. Moreover, since PDT is becoming more widely used clinically, practical considerations, such as portability and turn-key operation are increasingly important. In Figure 5 the optical spectrum is shown from about 380 nm (violet – blue) to 800 nm (near-infrared, NIR). We have superimposed on this spectrum, the light penetration depth of tissue to roughly illustrate how deeply light can penetrate (and consequently activate photosensitizer molecules) into lower-lying tissue layers. The term ‘penetration depth’ is a measure of the light attenuation by the tissue so the figure of 2 mm corresponds to 1=e attenuation of the incident intensity at a tissue depth of 2 mm. It is obvious that light at the blue end of the spectrum has only a superficial effect whereas the more we shift further into the red and NIR regions the penetration depth increases. This is due to two main reasons: absorption and scattering of light by various tissue component molecules. For example, in the blue/green region for the absorption of melanin and hemoglobin is relatively high. But there is a noticeable increase in penetration depth beyond about 600 nm leading to an optimal wavelength region, or

58

CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer

Figure 5 Photosensitizers and laser light sources available in the visible and near-infrared spectral regions.

‘therapeutic window’ for laser therapy of around 600 – 1100 nm (Figure 5). The fundamental wavelength of the Nd:YAG laser lies within this therapeutic window at 1064 nm and

this laser is now widely used in thermal laser therapy. The Nd:YAG lasers can be operated either in cw (output power , 200 W multimode), long-pulse (, 500 W average power at 50 Hz), or Q-switched

CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer

(50 MW peak power at around 10 ns pulse duration) modes. Nd:YAG is a solid-state laser with a yttrium aluminum garnet crystal doped with about 1% of trivalent Nd ions as the active medium of the laser; and with another transition of the Nd ion, this laser (with the choice of different optics) can also operate at 1320 nm. Although dyes are available which absorb from 800 – 1100 nm, the generation of singlet oxygen via the type II mechanism is energetically unfavorable because the triplet state energies of these dyes are too low. However, the fundamental frequency of the laser can be doubled (second-harmonic generation, SHG) or tripled (third-harmonic generation, THG) with the use of nonlinear crystals, to upconvert the output radiation to the visible range, thus rendering this laser suitable for PDT: i.e., frequency doubling to 532 nm and tripling to 355 nm. Note that the 532 nm output is suitable for activation of the absorption band of hypericin with a maximum at 550 nm even though it is not tuned to the maximum of this absorption. In the early days of PDT and other laser therapies, ion lasers were widely used. The argon ion laser uses ionized argon plasma as gain medium, and produces two main wavelengths at 488 nm and 514 nm. The second of the two lies exactly at the maximum of the 514 nm m-THPC absorption. Argon ion lasers are operated in cw mode and usually have output powers in the region of 5– 10 W at 514 nm (the most powerful argon ion line). Krypton ion lasers, which emit at 568 or 647 nm, are similar to their argon ion counterparts; however, they utilize ionized krypton plasma as gain medium. The 568 nm output can be used to activate the Rose Bengal peak at 559 nm, whereas the 647 nm most powerful line of krypton ion lasers, has been used for activation of m-THPC (652 nm). Ideally, for optimal excitation, it is best to exactly tune the laser wavelength to the maxima of photosensitizer absorption bands. For this reason tunable organic dye lasers have been widely used for PDT. In these lasers the gain medium is an organic dye with a high quantum yield of fluorescence. Due to the broad nature of their fluorescence gain profile, tuning elements within the laser cavity can be used for selection of the lasing frequency within the gain band. Tunable dye lasers with the currently available dyes can quite easily cover the range from about 350 – 1000 nm. However, their operation is not of a ‘turn key’ nature since they require frequent replacement of their active material either for tuning to a different spectral region or for a better performance when the dye degrades. Tunable dye lasers also need to be pumped by some other light source like an argon ion laser, an excimer laser, a solid state laser (e.g., Nd:YAG 532 or 355 nm), a copper vapor

59

laser, or lamps. Dye lasers have been used for the excitation of HpD at 630 nm, protoporphyrin IX at 635 nm, photoprotoporphyrin at 670 nm, and phthalocyanines around 675 nm. A further example is hypericin which, apart from the band at 550 nm, has a second absorption band at 590 nm. This is the optimum operation wavelength of dye lasers with rhodamine 590, better known as rhodamine 6G. Dye lasers can operate in either pulsed or cw mode. However, there is a potential disadvantage in using a pulsed laser for PDT which becomes apparent when using lower repetition rates and higher pulse energies (. 1 mJ per pulse) laser excitation. High pulse energies can induce saturation or transient bleaching of the sensitizer during the laser pulse and consequently much of the energy supplied is not efficiently absorbed by the sensitizer. It has been suggested that this effect accounts for the lack of photosensitized damage in tissue sensitized with a phthalocyanine and irradiated by a 5 Hz, 25 mJ-per-pulse flash lamp pumped dye laser. However, using a low pulse energy, high repetition rate copper vapor pumped dye laser (see below), the results were indistinguishable from cw irradiation with an Argon ion dye laser. Another laser that has found clinical application in PDT is the copper vapor laser. In this laser the active medium is copper vapor at high temperature maintained in the tube by a repetitively pulsed discharge current. The copper vapor lasers are pulsed with typical pulse duration of about 50 ns and repetition rates reaching 20 kHz. Copper vapor lasers have been operated at quite high average powers up to about 40 W. The output radiation is produced at two distinct wavelengths, namely 511 and 578 nm both of which have found clinical use. Copper vapor pumped dye lasers have also been widely used for activating red-absorbing photosensitizers, and the analogous gold vapor lasers operating at 628 nm have been used to activate HpD. A relatively new class of tunable lasers is the solid state Ti:Sapphire laser. The gain medium in this laser is a , 1%Ti doped sapphire (Al2O3) crystal and its output may be tuned throughout the 690 –1100 nm region, covering many photosensitizer absorption peaks. For example, most of the bacteriochlorin type photosensitizers have their absorption bands in that spectral region, e.g., m-THPBC with an absorption maximum at 740 nm. Also, Texaphyrin sensitizers absorb highly in this spectral region: lutetium texaphyrin has an absorption maximum at 732 nm. Although tunable dye or solid-state lasers are widely used in laboratory studies for PDT, fixedwavelength semiconductor diode lasers are gradually supplanting tunable lasers, owing to their practical convenience. It is now generally the case that when

60

CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer

a photosensitizer enters clinical trials or enters standard clinical use, laser manufacturers provide dedicated diode lasers with outputs matched to the chosen sensitizer. In this context there are diode lasers available at 630 nm for use with HpD, at 635 nm for use with ALA-induced protoporphyrin IX, 670 nm for ATSX-10 or phthalocyanines, or even in the region of 760 nm, for use with bacteriochlorins. The advantage of these diode lasers is that they are tailormade for use with a particular photosensitizer, they are highly portable, and they have a relatively easy turn-key operation. So far we have concentrated on red wavelengths but most of the porphyrin and chlorin family of photosensitizers exhibit quite strong absorption bands in the blue region, known as ‘Soret’ bands. Despite the fact that tissue penetration in this spectral region is minimal, these bands are far more intense than the corresponding red absorption bands of the sensitizer. In this respect it is possible to activate these sensitizers at blue wavelengths, especially in the case of treatment of superficial (e.g., skin) malignant lesions. There are now blue diode lasers (and lightemitting diodes) for the activation of these bands, in particular at 405 nm where the Soret band of protoporphyrin IX lies or at 430 nm for selective excitation of the photoprotoporphyrin species.

For very superficial lesions (e.g., in the retina) it may even be possible to use multiphoton excitation provided by femtosecond diode lasers operating at about 800 nm.

Clinical Applications of Lasers in PDT In the previous section we reviewed the various types of lasers being used for PDT but their combination with fiber-optic delivery is also of key importance for their clinical application. The laser light may be delivered to the target lesion either externally using surface irradiation, or internally within the lesion which is denoted as interstitial irradiation, as depicted in Figure 6. For example, in case of superficial cutaneous lesions the laser irradiation is applied externally (Figure 6a). The area of the lesion is marked prior to treatment to determine the diameter of the laser spot. For a given fluence rate (100 – 200 W/cm2) and the area of the spot, the required laser output power is then calculated. A multimode optical fiber, terminated with a microlens to ensure uniform irradiation, is then positioned at a distance from the lesion yielding the desired beam waste, and the surrounding normal tissue is shielded with dark material. However, if the lesion is a deeper-lying solid tumor, interstitial PDT is employed

Figure 6 Clinical application of PDT. (a) Surface treatment. (b) Interstitial treatment.

CHEMICAL APPLICATIONS OF LASERS / Photodynamic Therapy of Cancer

Figure 7 Balloon applicator and diffuser fiber used for PDT in Barrett’s esophagus.

(Figure 6b). Surgical needles are inserted in an equidistant parallel pattern within the tumor, either freehand or under image (MRI/CT) guidance. Bare tip optical fibers are guided through the surgical needles to the lesion and flagged for position. A beamsplitter is used to divide the laser output into 2 –4 components so that two to four optical fibers can be simultaneously employed. The laser power is adjusted so that the output fluence of each of the optical fibers is 100 – 200 mW. Even light distribution in the form of overlapping spheres approximately 1 cm diameter ensures the treatment of an equivalent volume of tissue around each fiber tip. Once each ‘station’ is likewise treated the fibers are repositioned accordingly, employing a pull-back technique in order for another station to be treated. Once the volume of the tumor is scanned in this manner the treatment is complete. Interstitial light delivery allows PDT to be used in the treatment of large, buried tumors and is particularly suitable for those in which surgery would involve extensive resection. The treatment of tumors of internal hollow organs is also possible with PDT. The most representative example of this is the treatment of precancerous lesions in the esophagus, known medically as ‘Barrett’s esophagus’. In this case, a balloon applicator is used to house a special fiber with a lightdiffusing tip or ‘diffuser’ (Figure 7). These diffusers are variable in length and transmit light uniformly along the whole of their length in order to facilitate treatment of circumferential lesions. The balloon applicator is endoscopically inserted into the patient. The balloon is inflated to stretch the esophageal walls

61

Figure 8 PDT treatment of a nasal basal cell carcinoma with mTHPC (Foscanw): (a) prior to PDT; (b) tissue necrosis a week after treatment; (c) four weeks after treatment; gradual recession of the necrosis and inflammation; (d) healing two months after treatment. Courtesy of Dr Alex Kuebler.

and the flagged diffuser fiber is placed in position, central to the balloon optical window. In this way the whole of the lesion may be treated circumferentially at the same time. Finally, the clinical response to PDT treatment is shown in Figure 8. The treated area becomes necrotic and inflamed within two to four days following PDT. This necrotic tissue usually sloughs off or is mechanically removed. The area eventually heals with minimal scarring (Figure 8).

Future Prospects A major factor in the future development of PDT will be the application of relatively cheap and portable semiconductor diode lasers. High-power systems consisting of visible diode arrays coupled into multimode fibers with an output of several Watts, have recently become available which will greatly simplify the technical difficulties which have held back PDT. The use of new photosensitizers, in the treatment of certain tumors by PDT, is making steady progress, and although the toxicology testing and clinical evaluation are lengthy and costly processes, the expectation is that these compounds will become more widely available for patient use during the next five years. Regarding the current clinical status of PDT, Photofrin has recently been approved for treatment of lung and esophageal cancers in Europe, USA, and Japan. Clinical trials are in progress for several of the second-generation photosensitizers which offer significant improvements over Photofrin in terms of chemical purity, photoproperties, and skin

62

CHEMICAL APPLICATIONS OF LASERS / Pump and Probe Studies of Femtosecond Kinetics

clearance. With the increasing clinical application of compact visible diode lasers the prospects for photodynamic therapy are therefore encouraging.

See also Lasers: Dye Lasers; Free Electron Lasers; Metal Vapor Lasers. Ultrafast Laser Techniques: Generation of Femtosecond Pulses.

Further Reading Bonnet R (2000) Chemical Aspects of Photodynamic Therapy. Amsterdam: Gordon and Breach Science Publishers.

Dolmans DEJGJ, Fukumura D and Jain RK (2003) Photodynamic therapy for cancer. Nature Reviews (Cancer) 3: 380 – 387. Hopper C (2000) Photodynamic therapy: a clinical reality in the treatment of cancer. Lancet Oncology 1: 212 – 219. Milgrom L and MacRobert AJ (1998) Light years ahead. Chemistry in Britain 34: 45– 50. Rosen GM, Britigan BE, Halpern HJ and Pou S (1999) Free Radicals Biology and Detection by Spin Trapping. New York: Oxford University Press. Svelto O (1989) Principles of Lasers, 3rd edn. New York: Plenum Press. Vo-Dinh T (2003) Biomedical Photonics Handbook. Boca Raton, FL: CRC Press.

Pump and Probe Studies of Femtosecond Kinetics G D Scholes, University of Toronto, Toronto, ON, Canada q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Pump-probe spectroscopy is a ubiquitous timeresolved optical spectroscopy that has found application to the study of all manner of ultrafast chemical processes that involve excited states. The principle of the measurement is that a ‘pump pulse’ – usually an intense, short laser pulse – impulsively excites a sample, thus defining the start time for the ensuing photophysical dynamics. A probe pulse interrogates the system at later times in order to capture ‘snap shots’ of the state of the system. Many kinds of pumpprobe experiments have been devised. The present article will mainly focus on transient absorption measurements. Femtosecond pump-probe experiments have been employed to reveal the kinetics of excited state chemical processes such as solvation, isomerization reactions, proton transfer, electron transfer, energy transfer, or photochromism. It is becoming more common to use pump-probe techniques to study systems of increasing complexity, such as photosynthetic proteins, photoreceptor proteins, molecular aggregates, nanostructured materials, conjugated polymers, semiconductors, or assemblies for artificial light harvesting. However, the underlying chemical kinetics are not always easily revealed by pump-probe signals. The principle of the pump-probe measurement is related to the seminal flash-photolysis experiment devised by Norrish and Porter. However, in order to

achieve ultrafast time resolution – beyond that attainable by electronic detection – the probe pulse is temporally short and is controlled to arrive at the sample at variable time delays after the pump pulse has excited the sample. The change in intensity of the probe pulse, as it is transmitted through the sample, is monitored at each pump-probe time delay using a ‘slow’ detector that integrates over the probe pulse duration. One possible xprocess that can diminish the probe transmission is transient absorption. After formation of an excited state S1 by the pump pulse, the probe pulse can monitor a resonance with a higher excited state S1 ! Sn absorption. Figure 1 shows transient absorption spectra, corresponding to excited state absorption at various time delays, that has been probed by a white light continuum probe and monitored by a multichannel CCD detector. The transient spectrum is seen to blue-shift with

Figure 1 Excited state absorption spectra of 40 -n-pentyl-4cyanoterphenyl in octanol solvent. The pump wavelength is 277 nm (100 nJ, 150 fs, 40 kHz) and the probe is a white light continuum (500 to 700 nm) generated in a sapphire crystal. The pump-probe delay T for each spectrum is indicated on the plot.

CHEMICAL APPLICATIONS OF LASERS / Pump and Probe Studies of Femtosecond Kinetics

63

emission. Typical TR3 data for different pump-probe delays are shown in Figure 2. The dashed lines show a deconvolution of the spectra to reveal the underlying vibrational bands. These bands are rather broad owing to the spectral width of the 1 ps laser pulses used in this experiment. It is clear that the timeresolution of a TR3 experiment is limited to the picosecond regime, otherwise most of the vibrational band information is lost. TR3 and complementary pump-probe infrared transient absorption experiments, utilizing an infrared probe beam, have been useful for observing the evolution of structural information, such as geometry changes, subsequent to photoexcitation. In the present article we will describe the foundation of experiment and theory necessary to understand resonant ultrafast pump-probe spectroscopy applied to chemical processes in the condensed phase. By resonant, it is meant that the pump and probe pulses have frequencies (energies) resonant with electronic transitions of the molecules being studied. The implications of condensed phase are that the experiment interrogates an ensemble of molecules and the electronic transitions of these molecules are coupled to the random motions and environments of a bath, for example the solvent. Figure 2 TR3 spectra of 40 -n-pentyl-4-cyanoterphenyl in octanol solvent, recorded using the same set-up as summarized in Figure 1. The dashed lines are deconvolutions of the data to show the Raman bands.

pump-probe delay owing to solvation of the S1 state of the molecule – the dynamic Stokes’ shift. The dynamic Stokes’ shift is the shift to lower energy of the emission spectrum as the solvent relaxes in response to the change in dipole moment of the excited electronic state compared to the ground state. The overall intensity of the induced absorption decays according to the lifetime of the excited state. Although they will not be discussed in detail in the present article, various other pump-probe experiments are possible. For example, when in resonance with this transient absorption, the probe pulse can induce resonance Raman scattering, which reveals the evolution of the vibrational spectrum of the excited state. In this kind of pump-probe experiment we do not monitor the change in probe transmission through the sample. Instead, we monitor the difference between probe-induced resonance Raman scatter from the excited and ground states of the system. The resonance Raman (TR3) scatter, time-resolved as a function of pump-probe delay, can be detected provided that it is not overwhelmed by fluorescence

Experimental Measurement of Pump-Probe Data The experimental setup for a one-color pump-probe experiment (i.e. pump and probe of the same color) is shown in Figure 3. The setup is easily adapted for a two-color experiment. Most of the laser intensity (70%) is transmitted through the first beam splitter so that the pump is more intense than the probe. A second beam splitter is used to split off a small amount of probe light for the reference beam. Note that the arrangement of these beam splitters results in both the pump and probe passing through the same amount of dispersive material on their way to the sample. The pump and probe beams each propagate through half-wave plates in order to control polarization. Usually the probe polarization is set to the magic angle (54.78) relative to the pump in order to remove polarization bias. The beams are sent towards the sample by way of retroreflectors mounted on x – y – z translation stages. The pump retroreflector is mounted on a precision computer-controlled translation stage such that it can be scanned in the x-direction to arrive at variable delays before and after the probe. The pump and probe beams are aligned using the y – z controls on the retroreflector translation stages together with the pick-off mirrors

64

CHEMICAL APPLICATIONS OF LASERS / Pump and Probe Studies of Femtosecond Kinetics

Figure 3 An experimental layout for femtosecond pump-probe spectroscopy. The reference beam ideally passes through the sample (not shown as such in aid of clarity). See text for a description. HWP, half-wave plate; BS, beamsplitter (% reflection); FL, focusing lens; CL, collimating lens; ND, neutral density filter; PD, photodetectors.

such that they travel in parallel towards the sample. The overall setup of the beams is in a box geometry so that pump and probe arms have the same path length. The pump and probe beams are focused into the sample using a transmissive (i.e. lens) or reflective optic FL. The focal length is typically 20 to 30 cm, providing a small crossing angle to improve phase matching. The pump beam should have a larger spot size at its focal point in the sample than the probe beam, in order to avoid artifacts arising from the wandering of the pump and probe spatial overlap as a function of delay time – a potential problem, particularly for long delays. Good spatial overlap of the beams in the sample is important. It can be checked using a 50 mm pinhole. Fine adjustment of the pump-probe delay, to establish temporal overlap of the pump and probe pulses, can be dictated by autocorrelating the beams in a second harmonic generating crystal mounted at the sample position. At the same time the pulse compression is tweaked to ensure that the pulse dispersion in the experiment has been pre-compensated. The sample is usually flowed or mounted in a spinning cell if it is liquid, in order to avoid thermal effects or photochemical bleaching. The path length of the sample is typically # 1 mm. The transmitted probe beam is spatially isolated from the pump beam using an iris, is collimated, and then directed onto the photodetector.

The reference beam, which provides the relative reference intensity I0 should preferably pass through the sample (away from the pump spot). This provides the probe-only transmission – that is, the reference beam intensity is attenuated according to sample ground state optical density. The photodetector outputs are used to ratio the pump-probe arm intensity transmitted through the sample with the reference intensity, Ipr ðv; TÞ=I0 : Since the pump beam is being chopped at the lock-in reference frequency, the lock-in amplifier outputs the pump-induced fractional change in transmission intensity, DIpr ðv; TÞ=I0 ; usually simply written as DT=T: When the pump-induced signal Ipr is small compared to the reference intensity I0 ; then the detected DIpr ðv; TÞ=I0 signal is approximately equal to the change in optical density, DO.D.: ! I0 2 DIpr 2DIpr 2DT ½1 DO:D: ¼ log ; < T I0 I0 Here we have used a Taylor expansion logð1 þ xÞ < x for small x: The power dependence of the signal in the x ð3Þ limit should be linear in both the pump and probe intensities. It is also possible to pump the sample using two-photon absorption, then probe an excited state absorption. This is a x ð5Þ experiment, so that the signal depends quadratically on the pump intensity.

What is Measured in a Pump-Probe Experiment? Origin of the Signal

The pump-probe measurement is a third-order nonlinear spectroscopy. A complete calculation of the signal requires computation of the third-order polarization induced by the interaction of the pump and probe with the material sample, together with an account for the kinetic evolution of populations (e.g., excited state reactant and product states, etc.). The pump pulse, wavevector k1 ; interacts twice with the sample, thereby creating excited state population density lelkel and a hole-in-the-ground state population density lglkgl: These population densities propagate until the probe pulse, with wavevector k2 ; interacts with the system to induce a polarization that depends on the state of the system at time T after the pump. This induced polarization Ps ðtÞ is radiated in the ks ¼ k1 2 k1 þ k2 ¼ k2 direction. Because the signal is radiated in the probe direction, the probe pulse acts as a local oscillator to heterodyne the signal. The change in probe-pulse spectral intensity after transmission through an

CHEMICAL APPLICATIONS OF LASERS / Pump and Probe Studies of Femtosecond Kinetics

Figure 4 Labeling and time variable for the description of a pump-probe experiment. T defines the pump-probe delay.

optically thin sample of path length l; with concentration of absorbers c and refractive index n; is given by eqn [2] ð  clvpr p ð3Þ ivt DIpr ðv; TÞ / Im dtEpr ðtÞP ð0;T; tÞe n

½2

In eqn [2] Epr ðtÞ is the probe pulse electric field, v is the center frequency of the probe, and p indicates complex conjugate. The time variables correspond to those indicated in Figure 4, and the signal is resolved as a function of frequency by Fourier transformation of t ! v: Experimentally this is achieved by dispersing the transmitted probe using a spectrograph. The firsttime variable, the time interval between the two pump pulse interactions, is set to zero because we are interested here in the limit where the pulse duration is much shorter than T: In this regime – impulsive pump-probe – the measurement is sensitive to formation and decay of excited state species. The induced polarization is given by eqn [3] Pð3Þ ð0; T; tÞ ¼

ð1 0

dt

ð1

dT½RSE 2 RESA þ RGSR 

0

£ Ep1 ðt 2 TÞE1 ðt 2 TÞE2 ðtÞ

½3

where the response functions that contain information about the lineshape functions and kinetics are given by eqn [4]: RSE ð0; T; tÞ ¼ lmeg l2 lmeg l2 expð2iveg tÞ exp½2gp ðtÞ þ 2ig00 ðT þ tÞ 2 2ig00 ðTÞ £ KSE ðTÞ

½4a

RESA ð0; T;tÞ ¼ lmeg l2 lmfe l2 expð2ivfe tÞ exp½2gp ðtÞ 2 2ig00 ðT þ tÞ þ 2ig00 ðTÞ £ KESA ðTÞ

½4b

RGSR ð0;T;tÞ ¼ lmeg l2 lmeg l2 expð2iveg tÞ exp½2gðtÞ £ KGSR ðTÞ

½4c

65

The overall intensity of each contribution to the signal is scaled by the transition dipole moment that connects the ground and excited state meg or excited state and a higher excited state mfe : The intensity is further influenced by resonance of the probe-pulse spectrum with the transition frequencies veg and vfe : These transition energies have a time-dependence owing to the relaxation of the excited state density in the condensed phase environment – the dynamic Stokes’ shift. The details of the dynamic Stokes’ shift depend on the bath, and are therefore contained in the imaginary part of the lineshape function gðtÞ ¼ g 0 ðtÞ þ ig 00 ðtÞ: The evolution of the Stokes’ shift with pump-probe delay T is clearly seen in the excited state absorption spectrum shown in Figure 1. The lineshape function contains all the details of the timescales of fluctuations of the bath and the coupling of the electronic transition of the molecule to these fluctuations. The lineshape function can be cast in the simple form given by eqn [5] using the Brownian oscillator model in the high temperature, high friction limit: gðtÞ < ð2lkB TB ="LÞ½Lt 2 1 þ expð2LtÞ þ iðl=LÞ½1 2 expð2LtÞ

½5

The Brownian oscillator model attributes fluctuations of the transition frequencies to coupling between electronic states and an ensemble of low-frequency bath motions. In eqn [5]; l is the solvent reorganization energy (half the Stokes’ shift) and L is the modulation frequency of the solvent oscillator ðt21 L for a Debye solvent, where tL is the longitudinal dielectric relaxation time); kB is the Boltzmann constant, and TB is the temperature. Information about the kinetics of the evolution of the system initiated by the pump pulse is contained in the terms KSE ; KESA ; and KGSR : These terms will be discussed in the next section. The evolution of the excited state density is mapped onto the response functions RSE and RESA . RSE is the contribution arising from the probe pulse stimulating emission from the excited state e in the probe direction, which increases the probe intensity. Thus the spectrum of the signal arising from RSE resembles the fluorescence emission of the sample, as shown in Figure 5. The excited state absorption contribution to the signal RESA depletes the probe intensity, and corresponds to an e ! f electronic resonance, such as that shown in Figure 1. Note that the ESA signal contributes to the overall DIpr ðv; TÞ with an opposite sign to the SE and GSR contributions. The groundstate recovery term RGSR arises from depletion of the ground-state population by the pump pulse, thereby increasing the transparency of the sample over the

66

CHEMICAL APPLICATIONS OF LASERS / Pump and Probe Studies of Femtosecond Kinetics

Figure 6 A schematic depiction of a pump-probe signal. The coherent spike is labeled 1. In region 2 we can see quantum beats as a sum of sinusoidal modulations of the signal. The label 3 denotes the long time decay of the signal, as determined by the population dynamics, K SE ; K GSR ; and K ESA :

Nuclear Wavepackets

Figure 5 (a) shows the absorption and emission spectrum of a dilute solution of oxazine-1 in water. (b) shows the transient absorption spectrum of this system at various delays after pumping at 600 nm using picosecond pulses. The transient spectrum, ground state recovery and stimulated emission resemble the sum of the absorption and emission spectra.

spectrum of the ground-state absorption (see Figure 5). Thus the transmitted probe intensity is increased relative to that without the pump pulse. The GSR contribution to DIpr ðv; TÞ decreases with T according to the excited state lifetime of the photoexcited species. The Coherent Spike

To calculate the pump-probe signal over delay times that are of the order of the pulse duration, one must calculate all possible ways that a signal can be generated with respect to all possible time orderings of the two pump interactions and the probe interactions (Liouville space pathways). Such a calculation reveals coherent as well as sequential contributions to the response of the system owing to the entanglement of time orderings that arise from the overlap of pulses of finite duration. The nonclassical, coherent, contributions dominate the signal for the time period during which pump and probe pulses overlap, leading to the coherent spike (or coherent ‘artifact’) (Figure 6).

The electronic absorption spectrum represents a sum over all possible vibronic transitions, each weighted by a corresponding nuclear overlap factor according to the Franck– Condon principle. It was recognized by Heller and co-workers that in the time-domain representation of the electronic absorption spectrum, the sum over vibrational overlaps is replaced by the propagation of an initially excited nuclear wavepacket on the excited state according to liðtÞl ¼ expð2Ht="Þ lið0Þl: Thus the absorption spectrum is written as

si ðvÞ ¼

2pv ð1 dt exp½2iðv 2 vieg Þt 2 gðtÞ 3c 21   ið0Þ liðtÞ

½6

for mode i: The Hamiltonian H is determined by the displacement of mode i; Di : A large displacement results in a strong coupling between the vibration and the electronic transition from state g to e. In the frequency domain this is seen as an intense vibronic progression. In the time domain this translates to an   oscillation in the time-dependent overlap ið0Þ liðtÞ ; with frequency vi ; thereby modulating the intensity of the linear response as a function of t: It follows that an impulsively excited coherent superposition of vibrational modes, created by an optical pulse that is much shorter than the vibrational period of each mode, propagates on the excited state as a wavepacket. The oscillations of this wavepacket then modulate the intensity of the transmitted probe pulse as a function of T to introduce features into the pump-probe signal known as quantum beats (see Figure 6). In fact, nuclear wavepackets propagate on both the ground and excited states in a pump-probe experiment. The short probe pulse interrogates the evolution of the nuclear wavepackets by projecting them onto the manifold of vibrational wavefunctions in an excited or ground state.

CHEMICAL APPLICATIONS OF LASERS / Pump and Probe Studies of Femtosecond Kinetics

67

Femtosecond Kinetics According to eqn [4] the kinetics of the evolution of the system initiated by the pump pulse are governed by the terms KSE ; KESA ; and KGSR : Each of these terms describes the kinetics of decay, production, or recovery of excited state and ground state populations. In the simplest conceivable case, each might be a single exponential function. The stimulated emission term KSE contains information about the depopulation of the initially excited state, which for example, may occur through a chemical reaction involving the excited state species or energy transfer from that state. The excited state absorption term KESA contains information on the formation and decay of all excited state populations that give a transient absorption signal. The ground-state recovery term KGSR contains information on the timescales for return of all population to the ground state, which will occur by radiative and nonradiative processes. Usually the kinetics are assumed to follow a multi-exponential law such that Km ðT; lexc ; vÞ ¼

N X

Aj ðlexc ; vÞ expð2t=tj Þ

½7

j¼1

where the amplitude Aj ; but not the decay time coefficient tj ; of each contribution in the coupled N-component system depends on the excitation wavelength lexc and the signal frequency v: The index m denotes SE, ESA, or GSR. In order to extract meaningful physical parameters from analysis of pump-probe data, a kinetic scheme in the form of eqn [7] needs to be constructed, based on a physical model to connect the decay of initially excited state population to formation of a product excited state and subsequent recovery of the groundstate population. An example of such a model is depicted in Figure 7. Here the stimulated emission contribution to the signal is attenuated according to the rate that state 1 goes to 2. The transient absorption appears at that same rate and decays according to the radiationless process 2 to 3. The rate of ground state recovery depends on 1 to 2 to 3 to 5 and possibly pathways involving the isomer 4. The kinetic scheme is not easily extracted from pump-probe data because contributions from each of KSE ; KESA ; and KGSR contribute additively at each detection wavelength v; as shown in eqns [2] –[4]. Moreover, additional, wavelength-dependent, decay or rise components may appear in the data on ultrafast time-scales owing to the time-dependent Stokes’ shift (the g 00 ðtÞ terms indicated in eqns [4]). A spectral shift can resemble either a decay or rise component. In addition, the very fastest dynamics may be hidden beneath the coherent spike. For these

Figure 7 Free energy curves corresponding to a model for an excited state isomerization reaction, with various contributions to a pump-probe measurement indicated. See text for a description. (Courtesy of Jordanides XJ and Fleming GR.)

reasons, simple fitting of pump-probe data usually cannot reveal the underlying kinetic model. Instead the data should be simulated according to eqns [3] and [4]. Alternative strategies involve global analysis of wavelength-resolved data to extract the kinetic evolution of spectral features, known as ‘species associated decay spectra’. Global analysis of data effectively reduces the number of adjustable parameters in the fitting procedure and allows one to obtain physically meaningful information by associating species with their spectra. These methods are described in detail by Holzwarth.

List of Units and Nomenclature ESA GSR SE Sn TR3 x (3)

excited state absorption ground state recovery stimulated emission nth singlet excited state time-resolved resonance Raman nth-order nonlinear optical susceptibility

See also Coherent Lightwave Systems. Coherent Transients: Ultrafast Studies of Semiconductors. Nonlinear Optics, Applications: Raman Lasers. Optical Amplifiers: Raman, Brillouin and Parametric Amplifiers. Scattering: Raman Scattering. Spectroscopy: Raman Spectroscopy.

68

CHEMICAL APPLICATIONS OF LASERS / Time-Correlated Single-Photon Counting

Further Reading Fleming GR (1986) Chemical Applications of Ultrafast Spectroscopy. New York: Oxford University Press. Heller EJ (1981) The semi-classical way to molecular spectroscopy. Accounts of Chemical Research 14: 368– 375. Holzwarth AR (1996) Data analysis of time-resolved measurements. In: Amesz J and Hoff AJ (eds) Biophysical Techniques in Photosynthesis, pp. 75 – 92. Dordrecht, Germany: Kluwer. Ippen EP, Shank CV, Wiensenfeld JM and Migus A (1980) Subpicosecond pulse techniques. Philosophical Transactions of the Royal Society of London A 298: 225– 232. Jonas DM and Fleming GR (1995) Vibrationally abrupt pulses in pump-probe spectroscopy. In: El-Sayed MA, Tanaka I and Molin Y (eds) Ultrafast Processes in Chemistry and Photobiology, pp. 225 – 256. Oxford, UK: Blackwell Scientific. Klimov VI and McBranch DW (1998) Femtosecond high-sensitivity, chirp-free transient absorption

spectroscopy using kilohertz lasers. Optics Letters 23: 277 – 279. Lee SY (1995) Wave-packet model of dynamic dispersed and integrated pump-probe signals in femtosecond transition-state spectroscopy. In: Manz J and Wo¨ste L (eds) Femtosecond Chemistry, vol. 1, pp. 273 – 298. Weinheim, Germany: VCH. Lin SH, Alden R, Islampour R, Ma H and Villaeys AA (1991) Density Matrix Method and Femtosecond Processes. Singapore: World Scientific. Mukamel S (1995) Principles of Nonlinear Optical Spectroscopy. New York: Oxford University Press. Norrish RGW and Porter G (1949) Chemical reactions produced by very high light intensities. Nature 164: 658. Towrie M, Parker AW, Shaikh W and Matousek P (1998) Tunable picosecond optical parametric generatoramplifier system for time resolved Raman spectroscopy. Measurement Science and Technology 9: 816 – 823. Yan YJ and Mukamel S (1990) Femtosecond pump-probe spectroscopy of polyatomic molecules in condensed phases. Physical Reviews A 41: 6485– 6504.

Time-Correlated Single-Photon Counting A Beeby, University of Durham, Durham, UK q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Time-correlated single-photon counting, TCSPC, is a well-established technique for the determination of fluorescence lifetimes and related time-resolved fluorescence properties. Since the 1980s, there have been dramatic changes in the laser sources used for this work, which in turn have opened up the technique to a wide user base.

TCSPC: The Basics In the TCSPC experiment, the sample is repeatedly excited by a high-repetition rate, low-intensity light source. Fluorescence from the sample is collected and may be passed though a cut-off filter or monochromator to remove scattered excitation light and to select the emission wavelength. The fluorescence is then focused onto a detector, typically a photomultiplier (PMT) or single-photon avalanche diode (SPAD) which detects single emitted photons. The ‘raw’ signal from the detector fluctuates considerably from event to event. In order to remove any timing error that could be induced by this, the signal is processed first by a constant fraction discriminator (CFD) in order to provide a stable and consistent timing pulse. This signal is used as an

input for the time-to-amplitude converter (TAC) which determines the time interval between the excitation of the sample and the emission of the photon. The output from the TAC is a relatively slow pulse whose intensity is proportional to the time interval between the start and stop signals. A pulse height analyzer (PHA) is used to process the output from the TAC and increments a channel in its memory corresponding to a time window during which the photon was detected. As the measurement is repeated many times over, a histogram is constructed in the memory of the PHA, showing the number of photons detected as a function of the time interval. This histogram corresponds to the fluorescence decay. As outlined below, it is desirable to have a light source with a moderately high repetition rate, typically of the order of MHz. Because it is not practical to start the TAC each time the light source ‘fires’, it is more usual to operate the TAC in the so-called ‘reversed mode’, whereby the start signal is derived from the detected photon and the stop is derived from the light source. The electronic circuitry is illustrated in Figure 1. In practice, many researchers still use modular components that are wired together manually, although there is increasingly a tendency towards the use of single PC-based cards which contain all the necessary discriminators, timing circuits, and data acquisition electronics. There are a number of commercial PC

CHEMICAL APPLICATIONS OF LASERS / Time-Correlated Single-Photon Counting

cards that contain all the timing electronics, etc., required for TCSPC, for example, Becker and Hickl (Germany). This approach to obtaining the fluorescence decay profile can only work if the detected photons are ‘randomly selected’ from the total emission from the sample. In practice, this means that the rate of detection of photons should be of the order of 1% of the excitation rate: that is, only one in a hundred excitation pulses gives rise to a detected photon. This histogram mirrors the fluorescence decay of the sample convolved with the instrument response function (IRF). The IRF is itself a convolution of the

Figure 1 Schematic diagram of a TSCPC spectrometer. Fluorescence from the sample is collected by L1, which may also include a monochromator/wavelength selection filter. A synchronization pulse is obtained either from the laser driver electronics directly, or via a photodiode and second constant fraction discriminator. The TAC is shown operating in a ‘reversed mode’ as is normal with a high repetition rate laser source.

69

temporal response of the optical light pulse, the optical path in the collection optics, and the response of the detector and ancillary electronics. Experimentally, the IRF is usually recorded by placing a scattering material in the spectrometer. In principle, the decay function of the sample can be extracted by deconvolution of the IRF from the decay function, although in practice it is more common to use the method of iterative re-convolution using a model decay function(s). Thus, the experimentally determined IRF is convolved with a model decay function and parameters within the function are systematically varied to obtain the best fit for a specific model. The most commonly used model assumes the decaying species to follow one or more exponential functions (sum of exponentials) or a distribution of exponentials. Further details regarding the mechanics of data analysis and fitting procedures can be found in the literature. A typical fit is illustrated in Figure 2. The main aim of this review is to illustrate how recent advances in laser technology have changed the face of TCSPC and to discuss the types of laser light sources that are in use. The easiest way of approaching this topic is to define the criteria for an ideal light source for this experiment, and then to discuss the ways in which new technologies are meeting these criteria.

High Repetition Rate A typical decay data set may contain something of the order of 0.1 –5 million ‘counts’: it is desirable to

Figure 2 The fluorescence decay of Rhodamine B in water. The sample was excited using a 397 nm pulsed laser diode (1 MHz, ,200 ps FWHM), and the emission was observed at 575 nm. Deconvolution of the IRF (dark gray trace) and observed decay (light gray trace) with a single exponential function gave a lifetime of 1.50 ns, with a x 2R ¼ 1.07, Durbin–Watson parameter ¼ 1.78. The fitted curve is shown overlaid in gray, and the weighted residuals are shown offset in black.

70

CHEMICAL APPLICATIONS OF LASERS / Time-Correlated Single-Photon Counting

acquire such a number of events in order that we can reliably employ statistical parameters to judge the quality of the fitted decay function(s). As stated above, the rate of photon acquisition should be of the order of 1% of the excitation rate in order to obtain reliable, good-quality data. It is clear then that a high repetition rate source is desirable in order to reduce the overall data acquisition time. However, it is important that the repetition rate of the source is not too high, otherwise the data show ‘overlap’ effects, caused by the fluorescence intensity from one excitation pulse not completely decaying before the next laser pulse arrives at the sample. Under these conditions the observed fluorescence decay is perturbed and extraction of complex kinetic behavior is more difficult. As a general rule of thumb, it is recommended that the inter-pulse separation is greater than five times the lifetime of the longest component of the fluorescence decay. Fluorescence lifetimes can be as long as 100 ns, indicating a maximum repetition rate of 2 MHz, although when investigating fluorophores with a shorter lifetime correspondingly higher frequencies can be selected. Clearly the ability to vary the repetition rate according to the type of sample under investigation is an advantage.

Short Pulse Duration Typically the minimum lifetime that can be determined by a TCSPC spectrometer is limited by the instrument response function: as a general rule it is possible to extract fluorescence lifetimes of about one tenth of the full width at half maximum (FWHM) of the IRF by reconvolution methods. As discussed above, the IRF is the product of a number of factors, including the optical pulse duration, differences in the optical path traversed by the fluorescence, the response time of the detector, and the temporal response of the associated timing electronics. Ideally the excitation source should be as short as possible, and with modern laser-based TCSPC systems, it is usually the case that the time-response of the detector is the limiting factor. The introduction of high-speed microchannel plate (MCP) photomultipliers have lead to very short IRFs, which can be , 50 ps FWHM. When using these detectors is it essential to keep the optical transit-time spread as low as possible: that is, the path taken by light through a monochromator should be independent of wavelength and trajectory taken through the optics. Many research groups advocate the use of a subtractive dispersive double monochromator for this purpose.

Tunability Early TCSPC experiments were carried out using hydrogen- or nitrogen-filled flashlamps which produced broadband ns-duration pulses with low repetition rates (, 100 kHz). Variable wavelength excitation is possible using these sources, but their low intensity and low repetition rates mean that data acquisition periods can be very long. The early 1980s saw the introduction of cavity-dumped ion lasers and synchronously pumped mode-locked cavity-dumped dye lasers, which provided significantly higher repetition rates, typically up to 4 MHz, and greater intensities. However, the TCSPC experiments that could be carried out using these laserbased sources were somewhat limited by the wavelengths that could be generated by the argon ion and synchronously pumped dye lasers, and their second harmonics. The dye lasers were restricted to the use of Rhodamine 6G as the gain medium, which has a limited tuning range (,580 –620 nm). It is often desirable to vary the excitation wavelength as part of a photophysical study, either as a systematic study of the effects of excitation energy upon the excited state behavior, or in order to optimize excitation of a particular chromophore in a system. To some extent in the past, the wavelengths available from the laser systems dictated the science that they could be used to address. The mid- to late1980s saw the introduction of mode-locked continuous wave (CW) Nd:YAG lasers as pump lasers and increased number of dyes that could be synchronously pumped, providing a broader range of excitation wavelengths as well as increased stability. One drawback of these complex and often fickle laser systems was the level of expertise and attention they required to keep their performance at its peak. Since the 1990s, the introduction of two important new types of laser have had a significant impact on the use of TCSPC as an investigative method. The first is based upon the mode-locked titanium sapphire laser which offers tuneable radiation from the deep UV to the NIR, whilst the second series of laser are based upon (relatively) low-cost, turn-key solid state diode lasers. The developments and merits of these new laser sources, which now dominate the market for TCSPC sources, are discussed below.

Ti-Sapphire The discovery of the titanium doped sapphire laser has led to a revolution in ultrafast laser sources. Coupled with efficient diode-pumped solid state

CHEMICAL APPLICATIONS OF LASERS / Time-Correlated Single-Photon Counting

Nd:YAG pump lasers, this medium currently dominates the entire ultrafast laser market, from high repetition, low pulse energy systems through to systems generating very high pulse energies at kHz repetition rates. Ti-sapphire is, in many respects, an ideal laser gain medium: it has a high energy storage capacity, it has a very broad-gain bandwidth, and it has good thermal properties. Following the first demonstration of laser action in a Ti-sapphire crystal, the CW Ti-sapphire laser was quickly commercialized. It proved to be a remarkable new laser medium, providing greater efficiency, with outputs of the order of 1 W. The laser also provided an unprecedented broad tuning range, spanning from just below 700 nm to 1000 nm. The most important breakthrough for time-resolved users came in 1991, when self-mode-locking in the medium was demonstrated. This self-mode-locking, sometimes referred to as Kerr-lens mode-locking, arises from the nonlinear refractive index of the material and is implemented by simply placing an aperture within the cavity at an appropriate point. The broad-gain bandwidth of the medium results in very short pulses: pulse durations of the order of tens of femtoseconds can be routinely achieved with commercial laser systems. The mode-locked Ti-sapphire laser was quickly seized upon and commercialized. The first generation of systems used multiwatt CW argon ion lasers to pump them, making the systems unwieldy and expensive to run, but in recent years these have been replaced by diode-pumped Nd:YAG lasers, providing an all solid-state, ultrafast laser source. Equally significantly, the current generation of lasers are very efficient and do not require significant electrical power or cooling capacity. There are several commercial suppliers of pulsed Ti-sapphire lasers, including Coherent and Spectra Physics. A typical mode-locked Ti-sapphire laser provides an 80 MHz train of pulses with a FWHM of ,100 fs, with an average power of up to 1.5 W and are tuneable from 700 nm to almost 1000 nm. The relatively high peak powers associated with these pulses means that second- or even third-harmonic generation is relatively efficient, providing radiation in the ranges 350 – 480 nm and 240 – 320 nm. The output powers of the second and third harmonics far exceeds that required for TCSPC experiments, making them an attractive source for the study of fluorescence lifetime measurements. The simple mode-locked Ti-sapphire laser has some drawbacks: the very short pulse duration provides a broadbandwidth excitation, the repetition rate is often too high with an inter-pulse separation of ,12.5 ns,

71

and the laser does not provide continuous tuning from the deep-UV though to the visible. Incrementally, all of these points have been addressed by the laser manufacturers. As well as operating as a source of femtosecond pulses, with their associated bandwidths of many tens of nm, some modern commercial Ti-sapphire lasers can, by adjustment of the optical path and components, be operated in a mode that provides picosecond pulses whilst retaining their broad tuning curve. Under these conditions, the bandwidth of the output is greatly reduced, making the laser resemble a synchronously pumped mode-locked dye laser. As stated above, a very high repetition rate can be undesirable if the system under study has a long fluorescence lifetime due to the fluorescence decay from an earlier pulse not decaying completely before the next excitation pulse. Variation of the repetition of output frequency can be addressed by two means. The first is to use a pulse-picker. This is an opto-accoustic device which is placed outside the laser cavity. An electrical signal is applied to a device, synchronized to the frequency of the modelocked laser to provide pulses at frequencies from ,4 MHz down to single-shot. The operation of a pulse picker is somewhat inefficient, with less than half of the original pulse energy being obtained in the output, although output power is not usually an issue when using these sources. More importantly, when using pulse-pickers, it is essential to ensure that there is no pre- or post-pulse which can lead to artifacts in the observed fluorescence decays. The second means of reducing the pulse duration is cavity dumping. This is achieved by extending the laser cavity and placing an opto-acoustic device, known as a Bragg cell, at a beam-waist within the cavity. When a signal is applied to the intracavity Bragg cell, synchronized with the passage of a mode-locked pulse though the device, a pulse of light is extracted from the cavity. The great advantage of cavity-dumping is that it builds up energy within the optical cavity and when pulses are extracted these are of a higher pulse energy than the mode-locked laser alone. This in turn leads to more efficient harmonic generation. This addition of the cavity dumper system extends the laser cavity and hence reduces the overall frequency of the modelocked laser to ,50 MHz, and can provide pulses at 9 MHz down to single shot. The extension of the tuning range of the Tisapphire laser has been facilitated by the introduction of the optical parametric oscillator (OPO). In the OPO, a short wavelength pump beam passes through a nonlinear optical crystal and generates two tuneable

72

CHEMICAL APPLICATIONS OF LASERS / Time-Correlated Single-Photon Counting

Figure 3 An illustration of the spectral coverage provided by currently available pulsed semiconductor lasers and LEDs, and a modern Ti-sapphire laser and OPO. Note that the OPO’s output is frequency doubled, and the wavelengths in the range 525 – , 440 nm may be obtained by sum frequency mixing of the Ti-sapphire’s fundamental output with the output of the OPO. p denotes that there are many long wavelength pulsed laser diodes falling in the range l . 720 nm.

beams of longer wavelengths. Synchronous pumping of the OPOs with either a fs or ps Ti-sapphire laser ensures reasonably high efficiencies. The outputs from the OPO may be frequency doubled, with the option of intracavity doubling enhancing the efficiency still further. Using an 800 nm pump wavelength, an OPO provides typical signal and idler tuning ranges of 1050 – . 1350 nm and 2.1 – 3.0 mm, respectively. The signal can be doubled to give 525 – 680 nm, effectively filling the visible spectrum. The average powers obtainable from Ti-sapphire pumped OPOs makes them eminently suitable for TCSPC experiments. A summary of the spectral coverage of the Ti-sapphire laser and OPO, and low-cost solidstate sources are illustrated in Figure 3. An additional advantage of the Ti-sapphire laser is the possibility of multiphoton excitation. The combination of high average power and very short pulse duration of a mode-locked Ti-sapphire laser means that the peak-powers of the pulses can be very high. Focusing the near-IR output of the laser gives rise to exceedingly high photon densities within a sample which can lead to the simultaneous absorption of two or more photons. This is of particular significance when recording fluorescence lifetimes in conjunction with a microscope, particularly when the microscope is operated in a confocal configuration. Such systems are now being used for a derivative of TCSPC, known as fluorescence lifetime imaging (FLM).

In summary, the Ti-sapphire laser is a very versatile and flexible source that is well suited to the TCSPC experiment. Off-the-shelf systems provide broad spectral coverage and variable repetition rates. Undoubtedly this medium will be the mainstay of TCSPC for some time to come.

Diode Lasers Low-power diode lasers, with outputs in the range 630 –1000 nm, have been commercially available for some time. These devices, which are normally operated as CW sources, can be operated in a pulsed mode, providing a high frequency train of sub-ns pulses. However, the relatively long output wavelengths produced by this first generation of diode lasers severely restricted the range of chromophores that could be excited and this proved to be a limiting factor for their use as TCSPC sources. Attempts to generate shorter wavelengths from these NIR laser diodes by second harmonic generation (SHG), were never realized: the SHG process is extremely inefficient due to the very low peak output power of these lasers. Shorter wavelengths could be obtained by pulsing conventional light-emitting diodes (LEDs), which were commercially available with emission wavelengths covering the visible spectrum down to ,450 nm. However, these had the disadvantage of exhibiting longer pulse durations than those achievable from laser diodes and have

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

much broader emission bandwidths than a laser source. The introduction of violet and blue diode lasers and UV-LEDs by the Nichia Corporation (Japan) in the late 1990s resulted in a renewed interest in sources utilizing these small, solid-state devices. At least three manufacturers produce ranges of commercial TCSPC light sources based upon pulsed LEDs and laser diodes and can now boast a number of pulsed laser sources in their ranges including 375, 400, 450, 473, 635, 670, and 830 nm. Currently commercial pulsed laser diodes and LEDs are manufactured by Jobin-Yvon IBH (UK), Picoquant (Germany), and Hamamatso (Japan). These small, relatively low-cost devices offer pulse durations of , 100 ps and excitation repetition rates of up to tens of MHz making them ideally suited to ‘routine’ lifetime measurements. Although these diode laser sources are not tunable and do not span the full spectrum, they are complemented by pulsed LEDs which fill the gap in the visible spectrum. Pulsed LEDs generally have a longer pulse duration and broader, lower intensity spectral output, but are useful for some applications. At the time of writing there are no diode laser sources providing wavelengths of , 350 nm, although Jobin-Yvon IBH (UK) have recently announced pulsed LEDs operating at

73

280 nm and 340 nm. As LEDs, and possibly diode lasers, emitting at these shorter wavelengths become more widely available, the prospects of low running costs and ease of use makes these semiconductor devices an extremely attractive option for many TCSPC experiments. Commercial TCSPC spectrometers based around these systems offer a comparatively low-cost, turnkey approach to fluorescence lifetime based experiments, opening up what was a highly specialized field to a much broader range of users.

See also Lasers: Noble Gas Ion Lasers. Light Emitting Diodes. Ultrafast Laser Techniques: Pulse Characterization Techniques.

Further Reading Lakowicz JR (ed.) (2001) Principles of Fluorescence Spectroscopy. Plenum Press. O’Connor DV and Phillips D (1984) Time-Correlated Single Photon Counting. Spence DE, Kean PN and Sibbett W (1991) Optics Letters 16: 42– 44.

Transient Holographic Grating Techniques in Chemical Dynamics E Vauthey, University of Geneva, Geneva, Switzerland q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Over the past two decades, holographic techniques have proved to be valuable tools for investigating the dynamics of chemical processes. The aim of this chapter is to give an overview of the main applications of these techniques for chemical dynamics. The basic principle underlying the formation and the detection of elementary transient holograms, also called transient gratings, is first presented. This is followed by a brief description of a typical experimental setup. The main applications of these techniques to solve chemical problems are then discussed.

Basic Principle The basic principle of the transient holographic technique is illustrated in Figure 1. The sample

material is excited by two laser pulses at the same wavelength and crossed at an angle 2upu . If the two pump pulses have the same intensity, Ipu ; the intensity distribution at the interference region, assuming plane waves, is    2px IðxÞ ¼ 2Ipu 1 þ cos L

½1

where L ¼ lpu =ð2sin upu Þ is the fringe spacing, lpu is the pump wavelength, and Ipu is the intensity of one pump pulse. As discussed below, there can be many types of light–matter interactions that lead to a change in the optical properties of the material. For a dielectric material, they result in a spatial modulation of the optical susceptibility and thus of the complex ~ The latter distribution can be refractive index, n: described as a Fourier cosine series: 1 X



m2px ~ nðxÞ ¼ n~ 0 þ n~ m cos L m¼1

 ½2

74

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

~ In the absence where n~ 0 is the average value of n: of saturation effects, the spatial modulation of n~ is harmonic and the Fourier coefficients with m . 1 vanish. In this case, the peak to null variation of ~ is equal to the the complex refractive index, Dn; Fourier coefficient n~ 1 : The complex refractive index can be split into its real and imaginary components: n~ ¼ n þ iK

½3

where n is the refractive index and K is the attenuation constant. The hologram created by the interaction of the crossed pump pulses consists in periodic, one-dimensional spatial modulations of n and K: Such distributions are nothing but phase and amplitude gratings, respectively. A third laser beam at the probe wavelength, lpr ; striking these gratings at Bragg angle, uB ¼ arcsinðlpr =2LÞ; will thus be partially diffracted (Figure 1b). The diffraction efficiency, h; depends on the modulation amplitude of the optical properties. In the limit of small diffraction

efficiency ðh , 0:01Þ; this relationship is given by 2 !2 !2 3 Idif 4 ln 10DA pdDn 5 h¼ ø þ 4cosuB Ipr lpr cos uB ! ln 10A £ exp 2 cos uB

½4

where Idif and Ipr are the diffracted and the probe intensity respectively, d is the sample thickness, and A ¼ 4pdK=ðl ln 10Þ is the average absorbance. The first and the second terms in the square bracket describe the contribution of the amplitude and phase gratings respectively, and the exponential term accounts for the reabsorption of the diffracted beam by the sample. The main processes responsible for the variation of the optical properties of an isotropic dielectric material are summarized in Figure 2. The modulation of the absorbance, DA; is essentially due to the photoinduced concentration change,

Figure 1 Principle of the transient grating technique: (a) grating formation (pumping), (b) grating detection (probing).

Figure 2 Classification of the possible contributions to a transient grating signal.

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

DC; of the different chemical species i (excited state, photoproduct, …): X DAðlpr Þ ¼ 1i ðlpr ÞDCi ½5 i

where 1i is the absorption coefficient of the species i: The variation of the refractive index, Dn; has several origins and can be expressed as Dn ¼ DnK þ Dnp þ Dnd

½6

DnK is the variation of refractive index due to the optical Kerr effect (OKE). This nonresonant interaction results in an electronic polarization (electronic OKE) and/or in a nuclear reorientation of the molecules (nuclear OKE) along the direction of the electric field associated with the pump pulses. As a consequence, a transient birefringence is created in the material. This effect is usually discussed within the framework of nonlinear optics in terms of intensity dependent refractive index or third order nonlinear susceptibility. Electronic OKE occurs in any dielectric material under sufficiently high light intensity. On the other hand, nuclear OKE is mostly observed in liquids and gases and depends strongly on the molecular shape. Dnp is the change of refractive index related to population changes. Its magnitude and wavelength dependence can be obtained by Kramers – Kronig transformation of DAðlÞ or DKðlÞ: Dnp ðlÞ ¼

1 ð1 DKðl0 Þ dl0 2p 2 0 1 2 ðl0 =lÞ2

½7

Dnd is the change of refractive index associated with density changes. Density phase gratings can have essentially three origins: Dnd ¼ Dntd þ Dnvd þ Dned Dntd

75

that of the reactant and a positive volume change can be expected. This will lead to a decrease of the density and to a negative Dnvd : Finally, Dned is related to electrostriction in the sample by the electric field of the pump pulses. Like OKE, this is a nonresonant process that also contributes to the intensity dependent refractive index. Electrostriction leads to material compression in the regions of high electric field strength. The periodic compression is accompanied by the generation of two counterpropagating ~ ¼ ^ð2p=LÞ~i; acoustic waves with wave vectors, k ac ~ where i is the unit vector along the modulation axis. The interference of these acoustic waves leads to a temporal modulation of Dned at the acoustic frequency y ac ; with 2py ac ¼ kac vs ; vs being the speed of sound. As Dned oscillates between negative and positive values, the diffracted intensity, which is proportional to ðDned Þ2 ; shows at temporal oscillation at twice the acoustic frequency. In most cases, Dned is weak and can be neglected if the pump pulses are within an absorption band of the sample. The modulation amplitudes of absorbance and refractive index are not constant in time and their temporal behavior depends on various dynamic processes in the sample. The whole point of the transient grating techniques is precisely the measurement of the diffracted intensity as a function of time after excitation to deduce dynamic information on the system. In the following, we will show that the various processes shown in Figure 2, that give rise to a diffracted signal, can in principle be separated by choosing appropriate experimental parameters, such as the timescale, the probe wavelength, the polarization of the four beams, or the crossing angle.

½8

is related to the temperature-induced change of density. If a fraction of the excitation energy is converted into heat, through a nonradiative transition or an exothermic process, the temperature becomes spatially modulated. This results in a variation of density, hence to a modulation of refractive index with amplitude Dntd : Most of the temperature dependence of n originates from the density. The temperature-induced variation of n at constant density is much smaller than Dntd : Dnvd is related to the variation of volume upon population changes. This volume comprises not only the reactant and product molecules but also their environment. For example, in the case of a photodissociation, the volume of the product is larger than

Experimental Setup A typical experimental arrangement for pump – probe transient grating measurements is shown in Figure 3. The laser output pulses are split in three parts. Two parts of equal intensity are used as

Figure 3 Schematic of a transient grating setup with pump– probe detection.

76

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

Figure 4 Beam geometry for transient grating: (a) in plane; (b) boxcars.

pump pulses and are crossed in the sample. In order to ensure time coincidence, one pump pulse travels along an adjustable optical delay line. The third part, which is used for probing, can be frequency converted using a dye laser, a nonlinear crystal, a Raman shifter, or white light continuum generation. The probe pulse is sent along a motorized optical delay line before striking the sample at the Bragg angle. There are several possible beam configurations for transient grating and the two most used are illustrated in Figure 4. When the probe and pump pulses are at different wavelengths, they can be in the same plane of incidence as shown in Figure 4a. However, if the pump and probe wavelengths are the same, the folded boxcars geometry shown in Figure 4b has to be used. The transient grating technique is background free and the diffracted signal propagates in a well-defined direction. In a pump –probe experiment, the diffracted signal intensity is measured as a function of the time delay between the pump and probe pulses. Such a setup can be used to probe dynamic processes occurring in timescales going from a few fs to a few ns, the time resolution depending essentially on the duration of the pump and probe pulses. For slower processes, the grating dynamics can be probed in real time with a cw laser beam and a fast photodetector.

and ultrafast response is negligibly small. The density change can originate from both heat releasing processes and volume differences between the products and reactants, the former contribution usually being much larger than the latter. Even if the heat releasing process is instantaneous, the risetime of the density grating is limited by thermal expansion. This expansion is accompanied by the generation of two counterpropagating acoustic ~ ¼ ^ð2p=LÞ~i: One can waves with wave vectors, k ac distinguish two density gratings: 1) a diffusive density grating, which reproduces the spatial distribution of temperature and which decays by thermal diffusion; and 2) an acoustic density grating originating from the standing acoustic wave and whose amplitude oscillates at the acoustic frequency y ac : The amplitudes of these two gratings are equal but of opposite sign. Consequently, the time dependence of the modulation amplitude of the density phase grating is given by ! ! bQ ›n Dnd ðtÞ ¼ RðtÞ ½10a þ DV r rC v ›r with RðtÞ ¼ 1 2 cosð2py ac tÞ £ expð2aac vs tÞ

½10b

where r; b; Cv ; and aac are the density, the volume expansion coefficient, the heat capacity, and the acoustic attenuation constant of the medium, respectively, Q is the amount of heat deposited during the photoinduced process, and DV is the corresponding volume change. As the standing acoustic wave oscillates, its corresponding density grating interferes with the diffusive density grating. Therefore, the total modulation amplitude of the density and thus Dnd exhibits the oscillation at y ac : Figure 5 shows the time profile of the diffracted

Applications The Transient Density Phase Grating Technique

If the grating is probed at a wavelength far from any absorption band, the variation of absorbance, DAðlpr Þ; is zero and the corresponding change of refractive index, Dnp ðlpr Þ; is negligibly small. In this case, eqn [4] simplifies to !2 pd hø £ Dn2d ½9 lpr cos uB In principle, the diffracted signal may also contain contributions from the optical Kerr effect, DnK ; but we will assume here that this non-resonant

Figure 5 Time profile of the diffracted intensity measured with a solution of malachite green after excitation with two pulses with close to counterpropagating geometry.

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

intensity measured with a solution of malachite green. After excitation to the S1 state, this dye relaxes nonradiatively to the ground state in a few ps. For this measurement, the sample solution was excited with two 30 ps laser pulses at 532 nm crossed with an angle close to 1808. The continuous line is the best fit of eqns [9] and [10]. The damping of the oscillation is due to acoustic attenuation. After complete damping, the remaining diffracted signal is due to the diffusive density grating only. RðtÞ can be considered as the response function of the sample to a prompt heat release and/or volume change. If these processes are not instantaneous compared to an acoustic period ðtac ¼ y 21 ac Þ; the acoustic waves are not created impulsively and the time dependence of Dnd is

Dnd ðtÞ ¼

! ! bQ ›n þ DV r FðtÞ rC v ›r

½11a

with

expressed as Dnd ðtÞ ¼

X i

! ! bQ i ›n F ðtÞ þ DVi r rC v ›r i

FðtÞ ¼

kth ¼ Dth 0

0

Rðt 2 t Þ · f ðt Þdt

0

½11b

21

where f ðtÞ is a normalized function describing the time evolution of the temperature and/or volume change. In many cases, f ðtÞ ¼ expð2kr tÞ; kr being the rate constant of the process responsible for the change. Figure 6 shows the time profile of the diffracted intensity calculated with eqns [9] and [11] for different values of kr : If several processes take place, the total change of refractive index is the sum of the changes due to the individual processes. In this case, Dnd should be

½12

The separation of the thermal and volume contributions to the diffracted signal is problematic. Several approaches have been proposed, the most used being the measurement in a series of solvents with different expansion coefficient b. This method requires that the other solvent properties like refractive index, dielectric constant, or viscosity, are constant or that the energetics of the system investigated does not depend on them. This separation is easier when working with water because its b value vanishes at 4 8C. At this temperature, the density variations are due to the volume changes only. The above equations describe the growth of the density phase grating. However, this grating is not permanent and decays through diffusive processes. The phase grating originating from thermal expansion decays via thermal diffusion with a rate constant kth given by 

ðt

77

2p L

2 ½13

where Dth is the thermal diffusivity. Table 1 shows kth values in acetonitrile for different crossing angles. The decay of the phase grating originating from volume changes depends on the dynamics of the population responsible for DV (vide infra). Time resolved optical calorimetry A major application of the density phase grating in chemistry is the investigation of the energetics of photo-induced processes. The great advantage of this technique over other optical calorimetric methods, like the thermal lens or the photoacoustic spectroscopy, is its superior time resolution. The time constant of the fastest heat releasing process that can be time resolved with this technique is of the order of the acoustic period. The shortest acoustic period is achieved when forming the grating with two counterpropagating pump pulses ð2upu ¼ 1808Þ: In this case the fringe spacing is Table 1 Fringe spacing, L, acoustic frequency, nac ; thermal diffusion rate constant, ktr , for various crossing angles of the pump pulses, 2 upu ; at 355 nm and in acetonitrile

Figure 6 Time profiles of the diffracted intensity calculated using eqns [9] and [11] with various kr values.

2 upu

L (mm)

y ac (s21)

kth (s21)

0.58 508 1808

40.7 0.42 0.13

3.2 £ 107 3.1 £ 109 9.7 £ 109

4.7 £ 103 4.5 £ 107 4.7 £ 108

78

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

L ¼ lpu =2n and, with UV pump pulses, an acoustic period of the order of 100 ps can be obtained in a typical organic solvent. For example, Figure 7a shows the energy diagram for a photoinduced electron transfer (ET) reaction between benzophenone (BP) and an electron donor (D) in a polar solvent. Upon excitation at 355 nm, 1BPp undergoes intersystem-crossing (ISC) to 3BPp with a time constant of about 10 ps.

Figure 7 (a) Energy diagram of the states involved in the photoinduced electron transfer reaction between benzophenone (BP) and an electron donor (D) (VR: vibrational relaxation, ISC: intersystem crossing; ET: electron transfer; REC: charge recombination). (b) Time profile of the diffracted intensity measured at 590 nm after excitation at 355 nm of a solution of BP and 0.05 M D. (c) Same as (b) but measured at 1064 nm with a cw laser.

After diffusional encounter with the electron donor, ET takes place and a pair of ions is generated. With this system, the whole energy of the 355 nm photon ðE ¼ 3:49 eVÞ is converted into heat. The different heat releasing processes can be differentiated according to their timescale. The vibrational relaxation to 1BPp and the ensuing ISC to 3BPp induces an ultrafast release of 0.49 eV as heat. With donor concentrations of the order of 0.1 M, the heat deposition process due to the electron transfer is typically in the ns range. Finally, the recombination of the ions produces a heat release in the microsecond timescale. Figure 7b shows the time profile of the diffracted intensity measured after excitation at 355 nm of BP with 0.05 M D in acetonitrile. The 30 ps pump pulses were crossed at 278 ðtac ¼ 565 psÞ and the 10 ps probe pulses were at 590 nm. The oscillatory behavior is due to the ultrafast heat released upon formation of 3BPp, while the slow dc rise is caused by the heat dissipated upon ET. As the amount of energy released in the ultrafast process is known, the energy released upon ET can be determined by comparing the amplitudes of the fast and slow components of the time profile. The energetics as well as the dynamics of the photoinduced ET process can thus be determined. Figure 7c shows the decay of the diffracted intensity due to the washing out of the grating by thermal diffusion. As both the acoustic frequency and the thermal diffusion depends on the grating wavevector, the experimental time window in which a heat releasing process can be measured, depends mainly on the crossing angle of the pump pulses as shown in Table 1. Determination of material properties Another important application of the transient density phase grating technique is the investigation of material properties. Acoustics waves of various frequencies, depending on the pump wavelength and crossing angle, can be generated without physical contact with the sample. Therefore, acoustic properties, such as the speed of sound and the acoustic attenuation of the material, can be easily obtained from the frequency and the damping of the oscillation of the transient grating signal. Similarly, the optoelastic constant of a material, r›n=›r; can be determined from the amplitude of the signal (see eqn [11]). This is done by comparing the signal amplitude of the material under investigation with that obtained with a known standard. Finally, the thermal diffusivity, Dth ; can be easily obtained from the decay of the thermal density

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

phase grating, such as that shown in Figure 7c. This technique can be used with a large variety of bulk materials as well as films, surfaces and interfaces. Investigation of Population Dynamics

The dynamic properties of a photogenerated species can be investigated by using a probe wavelength within its absorption or dispersion spectrum. As shown by eqn [4], either DA or Dnp has to be different from zero. For this application, DnK and Dnd should ideally be equal to zero. Unless working with ultrashort pulses ð, 1 psÞ and a weakly absorbing sample, DnK can be neglected. Practically, it is almost impossible to find a sample system for which some fraction of the energy absorbed as light is not released as heat. However, by using a sufficiently small crossing angle of the pump pulses, the formation of the density phase grating, which depends on the acoustic period, can take as much as 30 to 40 ns. In this case, Dnd is negligible during the first few ns after excitation and the diffracted intensity is due to the population grating only: Idif ðtÞ /

X

DC2i ðtÞ

½14

i

where DCi is the modulation amplitude of the concentration of every species i whose absorption and/or dispersion spectrum overlaps with the probe wavelength. The temporal variation of DCi can be due either to the dynamics of the species i in the illuminated grating fringes or to processes taking place between the fringes, such as translational diffusion, excitation transport, and charge diffusion. In liquids, the decay of the population grating by translational diffusion is slow and occurs in the microsecond to millisecond timescale, depending on the fringe spacing. As thermal diffusion is typically hundred times faster, eqn [14] is again valid in this long timescale. Therefore if the population dynamics is very slow, the translational diffusion coefficient of a chemical species can be obtained by measuring the decay of the diffracted intensity as a function of the fringe spacing. This procedure has also been used to determine the temperature of flames. In this case however, the decay of the population grating by translational diffusion occurs typically in the sub-ns timescale. In the condensed phase, these interfringe processes are of minor importance when measuring the diffracted intensity in the short timescale, i.e., before the formation of the density phase grating. In this

79

case, the transient grating technique is similar to transient absorption, and it thus allows the measurement of population dynamics. However, because holographic detection is background free, it is at least a hundred times more sensitive than transient absorption. The population gratings are usually probed with monochromatic laser pulses. This procedure is well suited for simple photoinduced processes, such as the decay of an excited state population to the ground state. For example, Figure 8 shows the decay of the diffracted intensity at 532 nm measured after excitation of a cyanine dye at the same wavelength. These dynamics correspond to the ground state recovery of the dye by non-radiative deactivation of the first singlet excited state. Because the diffracted intensity is proportional to the square of the concentration changes (see eqn [14]), its decay time is twice as short as the ground state recovery time. If the reaction is more complex, or if several intermediates are involved, the population grating has to be probed at several wavelengths. Instead of repeating many single wavelength experiments, it is preferable to perform multiplex transient grating. In this case, the grating is probed by white light pulses generated by focusing high intensity fs or ps laser pulses in a dispersive medium. If the crossing angle of the pump pulses is small enough (, 18), the Bragg angle for probing is almost independent on the wavelength. A transient grating spectrum is obtained by dispersing the diffracted signal in a spectrograph. This spectrum consists in the sum of the square of the transient absorption and transient dispersion spectra. Practically, it is very similar to a transient absorption spectrum, but with a much superior signal to noise ratio. Figure 9 shows the transient grating spectra measured at various time delays after excitation of a solution of chloranil (CA) and

Figure 8 Time profile of the diffracted intensity at 532 nm measured after excitation at the same wavelength of a cyanine dye (inset) in solution and best fit assuming exponential ground state recovery.

80

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

Another important parameter is the polarization of the four waves involved in a transient grating experiment. For example, when measuring population dynamics, the polarization of the probe beam has to be at magic angle (54.78) relatively to that of the pump pulses. This ensures that the observed time profiles are not distorted by the decay of the orientational anisotropy of the species created by the polarized pump pulses. The magnitude of DA and Dn depends on the pump pulses intensity, and therefore the diffracted intensity can be expressed by using the formalism of nonlinear optics: Idif ðtÞ ¼ C

ðþ1 21

" £

dt Ipr ðt 2 t00 Þ

ðt 21

Figure 9 Transient grating spectra obtained at various time delays after excitation at 355 nm of a solution of chloranil and 0.25 M methylnaphthalene (a): from top to bottom: 60, 100, 180, 260, 330 and 600 ps; (b): from top to bottom: 100, 400, 750, 1100, 1500 ps).

methylnaphthalene (MNA) in acetonitrile. The reaction that takes place is 3

diffusion

ET

CAp · MNA ! ð3 CAp · MNAÞ ! 3

MNA

ðCA2 · MNAþ Þ ! 3 ðCA2 · MNAþ Þ

After its formation upon ultrafast ISC, 3CAp, which absorbs around 510 nm, decays upon ET with MNA to generate CA2(450 nm) and MNAþ (690 nm). The ion pair reacts further with a second MNA molecule to form the dimer cation (580 nm). The time profile of the diffracted intensity reflects the population dynamics as long as these populations follow first- or pseudo-first order kinetics. Higher order kinetics leads to inharmonic gratings and in this case the time dependence of the diffracted intensity is no longer a direct measure of the population dynamics. Polarization Selective Transient Grating

In the above applications, the selection between the different contributions to the diffracted signal was essentially made by choosing the probe wavelength and the crossing angle of the pump pulses. However, this approach is not always sufficient.

#2 0

00

0

0

dt lRijkl ðt 2 t ÞlIpu ðt Þ

½15

where C is a constant and Rijkl is an element of the fourth rank tensor R describing the nonlinear response of the material to the applied optical fields. In isotropic media, this tensor has only 21 nonzero elements, which are related as follows: R1111 ¼ R1122 þ R1212 þ R1221

½16

where the subscripts are the Cartesian coordinates. Going from right to left, they design the direction of polarization of the pump, probe, and signal pulses. The remaining elements can be obtained by permutation of these indices ðR1122 ¼ R1133 ¼ R2211 ¼ …Þ: In a conventional transient grating experiment, the two pump pulses are at the same frequency and are time coincident and therefore their indices can be interchanged. In this case, R1212 ¼ R1221 and the number of independent tensor elements is further reduced: R1111 ¼ R1122 þ 2R1212

½17

The tensor R can be decomposed into four tensors according to the origin of the sample response: population and density changes, electronic and nuclear optical Kerr effects: R ¼ RðpÞ þ RðdÞ þ RðK; eÞ þ RðK; nÞ

½18

As they describe different phenomena, these tensors do not have the same symmetry properties. Therefore, the various contributions to the diffracted intensity can, in some cases, be measured selectively by choosing the appropriate polarization of the four waves. Table 2 shows the relative amplitude of the most important elements of these four tensors.

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

The tensor RðpÞ depends on the polarization anisotropy of the sample, r; created by excitation with polarized pump pulses: rðtÞ ¼

Nk ðtÞ 2 N’ ðtÞ 2 ¼ P2 ½cosðgÞexpð2kre tÞ Nk ðtÞ þ 2N’ ðtÞ 5 ½19

where Nk and N’ are the number of molecules with the transition dipole oriented parallel and perpendicular to the polarization of the probe pulse, respectively, g is the angle between the transition dipoles involved in the pump and probe processes, P2 is the second Legendre polynomial, and kre is the rate constant for the reorientation of the transition dipole, for example, by rotational diffusion of the molecule or by energy hopping. The table shows that R1212 ðdÞ ¼ 0; i.e., the contribution of the density grating can be eliminated with the set of polarization ð08; 908; 08; 908Þ; the so-called crossed grating geometry. In this geometry, the two pump pulses have orthogonal polarization, the polarization of the probe pulse is parallel to that of one pump pulse and the polarization component of the signal that is orthogonal to the polarization of the probe pulse is measured. In this geometry, R1212 ðpÞ is nonzero as long as there is some polarization anisotropy, ðr – 0Þ: In this case, the diffracted intensity is Idif ðtÞ / lR1212 ðpÞl2 / ½DCðtÞ £ rðtÞ2

81

diffusion, the excited lifetime of rhodamine being about 4 ns. On the other hand, if the decay of r is slower than that of DC; the crossed grating geometry can be used to measure the population dynamics without any interference from the density phase grating. For example, Figure 11 shows the time profile of the diffracted intensity after excitation of a suspension of TiO2 particles in water. The upper profile was measured with the set of polarization ð08; 08; 08; 08Þ and thus reflects the time dependence of R1111 ðpÞ and R1111 ðdÞ. R1111 ðpÞ is due to the trapped electron population, which decays by charge recombination, and R1111 ðdÞ is due to the heat dissipated upon both charge separation and recombination. The lower time profile was measured in the crossed grating geometry and thus reflects the time dependence of R1212 and in particular that of R1212 ðpÞ: Each contribution to the signal can be selectively eliminated by using the set of geometry ðz; 458; 08; 08Þ where the value of the ‘magic angle’ z for each

½20

The crossed grating technique can thus be used to investigate the reorientational dynamics of molecules, through rðtÞ; especially when the dynamics of r is faster than that of DC: For example, Figure 10 shows the time profile of the diffracted intensity measured in the crossed grating geometry with rhodamine 6G in ethanol. The decay is due to the reorientation of the molecule by rotational

Figure 10 Time profile of the diffracted intensity measured with a solution of rhodamine 6G with crossed grating geometry.

Table 2 Relative value of the most important elements of the response tensors Rði Þ and polarization angle z of the signal beam, where the contribution of the corresponding process vanishes for the set of polarization ðz; 458; 08; 08Þ: Process

R1111

R1122

R1212

z

Electronic OKE Nuclear OKE Density Population: g ¼ 08 g ¼ 908 No correlation: r ¼ 0

1 1 1

1/3 21/2 1

1/3 3/4 0

271.68 63.48 2458

1 1 1

1/3 2 1

1/3 21/2 0

271.68 226.68 2458

Figure 11 Time profiles of the diffracted intensity after excitation at 355 nm of a suspension of TiO2 particles in water and using different set of polarization of the four beams.

82

CHEMICAL APPLICATIONS OF LASERS / Transient Holographic Grating Techniques in Chemical Dynamics

contribution is listed in Table 2. This approach allows for example the nuclear and electronic contributions to the optical Kerr effect to be measured separately.

Concluding Remarks The transient grating techniques offer a large variety of applications for investigating the dynamics of chemical processes. We have only discussed the cases where the pump pulses are time coincident and at the same wavelength. Excitation with pump pulses at different wavelengths results to a moving grating. The well-known CARS spectroscopy is such a moving grating technique. Finally, the three pulse photon echo can be considered as a special case of transient grating where the sample is excited by two pump pulses, which are at the same wavelength but are not time coincident.

List of Units and Nomenclature C Cv d Dth Idif Ipr Ipu ~ k ac kr kre kth K n n~ N r R vs V aac b g L Dnd Dned DnK p

Dnd

concentration [mol L21] heat capacity [J K21 kg21] sample thickness [m] thermal diffusivity [m2 s21] diffracted intensity [W m22] intensity of the probe pulse [W m22] intensity of a pump pulse [W m22] acoustic wavevector [m21] rate constant of a heat releasing process [s21] rate constant of reorientation [s21] rate constant of thermal diffusion [s21] attenuation constant refractive index complex refractive index number of molecule per unit volume [m23] polarization anisotropy fourth rank response tensor [m2 V22] speed of sound [m s21] volume [m3] acoustic attenuation constant [m21] cubic volume expansion coefficient [K21] angle between transition dipoles fringe spacing [m] variation of refractive index due to density changes Dnd due to electrostriction variation of refractive index due to optical Kerr effect Dnd due to volume changes

Dnp Dntd Dx 1

z h uB upu lpr lpu r tac y ac

variation of refractive index due to population changes Dnd due to temperature changes peak to null variation of the parameter x molar decadic absorption coefficient [cm L mol21] angle of polarization of the diffracted signal diffraction efficiency Bragg angle (angle of incidence of the probe pulse) angle of incidence of a pump pulse probe wavelength [m] pump wavelength [m] density [kg m23] acoustic period [s] acoustic frequency [s21]

See also Chemical Applications of Lasers: Nonlinear Spectroscopies; Pump and Probe Studies of Femtosecond Kinetics. Holography, Applications: Holographic Recording Materials and their Processing. Holography, Techniques: Holographic Interferometry. Materials Characterization Techniques: x (3). Nonlinear Optics Basics: Four-Wave Mixing. Ultrafast Technology: Femtosecond Chemical Dynamics, Gas Phase; Femtosecond Condensed Phase Spectroscopy – Structural Dynamics.

Further Reading Bra¨uchle C and Burland DM (1983) Holographic methods for the investigation of photochemical and photophysical properties of molecules. Angewandte Chemie International Edition Engl 22: 582 – 598. Eichler HJ, Gu¨nter P and Pohl DW (1986) Laser-Induced Dynamics Gratings. Berlin: Springer. Fleming GR (1986) Chemical Applications of Ultrafast Spectroscopy. Oxford: Oxford University Press. Fourkas JT and Fayer MD (1992) The transient grating: a holographic window to dynamic processes. Accounts of Chemical Research 25: 227 – 233. Hall G and Whitaker BJ (1994) Laser-induced grating spectroscopy. Journal of the Chemistry Society Faraday Trans 90: 1– 16. Levenson MD and Kano SS (1987) Introduction to Nonlinear Laser Spectroscopy, revised edn. Boston: Academic Press. Mukamel S (1995) Nonlinear Optical Spectroscopy. Oxford: Oxford University Press. Rullie`re C (ed.) (1998) Femtosecond Laser Pulses. Berlin: Springer. Terazima M (1998) Photothermal studies of photophysical and photochemical processes by the transient grating method. Advances in Photochemistry 24: 255– 338.

84

COHERENCE / Overview

cross-sectional area 104 times and dispersed the beam across it. The CPA technique makes it possible to use conventional laser amplifiers and to stay below the onset of nonlinear effects.

See also Diffraction: Diffraction Gratings. Ultrafast Laser Techniques: Pulse Characterization Techniques.

COHERENCE Contents Overview Coherence and Imaging Speckle and Coherence

Overview A Sharma and A K Ghatak, Indian Institute of Technology, New Delhi, India H C Kandpal, National Physical Laboratory, New Delhi, India q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Coherence refers to the characteristic of a wave that indicates whether different parts of the wave oscillate in tandem, or in a definite phase relationship. In other words, it refers to the degree of confidence by which one can predict the amplitude and phase of a wave at a point, from the knowledge of these at another point at the same or a different instant of time. Emission from a thermal source, such as a light bulb, is a highly disordered process and the emitted light is incoherent. A well-stabilized laser source, on the other hand, generates light in a highly ordered manner and the emitted light is highly coherent. Incoherent and coherent light represent two extreme cases. While describing the phenomena of physical optics and diffraction theory, light is assumed to be perfectly coherent in both spatial as well as temporal senses, whereas in radiometry it is generally assumed to be incoherent. However, practical light sources and the fields generated by them are in between the two extremes and are termed as partially coherent sources and fields. The degree of order that exists in an optical field produced by a source of any kind may be described in terms of various correlation functions. These correlation functions are the basic theoretical

tools for the analyses of statistical properties of partially coherent light fields. Light fields generated from real physical sources fluctuate randomly to some extent. On a microscopic level quantum mechanical fluctuations produce randomness and on macroscopic level the randomness occurs as a consequence of these microscopic fluctuations, even in free space. In real physical sources, spontaneous emission causes random fluctuations and even in the case of lasers, spontaneous emission cannot be suppressed completely. In addition to spontaneous emission, there are many other processes that give rise to random fluctuations of light fields. Optical coherence theory was developed to describe the random nature of light and it deals with the statistical similarity between light fluctuations at two (or more) space – time points. As mentioned earlier, in developing the theory of interference or diffraction, light is assumed to be perfectly coherent, or, in other words, it is taken to be monochromatic and sinusoidal for all times. This is, however, an idealization since the wave is obviously generated at some point of time by an atomic or molecular transition. Furthermore, a wavetrain generated by such a transition is of a finite duration, which is related to the finite lifetime of the atomic or molecular levels involved in the transition. Thus, any wave emanating from a source is an ensemble of a large number of such wavetrains of finite duration, say tc : A simplified visualization of such an ensemble is shown in Figure 1 where a wave is shown as a series of wavetrains of duration tc : It is evident from the figure that the fields at time t and t þ Dt will have a definite phase relationship if Dt p tc and will have no or negligible phase relationship when Dt q tc : The time tc is known as the coherence time of the

COHERENCE / Overview

85

Figure 1 Typical variation of the radiation field with time. The coherence time , tc :

radiation and the field is said to remain coherent for time , tc : This property of waves is referred to as the time coherence or the temporal coherence and is related to the spectral purity of the radiation. If one obtains the spectrum of the wave shown in Figure 1 by taking the Fourier transform of the time variation, it would have a width of Dn around n0 which is the frequency of the sinusoidal variation of individual wavetrains. The spectral width Dn is related to the coherence time as Dn , 1=tc

½1

For thermal sources such as a sodium lamp, tc , 100 ps, whereas for a laser beam it could be as large as a few milliseconds. A related quantity is the coherence length lc ; which is the distance covered by the wave in time tc ; lc ¼ ctc ,

c l2 ¼ 0 Dn Dl

½2

where l0 is the central wavelength ðl0 ¼ c=n0 Þ and Dl is the wavelength spread corresponding to the spectral width Dn: In a two-beam interference experiment (e.g., Michelson interferometer, Figure 2), the interfering beam derived from the same source will produce good interference fringes if the path difference between the two interfering beams is less than the coherence length of the radiation given out by the source. It must be added here that for the real fields, generated by innumerable atoms or molecules, the individual wavetrains have different lengths around an average value tc : Furthermore, several wavetrains in general are propagating simultaneously, overlapping in space and time, to produce an ensemble whose properties are best understood in a statistical sense as we shall see in later sections. Another type of coherence associated with the fields is the space coherence or the spatial coherence, which is related to the size of the source of radiation. It is evident that when the source is an ideal point source, the field at any two points (within the coherence length) would have definite phase relationship. However, the field from a thermal source of finite area S can be thought as resultant of the fields

Figure 2 Michelson interferometer to study the temporal coherence of radiation from source S; M1 and M2 are mirrors and BS is a 50%– 50% beamsplitter.

from each point on the source. Since each point source would usually be independent of the other, the phase relationship between the fields at any two points would depend on their position and size of the source. It can be shown that two points will have a strong phase relationship if they lie within the solid angle DV from the source (Figure 3) such that DV ,

l20 S

½3

Thus, on a plane R distance away from the source, one can define an area Ac ¼ R2 DV over which the field remains spatially coherent. This area Ac is called the coherence area of the radiation and its square root is sometimes referred to as the transverse coherence length. It is trivial to show that Ac ,

l20 DVS

½4

where DVS is the solid angle subtended by the source on the plane at which the coherence area is defined. In Young’s double-hole experiment, if the two pinholes lie within the coherence area of the radiation from the primary source, good contrast in fringes would be observed. Combining the concepts of coherence length and the coherence area, one can define a coherence volume as Vc ¼ Ac lc : For the wavefield from a thermal source, this volume represents that portion of the space in which the field is coherent and any

86

COHERENCE / Overview

In other words, Vðr; tÞ is a typical member of an ensemble, which is the result of a random process representing the radiation from a quasi-monochromatic source. This process is assumed to be stationary and ergodic so that the ensemble average is equal to the time average of a typical member of the ensemble and that the origin of time is unimportant. Thus, the quantities of interest in the theory of coherence are defined as time averages: kf ðtÞl ¼ Lim

T!1

1 ðT f ðtÞdt 2T 2T

½6

Mutual Coherence

Figure 3 Spatial coherence and coherence area.

interference produced using the radiation from points within this volume will produce fringes of good contrast.

Mathematical Description of Coherence Analytical Field Representation

Coherence properties associated with fields are best analyzed in terms of complex representation for optical fields. Let the real function V ðrÞ ðr; tÞ represent the scalar field, which could be one of the transverse Cartesian components of the electric field associated with the electromagnetic wave. One can then define a complex analytical signal Vðr; tÞ such that V ðrÞ ðr; tÞ ¼ Re½Vðr; tÞ; Vðr; tÞ ¼

ð1

In order to define the mutual coherence, the key concept in the theory of coherence, we consider Young’s interference experiment, as shown in Figure 4, where the radiation from a broad source of size S is illuminating a screen with two pinholes P1 and P2 : The light emerging from the two pinholes produces interference, which is observed on another screen at a distance R from the first screen. The field at point P; due to the pinholes, would be K1 Vðr1 ; t 2 t1 Þ and K2 Vðr2 ; t 2 t2 Þ respectively, where r1 and r2 define the positions of P1 and P2 ; t1 and t2 are the times taken by the light to travel to P from P1 and P2 ; and K1 and K2 are imaginary constants that depend on the geometry and size of the respective pinhole and its distance from the point P: Thus, the resultant field at point P would be given by Vðr; tÞ ¼ K1 Vðr1 ; t 2 t1 Þ þ K2 Vðr2 ; t 2 t2 Þ

Since the optical periods are extremely small as compared to the response time of a detector, the detector placed at point P will record only the timeaveraged intensity:

½5

IðPÞ ¼ kV p ðr; tÞVðr; tÞl

V~ ðrÞ ðr; nÞe22pint dn

0

where the spectrum V~ ðrÞ ðr; nÞ is the Fourier transform of the scalar field V ðrÞ ðr; tÞ and the spectrum for negative frequencies has been suppressed as it does p not give any new information since V~ ðrÞ ðr; nÞ ¼ V~ ðrÞ ðr; 2nÞ: In general, the radiation from a quasi-monochromatic thermal source fluctuates randomly as it is made of a large number of mutually independent contributions from individual atoms or molecules in the source. The field from such a source can be regarded as an ensemble of a large number of randomly different analytical signals such as Vðr; tÞ:

½7

Figure 4

Schematic of Young’s double-hole experiment.

½8

COHERENCE / Overview

where some constants have been ignored. With eqn [7], the intensity at point P would be given by IðPÞ ¼ I1 þ I2 þ 2 Re{

}

K1 Kp2 Gðr1 ;r2 ; t 2 t1 ;t 2 t2 Þ

½9

where I1 and I2 are the intensities at point P; respectively, due to radiations from pinholes P1 and P2 independently (defined as Ii ¼ klKi Vðri ;tÞl2 l) and Gðr1 ;r2 ; t 2 t1 ;t 2 t2 Þ ; Gðr1 ; r2 ; tÞ ¼ kV p ðr1 ; tÞVðr2 ; t þ tÞl

½10

is the mutual coherence function of the fields at P1 and P2 ; and it depends on the time difference t ¼ t2 2 t1 ; since the random process is assumed to be stationary. This function is also sometimes denoted by G12 ðtÞ: The function Gii ðtÞ ; Gðri ; ri ; tÞ defines the self-coherence of the light from the pinhole at Pi ; and lKi l2 Gii ð0Þ defines the intensity Ii at point P; due to the light from pinhole Pi :

One can define a normalized form of the mutual coherence function, namely: G12 ðtÞ pffiffiffiffiffiffiffiffi g12 ðtÞ ; gðr1 ; r2 ; tÞ ¼ pffiffiffiffiffiffiffiffi G11 ð0Þ G22 ð0Þ ¼

½klVðr1 ; tÞl2 lklVðr2 ; tÞl2 l1=2

as y ¼ ðImax 2 Imin Þ=ðImax þ Imin Þ, we obtain



2ðI1 I2 Þ1=2 lgðr1 ; r2 ; tÞl I1 þ I2

½15

which shows that for maximum visibility, the two interfering fields must be completely coherent ðlg ðr1 ; r2 ; tÞl ¼ 1Þ: On the other hand, if the fields are completely incoherent ðlgðr1 ; r2 ; tÞl ¼ 0Þ; no fringes are observed ðImax ¼ Imin Þ: The fields are said to be partially coherent when 0 , lg ðr1 ; r2 ; tÞl , 1: When I1 ¼ I2 ; the visibility is the same as lgðr1 ; r2 ; tÞl: The relation in eqn [15] shows that in an interference experiment, one can obtain the modulus of the complex degree of coherence by measuring I1 ; I2 ; and the visibility. Similarly, eqn [14] shows that from the measurement the positions of maxima, one can obtain the phase of the complex degree of coherence, aðr1 ; r2 ; tÞ: Temporal and Spatial Coherence

Degree of Coherence and the Visibility of Interference Fringes

kV p ðr1 ; tÞVðr2 ; t þ tÞl

87

If the source illuminating the pinholes is a point source of finite spectral width situated at point Q; the fields at point P1 and P2 (Figure 4) at any given instant are the same and the mutual coherence function becomes G11 ðtÞ ¼ Gðr1 ; r1 ; tÞ ¼ kV p ðr1 ; tÞVðr1 ; t þ tÞl

½11

which is called the complex degree of coherence. It can be shown that 0 # lg12 ðtÞl # 1: The intensity at point P; given by eqn [9], can now be written as pffiffiffiffiffi    IðPÞ ¼ I1 þ I2 þ 2 I1 I2 Re g r1 ; r2 ; t ½12 Expressing g ðr1 ; r2 ; tÞ as   g ðr1 ; r2 ; tÞ ¼ lg ðr1 ; r2 ; tÞl exp iaðr1 ; r2 ; tÞ 2 2pin0 t ½13 where að~r1 ; ~r2 ; tÞ ¼ arg½gð~r1 ; ~r2 ; tÞ þ 2pn0 t; and n0 is the mean frequency of the light, the intensity in eqn [12] can be written as pffiffiffiffiffi IðPÞ ¼ I1 þ I2 þ 2 I1 I2 lg ðr1 ; r2 ; tÞl   ½14 £ cos aðr1 ; r2 ; tÞ 2 2pn0 t Now, if we assume that the source is quasimonochromatic, i.e., its spectral width Dn p n0 ; the quantities gðr1 ; r2 ; tÞ and aðr1 ; r2 ; tÞ vary slowly on the observation screen, and the interference fringes are mainly obtained due to the cosine term. Thus, defining the visibility of fringes

¼ kV p ðr2 ; tÞVðr2 ; t þ tÞl ¼ G22 ðtÞ

½16

The self-coherence, G11 ðtÞ; of the light from pinhole P1 ; is a direct measure of the temporal coherence of the source. On the other hand, if the source is of finite size and we observe the interference at point O which corresponds to t ¼ 0; the mutual coherence function would be G12 ð0Þ ¼ Gðr1 ; r2 ; 0Þ ¼ kV p ðr1 ; tÞVðr2 ; tÞl ; J12 ½17 which is called the mutual intensity and is a direct measure of the spatial coherence of the source. In general, however, the function G12 ðtÞ measures, for a source of finite size and spectral width, a combination of the temporal and spatial coherence, and in only some limiting cases, the two types of coherence can be separated. Spectral Representation of Mutual Coherence

One can also analyze the correlation between two fields in the spectral domain. In particular, one can define the cross-spectral density function Wðr1 ; r2 ; nÞ which defines the correlation between the amplitudes of the spectral components of frequency n of the light

88

COHERENCE / Overview

at the points P1 and P2 : Thus: Wðr1 ; r2 ; nÞdðn 2 n 0 Þ ¼ kV p ðr1 ; nÞVðr2 ; n 0 Þl

½18

Using the generalized Wiener– Khintchine theorem, the cross-spectral density function can be shown to be the Fourier transform of the mutual coherence function: Gðr1 ; r2 ; tÞ ¼

ð1

Wðr1 ; r2 ; nÞ ¼

0

Wðr1 ; r2 ; nÞe22p int dn

ð1 21

Gðr1 ; r2 ; tÞe2p int dt

½19

Geometry for the van Cittert– Zernike theorem.

The field, due to this source, would develop finite correlations after propagation, and the theorem states that

½20

If the two points P1 and P2 coincide (i.e., there is only one pinhole), the cross-spectral density function reduces to the spectral density function of the light, which we denote by Sðr; nÞ½; Wðr; r; nÞ: Thus, it follows from eqn [20] that the spectral density of light is the inverse Fourier transform of the self-coherence function. This leads to the Fourier transform spectroscopy, as we shall see later. One can also define spectral degree of coherence at frequency n as Wðr1 ; r2 ; nÞ ffi mðr1 ; r2 ; nÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Wðr1 ; r1 ; nÞWðr2 ; r2 ; nÞ Wðr1 ; r2 ; nÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Sðr1 ; nÞSðr2 ; nÞ

Figure 5

½21

It is easy to see that 0 # lmðr1 ; r2 ; nÞl # 1: It is also sometimes referred to as the complex degree of coherence at frequency n: It may be noted here that in the literature the notation Gð1;1Þ ðnÞ ; Gðr1 ; r2 ; gÞ has also been used for W(r1, r2, n).

Propagation of Coherence The van Cittert – Zernike Theorem

Perfectly coherent waves propagate through diffraction formulae, which have been discussed in this volume elsewhere. However, incoherent and partially coherent waves would evidently propagate somewhat differently. The earliest treatment of propagation noncoherent light is due to van Cittert, which was later generalized by Zernike to obtain what is now the van Cittert –Zernike theorem. The theorem deals with the correlations developed between fields at two points, which have been generated by a quasimonochromatic and spatially incoherent planar source. Thus, as shown in Figure 5, we consider a quasi-monochromatic ðDn p n0 Þ planar source s; which has an intensity distribution Iðr 0 Þ on its plane and is spatially incoherent, i.e., there is no correlation between the fields at any two points on the source.

G12 ð0Þ pffiffiffiffiffiffiffiffi gðr1 ; r2 ; 0Þ ¼ pffiffiffiffiffiffiffiffi G11 ð0Þ G12 ð0Þ ðð 1 e2 pin0 ðR2 2R1 Þ=c 2 0 Iðr 0 Þ d r ¼ pffiffiffiffiffiffiffiffiffiffiffiffi R1 R2 s Iðr1 ÞIðr2 Þ ½22 where Ri ¼ lri 2 r 0 l and Iðri Þ is the intensity at point Pi : This relation is similar to the diffraction pattern produced at point P1 due to a wave, with a spherical wavefront converging towards P2 and with an amplitude distribution Iðr 0 Þ when it is diffracted by an aperture of the same shape, size, and the intensity distribution as those of the incoherent source s: Thus, the theorem shows that a radiation, which was incoherent at the source, becomes partially coherent as it propagates. Generalized Propagation

The expression e22p in0 R1 =c =R1 can be interpreted as the field obtained at P1 due to a point source located at the point r0 on the planar source. Thus, this expression is simply the point spread function of the homogeneous space between the source and the observation plane containing the points P1 and P2 : Hopkins generalized this to include any linear optical system characterized by a point spread function hðrj 2 r0 Þ and obtained the formula for the complex degree of coherence of a radiation emerging from an incoherent, quasi-monochromatic planar source after it has propagated through such a linear optical system: ðð 1 gðr1 ;r2 ;0Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffi Iðr0 Þhðr1 2r0 Þhp ðr2 2r0 Þd2 r 0 Iðr1 ÞIðr2 Þ s ½23 It would thus seem that the correlations propagate in much the same way, as does the field. Indeed, Wolf showed that the mutual correlation function

COHERENCE / Overview

Gðr1 ;r2 ; tÞ satisfies the wave equations: 721 Gðr1 ;r2 ; tÞ ¼

1 ›2 Gðr1 ;r2 ; tÞ c 2 ›t 2

1 722 Gðr1 ;r2 ; tÞ ¼ 2 c

›2 Gðr1 ;r2 ; tÞ ›t 2

½24

where 72j is the Laplacian with respect to the point ri : Here the first of eqns [24], for instance, gives the variation of the mutual coherence function with respect to r1 and t for fixed r2 : Further, the variable t is the time difference defined through the path difference, since t ¼ ðR2 2R1 Þ=c; and the actual time does not affect the mutual coherence function (as the fields are assumed to be stationary). Using the relation in eqn [20], one can also obtain from eqn [24], the propagation equations for the cross-spectral density function Wðr1 ; r2 ; nÞ: 721 Wðr1 ; r2 ; nÞ þ k2 Wðr1 ; r2 ; nÞ ¼ 0 722 Wðr1 ; r2 ; nÞ

½25

2

þ k Wðr1 ; r2 ; nÞ ¼ 0

where k ¼ 2pn=c: Thompson and Wolf Experiment

One of the most elegant experiments for studying various aspects of coherence theory was carried out by Thompson and Wolf by making slight modifications in the Young’s double-hole experiment. The experimental set-up shown in Figure 6 consists of a quasi-monochromatic broad incoherent source S of diameter 2a: This was obtained by focusing filtered narrow band light from a mercury lamp (not shown in the figure) on to a hole of size 2a in an opaque screen A: A mask consisting of two pinholes, each of diameter 2b with their axes separated by a distance d; was placed symmetrically about the optical axis of the experimental setup at plane B between two lenses L1 and L2 ; each having focal length f : The source was at the front focal plane of the lens L1 and the observations were made at the plane C at the back focus of the lens L2 : The separation d was varied to study the spatial coherence function on the mask

89

plane by measuring the visibility and the phase of the fringes formed in plane C: Using the van Cittert –Zernike theorem, the complex degree of coherence at the pinholes P1 and P2 on the mask after the lens L1 is ! 2J1 ðvÞ 2pnad ib12 ½26 with v ¼ g12 ¼ lg12 le ¼ v cf for symmetrically placed pinholes about the optical axis. Here b12 is the phase of the complex degree of coherence and in this special case where the two holes are equidistant from the axis, b12 is either zero or p; respectively, for positive or negative values of 2J1 ðvÞ=v: Let us assume that the intensities at two pinholes P1 and P2 are equal. The interference pattern observed at the back focal plane of lens L2 is due to the superposition of the light diffracted from the pinholes. The beams are partially coherent with degree of coherence given by eqn [26]. Since the pinholes are symmetrically placed, the intensity due to either of the pinholes at a point P at the back focal plane of the lens L2 is the same and is given by the Fraunhofer formula for diffraction from circular apertures, i.e.:



2J1 ðuÞ 2

; with u ¼ 2pn b sin f I1 ðPÞ ¼ I2 ðPÞ ¼

u c

½27

and f is the angle that the diffracted beam makes from normal to the plane of the pinholes. The intensity of the interference pattern produced at the back focal plane of the lens L2 is: " #



2J1 ðvÞ 2

cosðd þ b Þ IðPÞ ¼ 2I1 ðPÞ 1 þ

½28 12 v

where d ¼ d sin f is the phase difference between the two beams reaching P from P1 and P2 : For the on-axis point O; the quantity d is zero. The maximum and minimum values of IðPÞ are given by Imax ðPÞ ¼ 2I1 ðPÞb1 þ l2J1 ðvÞ=vl2 c

½29ðaÞ

Imax ðPÞ ¼ 2I1 ðPÞb1 2 l2J1 ðvÞ=vl2 c

½29ðbÞ

Figure 7 shows an example of the observed fringe patterns obtained by Thompson (1958) in a subsequent experiment.

Types of Fields Figure 6 Schematic of the Thompson and Wolf experiment.

As mentioned above, g12 ðtÞ is a measure of the correlation of the complex field at any two points P1

90

COHERENCE / Overview

Figure 7 Two beam interference patterns obtained by using partially coherent light. The wavelength of light used was 0.579 mm and the separation d of the pinholes in the screen at B was 0.5 cm. The figure shows the observed fringe pattern and the calculated intensity variation for three sizes of the secondary source. The dashed lines show the maximum and minimum intensity. The values of the diameter, 2a; of the source and the corresponding values of the magnitude, lg12 l; and the phase, b12 ; are also shown in each case. Reproduced with permission from Thompson BJ (1958) Illustration of phase change in two-beam interference with partially coherent light. Journal of the Optical Society of America 48: 95– 97.

and P2 at specific time delay t: Extending this definition, an optical field may be coherent or incoherent, if lg12 ðtÞl ¼ 1 or lg12 ðtÞl ¼ 0; respectively, for all pairs of points in the field and for all time delays. In the following, we consider some specific cases of fields and their properties. Perfectly Coherent Fields

A field would be termed as perfectly coherent or selfcoherent at a fixed point, if it has the property that lgðR; R; tÞl ¼ 1 at some specific point R in the domain of the field for all values of time delay t: It can also be shown that gðR; R; tÞ is periodic in time, i.e.:

gðR; R; tÞ ¼ e22p in0 t for n0 . 0

½30

For any other point r in the domain of the field, gðR; r; tÞ and gðr; R; tÞ are also unimodular and also periodic in time. A perfectly coherent optical field at two fixed points has the property that lgðR1 ; R2 ; tÞl ¼ 1 for any

two fixed points R1 and R2 in the domain of the field for all values of time delay t: For such a field, it can be shown that gðR1 ; R2 ; tÞ ¼ exp½iðb 2 2pn0 tÞ with n0 ð. 0Þ and b are real constants. Further, as a corollary, lgðR1 ; R1 ; tÞl ¼ 1 and lgðR2 ; R2 ; tÞl ¼ 1 for all t; i.e., the field is self-coherent at each of the field points R1 and R2 ; and

g ðR1 ;R1 ; tÞ ¼ g ðR2 ;R2 ; tÞ ¼ expð22pin0 tÞ for n0 . 0 It can be shown that for any other point r 0 within the field

g ðr;R2 ; tÞ ¼ g ðR1 ;R2 ;0Þg ðr;R1 ; tÞ ¼ g ðr;R1 ; tÞeib g ðR1 ;R2 ; tÞ ¼ g ðR1 ;R2 ;0Þ exp½22p in0 t

½31

A perfectly coherent optical field for all points in the domain of the field has the property that lgðr1 ;r2 ; tÞl ¼ 1 for all pairs of points r1 and r2 in the domain of the field, for all values of time delay t:

COHERENCE / Overview

It can then be shown that gðr1 ;r2 ; tÞ ¼ exp{i½aðr1 Þ2 aðr2 Þ 2 2pin0 t}; where aðr 0 Þ is real function of a single position r 0 in the domain of the field. Further, the mutual coherence function Gðr1 ;r2 ; tÞ for such a field has a factorized periodic form as Gðr1 ;r2 ; tÞ ¼ Uðr1 ÞUp ðr2 Þ expð22pin0 tÞ

½32

which means the field is monochromatic with frequency n0 and UðrÞ is any solution of the Helmholtz equation: 72 UðrÞ þ k20 UðrÞ ¼ 0;

k0 ¼ 2pn0 =c

½33

The spectral and cross-spectral densities of such fields are represented by Dirac d-functions with their singularities at some positive frequency n0 : Such fields, however, can never occur in nature.

Optical fields are found in practice for which spectral spread Dn is much smaller than the mean frequency n: These are termed as quasi-monochromatic fields. The cross-spectral density Wðr1 ; r2 ; nÞ of the quasi-monochromatic fields attains an appreciable value only in the small region Dn; and falls to zero outside this region. Wðr1 ; r2 ; nÞ ¼ 0; ln 2 n l . Dn and Dn p n ½34 The mutual coherence function Gðr1 ; r2 ; tÞ can be written as the Fourier transform of cross-spectral density function as Gðr1 ;r2 ; tÞ ¼ e22p int

ð1 0

Wðr1 ;r2 ; nÞe22p iðn2nÞt dn ½35

If we limit our attention to small t such that Dnltl p 1 holds, the exponential factor inside the integral is approximately equal to unity and eqn [35] reduces to Gðr1 ;r2 ; tÞ ¼ expð22pint  Þ

ð1 0

coordinates and a component dependent on time delay, is called reducible. In the case of perfectly coherent light, the complex degree of coherence is reducible, as we have seen above, e.g., in eqn [32], and in the case of quasi-monochromatic fields, this is reducible approximately as shown in eqn [37]. There also exists a very special case of a cross-spectrally pure field for which the complex degree of coherence is reducible. A field is called a cross-spectrally pure field if the normalized spectrum of the superimposed light is equal to the normalized spectrum of the component beams, a concept introduced by Mandel. In the space-frequency domain, the intensity interference law is expressed as the so-called spectral interference law: Sðr; nÞ ¼ Sð1Þ ðr; nÞ þ Sð2Þ ðr; nÞ

qffiffiffiffiffiffiffiffiffiffiffiqffiffiffiffiffiffiffiffiffiffiffi  þ2

Quasi-Monochromatic Fields

½Wðr1 ;r2 ; nÞdn

½36

which gives Gðr1 ;r2 ; tÞ ¼ Gðr1 ;r2 ;0Þ expð22pint Þ

½37

Equation [37] describes the behavior of Gðr1 ;r2 ; tÞ for a limited range of t values for quasi-monochromatic fields and in this range it behaves as a monochromatic field of frequency n: However, due to the factor Gðr1 ;r2 ;0Þ; the quasi-monochromatic field may be coherent, partially coherent, or even incoherent. Cross-Spectrally Pure Fields

The complex degree of coherence, if it can be factored into a product of a component dependent on spatial

91

Sð1Þ ðr; nÞ Sð2Þ ðr; nÞ Re½mðr1 ; r2 ; nÞe22 pint  ½38

where mðr1 ; r2 ; nÞ is the spectral degree of coherence, defined in eqn [21] and t is the relative time delay that is needed by the light from the pinholes to reach any point on the screen; Sð1Þ ðr; nÞ and Sð2Þ ðr; nÞ are the spectral densities of the light reaching P from the pinholes P1 and P2 ; respectively (see Figure 4) and it is assumed that the spectral densities of the field at pinholes P1 and P2 are the same ½Sðr1 ; nÞ ¼ CSðr2 ; nÞ: Now, if we consider a point for which the time delay is t0 ; then it can be seen that the last term in eqn [38] would be independent of frequency, provided that

mðr1 ; r2 ; nÞ expð22pint0 Þ ¼ f ðr1 ; r2 ; t0 Þ

½39

where f ðr1 ; r2 ; t0 Þ is a function of r1 ; r2 and t0 only and the light at this point would have the same spectral density as that at the pinholes. If a region exists around the specified point on the observation plane, such that the spectral distribution of the light in this region is of the same form as the spectral distribution of the light at the pinholes, the light at the pinholes is cross-spectrally pure light. In terms of the spectral distribution of the light Sðr1 ; nÞ at pinhole P1 and Sðr2 ; nÞ½¼ CSðr1 ; nÞ at pinhole P2 ; the mutual coherence function at the pinholes can be written as pffiffiffi ð Gðr1 ; r2 ; tÞ ¼ C Sðr1 ; nÞmðr1 ; r2 ; nÞ expð22pintÞdn ½40 and using eqn [39], we get the very important condition for the field to be cross-spectrally pure, i.e.:

gðr1 ; r2 ; tÞ ¼ gðr1 ; r2 ; t0 Þgðr1 ; r1 ; t 2 t0 Þ

½41

92

COHERENCE / Overview

The complex degree of coherence gðr1 ; r2 ; tÞ is thus expressible as the product of two factors: one factor characterizes the spatial coherence at the two pinholes at time delay t0 and the other characterizes the temporal coherence at one of the pinholes. Equation [41] is known as the reduction formula for crossspectrally pure light. It can further be shown that

mðr1 ; r2 ; nÞ ¼ gðr1 ; r2 ; t0 Þ expð2pint0 Þ

½42

Thus, the absolute value of the spectral degree of coherence is the same for all frequencies and is equal to the absolute value of the degree of coherence for the point specified by t0 : It has been shown that crossspectrally pure light can be generated, for example, by linear filtering of light that emerges from the two pinholes in Young’s interference experiments.

Types of Sources Primary Sources

A primary source is a set of radiating atoms or molecules. In a primary source the randomness comes from true source fluctuations, i.e., from the spatial distributions of fluctuating charges and currents. Such a source gives rise to a fluctuating field. Let Qðr; tÞ represent the fluctuating source variable at any point r at time t; then the field generated by the source is represented by fluctuating field variable Vðr; tÞ: The source is assumed to be localized in some finite domain such that Qðr; tÞ ¼ 0 at any time t . 0 outside the domain. Assuming that field variable Vðr; tÞ and the source variable Qðr; tÞ are scalar quantities, they are related by an inhomogeneous equation as 72 Vðr; tÞ 2

1 ›2 Vðr; tÞ ¼ 24pQðr; tÞ c 2 ›t 2

½43

The mutual coherence functions of the source GQ ðr1 ; r2 ; tÞ ¼ kQp ðr1 ; tÞQðr2 ; t þ tÞl and of the field GV ðr1 ; r2 ; tÞ ¼ kV p ðr1 ; tÞVðr2 ; t þ tÞl characterize the statistical similarity of the fluctuating quantities at the points r1 and r2 : Following eqn [20], one can define, respectively, the cross-spectral density function of the source and the field as WQ ðr1 ; r2 ; nÞ ¼

WV ðr1 ; r2 ; nÞ ¼

ð1 21

ð1 21

The cross-spectral density functions of the source and of the field are related as ð722 þ k2 Þð721 þ k2 ÞWV ðr1 ; r2 ; nÞ ¼ 4p 2 WQ ðr1 ; r2 ; nÞ ½45 The solution of eqn [45] is represented as WV ðr1 ;r2 ; nÞ ¼

ð ð S

S

WQ ðr0 1 ;r0 2 ; nÞ

eikðR2 2R1 Þ 3 0 3 0 d r 1d r 2 R1 R2 ½46

where R1 ¼ lr1 2 r 0 1 l and R2 ¼ lr2 2 r 0 2 l (see Figure 8). Using eqn [46], one can then obtain an expression for the spectrum at a point ðr1 ¼ r2 ¼ r ¼ ruÞ in the far field ðr q r 0 1 ;r 0 2 Þ of a source as S1 ðru; nÞ ¼

0 0 1ð ð WQ ðr0 1 ;r0 2 ; nÞe2iku·ðr 1 2r 2 Þ d3 r0 1 d3 r0 2 2 r S S ½47

where u is the unit vector along r: The integral in eqn [47], i.e., the quantity defined by r2 S1 ðru; nÞ; is also defined as the radiant intensity, which represents the rate of energy radiated at the frequency n from the source per unit solid angle in the direction of u: Secondary Sources

Sources used in a laboratory are usually secondary planar sources. A secondary source is a field, which arises from the primary source in the region outside the domain of the primary sources. This kind of source is an aperture on an opaque screen illuminated either directly or via an optical system by primary sources. Let Vðr; tÞ represent the fluctuating field in a secondary source plane s at z ¼ 0 (Figure 9) and W0 ðr1 ; r2 ; nÞ represent its cross-spectral density (the subscript 0 refers to z ¼ 0). One can then solve eqn [25] to obtain the propagation of the crossspectral density from this planar source. For two points P1 and P2 located at distances which are large compared to wavelength, the cross-spectral density is

GQ ðr1 ; r2 ; tÞe22p int dt ½44 GV ðr1 ; r2 ; tÞe 22pint dt

Figure 8 Geometry of a 3-dimensional primary source S and the radiation from it.

COHERENCE / Overview

93

Schell-model secondary source and for source variables Q; in the case of a Schell-model primary source. The cross-spectral density function of a Schell-model source is of the form: WA ðr1 ; r2 ; nÞ ¼ ½SA ðr1 ; nÞ1=2 ½SA ðr2 ; nÞ1=2 mA ðr2 2 r1 ; nÞ ½51

Figure 9 Geometry of a planar source s and the radiation from it.

where SA ðr; nÞ is the spectral density of the light at a typical point in the primary source or on the plane of a secondary source. Schell model sources do not assume low coherence, and therefore, can be applied to spatially stationary light fields of any state of coherence. The Schell-model of the form given in eqn [51] has been used to represent both threedimensional primary sources and two-dimensional secondary sources.

then given by 

 k 2ð ð Wðr1 ; r2 ; nÞ ¼ W0 ðr1 ; r2 ; nÞ 2p s s eikðR2 2R1 Þ £ cos u 0 1 cos u 0 2 d2 r1 d2 r2 R1 R2 ½48 where Rj ¼ lrj 2 rj l; ðj ¼ 1; 2Þ; u 0 1 and u 0 2 are the angles that R1 and R2 directions make with the z-axis (Figure 9). Using eqn [48], one can then obtain an expression for the spectrum at a point ðr1 ¼ r2 ¼ r ¼ ruÞ in the far field ðr q r1 ; r2 Þ of a planar source as k cos u ð ð S1 ðru; nÞ ¼ W0 ðr1 ; r2 ; nÞ 2p 2 r2 s s £ e2 iku’ ·ðr1 2r2 Þ d2 r1 d2 r2 2

Quasi-Homogeneous Sources

Useful models of partially coherent sources that are frequently encountered in nature or in the laboratory are the so-called quasi-homogeneous sources. These are an important sub-class of Schell-model sources. A Schell-model source is called quasi-homogeneous if the intensity of a Schell model source is essentially constant over any coherence area. Under these approximations, the cross-spectral density function for a quasi-homogeneous source is given by WA ðr1 ; r2 ; nÞ ¼ SA

1 2

 ðr1 þ r2 Þ; n mA ðr2 2 r1 ; nÞ

¼ SA ½r; nmA ðr0 ; nÞ

½52

2

½49

where u’ is the projection of the unit vector u on the plane s of the source. Schell-Model Sources

In the framework of coherence theory in space – time domain, two-dimensional planar model sources of this kind were first discussed by Schell. Later, the model was adopted for formulation of coherence theory in space – frequency domain. Schell-model sources are the sources whose degree of spectral coherence mA ðr1 ; r2 ; nÞ (for either primary or secondary source) is stationary in space. It means that mA ðr1 ; r2 ; nÞ depends on r1 and r2 only through the difference r2 2 r1 ; i.e., of the form:

mA ðr1 ; r2 ; nÞ ; mA ðr2 2 r1 ; nÞ

½50

for each frequency n present in the source spectrum. Here A stands for field variables V; in the case of a

where r ¼ ðr1 þ r2 Þ=2; and r0 ¼ r2 2 r1 : The subscript A stands for either V or Q for the field variable or a source variable, respectively. It is clear that for a quasi-homogeneous source the spectral density SA ðr; nÞ varies so slowly with position that it is approximately constant over distances across the sources that are of the order of the correlation length L; which is a measure of the effective width of lmA ðr0 ; nÞl: Therefore, SA ðr; nÞ is a slowly varying function of r (Figure 10b) and lmA ðr0 ; nÞl is a fast varying function of r0 (Figure 10a). In addition, the linear dimensions of the source are large compared with the wavelength of light and with the correlation length D (Figure 10c). Quasi-homogeneous sources are always spatially incoherent in the ‘global’ sense, because their linear dimensions are large compared with the correlation length. This model is very good for representing twodimensional secondary sources with sufficiently low coherence such that the intensity does not vary over the coherence area on the input plane. It has also been applied to three-dimensional primary sources,

94

COHERENCE / Overview

coherent sources and radiations from them in 1980s, revealed that this was true only for specific type of sources. It was discovered on general grounds that the spectrum of light, which originates in an extended source, either a primary source or a secondary source, depends not only on the source spectrum but also on the spatial coherence properties of the source. It was also predicted theoretically by Wolf that the spectrum of light would, in general, be different from the spectrum of the source, and be different at different points in space on propagation in free space. For a quasi-homogeneous planar secondary source defined by eqn [52], whose normalized spectrum is the same at each source point, one can write the spectral density as S0 ðr; nÞ ¼ I0 ðrÞg0 ðvÞ with

Figure 10 Concept of quasi-homogeneous sources.

three-dimensional scattering potentials, and twodimensional primary and secondary sources. Equivalence Theorems

The study of partially coherent sources led to the formulations of a number of equivalence theorems, which show that sources of any state of coherence can produce the same distribution of the radiant intensity as a fully spatially coherent laser source. These theorems provide conditions under which sources of different spatial distribution of spectral density and of different state of coherence will generate fields, which have the same radiant intensity. It has been shown, by taking examples of Gaussian – Schell model sources, that sources of completely different coherence properties and different spectral distributions across the source generate identical distribution of radiant intensity. Experimental verifications of the results of these theorems have also been carried out. For further details on this subject, the reader is referred to Mandel and Wolf (1995).

ð1 0

g0 ðvÞdn ¼ 1

½53

where I0 ðrÞ is the intensity of light at point r on the plane of the source, g0 ðvÞ is the normalized spectrum of the source and the subscript 0 refers to the quantities of the source plane. Using eqn [49], one can obtain an expression for the far-field spectrum due to this source as S1 ðru; nÞ ¼

k2 cos2 u ð ð I0 ðrÞg0 ðvÞm0 ðr0 ; nÞ 2p2 r2 s s £ e2 iku’ ·ðr1 2r2 Þ d2 r1 d2 r2

½54

Noting that r ¼ ðr1 þ r2 Þ=2 and r0 ¼ r2 2 r1 ; one can transform the variables of the integration and obtain after some manipulation: Sð1Þ ðru; nÞ ¼

k2 cos2 u ~ I0 m~0 ðku’ ; nÞg0 ðnÞ ð2prÞ2

½55

where

m~0 ðku’ ; nÞ ¼

ð

0

s

m0 ðr0 ; nÞe2 iku’ ·r d2 r0

and I~0 ¼

ð s

I0 ðrÞd2 r

½56

½57

Equation [55] shows that the spectrum of the field in the far-zone depends on the coherence properties of the source through its spectral degree of coherence m0 ðr0 ; nÞ and on the normalized source spectrum g0 ðnÞ:

Correlation-Induced Spectral Changes

Scaling Law

It was assumed that spectrum is an intrinsic property of light that does not change as the radiation propagates in free-space, until studies on partially

The reason why coherence-induced spectral changes were not observed until recently is that the usual thermal sources employed in laboratories or

COHERENCE / Overview

commonly encountered in nature have special coherence properties and the spectral degree of coherence has the function form:   2pn mð0Þ ðr2 2r1 ; nÞ ¼ f kðr2 2r1 Þ with k ¼ c

½58

which shows that the spectral degree of coherence depends only on the product of the frequency and space coordinates. This formula expresses the socalled scaling law, which was enunciated by Wolf. Commonly used sources satisfy this property. For example, the spectral degree of coherence of Lambertian sources and black-body sources can be shown to be   sin klr2 2r1 l m0 ðr2 2r1 ; nÞ ¼ ½59 klr2 2r1 l This expression evidently satisfies the scaling law. If the spectral degree of coherence does not satisfy the scaling law, the normalized far-field spectrum will, in general, vary in different directions in the far-zone and will differ from the source spectrum. Spectral Changes in Young’s Interference Experiment

Spectral changes in Young’s interference experiment with broadband light are not as well understood as in experiments with quasi-monochromatic, probably because in such experiments no interference fringes are formed. However, if one were to analyze the spectrum of the light in the region of superposition, one would observe changes in the spectrum of light in the region of superposition in the form of a shift in the spectrum for narrowband spectrum and spectral modulation for broadband light. One can readily derive an expression for the spectrum of light in the region of superposition. Let Sð1Þ ðP; nÞ be the spectral density of the light at P which would be obtained if the small aperture at P1 alone was open; Sð2Þ ðP; nÞ has a similar meaning if only the aperture at P2 was open (Figure 4). Let us assume, as is usually the case, that Sð2Þ ðP; nÞ < Sð1Þ ðP; nÞ and let d be the distance between the two pinholes. Consider the spectral density at the point P; at distance x from the axis of symmetry in an observation plane located at distance of R from the plane containing the pinholes. Assuming that x=R p 1; one can make the approximation R2 2 R1 < xd=R: The spectral interference law (eqn [38]) can then be written as SðP; nÞ < 2Sð1Þ ðP; nÞ{1 þ lmðP1 ; P2 ; nÞl   £ cos bðP1 ; P2 ; nÞ þ 2pnxd=cR }

½60

95

where bðP1 ; P2 ; nÞ denotes the phase of the spectral degree of coherence. Equation [60] implies the two results: (i) at any fixed frequency n; the spectral density varies sinusoidally with the distance x of the point from the axis, with the amplitude and the phase of the variation depending on the (generally complex) spectral degree of coherence mðP1 ; P2 ; nÞ; and (ii) at any fixed point P in the observation plane the spectrum SðP; nÞ will, in general, differ from the spectrum Sð1Þ ðP; nÞ; the change also depending on the spectral degree of coherence mðP1 ; P2 ; nÞ of the light at the two pinholes. Experimental Confirmations

Experimental tests of the theoretical prediction of spectral invariance and noninvariance due to correlation of fluctuations across the source were performed just after the theoretical predictions. Figure 11 shows results of one such experiment in which spectrum changes in the Young’s experiment were studied. Several other experiments also reported confirmation of the source correlation-dependent spectral changes. One of the important applications of these observations has been to explain the discrepancies in the maintenance of the spectroradiometric scales by national laboratories in different countries. These studies also have potential application in determining experimentally the spectral degree of coherence of partially coherent fields. The knowledge of spectral degree of coherence is often important in remote sensing, e.g., for determining angular diameters of stars.

Figure 11 Correlation-induced changes in spectrum in Young’s interference. Dashed line represents the spectrum when only one of the slits is open, the continuous curve shows the spectrum when both the slits are open and the circles are the measured values in the latter case. Reproduced with permission from Santarsiero M and Gori F (1992) Spectral changes in Young interference pattern. Physics Letters 167: 123 –128.

96

COHERENCE / Overview

Applications of Optical Coherence Stellar Interferometry

The Michelson stellar interferometer, named after Albert Michelson, was used to determine the angular diameters of stars and also the intensity distribution across the star. The method was devised by Michelson without using any concept of coherence, although subsequently the full theory of the method was developed on the basis of propagation of correlations. A schematic of the experiment is shown in Figure 12. The interferometer is mounted in front of a telescope, a reflecting telescope in this case. The light from a star is reflected from mirrors M1 and M2 and is directed towards the primary mirror (or the objective lens) of the telescope. The two beams thus collected superpose in the focal plane F of telescope where an image crossed with fringes is formed. The outer mirrors M1 and M2 can be moved along the axis defined as M1M3M4M2 while the inner mirrors M3 and M4 remain fixed. The fringe spacing depends on the position of mirrors M3 and M4 and hence is fixed, while the visibility of the fringes depends on the separation of the mirrors M1 and M2 and hence, can be varied. Michelson showed that from the measurement of the variation of the visibility with the separation of the two mirrors, one could obtain information about the intensity distribution of the stars, which are rotationally symmetric. He also showed that if the stellar disk is circular and uniform, the visibility curve as a function of the separation d of the mirrors M1 and M2 will have zeros for certain values of d; and that the smallest of these d values for which zero occurs is given by d0 ¼ 0:61la =a; where a is the semi-angular diameter of the star and la is the mean wavelength of the filtered

quasi-monochromatic light from the star. Angular diameters of several stars down to 0.02 second of an arc were determined. From the standpoint of second-order coherence theory the principles of the method can be readily understood. The star is considered an incoherent source and according to the van Cittert – Zernike theorem, the light reaching the outer mirrors M1 and M2 of the interferometer will be partially coherent. This coherence would depend on the size of and the intensity distribution across the star. Let ðx1 ; y1 Þ and ðx2 ; y2 Þ be the coordinates of the positions of the mirrors M1 and M2, respectively, and ðj; hÞ the coordinates of a point on the surface plane of the star which is assumed to be at a very large (astronomical) distance R from the mirrors. The complex degree of coherence at the mirrors would then be given by eqn [22] which can now be written as ð Iðu; vÞe2 ika ðuDxþvDyÞ du dv s ½61 gðDx; Dy; 0Þ ¼ ð Iðu; vÞdu dv s

where Iðu; vÞ is the intensity distribution across the star disk s as a function of the angular coordinates u ¼ j=R; v ¼ h=R; Dx ¼ x1 2 x2 ; Dy ¼ y1 2 y2 ; and ka ¼ 2p=la ; la being the mean wavelength of the light from the star. Equation [61] shows that the equaltime ðt ¼ 0Þ complex degree of coherence of the light incident at the outer mirrors of the interferometer is the normalized Fourier-transform of the intensity distribution across the stellar disk. Further, eqn [15] shows that the visibility of the interference fringes is the absolute value of g; if the intensity of the two interfering beams is equal, as in the present case. The phase of g can be determined by the position of the intensity maxima of the fringe pattern (eqn [14]). If one is interested in determining only the angular size of the star and the star is assumed to be a circularly symmetric disk of angular diameter 2a and of uniform intensity [Iðu; vÞ is constant across the disk], then eqn [61] reduces to

g ðDx;DyÞ ¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2J1 ðvÞ 2pa ; v¼ d; d ¼ ðDxÞ2 þðDyÞ2 v la ½62

The smallest separation of the mirrors for which the visibility g vanishes corresponds to v ¼ 3:832, i.e., d0 ¼ 0:61la =a; which is in agreement with Michelson’s result. Interference Spectroscopy

Figure 12 Schematic of the Michelson stellar interferometer.

Another contribution of Michelson, which was subsequently identified as an application of the

COHERENCE / Overview

coherence theory, was the use of his interferometer (Figure 2) to determine the energy distribution in the spectral lines. The method he developed is capable to resolving spectral lines that are too narrow to be analyzed by the spectrometer. The visibility of the interference fringes depends on the energy distribution in the spectrum of the light and its measurement can give information about the spectral lines. In particular, if the energy distribution in a spectral line is symmetric about some frequency n0 ; its profile is simply the Fourier transform of the visibility variation as a function of the path difference between the two interfering beams. This method is the basis of interference spectroscopy or the Fourier transform spectroscopy. Within the framework of second-order coherence theory, if the mean intensity of the two beams in a Michelson’s interferometer is the same, then the visibility of the interference fringes in the observation plane is related to the complex degree of coherence of light at a point at the beamsplitter where the two beams superimpose (eqn [15]). The two quantities are related as y ðtÞ ¼ lg ðtÞl where g ðtÞ ; g ðr1 ; r1 ; tÞ is the complex degree of self-coherence at the point r1 on the beamsplitter. Following eqn [19], g ðtÞ can be represented by a Fourier integral as

gðtÞ ¼

ð1

sðnÞexpð2i2pntÞdn

½63

0

spectral density of the where sðnÞ is the normalized Ð light defined as sðnÞ ¼ SðnÞ= 1 0 SðnÞdn and SðnÞ ; Sðr1 ; nÞ ¼ Wðr1 ; r1 ; nÞ is the spectral density of the beam at the point r1 : The method is usually applied to very narrow spectral lines for which one can assume that the peak that occurs at n0 ; and g ðtÞ can be represented as

g ðtÞ ¼ g ðtÞexpð22pin0 tÞ with ð1 sðmÞexpð2i2pmtÞdm g ðtÞ ¼

½64

21

inversion would give: sðmÞ ¼ sðn0 þ mÞ ¼ 2

ð1

y ðtÞcosð2pmtÞdt

½66

0

which can be used to calculate the spectral energy distribution for a symmetric spectrum about n0 from the visibility curve. However, for an asymmetric spectral distribution, the visibility and the phase of the complex degree of coherence must be determined as the Fourier transform of the shifted spectrum is no longer real everywhere.

Higher-Order Coherence So far we have considered correlations of the fluctuating field variables at two space –time ðr; tÞ points, as defined in eqn [10]. These are termed as second-order correlations. One can extend the concept of correlations to more than two space –time points, which will involve higher-order correlations. For example, one can define the space –time cross correlation function of order ðM; NÞ of the random field Vðr; tÞ; represented by GðM;NÞ ; as an ensemble average of the product of the field Vðr; tÞ values at N space –time points and V p ðr; tÞ at other M points. In this notation, the mutual coherence function as defined in eqn [10] would now be Gð1;1Þ : Among higher-order correlations, the one with M ¼ N ¼ 2; is of practical significance and is called the fourth-order correlation function, Gð2;2Þ ðr1 ; t1 ; r2 ; t2 ; r3 ; t3 ; r4 ; t4 Þ: The theory of Gaussian random variables tells us that any higher correlation can be written in terms of second-order correlations over all permutations of pairs of points. In addition, if we assume that ðr3 ; t3 Þ ¼ ðr1 ; t1 Þ and ðr4 ; t4 Þ ¼ ðr2 ; t2 Þ; and that the field is stationary, then Gð2;2Þ is called the intensity – intensity correlation and is given as Gð2;2Þ ðr1 ;r2 ;t2 2 t1 Þ ¼ kVðr1 ;t1 ÞVðr2 ;t2 ÞV p ðr1 ;t1 ÞV p ðr2 ;t2 Þl

where s ðmÞ is the shifted spectrum such that

¼ kIðr1 ;t1 ÞIðr2 ;t2 Þl

s ðmÞ ¼ sðn0 þ mÞ m $ 2n0 ¼0

97

  ¼ kIðr1 ;t1 ÞlkIðr2 ;t2 Þl 1 þ lgð1;1Þ ðr1 ;r2 ;t2 2 t1 Þl2

m , 2 n0

½67

where From the above one readily gets



ð1



s ðmÞ expð2i2pmtÞdm

y ðtÞ ¼ lg ðtÞl ¼

21

gð1;1Þ ðr1 ;r2 ;t2 2 t1 Þ ¼ ½65

If the spectrum is symmetric about n0 ; then y ðtÞ would be an even function of t and the Fourier

Gð1;1Þ ðr1 ;r2 ;t2 2 t1 Þ ½kIðr1 ;t1 Þl1=2 ½kIðr2 ;t2 Þl1=2

½68

We now define fluctuations in intensity at ðrj ;tj Þ as DIj ¼ Iðrj ;tj Þ 2 kIðrj ;tj Þl

98

COHERENCE / Overview

and then the correlation of intensity fluctuations becomes kDI1 DI2 l ¼ kIðr1 ;t1 ÞIðr2 ;t2 Þl 2 kIðr1 ;t1 ÞlkIðr2 ;t2 Þl

Stellar Intensity Interferometry

¼ kIðr1 ;t1 ÞlkIðr2 ;t2 Þllgð1;1Þ ðr1 ;r2 ;t2 2 t1 Þl

2

½69 where we have used eqn [67]. Equation [69] forms the basis for intensity interferometry. Hanbury-Brown and Twiss Experiment

In this landmark experiment conducted both on the laboratory scale and astronomical scale, HanburyBrown and Twiss demonstrated the existence of intensity – intensity correlations in terms of the correlations in the photocurrents in the two detectors and thus measured the squared modulus of complex degree of coherence. In the laboratory experiment, the arc of a mercury lamp was focused onto a circular hole to produce a secondary source. The light from this source was then divided equally into two parts through a beamsplitter. Each part was, respectively, detected by two photomultiplier tubes, which had identical square apertures in front. One of the tubes could be translated normally to the direction of propagation of light and was so positioned that its image through the splitter could be made to coincide with the other tube. Thus, by suitable translation a measured separation d between the two square apertures could be introduced. The output currents from the photomultiplier tubes were taken by cables of equal length to a correlator. In the path of each cable a high-pass filter was inserted, so that only the current fluctuations could be transmitted to the correlator. Thus, the normalized correlations between the two current fluctuations: CðdÞ ¼

kDJ1 ðtÞDJ2 ðtÞl 1=2 1=2 k½DJ1 ðtÞ2 l k½DJ2 ðtÞ2 l

Michelson stellar interferometry can resolve stars which have angular sizes of the order of 0:0100 ; since for smaller stars, the separation between the primary mirrors runs into several meters and maintaining stability of mirrors such that the optical paths do not change, even by fraction of a wavelength, is extremely difficult. The atmospheric turbulence further adds to this problem and obtaining stable fringe pattern becomes next to impossible for very small stars. Hanbury-Brown and Twiss applied the intensity interferometry based on their photoelectric correlation technique for determining the angular sizes of such stars. Two separate parabolic mirrors collected light from the star and the output of the photodetectors placed at the focus of each mirror was sent to a correlator. The cable lengths were made unequal so as to compensate for the time difference of the light arrival at the two mirrors. The normalized correlation of the fluctuations of the photocurrents was determined as described above. This would give the variation of the modulus-squared degree of coherence as a function of the mirror separation d from which the angular size of the stars can be estimated. The advantage of the stellar intensity interferometer over the stellar (amplitude) interferometer is that the light need not interfere as in the latter, since the photodetectors are mounted directly at the focus of the primary mirrors of the telescope. Thus, the constraint on the large path difference between the two beams is removed and large values of d can now be used. Moreover, the atmosphere turbulence and the mirror movements have very small effect. Stellar angular diameters as small as 0:000400 of arc with resolution of 0:0000300 could be measured by such interferometers.

½70

See also

as a function of detector separation d could be measured. Now when the detector response time is much larger than the time-scale of the fluctuations in intensity, then it can be shown that the correlations in the fluctuations of the photocurrent are proportional to the correlations of the fluctuations in intensity of the light being detected. Thus, we would have CðdÞ < dlgð1;1Þ ðr1 ; r2 ; 0Þl2

less than one). Equation [71] represents the HanburyBrown – Twiss effect.

Coherence: Coherence and Imaging. Coherent Lightwave Systems. Coherent Transients: Coherent Transient Spectroscopy in Atomic and Molecular Vapours. Coherent Control: Applications in Semiconductors; Experimental; Theory. Interferometry: Overview. Information Processing: Coherent Analogue Optical Processors. Terahertz Technology: Coherent Terahertz Sources.

½71

where d is the average number of photocounts of the light of one polarization during the time-scale of the fluctuations (for general thermal sources, this is much

Further Reading Beran MJ and Parrent GB (1964) Theory of Partial Coherence. Englewood Cliffs, NJ: Prentice-Hall.

COHERENCE / Coherence and Imaging

Born M and Wolf E (1999) Principles of Optics. New York: Pergamon Press. Carter WH (1996) Coherence theory. In: Bass M (ed.) Hand Book of Optics. New York: McGraw-Hill. Davenport WB and Root WL (1960) An Introduction to the Theory of Random Signals and Noise. New York: McGraw-Hill. Goodman JW (1985) Statistical Optics. Chichester: Wiley. Hanbury-Brown R and Twiss RQ (1957) Interferometry of intensity fluctuations in light: I basic theory: the correlation between photons in coherent beams of radiation. Proceedings of the Royal Society 242: 300– 324. Hanbury-Brown R and Twiss RQ (1957) Interferometry of intensity fluctuations in light: II an experimental test of the theory for partially coherent light. Proceedings of the Royal Society 243: 291 – 319. Kandpal HC, Vaishya JS and Joshi KC (1994) Correlationinduced spectral shifts in optical measurements. Optical Engineering 33: 1996– 2012. Mandel L and Wolf E (1965) Coherence properties of optical fields. Review of Modern Physics 37: 231 – 287.

99

Mandel L and Wolf E (1995) Optical Coherence and Quantum Optics. Cambridge, UK: Cambridge University Press. Marathay AS (1982) Elements of Optical Coherence. New York: John Wiley. Perina J (1985) Coherence of Light. Boston, MA: M.A. Reidel. Santarsiero M and Gori F (1992) Spectral changes in Young interference pattern. Physics Letters 167: 123– 128. Schell AC (1967) A technique for the determination of the radiation pattern of a partially coherent aperture. IEEE Transactions on Antennas and Propagation AP15: 187. Thompson BJ (1958) Illustration of phase change in twobeam interference with partially coherent light. Journal of the Optical Society of America 48: 95– 97. Thompson BJ and Wolf E (1957) Two beam interference with partially coherent light. Journal of the Optical Society of America 47: 895 –902. Wolf E and James DFV (1996) Correlation-induced spectral changes. Report on Progress in Physics 59: 771 – 818.

Coherence and Imaging J van der Gracht, HoloSpex, Inc., Columbia, MD, USA q 2005, Elsevier Ltd. All Rights Reserved.

of experimental results highlight the effects of spatial coherence of the illumination on image formation. Discussions of the role of coherence in such key applications are interspersed throughout the article.

Introduction Optical imaging systems are strongly affected by the coherence of the light that illuminates the object of interest. In many cases, the light is approximately coherent or incoherent. These approximations lead to simple mathematical models for the image formation process and allow straightforward analysis and design of such systems. When the light is partially coherent, the mathematical models are more complicated and system analysis and design is more difficult. Partially coherent illumination is often used in microscopy, machine vision, and optical lithography. The intent of this article is to provide the reader with a basic understanding of the effects of coherence on imaging. The information should enable the reader to recognize when coherence effects are present in an imaging system and give insight into when coherence can be modified to improve imaging performance. The material relies mostly on concepts drawn from the Fourier optics perspective of imaging and a rigorous coherence theory treatment is avoided. We encourage the reader to consult the Further Reading list at the end of this article, for more complete definitions of terms for coherence theory. A number

Image Formation – Ideal and Optimal An image is typically defined as the reproduction or likeness of the form of an object. An image that is indistinguishable from the original object is generally considered to be ideal. In a general context, the sound of a voice coming from a loudspeaker can be thought of as the image of the sound coming directly from the original speaker’s mouth. In optical imaging, the ideal image replicates the light emanating from the object. Taken to the extreme, the ideal image replicates the light leaving the object in terms of intensity, wavelength, polarization, and even coherence. When the final image is viewed by the eye, the ideal image only needs to replicate the spatial distribution of the light leaving the object in terms of color and relative intensity at each point on the object. (In this article, intensity is defined as optical power per unit area (watts per meter squared).) A general diagram of a direct view image formation system is shown in Figure 1. The condenser optics gathers light from a primary source and illuminates a transmissive object having a complex wave amplitude

100 COHERENCE / Coherence and Imaging

transmission of Oðx; yÞ: The imaging optics produce an optical image that is viewed directly by the viewer. A growing number of image formation systems now include a solid-state image detector as shown in Figure 2. The raw intensity optical image is converted to an electronic signal that can be digitally processed and then displayed on a monitor. In this system, the spatial intensity distribution emanating from the monitor is the final image. The task of the optical system to the left of the detector is to gather spatial information about the light properties of the object. In fact, the information gathered by the detector includes information about the object, the illumination system, and the image formation optics. If the observer is only interested in the light transmission properties of the object, the effects of the illumination and image formation optics must be well understood. When the light illuminating the object is known to be coherent or incoherent, reasonably simple models for the overall image formation process can be used. More general, partially coherent illumination can produce optically formed images that differ greatly

Figure 1 Direct view optical system. Imaging optics conveys the light from an illuminated object directly to the human viewer.

from the intensity pattern leaving the object. Seen from this perspective, partially coherent illumination is an undesirable property that creates nonideal images and complicates the image analysis. The ideal image as described above is not necessarily the optimum image for a particular task. Consider the case of object recognition when the object has low optical contrast. Using the above definition, an ideal image would mimic the low contrast and render recognition difficult. An image formation system that alters the contrast to improve the recognition task would be better than the so-called ideal image. The image formation system that maximized the appropriate performance metric for a particular recognition task would be considered optimal. In optimizing the indirect imaging system of Figure 2, the designer can adjust the illumination, the imaging optics, and the post-detection processing. In fact, research microscopes often include an adjustment that modifies illumination coherence and often alters the image contrast. Darkfield and phase contrast imaging microscopes usually employ partially coherent illumination combined with pupil modification to view otherwise invisible objects. Seen from this perspective, partially coherent illumination is a desirable property that provides more degrees of freedom to the imaging system designer. Quite often, partially coherent imaging systems provide a compromise between the performance of coherent and incoherent systems.

Elementary Coherence Concepts Most readers are somewhat familiar with the concept of temporal coherence. In Figure 3, a Michelson interferometer splits light from a point source into two paths and recombines the beams to form interference fringes. The presence of interference fringes indicates that the wave amplitude fluctuations of the two beams are highly correlated so the light adds in wave amplitude. If the optical path difference between the two paths can be made large without reducing the fringe contrast, the light is said to be Variable phase delay Mirror Intensity Prism

Narrowband Lens point source

Distance Beamsplitter

Figure 2 Indirect view optical system. The raw intensity optical image is detected electronically, processed and a final image is presented to the human observer on a display device.

Figure 3 The presence of high contrast fringes in a Michelson interferometer indicates high temporal coherence.

COHERENCE / Coherence and Imaging

101

Figure 4 A Young’s two pinhole interferometer produces: (a) a uniform high contrast fringe pattern for light that is highly spatially and temporally coherent; (b) high contrast fringes only near the axis for low temporal coherence and high spatial coherence light; and (c) no fringe pattern for high temporal coherence, low spatial coherence light.

highly temporally coherent. Light from very narrowband lasers can maintain coherence over very large optical path differences. Light from broadband light sources requires very small optical path differences to add in wave amplitude. Spatial coherence is a measure of the ability of two separate points in a field to interfere. The Young’s two pinhole experiment in Figure 4 measures the coherence between two points sampled by the pinhole mask. Only a one-dimensional pinhole mask is shown for simplicity. Figure 4a shows an expanded laser beam illuminating two spatially separated pinholes and recombining to form an intensity fringe pattern. The high contrast of the fringe indicates that the wave amplitude of the light from the two pinholes is highly correlated. Figure 4b shows a broadband point source expanded in a similar fashion. The fringe contrast is high near the axis because the optical path difference between the two beams is zero on the axis and relatively low near the axis. For points far from the axes, the fringe pattern disappears because the low temporal coherence from the broadband source results in a loss in correlation of the wave amplitudes. A final Young’s experiment example in Figure 4c shows that highly temporally coherent light can be spatially incoherent. In the figure, the light illuminating the two pinholes comes from two separate and

highly temporally coherent lasers that are designed to have the same center frequency. Since the light from the lasers is not synchronized in phase, any fringes that might form for an instant will move rapidly and average to a uniform intensity pattern over the integration time of a typical detector. Since the fringe contrast is zero over a practical integration time, the light at the two pinholes is effectively spatially incoherent.

Two-point Imaging In a typical partially coherent imaging experiment we need to know how light from two pinholes adds at the optical image plane, as shown in Figure 5. Diffraction and system aberrations cause the image of a single point to spread so that the images of two spatially separated object points overlap in the image plane. The wave amplitude image of a single pinhole is called the coherent point spread function (CPSF). Since the CPSF is compact, two CPSFs will only overlap when the corresponding object points are closely spaced. When the two points are sufficiently close, the relevant optical path differences will be small so full temporal coherence can be assumed. Spatial coherence will be the critical factor in determining how to add the responses to pairs of

102 COHERENCE / Coherence and Imaging

point images. We assume full temporal coherence in the subsequent analyses and the light is said to be quasimonochromatic. Coherent Two-Point Imaging

Consider imaging a two-point object illuminated by a spatially coherent plane wave produced from a point source, as depicted in Figure 6a. In the figure, the imaging system is assumed to be space-invariant. This means that the spatial distribution of the CPSF is the same regardless of position of the input pinhole. The CPSF in the figure is broad enough such that the image plane point responses overlap for this particular pinhole spacing. Since the wave amplitudes from the two pinholes are correlated, the point responses add in wave amplitude, resulting in a two-point image intensity given by I2coh ðxÞ ¼ I0 lhðx 2 x1 Þl2 þ I0 lhðx 2 x2 Þl2

½1

where hðxÞ is the normalized CPSF and I0 is a scaling factor that determines the absolute image intensity value. Since the CPSF has units of wave amplitude,

the magnitude squaring operation accounts for the square law response of the image plane detector. More generally, the image intensity for an arbitrary object distribution can be found by breaking the object into a number of points and adding the CPSFs due to each point on the object. The resultant image plane intensity is the spatial convolution of the object amplitude transmittance with the CPSF and is given by  2 ð    Icoh ðxÞ ¼ I0  Oðj Þhðj 2 xÞdj  ½2   where OðxÞ is the object amplitude transmittance. Incoherent Two-Point Imaging

Two pinhole imaging with spatially incoherent light is shown in Figure 6b, where a spatially extended blackbody radiator is placed directly behind the pinholes. Once again the imaging optics are assumed to be space-invariant and have the same CPSF as the system in Figure 6a. Since the radiation is originating from two completely different physical points on the source, no correlation is expected between the wave amplitude of the light leaving the pinholes and the light is said to be spatially incoherent. The resultant image intensity corresponding to the two pinhole object with equal amplitude transmittance values is calculated by adding the individual intensity responses to give I2inc ðxÞ ¼ I0 lhðx 2 x1 Þl2 þ I0 lhðx 2 x2 Þl2

½3

For more general object distributions, the image intensity is the convolution of the object intensity transmittance with the incoherent PSF, and is given by Figure 5 Imaging of two points. The object plane illumination coherence determines how the light adds in the region of overlap of the two image plane points.

Iinc ðxÞ ¼ I0

ð

loðj Þl2 lhðx 2 j Þl2 dj

Figure 6 Imaging of two points for (a) spatially coherent illumination and (b) spatially incoherent imaging.

½4

COHERENCE / Coherence and Imaging

where the intensity transmittance is the squared magnitude of the amplitude transmittance and the incoherent PSF is the squared magnitude of the CPSF. Partially Coherent Two-Point Imaging

Finally we consider two pinhole imaging illuminated by partially spatially coherent light. Now the twopoint responses do not add simply in amplitude or intensity. Rather, the image intensity is given by the more general equation:  I2pc ðxÞ ¼ I0 lhðx 2 x1 Þ2 þ lhðx 2 x2 Þl2

 þ 2Re{mðx1 ; x2 Þhðx 2 x1 Þhp ðx 2 x2 Þ}

½5

where the p denotes complex conjugations and mðx1 ; x2 Þ is the normalized form of the mutual intensity of the object illumination evaluated at the object points in question. The mutual intensity function is often denoted as J0 ðx1 ; x2 Þ and is a measure of the cross correlation of the wave amplitude distributions leaving the two pinholes. Rather than providing a rigorous definition, we note that the magnitude of J0 ðx1 ; x2 Þ corresponds to the fringe contrast that would be produced if the two object pinholes were placed at the input of a Young’s interference experiment. The phase is related to the relative spatial shift of the fringe pattern. When the light is uncorrelated, J0 ðx1 ; x2 Þ ¼ 0 and eqn [5] collapses to the incoherent limit of eqn [4]. When the light is coherent, J0 ðx1 ; x2 Þ ¼ 1 and eqn [5] reduces to the coherent form of eqn [2]. Image formation for a general object distribution is given by the bilinear equation: Ipc ðxÞ ¼ I0

ð

oðj1 Þop ðj2 Þhðx 2 j1 Þhp ðx 2 j2 Þ

 Jo ðj1 2 j2 Þdj1 dj2

½6

Note that, in general, J0 ðx1 ; x2 Þ must be evaluated for all pairs of object points. Close examination of this equation reveals that just as a linear system can be evaluated by considering all possible single point

103

responses, a bilinear system requires consideration of all possible pairs of points. This behavior is much more complicated and does not allow the application of the well-developed linear system theory.

Source Distribution and Object Illumination Coherence According to eqn [6], the mutual intensity of all pairs of points at the object plane must be known, to calculate a partially coherent image. Consider the telecentric Kohler illumination imaging system shown in Figure 7. The primary source is considered to be spatially incoherent and illuminates the object after passing through lens L1 located one focal distance away from the source and one focal distance away from the object. Even though the primary source is spatially incoherent, the illumination at the object plane is partially coherent. The explicit mutual intensity function corresponding to the object plane illumination is given by applying the van Cittert Zernike theorem: ð    J0 ðDxÞ ¼ Sðj Þ exp j2pDxj lF dj ½7 where SðxÞ is the intensity distribution of the spatially incoherent primary source, F is the focal length of the lens, l is the illumination wavelength, and Dx ¼ x1 2 x2 : The van Cittert Zernike theorem reflects a Fourier transform relationship between the source image intensity distribution and the mutual intensity at the object plane. When the source plane is effectively spatially incoherent, the object plane mutual intensity is only a function of separation distance. For a two-dimensional object, the mutual intensity needs to be characterized for all pairs of unique spacings in x and y: In Figure 7, the lens arrangement ensures that the object plane is located in the far field of the primary source plane. In fact, the van Cittert Zernike theorem applies more generally, even in the Fresnel propagation regime, as long as the standard paraxial approximation for optics is valid.

Figure 7 A telecentric Kohler illumination system with a spatially incoherent primary source imaged onto the pupil.

104 COHERENCE / Coherence and Imaging

Figure 8 Light propages from the source plane to the object plane and produces (a) coherent illumination from a single source point; (b) incoherent illumination from an infinite extent source; and (c) partially coherent illumination from a finite extent primary source.

Figure 8 shows some simple examples of primary source distributions and the corresponding object plane mutual intensity function. Figure 8a assumes an infinitesimal point source and the corresponding mutual intensity is 1.0 for all possible pairs of object points. Figure 8b assumes an infinite extent primary source and results in a dirac delta function for the mutual intensity function. This means that there is no correlation between any two object points having a separation greater than zero so the object plane illumination is spatially incoherent. In Figure 5c, a finite extent uniform source gives a mutual intensity function of the form sinðapDxÞ=ðaDxÞ: The finitesized source corresponds to partially coherent imaging and shows that the response to pairs of points is affected by the spatial distribution of the source in a complicated way. Note that a large primary source corresponds to a large range of angular illumination at the object plane. Varying the size of the source in the imaging system of Figure 8 will affect the spatial coherence of the illumination and hence the optical image intensity. Many textbook treatments discuss the imaging of two points separated by the Rayleigh resolution criterion which corresponds to the case where the first minimum of one point image coincides with the maximum of the adjacent point image. With a large source that provides effectively incoherent light, the two point image has a modest dip in intensity between the two points, as shown in Figure 9a. Fully coherent illumination of two points separated by the Rayleigh resolution produces a single large spot with no dip in intensity, as shown in Figure 9b.

Varying the source size and hence the illumination spatial coherence produces a dip that is less than the incoherent intensity dip. This result is often used to suggest that coherent imaging gives poorer resolution than incoherent imaging. In fact, generalizations about two-point resolution can be misleading. Recent developments in optical lithography have shown that coherence can be used to effectively increase two-point resolution beyond traditional diffraction limits. In Figure 9c, one of the pinholes in a coherent two-point imaging experiment has been modified with a phase shift corresponding to one half of a wavelength of the illuminating light. The two images add in wave amplitude and the phase shift creates a distinct null at the image plane and effectively enhances the two-point resolution. This approach is termed phase screen lithography and has been exploited to produce finer features in lithography by purposely introducing small phase shift masks at the object mask. In practice, the temporal and spatial coherence of the light is engineered to give sufficient coherence to take advantage of the twopoint enhancement while maintaining sufficient incoherence to avoid speckle-like artifacts associated with coherent light.

Spatial Frequency Modeling of Imaging Spatial frequency models of image formation are also useful in understanding how coherence affects image formation. A spatially coherent imaging system has a particularly simple spatial frequency model. In the

COHERENCE / Coherence and Imaging

105

Figure 9 Comparison of coherent and two point imaging of two pinholes separated by the Rayleigh distance. (a) Coherent image cannot resolve the two points. (b) Incoherent image barely resolves the two points. (c) Coherent illumination with a phase shifting plate at one pinhole produces a null between the two image points.

Figure 10 The object Fourier transform is filtered by the pupil function in a coherent imaging system.

spatially coherent imaging system shown in Figure 10, the on-axis plane wave illumination projects the spatial Fourier transform of the object, or Fraunhofer pattern, onto the pupil plane of the imaging system. The pupil acts as a frequency domain filter that can be modified to perform spatial frequency filtering. The complex transmittance of the pupil is the wave amplitude spatial frequency transfer function of the imaging system. It follows that the image plane wave ~ c ð fx Þ; is given by amplitude frequency spectrum, U ~ fx ÞHð fx Þ ~ c ð fx Þ ¼ Oð U

½8

~ fx Þ is the spatial Fourier transform of the where Oð object amplitude transmittance function and Hð fx Þ is

proportional to the pupil plane amplitude transmittance. Equation [8] is the frequency domain version of the convolution representation given by eqn [2], but does not account for the squared magnitude response of the detector. In the previous section, we learned that an infinite extent source is necessary to achieve fully incoherent imaging for the imaging system of Figure 7. As a thought experiment, one can start with a single point source on the axis and keep adding mutually incoherent source points to build up from a coherent imaging system to an incoherent imaging system. Since the individual source points are assumed to be incoherent with each other, the images from each point source can be added in intensity. In Figure 11,

106 COHERENCE / Coherence and Imaging

Figure 11 A second source point projects a second object Fourier transform onto the pupil plane. The images produced by the two source points add in intensity at the image plane.

only two point sources are shown. Each point source projects an object Fraunhofer pattern onto the pupil plane. The centered point source will result in the familiar coherent image. The off-axis point source projects a displaced object Fraunhofer pattern that is also filtered by the pupil plane before forming an image of the object. The image formed from this second point source can be modeled with the same coherent imaging model with a shifted pupil plane filter. Since the light from the two source points is uncorrelated, the final image is calculated by adding the intensities of the two coherently formed images. The two source point model of Figure 11 can be generalized to an arbitrary number of source points. The final image is an intensity superposition of a number of coherent images. This suggests that partially coherent imaging systems behave as a number of redundant coherent imaging systems, each having a slightly different amplitude spatial frequency transfer function due to the relative shift of the pupil filter with respect to the object Fraunhofer pattern. As the number of point sources is increased to infinity, the primary source approaches an infinite extent spatially incoherent source. In practice, the source need not be infinite. When the source is large enough to effectively produce linear-in-intensity imaging, the imaging system is effectively spatially incoherent. The corresponding image intensity in the spatial frequency domain, I~inc ð fx Þ; is given by I~inc ð fx Þ ¼ I~obj ð fx ÞOTFð fx Þ

These comparisons are often misleading since one transfer function describes the wave amplitude spatial frequency transfer function and the other describes the intensity spatial frequency transfer function. Here we note that incoherent imaging systems do indeed allow higher object amplitude spatial frequencies to participate in the image formation process. This argument is often used to support the claim that incoherent systems have higher resolution. However, both systems have the same image intensity spatial frequency cutoff. Furthermore, the nature of the coherent transfer function tends to produce high contrast images that are typically interpreted as higher-resolution images than their incoherent counterparts. Perhaps the real conclusion is that the term resolution is not well defined and direct comparisons between coherent and incoherent imaging must be treated carefully. The frequency domain treatment for partially coherent imaging of two-dimensional objects involves a four-dimensional spatial frequency transfer function that is sometimes called the bilinear transfer function or the transmission cross-coefficient model. This model describes how constituent object wave amplitude spatial frequencies interact to form image plane intensity frequencies. The utility of this approach to analysis is limited for someone new to the field, but is often used in numerical simulations of partially coherent imaging systems used in optical lithography.

½9

where I~obj ð fx Þ is the spatial Fourier transform of the object intensity transmittance and OTFð fx Þ is the incoherent optical transfer function which is proportional to the spatial autocorrelation of the pupil function. Many texts include detailed discussions comparing the coherent transfer function and the OTF and attempt to make comparisons about the relative performance of coherent and incoherent systems.

Experimental Examples of Important Coherence Imaging Phenomena Perhaps the best way to gain an understanding of coherence phenomena in imaging is to examine experimental results. In the following section we use experimental data to see how coherence affects edge response, noise immunity, and depth of field. Several experimental configurations were used to collect

COHERENCE / Coherence and Imaging

107

the image data, but all of the systems can be represented generically by the Kohler illumination system shown in Figure 12. Kohler illumination is often employed in many projection illumination systems. In order to obtain Kohler illumination, the primary spatially incoherent source is imaged onto the pupil plane of the imaging portion of the system and the object is placed at the pupil plane of the condenser optics. Primary Source Generation

Figure 13 shows a highly coherent illumination system produced by focusing a laser beam to a point and imaging the focused spot onto the pupil of the imaging system to produce spatially coherent illumination. The use of a laser produces highly temporally coherent light. Two methods were used to create an extended spatially incoherent primary source with control over the spatial intensity distribution. Figure 14a shows a collimated laser beam with a 633 nm center wavelength illuminating a rotating diffuser. A photographically produced mask defines the spatial shape of the primary source. The laser provides highly temporally coherent light and the diffuser destroys the spatial coherence of the light. Consider the thought experiment of two pinholes placed immediately to

Figure 12 General representation of the Kohler illumination imaging systems used in the experimental result section. The primary incoherent source is imaged onto the imaging system pupil plane and the object resides in the pupil of the condenser optics.

Figure 13 Highly spatially and temporally coherent illumination produced by imaging a focused laser beam into the imaging system pupil.

Figure 14 Two methods for generating a spatially incoherent primary source. (a) An expanded laser beam passes through a moving diffuser followed by a mask to define the extent and shape of the source. (b) A broadband source exits a multifiber lightguide and propagates to a moving diffuser followed by a source mask.

the right of the moving diffuser. Without the diffuser, the two wave amplitudes would be highly correlated. Assuming that the diffuser can be modeled as a spatially random phase plate, a fixed diffuser would only introduce a fixed phase difference between the amplitudes leaving the two pinholes and would not destroy the coherence. When the diffuser is rotated, the light from each pinhole encounters a different phase modulation that is changing over time. This random modulation destroys the effective correlation between the wave amplitude of the light leaving the two pinholes provided that the rotation speed is sufficiently fast. The moving diffuser method is light inefficient but is a practical way of exploring coherence in imaging in a laboratory environment. The choice of the diffuser is critical. The diffuser should spread light out uniformly over an angular subtense that overfills the object of interest. Many commercially available diffusers tend to pass too much light in the straight through direction. Engineered diffusers can be purchased to produce an optimally diffused beam. When quick and inexpensive results are required, thin plastic sheets used in day-to-day packaging often serve as excellent diffusers. When the diffusion angles are not high enough, a number of these plastic sheets can be layered on top of each other to achieve the appropriate angular spread. The second method for producing a spatially incoherent source is shown in Figure 14b. Broadband

108 COHERENCE / Coherence and Imaging

light is delivered through a large multifiber lightguide and illuminates a moving diffuser. The main purpose of the diffuser is to ensure that the entire object is illuminated uniformly. The low temporal coherence of the source and the long propagation distances are usually enough to destroy any spatial coherence at the plane of the diffuser. The rotation further ensures that no residual spatial coherence exists at the primary source plane. A chromatic filter centered around 600 nm, with a spectral width of approximately 100 nm, is shown in the light path. The filter was used to minimize the effects of chromatic aberrations in the system. The wide spectral width certainly qualifies as a broadband source in relation to a laser. High-contrast photographically defined apertures determine the spatial intensity distribution at the plane of the primary source. In the following results, the primary source distributions were restricted to circular sources of varying sizes as well as annular

sources. These shapes are representative of most of the sources employed in microscopy, machine vision, and optical lithography.

Figure 15 High temporal coherence imaging with disk sources corresponding to (a) extremely high spatial coherence (K ¼ 0:05) and (b) slightly reduced but high spatial coherence (K ¼ 0:1).

Figure 16 High spatial coherence disk illumination (K ¼ 0:1) imaging through a dust covered surface with (a) narrowband laser illumination and (b) with broadband illumination.

Noise Immunity

Coherent imaging systems are notorious for introducing speckle-like noise artifacts at the image. Dust and optically rough surfaces within an optical system result in a complicated textured image plane intensity distribution that is often called speckle noise. We refer to such effects as coherent artifact noise. Figure 15a shows the image of a standard binary target as imaged by a benchtop imaging system, with the highly coherent illumination method shown in Figure 13. Some care was taken to clean individual lenses and optical surfaces within the system and no dust was purposely introduced. The image is corrupted by a complicated texture that is due in

COHERENCE / Coherence and Imaging

part to imperfections in the laser beam itself and in part to unwanted dust and optically rough surfaces. Certainly, better imaging performance can be obtained with more attention to surface cleanliness and laser beam filtering, but the result shows that it can be difficult to produce high-quality coherent images in a laboratory setting. Lower temporal coherence and lower spatial coherence will reduce the effect of speckle noise. The image of Figure 15b was obtained with laser illumination according to Figure 14a with a source mask corresponding to a source to pupil diameter ratio of K ¼ 0:1: The object plane illumination is still highly temporally coherent and the spatial coherence has been reduced but is still very high. The artifact noise has been nearly eliminated by the modest reduction of spatial coherence. The presence of diagonal fringes in some regions is a result of multiple reflections produced by the cover glass in front of

109

the CCD detector. The reflections produce a weak secondary image that is slightly displaced from the main image and the high temporal coherence allows the two images to interfere. The noise performance was intentionally perturbed in the image of Figure 16a by inserting a slide with a modest sprinkling of dust at an optical surface in between the object and the pupil plane. The illumination conditions are the same as for the image of Figure 15b. The introduction of the dust has further degraded the image. The image of Figure 16b maintains the same spatial coherence ðK ¼ 0:1Þ but employs the broadband source. The lower temporal coherence eliminates the unwanted diagonal fringes but the speckle noise produced by the dust pattern does not improve significantly relative to the laser illumination K ¼ 0:1 system. Figures 17a – c show that increasing the source size (and hence the range of angular illumination) reduces

Figure 17 Low temporal coherence imaging through a dust-covered surface with various source sizes and shapes: (a) disk source with K ¼ 0:3; (b) disk source with K ¼ 0:7; (c) disk source with K ¼ 2:0; and (d) a thin annular source with an outer diameter corresponding to K ¼ 0:5 and inner diameter corresponding to K ¼ 0:45:

110 COHERENCE / Coherence and Imaging

spatially coherent imaging systems creates oscillations at the images of an edge. Figure 18 shows slices of experimentally gathered edge imagery as a function of the ratio of source diameter to pupil diameter. Higher coherence systems produce sharper edges, but tend to have overshoots. The sharper edges contribute to the sense that high-coherence systems produce higher-resolution images. One advantage to an incoherent imaging system is that the exact location of the edge corresponds to the 50% intensity location. The exact location of an edge in a partially coherent imaging system is not as easily determined. The presence or lack of edge ringing can be used to assess whether a given imaging system can be modeled as spatially incoherent or partially coherent. Depth of Field

Coherent imaging systems exhibit an apparent increase in depth-of-field compared to spatially incoherent systems. Figure 19 shows spatially incoherent imagery for four different focus positions. Figure 20 shows imagery with the same amount of defocus produced with highly spatially coherent light. Finally, Figure 21 gives an example of how defocused imagery depends on the illumination coherence. The images of a spoke target were all gathered with a fixed amount of defocus and the source size was varied to control the illumination coherence. The images of Figure 21 differ greatly, even though the CPSF was the same for all the cases. Figure 18 Slices of edge intensity images of high contrast edges taken by a laboratory system with low temporal coherence and (a) high spatial coherence (disk source with K ¼ 0:1) and (b) effectively incoherent (disk source with K ¼ 2).

the spatial coherence and virtually eliminates the unwanted speckle pattern. Finally, Figure 17d shows that an annular source can also provide some noise immunity. The amount of noise immunity is related to the total area of the source rather than the outer diameter of annular source. Seen from the spatial frequency model, each source point produces a noisecorrupted coherent image. Since the effect of the noise is different for each coherent image, the image plane noise averages out as the images add incoherently. This perspective suggests that an extended source provides redundancy in the transfer of object information to the image plane. Edge Response

Spatial coherence has a strong impact on the images of edges. The sharp spatial frequency cutoff of

Digital Post-detection Processing and Partial Coherence The model of Figure 2 suggests that post-detection image processing can be considered as part of the image formation system. Such a general view can result in imaging that might be otherwise unobtainable by classical means. In fact, microscopists routinely use complicated deblurring methods to reconstruct out-of-focus imagery and build up three-dimensional images from multiple image slices. The coherence of the illumination should be considered when undertaking such image restoration. Rigorous restoration of partially coherent imagery is computationally intensive and requires precise knowledge of the coherence. In practice, a linear-inintensity model for the image formation is almost always used in developing image restoration algorithms. Even nonlinear restoration algorithms have built-in linear assumptions about the image formation models which imply spatially incoherent illumination.

COHERENCE / Coherence and Imaging

111

Figure 19 Spatially incoherent imaging for (a) best focus, (b) moderate misfocus, and (c) high misfocus.

The images of Figures 22a and b are digitally restored versions of the images of Figure 19b and Figure 20b. The point spread function was directly measured and used to create a linear restoration filter that presumed spatially incoherent illumination. The restored image in Figure 22a is faithful since the assumption of spatially incoherent light was reasonable. The restored image of Figure 22b suffers from a loss in fidelity since the actual illumination coherence was relatively high. The image is visually pleasing and correctly conveys the presence of three bar targets. However, the width of the bars is not faithful to the original target which had spacings equal to the widths of the bars. As discussed earlier, the ideal image is not always the optimal image for a given task. In general, restoration of partially coherent imaging with an implicit spatially incoherent imaging model will produce visually pleasing images that are not necessarily faithful to the object plane intensity.

While they are not faithful, they often preserve and even enhance edge information and the overall image may appear sharper than the incoherent image. When end task performance is improved, the image may be considered to be more optimal than the ideal image. It is important to keep in mind that some of the spatial information may be misleading and a more complete understanding of the coherence may be needed for precise image plane measurements. This warning is relevant in microscopy where the user is often encouraged to increase image contrast by reducing illumination spatial coherence. Annular sources can produce highly complicated spatial coherence functions that will strongly impact the restoration of such images.

Summary and Discussion The coherence of the illumination at the object plane is important in understanding image formation.

112 COHERENCE / Coherence and Imaging

Figure 20 High spatial coherence disk source (K ¼ 0:2) imaging for (a) best focus, (b) moderate misfocus, and (c) high misfocus.

Highly coherent imaging systems produce highcontrast images with high depth of field and provide the opportunity for sophisticated manipulation of the image with frequency plane filtering. Darkfield imaging and phase contrast imaging are examples of frequency plane filtering. Unfortunately, coherent systems are sensitive to optical noise and are generally avoided in practical system design. Relaxing the temporal coherence of the illumination can provide some improvement, but reducing the spatial coherence is more powerful in combating noise artifacts. Research microscopes, machine vision systems, and optical lithography systems are the most prominent examples of partially coherent imaging systems. These systems typically employ adjustable spatially incoherent extended sources in the shapes of disks or annuli. The exact shape and size of the primary source, shapes the angular extent of the illumination at the object plane and determines the spatial coherence at the object plane. Spatial coherence

effects can be significant, even for broadband light. Control over the object plane spatial coherence allows the designer to find a tradeoff between the various strengths and weaknesses of coherent and incoherent imaging systems. As more imaging systems employ post-detection processing, there is an opportunity to design fundamentally different systems that effectively split the image formation process into a physical portion and a post-detection portion. The simple example of image deblurring cited in this article shows that object plane coherence can affect the nature of the digitally restored image. The final image can be best understood when the illumination coherence effects are well understood. The spatially incoherent case results in the most straightforward model for image interpretation, but is not always the best choice since the coherence can often be manipulated to increase the contrast and hence the amount of useful information in the raw image. A completely general approach to

COHERENCE / Coherence and Imaging

113

Figure 21 Imaging with a fixed amount of misfocus and varying object spatial coherence produced by: (a) highly incoherent disk source illumination (K ¼ 2); (b) moderate coherence disk source (K ¼ 0:5); (c) high coherence disk source (K ¼ 0:2); and (d) annular source with outer diameter corresponding to K ¼ 0:5 and inner diameter corresponding to K ¼ 0:45:

Figure 22 Digital restoration of blurred imagery with an inherent assumption of linear-in-intensity imaging for (a) highly spatially incoherent imaging (K ¼ 2) and (b) high spatial coherence (K ¼ 0:2).

114 COHERENCE / Speckle and Coherence

imaging would treat the coherence of the source, the imaging optics, and the post-detection restoration all as free variables that can be manipulated to produce the optimal imaging system for a given task.

See also Coherent Lightwave Systems. Information Processing: Coherent Analogue Optical Processors. Terahertz Technology: Coherent Terahertz Sources.

Further Reading Bartelt H, Case SK and Hauck R (1982) Incoherent Optical Processing. In: Stark H (ed.) Applications of Optical Fourier Transforms, pp. 499 – 535. London: Academic Press. Born M and Wolf E (1985) Principles of Optics, 6th edn. (corrected), pp. 491 – 554. Oxford, UK: Pergamon Press.

Bracewell RN (1978) The Fourier Transform and its Applications. New York: McGraw Hill. Gonzales RC and Woods EW (1992) Digital Image Processing, 3rd edn., pp. 253 – 304. Reading, MA: Addison-Wesley. Goodman JW (1985) Statistical Optics, 1st edn. New York: John Wiley & Sons. Goodman JW (1996) Introduction to Fourier Optics, 2nd edn. New York: McGraw Hill. Hecht E and Zajac A (1979) Optics, 1st edn., pp. 397 – 435, Reading, MA: Addison-Wesley. Inoue S and Spring KR (1997) Video Microscopy, 2nd edn. New York: Plenum. Reynolds GO, DeVelis JB, Parrent GB and Thompson JB (1989) Physical Optics Notebook: Tutorials in Fourier Optics, 1st edn. New York: SPIE, Bellingham and AIP. Saleh BEA and Teich MC (1991) Fundamentals of Photonics, pp. 343 – 381. New York: John Wiley & Sons.

Speckle and Coherence G Ha¨usler, University of Erlangen-Nu¨rnberg, Erlangen, Germany q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Opticists are aware that the amount of coherence plays a significant role in imaging systems: laser speckles are known to add significant noise to the image, as well as parasitic interferences from dusty lenses. Optical systems are often called coherent, if a laser is used (right), and incoherent if other sources come into play (wrong). Many users of optical systems are unaware that it is not the high temporal coherence but the spatial coherence that commonly afflicts the image quality, and that this parasitic spatial coherence is ubiquitous, even though not obvious. Coherent artifacts can occur without the use of lasers, although speckle noise is more prominent with lasers. Even opticists sometimes underestimate the damage that residual coherent noise can cause, and as laser oriented sensor funding programs are ‘en vogue’, nonexperts are disappointed if some metrology device does not include a laser. This encyclopedia addresses the many uses of lasers. In this article, we will discuss the costs of coherence. The commonly pretended incoherent approach of everyday optics may lead to significant quantitative measuring errors of illumination or reflectivity, 3d shape, distance or size. Spatial

coherence is the dominant source of noise. We will give some rules of thumb to estimate these errors and a few tricks to reduce coherent noise. These rules will help to minimize coherent noise, however, it turns out that as spatial coherence is ubiquitous, there are only limited options available to clear it. One of the options to build good sensors that measure shape, reflectivity, etc. is avoiding the use of lasers! To become familiar with some basics of the theory of coherence we refer the reader the Further Reading section at the end of this article. Coherence can be boon or disaster for the opticist, as is explained in other articles of this encyclopedia about interferometry, diffraction, and holography. A specific topic is information acquisition from coherently scattered light. An enlightening example is where speckles in white light interferometry at rough surfaces and in speckle interferometry are exploited. We will briefly discuss white light interferometry at rough surfaces in the section on speckles as carriers of information below.

Practical Coherence Theory A major issue of this chapter will be corrupting properties of coherence in the daily life of an optical metrologist. We will demonstrate that ‘speckle’ noise is ever present, and essentially unavoidable, in the images of (diffusively reflecting) objects. Its influence on the quality of optical measurements leads to a lower limit of the physical measuring uncertainty.

COHERENCE / Speckle and Coherence

115

Figure 1 Ground glass in sunlight: (a) the image of a ground glass, illuminated by the sun, observed with small aperture (from large distance), displays no visible speckles; (b) a medium observation aperture (smaller distance) displays weak visible speckles; (c) observation with very high aperture displays strong contrast speckles; and (d) the image of the ground glass on a cloudy day does not display speckles, due to the very big aperture of illumination. Reproduced with permission from Ha¨usler G (1999) Optical sensors and algorithms for reverse engineering. In: Donah S (ed.) Proceedings of the 3rd Topical Meeting on Optoelectronic Distance/Displacement Measurements and Applications. IEEE-LEOS.

This ‘coherent noise’ limit is, not surprisingly, identical to Heisenberg’s limit. We will start by summarizing the important results, and will give some simple explanations later. Some Basic Observations

(i) The major source of noise in optical measurements is spatial coherence, the temporal coherence status, generally, does not play a significant role. (ii) An optically rough surface displays subjective speckles with contrast C ¼ 1; if the observation aperture sin uo is larger than the illumination aperture sin uq . (iii) The speckle contrast is C , sin uq =sin uo ; if the observation aperture is smaller than the illumination aperture. Figure 1 illustrates the situation by a simple experiment. The results of Figure 1 can be explained simply by summarizing some 80 pages of coherence theory in a nutshell, by useful simplifications and approximations (Figure 2). If the source is at an infinite distance, the coherence function no longer depends on the two variables but just on the slit distance d ¼ lx1 2 x2 l: In this case, the coherence function G(d) can be easily calculated as the Fourier transform of the spatial intensity

Figure 2 Basic experiment for spatial coherence. An extended source, such as the sun or some incandescent lamp illuminates a double slit, from a large distance. On a screen behind the double slit we can observe ‘double slit interference’, if the waves coming from x1 and x2 display coherence. The contrast of the interference fringes is given by the magnitude of the coherence function G (neglecting some normalization factor). G is a function of the two slit locations x1 and x2 :

distribution of the source (van Cittert-Zernike theorem). Let us assume a one dimensional source with an angular size of 2uq ; where uq is commonly called the aperture of the illumination. In this case, the coherence function will be: GðdÞ , sin cð2uq d=lÞ

½1

The width of the coherence function which gives the size of the coherently illuminated area (coherence area) can be approximated from eqn [1]: dG ¼ l=uq

½2

Equation [1] includes some approximations: for circular sources such as the sun, we get an additional

116 COHERENCE / Speckle and Coherence

factor 1.22 for the width of the coherence function – the ‘sinc-function’ has to be replaced by the Bessel function J1 ðrÞ=ðrÞ: There is another approximation: we should replace the aperture angle uq by sin uq ; for larger angles. We choose the wavelength of the maximum sensitivity of the eye, at l ¼ 0:55 mm; which is the wavelength of the maximum emission of the sun also. In conclusion, if two points at the object are closer than dG these points will be illuminated with a high degree of coherence. The light waves, scattered from these object-locations can display interference contrast. Specifically, they may cancel each other out, causing a low signal-to-noise ratio of only 1:1. This does not happen if the points have a distance larger than dG : From Figure 3 we learn that the width of the spatial coherence function from sunlight illumination at the earth’s surface is ,110 mm: (Stars more distant than the sun appear smaller, and hence have a wider coherence function, at the earth’s surface. Michelson was able to measure the angular size of some close stars, by measuring the width of this coherence function, which was about dG , 10 m:) Figure 4 again displays an easy to perform experiment. We can see speckles at sunlight illumination. We can observe speckles as well in shops because they often use small halogen spot illumination. So far we have only discussed the first stage (from the source to the object). We still have to discuss the

Figure 5 Coherence in a nutshell: spatial coherence and observer.

way from the object to the retina (or CCD-chip of a video camera). This is illustrated by Figure 5. As an object, we choose some diffusely reflecting or transmitting surface (such as a ground glass). The coherence function at the ground glass has again a width of dG. What happens if we image this ground glass onto our retina? Let us assume our eye to be diffraction limited (which, in fact, it is at sunlight conditions, where the eye pupil has only a diameter Fpupil ¼ 2:5 mm diameter or less). Then a point x1 at the ground glass will cause a diffraction spot at the retina, with a size of: d0 diffr ¼ l=u0 o ¼ 8 mm

The image sided aperture of observation u0 0 of the eye is calculated from Fpupil and from its focal length feye ¼ 18 mm as u0 o ¼ Fpupil =2feye ¼ 0:07. If we project the diffraction spot at the retina back onto the object, we get its size d from: ddiffr ¼ l=uo ¼ l=ðFpupil =2zo Þ

Figure 3 Coherence from sunlight illumination. With the van Cittert–Zernike theorem and an approximate illumination aperture of the sun uqsun , 0:005 (0:258), we get a width of the coherence function at the Earth’s surface of about dG , 110 mm:

½4

with zo ¼ observation distance of the object from the eye. We call uo ¼ Fpupil =2zo the object sided observation aperture. Let us calculate the laterally resolvable distance ddiffr at the object, with a distance zo ¼ 250 mm; which is called the ‘comfortable viewing distance’: ddiffr ðzo ¼ 250 mmÞ ¼ 110 mm

Figure 4 Finger nail in sunlight, with speckles.

½3

½5

After these preparations we come to the crucial issue: how does the image at the retina appear if we cannot resolve distances at an object smaller than dG ; and how does it appear if the resolution of the eye is sufficient to resolve distances smaller than dG? Let us start with the first assumption: ddiffr q dG : Now the images of points at the object, over an area of the diffraction spot are more or less incoherently averaged at the retina, so we will see little interference contrast or ‘speckles’. From eqn [4] we see that at

COHERENCE / Speckle and Coherence

this incoherent averaging starts for distances larger than 250 mm, which is the comfortable viewing distance. Adults cannot see objects from much closer distances. We generally do not see speckles, not just at the limit of maximum accommodation. For larger distances, if we take zo ¼ 2500 mm; the resolvable distance at the object will be 1.1 mm, which is 10 times larger than the diameter of the coherence area. Averaging over such a large area will drop the interference contrast by a factor of 10. Note that such a small interference contrast might not be visible, but it is not at zero! Coming to the second assumption we can laterally resolve distances smaller than dG. In order to understand this, we first have to learn what is an ‘optically rough’ object surface. Figure 6 illustrates the problem. A surface commonly is called ‘rough’ if the local height variations are larger than l: However, a surface appears only rough, if the height variation within the distance that can be resolved by the observer, is larger than l (reflected light will travel the distance twice, hence a roughness of l=2 will be sufficient). Then the scattered different phasors, or ‘Huygens elementary waves’: us , exp½i2kzðx; yÞ

½6

117

see high interference contrast; in fact we see a speckle contrast of C ¼ 1; if the roughness of the object is smaller than the coherence length of the source. This is the case for most metal surfaces, for ground glasses, or for worked plastic surfaces. The assumption is not true for ‘translucent’ surfaces, such as skin, paper, or wood. This will be discussed below. The observation that coherent imaging is achieved, if we can resolve distances smaller than the coherence width dG, is identical to the simple rule that fully coherent imaging occurs if the observation aperture uo is larger than the illumination aperture uq : As mentioned, we will incoherently average in the image plane, over some object area, determined by the size of the diffraction spot. According to the rules of speckle-averaging, the speckle contrast C decreases with the inverse square root of the number N of incoherently averaged speckle patterns. This number N is equal to the ratio of the area Adiff of the diffraction spot divided by the coherence area AG ¼ dG2 : So we obtain for the speckle contrast C: C¼1

½7a

for uq , uo

pffiffiffiffiffi C ¼ 1= ðNÞ ¼ uo =uq ;

for uq .¼ uo

½7b

scattered from the object at x; y may have big phase differences so that destructive interferences (and speckle) can occur in the area of the diffraction image. So the attribute ‘rough’ depends on the object as well as on the observation aperture. With a high observation aperture (microscope), the diffraction image is small and within that area the phase differences might be small as well. So a ground glass may then look locally like a mirror, while with a small observation aperture it appears ‘rough’. With the ‘rough’ observation mode assumption, we can summarize what we have assumed: with resolving distances smaller than the coherence width dG we will

We summarize the results in Figure 7. Equation [7b] has an interesting consequence: we never get rid of speckle noise, even for large illumination apertures and small observation apertures. In many practical instruments, such as a slide projector, the illumination ‘aperture stop’ cannot be greater than the imaging lens. Fortunately, the observer’s eye commonly has a pupil smaller than that of the projector, and/or looks from a distance at the projection screen. Laser projection devices however cause strong and disturbing speckle effects for the user and significant effort is invested to cope with this effect.

Figure 6 What is a rough surface?

Figure 7 Coherence in a nutshell: what are the conditions for incoherent and coherent imaging?

118 COHERENCE / Speckle and Coherence

Figure 8 Microfiche projection. For high observation aperture (a) strong speckles occur, and for small observation aperture (b), the speckle contrast is low.

In Figure 8, an image is depicted from a microfiche reading projector. Figure 8a displays the close-up look (high observation aperture) and Figure 8b depicts a more distant view (small observation aperture). The amount of noise is much less in the second case, as it should be according to the rules of eqn [7]. Let us finish this section with some speculation. How would a fly see the sunny world, with a human type of eye? With high observation aperture and at short distance, the world is full of speckle noise. Fortunately, nature invented the facet-eye, for insects, as depicted in Figure 9.

Figure 9 Facet eye and ‘human’ eye. The facet eye consists of many low aperture lenses, in contrast to the human eye.

Speckle Limits of Metrology The consequences of the effects discussed above are often underestimated. We should be suspicious, if some effect in nature that disturbs us – like our coherent noise – is ubiquitous, that there might be some deep underlying principle that does not allow us to know everything about the object under observation. Indeed, it turns out that Heisenberg’s uncertainty principle is strongly connected with coherent noise (Figure 10). We can see this from the following experiment: a laser spot is projected onto a ground glass, and imaged with high magnification by a video camera, with an aperture sin uo : The ground glass is macroscopically planar. When the ground glass is laterally shifted, we find that the observed spot is ‘dancing’ at the video target. Its observed position is not constant, although its projected position is. It turns out that the standard deviation of the observed position is equal to the uncertainty calculated from the aperture by Heisenberg’s principle. We can calculate the limit for the distance measuring uncertainty, from speckle theory, as well from Heisenberg’s principle (within some factor of order 1): dz · dpz q h=4p

½8

Figure 10 Coherence makes it impossible to localize objects with very high accuracy. Laser spots projected onto a ground glass cannot be imaged without some uncertainty of the lateral position. The four pictures of the spot images, taken with different lateral positions of the ground glass, display the cross bar (true position) and the apparent position of the spot images. The position uncertainty is equal to the uncertainty calculated from Heisenberg’s uncertainty principle.

where dpz is the uncertainty of the photon impulse h=l in the z-direction (along the optical axis of the measuring system) and h is Planck’s constant. For a small measurement uncertainty of the distance, we should allow a big uncertainty dpz : This can be achieved by a large aperture of the observation system, giving the photons a wide range of possible directions to the lens. We can also allow different

COHERENCE / Speckle and Coherence

119

wavelengths, and come to white light interferometry, see section on speckles as carriers of information below. The result is – not surprising – the same as Rayleigh scattering found for the depth of field:

Hence, many photons do not supply more information than one single photon.

dz ¼ l=sin2 uo

Generally, coherent noise is not well visible, even in sophisticated technical systems the visibility might be low. There are two main reasons; first, the observation aperture might be much smaller than the aperture of illumination. This is often true, for large distance observation, even with small apertures of illumination. The second reason holds for technical systems, if the observation is implemented via pixelized video targets. If the pixel size is much larger than a single speckle, which is commonly so, then, by averaging over many speckles, the noise is greatly reduced. However, we have to take into account that we pay for this by loss of lateral resolution 1=dx: We can formulate another uncertainty relation:

½9

Coherent noise is the source of the fundamental limit of the distance measuring uncertainty dz of triangulation based sensors. dz ¼ C · l=ðsin uo sin uÞ

½10

where C is the speckle contrast, l is the wavelength, sin uo is the aperture of observation, and u is the angle of triangulation (between the direction of the projection and the direction of observation). For a commercially available laser triangulation sensor, (see Figure 11), with sin uo , 0:01; C ¼ 1 and u ¼ 308; the measuring uncertainty dz will be larger than 100 mm. We may add, that for sin uo ¼ sin u; which is valid for auto-focus sensors (such as the confocal scanning microscope), eqn [10] degenerates to the well known Rayleigh depth of field (eqn [9]). The above results are remarkable in that we cannot know the accurate position of the projected spot or about the position of an intrinsic reflectivity feature; we also cannot know the accurate local reflectivity of a coherently illuminated object. A further consequence is that we do not know the accurate shape of an object in 3D-space. We can calculate this ‘physical measuring uncertainty’ from the considerations above. These consequences hold for optically rough surfaces, and for measuring principles that exploit the local intensity of some image, such as with all triangulation type of sensors. The consequences of these considerations are depressing: triangulation with a strong laser is not better than triangulation with only one single photon. The deep reason is, that all coherent photons stem from the same quantum mechanical phase cell and are indistinguishable.

Figure 11 Principle of laser triangulation. The distance of the projected spot is calculated from the location of the speckled spot image and from the angle of triangulation. The speckle noise introduces an ultimate limit of the achievable measuring uncertainty. Reproduced with permission from Physikalische Pla¨tter (May 1997): 419.

Why Are We Not Aware of Coherence

dx . l=ðC · sin uo Þ

½11

which says that if we want to reduce the speckle contrast C (noise), by lateral averaging, we can do this but lose lateral resolution 1=dx:

Can We Overcome Coherence Limits? Since the daily interaction of light with matter is coherent scattering, we can overcome the limit of eqn [10] only by looking for measuring principles that are not based on the exploitation of local reflectivity, i.e., on ‘conventional imaging’. Concerning optical 3D-measurements, the principle of triangulation uses the position of some image detail, for the calculation of the shape of objects, and has to cope with coherent noise. Are there different mechanisms of photon– matter interaction, with noncoherent scattering? Fluorescence and thermal excitation are incoherent mechanisms. It can be shown (see Figure 12) that triangulation utilizing fluorescent light, displays much less noise than given by eqn [10]. This is exploited in fluorescent confocal microscopy. Thermally excited matter emits perfectly incoherent radiation as well. We use this incoherence to measure the material wear in laser processing (see Figure 13), with, again, much better accuracy than given by eqns [9,10]. The sensor is based on triangulation. Nevertheless, by its virtually zero coherent noise, it allows a measuring uncertainty which is limited only by camera noise and other technical imperfections. The uncertainty of the depth measurement through the narrow aperture of the laser nozzle is only a few

120 COHERENCE / Speckle and Coherence

microns, beating the limitation of eqn [10] by a factor of about 5, even in the presence of a turbulent plasma spot with a temperature of 3000 K.

Broadband Illumination Speckle noise can hardly be avoided by broadband illumination. The speckle contrast C is related to the surface roughness sz and the coherence length lc by eqn [12]: C2 , lc =2sz for sz q lc

½12

So, only for very rough surfaces, and white light illumination, C can be reduced. However, the majority of technical objects is smooth, with a roughness of a few micrometers, hence speckles can be observed even with white light illumination ðlc , 3 mmÞ: This situation is different for ‘translucent’ objects such as skin, paper, or some plastic material.

Speckles as a Carrier of Information

Figure 12 Reduction of speckle noise by triangulation with fluorescent light. A flat metal surface is measured by laser triangulation. Top: surface measured by laser triangulation with full speckle contrast. Bottom: the surface is covered with a very thin fluorescent layer, and illuminated by the same laser, however, the triangulation is done after suppressing the scattered laser light, and utilizing the (incoherent!) fluorescent light. This experiment proves that it is the coherent noise that causes the measuring uncertainty.

Figure 13 A 100 W laser generates a plasma that emits perfectly incoherent radiation at the object surface. The emitted light can be used to measure on-line the distance of the object surface. A triangulation sensor, exploiting the incoherent plasma emission, controls surface ablation with an accuracy of only a few microns. Reproduced with permission from F & M, Feinwerktechnik Mikrotechnik Masstechnik (1995) Issue 9.

So far we discussed the problems of speckles when measuring rough surfaces. We can take advantage of speckles, if we do not stick to triangulation as the basic mechanism of distance measurement. We may utilize the fact that although the phase within each speckle has a random value, this phase is constant over some area. This observation can be used to build an interferometer that works even at rough surfaces. So, with proper choice of the illumination aperture and observation aperture, and with a surface roughness less than the coherence length, we can measure distances by the localization of the temporal coherence function. This function can easily be measured by moving the object under test along the optical axis and measuring the interference intensity (correlogram) at each pixel of the camera. The signal generating mechanism of this ‘white light interferometer’ distance measurement is not triangulation but ‘time of flight’. Here we do not suffer from the limitations of eqn [10]. The limits of white light rough surface interferometry are discussed in the Further Reading section at the end of this article. The ‘coherence radar’ and the correlograms of different speckles are depicted in Figures 14 and 15. The physical mechanism of the signal formation in white light interferometry at rough surfaces is different from that of white light interferometry at smooth surfaces. That is why the method is sometimes referred as ‘coherence radar’. We will briefly summarize some advantageous, and – probably unexpected – features of the coherence radar.

COHERENCE / Speckle and Coherence

121

We call the real shape of the object, or the surface profile, zðx; yÞ; where z is the local distance at the position ðx; yÞ: We give the measured data an index ‘m2 ’:

This measure Rq was calculated from measurements at different roughness gauges, with different observation apertures. Two measurements are depicted in Figure 16.

(i) The physical measuring uncertainty of the measured data is independent from the observation aperture. This is quite a remarkable property, as it enables us to take accurate measurements at the bottom of deep boreholes. (ii) The standard deviation sz of the object can be calculated from the measured data zm: sz ¼ ,lzm l. : According to German standards, sz corresponds to the roughness measure Rq :

The experiments shown in Figure 16 display the correct roughness measure, even if the higher frequencies of the object spectrum Zðn; mÞ (Figure 17), are not optically resolved. The solution of this paradox might be explained as follows. From the coherently illuminated object each point scatters a spherical wave. The waves scattered from different object points have, in general, a different phase. At the image plane, we see the laterally

Figure 14 ‘Coherence radar’, white light interferometry at rough surfaces. The surface under test is illuminated with high spatial coherence and low temporal coherence. We acquire the temporal coherence function (correlogram) within each speckle, by scanning the object under test in depth.

Figure 15 White light speckles in the x – z plane. The left side displays the acquired correlograms, for a couple of speckles, the right side shows the graphs of some of the correlograms. We see that the correlograms are not located at a constant distance, but display some ‘distance uncertainty’. This uncertainty does not originate from the instrument, but from the roughness of the object.

Figure 16 Optical measurement of the roughness beyond the Abbe resolution limit. The roughness is measured for 4 roughness standards N1– N4, with different observation apertures. The roughness is measured correctly, although the microtopology is not resolved by the observing optics.

122 COHERENCE / Speckle and Coherence

Figure 17 Signal generation beyond the Abbe limit. The microtopology of the object zðx ; y Þ is not resolved, due to the large diffraction image with diameter dB : Yet we get information from beyond the bandlimit 1=dB of the observing optics.

averaged complex amplitude , u .; according to eqn [6]. The averaging is due to the diffraction limited resolution. Equation [6] reveals that the (spatially) averaged field amplitude , u . is a nonmonotonic, nonlinear function of the surface profile zðx; yÞ; if the surface is rough (i.e., z q l). We do not average over the profile z but over the complex amplitude. As a consequence of this nonlinearity, the limited observation aperture does not only collect the spatial frequencies n; m of the spatial Fourier spectrum Zðn; mÞ within the bandlimit 1=dB ; but also acquires information beyond this bandlimit. The reason is the ‘down conversion’ of higher spatial frequencies by nonlinear mixing. Simulations and experiments confirm the consequences. Thus we can evaluate the surface roughness, even without laterally resolving the microtopology. However, there is a principal uncertainty on the measured data. We do not see the true surface but a surface with a ‘noise’ equivalent to the roughness of the real surface. Nonlinear nonmonotonic nonlinearities in a sequence of operations may cause ‘chaotic’ behavior, in other words, small parameter changes such as vibrations, humidity on the surface, etc. may cause large differences of the outcome. In fact, we observe a significant variation of the measured data zm ðx; yÞ within a sequence of repeated measurements. Our conjecture is that the complex signal formation may be involved in this irregular behavior. The hypothesis may be supported by the observation that much better repeatability can be achieved at specular surfaces, where eqn [6] degenerates to a linear averaging over the surface profile: , u ., 1 þ ik , zðx; yÞ .; for z p l

½13

Summary Spatial coherence is ubiquitous and unavoidable. Spatial coherence disturbs most optical

measurements, where optically rough surfaces exist. Then, coherence gives the ultimate limit of the achievable measuring uncertainty. There are simple rules to estimate the measuring uncertainty, by calculating the width of the coherence function and the resolution of the observation system. Spatial coherence and Heisenberg’s uncertainty principle lead to the same results of measuring uncertainty. On the other hand – the formation of signals from coherent light that is scattered at rough surfaces is quite complex and a strongly nonlinear process, which sometimes might encode information which is not otherwise available. One example appears to be the measurement of the roughness with white light interferometry – which is possible, even if the microtopology of the surface is not optically resolved.

See also Coherence: Overview; Coherence and Imaging.

Further Reading Born M and Wolf E (1999) Principles of Optics, 7th edn. Cambridge: Cambridge University Press. Brooker G (2003) Modern Classical Optics. Oxford: Oxford University Press. Dainty JC (ed.) (1984) Laser Speckle and Related Phenomena. Berlin, Heidelberg, New York: Springer Verlag. Dorsch R, Herrmann J and Ha¨usler G (1994) Laser triangulation: fundamental uncertainty in distance measurement. Applied Optics 33: 1306– 1314. Dresel T, Venzke H and Ha¨usler G (1992) 3D-sensing of rough surfaces by coherence radar. Applied Optics 31: 919– 925. ¨ ber die Signalenstehung bei WeisslichtinterEttl P (2002) U ferometrie. PhD Dissertation, University of ErlangenNu¨rnberg. Ha¨usler G, Ettl P, Schenk M, Bohn G and Laszlo I (1999) Limits of optical range sensors and how exploit them. In: Asakura T (ed.) Trends in Optics and Phototonics. Ico IV, Springer Series in Optical Sciences, vol. 74, pp. 328– 342. Berlin, Heidelberg, New York: Springer Verlag. Ha¨usler G and Herrmann J (1992) Physical limits of 3D-sensing. Proceedings of the SPIE Conf. 1822. In: Svetkov D (ed.) Optics, Illumination, and Image Sensing for Machine Vision VII, pp. 150 – 158. Boston, MA: SPIE. Klein M and Furtak TE (1986) Optics. New York: John Wiley & Sons. Klinger P, Spellenberg B, Herrmann JM and Ha¨usler G (2001) In Process 3D-Sensing for Laser Material Processing. Proceedings of Third Int. Conf. on 3-D Digital Imaging and Modeling, Quebec City, Canada. Los Alamitos: IEEE Computer Society, 38 – 41.

COHERENT CONTROL / Theory

Pedrotti F and Pedrotti L (1993) Introduction to Optics, 2nd edn. New Jersey: Prentice Hall. Pe´rez J-Ph (1996) Optik, Spektrum Akademischer. Heidelberg: Verlag. Sharma A, Kandpal HC and Ghatak AK (2004) Optical Coherence. See this Encyclopedia.

123

van de Gracht (2004) Coherence and Imaging. See this Encyclopedia. Young M (1993) Optics and Lasers, Including Fibers and Optical Wave Guides, 4th revised edn. Berlin, Heidelberg: Springer Verlag.

COHERENT CONTROL Contents Theory Experimental Applications in Semiconductors

Theory H Rabitz, Princeton University, Princeton, NJ, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Since the inception of quantum mechanics almost a century ago, a prime activity has been the observation of quantum phenomena in virtually all areas of chemistry and physics. However, the natural evolution of science leads to the desire to go beyond passive observation to active manipulation of quantum mechanical processes. Achieving control over quantum phenomena could be viewed as engineering at the atomic scale guided by the principles of quantum mechanics, for the alteration of system properties or dynamic behavior. From this perspective, the construction of quantum mechanically operating solidstate devices through selective material growth would fall into this category. The focus of this article is principally on the manipulation of quantum phenomena through tailored laser pulses. The suggestion of using coherent radiation for the active alteration of microworld processes may be traced to the early 1960s, almost immediately after the discovery of lasers. Since then, the subject has grown enormously to encompass the manipulation of (1) chemical reactions, (2) quantum electron transport in semiconductors, (3) excitons in solids, (4) quantum information systems, (5) atom lasers, and (6) high harmonic generation, amongst other topics. Perhaps

the most significant use of these techniques may be their provision of refined tools to ultimately better understand the basic physical interactions operative at the atomic scale. Regardless of the particular application of laser control over quantum phenomena, there is one basic operating principle involved: active manipulation of constructive and destructive quantum wave interferences. This process is depicted in Figure 1, showing the evolution of a quantum system from the initial state lci l to the desired final state lcf l along three of possibly many interfering pathways. In general, there may be many possible final accessible states lcf l; and often the goal is to achieve a high amplitude in one of these states and low amplitude in all the others. The target state might actually be a superposition of states, and an analogous picture to

Figure 1 The evolution of a quantum system under laser control from the initial state lci l to the final state lcf l. The latter state is chosen to have desirable physical properties, and three of possibly many pathways between the states are depicted. Successful control of the process lci l ! lcf l by a tailored laser pulse generally requires creating constructive quantum wave interferences in the state lcf l from many pathways, and destructive interferences in all other accessible final states lcf0 l – lcf l.

124 COHERENT CONTROL / Theory

that in Figure 1 would also apply to steering about the density matrix. The process depicted in Figure 1 may be thought of as a microscale analog of the classic double-slit experiment for light waves. As the goal is often high-finesse focusing into a particular target quantum state, success may call for the manipulation of many quantum pathways (i.e., the notion of a ‘many-slit’ experiment). Most applications are concerned with achieving a desirable outcome for the expectation value kcf lOlcf l associated with some observable Hermitian operator O. The practical realization of the task above becomes a control problem when the system is expressed through its Hamiltonian H ¼ H0 þ Vc ; where H0 is the free Hamiltonian describing the dynamics without explicit control; it is assumed that the free evolutionary dynamics under H0 will not satisfactorily achieve the physical objective. Thus, a laboratory-accessible control term Vc is introduced in the Hamiltonian to achieve the desired manipulation. Even just considering radiative interactions, the form of Vc could be quite diverse depending on the nature of the system (e.g., nuclear spins, electrons, atoms, molecules, etc.) and the intensity of the radiation field. Many problems may be treated through an electric dipole interaction Vc ¼ 2m·1ðtÞ where m is a system dipole moment and 1(t) is the laser electric field. Where appropriate, this article will consider the interaction in this form, but other suitable radiative control interactions may be equally well treated. Thus, the laser field 1(t) as a function of time (or frequency) is at our disposal for attempted manipulation of quantum systems. Before considering any practical issues associated with the identification of shaped laser control fields, a fundamental question concerns whether it is, in principle, possible to steer about any particular quantum system from an arbitrary initial state lci l to an arbitrary final state lcf l. Questions of this sort are addressed by a controllability analysis of Schro¨dinger’s equation

i"

›lcðtÞl ¼ ½H0 2 m·1ðtÞlcðtÞl; ›t

lcð0Þl ¼ lci l

½1

Controllability concerns whether, in principle, some field 1(t) exists, such that the quantum system described by H0 ¼ H0 2 m·1ðtÞ permits arbitrary degrees of control. For finite-dimensional quantum systems (i.e., those described by evolution amongst a discrete set of quantum states), the formal tools for such a controllability analysis exist both for evaluating the controllability of the wave function, as well as the more general time evolution operator U(t),

which satisfies i"

›U ¼ ½H0 2 m·1ðtÞU; Uð0Þ ¼ 1 ›t

½2

Analyses of this type can be quite insightful, but they require detailed knowledge about H0 and m. Cases involving quantum information science applications are perhaps the most demanding with regard to achieving total control. Most other physical applications would likely accept much more modest levels of control and still be categorized as excellent achievements. Theoretical tools and concepts have a number of roles in considering the control of quantum systems, including (1) an exploration of physical/chemical phenomena under active control, (2) the design of viable control fields, (3) the development of algorithms to actively guide laboratory control experiments towards achieving their dynamical objectives, and (4) the introduction of special algorithms to reveal the physical mechanisms operative in the control of quantum phenomena. Activities (1) –(3) have thus far been the primary focus of theoretical studies in this area, and it is anticipated that item (4) will grow in importance in the future. A few comments on the history of laser control over quantum systems are relevant, as they speak to the special nature of the currently employed successful closed-loop quantum control experiments. Starting in the 1960s and spanning roughly 20 years, it was thought that the design of lasers to manipulate molecular motion could be achieved by the application of simple physical logic and intuition. In particular, the thinking at the time focused on using cw laser fields resonant with one or more local modes of the molecule, as state or energy localization was believed to be the key to successful control. Since quantum dynamics phenomena typically occur on ultrafast time-scales, possibly involving spatially dispersed wave packets, expecting to achieve high quality control with a light source operating at one or two resonant frequencies is generally wishful thinking. Quantum dynamics phenomena occur in a multifrequency domain, and controls with a few frequencies will not suffice. Over approximately the last decade, it became clear that successful control often calls for manipulating multiple interfering quantum pathways (cf., Figure 1). In turn, this recognition led to the need for broad-bandwidth laser sources (i.e., tailored laser pulses). Fortunately, the necessary laser pulse-shaping technologies have become available, and these sources continue to expand into new frequency ranges with enhanced bandwidth capabilities. Many chemical and physical

COHERENT CONTROL / Theory

applications of control over quantum phenomena involve performing a significant amount of work (i.e., quantum mechanical action), and sufficient laser field intensity is required. A commonly quoted adage is ‘no field, no yield’, and at the other extreme, there was much speculation that operating nonlinearly at high field intensities would lead to a loss of control because the quantum system would effectively act as an amplifier of even weak laser field noise. Fortunately, the latter outcome has not occurred, as is evident from a number of successful high field laser control experiments. Quantum control may be expressed as an inverse problem with a prescribed chemical or physical objective, and the task being the discovery of a suitable control field 1(t) to meet the objective. Normally, Schro¨dinger’s equation [1] is viewed as linear, and this perspective is valid if the Hamiltonian is known a priori, leaving the equation to be solved for the wave function. However, in the case of quantum control, neither the wave function nor the control field is known a priori, and the two enter bilinearly on the right-hand side of eqn [1]. Thus, quantum control is mathematically a nonlinear inverse problem. As such, one can anticipate possibly diverse behavior in the process of seeking successful controls, as well as in the ensuing quantum dynamical control behavior. The control must take into account the evolving system dynamics, thereby resulting in the control field being a function(al) of the current state of the system 1(c) in the case of quantum control design, or dependent on the system observations 1ðkOlÞ in the case of laboratory closed-loop field discovery efforts where kOl ¼ kclOlcl: Furthermore, the control field at the current time t will depend in some implicit fashion on the future state of the evolving system at the final target time T. It is evident that control field design and laboratory discovery present highly nontrivial tasks. Although the emphasis in this review is on the theoretical aspects of quantum control theory, the subject exists for its laboratory realizations, and those realizations, in turn, intimately depend on the capabilities of theoretical and algorithmic techniques for their successful implementations. Accordingly, theoretical laser control field design techniques will be discussed, along with algorithmic aspects of current laboratory practice. This material respectively will focus on optimal control theory (OCT) and optimal control experiments (OCEs), as seeking optimality provides the best means to achieve any posed objective. Finally, the last part of the article will present some general conclusions on the state of the quantum control field.

125

Quantum Optimal Control Theory for Designing Laser Fields When considering control in any domain of application, a reasonable approach is to computationally design the control for subsequent implementation in the laboratory. Quantum mechanical laser field design has taken on a number of realizations. At one limit is the application of simple intuition for design purposes, and in some special cases (e.g., the weak field perturbation theory regime), this approach may be applicable. Physical insights will always play a central role in laser field design, but to be especially useful, they need to be channeled into the proper mathematical framework. Many of the interesting applications operate in the strong-field nonperturbative regime. In this domain, serious questions arise regarding whether sufficient information is available about the Hamiltonian to execute reliable designs. Regardless of whether the Hamiltonian is known accurately or is an acknowledged model, the theoretical study of quantum control can provide physical insight into the phenomena involved, as well as possibly yielding trial laser pulses for further refinement in the laboratory. Achieving control over quantum mechanical phenomena often involves a balance of competing dynamical processes. For example, in the case of aiming to create a particular excitation in a molecule or material, there will always be the concomitant need to minimize other unwanted excitations. Often, there are also limitations on the form, intensity, or other characteristics of the laser controls that must be adhered to. Another goal is for the control outcome to be as robust as possible to the laser field fluctuations and Hamiltonian uncertainties. Overriding all of these issues is merely the desire to achieve the best possible physical result for the posed quantum mechanical objective. In summary, all of these desired extrema conditions translate over to posing the design of control laser fields as an optimization problem. Thus, optimal control theory forms a basic foundation for quantum control field design. Control field design starts by specification of the Hamiltonian components H0 and m in eqn [1], with the goal of finding the best control electric field 1(t) to balance the competing objectives. Schro¨dinger’s equation must be solved as part of this process, but this effort is not merely a forward propagation task, as the control field is not known a priori. As an optimal design is the goal, a cost functional J ¼ J(objectives, penalties, 1(t)) is prescribed, which contains the information on the physical objectives, competing penalties, and any costs or constraints associated with the structure of the laser field. The functional J could

126 COHERENT CONTROL / Theory

contain many terms if the competing physical goals are highly complex. As a specific simple illustration, consider the common goal of steering the system to achieve an expectation value kcðTÞlOlcðTÞl as close as possible to the target value Otarget associated with the observable operator O at time T. An additional cost is imposed to minimize the laser fluence. These criteria may be embodied in a cost functional of the form  2 ðT J ¼ kcðTÞlOlcðTÞl 2 Otarget þ v 12 ðtÞ dt 0  ðT  › 2 H0 þ m·1ðtÞlcðtÞ ½3 dt lðtÞli" 2 2I ›t 0 Here, the parameter v $ 0 is a weight to balance the significance of the fluence term relative to achieving the target expectation value, and llðtÞl serves as a Lagrange multiplier, assuring that Schro¨dinger’s equation is satisfied during the variational minimization of J with respect to the control field. Carrying out the latter minimization will lead to eqn [1], along with the additional relations i"

› llðtÞl ¼ ½H0 2 m·1ðtÞllðtÞl; ›t llðTÞl ¼ 2sOllðTÞl 1ðtÞ ¼

1 IklðtÞlmlcðtÞl v

s ¼ kcðTÞlOlcðTÞl 2 Otarget

½4 ½5 ½6

Equations [1] and [4] –[6] embody the OCT design equations that must be solved to yield the optimal field 1(t) given in eqn [5]. These design equations have an unusual mathematical structure within quantum mechanics. First, insertion of the field expression in eqn [5] into eqns [1] and [4] produces two cubically nonlinear coupled Schro¨dinger-like equations. These equations are in the same family as the standard nonlinear Schro¨dinger equation, and as such, one may expect that unusual dynamical behavior could arise. Although eqn [4] for llðtÞl is identical in form to the Schro¨dinger equation [1], only lcðtÞl is the true wavefunction, with llðtÞl serving to guide the controlled quantum evolution towards the physical objective. Importantly, eqn [1] is an initial value problem, while eqn [4] is a final value problem. Thus, the two equations together form a two-point boundary value problem in time, which is an inherent feature of temporal engineering optimal control as well. The boundary value nature of these equations typically leads to the existence of multiple solutions, where s in eqn [6] plays the role of a discrete eigenvalue, specifying the quality of the particular

achieved control field design. The fact that there are generally multiple solutions to the laser design equations can be attractive from a physical perspective, as the designs may be sorted through to identify those being most attractive for laboratory implementation. The overall mathematical structure of eqns [1] and [4] –[6] can be melded together into a single design equation i"

› lcðtÞl ¼ ½H0 2 m·1ðc; s; tÞlcðtÞl; lcð0Þl ¼ lci l ›t ½7

The structure of this equation embodies the comments in the introduction that control field design is inherently a nonlinear process. The main source of complexity arising in eqn [7] is through the control field 1ðc; s; tÞ depending not only on the wave function at the present time t, but also on the future value of the wavefunction at the target time T contained in s. This structure again reflects the two-point boundary value nature of the control equations. Given the typical multiplicity of solutions to eqns [1] and [4] –[6], or equivalently, eqn [7], it is attractive to consider approximations to these equations (except perhaps for the inviolate Schro¨ dinger equation [1], and much work continues to be done along these lines. One case involves what is referred to as tracking control, whereby a path is specified for kcðtÞlOlcðtÞl evolving from t ¼ 0 out to t ¼ T. Tracking control eliminates s to produce a field of the form 1(c,t), thereby permitting the design equations to be explicitly integrated as a forwardmarching problem toward the target; the tracking equations are still nonlinear with respect to the evolving state. Many variations on these concepts can be envisioned, and other approximations may also be introduced to deal with special circumstances. One technique appropriate for at least few-level systems is stimulated Raman adiabatic passage (STIRAP), which seeks robust adiabatic passage from an initial state to a particular final state, typically with nearly total destructive interference occurring in the intermediate states. This design technique can have special attractive robustness characteristics with respect to field errors. STIRAP, tracking, and various perturbation theory-based control design techniques can be expressed as special cases of OCT, with suitable cost functionals and constraints. Further approximation methods will surely be developed, with the rigorous OCT concepts forming the general foundation for control field design. As the OCT equations are inherently nonlinear, their solution typically requires numerical iteration,

COHERENT CONTROL / Theory

and a variety of procedures may be utilized for this purpose. Almost all of the methods employ local search techniques (e.g., gradient methods), and some also show monotonic convergence with respect to each new iteration step. Local techniques typically evolve to the nearest solution in the space of possible control fields. Such optimizations can be numerically efficient, although the quality of the attained result can depend on the initial trial for the control field and the details of the algorithm involved. In contrast, global search techniques, such as simulated annealing or genetic algorithms, can search more broadly for the best solution in the control field space. These more expansive searches are attractive, but at the price of typically requiring more intensive computations. Effort has also gone into seeking global or semiglobal input ! output maps relating the electric field 1(t) structure to the observable O[1(t)]. Such maps may be learned, hopefully, from a modest number of computations with a selected set of fields, and then utilized as high-speed interpolators over the control field space to permit the efficient use of global search algorithms to attain the best control solution possible. At the foundation of all OCT design procedures is the need to solve the Schro¨dinger equation. Thus, computational technology to improve this basic task is of fundamental importance for designing laser fields. Many computations have been carried out with OCT to attain control field designs for manipulating a host of phenomena, including rotational, vibrational, electronic, and reactive dynamics of molecules, as well as electron motion in semiconductors. Every system has its own rich details, which in turn, are compounded by the fact that multiple control designs will typically exist in many applications producing comparable physical outcomes. Collectively, these control design studies confirm the manipulation of constructive and destructive quantum wave interferences as the general mechanism for achieving successful control over quantum phenomena. This conclusion may be expressed as the following principle: Control field– system cooperativity principle: Successful quantum control requires that the field must have the proper structure to take full advantage of all of the dynamical opportunities offered by the system, to best satisfy the physical objective.

This simple statement of system – field cooperativity embodies the richness, as well as the complexity, of seeking control over quantum phenomena, and also speaks to why simple intuition alone has not proved to be a generally viable design technique. Quantum systems can often exhibit highly complex dynamical behavior, including broad dispersion of the

127

wave packet over spatial domains or multitudes of quantum states. Handling such complexity can require fields with subtle structure to interact with the quantum system in a global fashion to manage all of the motions involved. Thus, we may expect that successful control pulses will often have broad bandwidth, including amplitude and phase modulation. Until relatively recently, laser sources with these characteristics were not available, but the technology is now in hand and rapidly evolving (see the discussion later). The cooperativity principle above is of fundamental importance in the control of all quantum phenomena, and a simple illustration of this principle is shown in Figure 2 for the control of wave packet motion on an excited state of the NO molecule. The target for the control is a narrow wave packet located over the well of the excited B state. The optimal control field consists of two coordinated pulses, at early and late times, with both features having internal structure. The figure indicates that the ensuing excited state wave packet has two components, one of which actually passes through the target region during the initial evolution, only to return again and meet the second component at just the right place and time, to achieve the target objective as best as possible. Similar cooperativity interpretations can be found in virtually all implementations with OCT. The main reason for performing quantum field design is to ultimately implement the designs in the laboratory. The design procedures must face laboratory realities, which include the fact that most Hamiltonians are not known to high accuracy (especially for polyatomic molecules and complex solid-state structures), and secondly, a variety of laboratory field imperfections may unwittingly be present. Notwithstanding these comments, OCT has been fundamental to the development of quantum control, including laying out the logic for how to perform the analogous optimal control experiments. Design implementations and further OCT development will continue to play a basic role in the quantum control field. At present, perhaps the most important contribution of OCT has been to (1) highlight the basic control cooperativity principle above, and (2) provide the basis for developing algorithms to successfully guide optimal control experiments, as discussed below.

Algorithms for Implementing Optimal Control Experiments The ultimate purpose of considering control theory within quantum mechanics is to take the matter into

128 COHERENT CONTROL / Theory

Figure 2 Control of wave packet evolution on the electronically excited B state of NO, with the goal of creating a narrow final packet over the excited state well. (a) indicates that the ground-state packet is brought up in two pieces using the coordinated dual pulse in (b). The piece of the packet from the first pulse actually passes through the target region, then bounces off the right side of the potential, to finally meet the second piece of the packet at just the right time over the target location r p for successful control. This behavior is an illustration of the control field– system cooperativity principle stated in the text.

the laboratory for implementation and exploitation of its capabilities. Ideally, theoretical control field designs may be attained using the techniques discussed above, followed by the achievement of successful control upon execution of the designs in the laboratory. This appealing approach is burdened with three difficulties: (1) Hamiltonians are often imprecisely known, (2) accurately solving the design equations can be a significant task, and (3) realization of any given design will likely be imperfect due to laboratory noise or other unaccounted-for systematic errors. Perhaps the most serious of these difficulties is point (1), especially considering that the best quality control will be achieved by maximally drawing on subtle constructive and destructive quantum wave interference effects. Exploiting such subtleties will generally require high-quality control designs that, in turn, depend on having reliable Hamiltonians. Although various designs have been carried out seeking robustness with respect to Hamiltonian uncertainties, the issue in point (1) should remain of significance in the foreseeable future, especially for the most complex (and often, the most interesting!) chemical/physical applications. Mitigating this serious problem is the ability to create shaped laser pulse controls and apply them to a quantum system, followed by a probe of their effects at an unprecedented rate of thousands or more independent trials per minute. This unique capability led to the suggestion of

partially, if not totally, sidestepping the design process by performing closed-loop experiments to let the quantum system teach the laser how to achieve its control in the laboratory. Figure 3 schematically shows this closed loop process, drawing on the following logic: 1. The molecular view. Although there may be theoretical uncertainty about the system Hamiltonian, the actual chemical/physical system under study ‘knows’ its own Hamiltonian precisely! This knowledge would also include any unusual exigencies, perhaps associated with structural or other defects in the particular sample. Furthermore, upon exposure to a control field, the system ‘solves’ its own Schro¨dinger equation impeccably accurately and as fast as possible in real time. Considering these points, the aim is to replace the offline arduous digital computer control field design effort with the actual quantum system under study acting as a precise analog computer, solving the true equations of motion. 2. Control laser technology. Pulse shaping under full computer control may be carried out using even hundreds of discrete elements in the frequency domain controlling the phase and amplitude structure of the pulse. This technology is readily available and expanding in terms of pulse center frequency flexibility and bandwidth capabilities.

COHERENT CONTROL / Theory

129

Figure 3 A schematic of the closed-loop concept for allowing the quantum system to teach the laser how to achieve its control. The actual quantum system under control is in a loop with a laser and pulse shaper all slaved together with a pattern recognition algorithm to guide the excursions around the loop. The success of this concept relies on the ability to perform very large numbers of control experiments in a short period of time. In principle, no knowledge of the system Hamiltonian is required to steer the system to the desired final objective, although a good trial design 10 ðtÞ may accelerate the process. The i th cycle around the loop attempts to find a better control field 1i ðtÞ such that the system response kOðT Þli for the observable operator O comes closer to the desired value Otarget .

3. Quantum mechanical objectives. Many chemical/ physical objectives may be simply expressed as the desire to steer a quantum mechanical flux out one clearly defined channel versus another. 4. Detection of the control action. Detection of the control outcome can often be carried out by a second laser pulse or any other suitable high duty cycle detection means, such as laser-induced mass spectrometry. The typical circumstance in point (3) implies that little, if any, time-consuming offline data analysis will be necessary beyond that of simple signal averaging. 5. Fast learning algorithms. All of the four points above may be slaved together using pattern recognition learning algorithms to identify those control fields which are producing better results, and bias the next sequence of experiments in their favor for further improvement. Although many algorithms may be employed for this purpose, the global search capabilities of genetic algorithms are quite attractive, as they may take full advantage of the high throughput nature of the experiments. In linking together all of these components into a closed-loop OCE, it is important that no single operational step significantly lags behind any other,

for efficiency reasons. Fortunately, this economy is coincident with all the tasks and technologies involved. For achieving control alone, there is no requirement that the actual laser pulse structure be identified on each cycle of the loop. Rather, the laser control ‘knob’ settings are adequate information for the learning algorithm, as the only criterion is to suitably adjust the knobs to achieve an acceptable value for the chemical/physical objective. The learning algorithm in point (5) operates with a cost functional J, analogous to that used for computational design of fields, but now the cost functional can only depend on those quantities directly observable in the laboratory (i.e., minimally, the current achieved target expectation value). In cases where a physical understanding is sought about the control mechanism, the laser field structure must also be measured. This additional information is typically also not a burden to attain, as it often is only required for the single best control field at the end of the closed-loop learning experiments. The OCE process in Figure 3 and steps (1) –(5) above constitute the laboratory process for achieving optimal quantum control, and exactly the same reasons put forth for OCT motivate the desire for attaining optimal performance: principally, seeking

130 COHERENT CONTROL / Theory

the best control result that can possibly be obtained. Seeking optimal performance also ensures some degree of inherent robustness to field noise, as fleeting observations would not likely survive the signal averaging carried out in the laboratory. Additionally, robustness as a specific criterion may also be included in the laboratory cost functional in item (5). When a detailed physical interpretation of the controlled dynamics is desired, it is essential to remove any extraneous control field features that have little impact on the system manipulations. Without the latter clean-up carried out during the experiments, the final field may be contaminated by structures that have little physical impact. Although quantum dynamics typically occurs on femtosecond or picosecond time-scales, the loop excursions in Figure 3 need not be carried out on the latter time-scales. Rather, the loop may be traversed as rapidly as convenient, consistent with the capabilities of the apparatus. A new system sample is introduced on each cycle of the loop. This process is referred learning control, to distinguish it from real-time feedback control. The closed-loop quantum learning control algorithm can be self-starting, requiring no prior control field designs, under favorable circumstances. Virtually all of the current experiments were carried out in this fashion. It remains to be seen if this procedure of ‘going in blind’ will be generally applicable in highly complex situations where the initial state is far from the target state. Learning control can only proceed if at least a minimal signal is observed in the target state. Presently, this requirement has been met by drawing on the overwhelmingly large number of exploratory experiments that can be carried out, even in a brief few minutes, under full computer control. However, in some situations, the performance of intermediate observations along the way to the target goal may be necessary in order to at least partially guide the quantum dynamics towards the ultimate objective. Beneficial use could also be made from prior theoretical laser control designs 10(t) capable of at least yielding a minimal target signal. The OCE closed-loop procedure was based on the growing number of successful theoretical OCT design calculations, even with all of their foibles, especially including less than perfect Hamiltonians. The OCT and OCE processes are analogous, with their primary distinction involving precisely what appears in their corresponding cost functionals. Theoretical guidance can aid in identifying the appropriate cost functionals and learning algorithms for OCE, especially for addressing the needs associated with attaining a physical understanding about the mechanisms governing controlled quantum dynamics phenomena.

A central issue is the rate that learning control occurs upon excursions around the loop in Figure 3. A number of factors control this rate of learning, including the nature of the physical system, the choice of objective, the presence of field uncertainties and measurement errors, the number of control variables, and the capabilities of the learning algorithm employed. Achieving control is a matter of discrimination, and some of these factors, whether inherent to the system or introduced by choice, may work against attaining good-quality discrimination. At this juncture, little is known quantitatively about the limitations associated with any of these factors. The latter issues have not prevented the successful performance of a broad variety of quantum control experiments, and at present the ability to carry out massive numbers of closed-loop excursions has overcome any evident difficulties. The number of examples of successful closed-loop quantum control is rapidly growing, with illustrations involving the manipulation of laser dye fluorescence, chemical reactivity, high harmonic generation, semiconductor optical switching, fiber optic pulse transmission, and dynamical discrimination of similar chemical species, amongst others. Perhaps the most interesting cases are those carried out at high field intensities, where prior speculation suggested that such experiments would fail due to even modest laser field noise being amplified by the quantum dynamics. Fortunately, this outcome did not occur, and theoretical studies have indicated that surprising degrees of robustness to field noise may accompany the manipulation of quantum dynamics phenomena. Although the presence of field noise may diminish the beneficial influence of constructive and destructive interferences, evidence shows that field noise does not appear to kill the actual control process, at least in terms of obtaining reasonable values for the control objectives. One illustration of control in the high-field regime is shown in Figure 4, demonstrating the learning process for the dissociative rearrangement of acetophenone to form toluene. Interestingly, this case involves breaking two bonds, with the formation of a third, indicating that complex dynamics can be managed with suitably tailored laser pulses. Operating in the strong-field regime, especially for chemical manipulations, also has the important benefit of producing a generic laser tool to control a broad variety of systems by overcoming the long-standing problem of having sufficient bandwidth. For example, the result in Figure 4 uses a Ti:sapphire laser in the near-infrared regime. Starting with a bandwidth-limited pulse of intensity near

COHERENT CONTROL / Theory

Figure 4 An illustration of dissociative rearrangement achieved by closed-loop learning control in Figure 3. The optimally deduced laser pulse broke two bonds in the parent acetophenone molecule and formed a new one to yield the toluene product. The laboratory learning curve is shown for the toluene product signal as a function of the generations in the genetic algorithm guiding the experiments. The fluctuations in the learning curve for optimizing the toluene yield corresponds to the algorithm searching for the optimal yield as the experiments proceed.

, 1014 W cm22, in this regime, the dynamic power broadening can easily be on the order of , 1 eV or more, thereby effectively converting the otherwise discrete spectrum of the molecule into an effective continuum for ready multiphoton matching by the control laser pulse. Operating under these physical conditions is very attractive, as the apparatus is generic, permitting a single shaped-pulse laser source (e.g., a Ti:sapphire system with phase and amplitude modulation) to be utilized with virtually any chemical system where manipulations are desired. Although every desired physical/chemical goal may not be satisfactorily met (i.e., the goals must be physically attainable!), the means are now available to explore large numbers of systems. Operating under closed loop in the strong-field tailored-pulse regime eliminates the prior serious limitation of first finding a physical system to meet the laser capabilities; now, the structure of the laser pulse can be shaped to meet the molecule’s characteristics. Strong field operations may be attractive for these reasons, but other broadband laser sources, possibly working at weaker field intensities, might be essential in some applications (e.g., controlling electron dynamics in semiconductors, where material damage is to be avoided). It is anticipated that additional broadband laser sources will become available for these purposes. The coming years should be a period of rapid expansion, including a thorough exploration of

131

closed-loop quantum control capabilities. As with the use of any laboratory tool, certain applications may be more amenable than others to attaining successful control. The particular physical/chemical questions need to be well posed, and controls need to have sufficient flexibility to meet the objectives. The experiments ahead should be able to reveal the degree to which drawing on optimality in OCE, combined with the performance of massive numbers of experiments can lead to broad-scale successful control of quantum phenomena. One issue of concern is the richness associated with the large numbers of phase and amplitude control knobs that may be adjusted in the laboratory. Some experiments have already operated with hundreds of knobs, while others have restricted their number in a variety of ways, to simplify the search process. Additional technological and algorithmic advances may be required to manage the high-dimensional control space searches. Fortunately, for typical applications, the search does not reduce to seeking a needle in a haystack, as generally there are multiple control solutions, possibly all of very good quality. As a final comment on OCE, it is useful to appreciate the subtle distinction of the procedure from closed-loop feedback experiments. This distinction is illustrated in Figure 5, pointing to three types of closed-loop quantum control laboratory experiments. The OCE procedure in Figure 5a produces a generic laser tool capable of controlling a broad variety of systems with an emphasis in the figure placed on the point that each cycle around the loop starts with a new sample for control. The replacement of the sample on each cycle eliminates a number of difficulties, principally including concerns about sample damage, avoidance of operating the loop at the ultrafast speed of the quantum mechanical processes, and elimination of the effect that ‘to observe is to disturb’ in quantum mechanics. Thus, learning control provides a generic practical procedure, regardless of the nature of the quantum system. Figure 5b has the same structure as that of Figure 5a, except the loop is now closed around the same single quantum system, which is followed throughout its evolution. All of the issues mentioned above that are circumvented by operating through laboratory learning control in Figure 5a must now be directly faced in the setup of Figure 5b. The procedure in Figure 5b will likely only be applicable in special circumstances, at least for the reason that many quantum systems operate on time-scales far too fast for opto-electronics and computers to keep up with. However, there are certain quantum mechanical processes that are sufficiently slow to meet the criteria. Furthermore, a period of free evolutionary

132 COHERENT CONTROL / Theory

hard limitations of quantum mechanics. Experiments of the type in Figure 5b will be most interesting for exploring this matter. Finally, Figure 5c introduces a gedanken experiment in the sense that the closed-loop process is literally built into the hardware. That is, the laser control and quantum system act as a single functioning unit operating in a stable fashion, so as to automatically steer the system to the desired target. This process may involve engineering the quantum mechanical system prior to control in cases where that freedom exists, as well as engineering the laser components involved. The meaning of such a device in Figure 5c can be understood by considering an analogy with airplane flight, where the aircraft is constructed to have an inherent degree of aerodynamic stability and will essentially fly (glide) on its own accord when pushed forward. It is an open question whether closing the loop in the hardware can be attained for quantum control, and an exploration of this concept may require additional laser technologies.

Conclusions Figure 5 Three possible closed-loop formulations for attaining laser control over quantum phenomena. (a) is a learning control process where a new system is introduced on each excursion around the loop. (b) utilizes feedback control with the loop being traversed with the same sample, generally calling for an accurate model of the system to adjust the controls on each loop excursion. (c) is, at present, a dream machine where the laser and the sample under control are one unit operating without external algorithmic guidance. The closed-loop learning control procedure in (a) appears to form the most practical means to achieve laser control over quantum phenomena, especially for complex systems.

quantum dynamics may be permitted to occur between one control pulse and the next, to make the time issue manageable. While the learning control process in Figures 3 and 5a can be performed model-free, the feedback algorithm in Figure 5b generally must operate with a sufficiently reliable system Hamiltonian to carry out fast design corrections to the control field, based on the previous control outcome. These are very severe demands, but they may be met under certain circumstances. One reason for considering closed-loop feedback control in Figure 5b is to explore the basic physical issue of quantum mechanical limitations inherent in the statement ‘to observe is to disturb’. It is suggestive that control over many types of quantum phenomena may fall into the semiclassical regime, lying somewhere between classical engineering behavior and the

This article presented an overview of theoretical concepts and algorithmic considerations associated with the control of quantum phenomena. Theory has played a central role in this area by revealing the fundamental principles underlying quantum control, as well as by providing algorithms for designing controls and guiding experiments to discover successful controls in the laboratory. The challenge of controlling quantum phenomena, in one sense, is an old subject, going back at least 40 years. However, roughly the first 30 of these years would best be described as a period of frustration, due to a lack of full understanding of the principles involved and the nature of the lasers needed to achieve success. Thus, from another perspective, the subject is quite young, with perhaps the most notable development being the introduction of closed-loop laboratory learning procedures. At the time of writing this article, these procedures are just beginning to be explored for their full capabilities. To appreciate the young nature of this subject, it is useful to note that the analogous engineering control disciplines presently occupy many thousands of engineers worldwide, both theoreticians and practitioners, and have done so for many years. Yet, engineering control is far from considered a mature subject. Armed now with the basic concepts and proper laboratory tools, one may anticipate a thorough exploration of control over quantum phenomena, including its many possible applications.

COHERENT CONTROL / Experimental

Acknowledgments Support for this work has come from the National Science Foundation and the Department of Defense.

Further Reading Brixner T, Damrauer NH and Gerber G (2001) Femtosecond quantum control. Advances in Atomic, Molecular, and Optical Physics 44: 1 –54.

133

Rabitz H and Zhu W (2000) Optimal control of molecular motion: Design, implementation and inversion. Accounts of Chemical Research 33: 572 – 578. Rabitz H, de Vivie-Riedle R, Motzkus M and Kompa K (2000) Whither the future of controlling quantum phenomena? Science 288: 824 –828. Rice SA and Zhao M (2000) Optical Control of Molecular Processes. NewYork: John Wiley.

Experimental R J Levis, Temple University, Philadelphia, PA, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction The experimental control of quantum phenomena in atoms, molecules, and condensed phase materials has been a long-standing challenge in the physical sciences. For example, the idea that a laser beam could be used to selectively cleave a chemical bond has been pursued since the demonstration of the laser in the early 1960s. However, it was not until very recent times that one could realize selective bond cleavage. One key to the rapid acceleration of such experiments is the capability of manipulating the relative phases between two or more paths leading to the same final state. This is done in practice by controlling the phase of two (or two thousand!) laser frequencies coupling the initial state to some desired final state. The present interest in phase-related control stems in part from research focusing on quantum information sciences, controlling chemical reactivity, and developing new optical technologies, such as in biophotonics for imaging cellular materials. Ultimately, one would like to identify coherent control applications having impact similar to linear regime applications like compact disk reading (and writing), integrated circuit fabrication, and photodynamic therapy. Note that in each of these the phase of the electromagnetic field is not an important parameter and only the brightness of the laser at a particular frequency is used. While phase is of negligible importance in any one photon process, for excitations involving two or more photons, phase not only becomes a useful characteristic, it can modulate the yield of a desired process between zero and one hundred percent. The use of optical phase to manipulate atomic and molecular systems rose to the forefront of laser science in the 1980s. Prior to this there was

considerable effort directed toward using the intensity and high spectral resolution features of the laser for various applications. For example, the high spectral resolution character was employed in purifying nuclear isotopes. Highly intense beams were also used in the unsuccessful attempt to pump enough energy into a single vibrational resonance to selectively dissociate a molecule. The current revolution in coherent control has been driven in part by the realization that controlling energy deposition into a single degree of freedom (such as selectively breaking a bond in a polyatomic molecule) can not be accomplished by simply driving a resonance with an ever more intense laser source. In the case of laser-induced photo dissociation, such excitation ultimately results in statistical dissociation throughout the molecule due to vibrational mode coupling. In complex systems, controlling energy deposition into a desired quantum state requires a more sophisticated approach and one scheme involves coherent control. The two distinct regimes for excitation in coherent control, the time and frequency domain experiments, are formally equivalent. While the experimental approach and level of success for each domain is vastly different, both rely on precisely specifying the phase relations for a series of excitation frequencies. The basic idea behind phase control is simple. One emulates the interference effects that can be easily visualized in a water tank experiment with a wave propagating through two slits. In the coherent control experiment, two or more indistinguishable pathways must be excited that connect one quantum state to another. In either water waves, or coherent control, the probability for observing constructive or destructive interference at a target state depends on the phase difference between the paths connecting the two locations in the water tank, or the initial and final states in the quantum system. In the case of coherent control, the final state may have some technological

134 COHERENT CONTROL / Experimental

value, for instance, a desirable chemical reaction product or an intermediate in the implementation of quantum computation. Frequency domain methods concentrate on interfering two excitation routes in a system by controlling the phase difference between two distinct frequencies coupling an initial state to a final state. Perhaps the first mention of the concept of phase control in the optical regime can be traced to investigations in Russia in the early 1960s, where the interference between a one- and three-photon path was theoretically proposed as a means to modulate a two-photon absorption. This two-path interference idea lay dormant until the reality of controlling the relative phase of two distinct frequencies was achieved in 1990 using gas phase pressure modulation for frequency domain experiments. The idea was theoretically extended to the control of nuclear motion in 1988. The first experimental example of two-color phase control over nuclear motion was demonstrated in 1991. It is interesting that the first demonstrations of coherent control did not, however, originate in the frequency domain approach, but came from time domain investigations in the mid-1980s. This was work that grew directly out of the chemical dynamics community and will be described below. While frequency domain methods were developed exclusively for the observation of quantum interference, the experimental implementation of timedomain methods was first developed for observing nuclear motion in real time. In the time domain, a superposition of multiple states is prepared in a system using a short duration (, 50 fs) laser pulse. The superposition state will subsequently evolve according to the phase relationships of the prepared states and the Hamiltonian of the system. At some later time the evolution must be probed using a second, short-time-duration laser pulse. The timedomain methods for coherent control were first proposed in 1985 and were based on wavepacket propagation methods developed in the 1970s. The first experiments were reported in the mid-1980s. To determine whether the time or frequency domain implementation is more applicable for a particular control context, one should compare the relevant energy level spacing of the quantum system to the bandwidth of the excitation laser. For instance, control of atomic systems is more suited to frequency domain methods because the characteristic energy level spacing in an atom (several electron volts) is large compared to the bandwidth of femtosecond laser pulses (millielectron volts). In this case, preparation of a superposition of two or more states is impractical with a single laser pulse at the present time, dictating that the control scheme must employ

the initial state and a single excited eigenstate. In practice, quasi-CW nanosecond duration laser pulses are employed for such experiments because current ultrafast (fs duration) laser sources cannot typically prepare a superposition of electronic states. One instructive exception involves the excitation of Rydberg states in atoms. Here the electronic energy level spacing can be small enough that the bandwidth of a femtosecond excitation laser may span several electronic states and thus can create a superposition state. One should also note that the next generation of laser sources in the attosecond regime may permit the preparation of a wavepacket from low-lying electronic states in an atom. In the case of molecules, avoiding a superposition state is nearly impossible. Such experiments employ pico- to femtosecond duration laser pulses, having a bandwidth sufficient to excite many vibrational levels, either in the ground or excited electronic state manifold of a molecule. To complete the time-dependent measurement, the propagation of the wavepacket to the final state of interest must be observed (probed) using a second ultrafast laser pulse that can produce fluorescence, ionization, or stimulated emission in a time-resolved manner. The multiple paths that link the initial and final state via the pump and the probe pulses are the key to the equivalence of the time and frequency domain methods. There is another useful way to classify experiments that have been performed recently in coherent control, that is into either open-loop or closed-loop techniques. Open-loop signifies that a calculation may be performed to specify the required pulse shape for control before the experiment is attempted. All of the frequency-based, and most of the time-based experiments fall into this category. Closed-loop, on the other hand, signifies that the results from experimental measurements must be used to assist in determining the correct parameters for the next experiment, this pioneering approach was suggested in 1992. It is interesting that in the context of control experiments performed to date, closed-loop experiments represent the only route to determining the optimal laser pulse shape for controlling even moderately complex systems. In this approach, no input from theoretical models is required for coherent control of complex systems. This is unlike typical control engineering environments, where closed-loop signifies the use of experiments to refine parameters used in a theoretical model. There is one additional distinction that is of value for understanding the state of coherent control at the present time. Experiments may be classified into systems having Hamiltonians that allow calculation of frequencies required for the desired interference

COHERENT CONTROL / Experimental

patterns (thus enabling open-loop experiments) and those having ill-defined Hamiltonians (requiring closed-loop experiments). The open-loop experiments have demonstrated the utility of various control schemes, but are not amenable to systems having even moderate complexity. The closed-loop experiments are capable of controlling systems of moderate complexity but are not amenable to precalculation of the required control fields. Systems requiring closed-loop methods include propagation of short pulses in an optical fiber, control of high harmonic generation for soft X-ray generation, control of chemical reactions and control of biological systems. In the experimental regime, the value of learning control for dealing with complex systems has been well documented. The remainder of this article represents an overview of the experimental implementation of coherent control. The earliest phase control experiments involved time-dependent probing of molecular wavepackets, and were followed by two-path interference in an atomic system. The most recent, and general, implementation of control involves tailoring a timedependent electromagnetic field for a desired objective using closed-loop techniques. Since the use of tailored laser pulses appears to be rather general, a more complete description of the experiment is provided.

Time-Dependent Methods The first experiments demonstrating the possibility of coherent control were performed to detect nuclear motion in molecules in real time. These investigations were the experimental realization of the pump-dump or pulse-timing control methods as originally described in 1985. The experimental applications have expanded to numerous systems including diatomic and triatomic molecules, small clusters, and even biological molecules. For such experiments, a vibrational coherence is prepared by exciting a superposition of states. These measurements implicitly used phase control both in the generation of the ultrashort pulses and in the coherent probing of the superposition state by varying the time delay between the pump and probe pulses, in effect modulating the outcome of a quantum transition. Since the earliest optical experiment in the mid1980s, there have been many hundreds of such experiments reported in the literature. The motivation for the very first experiments was not explicitly phase control, but rather the observation of nuclear motion of molecules in real time. In such later measurements, oscillations in the superposition states were observable for many cycles, up to many tens of

135

picoseconds in favorable cases, i.e., diatomic molecules in the gas phase. The loss of signal in the coherent oscillations is ultimately due to passage of the superposition state in to surrounding bath states. This may be due to collision with another molecule, dissociation of the system or redistribution of the excitation energy into other modes of the system not originally excited in the superposition. It is notable that coherence can be maintained in solution phase systems for tens of picoseconds and for the case of resonances having weak coupling to other modes, coherence can even be modulated in large biological systems on the picosecond time scale. Decoherence represents loss of phase information about a system. In this context, phase information represents our detailed knowledge of the electronic and nuclear coordinates of the system. In the case of vibrational coherence, decoherence may be represented by intramolecular vibrational energy transfer to other modes of the molecule. This can be thought of as the propensity of vibrational modes of a molecule to couple together in a manner that randomizes deposited energy throughout the molecule. Decoherence in a condensed phase system includes transfer of energy to the solvent modes surrounding the system of interest. For an electronically excited molecule or atom, decoherence can involve spontaneous emission of radiation (fluorescence), or dissociation in the case of molecules. Certain excited states may have high probability for fluorescence and would represent a significant decoherence pathway and an obstacle for coherent control. These states may be called ‘lossy’ and are often used in optical pumping schemes as well as for stimulated emission pumping. An interesting question arises as to whether one can employ such states in a pathway leading to a desired final state, or must such lossy states be avoided at all costs in coherent control. This question has been answered to some degree by the method of rapid adiabatic passage. In this experiment, an initial state is coupled to a final state by way of a lossy state. The coupling is a coherent process and the preparation involves dressed states that are eigenstates of the system. In this coherent superposition the lossy state is employed, but no population is allowed to build up and thus decoherence is circumvented.

Two-Path Interference Methods Interference methods form perhaps the most intuitive example of coherent control. The first experimental demonstration was reported in 1990 and involved the modulation of ionization probability in Hg by interfering a one-photon and three-photon excitation

136 COHERENT CONTROL / Experimental

pathway to an excited electronic state (followed by two-photon ionization). The long delay between the introduction of the two-path interference concept in 1960 and the demonstration in 1990 was the difficulty in controlling the relative phase of the one- and three-photon paths. The solution involved generating the third harmonic of the fundamental in a nonlinear crystal, and copropagating the two beams through a low-density gas. Changing the pressure of the gas changes the optical phase retardance at different rates for different frequencies, thus allowing relative phase control between two colors. The ionization yield of Hg was clearly modulated as a function of pressure. Such schemes have been extended to diatomic molecular systems and the concept of determining molecular phase has been investigated. While changing the phase of more than two frequencies has not been demonstrated using pressure tuning (because of experimental difficulties), a more flexible method has been developed to alter relative phase of up to 1000 frequency bands, as will be described next.

Closed-Loop Control Methods Perhaps the most exciting new aspect of control in recent years concerns the prospect for performing closed-loop experiments. In this approach a more complex level of control in matter is achieved in comparison to the two-path interference and pumpprobe control methods. In the closed-loop method, an arbitrarily complex time-dependent electromagnetic field is prepared that may have multiple subpulses and up to thousands of interfering frequency bands. The realization of closed-loop control over complex processes relies on the confluence of three technologies: (i) production of large bandwidth ultrafast laser pulses; (ii) spatial modulation methods for manipulating the relative phases and amplitudes of component frequencies of the ultrafast laser pulse; and (iii) closed-loop learning control methods for sorting through the vast number of pulse shapes available as potential control fields. In the closed-loop method, the result of a laser molecule interaction is used to tailor an optimized pulse. In almost all of the experimental systems used for closed-loop control today, Kerr lens mode-locking in the Ti:sapphire crystal is used to phase lock the wide bandwidth of available frequencies (750 –950 nm) to create the ultrafast pulse. In this phenomenon, the intensity variation in the short laser pulse creates a transient lens in the Ti:sapphire lasing medium that discriminates between free lasing and mode-locked pulses. As a result, all of the population inversion available in the lasing medium may be coerced into

enhancing the ultrashort traveling wave in the cavity in this way. The wide bandwidth available may then be amplified and tailored into a shaped electromagnetic field. Spatial light modulation is one such method for modulating the relative phases and amplitudes of the component frequencies in the ultrafast laser pulse to tailor the shaped laser pulse. Pulse shaping first involves dispersing the phaselocked frequencies on an optical grating, the dispersed radiation is then collimated using a cylindrical lens and the individual frequencies are focused to a line forming the Fourier plane. At the Fourier plane, the time-dependent laser pulse is transformed into a series of phase-locked continuous wave (CW) laser frequencies, and thus Fourier transformed from time to frequency space. The relative phases and amplitudes may be modulated in the Fourier plane using an array of liquid crystal pixels. After altering the spectral phase and amplitude profile, the frequencies are recombined on a second grating to form the shaped laser pulse. With one degree of phase control and 10 pixels there are an astronomic number (36010) of pulse shapes that may be generated in this manner. (At the present time, pulse shapers have up to 2000 independent elements.) Evolutionary algorithms are employed to manage the available phase space. Realizing that the pulse shape is nothing more than the summation of a series of sine waves, each having an associated phase and amplitude, we find that a certain pulse shape can be represented by a genome consisting of an array of these frequency-dependent phases and amplitudes. This immediately suggests that the methods of evolutionary search strategies may be useful for determining the optimal pulse shape for a desired photoexcitation process. The method of closed-loop optimal control for experiments was first proposed in 1992 and the first experiments were reported in 1997 regarding optimization of the efficiency of a laser dye molecule in solution. Since that time a number of experiments have been performed in both the weak and strong field regime. In the weak field regime, the intensity of the laser does not alter the field free structure of the excited states of the system under investigation. In the strong field, the field free states of the system are altered by the electric field of the laser. Whether one is in the weak or strong field regime depends on parameters including the characteristic level spacing of the states of the system to be controlled and the intensity of the laser coupling into those states. Examples of weak field closed-loop control include optimization of laser dye efficiency, compression of an ultrashort laser pulse, dissociation of alkali clusters, optimization of coherent anti-Stokes

COHERENT CONTROL / Applications in Semiconductors

Raman excitation, and optical pulse propagation in fibers. In the strong field, the laser field is used to manipulate the excited states of the molecule as well as induce population transfer among these states. In this sense, the laser is both creating and exciting resonances, in effect allowing universal excitation with a limited frequency bandwidth of 750 – 850 nm. For chemical applications, it is worth noting that most molecules do not absorb in this wavelength region in the weak field. Strong field control has been used for controlling bond dissociation, chemical rearrangement, photonic reagents, X-ray generation, molecular centrifuges and the manipulation of mass spectral fragmentation intensities. In the latter case, a new sensing technology has emerged wherein each individual pulse shape represents an independent sensor for a molecule. Given the fact that thousands of pulse shapes can be tested per minute, this represents a new paradigm for molecular analysis. At the time of this writing, the closed-loop method for adaptively tailoring control fields, has far outstripped our theoretical understanding of the process (particularly in the strong field regime). Adaptive control is presently an active field of investigation, encompassing the fields of physics, engineering, optics, and chemistry. In the coming years, there will be as much work done on the mechanism of control as in the application of the methods. Due to small energy level spacings and complexity of the systems, we anticipate the majority of applications to be in the chemical sciences.

137

See also Coherence: Coherence and Imaging; Overview. Interferometry: Overview. Microscopy: Interference Microscopy.

Further Reading Brumer PW and Shapiro M (2003) Principles of the Quantum Control of Molecular Processes. New York: Wiley Interscience. Chen C and Elliott DS (1990) Measurement of optical phase variations using interfering multiphoton ionization processes. Physics Review Letters 65: 1737. Levis RJ and Rabitz H (2002) Closing the loop on bond selective chemistry using tailored strong field laser pulses. Journal of Physical Chemistry A 106: 6427– 6444. Levis RJ, Menkir GM and Rabitz H (2001) Selective bond dissociation and rearrangement with optimally tailored, strong-field laser pulses. Science 292(5517): 709 – 713. Rice SA and Zhao M (2000) Optical Control of Molecular Dynamics. New York: Wiley Interscience. Stolow A (2003) Femtosecond time-resolved photoelectron spectroscopy of polyatomic molecules. Annual Review of Physical Chemistry 54: 89– 119. Zewail AH, Casati G, Rice SA, et al. (1997) Femtochemistry: Chemical reaction dynamics and their control. Chemical Reactions and Their Control on the Femtosecond Time Scale, Solvay Conference on Chemistry 101: 3– 45.

Applications in Semiconductors H M van Driel and J E Sipe, University of Toronto, Toronto, ON, Canada q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Interference phenomena are well-known in classical optics. In the early 19th century, Thomas Young showed conclusively that light has wave properties, with the first interference experiment ever performed. He passed quasi-monochromatic light from a single source through a pair of double slits using the configuration of Figure 1a and observed an interference pattern consisting of a series of bright and dark fringes on a distant screen. The intensity distribution can be explained only if it is assumed that light has wave or phase properties. In modern

terminology, if E1 and E2 are the complex electric fields of the two light beams arriving at the screen, by the superposition principle of field addition the intensity at a particular point on the screen can be written as I / lE1 þ E2 l2 ¼ lE1 l2 þ lE2 l2 þ lE1 llE2 l cosðf1 2 f2 Þ

½1

where f1 2 f2 is the phase difference of the beams arriving at the screen. While this simple experiment was used originally to demonstrate the wave properties of light, it has subsequently been used for many other purposes, such as measuring the wavelength of light. However, for our purposes, here the Young’s double slit apparatus can also be viewed as a device that redistributes light, or

138 COHERENT CONTROL / Applications in Semiconductors

influence the system’s final state. If am is the (complex) amplitude associated with a transition from the initial to the final state, via intermediate virtual state lml; then for all possible intermediate states the overall transition probability, W; can be written as  X 2 W ¼  am   m 

½2

In the case of only two pathways, as illustrated in Figure 1b, W becomes W ¼ la1 þ a2 l2 ¼ la1 l2 þ la2 l2 þ la1 lla2 l cosðf1 2 f2 Þ

Figure 1 (a) Interference effects in the Young’s double slit experiment; (b) illustration of the general concept of coherence control via multiple quantum mechanical pathways; (c) interference of single- and two-photon transitions connecting the same valence and conduction band states in a semiconductor.

controls its intensity at a particular location. For example, let’s consider one of the slits to be a source, with the other slit taken to be a gate, with both capable of letting through the same amount of light. If the gate is closed, the distant screen is nearly uniformly illuminated. But if the gate is open, at a particular point on the distant screen there might be zero intensity or as much as four times the intensity emerging from one slit because of interference effects, with all the light merely being spatially redistributed. In this sense the double slit system can be viewed as a device to redistribute the incident light intensity. The key to all this is the superposition principle and the properties of optical phase. The superposition principle and interference effects also lie at the heart of quantum mechanics. For example, let’s now consider a system under the influence of a perturbation with an associated Hamiltonian that has phase properties. In general the system can evolve from one quantum state lil to another lf l via multiple pathways involving intermediate states lml: Because of the phased perturbation, interference between those pathways can

½3

where f1 and f2 are now the phases of the two transition amplitudes. While this expression strongly resembles that of eqn [1], here f1 and f2 are influenced by the phase properties of the perturbation Hamiltonian that governs the transition process; these phase factors arise, for example, if light fields constitute the perturbation. But overall it is clear that the change in the state of the system is affected by both the amplitude and phase properties of the perturbation. Analogous with the classical interferometer, equality of the transition amplitudes leads to maximum contrast in the transition rate. Although it has been known since the early days of quantum mechanics that the phase of a perturbation can influence the evolution of a system, generally phase has only been discussed in a passive role. Only in recent years has phase been used as a control parameter for a system, on the same level as the strength of the perturbation itself.

Coherence Control Coherence control, or quantum control as it is sometimes called, refers to the active process through which one can use phase-dependent perturbations originating with, for example, coherent light waves, to control one or more properties of a quantum system, such as state population, momentum, or spin. This more general perspective of interference leads to a picture of interference of matter waves, rather than simply light waves, and one can now speak of an effective ‘matter interferometer’. Laser beams, in particular, are in a unique position to play a role in such processes, since they offer a macroscopic ‘phase handle’ with which to create such effects. This was recognized by the community of atomic and molecular scientists who emphasized the active manifestations of quantum interference effects, and proposed that branching ratios in photochemical reactions

COHERENT CONTROL / Applications in Semiconductors

might be controlled through laser-induced interference processes. The use of phase as a control parameter also represents a novel horizon of applications for the laser since most previous applications had involved only amplitude (intensity). Of course, just as the ‘visibility’ of the screen pattern for a classical Young’s double slit interferometer is governed by the coherence properties of the two slit source, the evolution of a quantum system reflects the coherence properties of the perturbations, and the degree to which one can exercise phase control via external perturbations is also influenced by the interaction of the system of interest with a reservoir – essentially any other degrees of freedom in the system – which can cause decoherence and reduce the effectiveness of the ‘matter interferometer’. Since the influence of a reservoir will generally increase with the complexity of a system, one might think that coherence control can only be effectively achieved in simple atomic and molecular systems. Indeed, some of the earliest suggestions for coherence control of a system involved using interference between single and three photon absorption processes, connecting the same initial and final states with photons of frequency 3v and v; respectively, and controlling the population of excited states in atomic or diatomic systems. However, it has been difficult to extend this ‘two color’ paradigm to more complex molecular systems. Since the reservoir leads to decoherence of the system overall, shorter and shorter pulses must be used to overcome decoherence effects. However, short pulses possess a large bandwidth with the result that, for complex systems, selectivity of the final state can be lost. One must therefore consider using interference between multiple pathways in order to control the system, as dictated by the details of the unperturbed Hamiltonian of the system. For a polyatomic molecule or a solid, the complete Hamiltonian is virtually impossible to determine exactly and it is therefore equally difficult to prescribe the optimal spectral (amplitude and phase) content of the pulses that should be used for control purposes. An alternative approach has therefore emerged, in which one foregoes knowledge of the eigenstates of the Hamiltonian and details of the different possible interfering transition paths in order to achieve control of the end state of a system. This branch of coherence control has come to be known as optimal control. In optimal control one employs an optical source for which one can (ideally) have complete control over the spectral and phase properties. One then uses a feedback process in which experiments are carried out, the effectiveness of achieving a certain result is determined, and the pulse characteristics are then

139

altered to obtain a new result. A key component is the use of an algorithm to select the new pulse properties as part of the feedback system. In this approach the molecule teaches the external control system what it ‘requires’ for a certain result to be optimally achieved. While the ‘best’ pulse properties may not directly reveal details of the multiple interference process required to achieve the optimal result, this technique can nonetheless be used to gain some insight into the properties of the unperturbed Hamiltonian and, regardless of such understanding, achieve a desirable result. It has been used to control chemical reaction rates involving several polyatomic molecules, with considerable enhancement in achieving a certain product relative to what can be done using simple thermodynamics.

Coherence Control in Semiconductors Since coherence control of chemical reactions involving large molecules has represented a significant challenge, it was generally felt that ultrafast decoherence processes would also make control in solids, in general, and semiconductors, in particular, difficult. Early efforts therefore focused on atomic-like situations in which the electrons are bound and associated with discrete states and long coherence times. In semiconductors, the obvious choice is the excitonic system, defect states, or discrete states offered by quantum wells. Population control of excitons and directional ionization of electrons from quantum wells has been clearly demonstrated, similar to related population control of and directional ionization from atoms. Other manifestations of coherence control of semiconductors include control of electron – phonon interactions and intersub-band transitions in quantum wells. Among the different types of coherence control in semiconductors is the remarkable result that it is possible to coherently control the properties of free electrons associated with continuum states. Although optimal control might be used for some of these processes, we have achieved clear illustrations of control phenomena based on the use of harmonically related beams and interference of single- and twophoton absorption processes connecting the continuum valence and conduction band states as generically illustrated in Figure 1c. Details of the various processes observed can be found elsewhere. Of course, the momentum relaxation time of electrons or holes in continuum states is typically of the order of 100 fs at room temperature, but this time is sufficiently long that it can permit phase-controlled processes. Indeed, with respect to conventional carrier transport, this ‘long time’ lapse is responsible

140 COHERENT CONTROL / Applications in Semiconductors

for the typically high electrical mobilities in crystalline semiconductors such as Si and GaAs. In essence, a crystalline semiconductor with translational symmetry has crystal momentum as a good quantum number, and selection rules for scattering prevent the momentum relaxation time from being prohibitively small. Since electrons or holes in any continuum state can participate in such control processes, one need not be concerned about pulsewidth or bandwidth, unless the pulses were to be so short that carriers of both positive and negative effective mass were generated within one band. Coherent Control of Electrical Current Using Two Color Beams

We now illustrate the basic principles that describe how the interference process involving valence and conduction band states can be used to control properties of a bulk semiconductor. We do so in the case of using one or two beams to generate and control carrier population, electrical current, and spin current. In pointing out the underlying ideas behind coherence control of semiconductors, we will skip or suppress many of the mathematical details which are required for a full understanding but which might obscure the essential physics. In particular we consider how phase-related optical beams with frequencies v and 2v interact with a direct gap semiconductor such that "v , Eg , 2"v where Eg is the electronic bandgap. For simplicity we consider exciting electrons from a single valence band via single-photon absorption at 2v and two-photon absorption at v: As is well known, within an independent particle approximation, the states of electrons and holes in semiconductors can be labeled by their vector crystal momentum k; the energy of states near the conduction or valence bandedges varies quadratically with lkl: For one-photon absorption the transition amplitude can be derived using a perturbation Hamiltonian of the form H ¼ e=mc A2v · p where A2v is the vector potential associated with the light field and p is the momentum operator. The transition amplitude is therefore of the form a2v / E2v eif2v pcv

½4

where pcv is the interband matrix element of p along the field ðE2v Þ direction; for illustration purposes the field is taken to be linearly polarized. The overall transition rate between two particular states of the same k can be expressed as W1 / a2v ða2v Þp / I2v lpcv l2 where I2v is the intensity of the beam. This rate is independent of the phase of the light beam as well as the sign of k. Hence the absorption

of light via single-photon transitions generally populates states of equal and opposite momentum with equal probability or, equivalently, establishes a standing electron wave with zero crystal momentum. This is not surprising since photons possess very little momentum and, in the approximation of a uniform electric field, do not give any momentum to an excited electron. The particular states that are excited depends on the light polarization and crystal orientation. However, the main point is that while single-photon absorption can lead to anisotropic filling of electron states, the distribution in momentum space is not polar (dependent on sign of k). Similar considerations apply to the excited holes, but to avoid repetition we will focus on the electrons only. For two-photon absorption involving the v photons and connecting states similar to those connected with single-photon absorption, one must employ the 2nd order perturbation theory using the Hamiltonian H ¼ e=mcAv · p: For two-photon absorption there is a transition from the valence band to an (energy nonallowed) intermediate state followed by a transition from the intermediate state to the final state. To determine the total transition rate one must sum over all possible intermediate states. For semiconductors, by far the dominant intermediate state is the final state itself, so that the associated transition amplitude has the form av / ðEv eifv pcv ÞðE2v eifv pcc Þ / ðEv Þ2 e2ifv pcv "k ½5 where the matrix element pcc is simply the momentum of the conduction band state ð"kÞ along the direction of the field. Note that unlike an atomic system, pcc is nonzero, since Bloch states in a crystalline solid do not possess inversion symmetry. If two-photon absorption acts alone, the overall transition rate between two particular k states would be W2 / ðIv Þ2 lpcv l2 k2 : As with single-photon absorption, because this transition rate is independent of the sign of k; two-photon absorption leads to production of electrons with no net momentum. When both single- and two-photon transitions are present simultaneously, the transition amplitude is the sum of the transition amplitudes expressed in eqn [3]. The overall transition rate is then found using eqn [1] and yields W ¼ W1 þ W2 þ Int where the interference term Int is given by Int / E2v Ev Ev sinðf2v 2 2fv Þk

½6

Note, however, that the interference effect depends on the sign of k and hence can be constructive for one part of the conduction band but destructive for other parts of that band, depending on the value of the

COHERENT CONTROL / Applications in Semiconductors

141

relative phase, Df ¼ f2v 2 2fv : In principle one can largely eliminate transitions with þ k and enhance those with 2 k. This effectively generates a net momentum for electrons or holes and hence, at least temporarily, leads to an electrical current in the absence of any external bias. The net momentum of the carriers, which is absent in the individual processes, must come from the lattice. Because the carriers are created with net momentum during the pulses one has a form of current injection that can be written as dJ=dt / E2v Ev Ev sinðDfÞ

½7

where J is the current density. This type of current injection is allowed in both centrosymmetric and noncentrosymmetric materials. The physics and concepts behind the quantum interference leading to this form of current injection are analogous with the Young’s double slit experiment. Note as well that if this ‘interferometer’ is balanced (single- and twophoton transition rates are similar), then it is possible to control the overall transition rate with great contrast. Roughly speaking, one balances the two arms of the ‘effective interferometer’ involved in the interference process, one ‘arm’ corresponding to the one-photon process and the other to the two-photon process. As an example, let’s consider the excitation of GaAs, which has a room-temperature bandgap of 1.42 eV (equivalent to 870 nm). For excitation of GaAs using 1550 and 775 nm light under balanced conditions, the electron cloud is injected with a speed close to 500 km s21. Under ‘balanced’ conditions, nearly all the electrons are moving in the same direction. Assuming that the irradiance of the 1550 nm beam is 100 MW cm22, while that of the second harmonic beam is only 15 kW cm22 (satisfying the ‘balance’ condition), with Gaussian pulse widths of 100 fs, one obtains a surprisingly large peak current of about 1 kA cm22 for a carrier density of only 1014 cm23 if scattering effects are ignored. When scattering is taken into account, the value of the peak current is reduced and the transient current decays on a time-scale of the momentum relaxation time. Figure 2a shows an experimental setup which can be used to demonstrate coherence control of electrical current using the above parameters. Figure 2a shows the region between a pair of electrodes on GaAs being illuminated by a train of harmonically related pulses. Figure 2b illustrates how the steady-state experimental voltage across the capacitor changes as the phase parameter Df is varied. Transient electrical currents, generated though incident femtosecond optical pulses, have also been detected through the emission of the associated Terahertz radiation.

Figure 2 (a) Experimental setup to measure steady-state voltage (V) across a pair of electrodes on GaAs with the intervening region illuminated by phased radiation at v and 2v; (b) induced voltage across a pair of electrodes on a GaAs semiconductor as a function of the phase parameter associated with two harmonically related incident beams.

The current injection via coherence control and conventional cases (i.e., under a DC bias) differs with respect to their evolution. The current injected via coherent control has an onset determined by the rise time of the optical pulses. In the case of normal current production, existing carriers are accelerated by an electric field, and the momentum distribution is never far from isotropic. For a carrier density of 1014 cm23 a DC field , 80 kV cm21 is required to produce a current density of 1 kA cm22. In GaAs, with an electron mobility of 8000 cm2 V21 s21 this current would occur about 1/2 ps after the field is ‘instantaneously’ turned on. This illustrates that the coherently controlled phenomenon efficiently and quickly produces a larger current than can be achieved with the redirecting of statistically distributed electrons. A more detailed analysis must take into account the actual light polarization, crystal symmetry, crystal face, and orientation relative to the optical polarization. For given optical intensities of the two beams,

142 COHERENT CONTROL / Applications in Semiconductors

the maximum electrical current injection in the case of GaAs occurs for linearly polarized beams both oriented along the (111) or equivalent direction. However, large currents can also be observed with the light beams polarized along other high symmetry directions. Coherent Control of Carrier Density, Spin Population, and Spin Current Using Two Color Beams

The processes described above do not exhaust the coherent control effects that can be observed in bulk semiconductors using harmonic beams. Indeed, for noncentrosymmetric materials, certain light polarizations and crystal orientations allow one to coherently control the total carrier generation rate with or without generating an electrical current. In the case of the cubic material GaAs, provided the v beam has electric field components along two of the three principal crystal axes, with the 2v beam having a component along the third direction, one can coherently control the total carrier density. However, the overall degree of control here is determined by how ‘noncentrosymmetric’ the material is. The spin degrees of freedom of a semiconductor can also be controlled using two color beams. Due to the spin – orbit interaction, the upper valence bands of a typical semiconductor have certain spin characteristics. To date, optical manipulation of electron spin has been largely based on the fact that partially spin-polarized carriers can be injected in a semiconductor via one-photon absorption of circularly polarized light from these upper valence bands. In such carrier injection – where in fact two-photon absorption could be used as well – spins with no net velocity are injected, and then are typically dragged

by a bias voltage to produce a spin-polarized current. However, given the protocols discussed above it should not come as a surprise that the two color coherence scheme, when used with certain light polarizations, can coherently control the spin polarization, making it dependent on Df: Furthermore, for certain polarization combinations it is also possible to generate a spin current with or without an electrical current. Given the excitement surrounding the field of spintronics, where the goal is the use of the spin degree of freedom for data storage and processing, the control of quantities involving the intrinsic angular momentum of the electron is of particular interest. Various polarization and crystal geometries can be examined for generating spin currents with or without electrical current. For example, with reference to Figure 3a, for both beams propagating in the z ðnormalÞ direction of a crystal and possessing the same circular polarization, an electrical current can be injected in the ðxyÞ plane, at an angle from the crystallographic x direction dependent on the relative phase parameter, Df; this current is spin-polarized in the z direction. As well, the injected carriers have a ^z component of their velocity, as many with one component as with the other; but those going in one direction are preferentially spin-polarized in one direction in the ðxyÞ plane, while those in the other direction are preferentially spin-polarized in the opposite direction. This is an example of a spin current in the absence of an electrical current. Such a pure spin current that is perhaps more striking is observable with the two beams cross linearly polarized, for example with the fundamental beam in the x direction and the second harmonic beam in the y direction as shown in Figure 3b. Then there is no net spin injection; the average spin in any

Figure 3 (a) Excitation of a semiconductor by co-circularly polarized v and 2v pulses. Arrows indicate that a spin polarized electrical current is generated in the x – y plane in a direction dependent on Df while a pure spin current is generated in the beam propagation ðzÞ direction. (b) Excitation of a semiconductor by orthogonally, linearly polarized v and 2v pulses. Arrows indicate that a spin polarized electrical current is generated in the direction of the fundamental beam polarization as well as along the beam propagation ðzÞ direction.

COHERENT CONTROL / Applications in Semiconductors

direction is zero. And for a vanishing phase parameter Df there is no net electrical current injection. Yet, for example, the electrons injected with a þx velocity component will have one spin polarization with respect to the z direction, while those injected with a 2x component to their velocity will have the opposite spin polarization with respect to the z direction. The examples given above are the spin current analog of the two-color electrical current injection discussed above. There should also be the possibility of injecting a spin current with a single beam into crystals lacking center-of-inversion symmetry. The simple analysis presented above relies on simple quantum mechanical ideas and calculations using essentially nothing more than Fermi’s Golden Rule. Yet the process involves a mixing of two frequencies, and can therefore be thought of as a nonlinear optical effect. Indeed, one of the key ideas to emerge from the theoretical study of such phenomena is that an interpretation of coherence control effects alternate to that provided by the simple quantum interference picture that is provided by the usual susceptibilities of nonlinear optics. These susceptibilities are, of course, based on quantum mechanics, but the macroscopic viewpoint allows for the identification and classification of the effects in terms of 2nd order nonlinear optical effects, 3rd order optical effects, etc. Indeed, one can generalize many of the processes we have discussed above to a hierarchy of frequency mixing effects or high-order nonlinear processes involving multiple beams with frequency nv; mv; pv… with n; m; p being integers. Many of theses schemes require high intensity of one or more of the beams, but can occur in a simple semiconductor.

Nonetheless, the efficacy of this process is limited by the fact that it relies on the breaking of center-ofinversion symmetry of the underlying crystal. It is also clear that the maximum current occurs for circularly polarized light and that right and left circularly polarized light lead to a difference in sign of the current injection. Finally, for this particular single beam scheme, when circularly polarized light is used the electrical current is partially spin polarized.

Conclusions Through the phase of optical pulses it is possible to control electrical and spin currents, as well as carrier density, in bulk semiconductors on a time-scale that is limited only by the rise time of the optical pulse and the intrinsic response of the semiconductor. A few optically based processes that allow one to achieve these types of control have been illustrated here. Although at one level one can understand these processes in terms of quantum mechanical interference effects, at a macroscopic level one can understand these control phenomena as manifestations of nonlinear optical phenomena. For fundamental as well as applied reasons, our discussion has focused on coherence control effects using the continuum states in bulk semiconductors, although related effects can also occur in quantum dot, quantum well, and superlattice semiconductors. Applications of these control effects will undoubtedly exploit the all-optical nature of the process, including the speed at which the effects can be turned on or off. The turn-off effects, although not discussed extensively here, are related to transport phenomena as well as momentum scattering and related dephasing processes.

See also

Coherent Control of Electrical Current Using Single Color Beams

It is also possible to generate coherence control effects using beams at a single frequency ðvÞ; if one focuses on the two orthogonal components (e.g., x and y) polarization states and uses a noncentrosymmetric semiconductor of reduced symmetry such as a strained cubic semiconductor or a wurtzite material such as CdS or CdSe. In this case, interference between absorption pathways associated with the orthogonal components can lead to electrical current injection given by dJ=dt / Ev Ev sinðfxv 2 fvy Þ

143

½8

Since in this process current injection is linear in the beam’s intensity, the high intensities necessary for two-photon absorption are not necessary.

Semiconductor Physics: Spin Transport and Relaxation in Semiconductors; Spintronics. Interferometry: Overview. Coherent Control: Experimental; Theory.

Further Reading Brumer P (1995) Laser control of chemical reactions. Scientific American 272: 56. Rabitz H, de Vivie-Riedle R, Motzkus M and Kompa K (2000) Whither the future of controlling quantum phenomena? Science 288: 824. Shah J (1999) Ultrafast Spectroscopy of Semiconductors and Semiconductor Nanostructures. New York: Springer. van Driel HM and Sipe JE (2001) Coherent control of photocurrents in semiconductors. In: Tsen T (ed.) Ultrafast Phenomena in Semiconductors, p. 261. New York: Springer Verlag.

144 COHERENT LIGHTWAVE SYSTEMS

COHERENT LIGHTWAVE SYSTEMS M J Connelly, University of Limerick, Limerick, Ireland q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Conventional optical communication systems are based on the intensity modulation (IM) of an optical carrier by an electrical data signal and direct detection (DD) of the received light. The simplest scheme employs on/off keying (OOK), whereby turning on or off, an optical source transmits a binary 1 or 0. The signal light is transmitted down an optical fiber and at the receiver is detected by a photodetector, as shown in Figure 1. The resulting photocurrent is then processed to determine if a 1 or 0 was received. In the ideal case, where a monochromatic light source is used and where no receiver noise (detector dark current and thermal noise) is present, the probability of error is determined by the quantum nature (shot noise) of the received light. In this case, the number of photons/bit required (sensitivity) at the receiver, to achieve a biterror-rate (BER) of 1029, is 10. This is called the quantum-limit. In practical receivers without an optical preamplifier, operating in the 1.3 mm and 1.55 mm optical fiber communication windows, the actual sensitivity is typically 10 – 30 dB less than the theoretical sensitivity. This is due to receiver noise, in particular thermal noise. This means that the actual sensitivity is between 100 to 10 000 photons/bit. The capacity of IM/DD fiber links can be increased using wavelength division multiplexing (WDM). At the receiver the desired channel is selected using a narrowband optical filter. Some advanced commercial IM/DD systems utilize dense WDM with 50 GHz spacing and single-channel bit rates as high as 40 Gb/s. Indeed, IM/DD systems with terabit (1012 bit/s) capacity are now possible. Compared to IM/DD systems, coherent optical communication systems have greater sensitivity and selectivity. In the context of coherent lightwave systems, the term coherent refers to any technique employing nonlinear mixing between two optical waves on a photodetector, as shown in Figure 2. Typically, one of these is an information-bearing signal and the other is a locally generated wave (local oscillator). The result of this heterodyne process is a modulation of the photodetector photocurrent at a frequency equal to the difference between the signal and local oscillator frequencies. This intermediate

frequency (IF) electronic signal contains the information in the form of amplitude, frequency, or phase that was present in the original optical signal. The IF signal is filtered and demodulated to retrieve the transmitted information. An automatic frequency control circuit is required to keep the local-oscillator frequency stable. It is possible for the information signal to contain a number of subcarriers (typically at microwave frequencies), each of which can be modulated by a separate data channel. It is a simple matter to select the desired channel at the receiver by employing electronic heterodyning and low pass filtering, as shown in Figure 3. In IM/DD systems, channel selection can only be carried out using narrowband optical filters. Coherent optical communications utilize techniques that were first investigated in radio communications. Most of the basic research work on coherent optical communications was carried out in the 1980s and early 1990s, and was primarily motivated by the need for longer fiber links using no repeaters. Improving receiver sensitivity using coherent techniques made it possible to increase fiber link spans. In the mid-1990s, reliable optical fiber amplifiers became available. This made it possible to construct fiber links using optical amplifiers spaced at appropriate intervals to compensate for fiber and other transmission losses. The sensitivity of an IM/DD receiver can be greatly improved by the use of an optical preamplifier. Indeed the improvement is so marked that coherent techniques are not a viable alternative to IM/DD in the vast majority of commercial optical communication systems due to their complexity and higher cost. Moreover, because semiconductor lasers with very precise wavelengths and narrowband optical filters can be fabricated, the selectivity advantage of coherent optical communications has become less important. This has meant that the research work currently being carried out on coherent optical communications is very limited, and in particular, very few field trials have been carried out since the early 1990s. We will review the basic principles underlying coherent optical communications, modulation schemes, detection schemes, and coherent receiver sensitivity. The effects of phase noise and polarization on coherent receiver performance will be outlined. A brief summary of pertinent experimental results will also be presented.

COHERENT LIGHTWAVE SYSTEMS

145

Figure 1 IM/DD optical receiver.

Figure 2 Basic coherent optical receiver. AFC: automatic frequency control.

Figure 3 Coherent detection and processing of a subcarrier modulated signal lightwave. The power spectrum of the detected signal is shown: (a) after IF filtering; (b) after multiplication by the IF oscillator; (c) after multiplication with the subcarrier oscillator (in this case channel 1’s carrier frequency); and (d) after low pass filtering prior to demodulation and data detection. LPF: lowpass filter.

photocurrent is given by

Basic Principles In coherent detection, a low-level optical signal field Es is combined with a much larger power optical signal field ELO from a local oscillator laser. Es and ELO can be written as pffiffiffiffiffi Es ðtÞ ¼ 2Ps cos½2p fs t þ fs ðtÞ

½1

pffiffiffiffiffiffiffi ELO ðtÞ ¼ 2PLO cos½2p fLO t þ fLO ðtÞ

½2

where Ps , fs , and fs are the signal power, optical frequency, and phase (including phase noise), respectively. PLO , fLO , and fLO are the equivalent quantities for the local oscillator. For ideal coherent detection, the polarization states of the signal and local oscillator must be equal. If this is the case, the

id ðtÞ ¼ R½Es ðtÞ þ ELO ðtÞ2

½3

where R is the detector responsivity. Because the detector cannot respond to optical frequencies, we have pffiffiffiffiffiffiffiffi id ðtÞ ¼ R{Ps þ PLO þ 2 Ps PLO  cos½2p fIF t þ fs ðtÞ 2 fLO ðtÞ}

½4

The IF signal is centered at frequency fIF ¼ fs 2 fLO : In homodyne detection fs ¼ fLO and the homodyne signal is centered at baseband. In heterodyne detection fs – fLO and the signal generated by the photodetector is centered around fIF (typically three to six times thepffiffiffi bit rate). The IF photocurrent is proportional to Ps , rather than Ps as is the case in

146 COHERENT LIGHTWAVE SYSTEMS

direct detection, and p isffiffiffiffiffi effectively amplified by a ffi factor proportional to PLO : If PLO is large enough, the signal power can be raised above the receiver noise, thereby leading to a greater sensitivity than possible with conventional IM/DD receivers using p –i – n or avalanche photodiodes. Theoretically it is possible to reach shot noise limited receiver sensitivity although this is not possible in practice. Homodyne optical receivers usually require less bandwidth and are more sensitive than heterodyne receivers but require that the local oscillator is phase locked to the information signal. This requires an optical phase locked loop (OPLL), which is very difficult to design.

external modulator. The most common type is based on LiNbO3 waveguides in a Mach –Zehnder interferometer configuration. In FSK modulation, the optical frequency of the signal lightwave is changed slightly by the digital data stream. In binary FSK, the optical signal is given by pffiffiffiffiffi Es ðtÞ ¼ 2Ps cos½2p ð fs þ aðtÞDf Þt þ fs ðtÞ ½6 where aðtÞ ¼ 21 or 1 for transmission of a 0 or 1, respectively. The choice of the frequency deviation Df depends on the available bandwidth, the bit rate BT and the demodulation scheme used in the receiver. The modulation index m is given by m¼

Modulation Schemes The three principal modulation schemes used in coherent optical communications; amplitude shift keying (ASK), frequency shift keying (FSK) and phase shift keying (PSK), are shown in Figure 4. ASK modulation is essentially the same as the OOK scheme used in IM/DD systems. The optical signal is given by pffiffiffiffiffi Es ðtÞ ¼ 2aðtÞ 2Ps cos½2pfs t þ fs ðtÞ ½5 where aðtÞ ¼ 0 or 1 for transmission of a 0 or 1, respectively. Ps is the average signal power, assuming that it is equally likely that a 0 or 1 is transmitted. ASK modulation can be achieved by modulating the output of a semiconductor laser using an

2Df BT

½7

The minimum value of m that produces orthogonality (for independent detection of the 0s and 1s) is 0.5. If m ¼ 0:5, the signal is referred to as minimum FSK (MFSK). When m $ 2, the signal is referred to as wide-deviation FSK, and when m , 2, as narrowdeviation FSK. Wide-deviation FSK receivers are more resilient to phase noise than narrow-deviation FSK receivers. FSK modulation can be achieved using LiNbO3 external modulators, acoustic optic modulators and distributed feedback semiconductor lasers. The phase of the lightwave does not change between bit transitions. In PSK modulation, the digital data stream changes the phase of the signal lightwave. In binary PSK the optical signal is given by pffiffiffiffiffi p Es ðtÞ ¼ 2Ps cos½2p fs t þ aðtÞu þ þ fs ðtÞ 2

½8

where aðtÞ ¼ 21 or 1 for transmission of a 0 or 1, respectively. The phase shift between a 0 and 1 is u þ p=2: Most PSK schemes use u ¼ p=2: PSK modulation can be achieved by using an external phase modulator or a multiple quantum well electroabsorption modulator.

Detection Schemes There are two classes of demodulation schemes: synchronous and asynchronous. The former exploits the frequency and phase of the carrier signal to perform the detection. The latter only uses the envelope of the carrier to perform the detection. In the following we review the principal demodulation schemes. PSK Homodyne Detection Figure 4 Modulation schemes for coherent optical communications: (a) ASK; (b) FSK; and (c) PSK.

In an ideal PSK homodyne receiver, shown in Figure 5, the signal light and local oscillator have identical

COHERENT LIGHTWAVE SYSTEMS

147

Figure 5 PSK homodyne receiver.

Figure 6 PSK synchronous heterodyne receiver. BPF: bandpass filter.

Figure 7 FSK heterodyne synchronous receiver.

optical frequency and phase. If the resulting baseband information signal is positive, then a 1 has been received. If the baseband information signal is negative, then a 0 bit has been received. This scheme has the highest theoretical sensitivity but requires an OPLL and is also highly sensitive to phase noise. ASK Homodyne Detection

An ideal ASK homodyne receiver is identical to an ideal PSK homodyne receiver. Because the baseband information signal is either zero (0 bit) or nonzero (1 bit), the receiver sensitivity is 3 dB less than the PSK homodyne receiver. PSK Heterodyne Synchronous Detection

Figure 6 shows an ideal PSK synchronous heterodyne receiver. The IF signal is filtered by a bandpass filter, multiplied by the phase locked reference oscillator to move the signal to baseband, low pass filtered and sent to a decision circuit that decides if the received

bit is a 0 or 1. The receiver requires synchronization of the reference oscillator with the IF signal. This is not easy to achieve in practice because of the large amount of semiconductor laser phase noise. ASK Heterodyne Synchronous Detection

This scheme is basically the same as the PSK heterodyne detection and is similarly difficult to design. FSK Heterodyne Synchronous Detection

This scheme is shown in Figure 7 for a binary FSK signal consisting of two possible frequencies f1 ¼ fs 2 Df and f2 ¼ fs þ Df : There are two separate branches in which correlation with the two possible IF signals at f1 and f2 is performed. The two resulting signals are subtracted from each other and a decision taken on the sign of the difference. Because of the requirement for two electrical phase locked loops, the scheme is not very practical.

148 COHERENT LIGHTWAVE SYSTEMS

ASK Heterodyne Envelope Detection

CPFSK Heterodyne Differential Detection

In this scheme, shown in Figure 8, the IF signal is envelope detected to produce a baseband signal proportional to the original data signal. The output of the envelope detector is relatively insensitive to laser phase noise.

This scheme, shown in Figure 10, uses the continuous phase (CP) characteristic of an FSK signal, i.e., there is no phase discontinuity at bit transition times. After bandpass filtering, the IF photocurrent is

FSK Heterodyne Dual-Filter Detection

In this scheme, shown in Figure 9, two envelope detectors are used to demodulate the two possible IF signals at f1 and f2 and a decision taken based on the outputs. The scheme is tolerant to phase noise and can achieve high sensitivity. However, it needs highbandwidth receiver electronics because a large frequency deviation is required. FSK Heterodyne Single-Filter Detection

This scheme omits one branch of the FSK heterodyne dual-filter receiver and performs a decision in the same way as for the ASK heterodyne envelope detection. Because half the power of the detected signal is not used, the sensitivity of the scheme is 3 dB worse than the dual-filter scheme.

Figure 8 ASK heterodyne envelope receiver.

Figure 9 FSK heterodyne dual-filter receiver.

Figure 10 CPFSK heterodyne differential detection receiver.

iIF ðtÞ ¼ A cos{2p½ðfIF þ aðtÞDf Þt} ½9 pffiffiffiffiffiffiffiffi where A ¼ R Ps PLO : The output from the delay-line demodulator, after low pass filtering to remove the double IF frequency term, is xðtÞ ¼

A2 cos{2p½fIF þ aðtÞDf t} 2

½10

The function of the delay-line demodulator is to act as a frequency discriminator. We require that xðtÞ is a maximum when aðtÞ ¼ 1 and a minimum when aðtÞ ¼ 21: This is the case if the following relations are satisfied: 2p ðfIF þ Df Þt ¼ 2pk 2p ðfIF 2 Df Þt ¼ ð2k 2 1Þp;

k an integer

½11

COHERENT LIGHTWAVE SYSTEMS

Hence, we require that 1 Df t ¼ 4   1 fIF t ¼ k 2 ; k integer 4

½12

In terms of the modulation index we have



1 2mBT

½13

The above equation shows that it is possible to decrease the value of t required by increasing m, thereby reducing the sensitivity of the receiver to phase noise. The minimum value of m possible is 0.5. This scheme can operate with a smaller modulation index compared to the dual-filter and single-filter schemes for a given bit rate, thereby relaxing the bandwidth requirements of the receiver electronics.

If the noise is assumed to have a Gaussian distribution, then the bit error rate (BER) is given by pffiffiffiffiffiffi  1 ½18 BER ¼ erfc 2NR 2 where erfc is the pffiffi complementary error function; 2 erfcðxÞ < e2x =ðx pÞ for x . 5: The receiver sensitivity for a BER ¼ 1029 is 9 photons/bit. BER expressions and sensitivity for the demodulation schemes discussed above are listed in Table 1. The sensitivity versus NR is plotted for some of the demodulation schemes in Figure 11. The received signal power required for a given BER is proportional to BT : This dependency is shown in Figure 12 for some of the demodulation schemes. Table 1 Shot-noise limited BER expressions and sensitivity for coherent demodulation schemes Modulation scheme

BER

Sensitivity (photons/bit)

Homodyne PSK

1 2 erfc

DPSK Heterodyne Differential Detection

In differential phase shift key (DPSK) modulation, a 0 is sent by changing the phase of the carrier by p with respect to the previous signal. A 1 is sent by not changing the phase. The receiver configuration is identical to the CPFSK heterodyne differential detection receiver, except that the time delay t is equal to the inverse of the data rate. Compared to other PSK detection schemes, DPSK differential detection is relatively simple to implement and is relatively immune to phase noise.

ASK Synchronous heterodyne PSK ASK FSK Asynchronous heterodyne ASK (envelope)

Coherent Receiver Sensitivity

FSK (dual-filter)

The signal-to-noise ratio gc of an ideal PSK homodyne receiver, in the shot-noise limit, is given by

FSK (single-filter)

2RPs gc ¼ eBc

½14

149

Differential detection CPFSK DPSK

pffiffiffiffiffiffi 2NR pffiffiffiffi 1 NR 2 erfc pffiffiffiffi NR rffiffiffiffiffi ! NR 1 2 erfc 2 rffiffiffiffiffi ! N R 1 2 erfc 2

NR 1 exp 2 2 2

N R 1 2 exp 2 2

NR 1 exp 2 2 4 1 2 erfc



1 2 exp 2NR

1 2 exp 2NR

9 18 (peak 36) 18 36 (peak 72) 36 40 (peak 80) 40 80 20 20

where Bc is the receiver bandwidth. The detector responsivity is given by R¼

he hfs

½15

where h is the detector quantum efficiency (#1). The average signal power Ps ¼ Ns hfs BT, where Ns is the average number of photons per bit. If it is assumed that Bc ¼ BT , then

gc ¼ 2hNs

½16

The actual detected number of photons per bit NR ¼ hNs , so

gc ¼ 2NR

½17

Figure 11 Shot-noise limited BER versus number of received photons/bit for various demodulation schemes: (a) PSK homodyne; (b) synchronous PSK; (c) synchronous FSK; (d) ASK envelope; (e) FSK dual-filter; and (f) CPFSK and DPSK.

150 COHERENT LIGHTWAVE SYSTEMS

Phase Noise A major influence on the sensitivity possible with practical coherent receivers is laser phase noise, which is responsible for laser linewidth (3 dB bandwidth of the laser power spectrum). The effect of phase noise on system performance depends on the modulation and demodulation scheme used. In general, the effects of phase noise are more severe in homodyne and synchronous heterodyne schemes than in asynchronous schemes. If DnIF is the beat linewidth (without modulation present) between the signal and local oscillator lasers, the normalized beat linewidth Dn is defined as Dn ¼

DnIF BT

½19

where DnIF is the IF phase noise bandwidth. Semiconductor lasers usually have a Lorentzian spectrum in which case DnIF is equal to the sum of the signal and local-oscillator laser linewidths. The requirements placed on Dn become less severe as the bit rate increases. Table 2 compares the Dn requirement for a 1 dB power penalty at a BER of 1029 for

some of the principal modulation/demodulation schemes. PSK Homodyne Detection – Phase Locked Loop Schemes

Practical homodyne receivers require an OPLL to lock the local-oscillator frequency and phase to that of the signal lightwave. There are a number of OPLL schemes possible for PSK homodyne detection, including the balanced PLL, Costas-type PLL and the decision-driven PLL. The balanced PLL scheme, shown in Figure 13, requires that the PSK signal uses a phase shift of less than p between a 0 and 1, Es ðtÞ and ELO ðtÞ can be written as   pffiffiffiffiffi p ½20 Es ðtÞ ¼ 2Ps cos 2pf0 t þ aðtÞu þ fN;s ðtÞ þ 2 pffiffiffiffiffiffiffi ELO ðtÞ ¼ 2PLO cos½2pf0 t þ fN;LO ðtÞ ½21 where f0 is the common signal and local oscillator optical frequency and fN;s ðtÞ and fN;LO ðtÞ are the signal and local oscillator phase noise, respectively. When u – p=2, Es ðtÞ contains an unmodulated carrier component (residual carrier) in quadrature with the information-bearing signal. Es ðtÞ and ELO ðtÞ are combined by an optical 1808 hybrid. The output optical signals from the p-hybrid are given by 1 E1 ðtÞ ¼ pffiffi ½Es ðtÞ þ ELO ðtÞ 2 1 E2 ðtÞ ¼ pffiffi ½Es ðtÞ 2 ELO ðtÞ 2

Figure 12 Required signal power versus bit rate for a BER ¼ 1029 for various shot-noise limited demodulation schemes: (a) PSK homodyne; (b) ASK envelope; and (c) CPFSK. The signal wavelength is 1550 nm and the detector is assumed to have unity quantum efficiency.

Table 2 Required normalized beat linewidth at 1 dB power penalty for a BER ¼ 1029 for various modulation/demodulation schemes Modulation scheme Dn (%) Homodyne

PSK

Synchronous PSK heterodyne Asynchronous PSK heterodyne ASK (envelope) FSK (dual-filter) Differential CPFSK detection DPSK

6 £ 1024 (balanced loop) 0.01 (decision-driven loop) 0.2– 0.5 0.3– 0.7 3 10–20 0.33 (m ¼ 0.5) 0.66 (m ¼ 1.0) 0.33

½22 ½23

The voltage at the output of the summing circuit is given by pffiffiffiffiffiffiffiffi vL ðtÞ ¼ 2RRL Ps PLO aðtÞsin u cos½fe ðtÞ  ½24 þ cos u sin½fe ðtÞ where the photodetectors have the same responsivity and load resistance RL : The phase error is given by

fe ðtÞ ¼ fN;s ðtÞ 2 fN;LO ðtÞ 2 fc ðtÞ

½25

fc ðtÞ is the controlled phase determined by the control signal vc ðtÞ at the output of the loop filter. vL contains two terms; an information-bearing signal proportional to sin u, which is processed by the data detection circuit, and a phase error signal proportional to cos u used by the PLL for locking. The power penalty, due to the residual carrier transmission, is 10 log10 ð1=sin2 uÞ dB. This scheme is not practical because the Dn needed for a power penalty of , 1 dB is typically less than 1025. This would require that DnIF , 10 kHz for

COHERENT LIGHTWAVE SYSTEMS

151

Figure 13 PSK homodyne balanced PLL receiver.

Figure 14 PSK decision-driven PLL.

a data rate of 1 Gb/s. The narrow signal and localoscillator linewidths required are only achievable with external cavity semiconductor lasers. The Costas-type and decision-driven PLLs also utilize p-hybrids and two photodetectors but as they have full suppression of the carrier ðu ¼ p=2Þ, nonlinear processing of the detected signals is required to produce a phase error signal that can be used by the PLL for locking. In the decision-driven PLL, shown in Figure 14, the signal from one of the photodetector outputs is sent to a decision circuit and the output of the circuit is multiplied by the signal from the second photodetector. The mixer output is sent back to the local oscillator laser for phase locking. The Dn required for a power penalty of ,1 dB is typically less than 2 £ 1024 : This is superior to both the balanced PLL and Costas-type PLL receivers. Heterodyne Phase Locking

In practical synchronous heterodyne receivers phase locking is performed in the electronic domain utilizing techniques from radio engineering. In the PSK case, electronic analogs of the OPLL schemes used in homodyne detection can be used. Synchronous heterodyne schemes have better immunity to phase noise than the homodyne schemes. Asynchronous Systems

Asynchronous receivers do not require optical or electronic phase locking. This means that they are less

sensitive to phase noise than synchronous receivers. They do have some sensitivity to phase noise because IF filtering leads to phase-to-amplitude noise conversion. Differential Detection

Differential detection receivers have sensitivities to phase noise between synchronous and asynchronous receivers because, while they do not require phase locking, they use phase information in the received signal. Phase-Diversity Receivers

Asynchronous heterodyne receivers are relatively insensitive to phase noise but require a much higher receiver bandwidth for a given bit rate. Homodyne receivers only require a receiver bandwidth equal to the detected signal bandwidth but require an OPLL, which is difficult to implement. The phase diversity receiver, shown in Figure 15, is an asynchronous homodyne scheme, which does not require the use of an OPLL and has a bandwidth approximately equal to synchronous homodyne detection. This is at the expense of increased receiver complexity and reduced sensitivity compared to synchronous homodyne detection. The phase-diversity scheme can be used with ASK, DPSK, and CPFSK modulation. ASK modulation uses an squarer circuit for the demodulator component, while DPSK and CPFSK modulation use a delay line and mixer.

152 COHERENT LIGHTWAVE SYSTEMS

Figure 15 Multiport phase-diversity receiver. PD: photodetector.

In the case of ASK modulation, the outputs of the N-port hybrid can be written as    pffiffiffiffiffi 1 2pk Ek ðtÞ ¼ pffiffiffi aðtÞ 2Ps cos 2pfs t þ fs ðtÞ þ N N  pffiffiffiffiffiffiffi þ 2PLO cos½2pfLO t þ fLO ðtÞ ; k ¼ 1;2;…;N; N $ 3

½26

For N ¼ 2; the phase term is p=2 for k ¼ 2: The voltage input to the k-th demodulator (squarer), after the dc terms have been removed by the blocking capacitor, is given by  RRL pffiffiffiffiffiffiffiffi vk ðtÞ ¼ aðtÞ 2 Ps PLO cos 2pðfs 2 fLO Þ N  2pk þ fs ðtÞ 2 fLO ðtÞ þ ½27 N These voltages are then squared and added to give an output voltage 8 "

N < X RRL 2 1þcos 4p vL ðtÞ¼ a 4Ps PLO aðtÞ : N k¼1 #9

4pk = ½28  fs 2fLO tþ2fs ðtÞ22fLO ðtÞþ N ; where a is the squarer parameter. The lowpass filter at the summing circuit output effectively integrates vL ðtÞ over a bit period. If 2ðfs 2fLO ÞpBT and the difference between the signal and local oscillator phase noise is small within a bit period, then the input voltage to the decision circuit is ðRRL Þ2 4Ps PLO aðtÞ ½29 N which is proportional to the original data signal. In shot-noise limited operation, the receiver sensitivity is 3 dB worse than the ASK homodyne case. vD ðtÞ¼ a

The scheme is very tolerant of laser phase noise. The Dn for a power penalty of , 1 dB is of the same order as for ASK heterodyne envelope detection.

Polarization The mixing efficiency between the signal and localoscillator lightwaves is a maximum when their polarization states are identical. In practice the polarization state of the signal arriving at the receiver is unknown and changes slowly with time. This means that the IF photocurrent can change with time and in the worst case, when the polarization states of the signal and local-oscillator are orthogonal, the IF photocurrent will be zero. There are a number of possible solutions to this problem. 1. Polarization maintaining (PM) fiber can be used in the optical link to keep the signal polarization from changing. It is then a simple matter to adjust the local-oscillator polarization to achieve optimum mixing. However, the losses associated with PM fiber are greater than for conventional singlemode (SM) fiber. In addition most installed fiber is SM fiber. 2. An active polarization controller, such as a fiber coil using bending-induced birefringence, can be used to ensure that the local-oscillator polarization tracks the signal polarization. 3. A polarization scrambler can be used to scramble the polarization of the transmitted signal at a speed greater than the bit rate. The sensitivity degradation due to the polarization scrambling is 3 dB compared to a receiver with perfect polarization matching. However, this technique is only feasible at low bit rates. 4. The most general technique is to use a polarization-diversity receiver, as shown in Figure 16. In this scheme, the signal light and local-oscillator

153

COHERENT LIGHTWAVE SYSTEMS

Figure 16 Polarization-diversity receiver. Table 3 General comparisons between various modulation/demodulation schemes. The sensitivity penalty is relative to ideal homodyne PSK detection Modulation/demodulation scheme

Sensitivity penalty (dB)

Immunity to phase noise

IF bandwidth/bit rate

Complexity

Homodyne PSK Synchronous heterodyne PSK Asynchronous heterodyne FSK (dual-filter)

0 3

Very poor Poor

1 3 –6

6.5

Excellent

3 –6

Differential detection CPFSK DPSK

3.5

Moderate

gc wc

gd |2>

ga

wa

Theoretical Treatment of EIT in a Three-Level Medium It was realized by several workers in the 1970s that laser-induced interference effects could lead to a cancellation of absorption at certain frequencies. To gain a more quantitative understanding of the effects of the coupling field upon the optical properties of a dense ensemble of three-level atoms we require a treatment that computes the optical susceptibility of the medium. A treatment originally carried out by Harris et al. for a L scheme similar to that illustrated in Figure 1 was the first to derive the modified susceptibilities that will be discussed below. In that treatment the state amplitudes in the three-level atom were solved in the steady-state limit and from these the linear and nonlinear susceptibilities (see below) are then derived. In what follows we

½2

wd

wa

|1> Figure 2 A four-wave mixing scheme incorporating the three-level lambda system in which electromagnetically induced transparency has been created for the generated field vd by the coupling field vc. The decay rates gi from the three atomic levels are also shown. For a full explanation of this system see text.

ELECTROMAGNETICALLY INDUCED TRANSPARENCY

where H0 is the unperturbed Hamiltonian of the system and is written as H0 ¼ "v1 l1lk1l þ "v2 l2lk2l þ "v3 l3lk3l

½3

and V is the interaction Hamiltonian and can be expressed V ¼ "Va e2 iva t l2lk1l þ "Vc e2 ivc t l2lk3l þ "Vd e2 ivd t l3lk1l þ c:c

½4

Note that the Rabi frequencies in the equation can be described as "Vij ¼ mij lEðvij Þl, where mij is the dipole moment of the transition between states li. and lj., the Rabi frequency Va is a two-photon Rabi frequency that characterizes the coupling between the laser field a and the atom for this two-photon transition. We have assumed for simplicity that va ¼ vb ¼ v12 =2. Assuming all the fields lie close to the resonance, the rotating wave approximation can be applied to the interaction picture and the interaction Hamiltonian V I is given as V I ¼ "Va eiDa t l2lk1l þ "Vc eiDc t l2lk3l þ "Vd eiDd t l3lk1l þ c:c

½5

where Da, Dc and Dd refer the detunings of the fields and can be written as: Da ¼ v12 ¼ 2va Dc ¼ v32 2 vc Dd ¼ v13 2 vd

½6

For the evaluation of the density matrix with this interaction V I, the Schro¨dinger equation can be restated in terms of the density matrix components. This form is called the Liouville equation and can be written as: X X › " @ij ðtÞ ¼ 2i Hik ðtÞ@ij ðtÞ þ i @ij ðtÞHik ðtÞ þ Gij ›t k k ½7 where Gij represents phenomenologically added decay terms (i.e. spontaneous decays, collisional broadening, etc.). This formalism leads to a set of nine differential equations for nine different density matrix elements that describe the three-level system. To remove the optical frequency oscillations, a coordinate transform is needed and to incorporate the relevant oscillatory detuning terms into the

379

off-diagonal elements we make the substitution:

@~12 ¼ @12 e2 iDa t @~23 ¼ @23 e2 iDc t @~31 ¼ @31 e

½8

2iDd t

This operation conveniently eliminates the time dependencies in the rate equations and the equations of motion for the density matrix are given by:

› 1 1 1 @ ¼ i Va @~12 þ i Vd @~13 2 i Vpa @~21 ›t 11 2 2 2 1 p 2 i Vd @~31 þ G11 2 › 1 1 1 @ ¼ i Vpc @~23 þ i Vpa @~21 2 i Vc @~32 ›t 22 2 2 2 1 2 i Va @~12 þ G22 2 › 1 1 1 @ ¼ i Vpc @~23 2 i Vpd @~31 þ i Vd @~13 ›t 33 2 2 2 1 2 i Vc @~32 þ G33 2

½9

› 1 1 @ ¼ i Vc @~22 2 i Vc @~33 2 iDa @~23 ›t 23 2 2 1 1 þ i Vd @~21 2 i Va @~13 þ G23 2 2 › @ ¼ iVa @~22 þ iVa @~33 þ iVd @~23 2 iDc @~21 ›t 21 1 1 2 i Vc @~31 2 i Va @~31 þ G21 2 2 › 1 @ ¼ i Vd @~22 þ iVd @~33 þ iVd @~23 2 iDc @~21 ›t 31 2 1 1 2 i Vc @~31 2 i Va @~31 þ G31 2 2 Using the set of equations above and the relation @~ij ¼ @~ijp we obtain equations for @~12, @~32 , and @~13 . For the incoherent population relaxation the decays can be written: G11 ¼ ga @22 þ gd @33 G22 ¼ 2ga @22 þ gc @33 G33 ¼ 2ðgc þ gd Þ@33 and for the coherence damping:   1 ½ga þ gc  þ gcol G21 ¼ 2 21 @21  2 1 col ½ga þ gc þ gd  þ g23 @23 G23 ¼ 2  2 1 ½gd þ gc  þ gcol G31 ¼ 2 31 @31 2

½10

½11

where gcol represents collisional dephasing terms ij which may be present.

380 ELECTROMAGNETICALLY INDUCED TRANSPARENCY

where N is the number of equivalent atoms in the ground state within the medium, and m13 is the dipole matrix element associated with the transition. In this way real and imaginary parts of the linear susceptibility x at frequency v can be directly related to @13 via the macroscopic polarization since the latter can be defined as: P13 ðvÞ ¼ 10 xðvÞE

½13

where E is the electric field amplitude at frequency vd. The linear susceptibility (real and imaginary parts) is given by the following expressions: lm13 l2 N Rexð1Þ D ð2vd ; vd Þ ¼ 10 " # " 24D21 ðlV2 l 2 24D21 D32 Þþ4D31 ga2  4D31 D21 2 gd ga 2lV2 l2 Þ 2 þ4ðgd D21 þ ga D31 Þ 2 ½14 lm13 l2 N ð2 v ; v Þ¼ Imxð1Þ d d D 10 " # " 2 8D21 gd þ2ga ðlV2 l2 þ gd ga Þ  ð4D31 D21 2 gd ga 2lV2 l2 Þ 2 þ4ðgd D21 þ ga D31 Þ 2 ½15 We now consider the additional two-photon resonant field va. This leads to a four-wave mixing process that generates a field at vd (i.e., the probe frequency). The nonlinear susceptibility that describes the coupling of the fields in a four-wave mixing process

¼

m23 m13 N 610 "3 ððD21 2jga =2ÞðD31 2jgd =2Þ2ðlVc l2 =4ÞÞp ! X 1 1 £ mi1 mi3 þ vi 2 va vi 2 vb i

Nonlinear Optical Processes The dependence of the susceptibilities upon the detuning is plotted in Figure 3. In this plot the effects of inhomogeneous (Doppler) broadening are 8 lm[c(1)] arb. units

½12

xð3Þ D ð2vd ; va ; va ; vc Þ

6 4 2 0 –1000

–500

0 ∆ /gd

500

1000

0 ∆ /gd

500

1000

0 ∆ /g d

500

1000

6 Re[c (1)] arb. units

P13 ¼ N m13 @13

is given by the expression:

4 2 0 –2 –4 –6 –1000

c (3) arb. units

This system of equations can be solved by various analytical or numerical methods to give the individual density matrix elements. Analytical solutions are possible if we can assume for instance that Vc q Va ; Vd and that there are continuous-wave fields so a steady-state treatment is valid. For pulsed fields or in the case where the generated field may be significant numerical solutions are in general required. We are specifically interested in the optical response to a probe field at vd (close to resonance with the l1. 2l3. transition) that is governed by the magnitude of the coherence @13. We find @13 making the assumption that only the coupling field is strong, i.e., that Vc q Va ; Vd holds. From this quantity the macroscopic polarization is obtained from which the susceptibility can be computed. The macroscopic polarization P at the transition frequency v13 can be related to the microscopic coherence @13 via the expression:

–500

3 2 1 0 –1000

–500

Figure 3 Electromagnetically induced transparency is shown in the case of significant Doppler broadening. The Doppler averaged values of the imaginary Im[x (1)] and real Re[x (1)] parts of the linear susceptibility and the nonlinear susceptibility x (3) are plotted in the three frames. The Doppler width is taken to be 20gd, Vc ¼ 100gd and gd ¼ 50ga, gc ¼ 0.

ELECTROMAGNETICALLY INDUCED TRANSPARENCY

also incorporated by computing the susceptibilities over the inhomogeneous profile (see below for more details). By inspection of eqn [14] we see that the absorptive loss at the minimum of Im[x (1)] varies as ga =V2c . This dependence is a consequence of the interference discussed above. In the absence of interference (i.e., just simply Autler– Townes splitting) the minimum loss would vary as gd =V2c . Since l3. 2 l1. is an allowed decay channel (in contrast to l2. 2 l1.) it follows that gb q ga and so the absorption is much less as a consequence of EIT. Re[x (1)] in eqn [15] and Figure 3 is also modified significantly. The resonant value of refractive index is equal to the vacuum value where the absorption is a minimum, the dispersion is normal in this region with a gradient determined by the strength of the coupling laser, a point we will return to shortly. For an unmodified system the refractive index is also unity at resonance but in that case there is high absorption and steep anomalous absorption. Reduced group velocities result from the steep normal dispersion that accompanies EIT. Inspection of the expression [16] and Figure 3 shows that x (3) is also modified by the coupling field. The nonlinear susceptibility depends upon 1/V2c as is expected for a laser dressed system, however in this case there is not destructive but rather constructive interference, between the field-split components. This result is of great importance: it ensures that the absorption can be minimized at frequencies where the nonlinear absorption remains large. As a consequence of constructive interference the nonlinear susceptibility remains resonantly enhanced whilst the linear susceptibility vanishes or becomes very small at resonance due to destructive interference. Large nonlinearity accompanying vanishing absorption (transparency) of course match conditions for efficient frequency mixing as a large atomic density can then be used. Moreover the dispersion (controlling the refractive index) also vanishes at resonance; this leads to perfect phase matching (i.e., zero wavevector mismatch between the fields) in the limit of a simple three-level system. As a result of these features a large enhancement of the conversion efficiency in this type of scheme can be achieved. To compute the generated field strength Maxwell’s equations must be solved using these expression for the susceptibility to describe the absorption, refraction, and nonlinear coupling in the medium. We will treat this within the slowly varying envelope approximation since the fields are cw or nanosecond pulses. To be specific we assume that the susceptibilities are time independent, i.e., that we are in the steady-state

381

(cw) limit. We make the assumptions also that there is no pump field absorption and that we have plane waves. Under these assumptions the generated field amplitude Ad is given in terms of the other field amplitudes Ai by:

› v v A ¼ i d xð3Þ A2a Ac e2 iDkd z 2 d Im½xð1Þ Ad ›z d 4c 2c vd ð1Þ Re½x Ad þi ½16 2c where the wavevector mismatch is given by: Dkd ¼ kd þ kc 2 2ka

½17

The wavevector mismatch will be zero on resonance for the three-level atom considered in this treatment. In fact the contribution to the refraction from all the other atomic levels must be taken into account whilst computing Dkd and it is implicit that these make a finite contribution to the wavevector mismatch. We can solve this first-order differential equation with the boundary condition Ad ðz ¼ 0Þ ¼ 0 to find the generated intensity Iðvd Þ after a length z: 3nv2d ð3Þ 2 Iðvd Þ¼ lx l lAa l4 lAc l2 8Z0 c2

  vd vd ð1Þ ð1Þ v 1þe2 c Im½x z 22e2 c Im½x z cos Dkþ d Re½xð1Þ  z 2c  2  v2d vd ð1Þ 2 ð1Þ Re½ Im½ x  þ Dkþ x  2c 4c2

where Z0 is the impedance of free space. This expression is quantitatively correct for the case of the assumptions made. More generally the qualitative predictions and general scaling remain valid in the limit where the pulse duration is significantly longer than the time required to traverse the medium. Note that both real and imaginary parts of x (1) and x (3) play an important role, as we would expect for resonant frequency mixing. The influence of that part of the medium refraction which is due to other levels is contained in the terms with Dk. In the case of a completely transparent medium this becomes a major limit to the conversion efficiency.

Propagation and Wave-Mixing in a Doppler Broadened Medium Doppler shifts arising from the Maxwellian velocity distribution of the atoms in the medium lead to a corresponding distribution in the detunings for

382 ELECTROMAGNETICALLY INDUCED TRANSPARENCY

the various members of the atomic ensemble. The response of the medium, as characterized by the susceptibilities, must therefore include the Doppler effect by performing a weighted sum over possible detunings. The weighting is determined by the Gaussian form of the Maxwellian velocity distribution. From this operation the effective values of the susceptibilities at a given frequency are obtained, and these quantities can be used to calculate the generated field. This step is of considerable practical importance as in most up-conversion schemes it is not possible to achieve Doppler-free geometries and the use of laser cooled atoms, although in principle possible, limits the atomic density that can be employed. The interference effects persist in the dressed profiles providing the coupling laser Rabi frequency is comparable to or larger than the inhomogeneous width. This is because the Doppler profile follows a Gaussian distribution which falls off much faster in the wings of the profile than the Lorentzian profile due to the natural broadening. In the case considered with weak probe field, excited state populations and coherences remain small. The two-photon transition need not be strongly driven (i.e., a small two-photon Rabi frequency can be used) but a strong coupling laser is required. The coupling laser must be intense enough that its Rabi frequency is comparable to or exceeds the inhomogenous widths in the system (i.e., Doppler width), and a laser intensity of above 1 MW cm22 is required for a typical transition. This is trivially achieved even for unfocused pulsed lasers, but does present a serious limit to cw lasers unless a specific Doppler-free configuration is employed. The latter is not normally suitable for a frequency up-conversion scheme if a significant up-conversion factor is required, e.g., to the vacuum ultraviolet (VUV); however recent experiments report significant progress in cw frequency up-conversion using EIT and likewise a number of other possibilities, e.g., laser-cooled atoms and standing-wave fields, have been proposed. A transform-limited single-mode laser pulse is essential for the coupling laser field since a multimode field will cause an additional dephasing effect on the coherence, resulting in a deterioration of the quality of the interference. In contrast, whilst it is advantageous for the field driving the two-photon transition to be single mode (in order to achieve optimal temporal and spectral overlap with the EIT hole induced by the dressing laser), this is not essential since this field does not need to drive the coherence responsible for interference. When a pulsed laser field is used additional issues must be considered. The group velocity is modified

for pulses propagating in the EIT large reductions, e.g., by factors down to ,c/100, in the group velocity have been observed. Another consideration beyond that found in the simple steady-state case is that the medium can only become transparent if the pulse contains enough energy to dress all the atoms in the interaction volume. The minimum pulse energy to prepare a transparency is: Epreparation ¼

f13 NL"v f23

½18

where fij are the oscillator strengths of the transitions and NL the product of the density and the length. Essentially the number of photons in the pulse must exceed the number of atoms in the interaction volume to ensure all atoms are in the appropriate dressed state. This puts additional constraints on the laser pulse parameters. Up-conversion to the UV and vacuum UV has been enhanced by EIT in a number of experiments. Only pulsed fields have so far been up-converted to the VUV with EIT enhancement. The requirements on a minimum value of Vc . DDoppler constrains the conversion efficiency that can be achieved because the 1/V2c factor in eqn [17] ultimately leads to diminished values of x (3). The use of gases of higher atomic weight at low temperatures is therefore highly desirable in any experiment utilizing EIT for enhancement of four-wave mixing to the VUV. Conversion efficiencies, defined in terms of the pulse energies by Ed/Ea or Ed/Ec of a few percent have been achieved using the EIT enhancement technique. It is typically most beneficial to maximize the conversion efficiency defined by the first ratio since va is normally in the UV and is the lower energy of the two applied pulses.

Nonlinear Optics with a Pair of Strong Coupling Fields in Raman Resonance An important extension of the EIT concept occurs when two strong fields are applied in Raman resonance between a pair of states in a three-level system. Considering the system illustrated in Figure 1 we can imagine that both applied fields are now strong. Under appropriate adiabatic conditions the system evolves to produce the maximum possible value for the coherence @12 ¼ 0.5. Adiabatic evolution into the maximally coherent state is achieved by adjusting either the Raman detuning or the pulse sequence (counter to-intuitive order). The pair of fields may also be in single-photon resonance with a

ELECTROMAGNETICALLY INDUCED TRANSPARENCY

Pulse Propagation and Nonlinear Optics for Weak CW Fields In a Doppler-free medium a new regime can be accessed. This is shown in Figure 4 where the possibility now arises of extremely narrow transparency dips since very small values of Vc are now sufficient to induce EIT. The widths of these features are typically subnatural and are therefore accompanied by very steep normal dispersion, which corresponds to a much reduced group velocity. The ultraslow propagation of pulses is one of the most dramatic manifestations of EIT in this regime. Nonlinear susceptibilities are now very large as there is constructive interference controlling the value and the splitting of the two states is negligible compared to their natural width. Nonlinear optics at very low light levels, i.e., at the few-photon limit, is possible in this regime. Propagation of pulses is significantly modified in the presence of EIT. Figure 4 shows the changes to Re[x (1)] that arise. Within the transparency dip there exists a region of steep normal dispersion. In the vicinity of resonance this is almost linear and it becomes reasonable to consider the leading term only that describes the group velocity. An analysis of the

lm[c (1)] arb. units

0.8 0.6 0.4 0.2 0 –4

–2

0 ∆ /gd

2

4

Re[c (1)] arb. units

0.4 0.2 0 –0.2 –0.4 –4

–2

0 ∆ /gd

2

4

0.5 c (3) arb. units

third level, in which case the EIT-like elimination of absorption will be important. This situation is equivalent to the formation of a darkstate, since neither of the two strong fields is absorbed by the medium. For sufficiently strong fields the singlephoton condition need not be satisfied and a maximum coherence will still be achieved. An additional field applied to the medium can participate in sum- or difference-frequency mixing with the two Raman resonant fields. The importance of the large value of coherence is that it is the source polarization that drives the new fields generated in the frequency mixing process. Complete conversion can occur over a short distance that greatly alleviates the constraints usually set by phase-matching in nonlinear optics. Recently near unity conversion efficiencies to the far-UV were reported in an atomic lead system where maximum coherence had been created. In a molecular medium large coherence between vibrational or rotational levels has also been achieved using adiabatic pulse pairs. Efficient multi-order Raman sideband generation has been observed to occur. This latter observation may lead the way to synthesizing very short duration light pulses since the broadband Raman sideband spectrum has been proved to be phase-locked.

383

0.4 0.3 0.2 0.1 0 –4

–2

0 ∆ /gd

2

4

Figure 4 Electromagnetically induced transparency is shown in the case where there is no significant Doppler broadening. The values of the imaginary Im[x (1)] and real Re[x (1)] parts of the linear susceptibility and the nonlinear susceptibility x (3) are plotted in the three frames. We take Vc ¼ gd /5 and gd ¼ 50ga, gc ¼ 0.

refractive changes has been provided by Harris who expanded the susceptibilities (both real and imaginary parts) of the dressed atom around the resonance frequency to determine various terms in Re[x (1)]. The first term of the series (zero order) Re[x (1)](v13) ¼ 0 corresponds to the vanishing dispersion at resonance. The next term ›[Rex (1)](v)/›v gives the slope of the dispersion curve; at v13 this takes the value:

›Rexð1Þ ðv13 Þ lm l2 N4ðV2c 2 g 2a Þ ¼ 13 ›v 2p10 ðV2c þ ga gd Þ

½19

384 ELECTROMAGNETICALLY INDUCED TRANSPARENCY

The slope of the dispersion profile leads to a reduced group velocity vg: 1 1 p ›xð1Þ ¼ þ vg c l ›v

½20

From the expression for ›x/›v we see that this slope is steepest (and so vg mimimum) in the case where Vc q G2 and V2c q G2 G3 but is still small compared to G3 (i.e., VC , G3) and hence ›x/›v / 1/V2c . In the limit of small Vc the following expression for vg therefore holds: vg ¼

  "c10 Vc 2 2vd lm13 l2 N

resonant wavelength causes a large change in the refractive index for a field applied close to this frequency. It is predicted that strong cross-phase modulations will be created in this process between the fields vf and vd, even in the quantum limit for these fields. This is predicted to lead to improved schemes for quantum nondemolition measurements of photons through the measurement of the phaseshifts they induce (through cross-phase modulation) on another quantum field. This type of measurement has direct application in quantum information processing.

½21

Extremely low group velocities, down to a few meters per second, are achieved in this limit using excitation of the hyperfine split ground states in either laser cooled atomic samples or Doppler-free configurations in finite temperature samples. Recently similar light slowing has been observed in solids. Storage of the optical pulse within the medium has also been achieved by adiabatically switching off the coupling field and thus trapping the optical excitation as an excitation within the hyperfine ground states for which the storage time can be very long (.1 ms) due to very low dephasing rates. Since the storage scenario should be valid even for single photons this process has attracted considerable attention recently as a means to enable quantum information storage. Extremely low values of Vc are sufficient to induce complete transparency (albeit in a very narrow dip) and at this location the nonlinear response is resonantly enhanced. Very high efficiency nonlinear frequency mixing and phase conjugation at low light levels have been reported under these conditions. It is expected that high-efficiency nonlinear optical processes will persist to ultralow intensities (the few photon level) in an EIT medium of this type. One example of highly enhanced nonlinear interactions is the predicted large values of the Kerr type nonlinearity (nonlinear refractive index). The origin of this can be seen by considering the steep dispersion profile in the region of the transparency dip in Figure 4. Imagine that we apply an additional field vf, perhaps a very weak one, at a frequency close to resonance between state l2. and a fourth level l4.. The ac Stark shift caused by this new field to the other three level, although small, will have a dramatic effect upon the value of the refractive index because of the extreme steepness of the dispersion profile. Essentially even a very small shift of the

See also Nonlinear Optics, Applications: Phase Matching; Raman Lasers. Scattering: Raman Scattering.

Further Reading Arimondo E (1996) Coherent population trapping in laser spectroscopy. Progress in Optics 35: 257 – 354. Harris SE (1997) Electromagnetically induced transparency. Physics Today 50: 36 –42. Harris SE and Hau LV (1999) Non-linear optics at low light levels. Physical Review Letters 82: 4611– 4614. Harris SE, Field JE and Imamoglu A (1990) Nonlinear optical processes using electromagnetically induced transparency. Physical Review Letters 64: 1107– 1110. Hau LV, Harris SE, Dutton Z and Behroozi CH (1999) Light speed reduction to 17 metres per second in an ultra-cold atomic gas. Nature 397: 594 – 598. Marangos JP (2001) Electromagnetically induced transparency. In: Bass M, et al. (ed.) Handbook of Optics, vol. IV, ch. 23. New York: McGraw-Hill. Merriam AJ, Sharpe SJ, Xia H, et al. (1999) Efficient gasphase generation of vacuum ultra-violet radiation. Optics Letters 24: 625 – 627. Schmidt H and Imamoglu A (1996) Giant Kerr nonlinearities obtained by electromagnetically induced transparency. Optics Letters 21: 1936– 1938. Scully MO (1991) Enhancement of the index of refraction via quantum coherence. Physical Review Letters 67: 1855– 1858. Scully MO and Suhail Zubairy M (1997) Quantum Optics. Cambridge, UK: Cambridge University Press. Sokolov AV, Walker DR, Yavuz DD, et al. (2001) Femtosecond light source for phase-controlled multi-photon ionisation. Physical Review Letters 87: 033402-1. Zhang GZ, Hakuta K and Stoicheff BP (1993) Nonlinear optical generation using electromagnetically induced transparency in hydrogen. Physical Review Letters 71: 3099– 3102.

ENVIRONMENTAL MEASUREMENTS / Doppler Lidar 385

ENVIRONMENTAL MEASUREMENTS Contents Doppler Lidar Hyperspectral Remote Sensing of Land and the Atmosphere Laser Detection of Atmospheric Gases Optical Transmission and Scatter of the Atmosphere

Doppler Lidar R M Hardesty, National Oceanic and Atmospheric Administration, Boulder, CO, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Lidars developed for the measurement of atmospheric winds make use of the Doppler effect, in which radiation scattered or emitted from a moving object is shifted in frequency as a result of the movement of the particles. Atmospheric Doppler lidars irradiate a volume of atmosphere with a pulse of very narrowband, laser-produced radiation, then detect the change in frequency of the radiation backscattered from atmospheric aerosol particles or molecules present in the volume. Although Doppler lidars are conceptually similar to Doppler weather radars, lidar wavelengths are 3 –4 orders of magnitude shorter than radar wavelengths, producing some important differences in propagation and scattering characteristics. Radiation at wavelengths in the optical region of the spectrum is efficiently scattered by aerosols and molecules, therefore Doppler lidars, unlike commonly used Doppler weather radars, do not require the presence of hydrometeors or insects to obtain useful results. Also, optical radiation can be tightly collimated, virtually eliminating effects of ground clutter and enabling lidar probing of small volumes to within a few meters of terrain or structures. The primary disadvantage of lidar techniques is the severe attenuation of the optical radiation by cloud water droplets and fog. Doppler lidars do not typically probe beyond the edge of most atmospheric clouds – the one exception being tenuous ice clouds such as cirrus, which often are characterized by low optical extinction and high backscatter, making them excellent lidar targets. Doppler lidars also do not have the extended range, which can exceed 100 km,

available from contemporary meteorological radars when scatterers are present. In clear air, however, the characteristics and capabilities of Doppler lidar are well suited for obtaining detailed wind and turbulence observations for a wide variety of applications. Lidar beams can be easily scanned to characterize motions within very confined three-dimensional spaces such as shallow atmospheric boundary layers, narrow canyons, and turbulent structures. The relative compactness of Doppler lidars makes deployment on aircraft or other moving platforms extremely viable, and a Doppler lidar deployed on a satellite has been proposed as a way to obtain global measurements of atmospheric wind fields. By scanning a lidar beam from an orbiting satellite and analyzing the returns backscattered from clouds, aerosols, and molecules, a satellite-based instrument could provide important wind information for numerical forecast models.

Doppler Lidar Fundamentals Backscattered Light

As laser light propagates through the atmosphere, some of the incident energy is scattered by the atmospheric molecules and constituents through which the light passes. In lidar applications, the light backscattered (scattered directly back toward the source) is collected and analyzed. The backscattering properties of an atmospheric particle depend on its refractive index, shape, and size. In the special case when the particle is much smaller than the wavelength of the incident radiation, as is the case for laser radiation scattered by atmospheric molecules, the scattering process is characterized as Rayleigh scattering. Because, in the Rayleigh scattering regime the energy backscattered by a particle increases proportionally as the inverse of the fourth power of the wavelength, Doppler lidar systems designed for molecular scatter operate at short wavelengths. Molecular Doppler lidar systems typically operate in the visible or ultraviolet spectral regions.

386 ENVIRONMENTAL MEASUREMENTS / Doppler Lidar

Aerosol particles, the other component of the atmosphere that scatters laser light, produce Mie scattering, which applies when the diameter of the scatterers is not orders of magnitude smaller than the incident wavelength. Aerosol particles present in the atmosphere include dust, soot, smoke, and pollen, as well as liquid water and ice. Although for Mie scattering the relationship between the power backscattered by an ensemble of aerosol particles and the incident wavelength is not simply characterized, most studies have shown that the energy backscattered increases roughly proportional to the first or second power of the inverse of the incident wavelength, depending on the characteristics of the scattering aerosol particles. In a polluted environment, such as in the vicinity of urban areas, an abundance of large particles often results in a roughly linear relationship between the inverse of the incident wavelength and the backscattered energy, while in more pristine environments, such as the free troposphere, the inverse wavelength/backscatter relationship can approach or exceed a square-law relationship. The primary objective in Doppler lidar is to measure the Doppler frequency shift of the scattered radiation introduced by the movements of the scattering particles. Figure 1 shows a typical spectrum of the radiation collected at the lidar receiver for a volume of atmosphere irradiated by a monochromatic laser pulse. Molecular scattering produces the broadband distribution in Figure 1, where the broadening results from the Doppler shifts of the radiation backscattered from molecules moving at their thermal velocities. The width of the molecular velocity distribution in the atmosphere ranges from , 320 to 350 m s21, scaling as the square root of the 0.80

Relative intensity

0.70

Aerosol return

0.60 0.50 0.40 0.30

temperature. In the center of the spectrum is a much narrower peak resulting from scattering of the light by aerosol particles. Because the thermal velocity of the much larger aerosol particles is very low, the width of the distribution of the aerosol return is determined by the range of velocities of particles moved about by small-scale turbulence within the scattering volume. This is typically on the order of a few m s21. Also shown in Figure 1 is an additional broadband distribution, due to scattered solar radiation collected at the receiver. If the laser source is not monochromatic, the backscattered signal spectrum is additionally broadened, with the resulting spectrum being the convolution of the spectrum shown in Figure 1 with the spectrum of the laser pulse. As seen in the figure, the entire spectrum is Doppler-shifted in frequency, relative to the frequency of the laser pulse. The object for a Doppler lidar system is to measure this shift, given by df ¼ 2vrad =l; where vrad is the component of the mean velocity of the particles in the direction of propagation of the lidar pulse and l is the laser wavelength. Elements of a Doppler Lidar

Doppler lidar systems can be designed primarily to measure winds from aerosol-scattered radiation, molecule-scattered radiation, or both. The type of system places specific requirements on the primary components that comprise a Doppler lidar system. A Doppler lidar consists of a laser transmitter to produce pulses of energy that irradiate the atmospheric volume of interest, a receiver which collects the backscattered photons and estimates the energy and Doppler shift of the return, and a beam pointing mechanism for directing the laser beam and receiver field of view together in various directions to probe different atmospheric volumes. Independent of whether the primary scatterers are molecules or aerosol particles, the system design criteria for a Doppler system are driven by a fundamental relationship between the error in the estimate of mean frequency shift df1 ; the bandwidth of the return f2 (proportional to the distribution of velocities), and the number of incident backscattered photons detected N as

Molecular return

df1 / f2 =N1=2

0.20 0.10 Background light 0 Wavelength Figure 1 Distribution of wind velocities in a lidar return, showing Doppler shift resulting from wind-induced translation of the scatterers. The ratio of aerosol to molecular signals increases with increasing wavelength.

½1

Thus, the precision of the velocity estimate is improved by increasing the number of detected signal photons and/or decreasing the bandwidth of the backscattered signal. It is obvious from the equation that a significantly greater number of photons are required to achieve the same precision in a Doppler measurement from a molecular backscattered signal,

ENVIRONMENTAL MEASUREMENTS / Doppler Lidar 387

characterized by higher bandwidth f2 ; compared to the number required to compute velocities from aerosol returns. The improved measurement precision gained from a narrow bandwidth return also indicates that the laser transmitter in a Doppler lidar system should be characterized by narrow spectral width (considerably narrower than the spectral broadening caused by the distribution of velocities), and to increase the number of photons detected, maximum transmit energy. Narrowband, single-frequency pulses are obtained by employing a low-power, precisely frequencystabilized master oscillator laser, whose radiation is either used to seed a stabilized, higher-energy laser cavity (injection seeding) or amplified in one or more optical amplifiers (master-oscillator, power amplifier, (MOPA)) to produce frequency-stabilized pulses. The lidar receiver gathers the backscattered photons and extracts the wind velocity as a function of the range to the scattering volume. This requires a telescope, to gather and focus the scattered radiation, and a receiver to detect the scattered radiation and determine the Doppler shift. Frequency analysis in a Doppler lidar receiver is carried out using one of two techniques: coherent detection (also known as heterodyne detection) or direct detection (alternately labeled incoherent detection). The techniques differ fundamentally. In a heterodyne receiver, frequency analysis is carried out on a digitized time series created by mixing laser radiation with the backscattered radiation, while in direct detection lidar an interferometer optically analyzes the backscattered radiation to produce a light pattern which contains the information on the frequency content.

Coherent (Heterodyne) Doppler Lidar Description

Coherent or heterodyne lidar is implemented by optically mixing the backscattered laser light with radiation from a stable, continuous-wave, local oscillator (LO) laser whose frequency is precisely controlled to be a known frequency offset, typically on the order of tens of MHz, from that of the laser transmitter (Figure 2). The mixing process at the face of an optical detector generates, after high pass filtering, an electrical signal with amplitude proportional to amplitude of the backscattered electromagnetic field and frequency equal to the difference between the backscattered field frequency and the LO laser field frequency. This signal is sampled, and then digitally processed to estimate the range-dependent mean frequency shift of the backscattered signal, from which the radial wind component can be

Figure 2 Schematic of coherent detection of backscattered radiation.

derived. A property of coherent lidar is that, because of constructive and destructive interference of the returns from individual scatterers, single pulse returns are characterized by random fluctuations in the amplitude and phase of the detected time series and Fourier spectrum. Consequently, averaging (sometimes referred to as accumulation) over multiple pulses usually is necessary to produce a sufficiently precise estimate of the signal mean frequency estimate. Because the phase of the detected time series is random within each pulse, averaging must be carried out in the power spectrum or autocorrelation function domain, rather than on the detected time series. The optical mixing process in coherent Doppler lidar provides both benefits and design challenges. Because receiver efficiencies (equivalent fraction of incident photons entering the receiver that are converted to electrons) are quite high, wind velocities can be estimated from very weak backscattered signals. Also, signal processing is performed on a time series derived from the mixed signal, enabling the use of electrical filters to produce a narrow receiver bandwidth and effectively eliminating broadband solar background light as a major source of noise. As a result, unlike many lidars coherent Doppler lidar performance is not degraded under daylight conditions. The primary limitation of coherent detection results from the added photon noise from the local oscillator radiation (which is usually much stronger than the backscattered radiation). The effect of local oscillator photon noise is to define a system noise level below which very weak signals cannot be practically extracted, even with substantial multiplepulse averaging. In addition, because efficient mixing requires phase coherence of the backscattered signal field across the detector, coherent lidar performance at longer ranges is degraded by strong optical turbulence along the path of the laser pulse. These adverse effects of turbulence increase for shorter wavelength systems.

388 ENVIRONMENTAL MEASUREMENTS / Doppler Lidar

Although measurements at wavelengths near 1.06 mm have been demonstrated, coherent Doppler lidar systems used for regular atmospheric probing operate in the eye-safe, infrared portion of the spectrum at wavelengths longer than 1.5 mm. Currently, the two most common system wavelengths for coherent lidar wind systems are in atmospheric window spectral regions around 2 and 10.6 mm. Early coherent Doppler lidar measurements, beginning in the 1970s, employed CO2 laser transmitters and local oscillators operating at wavelengths near 10.6 mm. Pulsed CO2 laser transmitters with as much as 10 J of energy have since been demonstrated, and systems with 1 J lasers have probed the atmosphere to ranges of 30 km or more. In the late 1980s, solid state laser transmitters operating near 2 mm wavelengths were introduced into coherent lidar wind-measuring systems. The compact size and potential reliability advantages of solid-state transmitters, in which the transmitter laser is optically pumped by an array of laser diodes, provide advantages over larger CO2 laser technology. Also, because for a given measurement accuracy the range-resolution obtainable is proportional to wavelength, 2 mm instruments have an enhanced capability to probe small-scale features. However, although development of higher energy, single-frequency coherent lidar sources operating at 2 mm region, as well as at 1.5 mm, is currently an active research area, solid state lasers with pulse energies greater than several tens of mJ have yet to be incorporated into lidar systems for atmospheric probing. Applications of Coherent Doppler Lidar

Coherent lidars have been used to measure winds for a variety of applications, and from an assortment of platforms, such as ships and aircraft. Since these

lidars operate in the infrared where aerosol scattering dominates molecular scattering, they require aerosol particles to be present at some level to obtain usable returns. Although clouds also provide excellent lidar targets, most of the more useful applications of coherent lidars involve probing the atmospheric boundary layer or lower troposphere where aerosol content is highest. Because of the capability to scan the narrow lidar beam directly adjacent to terrain, a unique application of lidar probing is the measurement of wind structure and evolution in complex terrain such as mountains and valleys. Over the past two decades, Doppler lidar studies have been used, for example, to study structure of damaging windstorms on the downwind side of mountain ranges, advection of pollution by drainage flows from valleys, and formation of mountain leeside turbulence as a potential hazard to landing aircraft. The interactions of wind flows with complex terrain can produce dangerous conditions for transportation, especially aircraft operations. Figure 3 shows a strong mountain wave measured by a Doppler lidar near Colorado Springs, during an investigation of the effects of downslope winds and turbulence on aircraft operations at the Colorado Springs airport. The mountains are on the right of the figure, where the strong wave with wavelength of about 6 km can be seen. Note the reversal in winds near the location where the wave descends to the surface. The Colorado Springs study was prompted by the crash of a passenger jet while landing at the airport during a period when the winds were down the slope of the Rocky Mountains just to the west of the airport. A Doppler lidar was recently deployed in an operational mode for wind shear detection at the new Hong Kong International Airport. Because winds flowing over mountainous terrain just south of the

Figure 3 Vertical scan of the winds in a mountain wave measured by a 10.6 mm coherent Doppler lidar near Colorado Springs, Colorado. Courtesy L. Darby, NOAA.

ENVIRONMENTAL MEASUREMENTS / Doppler Lidar 389

airport can produce mountain waves and channeled flows, wind shear is frequently encountered during landing and takeoff operations. Figure 4 shows a gust front situated just to the west of the airport, characterized by a sharp wind change (which would produce a corresponding change in airspeed) approaching 13 m s21 over a distance of less than one kilometer. Providing a capability to detect such events near the airport and warn pilots during approach or preparation for takeoff was the primary reason for deployment of the lidar. Within the first year of operation the Hong Kong lidar was credited with improving wind shear detection rates and providing more timely warnings for pilots of the potentially hazardous conditions. As of August 2003 this lidar had logged more than 10 000 hours of operation. Because coherent Doppler lidars are well matched to applications associated with probing small-scale, turbulent phenomena, they have also been deployed

at airports to detect and track wing tip vortices generated by arriving or departing aircraft on nearby parallel runways. In the future, a network of groundbased lidars could provide information on vortex location and advection speed as well as wind shear, leading to a potential decrease in congestion at major airports. Also, compact lidar systems looking directly ahead of research aircraft have shown the capability to detect wind changes associated with potentially hazardous clear-air turbulence. In the future, a Doppler lidar installed on the commercial aircraft fleet could potentially look ahead and provide a warning to passengers to fasten seat belts before severe turbulence is encountered. The high resolution obtainable in a scanning lidar enables visualization of finely structured wind and turbulence layers. Figure 5 shows an image of turbulence associated with a nocturnal low-level jet just 50 m above the surface obtained at night over flat

Figure 4 Horizontal depiction of the radial component of the wind field measured at Hong Kong Airport by a 2.02 mm Doppler lidar during a gust front passage, showing strong horizontal shear just west of the airport. Courtesy C. M. Shun, Hong Kong Observatory.

390 ENVIRONMENTAL MEASUREMENTS / Doppler Lidar

Figure 5 Turbulence structure within a nocturnal stable layer at a height of 50 m. Note the stretching of the vertical scale. Courtesy R. Banta, NOAA.

terrain in Kansas. Although a low-level jet was present on just about every evening during the experiment, similar images obtained at different times by the Doppler lidar illustrated markedly different characteristics, such as a wide variation in the observed mechanical turbulence along the interface. Such observations enable researchers to improve parameterizations of turbulence in models and better understand the conditions associated with vertical turbulent transport and mixing of atmospheric constituents such as pollutants. By deploying Doppler lidars on moving platforms such as ships and aircraft, the spatial coverage of the measurements can be greatly increased. Although removal of platform motion and changes in orientation are not trivial, aircraft-mounted lidar can map out such features as the low-level jet and the structure of boundary layer convective plumes. Figure 6 shows horizontal and vertical motions associated with convective plumes measured along an approximately 60 km path over the southern Great Plains of the United States. From such

measurements estimates of turbulence intensity, integral scale, and other boundary layer characteristics can be computed.

Direct Detection Doppler Lidar Description

Direct detection or incoherent Doppler lidar has received significant attention in recent years as an alternative to coherent lidar for atmospheric wind measurements. In contrast to coherent lidar, in which an electrical signal is processed to estimate Doppler shift, an optical interferometer, usually a Fabry –Perot etalon, serves as the principal element in a direct detection lidar receiver for determining the frequency shift of the backscattered radiation. One implementation of a direct-detection Doppler lidar receiver is the ‘fringe imaging’ technique. In this design, the interferometer acts as a spectrum analyzer. The backscatter radiation is directed through a

ENVIRONMENTAL MEASUREMENTS / Doppler Lidar 391

Figure 6 Vertical motion in convective plumes measured by a vertically pointing airborne coherent Doppler lidar. Total horizontal extent of the measurements is approximately 60 km, reddish colors correspond to upward motions.

Lens

Lens

Fiber optic input Etalon Figure 7 Schematic of Fabry– Perot etalon in a direct detection, fringe-imaging lidar receiver. Courtesy P. Hays, Michigan Aerospace Corporation.

Fabry –Perot interferometer, which produces a ring pattern in the focal plane (Figure 7). The spectral content information of the incident radiation is contained in the radial distribution of the light. Each ring corresponds to an order of the interferometer

and is equivalent to a representation of the backscattered signal frequency spectrum. As the mean frequency of the backscattered radiation changes, the rings move inward or outward from the center. To extract a spectrum of the backscattered light, one or

392 ENVIRONMENTAL MEASUREMENTS / Doppler Lidar

more of the rings are imaged onto a two-dimensional detector, and the resulting pattern analyzed. The rings can either be detected using a multi-element ring detector, or can be converted to a linear pattern with a circle-to-line converter optic and then imaged on to a charge-coupled device array device. An alternate direct detection receiver configuration, generally called the double edge technique, is to employ two interferometers as bandpass filters, with the center wavelength of each filter set above and below the laser transmitter wavelength, as shown in Figure 8. The incoming radiation is split between the two interferometers, and the wavelength shift is computed by examining the ratio of the radiation transmitted by each interferometer. Both the double edge and fringe-imaging techniques have demonstrated wind measurements to heights well into the stratosphere. The major challenge associated with the double edge receiver is optimizing the instrument when both aerosol and molecular scattered radiation are present, since in general the change in transmission as a function of velocity is different for the aerosol and molecular signals. Double edge receivers optimized for both aerosol and molecular returns place the bandpass of the etalon filters at the precise wavelength where the change in transmission with change in Doppler shift is the same for both aerosol and molecular returns. For both types of direct detection receivers described above, much of the radiation incident on the interferometer is reflected out of the system, reducing the overall efficiency of the receiver. Recently, designs that incorporate fiber optics to collect a portion of the reflected radiation and ‘recycle’ it back into the etalon have been demonstrated as a method to improve the transmission efficiency of the etalon in a fringe imaging receiver. Molecular wind lidar concept DnD Molecular

Signal

Laser frequency

Filter 1

Filter 2

Frequency Figure 8 Spectrum of molecular lidar return showing placement of bandpass filters for a dual channel (double edge) direct detection receiver. Courtesy B. Gentry, NASA.

Doppler Wind Measurements Based on Molecular Scatter

One of the primary advantages of direct detection Doppler lidar is its capability for measurements based on scatter from atmospheric molecules. Measurement of Doppler shifts from molecular scattered radiation is challenging because of the large Doppler-broadened bandwidth of the return. Because one wants to measure a mean wind velocity with a precision of a few m s21 or better, in the order of 105 photons are required. Some combination of multiple pulse averaging, powerful lasers, and large receiver optics is required to obtain these high photon counts from backscattered returns. Molecular-scatter wind measurements have been demonstrated in the visible spectral region at 532 nm wavelength as well as at 355 nm in the near ultraviolet. The ultraviolet region has the dual advantages of enhanced molecular scatter and less restrictive laser eye-safety restrictions. Figure 9 shows the time series of a wind profile measured in the troposphere from both aerosols and clouds using a molecular-scatter, ground-based 355-nm wavelength Doppler fringe-imaging lidar. The figure shows measurements from receiver channels optimized for the wideband molecular signal and the narrowband aerosol return. For this measurement, backscattered photons were collected by a 0.5 m receiver aperture, averaged for 1 minute, and processed. In the absence of clouds, direct detection Doppler lidars have measured wind profiles continuously from the surface to beyond 15 km height. Figure 10 shows that the estimated wind error for the same 355 nm lidar is less than 1 m s21 to about 10 km, and less than 4 m s21 at 15 km height.

Heterodyne and Direct-Detection Doppler Trade-Offs Lively debates within the lidar community have occurred over the past decade regarding the relative merits of heterodyne versus direct detection Doppler lidars. To a large extent, the instruments are complementary. Generally, heterodyne instruments are much more sensitive when significant aerosols are present. Coherent lidar processing techniques have been developed that can produce accurate wind measurement rates using only a few lidar pulses with as few as 100 detected photons, such that several wind observations per second can be obtained for nominal pulse rates. This inherent sensitivity has led to numerous applications in which a lidar beam has been scanned rapidly over a large volume to obtain time-varying, three-dimensional wind measurements.

ENVIRONMENTAL MEASUREMENTS / Doppler Lidar 393

Figure 9 Time series of radial wind speed profiles measured by a 355 nm fringe-imaging direct detection lidar aerosol channel (top) and molecular channel (bottom) at Maura Loa Observatory, HI. Change in wind speed and blocking of the return by clouds is clearly seen. Courtesy C. Nardell, Michigan Aerospace Corp.

The primary advantage of direct detection instruments is their demonstrated capability to measure winds from molecular-backscattered returns in the middle and upper troposphere. In very pristine air, direct detection offers the only method for long-range wind measurements, even though significant averaging may be required. Direct-detection lidars have the additional advantage of being unaffected by atmospheric refractive turbulence. Unlike heterodyne lidars, which require a very pure laser pulse and a diffraction-limited receiver field of view matched to and precisely aligned with the transmitted beam, direct detection systems can have a wider bandwidth transmitter and a receiver field of view several times diffraction-limited. In direct detection lidar design, the field of view is usually constrained by the need to limit background light during daytime operation. A major design challenge for direct detection instruments is holding the Fabry– Perot etalon plate spacing stable over a range of temperatures and in high vibration environments.

Global Wind Measurements A satellite-based Doppler lidar has frequently been proposed as a way to measure wind fields over most of the Earth. At present, winds are the one major meteorological variable not well-measured from orbiting platforms. Measurement of winds is especially important over regions of the Earth that are not currently well sampled, such as over Northern Hemisphere oceans, as well as over most of the tropics and Southern Hemisphere. Wind profile information is currently obtained from radiosondes and by tracking cloud and water vapor inhomogeneities using satellite imagers. Doppler lidar wind measurements would greatly augment the current data set by providing wind estimates throughout the troposphere under clear conditions, and highly height-resolved observations down to cloud tops when cloud decks are present. Observing system simulation experiments conducted in recent years indicate that satellite-based lidar global wind

394 ENVIRONMENTAL MEASUREMENTS / Doppler Lidar

Figure 10 Estimated error in the radial velocity estimate versus altitude for a 355 nm direct detection, fringe imaging Doppler lidar system at Mauna Loa observatory on Hawaii. The lidar provides data for assimilation into local mesoscale forecast models. Courtesy C. Nardell, Michigan Aerospace Corp.

measurements could lead to a significant improvement in long-term forecast skill, provided the wind fields can be observed with sufficient accuracy and spatial resolution. In a Doppler lidar wind mission, a satellite carrying a lidar system would orbit the Earth in a nearly polar orbit. The pulsed laser beam would be scanned conically about the nadir to obtain different components of the wind velocity. The scanning could be either continuous or ‘stop and stare’. After sufficient returns are averaged at a given pointing angle to produce an acceptable estimate, the radial component of the velocity would be computed and assimilated directly into numerical analysis and forecast models. Doppler lidar measurement of winds from space is theoretically feasible but technologically difficult. Depending on the orbital height, the scattering volume is anywhere from 450 to , 850 km from the satellite, which challenges the sensitivity of current system types. Because weight and power consumption are critical parameters for space systems, telescope diameter and laser power cannot be easily increased to obtain the necessary sensitivity. Similarly, the ability to average returns from multiple

pulses is also limited by time limitations. Because a satellite moves at about 7 km s21, in order to obtain measurements over a horizontal distance of 300 km (the resolution of the radiosonde network) only about 45 seconds are available to make enough observations from multiple look angles to obtain a useful measurement. It should also be noted that, as a result of the high orbital velocity of the satellite, precise knowledge of beam pointing is extremely critical for measurements from satellites. For a lidar operating at a nadir angle of 45 degrees, an error in the knowledge of pointing angle of just 1 mrad results in an error of about 5 ms21 in the measured radial component of the wind. Despite the challenge of employing a satellite-based Doppler lidar, efforts are continuing to develop the appropriate technology and to assess the impact of the observations. The European Space Agency is planning a Doppler wind lidar demonstration mission for the late 2000s that would incorporate a nonscanning, direct-detection instrument with both aerosol and molecular channels. Doppler lidar technology research and numerical simulations aimed at satellitebased wind sensing is ongoing at several research centers within the United States, Europe, and Japan. One option currently being studied is a ‘hybrid’ lidar system that would combine a direct detection lidar for measurement of winds in the clear free troposphere with a low-energy coherent system that would obtain its observations from clouds and the aerosol-rich boundary. Although a hybrid instrument reduces some requirements on system power and aperture size, combining the two techniques on a single platform will likely require innovative engineering and new approaches to beam combining and scanning.

See also Environmental Measurements: Optical Transmission and Scatter of the Atmosphere. Scattering: Scattering Theory.

Further Reading Baker WE, Emmitt GD, Robertson F, et al. (1998) Lidar-measured wind from space: a key component for weather and climate prediction. Bulletin of the American Meteorological Society 79: 581– 599. Banta RM, Olivier LD and Levinson DH (1993) Evolution of the Monterey Bay sea breeze layer as observed by pulsed Doppler lidar. Journal of Atmospheric Science 50 (24): 3959– 3982. Banta RM, Olivier LD, Gudiksen PH and Lange R (1996) Implications of small-scale flow features to modeling dispersion over complex terrain. Journal of Applied Meteorology 35: 330 –342.

ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere 395

Banta RM, Darby LS, Kaufmann P, Levinson DH and Zhu C-J (1999) Wind flow patterns in the Grand Canyon as revealed by Doppler lidar. Journal of Applied Meteorology 38: 1069– 1083. Frehlich R (1996) Coherent Doppler lidar measurements of winds in the weak signal regime, 1997. Applied Optics 36: 3491– 3499. Gentry B, Chen H and Li SX (2000) Wind measurements with a 355 nm molecular Doppler lidar. Optics Letters 25: 1231– 1233. Grund CJ, Banta RM, George JL, et al. (2001) Highresolution Doppler lidar for boundary-layer and cloud research. Journal of Atmosphere and Oceanic Technology 18: 376– 393. Huffaker RM and Hardesty RM (1996) Remote sensing of atmospheric wind velocities using solid state and CO2 coherent laser systems. Proceedings of the IEEE 84: 181– 204. Menzies RT and Hardesty RM (1989) Coherent Doppler lidar for measurements of wind fields. Proceedings of the IEEE 77: 449 –462.

Post MJ and Cupp RE (1990) Optimizing a pulsed Doppler lidar. Applied Optics 29: 4145– 4158. Rees D and McDermid IS (1990) Doppler lidar atmospheric wind sensor: reevaluation of a 355 nm incoherent Doppler lidar. Applied Optics 29: 4133– 4144. Rothermel J, Cutten DR, Hardesty RM, et al. (1998) The multi-center airborne coherent lidar atmospheric wind sensor. Bulletin of the American Meteorological Society 79: 581– 599. Skinner WR and Hays PB (1994) Incoherent Doppler lidar for measurement of atmospheric winds. Proceedings of SPIE 2216: 383 – 394. Souprayen C, Garnier A and Hertzog A (1999) Rayleigh – Mie Doppler wind lidar for atmospheric measurements. II: Mie scattering effect, theory, and calibration. Applied Optics 38: 2422– 2431. Souprayen C, Garnier A, Hertzog A, Hauchecorne A and Porteneuve J (1999) Rayleigh– Mie Doppler wind lidar for atmospheric measurements. I: Instrumental setup, validation, first climatological results. Applied Optics 38: 2410– 2421.

Hyperspectral Remote Sensing of Land and the Atmosphere W H Farrand, Space Science Institute, Boulder, CO, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Hyperspectral remote sensing is a true marriage of imaging technology with spectroscopy. Hyperspectral remote sensing systems (also known as imaging spectrometer systems) fully sample the optical wavelength range of interest, whether it be the reflected solar range (0.35 to 2.5 mm) or the range of thermal emission from the Earth’s surface (3.0 to 14 mm). A hyperspectral sensor views the Earth’s surface in a series of contiguous and spectrally narrow image bands. Figure 1 presents the concept of hyperspectral remote sensing, wherein each spatial element of an image has an associated full resolution spectrum. The calibrated reflectance or emittance spectra collected by a hyperspectral system are meant to be directly comparable with that of materials measured in the laboratory. With such high spectral resolution, it thus becomes possible to do reflectance or emittance spectroscopy of the Earth’s surface from an overhead perspective.

Underlying Principles Optical remote sensing is done over wavelength intervals, or windows, in which the atmosphere is

largely transparent. For the reflective solar portion of the spectrum, the atmosphere is mostly transparent from approximately 0.35 to 2.5 mm. Water, CO2, and other gases have absorptions of varying strength in this range. This window is subdivided into the range of human vision (the visible) and the region of slightly longer wavelengths known as the near infrared; together they are the visible/near infrared or VNIR and extend from approximately 0.35 to 1.0 mm. The region from approximately 1.0 to 2.5 mm is known as the short wavelength infrared or SWIR. In the range of emitted terrestrial radiation there are two window regions. The first extends from approximately 3 to 5 mm and is known as the medium wavelength infrared or MWIR, and the second, the long wavelength infrared or LWIR, extends from approximately 8 to 14 mm. The light that is intercepted by the entrance aperture of a sensor is known by the quantity of radiance which is measured in units of microwatts per square centimeter per nanometer per unit of solid angle. Physical quantities related to the material properties of Earth surface materials are reflectance and emittance. Both properties are the result of ratios and so are unitless. One definition of reflectance is that it is the ratio of radiance reflected from a material, divided by the radiance reflected from an identically illuminated perfectly diffuse reflector. Likewise, a definition for emittance is the ratio of radiance emitted from a material, divided by the ratio

396 ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere

Images taken simultaneously in 100 – 200 spectrally contiguous and spatially registered spectral bands

Reflectance

Each pixel has an associated continuous spectrum that can be used to identify the surface materials

0.4

Wavelength (µm)

2.5

Figure 1 Hyperspectral remote sensing concept.

of radiance emitted from a perfectly emitting material of the same temperature. Materials covering the Earth’s surface, or gases in the atmosphere, can be identified in hyperspectral data on the basis of absorption features, or bands, in the spectra recorded by the sensor. Absorption bands result from the preferential absorption of energy in some wavelength interval. The principal types of processes by which light can be absorbed include, in order of decreasing energy of the process, electronic charge transfers, electronic crystal field effects, and molecular vibrations. The former two processes are observed in minerals and manmade materials that contain transition group metal cations, most usually, iron. Charge transfer absorptions are the result of cation-to-anion or cationto-cation electron transfers. Crystal field absorptions are explainable by the quantum mechanical theory of atomic structure, wherein an atom’s electrons are contained in orbital shells. The transition group metals have incompletely filled d orbital shells. When a transition metal cation, such as iron, is surrounded by anions, and potentially selects other cations, an electric field, known as the crystal field, exists. Crystal field absorptions occur when radiant energy causes that cation’s orbital shell energy levels to be split by interaction with the crystal field. The reflectance spectra of different iron-bearing minerals are unique, because the iron cation has a unique positioning with respect to the anions and other cations that compose the mineral. A material’s component molecules have bonds to other molecules and as a result of interaction with radiant energy, these bonds can stretch or bend.

These molecular vibrations, and overtones of those vibrations, are less energetic processes than the electronic absorption processes described above and so the resulting absorption features occur at longer wavelengths. Fundamental vibrational features occur in the MWIR and LWIR. Overtones and combination overtones of these vibrations are manifested as weaker absorption features in the SWIR. These overtone features in the SWIR include absorptions diagnostic of carbonate and certain clay minerals. Absorption features caused by vibrational overtones of C – H, O – H, and N –H stretches are also manifested in the SWIR and these absorption features are characteristic of important vegetative biochemical components such as cellulose, lignin, starch, and glucose. An example of the effect of the changes in reflected energy recorded at a sensor, due to the absorption of incident energy by green vegetation and other surface materials, is provided in Figure 2. Laboratory spectrometers have been used for many years to analyze the diagnostic absorptions of materials caused by the above effects. More recently, spectrometers were mounted on telescopes to determine the composition of the Moon and the other solid bodies in our solar system. Technology progressed to where profiling spectrometers (instruments that measure a successive line of points on the surface) could be mounted on aircraft. The next logical step was the construction of imaging spectrometers (hyperspectral sensors) that measured spectra in two spatial dimensions. In fact, the data produced by a hyperspectral sensor are often thought of as an image cube (Figure 3) because they consist of three dimensions: two spatial and one spectral.

ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere 397

Figure 2 Portion of a scene from the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) from Utah that includes a basalt flow, circularly irrigated grasses sparsely vegetated plains, and damp alluvial sediments. AVIRIS channels centered at 0.45, 0.55, 0.81, and 2.41 mm are shown. Green vegetation is brightest in the 0.81 mm band which samples the peak of the near infrared plateau (see Figure 7) just beyond the ‘red edge’. The green vegetation and damp sediments are darkest in the 2.41 mm band where water has a very low reflectance (note also Figure 8). The bright patches in the basalt flow in the 2.41 mm image are occurrences of oxidized red cinders which have a high reflectance in the SWIR.

Figure 3 Visualization of a hyperspectral image cube. A 1.7 mm image band from the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) over the Lunar Crater Volcanic Field, Nevada is shown with each of the 224 rightmost columns and 224 topmost lines arrayed behind it. In the lower right-hand corner, the low reflectance of ferric oxide-rich cinders in the blue and green show up as dark in the stacking of columns with the high reflectance in the infrared showing up as bright.

398 ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere

Advances in hyperspectral sensor technology have been accompanied by advances in associated technologies, such as computing power and the evolution of data processing algorithms.

An early impediment to the widespread use of hyperspectral data was the sheer volume of the data sets. For instance, a single scene from NASA’s Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) is 614 samples by 512 lines by 224 bands. With its short integer storage format, such a scene takes up 140.8 Mbytes. However, since the dawn of the hyperspectral era, in the 1980s, computing power has expanded immensely and computing tasks, which once seemed prohibitive, are now relatively effortless. Also, numerous processing algorithms have been developed which are especially well suited for use with hyperspectral data. Spectral Matching

In one class of algorithms, the spectrum associated with each spatial sample is compared against one or a group of model spectra and some similarity metric is applied. The model spectrum can be measured by a laboratory or field portable spectrometer of pure materials or can be a single pixel spectrum, or average of pixel spectra, covering a known occurrence of the material(s) to be mapped. A widely used similarity metric is the Spectral Angle Mapper (SAM). SAM is the angle, u, obtained from the dot product of the image pixel spectrum, t, and the library spectrum, r, as expressed by

u ¼ cos21



tpr ktk p krk



½1

SAM is insensitive to brightness variations and determines similarity based solely on spectral shape. Lower u values mean more similar spectra. In a related approach, matching is done, not on the entire spectrum, but rather on characteristic absorption features of the materials of interest. In Figure 4, the geometry of an absorption band is plotted. To map a given mineral, with a specific absorption feature, a high resolution laboratory spectrum is resampled to the spectral resolution of the hyperspectral sensor. The spectrum is subsampled to match the specific absorption feature (i.e., only bands from the short wavelength shoulder of the absorption to the long wavelength shoulder are included). A straight line continuum (calculated, based on a line between the two-band shoulders) is divided out from both the laboratory and the

Continuum

Reflectance

Data Processing Approaches

Band shoulders

Full width at half maximum (FWHM)

Band minimum Wavelength Figure 4

Parameters that describe an absorption band.

sensor spectra. Contrast between the library and the sensor spectra is mitigated through the use of an additive constant. This constant is incorporated into a set of equations which are solved through the use of standard least squares. The result of such a bandfitting algorithm is two data numbers per spatial sample of the hyperspectral scene. First the band depth (or, alternatively, band area) of the feature is calculated, and second, a goodness-of-fit parameter is calculated. These two parameters are most often combined in a band-depth times goodness-of-fit image. More sophisticated versions of the bandmapping algorithm target multiple absorption features and are, essentially, expert systems. Spectral Mixture Analysis

A fundamentally different methodology for analyzing hyperspectral data sets is to model the measured spectrum of each spatial sample as a linear combination of endmember spectra. This methodology is based on the fact that the ground area corresponding to any given spatial sample likely will be covered by more than one material, with the consequence that the measured reflectance or emittance spectrum is a mixed pixel spectrum. The objective of linear SMA is to try to determine the fractional abundance of the component materials, or endmembers. The basis of linear spectral unmixing is to model the response of each spatial element as a linear combination of endmember spectra. The basis equation for linear SMA is rðx; yÞ ¼ aM þ 1

½2

where rðx; yÞ ¼ the relative reflectance spectrum for the pixel at position ðx; yÞ; a ¼ the vector of endmember abundances, M ¼ the matrix of endmember spectra, and 1 ¼ the vector of residuals between the modeled and the measured reflectances. Application

ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere 399

of SMA results in a series of fraction images for each endmember wherein the data numbers range, ideally, between 0 and 1. Fraction image pixels with a digital number (DN) of 0 are taken to be devoid of the endmember material. Fraction image pixels with a DN of 1.0 are taken to be completely covered with the endmember material. Techniques related to SMA have been developed which also map the abundance of a target material on a per pixel basis. These include implementations highly similar to SMA such as Orthogonal Subspace Projection (OSP) wherein, as with SMA, all endmembers must be determined a priori. Other approaches such as Constrained Energy Minimization (sometimes referred to as ‘Matched Filter’) do not require a priori knowledge of the endmembers. Instead, only the reflectance or emittance spectrum of the target need be known and the undesired spectral background is estimated, and ultimately compensated for, using the principal eigenvectors of the sample correlation or covariance matrix of the hyperspectral scene. Expert Systems and Artificial Neural Networks

Another means of extracting information from hyperspectral data sets is to use computer approaches that in some way mimic the human thought process. This can take the form of an expert system which applies a set of rules or tests as each pixel spectrum is successively analyzed. For example, if a spectrum has an absorption at 2.2 mm it could tentatively be classified as being caused by a clay mineral. Additionally, if the absorption is a doublet, the specific clay mineral is likely a kaolinite. In practice, expert system codes are lengthy and complex although excellent results can be demonstrated from their use, albeit at the expense of processing time. Another approach in this vein is the use of an artificial neural network (ANN) for data processing. The use of ANNs is motivated by their power in pattern recognition. ANN architectures are well suited for a parallel processing approach and thus have the potential for rapid data processing.

Hyperspectral Remote Sensing of the Land Geology

Early efforts with airborne hyperspectral sensors focused on geologic remote sensing. This was, at least in part, a consequence of the mineralogic diversity of the Earth’s surface and the fact that many minerals have absorption features which are

unique and diagnostic of the mineral’s identity. The expectation, which was soon borne out by results from these early sensors, was that, given exposure of the surface, mineralogic maps as detailed as the spatial resolution of the sensor could be derived from hyperspectral data. As some of the processing techniques discussed above have become more readily available, it has become possible to produce mineralogic maps from hyperspectral data far more rapidly and cost-effectively than geologic maps produced by standard means (e.g., by walking over the ground and manually noting which lithologies are underfoot). The propensity for transition metals, most especially iron, to absorb energy through charge transfer and crystal field effects, is noted above. Fe –O and Fe2þ –Fe3þ charge transfers cause a profound absorption in the reflectance spectrum of Fe-bearing minerals shortwards of 0.4 mm. The wing of this absorption causes the low blue and green reflectance of Fe-bearing minerals. Crystal field bands cause an absorption feature in the 0.9 to 1.0 mm region. The ability to discriminate subtle differences among Febearing minerals in hyperspectral data has proven extremely valuable in applications such as mineral exploration, volcanology, and in the characterization of abandoned mine lands. In the SWIR, absorptions caused by vibrational overtones of molecular bonds within minerals. These include absorptions in the 2.2 and 2.3 mm region which are characteristic of many Al- and Mg-bearing clay minerals. Absorptions in the SWIR are narrower in width than the Fe-generated crystal field bands discussed previously. In fact, the requirement to efficiently resolve these vibrational overtone absorption bands (which have full width at half maximum (FWHM) bandwidths of 20 to 40 nm) helped to drive the selection of the nominal 10 nm bandwidth of early and most current hyperspectral sensors. Certain clay minerals can be indicators of hydrothermal activity associated with economic mineral deposits; thus, the SWIR is an important spectral region for mineral exploration. Clay minerals, by definition, have planar molecular structures that are prone to failure if subjected to shearing stresses. The ability to uniquely identify and map these minerals, using hyperspectral data, has thus been used to good effect to identify areas on volcanoes and other hydrothermally altered terrains that could be subject to landslides. Minerals with adsorbed or molecularly bound OH2 and/or water have vibrational overtone absorptions near 1.45 and 1.9 mm, although these features are masked in remotely sensed data by prominent atmospheric water vapor bands at 1.38 and 1.88 mm. The reflectance of water and OH-bearing minerals decrease to near zero at wavelengths longwards of

Calcite (CO3) Hematite (Fe 3+) Augite (Fe 2+) 0.5

1.0

1.5 2.0 Wavelength (µm)

2.5

Figure 5 Reflectance spectra of representative minerals. Kaolinite represents clay minerals with its diagnostic 2.2 mm band caused by an overtone vibration of the Al–OH bond. The 2.335 mm band of calcite is caused by an overtone vibration of the carbonate molecule. The features shortwards of 1.0 mm in the hematite spectrum are caused by crystal field and charge transfer absorptions due to Fe3þ. The bands near 1.0 and longwards of 2.0 mm in the augite spectrum are the result of Fe2þ crystal field bands.

2.5 mm, due to the presence of water and OH fundamental vibrations near 3.0 mm. Minerals bearing the carbonate (CO22 3 ) molecule are abundant on the Earth’s surface due to the ubiquitous occurrences of limestone and dolomite. Calcite and dolomite have overtone absorptions centered at 2.335 and 2.315 mm, respectively. A good test for the spectral resolution and spectral calibration of a hyperspectral sensor is its ability to successfully discriminate calcite from dolomite on the basis of the aforementioned absorption features. Figure 5 shows representative spectra of carbonate, clay, and ferric and ferrous iron-bearing minerals. Absorptions resulting from fundamental molecular absorptions are manifested in the LWIR. These absorptions are of great geologic interest because they include absorptions resulting from vibrations of the Si – O bond and silicate minerals form the bulk of the Earth’s crust. The wavelength at which the Si – O stretching feature occurs is dictated by the level of polymerization (or molecule-to-molecule bonding) of the silicate mineral. Highly polymerized framework silicate minerals, such as quartz and feldspar, have a shorter wavelength absorption than do silicate minerals, such as olivine which are composed of disconnected SiO4 molecules. In Figure 6, laboratory emission spectra of highly and poorly polymerized silicate minerals are shown.

Emissivity (spectra offset for clarity)

Kaolinite (AI-OH clay)

species of vegetation represent variations of the same general biochemical constituents (e.g., chlorophyll, proteins, lignin, cellulose, sugar, starch, etc.). In the thermal IR, vegetation has a generally flat emittance, thus it is in the reflective solar spectrum that most vegetative remote sensing is performed. The major features in the reflectance spectrum of green vegetation are shown in Figure 7. In the visible, the major absorption features are caused by the presence of chlorophyll. Chlorophyll has strong absorptions in the blue and the red, leaving a reflectance maximum (the green peak) at 0.55 mm. In the NIR, scattering in the leaf structure causes high reflectance leaving a profound absorption edge (the red edge) between

Olivine

Feldspar

Quartz

1400

While different minerals are, by definition, generally composed of diverse component molecules, different

800

0.7 Water absorptions

0.6 0.5 0.4 0.3 0.2

Green peak Red edge

0.1

0.5

Vegetation and the Environment

1200 1000 Wavenumber (cm–1)

Figure 6 Emissivity spectra of SiO4-bearing minerals showing the shift in band minimum to higher wave number with increasing polymerization.

Reflectance

Reflectance (offset for clarity)

400 ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere

1.0

1.5 Wavelength (µm)

2.0

2.5

Figure 7 Major spectral features in the reflectance spectrum of green vegetation.

ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere 401

0.7 and 0.78 mm. Past the red edge and into the SWIR, the spectrum of green vegetation is dominated by water in the leaf, with leaf water absorptions occurring at 0.97, 1.19, 1.45, 1.93, and 2.50 mm. Studies of vegetation, using multispectral systems, have made use of a number of broadband indices. Hyperspectral systems allow for the discrimination of individual absorption features and subtle spectral shape differences in vegetation spectra which are unresolvable using broadband multispectral systems. For example, the absorptions caused by chlorophyll can be used for studies of chlorophyll content in vegetative canopies. The unique interaction of chlorophyll with leaf structures can provide a morphology to the green peak that is unique to a given species and thus is mappable using a spectral feature fitting approach. A desired goal in the study of vegetation and ecosystem processes is to be able to monitor the chemistry of forest canopies. The foliar biochemical constituents making up forest canopies have associated absorption features in the SWIR, that result from vibrational overtones and combinations of C –O, O – H, C – H, and N –H molecular bonds. However the absorptions from these biochemical components overlap so that the chemical abundance of any one plant component cannot be directly related to any one absorption feature. An even more serious challenge is introduced by the strong influence of water in the SWIR in green vegetation spectra. Water constitutes 40 to 80% of the weight of leaves. However, the unique characteristics of high-quality hyperspectral data (e.g., excellent radiometric and spectral calibration, high signal-to-noise ratio, high spectral resolution) have been used to detect even these subtle features imprinted on the stronger water features.

The amount of water present in plant leaves is, in itself, a valuable piece of information. Hyperspectral data can be used to determine the equivalent water thickness present in vegetative canopies. Conversely, the amount of dry plant litter and/or loose wood can be estimated using spectral mixture analysis and related techniques. Taken together, all these data: vegetation species maps, determinations of leaf water content and the relative fractions of live vegetation versus litter, can be used to characterize woodlands for forest fire potential. The ability to map different species of vegetation has also proved useful in the field of agriculture, for assessing the impact of invasive weed species on farm and ranch land. Being able to assess the health of crops, in terms of leaf water content, is also important for agriculture. Another subtle vegetative spectral feature, that provides information on plant vitality, is the aforementioned red edge. Shifts in the position of the red edge have been linked to vegetation stress. Studies have shown that vegetation stress and consequent shifts in the position of the red edge can be caused by a number of factors, including insufficient water intake or the intake of pernicious trace metals. Vegetation growing over mineral deposits can display red edge shifts and this can be used to assist in mineral exploration efforts. Another application area, in which hyperspectral remote sensing has found great utility, is in the analysis of snow-covered areas. Over the reflective solar spectrum, the reflectance of snow varies from values near zero in the SWIR to values near one in the blue. Differences in the grain size of snow result in differences in reflectance as is illustrated in Figure 8. The spectral variability of snow makes it easily distinguishable from other Earth surface materials.

Modeled spectral reflectance of snow for different snow grain radii 1

r = 50 microns r = 200 microns

Reflectance

0.8

r = 500 microns r = 1000 microns

0.6 0.4 VIS

NIR

0.2 0 0.4

0.7

1

1.3

1.6

1.9

2.2

2.5

Wavelength (µm) Figure 8 Differences in snow reflectance spectra resulting from differences in snow grain size. Courtesy of Dr Anne Nolin of Oregon State University.

402 ENVIRONMENTAL MEASUREMENTS / Hyperspectral Remote Sensing of Land and the Atmosphere

Linear spectral unmixing, using hyperspectral data, has been shown to successfully discriminate between snow, vegetation, rock, and clouds and to accurately map snow cover fraction in mixed pixels. Accurate maps of snow cover, types of snow, and fractions of liquid water admixed with snow grains, are required for forecasting snowmelt runoff and stream discharge in watersheds dominated by snow cover.

Ls ¼

Hyperspectral Remote Sensing of the Atmosphere The physical quantity most often sought in land remote sensing studies is surface reflectance. However, the quantity recorded by a hyperspectral sensor is radiance at the entrance aperture of the sensor. In order to obtain reflectance, the background energy level of the Sun and/or Earth must be removed and the scattering and absorbing effects of the atmosphere must be compensated for. In the VNIR through SWIR, there are seven atmospheric constituents with significant absorption features: water vapor, carbon dioxide, ozone, nitrous oxide, carbon monoxide, methane, and oxygen. Molecular scattering (commonly called Rayleigh scattering) is strong in the blue but decreases rapidly with increasing wavelength. Above 1 mm, its effect is negligible. Scattering caused by atmospheric aerosols, or Mie scattering, also is more prominent at shorter wavelengths and decreases with increasing wavelength, but the dropoff of effects from Mie scattering is not as profound as that for Rayleigh scattering. Consequently, effects from aerosol scattering can persist into the SWIR. There are three primary categories of methods for removing the effects of atmosphere and solar insolation from hyperspectral imagery, in order to derive surface reflectance. Methods of atmospheric corrections can be considered as being either an image-based, empirical, or model-based approach. An image-based, or ‘in-scene’ approach uses only data measured by the instrument. Empirical methods make use of the remotely sensed data in combination with field measurements of reflectance, to solve a simplified equation of at-sensor radiance such as eqn [3]. Ls ¼ Ar þ B

In some empirical approaches, atmospheric path radiance is ignored and only a multiplicative correction is applied. Model-based approaches seek to model what the at-sensor radiance should be on a pixel-by-pixel basis, including the contribution of the atmosphere. The at-sensor radiance, Ls, at any wavelength, l, can be expressed as

½3

where Ls is the at-sensor radiance, r is the reflectance of the surface, and A and B are quantities that incorporate, respectively, all multiplicative and additive contributions to the at-sensor radiance. All the quantities expressed in eqn [3] can be considered as varying as a function of wavelength, l. Approximations for A and B from eqn [3] constitute a set of gains and offsets derived from the empirical approach.

1 ðEr þ MT Þtu þ Lu p

½4

where E is the irradiance at the surface of the Earth, r is the reflectance of the Earth’s surface, MT is the spectral radiant exitance of the surface at temperature, T, tu is the transmissivity of the atmosphere at zenith angle u, and Lu is the spectral upwelling path radiance of the atmosphere. The ability to solve eqn [4] and do atmospheric correction on a pixel-bypixel basis is appealing, in that it negates the shortcomings of an empirical approach where a correction based on calibration targets in one part of a scene might not be appropriate for another part of the scene, due to differences in atmospheric pathlength or simple atmospheric heterogeneity. Modelbased approaches are also able to take advantage of the greater spectral dimensionality of hyperspectral data sets for the derivation of atmospheric properties (e.g., the amount of column water vapor, CO2 band depths, etc.) directly from the data. The main atmospheric component, affecting imagery in the reflective solar portion of the spectrum, is water vapor. Model-based atmospheric correction techniques determine the column water vapor for each pixel in the scene, based on the depth of the 940 and/or 1130 nm atmospheric water bands. Thus for each pixel, an appropriate amount of water can be removed. The maps of atmospheric water vapor distribution are themselves of interest to atmospheric scientists. Absorption features caused by well-mixed gases, such as the 760 nm O2 band, can be used by model-based programs, to produce images of approximate scene topography. More advanced model-based approaches also solve for the scattering effects of aerosols on the hyperspectral data. As noted above, atmospheric constituents, including industrial effluents, can be detected and spatially mapped by hyperspectral sensors. The fundamental vibrational absorption of gases of interest occurs in the MWIR and the LWIR. Overtones of these gaseous molecular vibrations occur in the VNIR to SWIR, but are generally too weak to be detected. The ability to detect anthropomorphically produced gases depends on a number of factors, including the

ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 403

temperature difference between the gas cloud and the background atmosphere, the concentration of gas within the cloud, and the size of the cloud (e.g., the path length of light through the cloud).

See also Imaging: Infrared Spectrometers.

Imaging.

Instrumentation:

Further Reading Clark RN (1999) Spectroscopy of rocks and minerals, and principles of spectroscopy. In: Rencz AN (ed.) Remote Sensing for the Earth Sciences, pp. 3– 57. New York: John Wiley and Sons. Farrand WH and Harsanyi JC (1997) Mapping the distribution of mine tailings in the Coeur d’Alene River Valley, Idaho through the use of a Constrained Energy Minimization technique. Remote Sensing Environment 59: 64 –76. Gao B-C and Goetz AFH (1995) Retrieval of equivalent water thickness and information related to biochemical components of vegetation canopies from AVIRIS data. Remote Sensing Environment 52: 155 – 162. Gao B-C, Heidebrecht KH and Goetz AFH (1993) Derivation of scaled surface reflectances from AVIRIS data. Remote Sensing Environment 44: 165 – 178. Goetz AFH (1989) Spectral remote sensing in geology. In: Asrar G (ed.) Theory and Applications of Optical Remote Sensing, pp. 491 – 526. New York: John Wiley & Sons.

Goetz AFH, Vane G, Solomon JE and Rock BN (1985) Imaging spectrometry for Earth remote sensing. Science 228: 1147– 1153. Hapke B (1993) Theory of Reflectance and Emittance Spectroscopy. Cambridge, UK: Cambridge University Press. Kirkland LE, Herr KC and Salisbury JW (2001) Thermal infrared spectral band detection limits for unidentified surface materials. Applied Optics 40: 4852– 4862. Martin ME, Newman SD, Aber JD and Congalton RG (1998) Determining forest species composition using high spectral resolution remote sensing data. Remote Sensing Environment 65: 249 – 254. Painter TH, Roberts DA, Green RO and Dozier J (1998) The effect of grain size on spectral mixture analysis of snow-covered area with AVIRIS data. Remote Sensing Environment 65: 320– 332. Pieters CM and Englert PAJ (eds) Remote Geochemical Analysis: Elemental and Mineralogical Composition. Cambridge, UK: Cambridge University Press. Pinzon JE, Ustin SL, Castaneda CM and Smith MO (1997) Investigation of leaf biochemistry by hierarchical foreground/background analysis. IEEE Transactions on Geoscience and Remote Sensing 36: 1– 15. Roberts DA, Green RO and Adams JB (1997) Temporal and spatial patterns in vegetation and atmospheric properties using AVIRIS. Remote Sensing Environment 62: 223– 240. Wessman CA, Aber JD and Peterson DL (1989) An evaluation of imaging spectrometry for estimating forest canopy chemistry. International Journal of Remote Sensing 10: 1293– 1316.

Laser Detection of Atmospheric Gases E V Browell, W B Grant and S Ismail, NASA Langley Research Center, Hampton, VA, USA Published by Elsevier Ltd.

Introduction Lidar (light detection and ranging) systems are able to measure profiles of atmospheric aerosols, clouds, and gases by transmitting a pulsed laser beam into the atmosphere and collecting the backscattered radiation from aerosols and molecules in the atmosphere with a receiver located near the transmitter. The differential absorption lidar (DIAL) approach is the most widely used technique for measuring a variety of gases. In this approach, two closely spaced laser wavelengths are used: one which is absorbed by the gas of interest, and the other which is only weakly, or not at all, absorbed by the gas. A differential with respect to range and wavelength is calculated, to

determine the average gas concentration along any segment of the lidar path, using the familiar Beer–Lambert law for an absorbing medium. The DIAL equation can be expressed in its simple form as    N ¼ 1 2ðR2 2 R1 Þðson 2 soff Þ    £ ln Poff ðR2 ÞPon ðR1 Þ Poff ðR1 ÞPon ðR2 Þ

=

=

½1

where N is the average gas concentration, R is the range, son and soff are the absorption cross sections at the on- and off-line wavelengths, and Pon(R) and Poff(R) are the powers received at the on- and off-line wavelengths. However, since it is performed in the atmosphere and not a laboratory, great care must be taken to ensure that the data are analyzed properly to minimize random and systematic errors. Random errors arise from noise in the backscattered signal, the solar background signal, and the inherent

404 ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases

detector noise, and this type of error can be reduced by signal averaging. Systematic errors arise from uncompensated instrument and atmospheric effects and must be carefully considered in system design and operation and data processing. Another approach, the Raman lidar approach, uses inelastic scattering (scattered/emitted light has different wavelength than the illuminating light) from gases where the wavelength shift corresponds to vibrational or rotational energy levels of the molecules. Any illuminating laser wavelength can be used, but since the Raman scattering cross-section varies as l24, where l is the wavelength, the shorter laser wavelengths, such as in the near UV spectral region, are preferred. While even shorter wavelengths in the solar blind region below ,300 nm would permit operation in daytime, the atmospheric attenuation, due to Rayleigh scattering and UV molecular absorption, limits the measurement range. High-power lasers are used due to the low Raman scattering cross-sections. The Raman lidar measurements have to be carefully calibrated against a profile measured using another approach, such as is done for water vapor (H2O) measurements with the launch of hygrometers on radiosondes, since the lidar system constants and atmospheric extinctions are not usually well known or modeled. In order to obtain mixing ratios of H2O with respect to atmospheric density, the H2O signals are ratioed to the nitrogen Raman signals. However, care must be exercised in processing these data due to the spectral dependences of atmospheric extinction.

DIAL systems at many locations around the Earth, with sites in Ny-A˚lesund, Spitzbergen (78.98N, 11.98E), Observatoire Haute Provence, France (43.98N, 5.78E), Table Mountain, California (34.48N, 118.28W), Mauna Loa, Hawaii (19.58N, 155.68W), and Lauder, New Zealand (45.08S, 169.78E). First established in the 1980s, these DIAL systems are strategically located so that O3 in different latitude bands can be monitored for signs of change. In addition, they can provide some profiles for comparison with space-based O3 measuring instruments. The parameters of a typical groundbased UV DIAL system used in the NDSC are given in Table 1.

Surface-Based Lidar Systems

Table 1 Parameters for the Jet Propulsion Laboratory’s Mauna Loa DIAL systems for stratospheric O3 measurements

Surface-based lidar systems measure the temporal evolution of atmospheric profiles of aerosols and gases. The first lidar systems used to remotely measure atmospheric gases were Raman lidar systems. The Raman lidar systems were thought to be very promising since one high-power laser could be used to measure a variety of gases. They are easier to develop and operate than DIAL systems, because the laser does not have to be tuned to a particular gas absorption feature. However, the Raman scattering cross-section is low, and the weak, inelastically scattered signal is easily contaminated by daylight background radiation, resulting in greatly reduced performance during daytime. Raman lidar systems have been developed primarily for measuring H2O and retrieving atmospheric temperature. Surface-based UV DIAL systems, that are part of the Network for the Detection of Stratospheric Change (NDSC) have made important contributions to the understanding of stratospheric O3. There are

Airborne DIAL Systems Airborne lidar systems expand the range of atmospheric studies beyond those possible by surface-based lidar systems, by virtue of being located in aircraft that can be flown to high altitudes and to remote locations. Thus, they permit measurements at locations inaccessible to surface-based lidar systems. In addition, they can make measurements of large atmospheric regions in times that are short compared with atmospheric motions, so that the large-scale patterns are discernible. Another advantage of airborne lidar operation is that lidar systems perform well in the nadir (down) direction since the atmospheric density and aerosol loading generally increases with decreasing altitude towards the surface, which helps to compensate for the R22 decrease in the lidar

Lasers

XeCl

Wavelength (nm) Pulse energy (mJ) Pulse repetition frequency (Hz) Receiver Area (m2) Optical efficiency (%) Wavelengths– Rayleigh (nm) N2 Raman (nm) (for aerosol correction) System performance Measurement rangep (km) Vertical resolution (km)

308 (on) 300 200

Measurement averaging time (hours) Measurement accuracy p

Nd:YAG (3rd Harmon.) 355 (off) 150 100

0.79 ,40 308 332

355 387

15– 55 3 at bottom, 1 at O3 peak, 8 –10 at top 1.5 ,5% at peak, 10–15% at 15 km, .40% at 45 km

Chopper added to block beam until it reaches 15 km in order to avoid near field signal effects.

ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 405

signal with range. For the zenith direction, the advantage is that the airborne lidar system is closer to the region being measured.

Field Measurement Programs Global O3 Measurements

The first airborne DIAL system, which was developed by the NASA Langley Research Center (LaRC), was flown for O3 and aerosol investigations in conjunction with the Environmental Protection Agency’s Persistent Elevated Pollution Episodes (PEPE) field experiment, conducted over the east coast of the US in the summer of 1980. This initial system has evolved into the advanced UV DIAL system that will be described in the next section. Airborne O3 DIAL systems have also been developed and used in field measurement programs by several other groups in the United States, Germany, and France. The parameters of the current NASA LaRC airborne UV DIAL system are given in Table 2. The on-line and off-line UV wavelengths of 288.2 and 299.6 nm are used for DIAL O3 measurements during tropospheric missions, and 301 and 310 nm are used for stratospheric missions. This system also transmits 1,064 and 600 nm beams for aerosol and cloud measurements. The time delay between the on- and off-wavelength pulses is 400 ms, which is sufficient time for the return at the first set of wavelengths to end, but short enough that the same region of the Table 2

atmosphere is sampled. This system has a demonstrated absolute accuracy for O3 measurements of better than 10% or 2 ppbv (parts per billion by volume), whichever is larger, and a measurement precision of 5% or 1 ppbv with a vertical resolution of 300 m and an averaging time of 5 minutes (about 70 km horizontal resolution at typical DC-8 ground speeds). The NASA LaRC airborne UV DIAL systems have made significant contributions to the understanding of both tropospheric and stratospheric O3, aerosols, and clouds. These systems have been used in 18 international and 3 national field experiments over the past 24 years, and during these field experiments, measurements were made over, or near, all of the oceans and continents of the world. A few examples of the scientific contributions made by these airborne UV DIAL systems are given in Table 3. The NASA LaRC airborne UV DIAL system has been used extensively in NASA’s Global Tropospheric Experiment (GTE) program which was started in the early 1980s, had as its primary mission the study of tropospheric chemistry in remote regions of the Earth, in part to study and gain a better understanding of atmospheric chemistry in the unperturbed atmosphere. A related goal was to document the Earth’s atmosphere in a number of places during seasons when anthropogenic influences are largely absent, then return to these areas later to document the changes. The field missions have included campaigns in Africa, Alaska, the Amazon Basin,

Parameters of the NASA LaRC airborne UV DIAL system

Lasers: Nd:YAG-pumped dye lasers, frequency doubled into the UV Pulse repetition frequency (Hz) 30 Pulse length (ns) 8– 12 Pulse energy (mJ) at 1.06 mm 250– 300 Pulse energy (mJ) at 600 nm 50–70 UV pulse energy (mJ) For troposphere at 288/300 nm 20 For stratosphere at 301/310 nm 20 Dimensions (l £ w £ h) (cm) 594 £ 102 £ 109 Mass (kg) 1735 Power requirement (kW) 30 Wavelength region (nm) Receiver Area (m2) Receiver optical efficiency (%) Detector quantum efficiency (%) Field-of-view (mrad)

289– 311 0.086 30 26 (PMT) #1.5

572– 622 0.086 40 8 (PMT) #1.5

System performance Measurement range (km) Vertical resolution (m) Horizontal resolution (km) Measurement accuracy

up to 10–15 (nadir and zenith) 300– 1500, depending on range #70 #10% or 2 ppbv, whichever is greater

1064 0.864 30 40 (APD) #1.5

406 ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases

Table 3 Examples of significant contributions of airborne O3 DIAL systems to the understanding of tropospheric and stratospheric O3 (see also Figures 1 –5) Troposphere: Air mass characterizations in a number of remote regions Continental pollution plume characterizations (see Figures 1 and 2) Study of biomass burn plumes and the effect of biomass burning on tropospheric O3 production (see Figure 3) Case study of warm conveyor belt transport from the tropics to the Arctic Study of stratospheric intrusions and tropopause fold events (see Figure 4) Observation of the decay of a cutoff low Power plant plume studies Stratosphere: Contributions to the chemical explanation of behavior of Antarctic O3 Quantification of O3 depletion in the Arctic (see Figure 5) Polar stratospheric clouds – particle characterizations Intercomparison with surface-based, airborne and space-based instruments Quantification of O3 reduction in tropical stratosphere after the June 1991 eruption of Mount Pinatubo and characterization of the tropical stratospheric reservoir edge Cross-vortex boundary transport

Figure 1 Latitudinal distribution of ozone over the western Pacific Ocean obtained during the second Pacific Exploratory Mission (PEM West B) in 1994.

Canada, and North America, and over the Pacific Ocean from Antarctica to Alaska, with concentrated measurements off the east coast of Asia and in the tropics. One of the bigger surprises of the two decades of field missions, was the discovery in the mid-1990s, that the tropical and South Pacific Ocean had very high tropospheric O3 concentrations in plumes from biomass burning in Africa and South America during the austral spring. There were few indications from surface-based or space-based measurements that there were extensive biomass burn plumes in the area, primarily since the plumes were largely devoid of aerosols due to being stripped out during cloud convective lofting. Once over the ocean, horizontal transport appears to proceed relatively unimpeded unless a storm system is encountered.

The NASA LaRC airborne UV DIAL system has also been flown in all the major stratospheric O3 campaigns starting in 1987 with the Airborne Antarctic Ozone Experiment (AAOE) to determine the cause of the Antarctic ozone hole. The UV DIAL system documented the O3 loss across the ozone hole region. Later, when attention was turned to the Arctic and the possibility of an ozone hole there, the system was used to produce an estimate of O3 loss during the winter season as well as to better characterize the polar stratospheric cloud (PSC) particles. The UV DIAL system was also used to study O3 loss in the tropical stratospheric reservoir following the eruption of Mount Pinatubo in June 1991. The loss was spotted by the Microwave Limb Sounder (MLS) on the Upper Atmospheric Research Satellite (UARS), but the MLS

ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 407

Figure 2 Pollution outflow from China over the South China Sea (right side of figure) with clean tropical air on south side of a front (left side of figure), observed during PEM West B in 1994.

was unable to study the loss in detail due to its low vertical resolution (5 km) compared to the small-scale (2– 3 km) features of the O3 loss. Other traditional space-based O3 measuring instruments also had difficulty during this period, due to the high aerosol loading in the stratosphere following the eruption. Space-Based O3 DIAL System

In order to obtain nearly continuous, global distributions of O3 in the troposphere, a space-based O3 DIAL system is needed. A number of key issues could be addressed by a space-based O3 DIAL system including: the global distribution of photochemical O3 production/destruction and transport in the troposphere; location of the tropopause;

and stratospheric O3 depletion and dynamics. High-resolution airborne O3 DIAL and other aircraft measurements show that to study tropospheric processes associated with biomass burning, transport of anthropogenic pollutants, tropospheric O3 chemistry and dynamics, and stratosphere – troposphere exchange, a vertical profiling capability from space with a resolution of 2 – 3 km is needed, and this capability cannot currently be achieved using passive remote sensing satellite instruments. An example of the type of latitudinal O3 cross-section that could be provided by a space-based O3 DIAL system is shown in Figure 1. This figure shows many different aspects of O3 loss and production; vertical and horizontal transport; and stratosphere– troposphere exchange that occurs from the tropics to high latitudes.

408 ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases

Figure 3 Biomass burning plume over the Atlantic Ocean arising from biomass burning in the central part of western Africa, observed during the TRACE-A mission in 1992.

This type of data would be available from just one pass from a space-based O3 DIAL system. A space-based O3 DIAL system optimized for tropospheric O3 measurements (see system description in Table 4a,b) would also permit high-resolution O3 measurements in the stratosphere (1 km vertical, 100 km horizontal), along with high-resolution aerosol measurements (100 m vertical, 10 km horizontal). In addition, these DIAL measurements will be useful in assisting in the interpretation of passive remote sensing measurements and in helping to improve their data processing algorithms.

Global H2O Measurements

H2O and O3 are important to the formation of OH in the troposphere, and OH is at the center of most of the chemical reactions in the lower atmosphere. In addition H2O is an excellent tracer of vertical and horizontal transport of air masses in the troposphere, and it can be used as a tracer of stratosphere – troposphere exchange. Increased aerosol sizes, due to high relative humidities, can also affect heterogeneous chemical processes and radiation budgets in the boundary layer and in cloud layers. Knowledge of H2O is important to weather forecasting and climate

ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 409

Figure 4 Ozone distribution observed on flight across US during the SASS (Subsonic Assessment) Ozone and Nitrogen Experiment (SONEX) in 1997. A stratospheric intrusion is clearly evident on left side of figure, and low ozone air from tropics transported to mid-latitudes can be seen in the upper troposphere on the right.

Figure 5 Ozone cross-sections in the stratosphere measured in the winter of 1999/2000 during the SOLVE mission. The change in ozone number density in the Arctic polar vortex due to chemical loss during the winter is clearly evident at latitudes north of 728N.

predictions. Thus, H2O distributions can be used in several different ways to better understand chemical, transport, radiation and meteorological processes in the global troposphere.

H2O Raman Lidar Systems

The first Raman lidar measurements of H2O were made in the late 1960s, but not much progress in using the Raman approach for H2O was made until a

410 ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases

system using an abandoned searchlight mirror was employed by the NASA Goddard Space Flight Center in the mid-1980s, to show that if the signal collector (telescope) was large enough, useful Raman measurements could be made to distances of several kilometers. Until recently, Raman lidar measurements of H 2O were largely limited to night-time, due to the high sunlight background interference. Measurements are now made during daytime, using very narrow telescope fields of view and very narrowband filters in the receiver. A good example of a Raman lidar system used to measure H2O is that at the Department of Energy’s Atmospheric Radiation Measurement – Cloud and Radiation Test Bed (ARM-CART) site in Oklahoma. The site is equipped with a variety of instruments aimed at studying the radiation properties of the atmosphere. The system parameters are given in Table 4a,b. Raman lidar systems have been used for important measurements of H2O from a number of groundbased locations. Some of the important early work dealt with the passage of cold and warm fronts. The arrival of the wedge-shaped cold front, pushing the warm air up, was one of these studies. Raman lidar have also been located at the ARM-CART site in Kansas and Oklahoma, where they could both provide a climatology of H2O as well as provide correlative measurements of H2O for validation of space-based instruments.

Table 4a Parameters for the US Department of Energy’s surface-based Raman lidar system for measurements of H2O Laser Wavelength (nm) Pulse energy (mJ) Pulse repetition frequency (Hz) Receiver Area (m2) Wavelengths (nm) Water vapor Nitrogen System performance Measurement range (km) Night-time Daytime Range resolution (m) Measurement accuracy (%) Night-time

Daytime

Nd:YAG Laser 355 400 30

1.1 407 387

near surface to 12 near surface to 3 39 at low altitudes, 300 .9 km ,10 ,7 km (1 min avg) 10– 30 at high altitudes (10– 30 min avg) 10 ,1 km; 5 –15 for 1– 3 km (10 min avg)

Table 4b Parameters for the NASA Goddard Space Flight Center’s Scanning Raman lidar system for measurements of H2O

Laser Wavelength (nm) Pulse energy (mJ) Pulse repetition frequency (Hz) Receiver Area (m2) Wavelengths (nm) Water vapor Nitrogen System performance Measurement range (km) Night-time Daytime Range resolution (m) Measurement accuracy (%) Night-time

Daytime

Day/Night

Night only

Nd:YAG Laser 355 300 30

XeF 351 30– 60 400

1.8

1.8

407 387

403 382

near surface to 12 near surface to 4 7.5 at low altitudes, 300 .9 km

,10 ,5 km (10 sec and 7.5 m range resolution) ,10 ,7 km (1 min avg) 10– 30 at high altitudes (10– 30 min avg) ,10 ,4 km (5 min avg)

H2O DIAL Systems

H2O was first measured with the DIAL approach using a temperature-tuned ruby laser lidar system in the mid-1960s. The first aircraft-based H2O DIAL system was developed at NASA LaRC, and was flown in 1982, as an initial step towards the development of a space-based H2O DIAL system. This system was based on Nd:YAG-pumped dye laser technology, and it was used in the first airborne H2 O DIAL atmospheric investigation, which was a study of the marine boundary layer over the Gulf Stream. This laser was later replaced with a flashlamppumped solid-state alexandrite laser, which had high spectral purity, i.e., little out-of-band radiation, a requirement since water vapor lines are narrow, and this system was used to make accurate H2O profile measurements across the lower troposphere. A third H2O DIAL system, called LASE (Lidar Atmospheric Sensing Experiment) was developed as a prototype for a space-based H2O DIAL system, and it was completed in 1995. This was the first fully autonomously operating DIAL system. LASE uses a Ti:sapphire laser that is pumped by a double-pulsed, frequency-doubled Nd:YAG to produce laser pulses in the 815 nm absorption band of H2O (see Table 5). The wavelength of the Ti:sapphire laser is controlled by injection seeding with a diode laser that is frequency locked to a H2O line using an

ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 411

Table 5

Parameters of LASE H2O DIAL system

Laser Wavelength (nm) Pulse energy (mJ) Pulse-pair repetition frequency Linewidth (pm) Stability (pm) Spectral purity (%) Beam divergence (mrad) Pulse width (ns) Receiver Area (m2) Receiver optical efficiency (%) Avalanch Photodiode (APD) detector quantum efficiency (%) Field-of-view (mrad) Noise equivalence power (W Hz20.5) Excess noise factor System performance Measurement range (altitude) (km) Range resolution (m) Measurement accuracy (%)

Ti:sapphire 813– 818 100 5 (on- and off-line pulses separated by 300 ms) ,0.25 ,0.35 .99 ,0.6 35 0.11 50 (night), 35 (day) 80 1.0 2 £ 10214

Table 6 Examples of significant contributions of airborne H2O DIAL systems to the understanding of H2O distributions (see also Figures 6 –8) NASA Langley H2O DIAL systems including LASE Study of marine boundary layer over Gulf Stream Observation of H2O transport at a land/sea edge Study of large-scale H2O distributions across troposphere (see Figure 6) Correlative in situ and remote measurements Observations of boundary layer development (see Figure 7) Cirrus cloud measurements Hurricane studies (see Figure 8) Relative humidity effects on aerosol sizes Ice supersaturation in the upper troposphere Studies of stratospheric intrusions H2O distributions over remote Pacific Ocean Other airborne H2O DIAL systems Boundary layer humidity fluxes Lower-stratospheric H2O studies

3 15 300– 500 5

absorption cell. Each pulse pair consists of an on-line and off-line wavelength for the H2O DIAL measurements. To cover the large dynamic range of H2O concentrations in the troposphere (over 3 orders of magnitude), up to three line pair combinations are needed. LASE uses a novel approach of operating from more than one position on a strongly absorbing H2O line. In this approach, the laser is electronically tuned at the line center, side of the line, and near the wing of the line to achieve the required absorption cross-section pairs (on and off). LASE has demonstrated measurements of H2O concentrations across the entire troposphere using this ‘side-line’ approach. The accuracy of LASE H2O profile measurements was determined to be better than 6% or 0.01 g/kg, whichever is larger, over the full dynamic range of H2O concentrations in the troposphere. LASE has participated in over eight major field experiments since 1995. See Table 6 for a listing of topics studied using airborne H2O DIAL systems (Figures 6 –8).

space-based H2O DIAL system could be flown on a long-duration space mission this decade. Space-based DIAL measurements can provide a global H2O profiling capability, which when combined with passive remote sensing with limited vertical resolution, can lead to 3-dimensional measurements of global H2O distributions. High vertical resolution H2O (#1 km), aerosol (#100 m), and cloud top (#50 m) measurements from the lidar along the satellite ground-track, can be combined with the horizontally contiguous data from nadir passive sounders to generate more complete high-resolution H2O, aerosol, and cloud fields for use in the various studies indicated above. In addition, the combination of active and passive measurements can provide significant synergistic benefits leading to improved temperature and relative humidity measurements. There is also strong synergism with aerosol and cloud imaging instruments and with future passive instruments that are being planned or proposed for missions addressing atmospheric chemistry, radiation, hydrology, natural hazards, and meteorology. Tunable Laser Systems for Point Monitoring

Space-Based H2O DIAL System

The technology for a space-based H2O DIAL system is rapidly maturing in the areas of: highefficiency, high-energy, high-spectral-purity, long-life lasers with tunability in the 815- and 940-nm regions; low-weight, large-area, high-throughput, high-background-rejection receivers; and highquantum-efficiency, low-noise, photon-counting detectors. With the expected advancements in lidar technologies leading to a 1-J/pulse capability, a

Tunable diode laser (TDL) systems are also used to measure atmospheric gases on a global scale from aircraft and balloons. TDLs are small lasers that emit extremely narrowband radiation and can be tuned in the near IR spectral region using a combination of temperature and current. TDLs can be built into very sensitive and compact systems and located on the surface or flown on aircraft or balloons and used to measure such species as CO, HCl, CH4, and oxides of nitrogen such as N2O and NO2. They derive their

412 ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases

Figure 6 LASE measurements of water vapor (left) and aerosols and clouds (right) across the troposphere on an ER-2 flight from Bermuda to Wallops during the Tropospheric Aerosol Radiative Forcing Experiment (TARFOX) conducted in 1996.

Figure 7 Water vapor and aerosol cross-section obtained on a flight across a cold front during the Southern Great Plains (SGP) field experiment conducted over Oklahoma in 1997.

ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 413

Figure 8 Measurements of water vapor, aerosols, and clouds in the inflow region of Hurricane Bonnie during 1998 Convection and Moisture Experiment (CAMEX-3). A rain band can be clearly seen at the middle of Leg-AB on the satellite and LASE cross-sections.

high sensitivity to the fact that the laser frequency is modulated at a high frequency, permitting a small spectral region to be scanned rapidly. The second- or fourth-harmonic of the scan frequency is used in the data acquisition, effectively eliminating much of the low-frequency noise due to mechanical vibrations and laser power fluctuations. In addition, multipass cells are employed, thereby generating long paths for the absorption measurements. TDL systems have been used in a number of surface-based measurement programs. One system was mounted on a ship doing a latitudinal survey in the Atlantic Ocean and monitored NO2, formaldehyde (HCHO), and H2O2. Another TDL system was located at the Mauna Loa Observatory and was used to monitor HCHO and H 2O 2 during a photochemistry experiment, finding much lower

concentrations of both gases than models had predicted. The TDL system used to measure HCHO has been used in a number of additional groundbased measurement programs and has also been flown on an aircraft in a couple of tropospheric missions. One TDL system, called DACOM (Differential Absorption CO Measurement), has been used in a large number of NASA GTE missions. It makes measurements of CO in the 4.7 mm spectral region and CH4 in the 3.3 or 7.6 mm spectral region. This instrument is able to make measurements at a 1 Hz rate and with a precision of 0.5 – 2.0%, depending on the CO value, and an accuracy of ^2%. DACOM has been very useful in determining global distributions of CO due to the wide-ranging nature of the GTE missions. It has also contributed to

414 ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases

the characterization and understanding of CO in air masses encountered during the GTE missions (Table 7). A pair of TDL instruments, the Airborne Tunable Laser Absorption Spectrometer (ATLAS) and the Aircraft (ER-2) Laser Infrared Absorption Spectrometer (ALIAS), have been flown on several NASA missions to explore polar O3 chemistry and atmospheric transport. The ATLAS instrument measures N2O, and ALIAS is a 4-channel spectrometer that measures a large variety of gases, including HCl, N2O, CH4, NO2, and CO and is currently configured to measure water isotopes across the tropopause. A new class of lasers, tunable quantum-cascade (QC) lasers, are being added to the list of those available for in situ gas measurement systems. Using a cryogenically cooled QC laser during a series of 20 aircraft flights beginning in September 1999 and extending through March 2000,

measurements were made of CH4 and N2O up to ,20 km in the stratosphere over North America, Scandinavia, and Russia, on the NASA ER-2. Compared with its companion lead salt diode lasers, that were also flown on these flights, the single-mode QC laser, cooled to 82 K, produced higher output power (10 mW), narrower laser linewidth (17 MHz), increased measurement precision (a factor of 3), and better spectral stability (,0.1 cm – 1 K). The sensitivity of the QC laser channel was estimated to correspond to a minimum-detectable mixing ratio of approximately 2 ppbv of CH4.

Laser Long-Path Measurements Laser systems can also be used in long-path measurements of gases. The most important of such programs entailed the measurement of hydroxyl radical (OH) in the Rocky Mountains west of

Table 7 Examples of significant contributions of airborne tunable diode laser systems to the understanding of atmospheric chemistry and transport Balloon-borne – stratosphere The first in situ measurements of the suite NO2, NO, O3, and the NO2 photolysis rate to test NOx (NO2 þ NO) photochemistry The first in situ stratospheric measurements of NOx over a full diurnal cycle to test N2O5 chemistry In situ measurements of NO2 and HNO3 over the 20–35 km region to assess the effect of Mt. Pinatubo aerosol on heterogeneous atmospheric chemistry Measurements of HNO3 and HCl near 30 km Measurements of CH4, HNO3, and N2O for validation of several satellite instruments Intrusions from midlatitude stratosphere to tropical stratospheric reservoir Aircraft mounted – troposphere Carbon monoxide Measurement of CO from biomass burning from Asia, Africa, Canada, Central America, and South America Detection of thin layers of CO that were transported thousands of miles in the upper troposphere Observed very high levels of urban pollution in plumes off the Asian continent Emission indices for many gases have been calculated with respect to CO Methane Determined CH4 flux over the Arctic tundra, which led to rethinking of the significance of tundra regions as a global source of CH4 Found that biomass burning is a significant source of global CH4 Found strong enhancements of CH4 associated with urban plumes Aircraft mounted – stratosphere Polar regions Extreme denitrification observed in Antarctic winter vortex from NOy:N2O correlation study Observed very low N2O in Antarctic winter vortex, which helped refute the theory that the O3 hole is caused by dynamics Contributed to the study of transport out of the lower stratospheric Arctic vortex by Rossby wave breaking Measurement of concentrations of gases involved in polar stratospheric O3 destruction and production Activation of chlorine in the presence of sulfate aerosols Mid-latitudes Measurements in aircraft exhaust plumes in the lower stratosphere, especially of reactive nitrogen species and CO Vertical profiles of CO in the troposphere and lower stratosphere Determination of the hydrochloric acid and the chlorine budget of the lower stratosphere Measurement of NO2 for testing atmospheric photochemical models Trends in HCl/Cly in the stratosphere ,21 km, 1992– 1998 Gas concentration measurements for comparison with a balloon-borne Fourier transform spectrometer observing the Sun Near-IR TDL laser hygrometers (ER-2, WB57, DC-8) for measuring H2O and total water in the lower stratosphere and upper troposphere Remnants of Arctic winter vortex detected many months after breakup Tropics Tropical entrainment time scales inferred from stratospheric N2O and CH4 observations

ENVIRONMENTAL MEASUREMENTS / Laser Detection of Atmospheric Gases 415

Boulder, Colorado. OH was measured using a XeCl excimer laser operating near 308 nm and transmitting a beam to a retroreflector 10.3 km away. The measurements were quite difficult to conduct, primarily since OH abundance is very low (105 – 107 cm23), yielding very low absorption (,0.02% for an abundance of 105 cm23) over the 20.6 km path. In addition, the excimer laser could generate OH from ambient H2O and O3, unless the laser energy density was kept low. To help ensure good measurements, a white light source was also employed during the measurements to monitor H2O, O3, and other gases. The measurements were eventually very successful and led to new values for OH abundances in the atmosphere.

Laser-Induced Fluorescence (LIF) The laser-induced fluorescence approach has been used to measure several molecular and ionic species in situ. The LIF approach has been used to measure OH and HO2 (HOx) on the ground and on aircraft platforms. One airborne LIF system uses a diodepumped Nd:YAG-pumped, frequency-doubled dye laser to generate the required energy near 308 nm. The laser beam is sent into a White cell where it can make 32 –36 passes through the gas in the cell to increase the LIF signal strength. NO is used to convert HO2 to OH. The detection limit in 1 minute of about 2 –3 ppqv (10215) above 5 km altitude, which translates into a concentration of about 4 £ 104 molec/cm3 at 5 km and 2 £ 104 molec/cm3 at 10 km altitude. One of the interesting findings from such measurements is that HOx concentrations are up to 5 times larger than model predictions based on NOx concentrations, suggesting that NOx emissions from aircraft could have a greater impact on O3 production than originally thought. NO is detected using the LIF technique in a twophoton approach: electrons are pumped from the ground state using 226 nm radiation and from that state to an excited state using 1.06 mm radiation. The 226 nm radiation is generated by frequency doubling a dye laser to 287 nm and then mixing that with 1.1 mm radiation derived from H2 Raman shifting of frequency-mixed radiation from a dye laser and a Nd:YAG laser. From the excited level, 187 – 201 nm radiation is emitted. In order to measure NO2, it is first converted to the photofragment NO via pumping at 353 nm from a XeF excimer laser. One of the interesting findings from airborne measurements in the South Pacific is that there appeared to be a large missing source for NO x in the upper troposphere.

Summary DIAL and Raman lidar systems have played important roles in studying the distribution of gases such as O3 and H2O on local, regional, and global scales, while TDL systems have played corresponding roles for such gases as CO, HCl, and oxides of nitrogen. It is anticipated that these approaches will continue to yield valuable information on these and other gases, as new and more capable systems are developed. Within the next decade, it is expected that a DIAL system will be placed in orbit to make truly global measurements of O3, H2O, and/or carbon dioxide.

See also Imaging: Lidar. Scattering: Raman Scattering.

Further Reading Browell EV (1994) Remote sensing of trace gases from satellites and aircraft. In: Calvert J (ed.) Chemistry of the Atmosphere: The Impact on Global Change, pp. 121– 134. Cambridge, MA: Blackwell Scientific Publications. Browell EV, Ismail S and Grant WB (1998) Differential absorption lidar (DIAL) measurements from air and space. Applied Physics. B 67: 399 –410. Dabas A, Loth C and Pelon J (eds) (2001) Advances in Laser Remote Sensing. Selected Papers presented at the 20th International Laser Radar Conference (ILRC), Vichy, France, 10 – 14 July 2000, pp. 357 – 360. Palaiseau Cedex, France: A. Edition d’Ecole Polytechnique. Goldsmith JEM, Bisson SE, Ferrare RA, et al. (1994) Raman lidar profiling of atmospheric water vapor: simultaneous measurements with two collocated systems. Bulletin American Meteorological Society 75: 975 – 982. Goldsmith JEM, Blair FH, Bisson SE and Turner DD (1998) Turn-key Raman lidar for profiling atmospheric water vapor, clouds, and aerosols. Applied Optics 37: 4979– 4990. Grant WB (1995) Lidar for atmospheric and hydrospheric studies. In: Duarte F (ed.) Tunable Laser Applications, pp. 213– 305. New York: Marcel Dekker Inc. Grant WB, Browell EV, Menzies RT, Sassen K and She C-Y (eds) (1997) Laser Applications in Remote Sensing. SPIE Milestones series, 690 pp, 86 papers. Grisar R, Booettner H, Tacke M and Restelli G (1992) Monitoring of Gaseous Pollutants by Tunable Diode Lasers. Proceedings of the International Symposium held in Freiburg, Germany, 17– 18 October 1991, 372 pp. Dordrecht, The Netherlands: Kluwer Academic Publishers.

416 ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere

McDermid IS, Walsh TD, Deslis A and White M (1995) Optical systems design for a stratospheric lidar system. Applied Optics 34: 6201– 6210. Me´gie G (1988) Laser measurements of atmospheric trace constituents. In: Measures RM (ed.) Laser Remote Chemical Analysis, pp. 333 – 408. New York: J Wiley & Sons. Podolske JR and Loewenstein M (1993) Airborne tunable diode laser spectrometer for trace gas measurements in the lower stratosphere. Applied Optics 32: 5324– 5330. Pougatchev NS, Sachse GW, Fuelberg HE, et al. (1999) Pacific Exploratory Mission-Tropics carbon monoxide measurements in historical context. Journal of Geophysics Research 104: 26,195 – 26,207. Sachse GW, Collins Jr JE, Hill GF, et al. (1991) Airborne tunable diode laser sensor for high precision concentration and flux measurements of carbon

dioxide and methane. Proceedings of the SPIE International Society for Optical Engineering 1443: 145 – 156. Svanberg S (1994) Differential absorption lidar (DIAL). In: Segrist MW (ed.) Air Monitoring Spectroscopic Techniques, pp. 85– 161. New York: J Wiley & Sons. Webster CR, et al. (1994) Aircraft (ER-2) laser infrared absorption spectrometer (ALIAS) for in situ measurements of HCl, N2O, CH4, NO2, and HNO3. Applied Optics 33: 454 – 472. Whiteman DN, Melfi SH and Ferrare RA (1992) Raman lidar system for the measurement of water vapor and aerosols in the Earth’s atmosphere. Applied Optics 31: 3068– 3082. Zanzottera E (1990) Differential absorption lidar techniques in the determination of trace pollutants and physical parameters of the atmosphere. Critical Reviews in Analytical Chemistry 21: 279– 319.

Optical Transmission and Scatter of the Atmosphere S M Adler-Golden and A Berk, Spectral Sciences, Inc., Burlington, MA, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction The ability to understand and model radiative transfer (RT) processes in the atmosphere is critical for remote sensing, environmental characterization, and many other areas of scientific and practical interest. At the Earth’s surface, the bulk of this radiation, which originates from the sun, is found in the ultraviolet to infrared range between around 0.3 and 4 mm. This article describes the most significant RT processes for these wavelengths, which are absorption (light attenuation along the line of sight LOS) and elastic scattering (redirection of the light). Transmittance, T; is defined as one minus the fractional extinction (absorption plus scattering). At longer wavelengths (in the mid- and long-wave infrared) the major light source is thermal emission. A few other light sources are mentioned here. The moon is the major source of visible light at night. Forest fires can be a significant source of mid-wave infrared radiation. Manmade light sources include continuum sources such as incandescent lamps and spectrally narrow sources such as fluorescent lamps and lasers. In general, spectrally narrow sources need to have a different RT treatment than continuum sources due to the abundance of narrow spectral absorption lines in the atmosphere, as is discussed below.

The challenge of atmospheric RT modeling is essentially to solve the following equation that describes monochromatic light propagation along the LOS direction: 

1 k



dI ¼ 2I þ J du

½1

where I is the LOS radiance (watts per unit area per unit wavelength per steradian), k is the extinction coefficient for the absorbing and scattering species (per unit concentration per unit length), u is the material column density (in units of concentration times length), and J is the radiance source function. I is a sum of direct (i.e., from the sun) and diffusely scattered components. The source function represents the diffuse light scattered into the LOS, and is the angular integral over all directions Vi of the product of the incoming radiance, IðVi Þ; and the scattering phase function, pðVo ; Vi Þ : JðVo Þ ¼

ð Vi

pðVo ; Vi ÞIðVi ÞdVi

½2

The scattering phase function describes the probability density for incoming light from direction Vi scattering out at angle Vo ; and is a function of the difference (scattering) angle u: The direct radiance component, I0 ; is described by eqn [1] with the source function omitted. Integrating along the LOS leads to the well-known Beer’s Law

ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere 417

equation for transmittance: T ¼ I0 =I00 ¼ expð2kuÞ

½3

where I00 is the direct radiance at the boundary of the LOS. The quantity ku ¼ lnð1=TÞ is known as the optical depth.

Atmospheric Constituents The atmosphere has a large number of constituents, including numerous gaseous species and suspended liquid and solid particulates. Their contributions to extinction are depicted in Figure 1. The largest category is gases, of which the most important are water vapor, carbon dioxide, and ozone. Of these, water vapor is the most variable and carbon dioxide the least, although the CO2 concentration is gradually increasing. In atmospheric RT models, carbon dioxide is frequently taken to have a fixed and altitude-independent concentration, along with other ‘uniformly mixed gases’ (UMGs). The concentration profiles of the three major gas species are very different. Water vapor is located mainly in the lowest 2 km of the atmosphere. Ozone has a fairly flat profile from the ground through the stratosphere (,30 km). The UMGs decline exponentially with altitude, with a scale height of around 8 km. Gases

Gas molecules both scatter and absorb light. Rayleigh scattering by gases scales inversely with the fourth

power of the wavelength, and is responsible for the sky’s blue color. For typical atmospheric conditions, the optical depth for Rayleigh extinction is approximately 0:009=l4 per air mass (l is in mm and air mass is defined by the vertical column from ground to space). The Rayleigh phase function has a ð1 þ cos2 uÞ dependence. Absorption by gases may consist of a smooth spectral continuum (such as the ultraviolet and visible electronic transitions of ozone) or of discrete spectral lines, which are primarily rotation lines of molecular vibrational bands. In the lower atmosphere (below around 25 km altitude), the spectral shape of these lines is determined by collisional broadening and described by the normalized Lorentz line shape formula: fLorentz ðnÞ ¼

ac =p a2c þ ðn 2 n0 Þ2

½4

Here n is the wavenumber (in cm21) ½n ¼ ð10 000 mm= cmÞ=l; n0 is the molecular line transition frequency and ac is the collision-broadened half-width (in cm21), which is proportional to pressure. The constant of proportionality, known as the pressurebroadening parameter, has a typical value on the order of 0.06 cm21 atm21 at ambient temperature. The extinction coefficient kðnÞ is the product of fLorentz ðnÞ and the integrated line strength S; which is commonly in units of atm21 cm22. At higher altitudes in the atmosphere the pressure is reduced sufficiently that Doppler broadening

Figure 1 Spectral absorbance (1 – transmittance) for the primary sources of atmospheric extinction. The 12 nm resolution data were generated by MODTRAN for a vertical path from space with a mid-latitude winter model atmosphere.

418 ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere

becomes competitive with collisional broadening, and the Lorentz formula becomes inaccurate. The general lineshape for combined collisional and Doppler broadening is the Voigt lineshape, which is proportional to the real part of the complex error (probability) function, w: fVoigt ðnÞ ¼

   1 n 2 n0 a pffiffi Re w þ ci ad p ad ad

½5

Here, ad is the Doppler 1=e half-width. Comprehensive spectral databases have been compiled of the transition frequencies, strengths, and pressure-broadened half-widths for atmospherically important molecules throughout the electromagnetic spectrum. Perhaps the most notable of these databases is HITRAN, which was developed by the US Air Force Research Laboratory and is currently maintained at the Harvard-Smithsonian Center for Astrophysics in Cambridge, MA.

Liquids and Solids

The larger particulates in the atmosphere (greater than a few mm in radius) typically belong to clouds. Low-altitude clouds consist of nearly spherical water droplets, while high-altitude cirrus clouds are mainly a collection of ice crystals. Other large particulates include sand dust. Their light scattering is close to the geometric limit at visible and ultraviolet wavelengths. This means that the extinction is nearly wavelength-independent, and the scattering phase function and single-scattering albedo may be reasonably modeled with ray-tracing techniques that account for the detailed size and shape distributions of the particles. However, Mie scattering theory is typically used to calculate cloud optical properties because it is exact for spherical particles of any size. The smaller particulates in the atmosphere belong to aerosols, which are very fine liquid particles, and dusts, which are solids such as minerals and soot. These particulates are concentrated mainly in the lower 2 km or so of the atmosphere; however, they are also present at higher altitudes in smaller concentrations. Their optical properties are typically modeled using Mie theory. The wavelength dependence of the scattering is approximately inversely proportional to a low power of the wavelength, typically between 1 and 2, as befits particulates intermediate in size between molecular and geometric-limit. The scattering phase functions have a strong forward-scattering peak; values of the asymmetry parameter g (the average value of cos u) typically range from 0.6 to 0.8 at solar wavelengths.

Solution Methods Geometry

Atmospheric properties are primarily a function of altitude, which determines pressure, temperature and species concentration profiles. Accordingly, most RT methods define a stratified atmosphere. The most accurate treatments of transmission and scattering account for the spherical shape of the layers and refraction; however, most RT models use a planeparallel approximation for at least some computations, such as multiple scattering. Spectral Resolution

Optical instruments have finite, and frequently broad, wavelength responses. Nevertheless, modeling their signals requires accounting for the variation of absorption on an extremely fine wavelength scale, smaller than the widths of the molecular lines. ‘Exact’ monochromatic methods The most accurate RT solution method involves explicitly solving the RT problem for a very large number of monochromatic wavelengths. This lineby-line method is used in a number of RT models, such as FASCODE. It allows Beer’s law to be applied to combine transmittances from multiple LOS segments, and provides an unambiguous definition of the optical parameters. It is suitable for use with spectrally structured light sources, such as lasers. The one major drawback of this method is that it is computationally intensive, and therefore may not be practical for problems where large wavelength ranges, multiple LOS views and multiple atmospheric conditions need to be treated. To alleviate the computational burden of monochromatic calculations, some approximate methods have been developed that model RT in finite spectral intervals, as described below. Statistical band models The band model method represents spectral lines in a narrow interval, Dn; statistically using such parameters as the total line strength, the mean pressurebroadening parameter, and the effective number of lines in the interval. An example of a popular band model-based RT algorithm is MODTRAN, which is described below. A key to the success of band models is the availability of approximate analytical formulas for the integrated absorption for an individual molecular transition of strength S; known as the single-line total

ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere 419

equivalent width, Wsl : Wsl ¼

ð1 21

½1 2 expð2Sufn Þdn

Scattering Methods

½6

In the optically thin (small absorption) limit, Wsl is proportional to the molecular species column density, while in the optically thick (large absorption) limit it scales as the square root of the column density. A further assumption made by MODTRAN is that the line centers are randomly located within the interval, i.e., spectrally uncorrelated. With this assumption, the net transmittance can be expressed as the product of the transmittances for each individual line, whether the line belongs to the same molecular species or a different species. A particular challenge in band models is treatment of the inhomogeneous path problem – that is, the variations in path properties along a LOS and their effect on the statistical line parameters, which arise primarily from differences in pressure and hence line width. The Curtis – Godson path averaging method provides a reasonable way to define ‘equivalent’ homogeneous path parameters for the band model. Another challenge is to define an effective extinction optical depth for each layer in order to solve the RT problem with scattering. One option in MODTRAN is to define it by computing the cumulative transmittance through successive layers in a vertical path. Correlated-k model Another well-known approximate RT algorithm for spectral intervals is the correlated-k method. This method starts with an ‘exact’ line-by-line calculation of extinction coefficients (k0 s) within the interval on a fine spectral grid, from which a k-distribution (vs. cumulative probability) is computed. A database of k values and probabilities summing to 1 is built from these k-distributions for a grid of atmospheric pressures and temperatures, and for all species contributing to the spectral interval. Inhomogeneous paths are handled by recognizing that the size order of the k0 s is virtually independent of pressure and typically only weakly dependent on temperature. LOS transmittances, LOS radiances, and fluxes are calculated by interpolating the database over temperature and pressure to define the k0 s for each LOS segment, solving the monochromatic RT equation at each fixed distribution location, and finally integrating over the distribution. The correlated-k method has been found to be quite accurate for atmospheric paths containing a single molecular species; however, corrections must be applied for spectral intervals containing multiple species.

When the diffuse light field is of interest, scattering methods are used to calculate the source function J of eqns [1] and [2]. Single scattering If scattering is weak, the approximation may be made that the solar radiation scatters only once. Thus the integral over the scattering phase function, eqn [2], is straightforwardly calculated using the direct radiance component, which is given by eqn [3]. The neglect of multiple scattering (i.e., the diffuse contribution to the source function) means that the diffuse radiance is underestimated; however, single scattering is sufficiently accurate for some atmospheric problems in clear weather and at infrared wavelengths. Multiple scattering A number of different methods have been developed to solve the multiple scattering problem. Two-stream methods, which are the simplest and fastest, resolve the radiance into upward and downward directions. These methods generally produce reasonably accurate values of hemispherically averaged radiance, which are also referred to as horizontal fluxes or irradiances. A much more accurate approach to the multiple scattering problem is the method of discrete ordinates. It involves expansion of the radiation field, the phase function and the surface reflectance as a series of spherical harmonics, leading to a system of linear integral equations. Evaluation of the integrals by Gaussian quadrature leads to a solvable system of linear differential equations. An important approximation called delta-M speeds up convergence of the discrete ordinates method, especially when scattering phase functions are strongly forward peaked, by representing the phase function as the sum of a forward direction d-function and a remainder term. For most scattering problems, the solution is converged upon with a modest number (,8 to 16) of quadrature points (streams). A very different type of multiple scattering technique, called the Monte Carlo method, is based on randomly sampling a large number of computersimulated ‘photons’ as they travel through the atmosphere and are absorbed and scattered. The basic idea here is that sensor radiance can be expressed as a multiple path integral over the local source terms, and Monte Carlo methods solve integrals by sampling the integrand. The major advantage of this method is that it is flexible enough to allow for all of the complexity of a realistic atmosphere, often neglected by other methods. The major drawback is its

420 ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere

computational burden, as a large number of photons is required for reasonable convergence; the Gaussian error in the calculation declines with the square root of the number of photons. The convergence problem is moderated to a large extent by using the physics of the problem being solved to bias the selection of photon paths toward those trajectories which contribute most; mathematically, this is equivalent to requiring that the integrand be sampled most often where its contributions are most significant. An Example Atmospheric RT Model: MODTRAN

MODTRAN, developed collaboratively by the Air Force Research Laboratory and Spectral Sciences, Inc., is the most widely used atmospheric radiation transport model. It defines the atmosphere using stratified layering and computes transmittances, radiances, and fluxes using a moderate spectral resolution band model with IR through UV coverage. The width of the standard spectral interval, or bin, in MODTRAN is 1 cm21. At this resolution, spectral correlation among extinction sources is well characterized as random. Thus, the total transmittance from absorption and scattering of atmospheric particulates and molecular gases is computed as the product of the individual components. Rayleigh, aerosol, and cloud extinction are all spectrally slowly varying and well represented by Beer’s Law absorption and scattering coefficients on a 1 cm21 grid. Calculation of molecular absorption is more complex because of the inherent spectral structure and the large number of molecular transitions contributing to individual spectral bins. As illustrated in Figure 2, MODTRAN partitions the

Absorption coefficient

Total Line centers Line tails Continuum

–1.5

–0.5

0.5

1.5

Relative wavenumber Figure 2 Components of molecular absorption. The line center, line tail and continuum contributions to the total absorption are illustrated for the central 1 cm21 spectral bin.

spectral bin molecular attenuation into 3 components: . Line center absorption from molecular transitions

centered within the spectral bin; . Line tail absorption from the tails of molecular

lines centered outside of the spectral bin but within 25 cm21; and . H2O and CO2 continuum absorption from distant (.25 cm21) lines. Within the terrestrial atmosphere, only H2O and CO2 have sufficient concentrations and line densities to warrant inclusion of continuum contributions. These absorption features are relatively flat and accurately modeled using 5 cm21 spectral resolution Beer’s Law absorption coefficients. Spectral bin contributions from neighboring line tails drop off, often rapidly, from their spectral bin edge values, but the spectral curves are typically simple, containing at most a single local minimum. For MODTRAN, these spectral contributions are pre-computed for a grid of temperature and pressure values, and fit with Pade´ approximants, specifically the ratio of quadratic polynomials in wavenumber. These fits are extremely accurate and enable line tail contributions to be computed on an arbitrarily fine grid. MODTRAN generally computes this absorption at a resolution equal to one-quarter the spectral bin width, i.e., 0.25 cm21 for the 1.0 cm21 band model. The most basic ansatz of the MODTRAN band model is the stipulation that molecular line center absorption can be approximated by the absorption of n identical Voigt lines randomly distributed within the band model spectral interval, Dn: Early in the development of RT theory, Plass derived the expression for the transmittance from these n randomly distributed lines:   W 0sl n T ¼ 12 ½7 Dn MODTRAN’s evolution has resulted in a fine tuning of the methodology used to define both the effective line number n and the in-band single-line equivalent width W 0sl : The effective line number is initially estimated from a relationship developed by Goody, in which lines are weighted according the to square root of their strength, but MODTRAN combines nearly degenerate transitions into single lines because these multiplets violate the random distribution assumption. The initial effective line number values are refined to insure a match with higher resolution transmittance predictions degraded to the band model resolution. The in-band equivalent

ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere 421

Applications Among the many applications of atmospheric transmission and scattering calculations, we briefly describe two complementary ones in the area of remote sensing, which illustrate many of the RT features discussed earlier as well as current optical technologies and problems of interest.

0.25

Apparent reflectance Reflectance O2

0.2

Reflectance

width is computed for an off-centered Voigt line of strength S: Lorentz and Doppler half-widths are determined as strength-weighted averages. The offcenter distance is fixed to insure that the weak-line Lorentz equivalent width exactly equals the random line center value. MODTRAN scattering calculations are optimally performed using the DISORT discrete ordinates algorithm developed by Stamnes and co-workers. Methods for computing multiple scattering such as DISORT require additive optical depths, i.e., Beer’s Law transmittances. Since the in-band molecular transmittances of MODTRAN do not satisfy Beer’s Law, MODTRAN includes a correlated-k algorithm option. The basic band model ansatz is re-invoked to efficiently determine k-distributions; tables of k-data are pre-computed as a function of Lorentz and Doppler half-widths and effective line number, assuming spectral intervals contain n randomly distributed identical molecular lines. Thus, MODTRAN k-distributions are statistical, only dependent on band model parameter values, not on the exact distribution of absorption coefficients in each spectral interval.

0.15

H2O

0.1 Scatter

Co2

0.05 H2O 0

0.5

1

1.5

2

Wavelength (microns) Figure 3 Vegetation spectrum viewed from 3 km altitude before (apparent reflectance) and after (reflectance) removal of atmospheric scatter and absorption.

observed radiance divided by the Top-of-Atmosphere horizontal solar flux. Atmospheric absorption by water vapor, oxygen, and carbon dioxide is evident, as well as Rayleigh and aerosol scattering. Figure 3 shows the surface reflectance spectrum inferred by modeling and then removing these atmospheric effects (this process is known as atmospheric removal, compensation, or correction). It has the characteristic smooth shape expected of vegetation, with strong chlorophyll absorption in the visible and water bands at longer wavelengths. A detailed analysis of such a spectrum may yield information on the vegetation type and its area coverage and health.

Earth Surface Viewing

The first example is Earth surface viewing from aircraft or spacecraft with spectral imaging sensors. These include hyperspectral sensors, such as AVIRIS, which have typically a hundred or more contiguous spectral channels, and multispectral sensors, such as Landsat, which typically have between three and a few tens of channels. These instruments are frequently used to characterize the surface terrain, materials and properties for such applications as mineral prospecting, environmental monitoring, precision agriculture, and military uses. In addition, they are sensitive to properties of the atmosphere such as aerosol optical depth and column water vapor. Indeed, in order to characterize the surface spectral reflectance, it is necessary to characterize and remove the extinction and scattering effects of the atmosphere. Figure 3 also shows an example of data collected by the AVIRIS sensor at , 3 km altitude over thick vegetation. The apparent reflectance spectrum is the

Sun and Sky Viewing

The second remote sensing example is sun and sky viewing from the Earth’s surface with a spectral radiometer, which can yield information on the aerosol content and optical properties as well as estimates of column concentrations of water vapor, ozone, and other gases. Figure 4 shows data from a Yankee Environmental Systems, Inc. multi-filter rotating shadow-band radiometer, which measures both ‘direct flux’ (the direct solar flux divided by the cosine of the zenith angle) and diffuse (sky) flux in narrow wavelength bands. The plot of ln(direct signal) versus the air mass ratio is called a Langley plot, and is linear for most of the bands, illustrating Beer’s Law. The extinction coefficients (slopes) vary with wavelength consistent with a combination of Mie and Rayleigh scattering. The water-absorbing 940 nm band has the lowest values and a curved

422 ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere

101

Direct flux (Wm–2 nm–1)

100

10–1

10–2

10–3

415 nm 500 nm 610 nm 665 nm 862 nm 940 nm 0

1

2

3

4

5

6

7

8

9

10

Air mass ratio Figure 4 Direct solar flux versus air mass at the surface measured by a Yankee Environmental Systems, Inc. multi-filter rotating shadow-band radiometer at different wavelengths.

plot, in accordance with the square-root dependence of the equivalent width for optically thick lines. The diffuse fluxes, which arise from aerosol and Rayleigh scattering, have a very different dependence on air mass than the direct flux. In particular, the diffuse contributions often increase with air mass for high sun conditions. The ratio of the diffuse and direct fluxes is related to the total single-scattering albedo, which is defined as the ratio of the total scattering coefficient to the extinction coefficient. Results from a number of measurements showing lower than expected diffuse-to-direct ratios have suggested the presence of black carbon or some other continuum absorber in the atmosphere, which would have a significant impact on radiative energy balance at the Earth’s surface and in the atmospheric boundary layer.

List of Units and Nomenclature Line shape function [cm] fn Lorentz line shape function [cm] fLorentz ðnÞ Voigt line shape function [cm] fVoigt (n) Scattering asymmetry parameter g Direct plus diffuse radiance (W cm21 sr21 I or W cm22 mm21 sr21)

Direct radiance (W cm21 sr21 or W cm22 mm21 sr21) Direct radiance at the LOS boundary (W cm21 sr21 or W cm22 mm21 sr21) Source function (W cm21 sr21 or W cm22 mm21 sr21) Extinction coefficient (cm21 atm21) Effective number of lines in bin Scattering phase function (sr21) Line strength (cm22 atm21) Transmittance Column density (atm cm) Complex error (probability) function Single-line total equivalent width (cm21) Single-line in-band equivalent width (cm21) Collision-broadened half-width at half maximum (cm21) Doppler half-width at 1=e (cm21) Spectral interval (bin) width (cm21) Scattering angle (radian) Wavelength (mm) Wavenumber (cm21) Molecular line transition frequency Direction of incoming light (sr) Direction of outgoing light (sr)

I0 I00 J k n p S T u w Wsl W 0sl

ac ad Dn u l n n0 Vi Vo

ENVIRONMENTAL MEASUREMENTS / Optical Transmission and Scatter of the Atmosphere 423

See also Environmental Measurements: Laser Detection of Atmospheric Gases. Instrumentation: Spectrometers. Scattering: Scattering from Surfaces and Thin Films.

Further Reading Berk A, Anderson GP, Acharya PK, Bernstein LS, Shettle EP, Adler-Golden SM, Lee J and Muratov L (2004) MODTRANS User’s Manual. Airforce Research Laboratory, Hanscom AFB, MA, in press. Chandrasekhar S (1960) Radiative Transfer. New York: Dover Publications, Inc., Originally published by Oxford University Press, London (1950). Fenn RW, Clough SA, Gallery WO, Good RE, Kneizys FX, Mill JD, Rothman LS, Shettle EP and Volz FE (1985) Optical and infrared properties of the atmosphere. In: Jursa AS (ed.) Handbook of Geophysics and the Space Environment, chap. 18, US Air Force Geophysics Laboratory, Hanscom AFB, MA, AFGL-TR-85-0315, available from NTIS ADA 167000.

Goody RM and Yung YL (1989) Atmospheric Radiation, Theoretical Basis, 2nd edn. New York: Oxford University Press. HITRAN Special Issue, J. Quant. Spectrosc. Rad. Transfer, vol. 82, Nos. 1–4, 15 November–15 December (2003). Houghton J (2001) The Physics of Atmospheres, 3rd edn, Cambridge, UK: Cambridge University Press. Killinger DK, Churnside JH and Rothman LS (1994) Atmospheric optics. In Handbook of Optics, 2nd edn, 2-Volume Set, chap. 44. New York: McGraw-Hill. Liou KN (1997) An Introduction to Atmospheric Radiation, 2nd edn. London: Academic Press. Smith FG (ed.) (1993) The Infrared and Electro-Optical Systems Handbook, vol. 2, Atmospheric Propagation of Radiation. Ann Arbor: Environmental Research Institute of Michigan. Thomas GE and Stamnes K (1999) Radiative Transfer in the Atmosphere and Ocean. Cambridge, UK: Cambridge University Press. Van de Hulst HC (1981) Light Scattering by Small Particles. New York: Dover Publications, Inc., Originally published in 1957, New York: John Wiley & Sons, Inc.

F FIBER AND GUIDED WAVE OPTICS Contents Overview Dispersion Fabrication of Optical Fiber Light Propagation Measuring Fiber Characteristics Nonlinear Effects (Basics) Nonlinear Optics Optical Fiber Cables Passive Optical Components

Overview A Mickelson, University of Colorado, Boulder, CO, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction It is said that some in ancient Greece knew that light could be guided inside slabs of transparent material. From this purported little-known curiosity in the ancient world, guided wave optics has grown to be the technology of the physical level of the systems which transfer most of the information about the world; the waveguide of the worldwide telecommunications network is the optical fiber. The optical fiber itself is but a passive waveguide, and guided wave optics is the technology which includes all of the passive and active components which are necessary to prepare optical signals for transmission, regenerate optical signals during transmission, route optical signals through systems and code onto optical carriers, and decode the information from optical carriers, to more conventional forms. In this article, some introduction to this rather encompassing topic will be given.

This article will be separated into four parts. In the first section, discussion will be given to fiber optics, that is, the properties of the light guided in optical waveguides which allow the light to be guided in distinctly nonrectilinear paths over terrestrial distances. The second section will then turn to the components which can be used along with the fiber optical waveguides in order to form useful systems. These components include sources and detectors as well as optical amplifiers. In the third section, we will discuss the telecommunications network which has arisen due to the availability of fiber optics and fiber optic compatible components. The closing section will discuss integrated optics, the field of endeavor which has such great promise to form the future of optical technology.

Fiber Optics Fiber optics is a term which generally refers to a technology in which light (actually infrared, visible, or ultraviolet radiation) is transmitted through the transparent cores of small threads of composite material. These threads, or fibers as they are called, when surrounded by a cladding material and coated with polymer for environmental protection (see Figure 1) can be coiled like conventional wire and

426 FIBER AND GUIDED WAVE OPTICS / Overview

Figure 1 A depiction of the structure of an optical fiber. The innermost cylinder is the core region in which the majority of the light is confined. The next concentric region is the cladding region which is still made of a pure material but one of a lower index of refraction than the innermost core region, such that the light of the lightwave decays exponentially with extension into this region. The outermost region is a coating which protects the fused silica of the core and cladding from environmental contaminants such as water. For telecommunications fibers, the core is comprised of fused silica doped with germanium and typically has an index of refraction of 1.45, whereas the cladding differs from this index by only about 1%. The coating is often a polyimide plastic which has a higher index (perhaps 1.6) but no light guided light should see the cladding-coating interface.

when cabled can resemble (a very lightweight flexible version of) conventional transmission wire. Although most of the optical fiber in use today is fabricated by a process of gas phase chemical deposition of fused silica doped with various other trace chemicals, fiber can be made from a number of different material systems and in a number of different configurations for use with various types of sources. In what follows in this opening section, we will limit discussion to the basic properties of the light guided by the fiber and leave more technological discussion to following sections. There are two complementary mathematical descriptions of the propagation of light in an optical waveguide. In the ray description, light incident on a fiber endface is considered to be made up of a bundle of rays. In a uniform homogeneous medium, each ray is like an arrow that exhibits rectilinear propagation from its source to its next interface with a dissimilar material. These rays satisfy Snell’s laws of reflection and refraction at interfaces between materials with dissimilar optical properties that they encounter along their propagation path. That is, at an interface, a fraction of the light is reflected backwards at an angle equal to the incident angle and a portion of the light is transmitted in a direction which is more directed toward the normal to the interface when the index increases across the

Figure 2 Schematic depiction of the geometrical optics interpretation of the coupling of light into an optical fiber. Here a ray that is incident on the center of the fiber core at an upward angle is first refracted at the fused silica– air interface into the fused silica core. The ray is then totally internally reflected at the core cladding interface such that the ray power remains in the core. The ray picture of light coupling is applicable to single rays coupling to a multimode fiber. When there are multiple rays, interference between these rays must be taken into account which is difficult to do in the ray picture. When there is only one (diffraction limited) mode in the fiber, the interference between a congruence of (spatially coherent) incident rays is necessary to describe the coupling. This description is more easily effected by a quasi-monochromatic mode picture of the propagation.

boundary and is directed more away from the normal when the index decreases. At the input endface of a waveguide, a portion of the energy guided by each ray is refracted due to the change in refractive index at the guide surface and then exhibits a more interesting path within the fiber. In a step index optical fiber, where the index of refraction is uniformly higher in the fiber core than in a surrounding cladding, the rays will propagate along straight paths until encountering the core cladding interface. Guided rays (see Figure 2) are totally internally reflected back into the fiber core to again be totally internally reflected at the next core cladding interface and so on. Radiating rays (see Figure 3) will be only partially reflected at the core cladding interface and will rapidly die out in propagating down the fiber when it is taken into account that typical distances between successive encounters with the boundary may be sub-millimeter and total propagation distances may be many kilometers. In graded index fibers, the refractive index within the fiber core varies continuously from a maximum somewhere within the core to a minimum at which the core ends and attaches continuously to a cladding. The ray paths within such fibers are curved and the guided rays are characterized by the fact that once in the fiber they never encounter the cladding. Radiating rays encounter the cladding and are refracted out of the fiber. This description of fiber propagation is quite simple and pleasing but does not take into account that each ray actually is carrying a clock that remembers how long it has been following its given path. When two rays come together, they can

FIBER AND GUIDED WAVE OPTICS / Overview 427

Figure 3 Schematic depiction of the existence of a cut-off angle for coupling into an optical fiber. A ray incident from air on the center of the fused silica fiber core is refracted at the air fused silica interface and then split when it comes to the core cladding interface within the fiber. Were the refracted angle of this ray zero (along the core cladding interface) we would say the ray is totally internally reflected and no power escapes. As it is, there is refracted power into the cladding and the ray will rapidly attenuate as it propagates along the fiber. Although most telecommunications fiber in this day and age is graded index (the rays will curve rather than travel along straight paths), a typical repeat period in such a fiber is a millimeter, whereas propagation distances may be 100 km. With a loss of even only 0.1% at each reflection form the cladding, the light light at cut-off will be severely attenuated in one meter, no loss, one kilometer.

either add or subtract, depending on the reading on their respective clocks. When the light is nearly monochromatic, the clocks’ readings are simply a measure of the phase of the ray when it was radiated from its source and the interference between rays can be quite strong. This interference leads to conditions which only allow certain ray paths to be propagated. When the light has a randomly varying phase in both space and time (and is therefore polychromatic), the interference all but disappears. But the condition that allows coupling to a specific ray path (mode) also is obliterated by the phase randomness, and coupling to a single mode guide, for example, becomes inefficient. When we need to preserve phase relations in order to preserve information while it propagates along a fiber path, we need to use single modes excited by nearly transform limited sources, that is sources whose time variation exhibits well-defined (nonrandom) phase variation. As a monochromatic wave carries no information, we will require a source which can be modulated. When modulation can be impressed on a carrier without loss of phase information of either information or carrier, information can be propagated at longer distances than when the frequency spectrum of the carrier is broadened by noise generated during the modulation. We refer to such a source that can be modulated without broadening as a coherent or nearly coherent source. We cannot easily adapt the ray picture to the description of propagation of such coherent radiation in an optical fiber. Instead we must resort to using the time harmonic form of Maxwell’s equations,

the equations which describe electromagnetic phenomena, whereas rays can be described by a simplification of the time-dependent form of these equations. In the time harmonic approach, we assume that the source is monochromatic and then solve for a set of modes of the fiber at the assumed frequency of the source. These modes have a well-defined phase progression as a function of carrier frequency and possess a given shape in the plane transverse to the direction of propagation. Information can be included in the propagation by assuming the quasimonochromatic variation of the source, that is, one assumes that the source retains the phase relations between the various modulated frequency components even while its amplitude and phase is being varied externally. When one assumes that time harmonically (monochromatic) varying fields are propagating along the fiber axis, one obtains solutions of Maxwell’s equations in the form of a summation of modes. These are guided modes, ones that propagate down the fiber axis without attenuation. There are also radiation modes that never couple into a propagating mode in the fiber. These modes are analogous to the types of rays we see in the ray description of fiber propagation. There are also another set of modes which are called evanescent modes which show up at discontinuities in the fiber or at junctions between the fiber and other components. In the modal picture of fiber propagation, each of these modes is given an independent complex coefficient (that is, a coefficient with both amplitude and phase). These coefficients are determined first by any sources in the problem and then must be recalculated at each discontinuity plane along the propagation path in the fiber. When the source(s) in the problem is not transform limited, the phases of the coefficients become smeared out and the coupling problems at discontinuities take on radically different solutions. In some limit, these randomized solutions must appear as the ray solutions. Optical fibers are generally characterized by their numerical aperture (NA), as well as the number which characterizes its transverse dimension. The numerical aperture is essentially the sine of the maximum angle into which the guide will radiate into free space. When the guide is multimoded, then the transverse dimension is generally given as a diameter. When a guide is single-moded, the transverse dimension is generally given as a spotsize, that is, by a measure of the size of a unity magnification image of the fiber endface. Multimode guides can be excited by even poorly coherent sources so long as they radiate into the NA or capture angle at the input of the guide. Single-mode fibers require an excitation which matches the shape and size of their fundamental mode.

428 FIBER AND GUIDED WAVE OPTICS / Overview

Active Fiber Compatible Components Our discussion of propagation in fiber optic waveguides would be a rather sterile one if there were not a number of fiber compatible components that can generate, amplify, and detect light streams available to us in order to construct fiber optic systems. In this section, we will try to discuss some of the theory of operation of active optical components in order to help elucidate some of the general characteristics of such components and the conditions that inclusion of such components in a fiber system impose on the form that the system must take. In a passive component such as an optical fiber, we need only consider the characteristics of the light that is guided. Operation of active components requires that the optical fields interact with a medium in such a manner that energy can be transferred from the field through an excitation of the medium to the controlling electrical stream and/or vice versa. Generally one wants the exchange to go only one way, that is from field to electrical stream or from electrical stream to field. In a detector this is not difficult to achieve, whereas in lasers and amplifiers it is quite hard to eliminate the back reaction of the field on the device. Whereas a guiding medium can be considered as a passive homogeneous continuum, the active medium has internal degrees of freedom of its own as well as a granularity associated with the distribution of the microscopic active elements. At least, we must consider the active medium as having an active index of refraction with a mind (set of differential equations anyway) of its own. If one also wants to consider noise characteristics, one needs to consider the lattice of active elements which convert the energy. The wavelength of operation of that active medium is determined by the energy spacing between the upper and lower levels of a transition which is determined by the microscopic structure of these grains, or quanta, that make up the medium. In a semiconductor, we consider these active elements to be so uniformly distributed and ideally spaced that we can consider the quanta (electron hole pairs) to be delocalized but numerous and labeled by their momentum vectors. In atomic media such as the rare earth doped optical fibers which serve as optical amplifiers, we must consider the individual atoms as the players. In the case of the semiconductor, a current, flowing in an external circuit, controls the population of the upper (electronic) and lower (hole) levels of each given momentum state within the semiconductor. When the momentum states are highly populated with electrons and holes, there is a flow of energy to the field (at the transition wavelength) and when the states are depopulated,

the field will tend to be absorbed and populate the momentum states by giving up its energy to the medium. The situation is similar with the atomic medium except that the pump is generally optical (rather than electrical) and at a different wavelength than the wavelength of the field to be amplified. That is to say, one needs to use a minimum of three levels of the localized atoms in the medium in order to carry out an amplification scheme. The composition of a semiconductor determines its bandgap, that is, the minimum energy difference between the electron and hole states of a given momentum value. A source, be it a light emitting diode (LED) or laser diode (see Figure 4) will emit light at a wavelength which corresponds to an energy slightly above the minimum gap energy (wavelength equals the velocity of light times Planck’s constant divided by energy) whereas a detector can detect almost any energy above the bandedge. Only semiconductors with bandgaps which exhibit a minimum energy at a zero momentum transfer can be made to emit light strongly. Silicon does not exhibit a direct gap and although it can be used as a detector, it cannot be used as a source material. The silicon laser is and has been the ‘Holy Grail’ of electronics because of the ubiquity of silicon electronics. To even believe that a ‘Holy Grail’ exists requires a leap of faith. Weak luminescence has been

Figure 4 A schematic depiction of the workings of a semiconductor laser light source. The source is fabricated as a diode, and normal operation of this diode is in forward bias, that is, the p-side of the junction is biased positively with respect to ground which is attached to the n-side of the junction. With the p-side positively biased, current should freely flow through the junction. This is not quite true as there is a heterojunction region between the p- and n-regions in which electrons from the n-side may recombine with holes from the p-side. This recombination gives off a photon which is radiated. In light emitting diodes (LEDs), the photon simply leaves the light source as a spontaneously emitted photon. In a laser diode, the heterojunction layer serves as a waveguide and the endfaces of the laser as mirrors to provide feedback and allow laser operation in which the majority of the photons generated are generated in stimulated processes.

FIBER AND GUIDED WAVE OPTICS / Overview 429

observed in certain silicon structures. This luminescence has been used for mid-infrared sources (where thermal effects mask strong luminescence) and in concert with rare earth dopants to make visible light emitting diodes. The silicon laser is still only a vision. Source materials are, therefore, all made of compound semiconductors. Detectors are in essence the inverse of sources (see Figure 5) but also must detect at wavelengths above their gap energy. For primarily historical reasons, as will be further discussed below, telecommunications employs essentially three windows of wavelengths for different applications, wavelengths centered about 0.85 microns in the first window, wavelengths about 1.3 microns in the second window, and wavelengths about 1.55 microns in the third window. Materials made of layers of different compositions of AlGaAs and mounted on GaAs substrates are used in the first band, while the other bands require the increased degree of freedom allowed by the quaternary alloys of InGaAsP mounted on InP substrates. Other material mixes are possible, these above-mentioned materials are the most common. Rare earth ions exhibit many different transitions. The most useful ones have proven to be the transitions of Er in the third telecommunications window, the one that covers the spectral region to either side of 1.55 microns. Although both Nd and Pr can be made to amplify near the 1.3 micron window, amplifiers made of these materials have not proven to be especially practical.

Figure 5 A schematic depiction of the workings of a semiconductor detector. As with a semiconductor source, the detector is fabricated to operate as a diode, that is a p-n junction, but in the case of the detector this diode is to be operated only in reverse bias, that is, with the n-material being biased positively with respect to the ground which is attached to the p-material’s contact. In this mode of operation, no current should flow through the junction as there are no free carriers in the material, they have all been drawn out through the contacts. A photon of sufficiently high energy can change this situation by forming an electron hole pair in the material. The electron and hole are thereafter drawn to the n- and p-contacts, respectively, causing a detectable current to flow in the external circuit.

Telecommunications Technology The physical transmission layer of the worldwide telecommunications network has come to be dominated by fiber optic technology, in particular, by wavelength division multiplexed or WDM singlemode fiber optics operating in the 1.55 micron telecommunications window. In the following, we will first discuss how this transformation to optical communications took place and then go on to discuss some of the specifics of the technology. Already by the middle of the 1960s, it was clear that changes were going to have to take place in order that the exponential growth of the telephone system in the United States, as well as in Europe, could continue. The telephone system then, pretty much as now, consisted of a hierarchy of tree structures connected by progressively longer lines. A local office is used to connect a number of lines emanating in a tree structure to local users. The local offices are connected by trunk lines which emanate from a toll office. The lines from there on up the hierarchy are long distance ones which are termed long lines and can be regional and longer. The most pressing problem in the late 1960s was congestion in the socalled trunk lines which connect the local switching offices. The congestion was naturally most severe in urban areas. These trunk lines were generally one kilometer in length at that time. The problem was that there was no more space in the ducts that housed these lines. A solution was to use time division multiplexing or TDM to increase the traffic that could be carried by each of the lines already buried in the conduit. The problem was that the twisted pair lines employed would smear out the edges of the time varying bit streams carrying the information at the higher bitrates (aggregated rates due to the multiplexing) due to the inherent dependence of signal propagation velocity on frequency known as dispersion. After discussion and even development of a number of possible technologies, in 1975 a demonstration of a fiber optic system which employed multimode semiconductor lasers feeding multimode optical fibers all operating at 0.85 micron wavelength, proved to be the most viable model for trunk line replacement. The technology was successful and already in 1980 advances in single-mode laser and single-mode fiber technology operating at the 1.3 micron wavelength had made the inclusion of fiber into the long lines viable as well. For roughly the decade from 1985 onward, single-mode fiber systems dominated long line replacement for terrestrial as well as transoceanic links. The erbium doped fiber amplifier which operated in the 1.55 micron wavelength third telecommunication window had proven

430 FIBER AND GUIDED WAVE OPTICS / Overview

itself to be viable for extending repeater periods by around 1990. Development of the InGaAsP quaternary semiconductor system allowed for reliable lasers to be manufactured for this window as development of strained lattice laser technology in the InGaAs system allowed for efficient pump lasers for the fiber amplifiers. As the optical amplifiers could amplify across the wavelength band that could be occupied by many aggregated channels of TDM signals, the move of the long line systems to the third telecommunications window was accompanied by the adoption of wavelength division multiplexing or WDM. That is, electronics becomes more expensive as the analog bandwidth it can handle increases. Cost-effective digital rates are now limited to a few Gb/s, and analog rates to perhaps 20 GHz. Optical center frequencies are in the order of 2 £ 1014 Hz and purely optical filters can be made to cleanly separate information signals spaced as closely as 100 GHz apart on optical carriers. An optical carrier can then be made to carry hundreds of channels of 10 Gb/s separated by 100 GHz and this signal can be propagated thousands of kilometers through the terrestrial fiber optics network. Such is today’s Worldwide Web.

Integrated Optics as a Future of Guided Wave Optics As electronics has been pushed to ever greater levels of integration, its power has increased and its price has dropped. A goal implicit in much of the development of planar waveguide technology has been that optics could become an integrated technology in the same sense as electronics. This has not occurred and probably will not occur. This is not to say that integrated optical processing circuits are not being developed and that they will be not employed in the future. They most definitely will. Integrated optics was coined as a term when it was used in 1969 to name a new program at Bell Laboratories. This program was aimed at investigation of all the technologies that were possible with which to fabricate optical integrated circuits. The original effort was in no way driven by fiber optics as the Bell System only reluctantly moved to a fiber optic network solution in 1975. The original integrated optics efforts were based on such technologies as the inorganic crystal technologies that allowed for large second-order optical nonlinearities, a necessity in order that a crystal also exhibits a large electro-optic coefficient. An electrooptic coefficient allows one to change the index of refraction of a crystal by applying a low frequency or DC electromagnetic field across the crystal.

Figure 6 A schematic depiction of an integrated optical device which is often used as a high speed modulator. The device is an integrated version of a Mach– Zehnder interferometer. The basic idea behind its operation is that spatially and temporally coherent light is input into a single mode optical channel. That channel is then split into two channels by a Y-junction. Although not depicted in the figure, the two arms should not have completely equivalent propagation paths. That is to say, that if the light in one of those paths propagates a slightly longer distance (as measured in terms of phase fronts of the wave) then one cannot recombine the power from the two arms. That is, in the second Y-junction, a certain amount of the light which we are trying to combine from the two arms will be radiated out from the junction. In fact, if the light were temporally incoherent (phase fronts not well defined), exactly half of the light would be radiated form the junction. This is a statement of the brightness theorem which was thought to be a law of propagation before there were coherent sources of light. In a high speed modulator, the substrate is an electro-optic crystal and electrodes are placed over the two arms which apply the electrical signal to be impressed on the optical carrier to the channels.

Lithium niobate technology is a technology that has lasted to the present as an optical modulator technology (see Figure 6) and an optical switch technology (see Figure 7), as well as a technology for parametric wavelength conversion. Unfortunately, although low loss passive waveguide can be fabricated in lithium niobate, the crystal is not amenable to any degree of monolithic integration. Glass was also investigated as an integration technology from the early days of integrated optics, but the lack of second-order nonlinearity in glass strictly limited its applicability. Attention for a period turned to monolithic integration in the semiconductor materials which were well progressing as for use as lasers, detectors, and integrated circuits. Semiconductor crystals, however, are too pure to allow for low loss light propagation which requires the material defects to be so numerous that optical wavelengths cannot sample them. Passive waveguides in semiconductors incur huge losses that can only be mitigated by almost constant optical amplification which, unfortunately, cannot track the rapid optical field variations necessary for information transfer. Early on, attention turned to silicon

FIBER AND GUIDED WAVE OPTICS / Overview 431

Figure 7 A schematic depiction of an integrated optical device which is often used as an optical switch. This device is interferometric but really has no free space optics counterpart as the Mach–Zhender interferometer does. In this interferometer, spatially and temporally coherent light is input into a single mode input channel. The two single optical mode input channels, which are initially so far apart from each other in terms of their mode sizes that they are effectively uncoupled, are then brought ever closer together until they are strongly coupled and their modes are no longer confined to their separate waveguides but are shared between the two channels. As the channels are symmetric, but the initial excitation, if from a single waveguide alone, is not, the pattern must evolve with propagation. Were both input channels excited, symmetry would still require a symmetry of their phases. If the phases were either equal to each other or completely out of phase, then there would be no evolution because we would have excited coupler modes. If the coupled region is the proper length, light input to one channel will emerge from the other after the channels are pulled apart. If the substrate is an electro-optic crystal and electrodes are placed over the channels, application of a voltage can affect the evolution of the interference in such a manner so as to ‘decouple’ the channels and switch the light back to the original channel.

optical bench technology, a technology which has presently spawned micro-electro-mechanical-system (MEMs) technology and still persists as an expensive vehicle to hybrid electro-optical integration. In silicon bench, individual components are micromounted with solder or silicon pop-up pieces on a silicon wafer which serves as a micro-optical bench. The technology is expensive and hybrid but pervasive in the highest end of cost classes of optical systems. The tremendous success of the fiber optical network spawned a movement in which groups tried to achieve every possible system functionality in fibers themselves. A notable success has been the optical fiber amplifier. Significantly less successful in the fiber optic network has been the fiber Bragg grating. Attempts to carry out WDM functionalities in all fiber configuration leads to unwieldy configurations requiring large amounts of fiber and significant propagation delays, although, Bragg grating sensors have proven to be somewhat useful in sensing

applications. WDM functions have been implemented in a large number of different hybrid configurations, involving various types of glasses, fused silica on silicon and simple hybrids of birefringent crystals. The telecommunications network is not the driver that it once was. That this is so is perhaps the strongest driving force yet for integrated optics. The driver now is new applications that must be implemented in the most efficient manner. The most efficient manner is quite generally the one in which there is a maximum degree of integration. In the long run, polymer has always been the winning technology for optics of any kind, due to the flexibility of the technology and the drop in cost that accompanies mass production. Indeed, polymer integrated optics progresses. There are also a number of researchers involved in investigating strongly guiding optics, so-called photonic crystal optics. That is, in order to achieve low loss as well as low cost, the fiber and integrated optic waveguides up to the present have used small variations in optical properties of material to achieve guidance at the cost of having structures that are many wavelengths in size. In radio frequency (RF) and microwave technology, for example, guides are tiny fractions of a wavelength. This can lead to an impedance matching problem. RF and microwave impedance matching costs (versus circuit impedance matching costs) are generally quite high, precluding mass application. That radio frequency systems, such as cell phone transceivers link to the rest of the world by antenna, allows that the overall circuit dimension can be kept less than a single wavelength and impedance matching can be foregone in this so-called circuit limit. It is usually hard to keep an overall microwave system to less than a wavelength in overall extent except in the cell phone case or in a microwave oven. Optical miniaturization will require high index contrast guides, and will require optical impedance matching. There are few complete optical systems which will comprise less than an optical wavelength in overall extent. But then this so-called photonic crystal technology will likely be polymer compatible and there may be ways to find significant cost reduction. The future of integrated optics is unclear but bright.

See also Fiber and Guided Wave Optics: Dispersion; Fabrication of Optical Fiber; Light Propagation; Measuring Fiber Characteristics; Nonlinear Effects (Basics); Nonlinear Optics; Optical Fiber Cables; Passive Optical Components. Optical Communication Systems: Wavelength Division Multiplexing.

432 FIBER AND GUIDED WAVE OPTICS / Dispersion

Further Reading Agrawal GP (1995) Fiber-Optic Communication Systems. New York: Wiley. Bette, DeMarchis G and Iannone (1995) Coherent Communications. New York: Wiley. Haken H (1985) Light. New York: Elsevier North-Holland. Hecht E (1998) Optics. Reading, MA: Addison-Wesley. Klein MV and Furtak TE (1986) Optics. New York: Wiley.

Marcuse D (1982) Light Transmission Optics. New York: Van Nostrand Reinhold. Marcuse D (1991) Theory of the Dielectic Optical Waveguide. Boston, MA: Academic Press. Mickelson AR, Lee Y-C and Basavanhally NR (1997) Optoelectronic Packaging. New York: Wiley. Saleh BAA and Teich MC (1991) Fundamentals of Photonics. New York: Wiley. Yariv A (1997) Optical Electronics in Modern Communications. New York: Van Nostrand Reinhold.

Dispersion L The´venaz, E´cole Polytechnique Fe´de´rale de Lausanne, Lausanne, Switzerland q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Dispersion designates the property of a medium to propagate the different spectral components of a wave with a different velocity. It may originate from the natural dependence of the refractive index of a dense material to the wavelength of light, and one of the most evident and magnificent manifestations of dispersion is the appearance of a rainbow in the sky. In this case dispersion gives rise to a different refraction angle for the different spectral components of the white light, resulting in an angular dispersion of the sunlight spectrum and explicating the etymology of the term dispersion. Despite this spectacular effect, dispersion is mostly seen as an impairment in many applications, in particular for optical signal transmission. In this case, dispersion causes a rearrangement of the signal spectral components in the time domain, resulting in temporal spreading of the information and eventually in severe distortion. There is a trend to designate all phenomena resulting in a temporal spreading of information as a type of dispersion, namely the polarization dispersion in optical fibers that should be properly named as polarization mode delay (PMD). In this article chromatic dispersion will be solely addressed, that designates the dependence of the propagation velocity on wavelength and that corresponds to the strict etymology of the term.

Effect of Dispersion on an Optical Signal An optical signal may always be considered as a sum of monochromatic waves through a normal Fourier expansion. Each of these Fourier components

propagates with a different phase velocity if the optical medium is dispersive, since the refractive index n depends on the optical frequency n. This phenomenon is linear and causal and gives rise to a signal distortion that may be properly described by a transfer function in the frequency domain. Let nðnÞ be the frequency-dependent refractive index of the propagation medium. When the optical wave propagates in a waveguiding structure the effective wavenumber is properly described by a propagation constant b; that reads:

bðnÞ ¼

2pn nðnÞ c0

½1

where c0 is the vacuum light velocity. The propagation constant corresponds to an eigenvalue of the wave equation in the guiding structure and normally takes a different value for each solution or propagation mode. For free-space propagation this constant is simply equal to the wavenumber of the corresponding plane wave. The complex electrical field Eðz; tÞ of a signal propagating in the z direction may be properly described by the following expression: Eðz; tÞ ¼ Aðz; tÞ eið2pn0 t2b0 zÞ with n0: the central optical frequency and b0 ¼ bðn0 Þ

½2

where Aðz; tÞ represents the complex envelope of the signal, supposed to be slowly varying compared to the carrier term oscillating with the frequency n0. Consequently, the signal will spread over a narrow band around the central frequency n0 and the propagation constant b can be conveniently approximated by a limited expansion to the second order:    db  d2 b    bðnÞ ¼ b0 þ ð n 2 n Þ þ ðn 2 n0 Þ2 ½3 0 dn n ¼n0 dn 2 n ¼n0 For a known signal Að0; tÞ at the input of the propagation medium, the problem consists in

FIBER AND GUIDED WAVE OPTICS / Dispersion 433

determining the signal envelope Aðz; tÞ after propagation over a distance z. The linearity and the causality of the system make possible a description using a transfer function Hz ðnÞ such as: ~ nÞ ¼ Hz ðnÞ Að0; ~ nÞ Aðz;

½4

~ nÞ is the Fourier transform of Aðz; tÞ. where Aðz; To make the transfer function Hz ðnÞ explicit, let us assume that the signal corresponds to an arbitrary harmonic function: Að0; tÞ ¼ A0 ei2pft

½5

Since this function is arbitrary and the signal may always be expanded as a sum of harmonic functions through a Fourier expansion, there is no loss of generality. The envelope identified as a harmonic function actually corresponds to a monochromatic wave of optical frequency n ¼ n0 þ f as defined by eqn [2]. Such a monochromatic wave will experience the following phase shift through propagation: Eðz; tÞ ¼ A0 ei½2p ðn0 þf Þt2bðn0 þf Þz ¼ A0 ei2pft ei½2pn0 t2bðn0 þf Þz

½6

On the other hand, an equivalent expression may be found using the linear system described by eqns [2] and [4]: Eðz; tÞ ¼ Aðz; tÞ eið2pn0 t2b0 zÞ ¼ Az ei2pft eið2pn0 t2b0 zÞ

½7

Since eqns [6] and [7] must represent the same quantities and using the definition in eqn [4], a simple comparison shows that the transfer function must take the following form: Hz ðnÞ ¼ e2 i½bðnÞz2b0 z

½8

The transfer function takes a more analytical form using the approximation in eqn [3]: Hz ðnÞ ¼ e2 i2pðn2n0 ÞtD e2 ipDn ðn2n0 Þ

2

z

1 db z z¼ 2p d n 0 Vg

1 d2 b d ¼ Dn ¼ 2p d n 2 dn

1 Vg

! ½11

It is important to point out that the GVD may be either positive (normal) or negative (anomalous) and the distortion term in the transfer function in eqn [9] may be exactly cancelled by propagating in a medium with Dn of opposite sign. It means that the distortion resulting from chromatic dispersion is reversible and this is widely used in optical links through the insertion of dispersion compensators. These are elements made of specially designed fibers or fiber Bragg gratings showing an enhanced GVD coefficient, with a sign opposite to the GVD in the fiber. From the transfer function in eqn [9] it is possible to calculate the impulse response of the dispersive medium: ðt2tD Þ 1 ip hz ðtÞ ¼ pffiffiffiffiffiffiffiffi e Dn z ilDn lz

2

½12

so that the distortion of the signal may be calculated by a simple convolution of the impulse response with the signal envelope in the time domain. The effect of dispersion on the signal can be more easily interpreted by evaluating the dispersive propagation of a Gaussian pulse. In this particular case the calculation of the resulting envelope can be carried out analytically. If the signal envelope takes the following Gaussian distribution at the origin:

½9

In the transfer function interpretation the first term represents a delay term. It means that the signal is delayed after propagation by the quantity:

tD ¼

The second term is the distortion term which is similar in form to a diffusion process and normally results in time spreading of the signal. In the case of a light pulse it will gradually broaden while propagating along the fiber, like a hot spot on a plate gradually spreading as a result of heat diffusion. The effect of this distortion is proportional to the distance z and to the coefficient Dn , named group velocity dispersion (GVD):

½10

where Vg represents the signal group velocity. This term therefore brings no distortion for the signal and thus states that the signal is replicated at the distance z with a delay tD :

Að0; tÞ ¼ A0 e

2

t2 t 20

½13

with t0 the 1=e half-width of the pulse, the envelope at distance z is obtained by convoluting the initial envelope with the impulse response hz ðtÞ: Aðz; tÞ ¼ hz ðtÞ ^ Að0; tÞ sffiffiffiffiffiffiffiffiffiffi ðt2tD Þ2 iz0 ip e Dn ðzþiz0 Þ ¼ A0 z þ iz0

½14

434 FIBER AND GUIDED WAVE OPTICS / Dispersion

where z0 ¼ 2

pt 20 Dn

½15

represents the typical dispersion length, that is the distance necessary to make the dispersion effect noticeable. The actual pulse spreading resulting from dispersion can be evaluated by calculating the intensity of the envelope at distance z:

t 2 2ðt2tD Þ lAðz; tÞl ¼ A0 0 e t 2 ðzÞ tðzÞ 2

½16

The variation of the pulse width t ðzÞ is presented in Figure 1 and clearly shows that the pulse spreading starts to be nonnegligible from the distance z ¼ z0 : This gives a physical interpretation for the dispersion length z0 : It must be pointed out that there is a direct formal similarity between the pulse broadening of a Gaussian pulse in a dispersive medium and the spreading of a free-space Gaussian beam as a result of diffraction. Asymptotically for distances z q z0 ; the pulsewidth increases linearly: z pt0

~ nÞl2 ¼ le2 i½bðnÞz2b0 z l2 lAð0; ~ nÞl2 ~ nÞl2 ¼ lHz ðnÞAð0; lAðz; ~ nÞl2 ¼ lAð0; ½19

2

that is still a Gaussian distribution centered about the propagation delay time tD , with 1=e2 half-width: qffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½17 t ðzÞ ¼ t0 1 þ ðz=z0 Þ2

tðzÞ . lDn l

from the larger spectral width corresponding to a narrower pulsewidth, giving rise to a stronger dispersive effect. The chromatic dispersion does not modify the spectrum of the transmitted light, as any linear effect. This can be straightforwardly demonstrated by evaluating the intensity spectrum of the signal envelope at any distance z; using the eqns [4] and [8]:

½18

It must be pointed out that the width increases proportionally to the dispersion Dn , but also inversely proportionally to the initial width t0 : This results

It means that the pulse characteristics in the time and frequency domains are no longer Fouriertransform limited, since after the broadening due to dispersion, the spectrum should normally spread over a narrower spectral width. This feature results from a re-arrangement of the spectral components within the pulse. This can be highlighted by evaluating the distribution of instantaneous frequency through the pulse. The instantaneous frequency vi is defined as the time-derivative of the wave phase factor fðtÞ and is uniformly equal to the optical carrier pulsation vi ¼ 2pn0 for the initial pulse, as can be deduced from the phase factor at z ¼ 0 by combining eqns [2] and [13]. This constant instantaneous frequency means that all spectral components are uniformly present within the pulse at the origin. After propagation through the dispersive medium the phase factor fðtÞ can be evaluated by combining eqns [2] and [14] and evaluating the argument of the resulting expression. The instantaneous frequency vi is obtained after a simple time derivative of fðtÞ and reads:

vi ðtÞ ¼ v0 þ

Figure 1 Variation of the 1/e 2 half-width of a Gaussian pulse, showing the pulse spreading effect of dispersion. The dashed line shows the asymptotic linear spreading for large propagation distance.

2pz ðt 2 tD Þ Dn ðz2 þ z20 Þ

½20

For z . 0 the instantaneous frequency varies linearly over the pulse, giving rise to a frequency chirp. The frequency components are re-arranged in the pulse, so that the lower frequency components are in the leading edge of the pulse for a normal dispersion Dn . 0 and the higher frequencies in the trailing edge. For an anomalous dispersion Dn , 0, the arrangement is opposite, as can be seen in Figure 2. The effect of this frequency chirp can be visualized in Figure 3, showing that the effect of dispersion is equivalent to a frequency modulation over the pulse. The chirp is maximal at position z0 and the pulsewidth takes its minimal value t0 when the chirp is zero. It is evident with this description that the pulse broadening may be entirely compensated through propagation in a medium of opposite

FIBER AND GUIDED WAVE OPTICS / Dispersion 435

to the propagation distance z and the frequency separation dn: This description can be extended to a continuous frequency distribution with a spectral width sn , resulting to the following temporal broadening st :

st ¼ lDn lsn z

Figure 2 Distribution of the instantaneous frequency through a Gaussian pulse. At the origin the distribution is uniform (left) and the dispersion induces a frequency chirp that depends on the sign of the GVD coefficient Dn :

Traditionally the spectral characteristics of a source are given in units of wavelength and the GVD coefficient is expressed in optical fibers accordingly. Following the same description as above, the temporal broadening st , for a spectral width sl in wavelength units, reads:

st ¼ lDl lsl z

Figure 3 The dispersion results in a pulse broadening together with a frequency chirp, here for a normal GVD, that can be seen like a frequency modulation.

group velocity dispersion; this is equivalent to reversing the time direction in Figure 3. Moreover a pre-chirped pulse can be compressed to its Fouriertransform limited value after propagation in a medium with the proper dispersion sign. This feature is widely used for pulse compression after prechirping through propagation in a medium subject to optical Kerr effect. The above description of the propagation of a Gaussian pulse implicitly states that the light source is perfectly coherent. In the early stages of optical communications it was not at all the case, since most sources were either light emitting diodes or multimode lasers. In this case the spectral extent of the signal was much larger than actually required for the strict need of modulation. Each spectral component may thus be considered as independently propagating the signal and the total optical wave can be identified to light merged from discrete sources emitting simultaneously the same signal at a different optical frequency. The group velocity dispersion will cause a delay dt between different spectral components separated by a frequency interval dn that can be simply evaluated by a first-order approximation: dt ¼

dtD d  z dn ¼ Dn zdn dn ¼ dn V g dn

½21

where eqns [10] and [11] have been used. The delay dt is thus proportional to the GVD coefficient Dn ;

½22

½23

Since equal spectral widths must give equal broadening the value of the GVD in units of wavelength can be deduced from the natural definition in frequency units: ! ! d 1 d 1 dn c Dl ¼ ¼  02 Dn ½24 ¼ dl Vg dn Vg dl l It must be pointed out that the coefficient Dl takes a sign opposite to Dn ; in other words, a normal dispersion corresponds to a negative GVD coefficient Dl : It is usually expressed in units of picoseconds of temporal broadening, per nanometer of spectral width and per kilometer of propagating distance, or ps/nm km. For example, a pulse showing a spectral width of 1 nm propagating through a 100 km fiber having a dispersion Dl of þ 10 ps/nm km, will experience, according to eqn [23], a pulse broadening st of 10 £ 1 £ 100 ¼ 1000 ps or 1 ns.

Material Group Velocity Dispersion Any dense material shows a variation of its index of refraction n as a function of the optical frequency n. This natural property is called material dispersion and is the dominant contribution in weakly guiding structures such as standard optical fibers. This natural dependence results from the noninstantaneous response of the medium to the presence of the electric field of the optical wave. In other words, the polarization field PðtÞ corresponding to the material response will vary with some delay or inertia to the change of the incident electric field EðtÞ. This delay between cause and effect generates a memory-type response of the medium that may be described using a time-dependent medium susceptibility xðtÞ: The relation between medium polarization at time t and incident field results from the weighted superposition

436 FIBER AND GUIDED WAVE OPTICS / Dispersion

of the effects of Eðt 0 Þ at all previous times t 0 , t: This takes the form of the following convolution: PðtÞ ¼ 10

ðþ1

xðt 2 t0 ÞEðt0 Þdt0

½25

21

Through application of a simple Fourier transform this relation reads in the frequency domain as: PðnÞ ¼ 10 xðnÞEðnÞ

½26

showing clearly that the noninstantaneous response of the medium results in a frequency-dependent refractive index using the standard relationship with the susceptibility x: pffiffiffiffiffiffiffiffiffiffi nðnÞ ¼ 1 þ xðnÞ where xðnÞ ¼ FT{xðtÞ} ½27 This means that a beautiful natural phenomenon such as a rainbow, originates on a microscopic scale from the sluggishness of the medium molecules to react to the presence of light. For signal propagation, it results in a distortion of the signal and in most cases in a pulse spreading, but the microscopic causes are in essence identical. This noninstantaneous response is tightly related to the molecules’ vibrations that also give rise to light absorption. For this reason it is convenient to express the propagation in an absorptive medium by adding an imaginary part to the susceptibility xðnÞ:

xðnÞ ¼ x 0 ðnÞ þ ix 00 ðnÞ

½28

so that the refractive index nðnÞ and the absorption coefficient aðnÞ reads in a weakly absorbing medium: qffiffiffiffiffiffiffiffiffiffiffi nðnÞ ¼ 1 þ x 0 ðnÞ

að n Þ ¼ 2

2px 00 ðnÞ lnðnÞ

½29

Since the response of the medium, given by the timedependent susceptibility xðtÞ in eqn [25], is real and causal, the real and imaginary part of the susceptibility in eqn [28] are not entirely independent and are related by the famous Kramers –Kronig relations:

x 0 ðnÞ ¼

2 ð1 sx 00 ðnÞ ds p 0 s2 2 n 2

2 ð1 nx 0 ðnÞ x ð nÞ ¼ ds p 0 n 2 2 s2

½30

00

Absorption and dispersion act in an interdependent way on the propagating optical wave and knowing either the absorption or the dispersion spectrum is theoretically sufficient to determine the entire optical response of the medium. The interdependence

Figure 4 Typical optical response of a transparent medium, showing the spectral interdependence between the absorption coefficient a and the index of refraction n.

between absorption and dispersion is typically illustrated in Figure 4. The natural tendency is to observe a growing refractive index for increasing frequencies in low absorption regions. In this case, the dispersion is called normal and this is the most observed situation in transparency regions of a material, which obviously offer the largest interest for optical propagation. In an absorption line the tendency is opposite, a diminishing index for increasing wavelength, and such a response is called anomalous dispersion. The dispersion considered here is the phase velocity dispersion, represented by the slope of nðnÞ, that must not be mistaken with the group velocity dispersion (GVD) that only matters for signal distortion. The difference between these two quantities is clarified below. To demonstrate that a frequency-dependent refractive index nðnÞ gives rise to a group velocity dispersion and thus a signal distortion we use eqns [1] and [10], so the group velocity Vg can be expressed: Vg ¼

c0 N

N ¼nþn

with

dn dn ¼n2l dn dl

½31

N is called group velocity index and differs from the phase refractive index n only if n shows a spectral dependence. In a region of normal dispersion ðdn= dnÞ . 0; the group index is larger than the phase index and this is the situation observed in the great majority of transparent materials. From eqn [31] and using eqn [24] the GVD coefficient Dl can be expressed as a function of the refractive index nðlÞ: d Dl ¼ dl

1 Vg

!

d ¼ dl

N c0

! ¼2

l d2 n c0 dl2

½32

The GVD coefficient is proportional to the second derivative of the refractive index n with respect to

FIBER AND GUIDED WAVE OPTICS / Dispersion 437

wavelength and is therefore minimal close to a point of inflexion of nðlÞ. As can be seen in Figure 4 such a point of inflexion is always present where absorption is minimal, at the largest spectral distance from two absorption lines. It means that the conditions of lowgroup velocity dispersion and high transparency are normally fulfilled in the same spectral region of an optical dielectric material. In this region the phase velocity dispersion is normal at any wavelength, but the group velocity is first normal for shorter wavelengths, then is zero at a definite wavelength corresponding to the point of inflexion of nðlÞ; and finally becomes anomalous for longer wavelengths. The situation in which normal phase dispersion and anomalous group dispersion are observed simultaneously is in no way exceptional. In pure silica the zero GVD wavelength is at 1273 nm, but is subject to be moderately shifted to larger wavelengths in optical fibers, as a result of the presence of doping species to raise the index in the fiber guiding core. This shift normally never exceeds 10 nm using standard dopings; larger shifts are observed resulting from waveguide dispersion and this aspect will be addressed in the next section. The zero GVD wavelength does not strictly correspond to the minimum attenuation in silica fibers, because the dominant source of loss is Rayleigh scattering in this spectral region and not molecular absorption. This scattering results from fluctuations of the medium density as observed in any amorphous materials such as vitreous silica and is therefore a collective effect of many molecules that does not impinge on the microscopic susceptibility xðtÞ: It has therefore no influence on the material dispersion characteristics and this explains the reason for the minimal attenuation wavelength at 1550 nm, mismatching and quite distant from the zero material GVD at 1273 nm. Material GVD in amorphous SiO2 can be accurately described using a three-term Sellmeier expansion of the refractive index: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u 3 X u Cj l2 ½33 nðlÞ ¼ t1 þ 2 2 j¼1 l 2 lj and performing twice the wavelength derivative. The coefficients Cj and lj are found in most reference handbooks and result in the GVD spectrum shown in Figure 5. Such a spectrum explains the absence of interest for propagation in the visible region through optical fibers, the dispersion being very important in this spectral region. It also explains the large development of optical fibers in the 1300 nm region as a consequence of the minimal material dispersion there. It must be pointed out that it is quite easy to set

Figure 5 Material dispersion of pure silica. The visible region (0.4– 0.7 mm) shows a strong normal GVD that decreases when moving into the infrared and eventually vanishes at 1273 nm. In the minimum attenuation window (1550 nm) the material GVD is anomalous.

up propagation in an anomalous GVD regime in optical fibers, since this regime is observed in the lowest attenuation spectral region. Anomalous dispersion makes possible interesting propagation features when combined with a third-order nonlinearity as the optical Kerr effect, namely soliton propagation and efficient spectral broadening through modulation instability.

Waveguide Group Velocity Dispersion Solutions of the wave equation in an optical dielectric waveguide such as an optical fiber are discrete and limited. These solutions, called modes, are characterized by an unchanged field distribution along the waveguides and by a uniform propagation constant b over the wavefront. This last feature is particularly important if one recall that the field extends over regions presenting different refractive indices in a dielectric waveguide. For a given mode, the propagation constant b defines an effective refractive index neff for the propagation by similarity to eqn [1]:



2pn n c0 eff

½34

The value of this effective refractive index neff is always bound by the value of the core index n1 and of the cladding index n2 ; so that n2 , neff , n1 : For a given mode, the propagation constant b; and so the effective refractive index neff ; only depend on a quantity called normalized frequency V that essentially scales the light frequency to the waveguide

438 FIBER AND GUIDED WAVE OPTICS / Dispersion

optical parameters: V¼

2p qffiffiffiffiffiffiffiffiffiffi a n21 2 n22 l

½35

where a is the core radius. Figure 6 shows the dependence of the propagation constant b of the fundamental mode LP01 on the normalized frequency V: The variation is nonnegligible in the single-mode region and gives rise to a chromatic dispersion, since V depends on the wavelength l; even in the fictitious case of dispersion-free refractive indices in the core and the cladding materials. This type of chromatic dispersion is called waveguide dispersion. To find an expression for the waveguide dispersion in function of the guiding properties, let us define another normalized parameter, the normalized phase constant b; such as: ffi 2p qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi b¼ n22 þ bðn21 2 n22 Þ l

½36

The parameter b takes values in the interval 0 , b , 1; is equal to 0 when neff ¼ n2 at the mode cutoff, and is equal to 1 when neff ¼ n1 : This latter situation is never observed and is only asymptotic for very large normalized frequencies V: Solving the wave equation provides the dispersion relation bðVÞ and it is important to point out that this relation between normalized quantities depends only on the shape of the refractive index profile. Step-index, triangular or multiple-clad index profiles will result in different bðVÞ relations, independently of the actual values of the refractive indices n1 and n2 and of the core radius a: From the definitions in eqns [10] and [35] and in the fictitious case of an absence of material dispersion, the propagation delay per unit length reads: 1 1 db 1 db dV l db ¼ ¼ ¼ V Vg 2p dn 2p dV dn 2p dV

½37

so that the waveguide group velocity dispersion can be expressed using the relation in eqn [24]: ! ! d 1 d 1 dn w ¼ Dl ¼ dl Vg dn Vg dl ¼2

n d l dn

1 Vg

! ¼2

1 d2 b V2 2pc0 dV 2

½38

The calculation of the combined effect of material and waveguide dispersions results in very long expressions in which it is difficult to highlight the relative effect of each contribution. Nevertheless, by making the assumption of weak guidance: n 1 2 n2 N 2 N2 . 1 p1 n2 N2

½39

where N1 and N2 are the group indices in the core and the cladding, respectively, defined in eqn [31], the complete expression can be drastically simplified to obtain for the delay per unit length:   1 1 dðbVÞ ¼ ½40 N2 þ ðN1 2 N2 Þ Vg c0 dV and for the total dispersion: dðbVÞ dV N2 1 d2 ðbVÞ 2 ðN1 2 N2 Þ V n1 lc0 dV 2

Dl ¼ D2 þ ðD1 2 D2 Þ

½41

where D1 ¼ 2

l d2 n1 l d2 n2 and D ¼ 2 2 c0 dl2 c0 dl2

½42

are the material GVD in the core and the cladding, respectively. The first two terms in eqn [41] represent the contribution of material dispersion weighted by the relative importance of core and cladding materials for the propagating mode. In optical fibers, the difference between D1 and D2 is small, so that this contribution can be often well approximated by D2 ; independently of any guiding effects. The last term represents the waveguide dispersion and is scaled by 2 factors: . The core-cladding index difference n1 2 n2 .

Figure 6 Propagation constant b as a function of the normalized frequency V in a step-index optical fiber. The single mode region is in the range 0 , V , 2.405.

N1 2 N2 : The waveguide dispersion will be significantly enhanced by increasing the index difference between core and cladding. . The shape factor Vðd2 ðbVÞ=dV 2 Þ: This factor uniquely depends on the shape of the refractive index profile and may substantially modify the

FIBER AND GUIDED WAVE OPTICS / Dispersion 439

to enhance the shape factor in eqn [41], so that the contribution of waveguide GVD is significantly increased with no impairing attenuation due to excessive doping. This makes it possible to realize the ideal situation of an optical fiber showing a zero GVD at the wavelength of minimum attenuation. These dispersion-shifted fibers (DSF) have now become successful in modern telecommunication networks. Nevertheless, the absence of dispersion favors the efficiency of nonlinear effects and several classes of fibers are now proposed, showing a small but nonzero GVD at 1550 nm, with positive or negative sign, nonzero DSF (NZDSF). By interleaving fibers with positive and negative GVDs it is possible to propagate along the optical link in a dispersive medium and thus minimize the impact of nonlinearities, while maintaining the overall GVD of the link close to zero and canceling any pulse spreading accordingly.

List of Units and Nomenclature

Figure 7 Material, waveguide and total group velocity dispersions for: (a) step-index fiber; (b) triangular profile fiber. The waveguide dispersion can be significantly enhanced by changing the shape of the index profile, making possible a shift of the zero GVD to the minimum attenuation window.

spectral dependence of the waveguide dispersion, making possible a great variety of dispersion characteristics. Due to this degree of freedom brought by waveguide dispersion it is possible to shift the zero GVD wavelength to the region of minimal attenuation at 1550 nm in silica optical fibers. Figure 7a shows the total group velocity dispersion of a step-index core fiber, together with the separate contributions of material and waveguide GVD. In this case, the material GVD is clearly the dominating contribution, while the small waveguide GVD results in a shift of the zero GVD wavelength from 1273 nm to 1310 nm. A larger shift could be obtained by increasing the core-cladding index difference, but this also gives rise to an increased attenuation from the doping and no real benefit can be expected from such a modification. In Figure 7b, the core shows a triangular index profile

Chromatic dispersion [ps nm21 km21] Electric field [V m21] Group index Group velocity [m s21] Group Velocity Dispersion GVD [s2 m21] Linear attenuation coefficient [m21] Medium susceptibility Normalized frequency Optical frequency [s21] Propagation constant [m21] Propagation delay [s] Polarization density field [A s m22] Refractive index Vacuum light velocity [m s21] Wavelength [m]

Dl E N Vg Dn

a x V n b tD P n c0 l

See also Dispersion Management. Fiber and Guided Wave Optics: Overview. Fourier Optics. Nonlinear Optics, Basics: Kramers –Kronig Relations in Nonlinear Optics.

Further Reading Ainslie BJ and Day CR (1986) A review of single-mode fibers with modified dispersion characteristics. IEEE Journal of Lightwave Technology LT-4(8): 967 – 979. Bass M (ed.) (1994) Handbook of Optics, sponsored by the Optical Society of America, 2nd edn, vol. II, chap. 10 & 33. New York: McGraw-Hill. Buck JA (1995) In: Goodman JW (ed.) Fundamentals of Optical Fibers, chap. 1, 5, 6. New York: Wiley Series in Pure and Applied Optics.

440 FIBER AND GUIDED WAVE OPTICS / Fabrication of Optical Fiber

Gloge D (1971) Dispersion in weakly guiding fibers. Applied Optics 10: 2442– 2445. Jeunhomme LB (1989) Single Mode Fiber Optics: Principles and Applications. Optical Engineering Series, No. 23. New York: Marcel Dekker. Marcuse D (1974) Theory of Dielectric Optical Waveguides. Series on Quantum Electronics – Principles and Applications, chap. 2. New York: Academic Press. Marcuse D (1980) Pulse distortion in single-mode fibers. Applied Optics 19: 1653– 1660. Marcuse D (1981) Pulse distortion in single-mode fibers. Applied Optics 20: 2962– 2974.

Marcuse D (1981) Pulse distortion in single-mode fibers. Applied Optics 20: 3573– 3579. Mazurin OV, Streltsina MV and Shvaiko-Shvaikovskaya TP (1983) Silica Glass and Binary Silicate Glasses (Handbook of Glass Data; Part A) Series on Physical Sciences Data. Amsterdam: Elseviervol. 15, pp. 63 – 75. Murata H (1988) Handbook of Optical Fibers and Cables. Optical Engineering Series, No. 15, chap. 2. New York: Marcel Dekker. Saleh BEA and Teich MC (1991) In: Goodman JW (ed.) Fundamentals of Photonics, chap. 5, 8 & 22. New York: Wiley Series in Pure and Applied Optics.

Fabrication of Optical Fiber D Hewak, University of Southampton, Southampton, UK q 2005, Elsevier Ltd. All Rights Reserved.

Introduction The drawing of optical fibers from silica preforms has, over a short period of time, progressed from the laboratory to become a manufacturing process capable of producing millions of kilometers of telecommunications fiber a year. Modern optical fiber fabrication processes produce low-cost fiber of excellent quality, with transmission losses close to their intrinsic loss limit. Today, fiber with transmission losses of 0.2 dB per kilometer of fiber are routinely drawn through a two-stage process that has been refined since the 1970s. Although fibers of glass have been fabricated and used for hundreds of years, it was not until 1966 that serious interest in the use of optical fibers for communication emerged. At this time, it was estimated that the optical transmission loss in bulk glass could be as low as 20 dB km21 if impurities were sufficiently reduced, a level at which practical applications were possible. At this time, no adequate fabrication techniques were available to synthesize glass of high purity, and fiber-drawing methods were crude. Over the next five years, efforts worldwide addressed the fabrication of low-loss fiber. In 1970, a fiber with a loss of 20 dB km21 was achieved. The fiber consisted of a titania doped core and pure silica cladding. This result generated much excitement and a number of laboratories worldwide actively began researching optical fiber. New fabrication techniques were introduced, and by 1986, fiber loss had been reduced close to the theoretical limit.

All telecommunications fiber that is fabricated today is made of silica glass, the most suitable material for low-loss fibers. Early fiber research studied multicomponent glasses, which are perhaps more familiar optical materials; however, low-loss fiber, could not be realized, partly due to the lack of a suitable fabrication method. Today other glasses, in particular the fluorides and sulfides, continue to be developed for speciality fiber applications, but silica fiber dominates in most applications. Silica is a glass of simple chemical structure containing only two elements, silicon and oxygen. It has a softening temperature of about 2,0008C at which it can be stretched, i.e. drawn into fiber. An optical fiber consists of a high purity silica glass core, doped with suitable oxide materials to raise its refractive index (Figure 1). This core, typically on the order of 2– 10 microns in diameter, is surrounded by silica glass of lower refractive index. This cladding layer extends the diameter to typically 125 microns. Finally a protective coating covers the entire structure. It is the phenomenon of total internal reflection at the core cladding interface that confines light to the core and allows it to be guided. The basic requirements of an optical fiber are as follows: 1. The material used to form the core of the fiber must have a higher refractive index than the cladding material, to ensure the fiber is a guiding structure.

Figure 1

Structure of an optical fiber.

FIBER AND GUIDED WAVE OPTICS / Fabrication of Optical Fiber 441

2. The materials used must be low loss, providing transmission with no absorption or scattering of light. 3. The materials used must have suitable thermal and mechanical properties to allow them to be drawn down in diameter into a fiber. Silica (SiO2) can be made into a glass relatively easily. It does not easily crystallize, which means that scattering from unwanted crystalline centers within the glass is negligible. This has been a key factor in the achievement of a low-loss fiber. Silica glass has high transparency in the visible and near-infrared wavelength regions and its refractive index can be easily modified. It is stable and inert, providing excellent chemical and mechanical durability. Moreover, the purification of the raw materials used to synthesize silica glass is quite straightforward. In the first stage of achieving an optical fiber, silica glass is synthesized by one of three main chemical vapor processes. All use silicon tetrachloride (SiCl4) as the main precursor, with various dopants to modify the properties of the glass. The precursors are reacted with oxygen to form the desired oxides. The end result is a high purity solid glass rod with the internal core and cladding structure of the desired fiber. In the second stage, the rod, or preform as it is known, is heated to its softening temperature and stretched to diameters of the order of 125 microns. Tens to hundreds of kilometers of fiber are produced from a single preform, which is drawn continuously, with minimal diameter fluctuations. During the drawing process one or more protective coatings are applied, yielding long lengths of strong, low-loss fiber, ready for immediate application. Preform Fabrication

The key step in preparing a low-loss optical fiber is to develop a technique for completely eliminating transition metal and OH ion contamination during the synthesis of the silica glass. The three methods most commonly used to fabricate a glass optical fiber preform are: the modified chemical vapor deposition process (MCVD); the outside vapor deposition process (OVD); and the vapor-axial deposition process (VAD). In a typical vapor phase reaction, halide precursors undergo a high temperature oxidation or hydrolysis to form the desired oxides. The completed chemical reaction for the formation of silica glass is, for oxidation: SiCl4 þ O2 ! SiO2 þ 2Cl2

½1

For hydrolysis, which occurs when the deposition occurs in a hydrogen-containing flame, the reaction is: SiCl4 þ 2H2 O ! SiO2 þ 4HCl

½2

These processes produce fine glass particles, spherical in shape with a size of the order of 1 nm. These glass particles, known as soot, are then deposited and subsequently sintered into a bulk transparent glass. The key to low-loss fiber is the difference in vapor pressures of the desired halides and the transition metal halides that cause significant absorption loss at the wavelengths of interest. The carrier gas picks up a pure vapor of, for example, SiCl4, and any impurities are left behind. Part of the process requires the formation of the desired core/cladding structure in the glass. In all cases, silica-based glass is produced with variations in refractive index produced by the incorporation of dopants. Typical dopants used are germania (GeO2), titania (TiO2), alumina (Al2O3), and phosphorous pentoxide (P2O5) for increasing the refractive index, and boron oxide (B2O2) and fluorine (F) for decreasing it. These dopants also allow other properties to be controlled, such as the thermal expansion of the glass and its softening temperatures. In addition, other materials, such as the rare earth elements, have also been used to fabricate active fibers that are used to produce optical fiber amplifiers and lasers. MCVD Process

Optical fibers were first produced by the MCVD method in 1974, a breakthrough that completely solved the technical problems of low-loss fiber fabrication (Figure 2). As shown schematically in Figure 3, the halide precursors are carried in the vapor phase by oxygen carrier gas into a pure silica substrate tube. An oxyhydrogen burner traverses the length of the tube, which it heats externally. The tube is heated to temperatures of about 1,4008C which then oxidizes the halide vapor materials. The deposition temperature is sufficiently high to form a soot made of glassy particles which are deposited on the inside wall of the substrate tube but low enough to prevent the softened silica substrate tube from collapsing. The process usually takes place on a horizontal glass-working lathe. During the deposition process, the repeated traversing of the burner forms multiple layers of soot. Changes to the precursors entering the tube and thus the resulting glass composition are introduced for layers which will form the cladding and then the core. The MCVD method allows germania to be doped into

442 FIBER AND GUIDED WAVE OPTICS / Fabrication of Optical Fiber

OVD Process

Figure 2 Fabrication of an optical fiber preform by the MCVD method.

The outside vapor deposition (OVD) methods, also known as outside vapor phase oxidation (OVPO), synthesize the fine glass particles within a burner flame. The precursors, oxygen, and fuel for the burner, are introduced directly into the flame. The soot is deposited onto a target rod that rotates and traverses in front of the burner. As in MCVD, the preform is built up layer by layer, though now initially by depositing the core glass and then building up the cladding layers over this. After the deposition process is complete, the preform is removed from the target rod and is collapsed and sintered into a transparent glass preform. The center hole remains, but disappears during the fiber drawing process. This technique has advantages in both size of preform which can be obtained and the fact that a high-quality silica substrate tube is no longer required. These two advantages combine to make a more economical process. From a single preform, several hundred kilometers of fiber can be produced. VAD Process

Figure 3 Schematic of preform fabrication by the MCVD method.

the silica glass and the precise control of the refractive index profile of the preform. When deposition is complete, the burner temperature is increased and the hollow, multilayered structure is collapsed to a solid rod. A characteristic of fiber formed by this process is a refractive index dip in the center of the core of the fiber. In addition, the deposition takes place in a closed system, which dramatically reduces contamination by OH2 ions and maintains low levels of other impurities. The high temperature allows high deposition rates compared to traditional chemical vapor deposition (CVD) and large performs, built up from hundreds of layers can be produced. The MCVD method is still widely used today though it has some limitations, particularly on the preform size that can be achieved and thus the manufacturing cost. Diameter of the final preform is determined by the size of the initial silica substrate tube, which, due to the high purity required, accounts for a significant portion of the cost of the preform. A typical preform fabricated by MCVD yields about 5 km of fiber. Its success has spurred improvements to the process, in particular to address the fiber yield from a single preform.

The most recent refinement to the fabrication process was developed again to aid the mass production of high-quality fibers. In the VAD process, both core and cladding glasses are deposited simultaneously. Like OVD, the soot is synthesized and deposited by flame hydrolysis, as shown in Figure 4, the precursors are blown from a burner, oxidized, and deposited onto a silica target rod. Burners for the VAD process consist of a series of concentrate nozzles. The first delivers an inert carrier and the main precursors SiCl4, the second delivers an inert carrier glass and the dopants, the third delivers hydrogen fuel, and the fourth delivers oxygen. Gas flows are up to a liter per minute and deposition rates can be very high. With the VAD process, both core and cladding glasses can be deposited simultaneously. The main advantage is that this is a continuous process, as the soot which forms the core and cladding glasses are deposited axially onto the end of the rotating silica rod, it is slowly drawn upwards into a furnace which sinters and consolidates the soot into a transparent glass. The upward motion is such that the end at which deposition is occurring remains in a fixed position and essentially the preform is grown from this base. The advantages of the VAD process are a preform without a central dip and, most importantly, the mass production associated with a continuous process. The glass quality produced is uniform and the resulting fibers have excellent reproducibility and low loss.

FIBER AND GUIDED WAVE OPTICS / Fabrication of Optical Fiber 443

Figure 4

Schematic of preform fabrication by the VAD method.

Other Methods of Preform Fabrication

There are other methods of preform fabrication, though these are not generally used for today’s silica glass based fiber. Some methods, such as fabrication of silica through sol –gel chemistry could not provide the large low-loss preforms obtained by chemical vapor deposition techniques. Other methods, such as direct casting of molten glass into rods, or formation of rods and tubes by extrusion, are better suited to and find application in other glass chemistries and speciality fibers.

Fiber Drawing

Once a preform has been made, the second step is to draw the preform down in diameter on a fiber drawing tower. The basic components and configuration of the drawing tower have remained unchanged for many years, although furnace design has become more sophisticated and processes like coating have become automated. In addition, fiber drawing towers have increased in height to allow faster pulling speeds (Figure 5). Essentially, the fiber drawing process takes place as follows. The preform is held in a chuck which is mounted on a precision feed assembly that lowers the

Figure 5 A commercial scale optical fiber drawing tower.

preform into the drawing furnace at a speed which matches the volume of preform entering the furnace to the volume of fiber leaving the bottom of the furnace. Within the furnace, the fiber is drawn from the molten zone of glass down the tower to a capstan, which controls the fiber diameter, and then onto a drum. During the drawing process, immediately below the furnace, is a noncontact device that measures the diameter of the fiber. This information is fed back to the capstan which speeds up to reduce the diameter or slows down to increase the fiber diameter. In this way, a constant diameter fiber is produced. One or more coatings are also applied in-line to maintain the pristine surface quality as the fiber leaves the furnace and thus to maximize the fiber strength. The schematic drawing in Figure 6 shows a fiber drawing tower and its key components. Draw towers are commercially available, ranging in height from 3 m to greater than 20 m, with the tower height increasing as the draw speed increases. Typical drawing speeds are on the order of 1 m s21. The increased height is needed to allow the fiber to cool sufficiently before entering the coating applicator, although forced air-cooling of the fiber is commonplace in a manufacturing environment.

444 FIBER AND GUIDED WAVE OPTICS / Fabrication of Optical Fiber

Figure 6 Schematic diagram of an optical fiber drawing tower and its components.

Furnaces

A resistance furnace is perhaps the most common and economical heating source used in a fiber drawing tower. A cylindrical furnace, with graphite or tungsten elements, provides heating to the preform by blackbody radiation. These elements must be surrounded by an inert gas, such as nitrogen or argon, to prevent oxidation. Gas flow and control is critical to prevent variations in the diameter of the fiber. In addition, the high temperature of the elements may lead to contamination of the preform surface that can then weaken the fiber. An induction furnace provides precise clean heating through radio frequency (RF) energy that is inductively coupled to a zirconia susceptor ring. This type of furnace is cleaner than the graphite resistance type and is capable of continuous running for several months. It also has the advantage that it does not require a protective inert atmosphere and consequently the preform is drawn in a turbulence-free environment. These advantages result in high-strength fibers with good diameter control. Diameter Measurement

A number of noncontact methods of diameter measurement can be applied to measure fiber diameter. These include laser scanning, interferometry,

and light scattering techniques. To obtain precise control of the fiber diameter, the deviation between the desired diameter and the measured diameter is fed into the diameter control system. In order to cope with high drawing speeds, sampling rates as high as 1,000 times per second are used. The diameter control is strongly affected by the gas flow in the drawing furnace and is less affected by the furnace temperature variation. The furnace gas flow can be used to achieve suppression of fast diameter fluctuations. This is used in combination with drawing speed control to achieve suppression of both fast and slow diameter fluctuations. Current manufacturing processes are capable of producing several hundreds of kilometers of fiber with diameter variations of ^ 1 micron. There are two major sources of fiber diameter fluctuations: short-term fluctuations caused by temperature fluctuations in the furnace; and long-term fluctuations caused by variations in the outer diameter of the preform. Careful control of the furnace temperature, the length of the hot zone, and the flow of gas minimize the short-term fluctuations. Through optimization of these parameters, diameter errors of less than ^ 0.5 microns can be realized. The long-term fluctuations in diameter are controlled by the feedback mechanism between the diameter measurement and the capstan. Fiber Coating

A fiber coating is primarily used to preserve the strength of a newly drawn fiber and therefore must be applied immediately after the fiber leaves the furnace. The fiber coating apparatus is typically located below the diameter measurement gauge at a distance determined by the speed of fiber drawing, the tower height, and whether there is external cooling of the fiber. Coating is usually one of the limiting factors in the speed of fiber drawing. To minimize any damage or contamination of the pristine fiber leaving the furnace, this portion of the tower, from furnace to the application of the first coating, is enclosed in a clean, filtered-air chamber. A wide variety of coating materials has been applied; however, conventional, commercial fiber generally relies on a UV curable polymer as the primary coating. Coating thickness is typically 50 – 100 microns. Subsequent coatings can then be applied for specific purposes. A dual coating is often used with an inner primary coating that is soft and an outer, secondary coating which is hard. This ratio of low to high elastic modulus can minimize stress on the fiber and reduce bending loss.

FIBER AND GUIDED WAVE OPTICS / Fabrication of Optical Fiber 445

Alternative coatings include hermetic coatings of a low melting temperature metal, ceramics, or amorphous carbon. These can be applied in-line before the polymeric coating. Metallic coatings are applied by passing the fiber through a molten metal while ceramic or amorphous coatings utilize an in-line chemical vapor deposition reactor. Types of Fiber

The majority of silica fiber drawn today is single mode. This structure consists of a core whose diameter is chosen such that, with a given refractive index difference between the core and cladding, only a single guide mode propagates at the wavelength of interest. With a discrete index difference between core and cladding, it is often referred to as a step index fiber. In typical telecommunications fiber, single mode operation is obtained with core diameters of 2 –10 microns with a standard outer diameter of 125 microns. Multimode fiber has core diameters considerably larger, typically 50, 62.5, 85, and 110 microns, again with a cladded diameter of 125 microns. Multimode fibers are often graded index, that is the refractive index is a maximum in the center of the fiber and smoothly decreases radially until the lower cladding index is reached. Multimode fibers find use in nontelecommunication applications, for example optical fiber sensing and medicine. Single mode fibers, which are capable of maintaining a linear polarization input to the fiber, are known as polarization preserving fibers. The structure of these fibers provides a birefringence that removes the degeneracy of the two possible polarization modes. This birefringence is a small difference in the effective refractive index of the two polarization modes that can be guided and it is achieved in one of two ways. Common methods for the realization of this birefringence are an elliptical core in the fiber, or through stress rods, which modify the refractive index in one orientation. These fiber structures are shown in Figure 7. While the loss minimum of silica-based fiber is near 1.55 microns, step index single-mode fiber offers zero dispersion close to 1.3 micron wavelengths and dispersion at the loss minimum is considerable. A modification of the structure of the fiber, and in particular a segmented refractive index profile in the core, can be used to shift this dispersion minimum to 1.55 microns. This fiber, illustrated in Figure 7 is known as dispersion shifted fiber. Similarly fibers, with a relatively low dispersion over a wide wavelength range, known as dispersion flattened fibers, can be obtained by the use of multiple cladding layers.

Figure 7 Structure of (a) dispersion shifted fiber and (b) two methods of achieving polarization preserving fiber.

List of Units and Nomenclature Dopants

Fiber drawing

Fiber loss

Glass

Modified chemical vapor deposition (MCVD)

Elements or compounds added, usually in small amounts, to a glass composition to modify its properties. The process of heating and thus softening an optical fiber preform and then drawing out a thin thread of glass. The transmission loss of light as it propagates through a fiber, usually measured in dB of loss per unit length of fiber. Loss can occur through the absorption of light in the core or scattering of light out of the core. An amorphous solid formed by cooling from the liquid state to a rigid solid with no long range structure. A process for the fabrication of an optical preform where gases flow into the inside of a rotating tube, are heated and react

446 FIBER AND GUIDED WAVE OPTICS / Light Propagation

to form particles of glass which are deposited onto the wall of the glass tube. After deposition, the glass particles are consolidated into a solid preform. Outside vapor A process for the fabrication of deposition (OVD) an optical preform where glass soot particles are formed in an oxy-hydrogen flame and deposited on a rotating rod. After deposition, the glass particles are consolidated into a solid preform. Preform A fiber blank, a bulk glass rod consisting of a core and cladding glass composite which is drawn into fiber. Refractive index A characteristic property of glass, which is defined by the speed of light in the material relative to the speed of light in a vacuum. Silica A transparent glass formed from silicon dioxide. Vapor-axial A process similar to OVD, deposition where the core and cladding layers are deposited simultaneously.

See also Detection: Fiber Sensors. Fiber and Guided Wave Optics: Dispersion; Light Propagation; Measuring Fiber

Characteristics; Nonlinear Effects (Basics); Optical Fiber Cables. Fiber Gratings. Optical Amplifiers: Erbrium Doped Fiber Amplifiers for Lightwave Systems. Optical Communication Systems: Architectures of Optical Fiber Communication Systems; Historical Development; Wavelength Division Multiplexing. Optical Materials: Optical Glasses.

Further Reading Agrawal GP (1992) Fiber-Optic Communication Systems. New York: John Wiley & Sons. Bass M and Van Stryland EV (eds) (2001) Fiber Optics Handbook: Fiber, Devices, and Systems for Optical Communications. New York: McGraw-Hill Professional Publishing. Fujiura K, Kanamori T and Sudo S (1997) Fiber materials and fabrications. In: Sudo S (ed.) Optical Fiber Amplifiers, ch. 4, pp. 193– 404. Boston, MA: Artech House. Goff DR and Hansen KS (2002) Fiber Optic Reference Guide: A Practical Guide to Communications Technology, 3rd edn. Oxford, UK: Butterworth-Heinemann UK. Hewak D (ed.) (1998) Glass and Rare Earth Doped Glasses for Optical Fibres. EMIS Datareview Series, INSPEC. London: The Institution of the Electrical Engineers. Keck DB (1981) Optical fibre waveguides. In: Barnoski MK (ed.) Fundamentals of Optical Fiber Communications. New York: Academic Press. Li T (ed.) (1985) Optical Fiber Communications: Fiber Fabrication. New York: Academic Press. Personick SD (1985) Fiber Optics: Technology and Applications. New York: Plenum Press. Schultz PC (1979) In: Bendow B and Mitra SS (eds) Fiber Optics. New York: Plenum Press.

Light Propagation F G Omenetto, Los Alamos National Laboratory, Los Alamos, NM, USA q 2005, Elsevier Ltd. All Rights Reserved.

Optical fibers have become a mainstay of our modern way of life. From phone calls to the internet, their presence is ubiquitous. The unmatched ability that optical fibers have to transmit light is an amazing asset that, besides being strongly part of our present, is poised to shape the future. The foundation of the unique ability of optical fibers to transmit light hinges on the well-known physical concept of total internal reflection. This phenomenon is described by considering the behavior of light traveling from one material (A) to a different

material (B). When light traverses the interface of the two materials with a specific angle, its direction of propagation is altered with respect to the normal to the surface, according to the well-known Snell law which relates the incident angle (onto the interface between A and B) to the refracted angle (away from the interface between A and B). This is written as nA sin i ¼ nB sin r; where i is the angle of incidence, r the angle of refraction, and nA and nB are the refractive indices of the materials A and B. Under appropriate conditions, a critical value of the incidence angle (iCR ) exists for which the light does not propagate into B and is totally reflected back into A. This value is given by iCR ¼ arcsinðnA =nB Þ: It follows that total internal reflection occurs when light is traveling from a medium with a higher index of

FIBER AND GUIDED WAVE OPTICS / Light Propagation 447

refraction into a medium with a lower index of refraction. As an approximation, this is the basic idea behind light propagation in fibers. Optical fibers are glass strands that include a concentric core with a cladding wrapped around it. The index of refraction of the core is slightly higher than the index of refraction of the cladding, thereby creating the necessary conditions for the confinement of light within them. Such index mismatches between the core and the cladding are determined by using different materials for each, by doping the glass matrix before pulling the fiber to create an index gradient, by introducing macroscopic defects such as air-holes in the structure of the fiber cladding (such as in the recently developed photonic crystal fibers), or by using materials other than glass, such as polymers or plastic. The real physical picture of light guiding in fibers is more complex than the conceptual description stated above. The light traveling in the fiber is, more precisely, an electromagnetic wave with frequencies that lie in the optical range, and the optical fiber itself constitutes an electromagnetic waveguide. As such, the waveguide supports guided electromagnetic modes which can take on a multitude of shapes (i.e., of energy distribution profiles), depending on the core/cladding index of refraction profiles. These profiles are often complex and determine the physical boundary conditions that influence the field distribution within the fiber. The nature of the fiber modes has been treated extensively in the literature where an in-depth description of the physical solutions can be found. There are many distinguishing features that identify different optical fiber types. Perhaps the most important one, in relation to the electromagnetic modes that the fiber geometry supports, is the distinction that is made between single-mode fibers and multimode fibers. As the name implies, a singlemode fiber is an optical fiber whose index of refraction profile between core and cladding is designed in such a way that there is only one electromagnetic field distribution that the fiber can support and transmit. That is to say, light is propagated very uniformly across the fiber. A very common commercially used fiber of this kind is the Corning SMF-28 fiber, which is employed in a variety of telecommunication applications. Multimode fibers have larger cores and are easier to couple light into, yet the fact that many modes are supported implies that there is less control over the behavior of light during its propagation, since it is more difficult to control each individual mode. The condition for single-mode propagation in optical fibers is determined during the design and manufacture of the fiber by its geometrical parameters and is specific to a certain operation

wavelength. A quantity V is defined as V ¼ ð2p=lÞ  aðn2core 2 n2cladding Þ1=2 ; where a is the core radius and l is the wavelength of light. For example, a step-index fiber (i.e., a fiber where the transition from the index of refraction of the core to the index of refraction in the cladding is abrupt), supports a single electromagnetic mode if V , 2:405: Light propagation in optical fibers, however, is affected by numerous other factors besides the electromagnetic modes that the fiber geometry allows. First, the ability to efficiently transmit light in optical fiber is dependent on the optical transparency of the medium that the fiber is made of. The optical transparency of the medium is a function of the wavelength (lÞ of light that needs to be transmitted through the fiber. Optical attenuation is particularly high for shorter wavelengths and the transmission losses for ultraviolet wavelengths become considerably higher than in the near infrared. The latter region, due to this favorable feature, is the preferred region of operation for optical fiber based telecommunication applications. This ultimate physical limit is determined by the process of light being scattered off the atoms that constitute the glass, a process called Rayleigh scattering. The losses due to Rayleigh scattering are inversely proportional to the fourth power of the wavelength ðlosses / l24 Þ; and therefore become quite considerable as l is decreased. The ability of optical fibers to transmit light over a certain set of wavelengths, is quantified by their optical attenuation. This parameter is defined as the ratio of the output power versus the input power and is usually measured in units of decibels (dB), i.e., attenuation (dB) ¼ 10 log10(Pout/Pin). The values that are found in the literature relate optical attenuation to distance and express the attenuation per unit length (such as dB/km). Singlemode fibers used in telecommunications are usually manufactured with doped silica glasses designed to work in the near infrared, in a wavelength range between 1.3 and 1.6 mm (1 mm ¼ 1026 m) which provides the lowest levels of Rayleigh scattering and is located before the physical infrared absorption of silica (,1.65 mm) kicks in. Such fibers, which have been developed and refined for decades, have attenuation losses of less that 0.2 dB/km at a wavelength of 1.55 mm. Many other materials have been used to meet specific optical transmission needs. While silica remains by far the most popular material used for optical fiber manufacturing, other glasses are better suited for different portions of the spectrum: quartz glass can be particularly transmissive in the ultraviolet whereas plastic is sometimes

448 FIBER AND GUIDED WAVE OPTICS / Light Propagation

conveniently used for short distance transmission in the visible. The ability to collect light at the input of the fiber represents another crucial and defining parameter. Returning to total internal reflection, such an effect can only be achieved by sending light onto an interface at a specific angle. In fibers, the light needs to be coupled into a circular core and reach the core cladding interface at that specific geometrical angle. The measure of the ability to couple light in the fiber is referred to as ‘coupling efficiency’. The core dimension defines the geometry of the coupling process which, in turn, determines the efficiency of the propagation of light inside the fiber. In order to be coupled efficiently and therefore guided within the fiber, the light has to be launched within a range of acceptable angles. Such a range is usually defined or derivable from the numerical aperture (NA) of the fiber which is approximately NA ¼ ðn2core 2 n2cladding Þ1=2 ; or can be alternatively described as a function of the critical angle for total internal reflection within the fiber by NA ¼ ncore sin iCR : Also, light has to be focused into the core of the fiber and therefore the interplay between core size, focusability, and NA (i.e., acceptance angle) of the fiber influence the coupling efficiency into the fiber (which in turn affects transmission). Typical telecommunication fibers have a 10 mm core (a NA of ,0.2) and have very high coupling efficiencies (of the order of 70 – 80%). Specialized single mode fibers (such as photonic crystal fibers) can have core diameters down to a few microns, significantly lower coupling efficiencies, and high numerical apertures. Multimode fibers, in contrast, are easy to couple into given their large core size. Other factors that contribute to losses during propagation are due to the physical path that the actual optical fiber follows. A typical optical fiber strand is rarely laid along a straight line. These types of losses are called bending losses and can be intuitively understood by thinking that when the fiber is straight, the light meets its total reflection condition at the core-cladding interface, but when the fiber bends, the interface is altered and varies the incidence conditions of light at the interface. If the bend is severe, light is no longer totally reflected and escapes from the fiber. Another important issue that influences propagation arises when the light that travels through the fiber is sent in bursts or in pulses of light. Pulsed light is very important in fiber transmission because the light pulse is the carrier of optical information, most often representing a binary digit of data. The ability to transmit pulses reliably is at the heart of modern day telecommunication systems. The features that affect the transmission of pulses through

optical fibers depend, again, largely on the structure and constituents of the fiber. One of the most important effects on pulsed light is dispersion. This phenomenon causes the broadening in time of the light pulses during their travel through the fiber. This can be easily explained by a Fourier analogy by thinking that a short pulse in time (such as the 10 – 20 picosecond pulse duration that is used for optical communications, 1 picosecond ¼ 10 212 seconds) has a large frequency content (i.e., is composed of a variety of ‘colors’). Since the speed of light through bulk media, and therefore through the fiber constituent, depends on wavelength, different ‘colors’ will travel at different speeds, arriving at the end of the fiber slightly delayed with respect to one another. This causes a net pulse broadening and is dependent on the properties of the fiber that are depicted in the dispersion curve of the fiber. This type of dispersion is called chromatic dispersion and is dependent on the distance the pulse travels in the fiber. A measure of the pulse broadening can be calculated by multiplying the dispersion value for the fiber (measured in ps/km) at the wavelength of operation times and the distance in kilometers that the pulse travels. There are other types of dispersion that affect pulses traveling through fibers: light with different polarization travels at a different speed, thereby causing an analogous problem to chromatic distortion. In multimode fibers, different modes travel with different properties. Dispersion issues are extremely important in the design of the nextgeneration telecommunication systems. As a push is made toward higher transmission capacity, pulses become shorter and these issues have to be dealt with carefully. Specifically, transmission of light in optical fibers becomes more complex because the interaction between the light and the fiber material constituents becomes nonlinear. As pulses become shorter, their energy is confined in a smaller temporal interval, making their peak power (which is equal to energy/time) increase. If the peak power becomes sufficiently high, the atoms that form the glass of which the fiber is made are excited in a nonlinear fashion adding effects that need to be addressed with caution. Energetic pulses at a certain wavelength change their shape during propagation and can become severely distorted or change wavelength (i.e., undergo a frequency conversion process), thereby destroying the information that they were meant to carry. Several physical manifestations of optical nonlinearity affect pulsed light propagation, giving rise to new losses but also providing new avenues for efficient transmission.

FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 449

Perhaps the most successful example of the positive outcome of nonlinear effects in fibers is the optical soliton. The latter is an extremely stable propagating pulse that is transmitted through the fiber undistorted, since it is a result of a peculiar balance between the linear distortion and nonlinear distortion that cancel each other by operating in a specific region of dispersion of the optical fiber (the anomalous dispersion region). Nonlinear effects in fibers are an extremely rich area of study which, besides several drawbacks, carries immense opportunities if the nonlinearities can be controlled and exploited.

See also Dispersion Management. Nonlinear Optics, Applications: Pulse Compression via Nonlinear Optics. Solitons: Optical Fiber Solitons, Physical Origin and Properties.

Further Reading Hecht J (2001) Understanding Fiber Optics. New York: Prentice Hall. Snyder AW and Love JD (1983) Optical Waveguide Theory. Boston: Kluwer Academic Press. Agrawal GP (2001) Nonlinear Fiber Optics. San Diego: Academic Press.

Measuring Fiber Characteristics A Girard, EXFO, Quebec, Canada q 2005, Elsevier Ltd. All Rights Reserved.

. polarization crosstalk; and . nonlinear effects. Attenuation

Introduction The optical fiber is divided in two types: multimode and singlemode. Each type is used in different applications and wavelength ranges and is consequently characterized differently. Furthermore, the corresponding test methods also vary. The optical fiber characteristics may be divided into four categories: . . . .

The The The The

optical characteristics (transmission related); dimensional characteristics; mechanical characteristics; and environmental characteristics.

These categories will be reviewed in the following sections, together with their corresponding test methods.

Fiber Optical Characteristics and Corresponding Tests Methods The following sections will describe the following optical characteristics: . . . . . . . .

attenuation; macrobending sensitivity; microbending sensitivity; cut-off wavelength; multimode fiber bandwidth; differential mode delay for multimode fibers; chromatic dispersion; polarization mode dispersion;

The spectral attenuation of an optical fiber follows exponential power decay from the power level at a cross-section 1 to the power level at cross-section 2, over a fiber length L; as follows: P2 ðlÞ ¼ P1 ðlÞ · e2gðlÞL

½1

P1 ðlÞ is the optical power transmitted through the fiber core cross-section 1, expressed in mW; P2 ðlÞ is the optical power transmitted through the fiber core cross-section 2 away from cross-section 1, expressed in mW; gðlÞ is the spectral attenuation coefficient in linear units, expressed in km21; and L is the fiber length expressed in km. Attenuation may be characterized at one or more specific wavelengths or as a function of wavelength. In the later case, attenuation is referred to spectral attenuation. Figure 1 illustrates such power decay. Equation [1] may be expressed in relative units as follows: log10 P2 ¼ ðlog10 P1 Þ · 2 gL · log10 e

½2

P is expressed in dBm units using the following definition.

Figure 1 Power decay in an optical fiber due to attenuation.

450 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics

The power in dBm is equal to 10 times the logarithm in base 10 of the power in mW; or 0 dBm ¼ 10 log10 ð1 mWÞ

½3

Then

gðlÞ½km21  ¼ ½ð10 log10 P1 Þ 2 ð10 log10 P2 Þ  L log10 e aðlÞ½dB=km ¼ ½P1 ðdBmÞ 2 P2 ðdBmÞ=L

½4 ½5

A new relative attenuation coefficient aðlÞ with units of dB/km has been introduced, which is related to the linear coefficient gðlÞ with units of km21 as follows:

aðlÞ ¼ ðlog10 eÞ · gðlÞ < 4;343gðlÞ

½6

aðlÞ is the spectral attenuation coefficient of a fiber and is illustrated in Figure 2. Test methods for attenuation The attenuation may be measured by the following methods: . cut-back method; . backscattering method; and . insertion loss method.

Cut-back method. The cut-back method is a direct application of the definition in which the power levels P1 and P2 are measured at two points of the fiber change of input conditions. P2 is the power emerging from the far end of the fiber and P1 is the power emerging from a point near the input after cutting the fiber. The output power P2 is recorded from the fiber under test (FUT) placed in the measurement setup. Keeping the launching conditions fixed, the FUT is

Figure 2 Typical spectral attenuation of a singlemode fiber.

cut to the cut-back length, for example 2 m from the launching point. The FUT attenuation, between the points where P1 and P2 have been measured, is calculated using P1 and P2 ; from the definition equations provided above. Backscattering method. The attenuation coefficient of a singlemode fiber is characterized using bidirectional backscattering measurements. This method is also used for: . . . . .

attenuation uniformity; optical continuity; physical discontinuities; splice losses; and fiber length.

An optical time domain reflectometer (OTDR) is used for performing such characterization. Adjustment of laser pulsewidth and power is used to obtain a compromise between resolution (a shorter pulsewidth provides a better resolution but at lower power) and dynamic range/fiber length (higher power provides better dynamic range but with longer pulsewidth). An example of such an instrument is shown in Figure 3. The measurement is applicable to various FUT configurations (e.g., cabled fiber in production or deployed in the field, fiber on a spool, etc.). Two unidirectional backscattering curves are obtained, one from each end of the fiber (Figure 4). Each backscattering curve is recorded on a logarithmic scale, avoiding the parts at the two ends of the curves, due to parasitic reflections. The FUT length is found from the time interval between the two ends of the backscattering loss

FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 451

Figure 3 The backscattering method (OTDR).

Figure 4 Unidirectional OTDR backscattering loss measurement.

curve together with the FUT group index of refraction, ng ; as: Lf ¼ cfs ·

DTf ng

½7

cfs is the velocity of light in free space. The bidirectional backscattering curve is obtained using two unidirectional backscattering curves

and calculating the average loss between them. The end-to-end FUT attenuation coefficient is obtained from the difference between two losses divided by the difference of their corresponding distances. Insertion loss method. The insertion loss method consists of the measurement of the power loss due to

452 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics

the FUT insertion between a launching and a receiving system, previously interconnected (reference condition). The powers P1 and P2 are evaluated in a less straightforward way than in the cut-back method. Therefore, this method is not intended for use in manufacturing. The insertion loss technique is less accurate than the cut-back, but has the advantage of being

non-destructive for the FUT. Therefore, it is particularly suitable in the field. Macrobending Sensitivity

Macrobending sensitivity is the property by which there is a certain amount of light leaking (loss) into the cladding when the fiber is bent and the bending angle is such that the condition of total internal reflection is no longer met at the core-cladding interface. Figure 5 illustrates such a case. Macrobending sensitivity is a direct function of the wavelength: the longer the wavelength and/or the smaller the bending diameter, the more loss the fiber experiences. It is recognized that 1625 nm is a wavelength that is very sensitive to macrobending (see Figure 6). The macrobending loss is measured by the power monitoring method (OTDR, especially for field assessment, see Figure 6) or the cut-back method. Microbending Sensitivity

Figure 5 Total internal reflection and macrobending effect on the light rays.

Figure 6 Macrobending sensitivity as a function of wavelength.

Microbending is a fiber property by which the core-cladding concentricity randomly changed along

FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 453

the fiber length. It causes the core to wobble inside the cladding and along the fiber length. Four methods are available for characterizing microbending sensitivity in optical fibers:

The results from the four methods can only be compared qualitatively. The test is nonroutine for general evaluation of optical fiber.

140 mm radius loosely constrained, with the rest of the fiber kept essentially straight. The presence of a primary coating on the fiber usually will not affect lfc : However, the presence of a secondary coating may result in lfc being significantly shorter than that of the primary coated fiber. Cable cut-off wavelength: Cable cut-off wavelength is measured prior to installation on a substantially straight 22 m cable length prepared by exposing 1 m of primary-coated fiber at both ends, the exposed ends each incorporating a 40 mm radius loop. Alternatively, this parameter may be measured on 22 m primary-coated uncabled fiber in the same configuration as for the lfc measurement. Jumper cable cut-off wavelength: Jumper cable cut-off wavelength is measured over 2 m with one loop of 76 mm radius, or equivalent (e.g., split mandrel), with the rest of the jumper cable kept essentially straight.

Cut-off Wavelength

Multimode Fiber Bandwidth for Multimode Fibers

The cut-off wavelength is the shortest wavelength at which a single mode can propagate in a singlemode fiber. This parameter can be computed from the fiber refractive index profile (RIP). At wavelengths below the cut-off wavelength, several modes propagate and the fiber is no longer singlemode, but multimode. In optical fibers, the change from multimode to singlemode behavior does not occur at a specific wavelength, but rather over a smooth transition as a function of wavelengths. Consequently, from a fiberoptic network standpoint, the actual threshold wavelength for singlemode performance is more critical. Thus an effective cut-off wavelength is described below. The cut-off wavelength is defined as the wavelength greater than the ratio between the total power, in the higher-order modes and the fundamental mode, which has decreased to less than 0.1 dB. According to this definition, the second-order mode LP11 undergoes 19.3 dB more attenuation than the fundamental LP01 mode when the modes are equally excited. Because the cut-off wavelength depends on the fiber length, bends, and strain, it is defined on the basis of the following three cases:

The 23 dB bandwidth of a multimode optical fiber (or modal bandwidth) is defined as the lowest frequency where the magnitude of the baseband frequency response in optical power has decreased by 3 dB relative to the power at zero frequency. Modal bandwidth is also called intermodal dispersion as it takes into account the dispersion between the modes of propagation of the transmitted signal into the multimode fiber. Various methods of reporting the results are available, but the results are typically expressed in terms of the 23 dB (optical power) frequency. Figure 7 illustrates modal bandwidth. The bandwidth or pulse broadening may be normalized to a unit length, such as GHz · km, or ns/km.

. expandable drum for singlemode fibers and optical

fiber ribbons over a wide range of applied linear pressure or loads; . fixed diameter drum for step-index multimode, singlemode, and ribbon fibers for a fixed linear pressure; . wire mesh and applied loads for step-index multimode and singlemode fibers over a wide range of applied linear pressure or loads; and . ‘basketweave’ wrap on a fixed diameter drum for singlemode fibers.

. fiber cut-off wavelength; . cable cut-off wavelength; and . jumper cable cut-off wavelength.

Fiber cut-off wavelength: Fiber cut-off wavelength lfc is defined for uncabled primary-coated fiber and is measured over 2 m with one loop of

Figure 7 Determination of modal bandwidth.

454 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics

Two methods are available for determining transmission capacity of multimode fibers: . the frequency domain measurement method in

which the baseband frequency response is directly measured in the frequency domain by determining the fiber response to a sinusoidally modulated light source; or . the optical time domain measurement method (pulse distortion) in which the baseband response is measured by observing the broadening of a narrow pulse of light. Differential Mode Delay for Multimode Fibers

Differential mode delay (DMD) characterizes the modal structure of a graded-index glass-core multimode fiber. DMD is useful for assessing the bandwidth performance of a fiber when used with short-pulse, narrow spectral-width laser sources. The output from a singlemode probe fiber excites the multimode FUT at the test wavelength. The probe spot is scanned across the FUT endface, and the optical pulse delay is determined at specified radial offset positions between an inner and an outer limit. DMD is the difference in optical pulse delay time between the FUT fastest and slowest modes excited for all radial offset positions between and including the inner and the outer limits. The related critical issues influencing DMD are the temporal width of the optical pulse, jitter in the timing, the finite bandwidth of the optical detector, and the mode broadening due to the source spectral width and the FUT chromatic dispersion. The test method is commonly used in production and research facilities, but is not easily accomplished in the field. DMD may be a good predictor of the source launching conditions. DMD may be normalized to a unit length, such as ps/m. Chromatic Dispersion

Chromatic dispersion in a singlemode fiber is a combination of material dispersion and waveguide dispersion (see Figure 8), and it contributes to pulse broadening and distortion in a digital signal. Material dispersion is produced by the dopants used in glass and is important in all fiber types. Waveguide dispersion is produced by the wavelength dependence of the index of refraction and is critical in singlemode fibers only. From the point of view of the transmitter, this is due to two causes: . The presence of wavelengths in the source optical

spectrum. Each wavelength has a different phase

Figure 8 Contribution of the material and waveguide dispersions to the chromatic dispersion.

Figure 9 Difference between the phase and the group index of refraction.

delay and group delay (different group velocities) along the fiber, because they travel under different index of refraction (or phase) as the index varies as a function of wavelengths, as shown in Figure 9. . The other cause is the modulation of the source, which itself has two effects: – As bit-rates increase, the spectral width of the modulated signal increases and can become comparable to or exceed the spectral width of the source. – Chirp occurs when the source wavelength spectrum varies during the pulse. By convention, positive chirp at the transmitter occurs when the spectrum during the rise/fall of the pulse shifts towards shorter/longer wavelengths respectively. For a positive fiber dispersion coefficient, longer wavelengths are delayed relative to shorter wavelengths. Hence if the sign of the product of chirp and dispersion is positive, the two processes combine to produce pulse broadening. If the product is negative, pulse compression can occur over an initial fiber length until the pulse reaches a minimum width and then broadens again with increasing dispersion.

FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 455

The electric field propagating into the FUT may be simply described as follows: Eðt; zÞ ¼ E0 sinðvt 2 bzÞ

½8

v ¼ 2pc=l [rad/s] is the angular frequency; b ¼ kn ¼ ðv=cÞn is the effective index; as b has units of m21 it is often referred to as wavenumber or even sometimes propagation constant; k is the propagation constant. The group delay tg is then given by: db=dv ¼ b1 ¼ tg

½9

An example of the group delay spectral distribution is shown in Figure 10. Assuming n is not complex; b1 is the first-order derivative of b: The group velocity vg is given by vg ¼ ðdb=dvÞ21 ¼ 2ðl2 =2pcÞðdb=dlÞ21

½10

The FUT input –output delay is given by

td ¼ L=vg

½11

L is the FUT length. The group index of refraction ng is given by ng ¼ c=vg ¼ n 2 lðdn=dlÞ

½12

The dispersion parameter or dispersion coefficient D (ps/nm · km) is given by D ¼ 2ðv=lÞðdtg =dvÞ ¼ 2ð2pc=l2 Þðd2 b=dv2 Þ ¼ 2ðl=cÞðd2 n=dl2 Þ d2 b=dv2 ¼ b2

½13 ½14

An example of the spectral distribution of D obtained from the group delay is shown in Figure 10. b2 (ps2/km) is the group velocity dispersion parameter, so D may be related to b2 ; as follows: D ¼ 2ðv=lÞb2

½15

When b2 is positive then D is negative and vice-versa. The region where b2 is positive is called normal

dispersion while the negative-b2 region is called anomalous dispersion. At l0 ; b1 is minimum, and b2 ¼ 0; then D ¼ 0: The dispersion slope S (ps/nm2 · km), also called the differential dispersion parameter or second-order dispersion, is given by S ¼ dD=dl ¼ ðv=lÞb3 þ ð2v=l2 Þb2

½16

b3 ¼ db2 =dv ¼ d3 b=dv3 At l0 ; b1 is minimum, b2 ¼ 0; then D ¼ 0; but S is not zero and depends on b3 : An example of the spectral distribution of S and S0 is illustrated in Figure 10. Overall, the general expression of b is given by

bðvÞ ¼ b0 þ ðv 2 v0 Þb1 þ ð1=2Þðv 2 v0 Þ2 b2 þ ð1=12Þðv 2 v0 Þ3 b3 þ · · ·

½17

Figure 11 illustrates the difference between dispersion unshifted fiber (ITU-T Rec. G.652), dispersion shifted fiber (ITU-T Rec. G.653) and nonzero dispersion shifted fiber (ITU-T Rec. G.655). Test methods for the determination of chromatic dispersion All methods measure the group delay at a specific wavelength over a range and use agreed fitting functions to evaluate l0 and S0 : In the phase shift method, the group delay is measured in the frequency domain, by detecting, recording, and processing the phase shift of a sinusoidal modulating signal between a reference, a secondary fiber path, and the channel signal. Setup variances exist and some do not require the secondary reference path. For instance, by using a reference optical filter at the FUT output it is possible to completely decouple the source from the phasemeter. With such an approach, chromatic dispersion may now be measured in the field over very long links using optical amplifiers (see Figure 12).

Figure 10 Relation between the pulse (group) delay and the (chromatic) dispersion.

456 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics

Figure 11 Chromatic dispersion for various types of fiber.

Figure 12 Test results for field-related chromatic dispersion using the phase-shift technique.

In the differential phase shift method, two detection systems are used together with two wavelength sources at the same time. In this case the chromatic dispersion may be determined directly from the two group delays. This technique usually offers faster and more reliable results but costs much more than the phase-shift technique which is usually preferred. In the interferometric method, the group delay between the FUT and a reference path is measured by a Mach – Zehnder interferometer. The reference delay line may be an air path or a singlemode fiber standard reference material (SRM). The method can be used to determine the following characteristics: . longitudinal chromatic dispersion homogeneity;

and . effect of overall or local influences, such as

temperature changes and macrobending losses.

In the pulse delay method, the group delay is measured in the time domain, by detecting, recording, and processing the delay experienced by pulses at various wavelengths. Polarization Mode Dispersion

Polarization mode dispersion (PMD) causes an optical pulse to spread in the time domain and may impair the performance of a telecommunications system. The effect can be related to differential phase and group velocities and corresponding arrival time of different polarization components of the pulse signal. For a sufficiently narrowband source, the effect can be related to a differential group delay (DGD), Dt; between a pair of orthogonally polarized principal states of polarization (PSP) at a given wavelength or optical frequency (see Figure 13a).

FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 457

Figure 13 PMD effect on the pulse broadening and its possible pulse impairment. (a) The pulse is spread by the DGD Dt but the bit rate is too low to create an impairment. (b) The pulse is spread by the DGD Dt but the bit rate is high enough to create an impairment. (c) The DGD Dt is large enough even at low bit rate to make the pulse spreading and creating impairment.

458 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics

In an ideal circular symmetric fiber, the two PSPs propagate with the same velocity. However: . a real fiber is not perfectly circular; . the core is not perfectly concentric with the

cladding; . the core may be subjected to microbending; . the core may present localized clusters of dopants;

and . the environmental conditions may stress the deployed cable and affect the fiber. Each time the fiber undergoes local stresses and consequently birefringence. These asymmetry characteristics vary randomly along the fiber and in time, lead to a statistical behavior of PMD. For a deployed cabled fiber at a given time and optical frequency, there always exist two PSPs such that the pulse spreading due to PMD vanishes, if only one PSP is excited. On the contrary, the maximum pulse spread due to PMD occurs when both PSPs are equally excited, and is related to the difference in their group delays, the DGD associated with the two PSPs. For broadband transmission, the DGD statistically varies as a function of wavelengths or frequencies and result in an output pulse that is spread out in the time domain (see Figures 13a – c). In this case, the spreading can be related to the RMS (root mean square) of DGD values kDt 2 l1=2 : However, if a known distribution such as the Maxwell distribution may be fit to the DGD distribution probability, then a mean (or average) value of the DGD kDtl may be correlated to the RMS value and used as a system performance predictor in particular with a maximum value of the DGD distribution associated to a low probability of occurrence. This maximum DGD may then be used to define the quality of service that would tolerate values lower than this maximum DGD.

Test methods for polarization mode dispersion Three methods are generically used for measuring PMD. Other methods or analyses may exist but they are generally not standardized or are limited in their applications. . Stokes parameter evaluation (SPE) . Jones matrix eigenanalysis (JME) . Poincare´ sphere analysis (PSA) . Fixed analyzer (FA) . Extrema counting (EC) . Fourier transform (FT) . Interferometry (INTY) . Traditional analysis (TINTY) . General analysis (GINTY).

All methods use a linearly polarized source at the FUT input and are suitable for laboratory measurements of factory lengths of fiber and cable. However, the interferometric method is the only method appropriate for measurements of cabled fiber that may be moving or vibrating such as is found in the field. Stokes parameter evaluation. SPE determines PMD by measuring a response to a change of narrowband light (from a tuneable light source with broadband detector – JME, or a broadband source with a filtered detector such as an interferometer – PSA) across a wavelength range. The Stokes vector of the output light is measured for each wavelength. The change of these Stokes vectors with angular optical frequency (wavelength), v and with the change in input SOP (state of polarization), yields the DGD as a function of wavelength. For both JME and PSA analyses, three distinct and known linear SOPs (orthogonal on the Poincare´ sphere) must be launched for each wavelength. Figure 14 illustrates the test setup and examples of test results. The JME and PSA method are mathematically equivalent. Fixed analyzer. FA determines PMD by measuring a response to a change of narrowband light across a wavelength range. For each SOP, the change in output power that is filtered through a fixed polarization analyzer, relative to the power detected without the analyzer, is measured as a function of wavelength. Figure 15 illustrates a test setup and examples of test results. The resulting measured function can be analyzed in one of two ways: . by counting the number of peaks and valleys (EC)

of the curve and application of a formula. This analysis is considered as a frequency domain approach; and . by taking the Fourier transform (FT) of the measured function. This FT is equivalent to the pulse spreading obtained by TINTY. Interferometry. INTY uses a broadband light source and an interferometer. The fringe pattern containing the source auto-correlation, together with the PMD related cross-correlation of the emerging electromagnetic field, is determined by the interference pattern of the output light, i.e., the interferogram. The PMD determination for the wavelength range associated with the source spectrum is based on the envelope of the fringe

FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 459

Figure 14 PMD by Stokes parameter evaluation method.

Figure 15 PMD by fixed analyzer method. y

. GINTY uses no limiting operating conditions, but

lateral pressures, both HEx11 and HE11 polarization modes whose electric field vector directions are orthogonally to each other and which have different propagation constants bx and by : Two methods are available for measuring the polarization crosstalk of PMF:

in addition to the same basic setup, also using a modified setup compared to TINTY.

. power ratio method, which uses the maximum and

pattern interferogram. Two analyses are available to obtain the PMD: . TINTYuses a set of specific operating conditions for

its successful applications and a basic setup; and

Figure 16 illustrates the test setup for both approaches and examples of test results. Polarization Crosstalk

Polarization crosstalk is a characteristic of energy mixing/transfer/coupling between the two PSPs in a PMF (polarization maintaining fiber) when their isolation is imperfect. It is the measure of the strength of mode coupling or output power ratio between the PSPs within a PMF. A PMF is an optical fiber capable of transmitting, under external perturbations, such as bending or

minimum values of output power at a specified wavelength, and is applicable to fibers and connectors jointed to a PMF, and to two or more PMFs joined in series; and . in-line method, which uses an analysis of the Poincare´ sphere, and is applicable to single or cascaded sections of PMF, and to PMF interconnected with optical devices. Nonlinear Effects

When the power of the transmission signal is increased to achieve longer span lengths at high bit

FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 461

Figure 17 Power output-to-power input relationship for production of nonlinear effects.

expressed as follows: n ¼ n0 þ n 2 I

½18

n is the nonlinearity dependent index; n0 is the linear part of the index; n2 is the nonlinear index, also called the Kerr nonlinear index (2.2 to 3.4 £ 10216 cm2/W); and I is the optical intensity inside the fiber. The field propagation at a distance L into the fiber is described by the following equation: Eout ðLÞ ¼ Ein ð0Þexp½2a=2 þ ib þ gPðL; tÞ=2L

½19

a=2 is the attenuation; ib is the phase of the wave; and gPðL; tÞ=2 is the nonlinearity term;

g ¼ 2pn2 =lAeff

½20

g is the nonlinearity coefficient and may be a complex number; Aeff is the fiber core effective area; PðL; tÞ is the total power; and l is the signal wavelength and t the time variable. The nonlinear coefficient is defined as n2 =Aeff : This coefficient plays a critical role in the fiber and is closely related to system performance degradation due to nonlinearities when very high power is used. Methods for measuring the nonlinear coefficient. Two methods are available for measuring the nonlinear coefficient:

Stimulated Brillouin scattering In an intensity modulated system using a source with a narrow linewidth, significant optical power is transferred from the forward-propagating signal to a backward-propagating signal when the SBS power threshold is exceeded. At that point periodic regions of index of refraction produce a grating traveling at speed of sound away from the source. The grating reflects backward part of the incident light. The reflected sound waves (acoustic phonons) scatter light back to the source. Phase matching (or momentum conservation) dictates that the scattered light preferentially travels in the backward direction. The scattered light will be Doppler-shifted (downshifted or Brillouin-shifted) by approximately 11 GHz (at 1550 nm, for G.652 fiber). The scattered light has a very narrow spectrum (very coherent) and very close to the carrier signal and may be very detrimental. Stimulated Raman scattering SRS is an interaction between light and the fiber molecular vibrations as adjacent atoms vibrate in opposite directions (an ‘optical phonon’). Some of the energy of the main carrier (optical pump wave) is transferred to the molecules, thereby further increasing the amplitude of their vibrations. If the vibrations become large enough, a threshold is reached at which the local index of refraction changes. These local changes then scatter light in all directions similar to Rayleigh scattering. However, unlike Rayleigh scattering, the wavelength of the Raman scattered light is shifted to longer wavelengths by an amount that corresponds to the molecular vibration frequencies and the Raman signal spreads over a large spectrum. Self-phase modulation SPM is the effect that a powerful pulse has on its own phase, considering that in eqn [18], IðtÞ varies in time: . IðtÞ ! nðtÞ ¼ n0 þ n2 IðtÞ ! modulates the phase

b(t) of the pulse; and . Continuous-wave dual-frequency (CWDF); and . Pulsed single-frequency (PSF).

In the CWDF method, light from two wavelengths is injected into the fiber. At higher power, the light from the two wavelengths beat due to the nonlinearity and produce an output spectrum that is spread. The relationship of the power level to a particular spreading is used to calculate the nonlinear coefficient. In the PSF method, the pulsed light from a single wavelength is injected into the fiber. Very short pulses (,1 ns) and their input peak power must be measured and related to the nonlinear spreading of the output spectrum.

. dl=dt ! dn=dt ! db=dtðchirpÞ ! broadening in the

frequency domain ! broadening in the time domain. IðtÞ peaks at the center of the pulse (peak power) and consequently increases the index of refraction. A higher index causes the wavelengths in the center of the pulse to accumulate phase more quickly than at the wings of the pulse: . this causes wavelength stretching (shift to longer

wavelengths) at pulse leading edge (risetime); and . this causes wavelength compression (shift to shorter

wavelengths) at pulse trailing edge (falltime).

462 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics

The pulse will then broaden with negative (normal) dispersion and shorten with positive (anomalous) dispersion. SPM may then be used for dispersion compensation considering that self-phase modulation imposes C . 0 (positive chirp). It can cancel the dispersion if properly manage as a function of the sign of the dispersion. SPM is one of the most critical nonlinear effects for the propagation of soliton or very short pulses over very long distance. Cross-phase modulation XPM is the effect that a powerful pulse has on the phase of an adjacent pulse from another WDM system channel traveling in phase or at slightly the same group velocity. It concerns spectral interference between two WDM channels: . the increasing IðtÞ at the leading edge of the

interfering pulse shifts the other pulse to longer wavelength; and . decreasing IðtÞ at the trailing edge of the interfering pulse shifts the other pulse to shorter wavelengths. This produces spectral broadening, which dispersion converts to temporal broadening depending on the sign of the dispersion. XPM effect is similar to SPM except it depends on the channel count. Four-wave (four-photon) mixing FWM is the by-product production effect from two or more WDM channels. For two channels IðtÞ modulates the phase of each signal (v1 and v2 ). An intensity modulation appears at the beat frequency v1 2 v2 : Two sideband frequencies are created in a similar way as harmonics generation. New wavelengths are created in a number equal to N2 ðN 2 1Þ=2; where N ¼ number of original wavelengths.

Fiber Dimension Characteristics and Corresponding Tests Methods Table 1 provides a list of the various fiber dimensional characteristics and their corresponding test methods. Table 1

Fiber Geometry Characteristics

The fiber geometry is related to the core and cladding characteristics. Core The core center is the center of a circle which best fits the points at a constant level in the near-field intensity profile emitted from the central region of the fiber, using wavelengths above and/or below the cut-off wavelength. The RIP can be measured by refracted near field (RNF) or transverse interferometry techniques and transmitted near field (TNF). The core concentricity error is the distance between the core center and the cladding center. This definition applies very well for multimode fibers. The distance between the center of the near field profile and the center of the cladding is also used for singlemode fibers. The mode field diameter (MFD) represents a measure of the transverse electromagnetic field intensity of the mode in a fiber cross-section and it is defined from the far-field intensity distribution. The MF is the singlemode field distribution of the LP01 mode, giving rise to a spatial intensity distribution in the fiber. The MF concentricity error is the distance between the MF center and the cladding center. The core noncircularity is a measure of the core ellipticity. This parameter is one of the causes for creating birefringence in the fiber and consequently PMD. Cladding The cladding is the outermost region of constant refractive index in the fiber cross-section. The cladding center is the center of a circle best fitting the outer limit (boundary) of the cladding. The cladding diameter is the diameter of the circle defining the cladding center. The cladding noncircularity is a measure of the difference between the diameters of the two circles defined by the cladding tolerance field divided by the nominal cladding diameter.

Fiber dimensional characteristics

Attribute

Measured parameter

Fiber geometry

Core/cladding diameter Core/cladding noncircularity Core-cladding concentricity error

Numerical aperture Mode field diameter Coating geometry Length

Coating The primary coating is one or more layers of protective material applied to the cladding during or after the drawing process to protect the cladding surface (e.g., a 250 mm protective coating). The secondary coating is one or more layers of protective material applied over the primary coating in order to give additional protection or to provide a particular structure.

FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 463

Measurement of the fiber geometrical attributes The fiber geometry is measured by the following methods: . . . . .

TNF; RNF; Side-view technique/transverse interference; TNF image technique; and Mechanical diameter.

Test instrumentation may incorporate two or more methods such as the one shown in Figure 18. Transmitted near-field technique. The cladding diameter, core concentricity error, and cladding noncircularity are determined from the near-field intensity distribution. Figure 19 provides a series of examples of test results from TNF measurements. Refracted near-field technique. The RIP across the entire fiber (core and cladding) can be directly obtained from the RNF measurement, as shown in Figure 20.

Figure 18 RNF/TNF combined instrumentation.

The geometrical characteristics of the fiber can be obtained from the refractive index distribution using suitable algorithms: . . . . .

core/cladding diameter; core/cladding concentricity error; core/cladding noncircularity; maximum numerical aperture (NA); and index and relative index of refraction difference. Figure 21 illustrates the core geometry.

Side-view technique/transverse interference. The side-view method is applied to singlemode fibers to determine the core concentricity error, cladding diameter and cladding noncircularity by measuring the intensity distribution of light that is refracted inside the fiber. The method is based on an interference microscope focused on the side view of an FUT illuminated perpendicular to the FUT axis. The fringe pattern is used to determine the RIP.

464 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics

Figure 19 MFD measurement by TNF.

Figure 20 RIP from RNF measurement.

TNF image technique. The TNF image technique, also called near-field light distribution, is used for the measurement of the geometrical characteristics of singlemode fibers. The measurement is based on analysis of magnified images at the FUT output. Two subsets of the method are available:

accurately determine the cladding diameter of silica fibers. The technique uses an electronic micrometer such as based on a double-pass Michelson interferometer. The technique is used for providing calibrated fibers to the industry as SRM.

. grey-scale technique which performs an x – y near-

Numerical Aperture

field scan using a video system; and . Single near-field scanning technique performing a

one-dimensional scan. Mechanical diameter. This is a precision mechanical diameter measurement technique used to

The NA is an important attribute for multimode fibers in order to predict their launching efficiency, joint loss at splices and micro/macrobending characteristics. A method is available for the measurement of the angular radiant intensity distribution (far-field)

FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics 465

Figure 21 Core geometry by RNF measurement.

distribution or the RIP at the output of an FUT. NA can be determined from the analysis of the test results.

. bidirectional backscattering uses an OTDR and

bidirectional measurements to determine MFD by comparing the FUT results with a reference fiber.

Mode Field Diameter

The definition of the MFD is given in the section describing the core center above. Four measurement methods are available:

Effective Area

Aeff is a critical nonlinearity parameter and is defined as follows:

. direct far-field scan determines MFD from the far-

field intensity distribution; . variable aperture in far field determines MFD from the complementary aperture transmission function aðxÞ; x ¼ d tan u being the aperture radius and d the distance between the aperture and the FUT:

l MFD ¼ pd

"

ð1 0

x aðxÞ 2 dx ðx þ d2 Þ2

#21=2 ½21

Aeff

Ð 2p½ 1 IðrÞr dr 2 Ð10 ¼ 2 0 IðrÞ r dr

½23

IðrÞ is the field intensity distribution of the fiber fundamental mode at radius r: The integration in the equation is carried out over the entire fiber cross-section. For a Gaussian approximation: 2r 2 IðrÞ ¼ exp 2 2 MFD

Equation [21] is valid for small-u approximation.

½24

. near-field scan determines MFD from the near-field

intensity distribution INF ; r being the radial coordinate: 2 31=2 ð1 rINF ðrÞ dr 6 7 0 MFD ¼ 26 ½22 #2 7 " 62 7 6 ð1 7 4 5 dl1=2 ðrÞ r dr dr 0 Equation [22] is valid for small-u approximation; and

which yields: Aeff ¼

p MFD2 4

½25

Three methods are available for the measurement of Aeff : . direct far-field; . variable aperture in far-field; and . near-field.

466 FIBER AND GUIDED WAVE OPTICS / Measuring Fiber Characteristics

Mechanical Measurement and Test Methods

Table 3 Fiber environmental characteristics

Table 2 describes the various mechanical characteristics and their corresponding test methods.

Hydrogen aging Nuclear gamma irradiation Damp heat Dry heat Temperature cycling Water immersion

Proof Stressing

The proof stress level is the value of tensile stress or strain applied to a full fiber length over a short period of time. The method for fiber proof stressing is the longitudinal tension which describes procedures for applying tensile loads to a length of fiber. The fiber stress is calculated from the applied tension. The tensile load is applied over short periods of time but not too short in order for the fiber to experience proof stress.

Attribute

Hydrogen Aging for Low-Water-Peak Single-Mode Fiber

Hydrogen aging on low-water peak fibers, such as G.652.C, is based on a test performed at 1.0 atmosphere of hydrogen pressure at room temperature over a period of one month. Other proportional combinations are possible.

Residual Stress

Nuclear Gamma Irradiation

Residual stress is the stress built up by the thermal expansion difference between core and cladding during the fiber drawing process or splicing. Methods are available for measuring residual stress based on polarization effects. A light beam produced by a rotating polarizer propagates to the x-axis direction while the beam polarization is in the y – z plane and the fiber longitudinal axis in the z-axis. The light experiences different phase shift in the y- and z-axis due to the FUT birefringence. The photoelastic effect gives the relationship between this phase change and residual stresses.

Nuclear radiation is considered on the basis of a steady state response of optical fibers and cables exposed to gamma radiation and to determine the level of related radiation-induced attenuation produced in singlemode or multimode cabled or uncabled fibers. The fiber attenuation generally increases when exposed to gamma radiation. This is primarily due to the trapping of radiolytic electrons and holes at defect sites in the glass (i.e., the formation of ‘color centers’). Two regimes are considered:

Stress Corrosion Susceptibility

. the high dose rate suitable for estimating the effect

. the low dose rate suitable for estimating the effect

of environmental background radiation; and The stress corrosion susceptibility is related to the dependence of crack growth on applied stress. It depends on the environmental conditions and static and dynamic values may be observed.

of adverse nuclear environments. The effects of environmental background radiation are measured by the attenuation (cut-back method). The effects of adverse nuclear environments are tested by power monitoring before, during and after FUT exposure.

Environmental Characteristics Table 3 lists the fiber characteristics related to the effect of the environment. Table 2

Fiber mechanical characteristics

Attribute Proof stress Residual stress Stress corrosion susceptibility Tensile strength Stripability Fiber curl

See also Fiber and Guided Wave Optics: Nonlinear Effects (Basics). Imaging: Interferometric Imaging. Interferometry: Gravity Wave Detection; Overview. Polarization: Introduction. Scattering: Scattering Phenomena in Optical Fibers.

Further Reading Agrawal GP (1997) Fiber-Optic Communication Systems, 2nd edn. New York: John Wiley & Sons.

FIBER AND GUIDED WAVE OPTICS / Nonlinear Effects (Basics) 467

Agrawal GP (2001) Nonlinear Fiber Optics, 3rd edn. San Diego, CA: Academic Press. Girard A (2000) Guide to WDM Technology and Testing. Quebec: EXFO, pp. 194. Hecht J (1999) Understanding Fiber Optics, 3rd edn. Upper Saddle River, NJ: Prentice Hall.

Masson B and Girard A (2004) FTTx PON Guide, Testing Passive Optical Networks. Quebec: EXFO, pp. 56. Miller JL and Friedman E (2003) Optical Communications Rules of Thumb. New York: McGraw-Hill, pp. 428. Neumann E-G (1988) Single-Mode Fibers, Fundamentals. Berlin: Springer-Verlag.

Nonlinear Effects (Basics) G Millot and P Tchofo-Dinda, Universite´ de Bourgogne, Dijon, France q 2005, Elsevier Ltd. All Rights Reserved.

to an input signal E1 ¼ A sinðvtÞ. In the lowamplitude limit of the output signal, the response R1 of the system is proportional to the excitation: R1 ¼ a1 E1

Introduction Many physical systems in various areas such as condensed matter or plasma physics, biological sciences, or optics, give rise to localized largeamplitude excitations having a relatively long lifetime. Such excitations lead to a host of phenomena referred to as nonlinear phenomena. Of the many disciplines of physics, the optics field is probably the one in which practical applications of nonlinear phenomena have been the most fruitful, in particular, since the discovery of the laser in 1960. This discovery has thus led to the advent of a new branch in optics, referred to as nonlinear optics. The applications of nonlinear phenomena in optics include the design of various kinds of laser sources, optical amplifiers, light converters, light-wave communication systems for data transmission purposes, to name a few. In this article, we present an overview of some basic principles of nonlinear phenomena that result from the interaction of light waves with dielectric waveguides such as optical fibers. These nonlinear phenomena can be broadly divided into two main categories, namely, parametric effects and scattering phenomena. Parametric interactions arise whenever the state of the dielectric matter is left unchanged by the interaction, whereas scattering phenomena imply transitions between energy levels in the medium. More fundamentally, parametric interactions originate from the electron motion under the electric field of a light wave, whereas scattering phenomena originate from the motion of heavy ions (or molecules).

½1

where a1 is a constant. This type of behavior corresponds to the so-called linear response. In general, a physical system executes a linear response when the superposition of two (or more) input signals E1 and E2 yields a response which is a superposition of the output signals, as schematically represented in Figure 1: R ¼ a1 R 1 þ a 2 R 2

½2

Now, in almost all real physical systems, if the amplitude of an excitation E1 becomes sufficiently large, distortions will occur in the output signals. In other words, the response of the system will no longer be proportional to the excitation, and consequently, the law of superposition of states will no longer be observed. In this case, the response of the system may take the following form: R ¼ a1 E1 þ a2 E21 þ a3 E31 þ …

½3

which involves not only a signal at the input frequency v , but also signals at frequencies 2v , 3v , and so on. Thus, harmonics of the input signal are generated. This behavior, called nonlinear response, is at the origin of a host of important phenomena in many

Linear and Nonlinear Signatures The macroscopic properties of a physical system can be obtained by analyzing the response of the system under an external excitation. For example, consider at time t the response of a system, such as an amplifier,

Figure 1 Schematic representation of linear and nonlinear responses of a system, to two input signals.

468 FIBER AND GUIDED WAVE OPTICS / Nonlinear Effects (Basics)

branches of sciences, such as in condensed matter physics, biological sciences, or in optics.

Physical Origin of Optical Nonlinearity Optical nonlinearity originates fundamentally from the action of the electric field of a light wave on the charged particles of a dielectric waveguide. In contrast to conductors where charges can move throughout the material, dielectric media consist of bound charges (ions, electrons) that can execute only relatively limited displacements around their equilibrium positions. The electric field of an incoming light wave will cause the positive charges to move in the polarization direction of the electric field whereas negative charges will move in the opposite direction. In other words, the electric field will cause the charged particles to become dipoles, as schematically represented in Figure 2. Then each dipole will vibrate under the influence of the incoming light, thus becoming a source of radiation. The global light radiated by all the dipoles represents the scattered light. When the charge displacements are proportional to the excitation (incoming light), i.e., in the low-amplitude limit of scattered radiation, the output light vibrates at the same frequency as that of the excitation. This process corresponds to Rayleigh scattering. On the other hand, if the intensity of the excitation is sufficiently large to induce displacements that are not negligible with respect to atomic distances, then the charge displacements will no longer be proportional to the excitation. In other words, the response of the medium, which is no longer proportional to the excitation, becomes nonlinear. In this case, the scattered waves will be generated not only at the excitation frequency (the Kerr effect), say v , but also at frequencies that differ from v (e.g., 2v , 3v). On the other hand, it is worth noting that when all induced dipoles vibrate coherently (that is, their relative phase does not vary randomly), their individual radiation may, under certain conditions, interfere constructively and lead to a global field of high intensity.

Figure 2 Electric dipoles in a dielectric medium under an external electric field.

The condition of constructive interference is the phase-matching condition. In practice, the macroscopic response of a dielectric is given by the polarization, which corresponds to the total amount of dipole moment per unit volume of the dielectric. As the mass of an ion is much larger than that of an electron, the amplitude of the ion motion is generally negligible with respect to that of the electrons. As a consequence, the electron motion generally leads to the dominant contribution in the macroscopic properties of the medium. The behavior of an electron under an optical electric field is similar to that of a particle embedded in an anharmonic potential. A very simple model (called the Lorentz model) that provides a deep insight into the dielectric response consists of an electron of mass m and charge 2 e connected to an ion by an elastic spring (see Figure 2). Under the electric field EðtÞ, the electron executes a displacement xðtÞ with respect to its equilibrium position, which is governed by the following equation: d2 x dx þ v20 x þ ðað2Þ x2 þ að3Þ x3 þ · · ·Þ þ 2l 2 dt dt e ¼ 2 EðtÞ m

½4

where að2Þ ; að3Þ ; and so on, are constant parameters, v0 is the resonance angular frequency of the electron, and l is the damping coefficient resulting from the dipolar radiation. When the amplitude of the electric field is sufficiently large, then the restoring force on the electrons becomes a nonlinear function of x; hence the presence of terms such as að2Þ x2 ; að3Þ x3 ; and so on, in eqn [4]. In this situation, the macroscopic response of the dielectric is the polarization X P ¼ 2 exðv; 2v; 3v; …Þ ½5 where xðv; 2v; 3v; …Þ is the solution of eqn [4] in the frequency domain, and the summation extends over all the dipole moments per unit volume. In terms of the electric field E the polarization may be written as P ¼ 10 ðxð1Þ E þ xð2Þ E2 þ xð3Þ E3 þ · · ·Þ

½6

where xð1Þ ; xð2Þ ; xð3Þ ; and so on, represent the susceptibility coefficients. Figure 3 (top left) illustrates schematically the polarization as a function of the electric field. In particular, one can clearly observe that when the amplitude of the incoming electric field is sufficiently small (bottom left in Figure 3), the polarization (top right) is proportional to the electric field (bottom left), thus implying that the electric dipole radiates a wave having the same frequency as that of the incoming light (bottom right). On the other hand, Figure 4

0 –2 –4 –6 –8 –5 0 Field E (arbitrary units)

5

6 4 2 0 –5

0 5 Field E (arbitrary units)

Polarization P (arbitrary units)

2

Intensity (arbitrary units)

Time t (arbitrary units)

Polarization P (arbitrary units)

FIBER AND GUIDED WAVE OPTICS / Nonlinear Effects (Basics) 469

2 0 –2 –4 –6 –8 0

2 4 6 Time t (arbitrary units)

100

10–2

–4 –2 0 2 4 Frequency (arbitrary units)

Figure 3 Polarization induced by an electric field of small amplitude. Nonlinear dependence of the polarization as a function of field amplitude (top left) and time dependence of the input electric field (bottom left). Time dependence of the induced polarization (top right) and corresponding intensity spectrum (bottom right).

Figure 4 Polarization induced by an incoming electric field of large amplitude.

shows that for an electric field of large amplitude, the polarization is no longer proportional to the electric field, leading to radiation at harmonic frequencies (see Figure 4, bottom right). This nonlinear behavior leads to a host of important phenomena in optical fibers, which are useful for many optical systems but detrimental for other systems.

Parametric Phenomena in Optical Fibers In anisotropic materials, the leading nonlinear term in the polarization, i.e., the xð2Þ term, leads to phenomena such as the harmonic generation or optical rectification. This xð2Þ term vanishes in homogeneous isotropic materials such as cylindrical

470 FIBER AND GUIDED WAVE OPTICS / Nonlinear Effects (Basics)

optical fibers, and there the leading nonlinear term becomes the xð3Þ term. Thus, most of outstanding nonlinear phenomena in optical fibers originate from the third-order nonlinear susceptibility xð3Þ. Some of those phenomena are described below. Optical Kerr Effect

The optical Kerr effect is probably the most important nonlinear effect in optical fibers. This effect induces an intensity dependence of the refractive index, which leads to a vast wealth of fascinating phenomena such as self-phase modulation (SPM), cross-phase modulation (CPM), four-wave mixing (FWM), modulational instability (MI) or optical solitons. The Kerr effect can be conveniently described in the frequency domain through a direct analysis of the polarization, which takes the following form: PNL ðvÞ ¼

3 1 xð3Þ ðvÞlEðvÞl2 EðvÞ 4 0

½7

The constant 3/4 comes from the symmetry properties of the tensor xð3Þ. Setting PNL ðvÞ ¼ 10 1NL EðvÞ; where 1NL ¼ 34 xð3Þ lEl2 is the nonlinear contribution to the dielectric constant, the total polarization takes the form PðvÞ ¼ PL þ PNL ¼ 10 ½xð1Þ ðvÞ þ 1NL EðvÞ

½8

As eqn [8] shows, the refractive index n, at a given frequency v, is given by n2 ¼ 1 þ xð1Þ þ 1NL ¼ ðn0 þ DnNL Þ2

½9

with n20 ¼ 1 þ xð1Þ . In practice DnNL ! n0 ; and then, the refractive index is given by nðv; lEl2 Þ ¼ n0 ðvÞ þ ne2 lEl2

½10

where ne2 is the nonlinear refractive index defined by ne2 ¼ 3xð3Þ =ð8n0 Þ: The linear polarization PL is responsible for the frequency dependence of the refractive index, whereas the nonlinear polarization PNL causes an intensity dependence of the refractive index, which is referred to as the optical Kerr effect. Knowing that the wave intensity I is given by I ¼ alEl2 ; with a ¼ 12 10 cn0 ; the refractive index can be then rewritten as nðv; IÞ ¼ n0 ðvÞ þ n2 I

½11

with n2 ¼ ne2 =a ¼ 2ne2 =ð10 cn0 Þ: For fused silica fibers one has typically: n2 ¼ 2:66 £ 10220 m2 W21 : For example, an intensity of I ¼ 1 GW cm22 leads to DnNL ¼ 2:66 £ 1027 ; which is much smaller than n0 < 1:45:

Four-Wave Mixing

The four-wave mixing (FWM) process is a third-order nonlinear effect in which four waves interact through an energy exchange process. Let us consider two intense waves, E1 ðv1 Þ and E2 ðv2 Þ; with v2 . v1 ; called pump waves, propagating in an optical fiber. Hereafter we consider the simplest case when waves propagate with the same polarization. In this situation the total electric field is given by Etot ðr; tÞ ¼ E1 þ E2 ¼ A1 ðv1 Þ exp½iðk1 ·r 2 v1 tÞ þ A2 ðv2 Þ exp½iðk2 ·r 2 v2 tÞ

½12

where k1 and k2 are the wavevectors of the fields E1 and E2 , respectively. Equation [7], which gives the nonlinear polarization induced by a single monochromatic wave, remains valid provided that the frequency spacing between the two waves is relatively small, i.e., lDvl ¼ lv2 2 v1 l ,, v0 ¼ ðv1 þ v2 Þ=2: In this context, eqn [7] leads to PNL
b>a

Using eqns [5] and [6], it can be shown that there can be multiple solutions to the above input –output relationship which predicts the optical bistability. Figure 4 shows a plot of DI1 versus DI3, where D ¼ 2=TIs , as a function of various C0 : As can be seen from Figure 4, for a high enough value of C0 ; the system possesses multiple solutions. I2(au)

Dispersive Bistability The intensity dependence of the refractive index can be written as n ¼ n0 þ n 2 I

½7

Figure 5 Plots of both sides of eqn [10] as a function of I2 and for increasing values of input intensity I1 : The oscillatory curve represents the right-hand side of eqn [10] and the straight lines represent the left-hand side with increasing values.

After some manipulation, eqn [8] can be rewritten in terms of intensity as

If the refractive index n varies nonlinearly with intensity, assuming a ¼ 0 and using eqn [7], eqn [2] can be written as A2 ¼

tA 1 tA1 ¼ 2 2 ikl 1 2 Reid 12r e

½8

where, d ¼ d0 þ d2 : d0 and d2 are given by

d0 ¼ w þ 2n0 vc d2 ¼ 2n2 I l

vc l

½9

where, w is the phase associated with r and the fact that k ¼ nv=c is used.

I2 1=T ¼ I1 1 þ ð4R=T 2 Þ sin2 ðd=2Þ

½10

  v d ¼ d0 þ 4n2 l I2 c

½11

and

Equation [10] can be solved graphically to show the introduction of bistable behavior as a function of input intensity. Figure 5 shows the plots of both sides of eqn [10] as a function of I2 and for increasing values of input intensity I1 : The oscillatory curve represents the right-hand side of eqn [10] and the

252 INFORMATION PROCESSING / Free-Space Optical Computing

straight lines represent the left-hand side with increasing values of I1 : It can be readily seen from Figure 5 that at sufficiently high values of I1 ; multiple solutions are possible for eqn [10]. This again will give rise to the bistable behavior.

Nonlinear Interference Filters The optical bistability from the nonlinear interference filters (or etalons) can be used to achieve optical logic gates. The transmission property of these etalons can be modified by another optical beam, known as a bias beam. A typical logic gate and the transmission property are shown in the following figure. If the bias beam is such that, Pswitch 2 ðPa þ Pbias Þ , Pb ; then the transmitted beam shown in Figure 6b readily follows an AND gate. The truth table for such configurations is shown in Figure 6d.

Four-Wave Mixing Four Wave Mixing (FWM) can be used to implement various computing functions. Using the photorefractive effect and FWM in a third-order nonlinear optical material, optical phase conjugate beams can be obtained. A schematic of FWM and phase conjugation is shown in Figure 7. The forward pump beam interferes with the probe beam to produce a grating

in the photorefractive material. The backward pump beam is then diffracted to produce a beam which retraces the path of the probe beam but has an opposite phase. It can be seen from Figure 7, that the optical phase conjugation can be readily used as an AND logical operator. By using the four-wave mixing, an optical counterpart of transistors has been introduced. FWM have also been used in ANDbased optical symbolic substitution (OSS) operations. FWM is also used to produce matched filter-based optical correlators. Another device showing the optical bistability is the self electro-optic effect device (SEED). SEEDs were used to implement optical logic circuits. The SEEDs are produced by putting multiple quantum wells (MQW) in the intrinsic region of a p-i-n structure. The MQWs are created by placing alternating thin layers of high (barriers) and low (wells) bandgap materials. An electric field is applied to the SEEDs externally. The absorption coefficient is a function of applied bias voltage. When a light beam is incident on these devices, it creates a photocurrent, consequently lowering the bias voltage across the MQW. This, in turn, increases the absorption and when the input optical power is high enough, the system sees a peak absorption and suddenly switches to a lower state and thus shows the bistable behavior. A schematic of SEED is shown in Figure 8.

PA PT Pt

PB Pbias

PR

Nonlinear interference filter Pa + Pb (b)

(a)

(d)

PA= input 1

PB = input 2

PT = output

0

0

0

0

1

0

1

0

0

1

1

1

Pr

Truth table for an AND function from the above setup from the transmitted beam

Pa + Pb

(c)

Figure 6 A typical optical logic gate configuration using nonlinear interference filter: (a) schematic; (b) the transmitted power output; (c) reflected power output (both as functions of inputs); and (d) the truth table for AND operation with beams A and B as inputs.

INFORMATION PROCESSING / Free-Space Optical Computing 253

Input

Ep* Phase conjugate wave

Shift and overlap Mask Detection Invert

Photo-refractive crystal 0

y2

1 Ea

Eb

Ep (probe)

Forward pump

Backward pump

Figure 7 Four-wave mixing setup. Figure 9 Symbolic substitution example, the two inputs are coded using dual rails. The second column shows the four combinations of inputs (00, 01, 10, and 11). They are shifted and overlapped creating the fourth column. A mask detects the output as a dark pixel appearing through the transparent opening. The dark is inverted to create a detection output.

Input light p MQW i n



+ Output light

Figure 8 Schematic of self electro-optic effect devices.

Replacing the resistive load by a photodiode, a diode-based SEED (D-SEED) can be constructed, which is analogous to the way diode transistor logic was developed from resistor transistor logic in electronics. Electrically connecting two MQW p-i-n structure symmetric SEEDs (S-SEED) are constructed, which are less sensitive to optical power fluctuations and effectively provide isolation between input and output.

Pattern Coded Optical Computing: Symbolic Substitution or Cellular Logic This is a keyword which is equivalent to simple digital logic operation. In digital logic, a truth table can describe the relationship between input and output variable. In symbolic substitution logic, this relationship is expressed as a 2D pattern rule. In any specific rule (equivalent to a row in the truth table) of SS logic, a specific 2D pattern is identified and replaced by another pattern. For example, in the case of binary logic, we can express the addition of two binary bits with the following truth table: Rule number

1 2 3 4

Input

Output

A

B

C

S

0 0 1 1

0 1 0 1

0 0 0 1

0 1 1 0

Here A and B are the two digits to be added, C and S are the sum and the carry bit, respectively. To convert this operation to an SS operation, we can say that a two bit binary pattern is identified and replaced by a sum bit and a carry bit. If we now express the above in terms of symbolic substitution, we will say let 0 and 1 be represented by the following 2D symbol. Then the rule for the carry bit generation is shown in Figure 9. Note that each of these rules has to be implemented individually. However, one can implement these by only generating the 1 output or the zero output, whichever leads to the minimum rule. Here implementing the last rule is convenient, one can make a copy of the whole pattern and then overlap it with its left shift version. This will result in a dark pixel at the origin. This is inverted and then produces a detection output. Natural extension of SS is in signed digit computing, where carry free addition can be achieved, allowing parallel implementation of numerical algorithm using free space optics.

Optical Shadow Casting (OSC) All digital logic is nonlinear. The nonlinearity of the digital logic is converted into nonlinearity of coding. The OSC system originally proposed by Tanida and Ichioka uses two-dimensional spatially encoded patterns as data input and light emitting diodes (LEDs) as light sources. In the original shadowcasting system, the inputs were represented by vertical and horizontal stripes of opaque and transparent bars. With this system, programmability was attained by changing the LED pattern and it was possible to realize 16 logical operations between two

254 INFORMATION PROCESSING / Free-Space Optical Computing

following numerical example:

Input pixels A B C

1 1

3 4 5

2

1 1

3 8

2

Convolution 3

Flipped number 4 LEDs

Input overlap pixel

Source plane Input plane

Ordinary multiplication Output overlap pixel

1 4

Decoding mask

1

12 Partial products added

1 7 12 = 1 × 100 + 7 ×10 + 12 ×1 = 182

The two numbers are convolved as shown above. This is done by reversing one of the numbers (4 1), while keeping the other number in the same order (1 3), then sliding the reversed number to the right, and collecting the element by element products. The second partial products are 4 £ 1 and 1 £ 3 which are then added to give the convolution results. Surprisingly, the weighted number in mixed radix (since numbers that have digits higher than 9, are no longer decimal) has the same value as the product shown in the upper left. It is possible to have fast parallel convolution optically, resulting in fast parallel multiplier. However, an efficient electronic post processor is necessary to convert the mixed radix numbers to the radix of the to-be-multiplied numbers. A basic optical matrix –vector multiplier is shown in Figure 11. Here the vector, represented by an expanding column of light, is projected onto the 2D matrix, possibly represented by a glass plate or electronically addressable spatial light modulator. As the light representing the vector passes through the transmitting medium, each element of the vector (I) gets multiplied by a column of matrix represented by the transmittance of the matrix ðTÞ: The product (IT)

Discrete Processors: Optical Matrix Processor Most of the discrete optical processing schemes revolve around optical matrix – vector multiplication. Many numerically useful operations such as: (a) linear algebraic operations; (b) signal processing algorithms; (c) Boolean logic; (d) digital multiplication by analog convolution; and (e) neural networks, can be formulated as matrix –vector multiplication. An example of the digital multiplication by analog convolution (DMAC) can be understood by the

1

Overlap with the original

Figure 10 An optical shadow-casting system.

variables as well as a half addition as an example of arithmetic operation. Later, polarized codes were introduced into shadow-casting systems. This improved technique, called the polarization-encoded optical shadow casting (POSC), enabled complex combinational logic units like binary and trinary full adders, as well as image processing applications, to be realized. The lensless OSC system, as shown in Figure 10, consists of spatially encoded 2D binary pixel patterns as the inputs. The input patterns are kept in close contact at the input plane and the resulting input overlapped pixel pattern is illuminated by a set of LEDs from the source plane. The light originating from each of the LEDs located at the source plane produces a shadow of the input overlap pixel pattern at the output plane. The overlap of these relatively displaced shadows results in an output overlap pixel pattern at the output plane. A decoding mask placed at the output plane is then used to spatially filter and detect the logical output. To prevent crosstalk, specific spacing among the different processing elements must be maintained.

4+3=7

1 4

Output plane

Original number

Mask

A A B B Inverter Figure 11 A matrix–vector multiplier.

INFORMATION PROCESSING / Free-Space Optical Computing 255

is integrated horizontally using a cylindrical lens. This leads to an addition operation of the partial products, resulting in the generation of the output product vector. Mathematically: 32

2

3

a11 a12 a13 a14 b1 6 76 7 6a 76 7 6 21 a22 a23 a24 76 b2 7 76 7 y¼6 6 76 7 6 a31 a32 a33 a34 76 b3 7 4 54 5 a41 a42 a43 a44 b4 2 3 a11 b1 þ a12 b2 þ a13 b3 þ a14 b4 6 7 6a b þ a b þa b þ a b 7 6 21 1 22 2 23 3 24 4 7 7 ¼6 6 7 6 a31 b1 þ a32 b2 þ a33 b3 þ a34 b4 7 4 5 a41 b1 þ a42 b2 þ a43 b3 þ a44 b4

b1 6 6b 6 1 6 6 6 b1 4 b1

b2

b3

b2

b3

b2

b3

b2

b3

b4

½12

7 b4 7 7 7 7 b4 7 5 b4

and a point-by-point multiplication is performed between the expanded vector and the original matrix. Then the addition is performed by the cylindrical lenses. Another simple application of vector matrix multiplier is in digital logic, known as programmable logic array (PLA). PLA is an electronic device that consists of a set of AND gates followed by a set of OR gates, which can be organized in AND –OR format suitable for realizing any arbitrary general purpose logic operation. AND logic is performed when light representing a logic level is passed through a transparency representing the other variable; OR is achieved by converging light to a common detector. ½13

 represents the logical inverse of A: Using where A DeMorgans’ theorem in digital logic, this can be expressed as  þC  Þ þ ðA þ B þ C  Þ þ ðA  þBþC Þ F ¼ ðA ¼ y1 þ y2 þ y2

y1

1

B C B C where B y2 C @ A y3

0

3

  Assume a function F ¼ AC þ ABD þ ABC

0

0

The above can be achieved if the vector b is expanded as 2

Now the yi s can be generated by the vector matrix product, as shown below:

½14

0 B B ¼ B1 @ 0

1

0

0

0

1

0

0

1

0

0

1

1

0

0

1

A

1

B C B  C BAC B C B C C 1B BBC C 0 0 B  C CB B C CB C 0 1 CB B AB C C C C 0 0 B B C C BC  B C B C B C BDC @ A  D

½15

Note here that the AND has been converted to OR, which needs to be inverted and then ORed to generate the function F: A setup shown in Figure 11 can be used to accomplish this. In step 1, the yi s are generated, and then in step 2 they are inverted and summed by a second pass through the same system with a different mask. Other applications of vector– matrix multipliers are in neural networks, where the interconnection weights are represented by the matrix and the inputs are the vector.

Analog Optical Computing Correlation is a way to detect the presence of signal in additive noise. Optics offer a fast way of performing correlations. The practical motivation of this area comes from the fact that a lens performs a Fourier transform of a two-dimensional signal at its focal plane. The most famous analog optical computing is known as optical matched filter. The theoretical basis of an optical matched filter is in the Fourier theorem of matched filtering which states that the cross correlation between two signals can be formed by multiplying the Fourier transform of the signal and the complex conjugate of the second signal and followed by a subsequent Fourier transform operation. Mathematically: Correlationðf ; gÞ ¼ F21 ½F{f }p conjugateðF{g}Þ

½16

where F represents a Fourier transform operation and F21 represents an inverse transform. Since the lens can easily perform the Fourier transform, and multiplication can be performed by light passing through a medium, the only problem that has to be

256 INFORMATION PROCESSING / Free-Space Optical Computing

f

2f

f

device technology with the right architecture at the right applications. The development of quantum computing may hold new promises for optical computing.

List of Units and Nomenclature Input plane

Filter plane

Output plane

Figure 12 An optical 4-f set up. Lenses perform Fourier transform at their focal plane.

solved is to represent the signal and the filter in the optical domain. Since an SLM can represent both the real signal and the complex filter, thus the whole operation can be performed in a 4-f setup as shown in Figure 12. The input is displayed in the input plane by an SLM, the first lens performs a Fourier transform. This filter (conjugate F{g}) is encoded using another SLM and placed at the filter plane. When light from the F{f } passes through the filter, it performs the required multiplication. The second lens performs a second Fourier transform and produces the correlation output at the output plane. Variations of the basic setup exist as a joint transform correlator (JTC) or binary phase-only filter (BPOF). While the JTC achieves the product of the complex domain by adding the two signals and squaring, the BPOF is achieved by simplifying the complex domain filter by numerically approximating it with the binarized phase. Other applications of linear optical processing can be performed in the basic 4-f optical setup where filters such as low-pass or high-pass filters, can be set up in the correlation plane. The hologram base computing treats holograms as a legitimate way of signal transformation from one form to another, just as a lens transforms from one domain to another. Holograms are complex representations that can act on a signal. A lens can be approximated by a hologram, in reality functions of many optical elements, such as lens, grating, prism, and other transformation such as beamsplitting, it can be combined into single computing element known as computer-generated hologram (CGH). The smartness of this type of system depends on how many operations have been replaced by a single CGH. It is envisioned that optical computing is better suited for special purpose computing rather than general purpose computing. However, the future of optical computing lies in the marriage of appropriate

DMAC JTC OC PLA QC SLM

Digital multiplication by analog convolution Joint transform correlator Optical computing Programmable logic array Quantum computing Spatial light modulator

See also Nonlinear Optics, Basics: Four-Wave Mixing; Nomenclature and Units. Quantum Optics: Quantum Computing with Atoms.

Further Reading Arsenault H, Szoplik T and Macukow B (eds) (1989) Optical Processing and Computing. San Diego, CA: Academic Press. Boyd RW (1992) Nonlinear Optics. San Diego, CA: Academic Press. Gaskill J (1978) Linear Systems Fourier Transform and Optics. New York: Wiley. Gibbs HM (ed.) (1982) Optical Bistabiligy II. New York: Plenum Press. Goodman JW (1996) Introduction to Fourier Optics, 2nd edn. New York: McGraw Hill. Ichioka Y and Tanida J (1984) Optical parallel logic gates using a shadow-casting system for optical digital computing. Proceedings of the IEEE 72: 787. Karim MA and Awwal AAS (1992) Optical Computing: An Introduction. New York: Wiley. Koren I (1993) Computer Arithmetic Algorithms. New Jersey: Prentice Hall. McAulay A (1992) Optical Computing Architecture. New York: Wiley. Reynolds GO, DeVelis JB, Parrent GB and Thompson BJ (eds) (1989) Physical Optics Notebook: Tutorials in Fourier Optics. Bellingham, CA: SPIE Optical Engineering Press. Saleh BEA and Teich MC (1991) Fundamentals of Photonics. New York: John Wiley & Sons. VanderLugt A (1992) Optical Signal Processing. New York: Wiley. Yu FTS and Jutamulia S (1992) Optical Signal Processing, Computing and Neural Networks. New York: Wiley.

INFORMATION PROCESSING / Incoherent Analog Optical Processors 257

Incoherent Analog Optical Processors According to the convolution theorem, in the Fourier transform domain or frequency domain, we have

S Jutamulia, University of Northern California, Petaluma, CA, USA q 2005, Elsevier Ltd. All Rights Reserved.

Oðu; vÞ ¼ Fðu; vÞHðu; vÞ

Introduction After low-cost lasers became widely available, it seemed that incoherent analog optical processing was no longer important. In general, incoherent optical processing has no apparent significant advantages, as compared with its coherent optical counterpart or digital computer processing. However, although nobody now attempts to build an incoherent optical processor, such as an incoherent correlator, most instruments are still operating using incoherent or natural light. These incoherent instruments relate to imaging, including cameras, microscopes, telescopes, projection displays, and lithographic equipment. From this point of view, it is still very important to study the principles of incoherent analog optical processing, in order to further progress or improve the incoherent image.

Incoherent Image Formation The image formation in a coherent optical system can be explained using the linear system. A perfect point dðx; yÞ in the input plane is imaged in the output plane as the impulse response of the system hðx; yÞ: When the input is only a single point, the output amplitude is simply hðx; yÞ and the output intensity is lhðx; yÞl2 : Accordingly, the function of lhðx; yÞl2 is called point spread function (PSF). If we consider that the input object is the collection of a large number of very fine points, under coherent light illumination, the output image will be the collection of the same number of blurred spots. The shape of each blurred spot is hðx; yÞ: If the amplitude function of input object is f ðx; yÞ; the amplitude function of output image is oðx; yÞ ¼

ð ð1

f ðp; qÞhðx 2 p; y 2 qÞdp dq

½1

21

which is a convolution of f ðx; yÞ and hðx; yÞ: The intensity function of output image is  2  ð ð1    f ð p; qÞhðx 2 p; y 2 qÞdp dq  ½2 loðx; yÞl ¼    21 2

½3

where Oðu; vÞ; Fðu; vÞ; and Hðu; vÞ are the Fourier transforms of oðx; yÞ; f ðx; yÞ; and hðx; yÞ; respectively. Hðu; vÞ is the transfer function of the coherent imaging system. To distinguish it from the incoherent imaging system, it is also called coherent transfer function (CTF). The same optical imaging system is now illuminated with incoherent light instead of coherent light. The amplitude impulse response is still the same hðx; yÞ; and the point spread function is also the same lhðx; yÞl2 : Similarly, the input object is the collection of perfect points, and the output image is the collection of blurred images of each point. However, since the illuminating light is incoherent, light from any point in the input object is not coherent with light from any other points. Thus, there is no interference with light from different points. The output image is simply the addition of intensity patterns generated by each point in the input object. The amplitude function of the output image is not computable. However, the intensity function of the output image is loðx;yÞl2 ¼

ðð1

lf ðp;qÞl2 lhðx2p;y2qÞl2 dp dq

½4

21

In the frequency domain, we now have OI ðu;vÞ ¼ FI ðu;vÞHI ðu;vÞ

½5

where OI ðu;vÞ; FI ðu;vÞ; HI ðu;vÞ are Fourier transforms of loðx;yÞl2 ; lf ðx;yÞl2 ; and lhðx;yÞl2 ; respectively. The intensity impulse response lhðx;yÞl2 is the PSF. Note that the index I of OI ; FI ; and HI denotes intensity. The function HI ðu;vÞ is now the transfer function of the incoherent imaging system, which is called optical transfer function (OTF). Referring to Fourier analysis, if Hðu; vÞ is the Fourier transform of hðx; yÞ; the Fourier transform of lhðx; yÞl2 is the autocorrelation of Hðu; vÞ as follows: HI ðu; vÞ ¼

ð ð1

Hðp; qÞH p ðp 2 u; q 2 vÞdp dq ½6

21

The OTF (the function HI ðu; vÞ) is the autocorrelation of CTF (the function Hðu; vÞ), which is a complex

258 INFORMATION PROCESSING / Incoherent Analog Optical Processors

function in general. It can be written as OTF ¼ lOTFl expðifÞ

and [17] into eqn [15] yields OI ðuÞ ¼ dðuÞþ 12 dðu2aÞOTFðaÞ

½7

þ 12 dðuþaÞOTFðaÞ

We can further write MTF ¼ lOTFl

½8

½18

Remembering that OTF is a complex function given in eqn [7], we can write

where MTF stands for modulation transfer function, and

OTFðaÞ ¼ MTFðaÞ expðifðaÞÞ

½19

Substitution of eqn [19] into eqn [18] yields PTF ¼ f

½9 OI ðuÞ ¼ dðuÞþ 12 dðu2aÞMTFðaÞ expðifðaÞÞ

where PTF stands for phase transfer function.

þ 12 dðuþaÞMTFðaÞ expðifðaÞÞ

Measurement of MTF

The inverse Fourier transform of eqn [20] is

The MTF of an incoherent imaging system can be measured empirically using an input that is a cosine grating having intensity transmittance function: 2

lf ðxÞl ¼ 1 þ cosð2paxÞ

2

loðxÞl ¼ 1 þ m cosð2pax þ wÞ

½11

where m; the contrast, can be measured by loðxÞl2max 2 loðxÞl2min loðxÞl2max þ loðxÞl2min

½12

In the frequency domain, the Fourier transform of the input function is FI ðuÞ ¼ dðuÞ þ

1 2

dðu 2 aÞ þ

1 2

dðu þ aÞ

½13

The Fourier transform of the output function will be OI ðuÞ ¼ FI ðuÞOTFðuÞ

loðxÞl2 ¼ 1þMTFðaÞ cosð2paxþ fðaÞÞ

½14

½21

By comparing eqn [21] and eqn [11], we find

½10

where a is the frequency of grating. The output or the image intensity function will be



½20

MTFðaÞ ¼ m

½22

fðaÞ ¼ w

½23

and the PTF

Therefore, the MTF at frequency u ¼ a of the incoherent imaging system can be measured using eqn [12]. To get the complete MTFðuÞ; the measurement is repeated using cosine gratings with different frequencies. The PTF can also be measured at the same time.

Incoherent Spatial Filtering In the 4f coherent optical processor, the first lens performs the Fourier transform, and the second lens performs the inverse Fourier transform, as shown in Figure 1. The Fourier transform domain or frequency domain is materialized in the frequency plane that is the back focal plane of the first lens.

Substitution of eqn [13] yields OI ðuÞ ¼ dðuÞOTFð0Þþ 12 dðu2aÞOTFðaÞ

Frequency plane Input plane

þ 12 dðuþaÞOTFð2aÞ

½15

OTFð0Þ ¼ 1

½16

Output plane Lens

y

Lens

v

Notice that x

u x

due to normalization, and h

OTFðaÞ ¼ OTFð2aÞ

½17

Because OTF is an autocorrelation function, it must be a symmetric function. Substitution of eqns [16]

f

f

f

f

Figure 1 4f coherent optical processor consists of two Fourier transform lenses.

INFORMATION PROCESSING / Incoherent Analog Optical Processors 259

Thus, the Fourier transform of the object can be visualized in the frequency plane. For example, if the object is a grating, we will see its frequency components in the frequency plane. However, if the coherent illumination (e.g., from a laser) is replaced with incoherent illumination (e.g., from a lightbulb), the pattern of Fourier transform in the frequency plane disappears. Does it mean that we can no longer perform spatial filtering? No, we still can perform spatial filtering, although the pattern of Fourier transform is no longer visualized. Consider that we have a spatial filter in the frequency plane whose physical form can be expressed by the function Hðu; vÞ: For illustration, this spatial filter is simply a small one-dimensional window HðuÞ; as shown in Figure 2a. If the 4f optical

H (u)

u a

a+b

(a)

CTF(u)

u a

a+b

(b)

MTF(u)

processor is illuminated with coherent light, the CTF of the system is HðuÞ itself as shown in Figure 2b. However, if the 4f optical processor is illuminated with incoherent light, we will use OTF instead of CTF in the analysis of linear system. The OTF is the autocorrelation of CTF, which is graphically depicted in Figure 2c. In this example, since the OTF is real and positive, the MTF is identical to the OTF. For simplicity, we analyze and show a onedimensional filter only in a two-dimensional graph. It certainly can be extended to the two-dimensional filter. And we need a three-dimensional graph to show the two-dimensional autocorrelation. It is important to note that, although the CTF is a high pass filter that passes frequencies in the band of u ¼ a to u ¼ a þ b; the MTF is a low pass filter that passes frequencies from u ¼ 2b to u ¼ b; independent of a: This indicates that an incoherent processor cannot enhance high frequency. It is always a low pass processor. It is interesting to note that the OTF is independent of the location of the filter u ¼ a; but it is determined by the width of the filter Du ¼ b only. Accordingly, an aperture made up of randomly distributed pinholes behaves as a low pass filter in an incoherent imaging system. The cut-off frequency of such a filter can be derived from the diameter of the pinhole, which is equivalent to Figure 2b. The procedure for using this filter is simple. If a camera is used to take a picture, the random pinhole filter can be simply attached to the camera lens. The picture taken will not consist of high-frequency components, because the filter acts like a low pass filter. The low pass filtering can also be done by reducing the aperture of camera. However, by doing so, the light entering the camera is also reduced. The use of a random pinhole filter removes high frequencies without reducing the light intensity. When the random pinhole filter is placed in the frequency plane of a 4f coherent optical processor, it can be used to reduce the speckle noise. Note that the speckle noise is the main drawback in coherent optical processing.

Incoherent Complex Matched Spatial Filtering u –b

b

(c) Figure 2 (a) One-dimensional window in frequency plane functions as a spatial filter. (b) CTF and (c) OTF or MTF produced by spatial filter (a).

Lohmann and Werlich pointed out that the Vander Lugt correlator can also be operated with incoherent light. In the coherent Vander Lugt correlator the correlation term is: o3 ð j ; h Þ ¼

ðð1 21

f ða; bÞgpða 2 j; b 2 hÞda db

½24

260 INFORMATION PROCESSING / Incoherent Analog Optical Processors

Its intensity distribution is  2  ðð1    2 f ða; bÞgpða 2 j; b 2 hÞda db  lo3 ðj; hÞl ¼   21  ½25 However, when the input object is illuminated with incoherent light instead of coherent light, the intensity distribution of the correlation term becomes 2

lo3 ðj; hÞl ¼

ðð1

2

21

2

lf ða; bÞl lgða 2 j; b 2 hÞl da db

Pixelated input

Multiple Fourier spectra

Depixelated output

A

A Multiple phase filters with different thicknesses

Figure 4 Removal of pixel structure using multiple phase filters with different thickness to cover each Fourier spectrum order. No energy is wasted and the output intensity is very bright.

½26 Equation [26] shows the correlation of lf ðx;yÞl2 and lgðx;yÞl2 : Therefore, when f ðx;yÞ is identical to gðx;yÞ; the correlation peak always appears regardless of whether the illumination is coherent or incoherent. However, the detailed structures of the coherent and incoherent correlation outputs are different. Note that the joint transform correlator cannot be operated using incoherent light, since no joint Fourier spectra will be formed.

Depixelization of Projection Display Recently, a technique to remove the pixel structure from an image projected by liquid crystal projection display has been reported. A liquid crystal projection display must have a pixel structure to form an image. Can we remove this pixel structure from the projected image to obtain a movie-like image? It is well known that a single hole in the frequency plane can perform a coherent spatial filtering, such that the pixel structure will disappear from the output image. Because of the pixel structure, we get multiple Fourier spectra in the frequency plane. By passing only one spectrum order through the hole in the frequency plane, the pixel structure can be removed as shown in Figure 3. It is interesting to note that selecting any one of the spectra will produce the image at the same location.

Pixelated input

Multiple Fourier spectra

Depixelated output

A

A Single hole

Figure 3 Removal of pixel structure using a single hole in frequency plane. Most of the energy is wasted and the output intensity is dim.

However, if two spectra are passed, on the top of the produced image, Young interference fringes are also produced. If all spectra are passed, the produced interference fringes are in fact the pixel structure itself. By passing only one spectrum order and blocking most spectrum orders, we lose most of the energy. The projected image will be very dim, although the pixel structure is removed. To overcome this problem, we may cover each spectrum order with a transparent material having different thickness as shown in Figure 4. Thus, every spectrum order is passed and delayed by a phase filter with a different thickness. If the delay produced by the phase filter is larger than the coherent length of light, every spectrum order is no longer coherent to each other. In other words, the resultant image is the sum of the intensities of all images. As a result, the pixel structure does not appear, and no intensity is lost. For a white light source, since the frequency bandwidth is large, the coherent length is typically on the order of several tens of mm. This technique can significantly improve the quality of the liquid crystal projection display. Figure 5 shows the experimental result of a depixelated projection image. Figure 5a shows the projected image of an input with pixel structure when no phase filter is applied in the frequency plane. Figure 5b shows the projected image of the same input with pixel structure when phase filters are applied in the frequency plane. The pixel structure is successfully removed as shown in Figure 5b.

Computed Tomography The computed tomography or CT using X-ray is usually considered beyond optical information processing. However, it is interesting to review the principle of the X-ray CT, since the X-ray source is incoherent, and the CT image reconstruction utilizes Fourier transformation that is commonly used in

INFORMATION PROCESSING / Incoherent Analog Optical Processors 261

Figure 5 (a) Projected image consists of pixel structure when no phase filter is applied in the frequency plane. (b) Depixelated image is produced when phase filters are applied in the frequency plane.

optical information processing. The X-ray CT is used for taking the cross-section picture of the human body. Note that a conventional X-ray picture only provides a projection instead of a cross-section picture. Consider the case when an X-ray penetrates through an object characterized by its attenuation coefficient or density function f ðx; yÞ as shown in Figure 6. The detected X-ray intensity is I ¼ I0 exp 2



f ðx; yÞds



f (x,y) X-ray source

Detector

s I

I0

Figure 6 X-ray passes through object having density function f ðx ; y Þ:

which can be written ½27 Fðu; 0Þ ¼

where I0 is the original intensity. Equation [27] can be written as ð

f ðx; yÞds ¼ 2ln

I I0

½28

Apparently, we cannot directly measure f ðx; yÞ; Ð although we can obtain f ðx; yÞds from the measurement of I and I0 : However, we want to get the cross-section picture which is the density function f ðx; yÞ: The basic idea is to compute f ðx; yÞ from its Fourier transform Fðu; vÞ; and Fðu; vÞ is derived from eqn [28]. Fðu; vÞ can be expressed as Fðu; vÞ ¼

ð ð1

  f ðx; yÞ exp 2i2pðux þ vyÞ dx dy

21

½29 For v ¼ 0; eqn (29) becomes Fðu; 0Þ ¼

ð ð1 21

  f ðx; yÞ exp 2i2pux dx dy

½30

ð1

pðxÞ exp½2i2puxdx

½31

21

where pðxÞ ¼

ð1

f ðx; yÞdy

½32

21

The projection function pðxÞ given by eqn [32] can be obtained using an X-ray parallel beam as shown in Figure 7. Then Fðu; 0Þ can be computed using computer based on eqn [31]. Fðu; 0Þ is the Fourier transform along the u-axis. To get other data, we rotate the coordinate ðu; vÞ to ðu 0 ; v 0 Þ by a small angle a: Correspondingly, the coordinate ðx; yÞ is also rotated to ðx 0 ; y 0 Þ by the same angle a while the object is not rotated as shown in Figure 8. From the measurement shown in Figure 8, we can get pðx 0 Þ; which, in turn, provides Fðu 0 ; v 0 ¼ 0Þ: By increasing the rotation angle a, we will get other data of Fðu 00 ; v 00 ¼ 0Þ: The completion of rotation of 180 degrees will give us the data in the frequency domain ðu; vÞ as shown in Figure 9. After the data Fðu; vÞ is collected, as shown in Figure 9, Fðu; vÞ can be inverse Fourier transformed to

262 INFORMATION PROCESSING / Incoherent Analog Optical Processors

y p (x) Detector array

x f (x,y)

Figure 7 X-ray parallel beam passes through the object to produce projection function p(x).

f ðx; yÞ ¼

ð ð1

  Fðu; vÞ exp i2pðux þ vyÞ du dv

Detector array

½33

21

It can be written in polar coordinates as: ð 2p ð 1 f ðx; yÞ ¼ Fðr cos f; r sin fÞ 0 0   £ exp i2prðx cos f þ y sin fÞ lrldr df

y

y'

produce f ðx; yÞ: There are two approaches in the inverse Fourier transform. The first approach is to interpolate the computed value Fðu; vÞ such that the value of Fðu; vÞ; on a regular grid structure, can be defined before an inverse Fourier transform can be taken. The second approach is to apply polar coordinates, instead of Cartesian coordinates, to the inverse Fourier transform, such that no interpolation in frequency domain ðu; vÞ is required. The inverse Fourier transform in Cartesian coordinates is as follows:

½34

where

p (x' ) x' a

x f (x,y )

u ¼ r cos f

½35

v ¼ r sin f

½36

Therefore f ðx; yÞ can be obtained from the collected Fðu;vÞ; as shown in Figure 9, without further interpolation using eqn [34]. Note that eqns [31], [33], and [34] are computed using a computer program.

See also Figure 8 Rotated X-ray parallel beam produces the rotated projection function p(x 0 ).

Spectroscopy: Fourier Transform Spectroscopy.

Further Reading v

u

Figure 9 F (u; v ) computed from a set of rotated projection functions.

Bracewell RN (1965) The Fourier Transform and its Applications. New York: McGraw-Hill. Hounsfield GN (1973) Computerized transverse axial scanning (tomography): Part I. British Journal of Radiology 46: 1016– 1022. Iizuka K (1985) Engineering Optics. New York: SpringerVerlag. Jutamulia S, Toyoda S and Ichihashi Y (1995) Removal of pixel structure in liquid crystal projection display. Proceedings of SPIE 2407: 168 – 176. Klein MV (1970) Optics. New York: Wiley. Lohmann AW and Werlich HW (1971) Incoherent matched filtering with Fourier holograms. Applied Optics 10: 670 – 672. Reynolds GO, DeVelis JB, Parrent GB Jr and Thompson BJ (1989) The New Physical Optics Notebook: Tutorials in Fourier Optics. Bellingham, WA: SPIE. Vander Lugt A (1964) Signal detection by complex spatial filtering. IEEE Trans. Inf. Theo. IT-10: 139 – 145.

INFORMATION PROCESSING / Optical Bit-Serial Computing 263

Yang X, Jutamulia S and Li N (1996) Liquid-crystal projection image depixelization by spatial phase scrambling. Applied Optics 35: 4577– 4580.

Yu FTS and Wang EY (1973) Speckle reduction in holography by means of random spatial sampling. Applied Optics 12: 1656– 1659.

Optical Bit-Serial Computing A D McAulay, Lehigh University, Bethlehem, PA, USA q 2005, Elsevier Ltd. All Rights Reserved.

Why Optical Bit-Serial Computing? Optical bit-serial computing will lower cost and improve dataflow rates in optical telecommunications and associated computing. Moreover, the electromagnetic immunity of optics will enhance security for computing as well as for communications. Also, absence of sparking with optics enables safe operation, even in hazardous environments. Currently, optical bit-serial communication involves passing terabits of data per second (10 12 bps) – thousands of encyclopedias per second – along an optical fiber by switching light of multiple colors on and off, representing ‘1’ and ‘0’ bits respectively. Replacing electronic computing functions, constrained to tens of gigabits per second (109 bps), with integrated optics, ones of much higher speed, enables faster flow rates – allowing continuation of the annual doubling of capacity-distance that has occurred since the 1970s. Cost is reduced by eliminating the demultiplexing and multiplexing for optic-electronic-optic (OEO) conversions. Two illustrative areas of optical bit-serial computing for replacing electronic computing functions are described: computing at up to 40 Gbps with semiconductor optical amplifiers (SOAs) and computing at higher bit rates with integrated optic components. Several books address the relevant technical background: free-space optical information processing, optical computer architectures, optical fiber communications, and integrated optics.

1 0 1 0

1

1

Computing with Semiconductor Optical Amplifiers Semiconductor optical amplifiers, SOAs, allow nonlinear operations (switching, logic) at low power (mW) because of amplification – typically up to 1000. An SOA is a laser diode with antireflection coatings in place of reflecting facets. Light pulses at bit rates in excess of 40 Gbps are amplified by absorbing power from an electronic pump. SOAs are used in the following, at bit rates up to 40 Gbps, for nonlinear operations in cross-gain, cross-phase, or cross-polarization. SOA-XGM for Logic

Figure 1 shows an SOA in cross-gain modulation (XGM) performing a NOR function. Either A or B inputs at a ‘1’ level, driving the SOA through couplers into saturation, lowering its gain. A continuous wave at C, having a different frequency l3 ; from that of A, l1 and B, l2 , experiences the decrease in gain, resulting in an output ‘0’ level at frequency l3. The filter in Figure 1 can be avoided by counter-propagating A and B through the SOA from right to left. XOR gates use interference between two incoming streams of bits (locked to identical frequency, phase, polarization) to correlate headers on internet protocol (IP) address headers – similar to matching zip codes in postal mail routing machines. Bit-serial-adders are constructed from combinations of XOR and NOR gates; subsequently the adders are used to construct ripple-carry-adders for word addition. In addition, NOR gates are used to construct flip-flops; hence registers which form short-term memory.

0 D l3

A l1 1

0

1 B l2 C l3

Figure 1 Optical XGM-SOA NOR gate.

XGM-SOA

Bandpass filter l 3

264 INFORMATION PROCESSING / Optical Bit-Serial Computing

A

D F

XGM-SOA C

XGM-SOA E

Figure 2 Optical XGM-SOA 2R signal regeneration.

Semiconductor optical amplifier (SOA)

Control signal input Directional coupler 10/90

Optical bit-serial data input

Optical bit-serial data output

Figure 3 XPM-SOA in Sagnac interferometer for optical manipulation of bits at high data rates.

SOA-XPM with Interferometer Light in

Figure 4 resonator.

Light out

Integrated optic Optiwave layout for microring

SOA-XGM for Frequency Conversion

Ignoring input B, the bit information in Figure 1 has been transferred from carrier frequency A onto the carrier frequency C, with inversion. This enables switching of signals in wavelength division multiplexing (WDM), for example, color reuse is simplified and a color red is routed to the location where blue was previously routed; this is equivalent to time slot interchangers for switching in time division multiplexing (TDM). SOA-XGM for 2-R Signal Regeneration

Figure 2 show how two SOAs may be used in sequence to restore signal levels distorted by propagation or computation; reamplification and reshaping (2-R). The ‘1’ level at the output of the first SOA is clipped by gain saturation, removing noise, this becoming the ‘0’ level at the output of the second SOA. The second SOA clips the ‘1’ level at the output.

In cross-phase modulation (XPM), the signal at A, Figure 1, changes the phase of carrier for signal C passing through the SOA (signal levels may be less than for XGM). Inclusion of an SOA in an interferometer, Mach – Zehnder or Sagnac, converts phase to intensity for frequency conversion, switching or high-speed bit manipulation. Figure 3 shows a control pulse entering a coupler at the left and providing a control input to the SOA at one frequency. The bits circulating round the loop at a different frequency can be switched on and off by the control pulse, to synchronize or manipulate high bit rate signals or for frequency conversion.

Computing with Integrated Optics Combining several discrete optical components into a single optical chip is more beneficial than in electronic very large-scale integration (VLSI) because optical connections require greater precision; moreover bit rates are higher because photons, unlike electrons, have no mass or charge (only spin). Integrated Optic Microring Resonator

Figure 4 shows a microring resonator in an integrated optic chip for optical filtering, light entering at the left couples into the loop. For loop lengths that are a multiple of wavelength, light cycling the loop is in phase and resonance occurs. At resonant frequencies,

INFORMATION PROCESSING / Optical Bit-Serial Computing 265

Coupled cavities

Input 1 Input 2

Output 1 Output 2

(a) Output 1 Output 2

Input 1 Input 2 (b) Figure 6 Proposed interferometer.

Figure 5 Integrated optic Optiwave layout for Bragg microring resonator.

light exiting the loop is out of phase with that in the linear guide, causing cancellation of the input. As frequency varies away from resonance, light passes and the device acts as a notch filter to block light at the resonant frequencies. With nonlinear Kerr material, low-level light intensity may be used to change a resonant frequency to make a switch. Bragg Integrated Optic Microring Resonator

Performance is improved by using a Bragg microring resonator, Figure 5, which confines light in the central light colored ring by dielectric mirror reflection from Bragg gratings on either side. The refractive index is less in the ring than in the surrounding regions, in contrast to a conventional waveguide (used in Figure 4) that guides by total internal reflection. The higher reflectivity, smaller ring radius and lower refractive index – these help to reduce power loss, volume, and travel time around the ring. This increases filter selectivity, drives up peak power for enhanced nonlinearity for switching, decreases stored energy for fast response, and increases frequency range (free spectral range, FSR). In this case, an equivalent photonic crystal may be more expensive because it requires higher resolution lithography and greater refractive index contrast. Photonic Crystal Mach – Zehnder Switch

Recently developed photonic crystal technology allows a reduction in integrated optic chip size from centimeters to submillimeters. The 2D or 3D periodic

photonic

crystal

Mach – Zehnder

structure of dielectric variations emulates crystals and provides dielectric mirror reflections in multiple dimensions for lossless waveguide bends, high Q resonators, and slowing of wave propagation to enhance effective nonlinearity of materials. Figure 6 shows a proposed 2D photonic crystal Mach – Zehnder interferometer switch in which light is split into two paths by a coupler and propagation velocity is slowed for enhancing nonlinearity by using coupled-resonator optical waveguides (CROWs), in which alternate posts of different dielectric constant remain in the waveguides to form coupled cavities.

Conclusion Optical bit-serial computing can reduce cost and increase flow-rate for expanding telecommunication network capacity, while providing enhanced security from electromagnetic sabotage and tampering. Advances in integrated optics (microring resonators), progress in optoelectronic components (SOAs), improved materials and fabrication techniques – these technologies enable cost-effective optical bit-serial computing solutions for future telecommunication evolution.

See also All-Optical Signal Regeneration. Information Processing: All-Optical Multiplexing/Demultiplexing.

Further Reading Agrawal G (2002) Fiber-Optic Communication Systems. New York: Wiley. Connelly MJ (2002) Semiconductor Optical Amplifiers. Boston, MA: Kluwer Academic Publishers. Goodman JW (1996) Introduction to Fourier Optics, 2nd edn. New York: McGraw-Hill.

266 INFORMATION PROCESSING / Optical Digital Image Processing

Heuring VP, Jordan HF and Pratt JP (1992) A bit serial architecture for optical computing. Applied Optics 31(17): 3213– 3224. Hunsperger RG (2002) Integrated Optics, 5th edn. New York: Springer. Joannopoulios JD, Meade RD and Winn JN (1995) Photonic Crystals. Princeton, NJ: Princeton University Press. McAulay AD (1991) Optical Computer Architectures. New York: Wiley. McAulay AD (2004) Novel all-optical flip-flop using semiconductor optical amplifiers in innovating

frequency-shifting inverse-threshold pairs. Optical Engineering 43(5): 1115– 1120. Scheuer J and Yariv A (2003) Two-dimensional optical ring resonators based on radial Bragg resonance. Optics Letters 28(17): 1528–1530. Xu L, Glesk I, Baby V and Prucnal PR (2004) All-optical wavelength conversion using SOA at nearly symmetric position in a fiber-based Sagnac interferometric loop. IEEE Photonics Technical Letters 16(2): 539 – 541. Yariv A (1997) Optical Electronics in Modern Communications, 5th edn. New York: Oxford University Press.

Optical Digital Image Processing B L Shoop, United States Military Academy, West Point, NY, USA q 2005, Elsevier Ltd. All Rights Reserved.

Optical approaches to image processing provide a natural method by which to obtain the inherent advantages associated with optics including massive parallelism, high-speed processing, and the inherent compatibility with image formats. Early applications of optics employed analog processing techniques with the most common being the optical Fourier transform, matrix-vector processors, and correlators. Later, advances in digital signal processing (DSP), digital storage, and digital communication systems demonstrated the potential for higher resolution, improved flexibility and functionality, and increased noise immunity over analog techniques and quickly became the preferred method for accurate signal processing. Consequently, optical processing architectures, that incorporated digital processing techniques, were explored. Advances in very large-scale integration (VLSI) electronic circuitry and optoelectronic devices later enabled smart pixel technology, which integrates the programmability and processing of electronic circuitry with the two-dimensional (2D) nature of an optical architecture and made possible a particular realization of optical digital image processing. This chapter describes an application of optical digital image processing, based on smart pixel technology, called digital image halftoning. Natural images are, by definition, continuous in intensity and color. Halftoning is the process by which a continuous-tone, gray-scale image is converted to one containing only binary output pixels. Halftoning can be thought of as an image compression technique whereby a high-resolution image is transformed to a low-resolution image containing only black-and-white pixels. The transformation

from a continuous-tone, gray-scale image to one containing only binary-valued pixels is also similar, in principle, to low-resolution analog-to-digital (A/D) conversion. An early manual approach to printing continuoustone images, using only the presence or absence of ink, is the Mezzotint. Ludwig von Siegen (1609 – 1680), an amateur Dutch printmaker, first invented the mezzotint, or ‘half tone process’, in 1640. The process later came into prominence in England during the early eighteenth century. Mezzotints are produced on copper plates where the surface of the plate is roughed with a tool called the Mezzotint Rocker, shaped like a wide chisel with a curved and serrated edge. By rocking the toothed edge backwards and forwards over the plate, a rough burr is produced which holds the ink. The dark regions of an image were roughed in a random fashion, while the areas to be lightened were scraped and burnished. This process was found to be especially useful for the reproduction of paintings, due to its ability to capture the subtlest gradations of tone from rich, velvety blacks to glowing highlights. The mezzotint is, therefore, an early artistic predecessor to the modern day halftone. Optical halftoning has been in use by commercial printers for over 100 years. Commercial halftone screens are based on a discovery made by William Henry Fox Talbot (1800 – 1877), in 1852. He demonstrated the feasibility of optical halftoning by photographing an image through a loosely woven fabric or ‘screen’. In the 1890s, this process came into practical use when the halftone screen, consisting of two ruled glass plates cemented together, became commercially available. Commercial halftone screens produce an effect of variably sized dots on the photographic plate that gives the illusion of a continuous tone image. Digital image halftoning, sometimes referred to as spatial dithering, is the process of converting an

INFORMATION PROCESSING / Optical Digital Image Processing 267

electronic continuous-tone, gray-scale image to one containing only binary-valued pixels, for the purpose of displaying, printing, or storing the image. The underlying concept is to provide the viewer of the image with the illusion of viewing a continuous-tone image when, in fact, only black and white pixel values are used in the rendering. This process is particularly important in applications such as laser printing, bilevel displays, xerography, and more recently, facsimile. There are a number of different methods by which this digital image halftoning can be accomplished, which are generally classified as either point or neighborhood processes. A point process is one which computes the output pixel value based strictly on some characteristic of the corresponding input pixel. In contrast, a neighborhood process computes a single output pixel based on a number of pixels in a neighborhood or region of the input image. Ordered dither, which is considered a point process, produces an output by comparing a single continuous-tone input value against a deterministic periodic array of threshold values. If the value of the input pixel under consideration is greater than the corresponding threshold value, the corresponding output pixel is rendered white. If the intensity is less that the threshold, it is rendered black. Dispersed-dot ordered dither is a subset of ordered dither in which the halftone dots are of a fixed size while clustered-dot ordered dither uses variable-sized dots and simulates the variable-sized dots of a printer’s halftone screen in the rendering. Among the advantages of point process halftoning, in general and ordered dither halftoning specifically, are simplicity and speed of implementation. The primary disadvantage is that ordered dither produces patterns in the halftoned image, which are visually undesirable. In contrast, halftoning using the error diffusion algorithm, first introduced by Floyd and Steinberg in 1975, employs neighborhood operations and is currently the most popular neighborhood process. In this approach to halftoning, the output pixel value is determined, not solely by the value of the input pixel and some deterministic pattern, but instead by a weighted average of values from pixels in a neighborhood surrounding the specific pixel being computed. Here, the error of the quantization process is computed and spatially distributed or diffused within a local neighborhood in order to influence future pixel quantization decisions within that neighborhood and thereby improve the overall quality of the halftoned image. In classical unidirectional error diffusion, the image is processed sequentially, proceeding from the upper-left to the lower-right. Starting in the corner of the image, the first pixel is

thresholded and the quantizer error is calculated. The error is then diffused to neighboring pixels that have not yet been processed according to the scaling and interconnect defined by the error diffusion filter. The remainder of the pixels in the image are subsequently processed with each quantizer input, now being the algebraic sum of the corresponding original input pixel intensity and the weighted error from previously processed pixels. While error diffusion produces halftone images of superior quality, the unidirectional processing of the algorithm continues to introduce visual artifacts that are directly attributable to the algorithm itself. In an effort to overcome the implementation constraints of serial processing and improve overall halftone image quality, a number of different parallel architectures have been investigated, including several based on neural network algorithms. While neural network approaches can provide distinct advantages in improved halftone image quality, they also present challenges to hardware implementations, including speed of convergence and the physical interconnect requirements. These challenges, in addition to the natural compatibility with imaging media, provide the motivation for developing optical image processing architectures for these types of applications.

The Error Diffusion Algorithm Figure 1 shows a block diagram of the error diffusion architecture used in digital image halftoning. The input image is represented as xm;n ; wm;n is the impulse response of a 2D causal, unity gain error diffusion filter, and the output quantized image is described by ym;n [ {21; 1}: The quantizer q½um;n  provides the onebit thresholding functionality necessary to convert each analog input pixel to a low-resolution digital output pixel. The quantizer error 1m;n is computed as the difference between the output and input to the quantizer and distributed to adjacent pixels according to the weighting and interconnection specified by the error diffusion filter. The unity gain constraint ensures Quantizer

xm,n

+

um,n

Σ

ym,n

q [um,n]

_

_

hi,j Figure 1

Σ

+ em,n

Block diagram of recursive error diffusion architecture.

268 INFORMATION PROCESSING / Optical Digital Image Processing

that no amplification or attenuation of the error signal occurs during the error diffusion process and preserves the average intensity of the image. In this architecture, the error associated with the quantizer decision at spatial coordinates ðm; nÞ is diffused within a local 2D region to influence adjacent quantization decisions. All of the state variables in this architecture are scalar values since this architecture represents a recursive algorithm in which each pixel in the image is processed sequentially. In the general case, where wm;n is assumed to be a 2D finite impulse response (FIR) filter with coefficients that span a region of support defined by Rm;n ; the quantizer input state um;n can be written as X wi; j 1m2i;n2j ½1 um;n ¼ xm;n 2 i; j[Rm;n

Here, wi; j ¼ 0; for all i ¼ j; since the error signal is diffused spatially within the region of support Rm;n and does not influence the pixel under consideration. In this unidirectional error diffusion, the 2D error diffusion filter is therefore noncausal. Figure 2 shows the impulse response of several popular error diffusion filters that have been used in rectangular grid architectures. Figure 2a shows the original error diffusion filter weights developed by Floyd and Steinberg. They argued that the four weights were the smallest number of weights that would produce good halftone images. Figure 2b is a filter that contains a larger region of support developed by Jarvis, Judice, and Ninke in 1976. Finally, the filter in Figure 2c, developed by Stucki, contains the same region of support with coefficients that are a multiple of 2 for digital compatibility and computational efficiency. Here, ‘†’ represents the pixel being processed and the weights describe the local region over which the error is diffused. The normalization factors ensure that the filter coefficients sum to one and therefore meet the unity gain criterion. To understand the effect of error diffusion and its impact on the quantizer error, it is instructive to

1 (16 )

1 (48 )

7 3 5 1 (a)

7 5 3 5 7 5 3 1 3 5 3 1 (b)

1 (42 )

8 4 2 4 8 4 2 1 2 4 2 1 (c)

Figure 2 Three common error diffusion filters used in rectangular grid architectures. (The ‘†’ represents the origin.)

describe the effect of error diffusion mathematically. Consider the relationship between the quantizer error 1m;n ; ym;n 2 um;n and the overall quantization error 1^m;n ; ym;n 2 xm;n in the frequency domain. If we assume that the error is uncorrelated with the input and has statistical properties consistent with a white process, z-transform techniques can be applied to show that the feedback architecture provides the following noise-shaping characteristics: Hns ðz1 ; z2 Þ ;

^ 1 ; z2 Þ Eðz ¼ 1 2 Wðz1 ; z2 Þ Eðz1 ; z2 Þ

½2

^ 1 ; z2 Þ; Eðz1 ; z2 Þ; and Wðz1 ; z2 Þ represent the Here, Eðz z-transforms of 1^m;n ; 1m;n ; and wm;n ; respectively. Equation [2] shows that the noise-shaping characteristic of the error diffusion architecture is directly related to the spectral characteristics of Wðz1 ; z2 Þ: Appropriate selection of Wðz1 ; z2 Þ can spectrally shape the quantizer noise in such a way that minimizes the effect of the low-resolution quantization process on the overall halftoning process. An error diffusion filter with low-pass spectral characteristics produces an overall noise-shaping characteristic 1 2 Wðz1 ; z2 Þ; which has the desired high-pass frequency characteristics. This noise-shaping function suppresses the quantization noise within those frequencies occupied by the image and spectrally shapes the noise to high frequencies, which are less objectionable to the human visual system. Circular symmetry is another important characteristic in the error diffusion filter since the human visual system is particularly sensitive to directional artifacts in the image. In the following qualitative description of digital image halftoning, a 348 £ 348 natural image of the Cadet Chapel at West Point was selected as the test image. This image provides a variety of important image characteristics, with regions of uniform grayscale, edges, and areas with fine detail. This image was scanned from a high-resolution black and white photograph at 150 dots per inch (dpi) and then printed using a 300 dpi laser printer. Figure 3 shows the test image of the Cadet Chapel rendered using dispersed-dot ordered dither and results in 256 gray-levels at 300 dpi. Figure 4 shows the halftone image of the Cadet Chapel using the error diffusion algorithm and the Floyd – Steinberg filter coefficients shown in Figure 2a. The unidirectionality of the processing and the causality of the diffusion filter result in undesirable visual artifacts in the halftone image. These include directional hysteresis, which is manifested as ‘snakes’ running from upperleft to lower-right, and transient behavior near boundaries, which appears as ‘shadows’ below and to the right of sharp-intensity changes. The directional

INFORMATION PROCESSING / Optical Digital Image Processing 269

hysteresis is particularly objectionable in uniform gray intensity areas such as the cloud structure in the upperleft of Figure 4. Similar artifacts are also present in images halftoned using the error diffusion algorithm and other causal diffusion kernels. The logical conclusion to draw from this limited, qualitative analysis is that if we could diffuse the error symmetrically and simultaneously process the entire image in parallel, we could reduce some of these visual artifacts and thereby improve overall halftone image quality.

The Error Diffusion Algorithm and Neural Networks The popularity of the neural network-based approach to signal and image processing lies in the ability to minimize a particular performance metric associated with a highly nonlinear system of equations. Specifically, the problem of creating a halftone image can be cast in terms of a nonlinear quadratic optimization problem where the performance measure to be minimized is the difference between the original and the halftone images. The Hopfield-Type Neural Network

Tank and Hopfield first proposed a mathematical description of the functionality of the neural processing network for signal processing applications. Figure 5 shows an electronic implementation of a four-neuron architecture. Here, a single neuron is comprised of both a standard and inverting amplifier and the synapses or neural interconnections are represented by the physical connections between the input and output of the amplifiers. If the input to amplifier i is connected to the output of amplifier j by a resistor with resistance Ri; j ; the amplitude of the connection Ti; j is the conductance 1=Ri; j : The dynamic behavior of an N-neuron Hopfieldtype neural network can be described by the following system of N nonlinear differential equations: c

Figure 3 Original gray-scale image of the Cadet Chapel rendered using dispersed-dot ordered dither.

X dui ðtÞ ¼ 2ui ðtÞ þ Ti; j I½uj ðtÞ þ xi dt j

½3

where i ¼ 1; 2; …N; I[·] is a monotonically increasing sigmoid function, xi is an input vector containing N x1

x2

Figure 4 Halftone image of the Cadet Chapel produced using the error diffusion algorithm and the Floyd– Steinberg weights.

x4

T1,1

T2,1

T3,1

T4,1

T1,2

T2,2

T3,2

T4,2

T1,3

T2,3

T3,3

T4,3

T1,4

T2,4

T3,4

T4,4

y1

y1

x3

y2

y2

y3

y3

y4

y4

Figure 5 Electronic realization of a four-neuron Hopfield-type neural network.

270 INFORMATION PROCESSING / Optical Digital Image Processing

elements, and c is a scaling factor. In equilibrium, eqn [3] implies that X ui ¼ xi þ Ti; j Ibuj c ½4

In equilibrium, the error diffusion neural network can be shown to satisfy u ¼ Wðy 2 uÞ þ x

½6

j

Hopfield showed that when the matrix of interconnection weights T is symmetric with zero diagonal elements and the high-gain limit of the sigmoid I[·] is used, the stable states of the N functions yi(t) ¼ Ibujc are the local minima of the energy function: 1 E ¼ 2 yT Ty 2 xT y 2

½5

where y [ {21; 1} is the N-vector of quantized states. As a result, if the neural network can be shown to be stable, as the network converges the energy function is minimized. In most neural network applications, this energy function is designed to be proportional to a performance metric of interest and therefore as the network converges, the energy function and consequently the performance metric is also minimized. The Error Diffusion Neural Network

An understanding of the operation and characteristics of the Hopfield-type neural network can be applied to the development and understanding of a 2D extension of the error diffusion algorithm. Figure 6 shows an electronic implementation of a four-neuron error diffusion-type neural network, where the individual neurons are represented as amplifiers and the synapses by the physical connections between the input and output of the amplifiers.

Here, u is the state vector of neuron inputs, W is a matrix containing the interconnect weights, x is the input state vector, and y is the output state vector. For an N £ N image, W is an N 2 £ N 2 sparse, circulant matrix derived from the original error diffusion weights wm;n : If we define the coordinate system such that the central element of the error diffusion kernel is ði; jÞ ¼ ð0; 0Þ; then the matrix W is defined as Wði; jÞ ¼ 2w½ð j 2 iÞ div N; ðj 2 iÞ mod N where:     x x if xy $ 0 or if xy , 0 ½7 x div y ¼ y y An equivalence to the Hopfield network can therefore be described by u ¼ AðWy þ xÞ ½8 where A ¼ ðI þ WÞ21 : Effectively, the error diffusion network includes a pre-filtering of the input image x by the matrix A while still filtering the output image y but now with a new matrix, AW. Recognizing that AW ¼ I 2 A and adding the arbitrary constant k ¼ yT y þ xT Ax; we can write the energy function of the error diffusion neural network as ^ Eðx; yÞ ¼ yT Ay 2 2yT Ax þ xT Ax

½9

This energy function is a quadratic function, which can be factored into ^ Eðx; yÞ ¼ ½B ðy 2 xÞ T ½B ðy 2 xÞ  |fflffl{zfflffl} |fflffl{zfflffl} error

½10

error

T

x1

x2

x3

w1,1

w2,1

w3,1

w4,1

w1,2

w2,2

w3,2

w4,2

w1,3

w2,3

w3,3

w4,3

w1,4

w2,4

w3,4

w4,4









Σ e1

y1

+

+

e2

y2

+ e3

y3

;k : ½AWk;k $ 0

Σ

Σ

Σ

+

where A ¼ B B: From eqn [10] we find that as the error diffusion neural network converges and the energy function is minimized, so too is the error between the output and input images. If the neurons update independently, the convergence of the error diffusion network is guaranteed if

x4

e4

y4

Figure 6 Four-neuron electronic implementation of the error diffusion neural network.

½11

We find in practice, that even in a synchronous implementation, the halftoned images converge to a solution, which results in significantly improved halftone image quality over other similar halftoning algorithms. The Error Diffusion Filter

The purpose of the error diffusion filter is to spectrally shape the quantization noise in such a way that the

INFORMATION PROCESSING / Optical Digital Image Processing 271

quantizer error is distributed to higher spatial frequencies, which are less objectionable to the human visual system. In this application, a feedback filter for the error diffusion neural network was designed using conventional 2D filter design techniques resulting in the 7 £ 7 impulse response shown in Figure 7. Figure 8 shows the same image of the Cadet Chapel halftoned using the error diffusion neural network algorithm and the 2D error diffusion filter shown in Figure 7. There is clear improvement in halftone image quality over the image produced using sequential error diffusion in Figure 4. Notice particularly the uniform distribution of pixels in the cloud formation in the upper-left of Figure 8. Also noteworthy is the improvement around the fine detail portions of the tree branches and next to the vertical edges of the chapel.

0.0003

0.0019

0.0051

0.0068

0.0051

0.0019

0.0003

0.0019

0.0103

0.0248

0.0328

0.0248

0.0103

0.0019

0.0051

0.0248

0.0583

0.0766

0.0583

0.0248

0.0051

0.0068

0.0328

0.0766

0.0766

0.0328

0.0068

0.0051

0.0248

0.0583

0.0766

0.0583

0.0248

0.0051

0.0019

0.0103

0.0248

0.0328

0.0248

0.0103

0.0019

0.0003

0.0019

0.0051

0.0068

0.0051

0.0019

0.0003

Figure 7 Impulse response of a 7 £ 7 symmetric error diffusion filter. (‘†’ represents the origin.)

Smart Pixel Technology and Optical Image Processing The concept underlying smart pixel technology is to integrate electronic processing and individual optical devices on a common chip, to take advantage of the complexity of electronic processing circuits and the speed of optical devices. Arrays of these smart pixels then allow the advantage of parallelism that optics provides. There are a number of different approaches to smart pixel technology that differ primarily in the kind of opto-electronic devices, the type of electronic circuitry, and the method of integration of the two. Common opto-electronic devices found in smart pixels include: light-emitting diodes (LEDs); laser diodes (LDs); vertical cavity surface emitting lasers (VCSELs); multiple quantum well (MQW) modulators; liquid crystal devices (LCDs); and photodetectors (PDs). The most common types of electronic circuitry are silicon-based semiconductors such as complementary metal oxide semiconductors (CMOS) and compound semiconductors such as gallium arsinide (GaAs)-based circuitry. There are a number of different approaches to smart pixels, which generally differ in the way in which the electronic and optical devices are integrated. Monolithic integration, direct epitaxy, and hybrid integration are the three most common approaches in use today. Smart pixel technology provides a natural methodology by which to implement optical image processing architectures. The opto-electronic devices provide a natural optical interface while the electronic circuitry provides the ability to perform either analog or digital computation. It is important to understand that any functionality that can be implemented in electronic circuitry can be integrated into a smart pixel architecture. The only limitations arise from physical space constraints imposed by the integration of the opto-electronic devices with the electronic circuitry. While smart pixels can be fabricated with digital or analog circuitry, the smart pixel architecture described subsequently uses mixed-signal circuitry. A Smart Pixel Implementation of the Error Diffusion Neural Network

Figure 8 Halftone image of the Cadet Chapel using the error diffusion neural network and the 7 £ 7 symmetric error diffusion filter shown in Figure 7.

A smart pixel hardware implementation of the error diffusion neural network provides the potential to simultaneously achieve the computational complexity of electronic circuitry and the parallelism and high-speed switching of optics. The specific smart pixel architecture described here integrates MQW modulators called self electrooptic effect devices (SEEDs) with silicon CMOS

272 INFORMATION PROCESSING / Optical Digital Image Processing

VLSI circuitry, using a hybrid integration approach called flip-chip bonding. This type of smart pixel is commonly referred to as CMOS-SEED smart pixel technology. To provide an example of the application of smart pixel technology to digital image halftoning, a 5 £ 5 CMOS-SEED smart pixel array was designed and fabricated. The CMOS circuitry was produced using a 0.5 mm silicon process and the SEED modulators were subsequently flip-chip bonded to silicon circuitry using a hybrid integration technique. The central neuron of this new smart pixel array consists of approximately 160 transistors while the complete 5 £ 5 array accounts for over 3,600 transistors. A total of 50 optical input/output channels are provided in this implementation. Figure 9 shows the circuitry associated with a single neuron of the smart pixel error diffusion neural network. All state variables in each circuit of this architecture are represented as currents. Beginning in the upper-left of the circuit and proceeding clockwise, the input optical signal incident on the

SEED is continuous in intensity and represents the individual input analog pixel intensity. The input SEED at each neuron converts the optical signal to a photocurrent and subsequently, current mirrors are used to buffer and amplify the signal. The width-tolength ratio of the metal oxide semiconductor field effect transistors (MOSFETs) used in the CMOS-SEED circuitry provide current gain to amplify the photocurrent. The first circuit produces two output signals: þIu which represents the state variable um;n as the input to the quantizer and 2Ie which represents the state variable 21m;n as the input to the feedback differencing node. The function of the quantizer is to provide a smooth, continuous threshold functionality for the neuron that produces the output signal Iout ; corresponding to the output state variable ym;n : This second electronic circuit is called a modified wide-range transconductance amplifier and produces a hyperbolic tangent sigmoidal function when operated in the sub-threshold regime. The third circuit takes as its input Iout ; produces a replica of the original signal, and drives

Figure 9 Circuit diagram of a single neuron and a single error weight of the 5 £ 5 error diffusion neural network based on a CMOSSEED-type smart pixel architecture. (Reprinted with permission from Shoop BL, Photonic Analog-to-Digital Conversion, Springer Series in Optical Sciences, Vol. 81 [1], Fig. 8.6, p. 222. Copyright 2001, Springer-Verlag GmbH & Co. KG.)

INFORMATION PROCESSING / Optical Digital Image Processing 273

the output optical SEED. In this case, the output optical signal is a binary quantity represented as the presence or absence of light. When the SEED is forward-biased, light is generated through electroluminescence. The last circuit at the bottom of the schematic implements a portion of the error weighting and distribution function of the error diffusion filter. The individual weights are implemented by scaling the width-to-length ratio of the MOSFETs to achieve the desired weighting coefficients. The neuron-to-neuron interconnections are accomplished using the four metalization layers of the 0.5 mm silicon CMOS process. In this design, the error diffusion filter was limited to a 5 £ 5 region of support because of the physical constraints of the circuitry necessary to implement the complete error diffusion neural network in silicon circuitry. The impulse response of the 5 £ 5 filter used in this particular smart pixel architecture is shown in Figure 10. The error weighting circuitry at the bottom of Figure 9 represents only the largest weight (0.1124) with interconnects to its four local neighbors (IoutA 2 IoutD).

0.0023 0.0185 0.0758 0.0185 0.0023

0.0185 0.0254 0.1124 0.0254 0.0185

0.0758 0.1124 0.1124 0.0758

0.0185 0.0254 0.1124 0.0254 0.0185

0.0023 0.0185 0.0758 0.0185 0.0023

Figure 10 Error diffusion filter coefficients used in the 5 £ 5 CMOS-SEED error diffusion architecture. (‘†’ represents the origin.)

Figure 11 shows a photomicrograph of a 5 £ 5 error diffusion neural network, which was fabricated using CMOS-SEED smart pixels. Figure 12 shows a photomicrograph of a single neuron of the 5 £ 5 CMOS-SEED neural network. The rectangular features are the MQW modulators while the silicon circuits are visible between and beneath the modulators. The MQW modulators are approximately 70 mm £ 30 mm and have optical windows, which are 18 mm £ 18 mm. Simulations of this network predict individual neuron switching speeds of less than 1 ms, which corresponds to network convergence speeds capable of providing real-time digital image halftoning. Individual component functionality and dynamic operation of the full 5 £ 5 neural array were both experimentally characterized. Figure 13 shows CCD images of the operational 5 £ 5 CMOS-SEED smart pixel array. The white spots show SEED MQW modulators emitting light. Figure 13a shows the fully functioning array while Figure 13b shows the 5 £ 5 CMOS-SEED neural array under 50% gray-scale input. Here 50% of the SEED modulators are shown to be in the on-state demonstrating correct operation of the network and the halftoning architecture. Both the simulations and the experimental results demonstrate that this approach to a smart pixel implementation of the error diffusion neural network provides sufficient accuracy for the digital halftoning application. The individual neuron switching speeds also demonstrate the capability for this smart pixel hardware implementation to provide real-time halftoning of video images.

Figure 11 Photomicrograph of a smart pixel implementation of a 5 £ 5 CMOS-SEED error diffusion neural network for digital image halftoning. (Reprinted with permission from Shoop BL, Photonic Analog-to-Digital Conversion, Springer Series in Optical Sciences, Vol. 81 [1], Fig. 8.5, p. 221. Copyright 2001, Springer-Verlag GmbH & Co. KG.)

274 INFORMATION PROCESSING / Optical Digital Image Processing

Figure 12 Photomicrograph of a single neuron of the 5 £ 5 error diffusion neural network. (Reprinted with permission from Shoop BL, Photonic Analog-to-Digital Conversion, Springer Series in Optical Sciences, Vol. 81 [1], Fig. 8.4, p. 220. Copyright 2001, SpringerVerlag GmbH & Co. KG.)

Figure 13 CCD images of the 5 £ 5 CMOS-SEED array. (a) Fully-operational array, and (b) under 50% gray scale illumination. (Reprinted with permission from Shoop BL, Photonic Analog-to-Digital Conversion, Springer Series in Optical Sciences, Vol. 81 [1], Figs. 8.19 and 8.20, p. 236. Copyright 2001, Springer-Verlag GmbH & Co. KG.)

Image Processing Extensions Other image processing functionality is also possible by extending the fundamental concepts of 2D, symmetric error diffusion to other promising image processing applications. One important application includes color halftoning while other extensions include edge enhancement and feature extraction. In the basic error diffusion neural network, the error diffusion filter was specifically designed to produce visually pleasing halftoned images. Care was taken to ensure that the frequency response of the filter was circularly symmetric and that the cutoff frequency was chosen in such a way as to preserve image content. An analysis of the error diffusion

neural network shows that the frequency response of this filter directly shapes the frequency spectrum of the output halftone image and therefore directly controls halftone image content. Other filter designs with different spectral responses can provide other image processing features such as edge enhancement, which could lead to feature extraction and automatic target recognition applications.

See also Detection: Smart Pixel Arrays. Information Processing: Optical Neural Networks. Optical Processing Systems.

INFORMATION PROCESSING / Optical Neural Networks 275

Further Reading Dudgeon DE and Mersereau RM (1984) Multidimensional Digital Signal Processing. Englewood Cliffs, NJ: Prentice-Hall. Lau DL and Arce GR (2001) Modern Digital Halftoning. New York: Marcel Dekker. Lim JS (1990) Two-Dimensional Signal and Image Processing. Englewood Cliffs, NJ: Prentice Hall. Mead CA (1989) Analog VLSI and Neural Systems. New York: Addison-Wesley. Poon T-C and Banerjee PP (2001) Contemporary Optical Image Processing with Matlab. New York: Elsevier.

Russ JC (1998) The Image Processing Handbook. Boca Raton, FL: CRC Press. Shoop BL (2001) Photonic Analog-to-Digital Conversion. Springer Series in Applied Optics, vol. 81. Berlin: Springer-Verlag. Shoop BL, Sayles AH and Litynski DM (2002) New devices for optoelectronics: smart pixels. In: De Cusatis (ed.) Fiber Optic Data Communication: Technological Trends and Advances, pp. 352–421. San Diego, CA: Academic Press. Ulichney R (1987) Digital Halftoning. Cambridge, MA: The MIT Press. Wax C (1990) The Mezzotint: History and Technique. New York: Harry N. Abrahms, Inc.

Optical Neural Networks H J Caulfield, Fisk University, Nashville, TN, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction An artificial neural network (ANN) is a nonlinear signal processing system based on the neural processes observed in animals. Usually they have multiple inputs and often multiple outputs also. Conventionally, each input sends its signal to many neurons, and each neuron receives signals from many inputs. The neuron forms an intermediate sum of weighted inputs and transforms that sum, according to some nonlinearity, to form an output signal. Often the output signal from one neuron is used as the input signal for another in feed forward, feedback, or mixed mode complex neural networks. ANNs are of value in helping us understand how biological neural networks perform and in a variety of technical applications, most of which involve pattern association. An auto-associative neural network, that associates patterns with themselves, can have useful noise cleanup or pattern completion properties. A hetero-associative neural network is often used in pattern recognition to associate input patterns with preselected class labels.

Neural Networks – Natural and Artificial The most primitive one-celled organisms have no nerve cells, but they do exhibit various chemical means for interacting with their environments, within their own cells, and among individuals of the same or different types. As organisms increased in size, a common chemical environment became impractical,

and nerves arose as a means for long-distance communication within larger organisms. One thing seen in evolution is that widespread useful adaptations are often modified but seldom discarded. Thus, it should not be surprising to find that chemical interactions play a major role in the most complex thing we know of – the human brain. Chemicals play a huge role in making us who we are, setting our moods, and so forth. Realizing this, in the late twentieth century, psychiatrists started to make serious progress in treating mental diseases through the use of chemistry. There are a variety of processes that we use as humans, but the primary means is neural. Only some animals and no plants have neurons, but soon after neurons evolved, they began to concentrate in special regions of the body that developed into primitive brains. Not only were there many neurons, but also they were densely and complexly interconnected to form neural networks. Neural networks did not replace chemical interactions, they used and supplemented them. Chemical communication across the synaptic cleft dominates communication between networks, but the neural networks also control chemical production and distribution within the brain, and those chemicals change the behavior of the neurons that cause their secretion. The functioning of real brains is incredibly complex, possibly irreducibly complex in terms of current mathematics and logic. These things are noted to avoid a confusion that is distressingly common. Technologists produce simplified ANNs and then try to interpret the behavior of brains in those terms. Quite often, ANNs yield useful insight into brain behavior but seldom offer a detailed account of them. Reality is far more complex than any ANN. Current mathematics provides no means

276 INFORMATION PROCESSING / Optical Neural Networks

to describe continuous sets of events, each of which both causes and is caused by the others. ANNs themselves have grown much more complicated as well. Starting out as simple, feed-forward, single-layer systems, they rapidly became bidirectional and even multilayered as well. Higher-order neural networks processed the input signals before inserting them into the neural network. Pulse coupled neural networks (PCNNs) arose that more closely modeled the pulsations in real brains. Learning was broken into learning with instruction, following example, and self-organizing, the latter leading to self organized maps and adaptive resonance theory, and many related subjects. These are but crude pointers to the wealth and variety of concepts now part of the ANN field. The history of ANNs is both complicated and somewhat sordid. This is not the occasion to retrace the unpleasant things various members of this community have done to others in their quests for stature in the field. One result of that history is a dramatically up-and-down pattern of interest in and support for ANNs. The first boom followed the development of a powerful single-layer neural network called the Perceptron by its inventor Rosenblatt. Perceptrons were biologically motivated, fast, and reasonably effective; so interest in the new field of ANNs was high. That interest was destroyed by a book on the field arguing that Perceptrons were only linear discriminants and most interesting problems are not linearly discriminable. Ironically, the same argument has been made about optical Fourier correlators, and it is true there as well. But optical processing folk largely ignored the problem and kept working on it. Sociologists of science might enjoy asking why reactions differed so much in the two fields. Funding and interest collapsed and the field went into hibernation. It was widely recognized that multilayer ANNs would comprise nonlinear discriminants, but training them presented a real problem. The simple rewards and punishment of weights depending on performance was easy when each output signal came from a known set of weighted signals that could be rewarded or punished jointly, according to how the output differed from the sought-after output. With a multilayer perceptron, credit assignment of inner layers is no longer simple. It was not until Werbos invented ‘backward error propagation’ that a solution to this problem was available and the field started to boom again. Also key to the second boom were enormously popular articles by Hopfield on rather simple autoand hetero-associative neural networks. Basically, those networks are almost never used, but they were of great value in restarting the field. In terms of

research and applications, we are still in that second boom, that began during the 1980s. An ANN is a signal processor comprised of a multiplicity of nodes which receive one or more input signals and generate one or more output signals in a nonlinear fashion wherein those nodes are connected with one or two way signaling channels.

Note also that biological inspiration is no longer a part of the definition. Most ANNs used for association are only vaguely related to their biological counterparts.

Optical Processors Optics is well-suited for massive parallel interconnections, so it seems logical to explore optical ANNs. The first question to address is “Why bother to use optical processors at all?” as electronic digital computers seem to have everything: . Between the writing of this article (on an electronic

computer) and its publication, the speed of electronic computers will almost double and they will become even cheaper and faster. . They can have arbitrary dynamic range and accuracy. . Massively improved algorithms also improve their speed. For instance, the wonderful fast Fourier transform (FFT) that opticists have used to simulate optics for decades has been replaced by an even faster fastest Fourier transform in the West (FFTW). And for wavelets, the fast wavelet transform (FWT) is even faster. Optical processors, on the other hand, have severely limited analog accuracy and few analog algotectures or archirithms. They are specialized systems (equivalent in electronics, not to computers but to application specific integrated circuits (ASICs)). They are usually larger and clumsier than electronic systems and always cost more. There are several reasons why optics may still have a role in some computations. First, many things on which we wish to perform computations are inherently optical. If we can do the computations optically before detection, we can avoid some time and energy and sensitivity penalties inflicted, if we first detect the signals then process them electronically, and then (sometimes) convert them back to optics. Examples include spectroscopy and optical communication. Second, pure optics is pure quantum mechanics. Input of data and setup of the apparatus is experiment preparation. Readout of data is classical measurement. Nature does the rest free of charge. There is a unitary transformation from input to output that

INFORMATION PROCESSING / Optical Neural Networks 277

requires no human intervention and is, in fact, destroyed by human intervention. Those intermediate (virtual) computations require no energy and no time, beyond the time taken for light to propagate through the system. Speed and power consumption are limited only by the input and output. This applies for everything from Fourier optics to digital optics to logic gates. Thus, optics can have speed and power consumption advantages over electronics. Third, if quantum computers involving entanglement ever emerge, they may well be optical. Such computers would be used for nondeterministic solutions of hard problems. That is, they would explore all possible paths at once and see which one does best. This is something we can already do electronically or optically for maze running (using classical, not quantum entanglement) means. Fourth, most optical processors are analog, which is sometimes superior to digital. In those cases, we would still have to choose between analog optics and analog electronics, of course. So, optical processing (like all other processing) is a niche market. But some of the niches may be quite large and important. It is not a universal solution, but neither is digital electronics.

The Major Branches of Optical Processing and Their Corresponding Neural Networks Every branch of optical processing leads to its own special purpose optical neural networks, so those branches become a good place to start an article such as this one. Optical Linear Algebra

Many neural network paradigms use linear algebra followed by a point-by-point nonlinearity in each of multiple stages. The linear operation is a matrix – vector multiplication, a matrix – matrix multiplication, or even a matrix – tensor multiplication. In any case, the key operations are multiplication and addition – operations very amenable to performance with analog optics. With effort, it is possible to combine simple electronics with analog optics operating on non-negative signals (the amount of light emitted from a source, the fraction of light passing through a modulator, etc.) to expand to real or even complex numbers. With even more effort, we can use multiple analog non-negative signals to encrypt a digital signal, allowing digital optical processors. Almost all optical neural networks using linear algebra use only real numbers. In most of these, the real number is encoded as a positive number and a

negative number. Data are read in parallel using spatial light modulators (SLMs), source arrays, acousto-optic modulators, etc. They are operated upon by other SLMs, acousto-optic systems, and so forth. Finally, they are detected by individual detectors or detector arrays. The required nonlinearities are then done electronically. It is a theory that holds for all linear algebraic processes up to computational complexity OðN 4 Þ that they can be performed with temporal complexity OðN 0 Þ if the extra dimensions of complexity are absorbed in space, up to OðN 2 Þ through 2D arraying and in fanin/fanout, the ability of multiple beams of light to be operated upon in parallel by the same physical modulator – this has the same complexity as the 2D array of modulators, namely OðN 2 Þ: Thus, we can invert a matrix, an order OðN 4 Þ for an N £ N matrix, OðN 0 Þ in time, that is, independently of the size of N; as long as that matrix is accommodated by the apparatus. This offers a speed electronics cannot match. Most of the optical neural networks of the mid1980s had optical vector– matrix multipliers at their heart. Most were a few layers of feed forward systems, but by the 1990s feedback had been incorporated as well. These are typically N inputs connected to N outputs through an N £ N matrix, with N varying from 100 to 1000. With diffuser interconnection and a pair of SLMs, it is possible to connect an N £ N input with an N £ N output using N 4 arbitrary weights, but that requires integrating over a time up to N 2 intervals. Most of the N4dimensional matrices can be approximated well by fewer than N 4 terms using singular value decomposition (SVD) and (as this version of the interconnection uses outer products), that reduces the number of integration times considerably in most cases. We have shown, for instance, that recognizing M target images can be accomplished well with only M terms in the SVD. Coherent Optical Fourier Transformation

It took opticists several years, during the early 1960s, to realize that if Fourier mathematics was useful in describing optics, optics might be useful in performing Fourier mathematics. The first applications of optical Fourier transforms were in image filtering and particularly in pattern recognition. Such processors are attractive not only for their speed but also for their ability to locate the recognized pattern in 1D, 2D, or even 3D. It was soon obvious that a Fourier optical correlator was a special case of a single layer perceptron. Of course, it inherits the weaknesses of a single layer perceptron too. Soon, workers had added

278 INFORMATION PROCESSING / Optical Neural Networks

feedback to create a winner-takes-all type of autoassociative optical neural network, a system that associated images with themselves. This proved useful in restoring partial or noisy images. It was not until 2003 that we learned how to make powerful hetero-associative networks with Fourier optical systems. This creates a nonlinear discrimination system that maintains the target location and generalizes well from a few samples of multiple classes to classify new data quite accurately. Imaging

Imaging is the central act of optics and might seem to be of little value in optical processing. But multiple imaging (with lenslets, self-focusing fibers, holograms, and the like) has been the basis of numerous optical neural networks. Multiple imaging is a nonrandom way of distributing information from one plane to the next. Diffusers have also been used to produce fully interconnected optical neural networks. If each broad image point contains the whole image (something easily arranged with a diffuse hologram), then we have an auto-associative memory. Different such auto-associative networks can be superimposed in a single hologram. Then feedback after nonlinear image filtering can be used to make a general autoassociative memory. Even deliberately poor imaging has been used. A blurred or out-of-focus image can accomplish the lateral signal spreading needed to facilitate a PCNN. These networks are substantially more like real brain networks than those discussed so far. It starts with a 2D array of integrate-and-fire units. The firing rate of each depends only in the incident power (how long integration to the thresholding firing takes). But if the fired signal is spread to its neighbors, the firing of a neighbor can hasten any unit’s firing. This causes firings to synchronize and results in ‘autowaves’ of firing patterns moving across the 2D plane. Thus an incident 2D image is converted to a 3D signal – the two spatial dimensions and time. This results, for all but some very simple cases, in a chaotic signal occupying a strange attractor. It is interesting but not very revealing in itself. But suppose we ‘integrate out’ either the two spatial dimensions or the time dimension. If we integrate out the spatial dimensions, we are left with a time signal. Its strange attractors are remarkably useful in pattern recognition, because they are more syntactic than statistical (in pattern recognition terms). They describe the shape as similar, independently of the size, rotation, perspective, or even shading of the pattern! If we integrate out the time dimension, we restore the original image – almost. In fact, we can use this aspect of PCNNs

for optimum blind segmentation of images. The autowaves can be made to ‘leave tracks’ so they can be reversed, which allows them to solve maze problems nondeterministically, because they can take all the paths. When the first wave exits, we simply retrace it to find the shortest path through the maze – a prototypical nondeterministic polynomial operation. Holograms

Holograms are the most general, complex, and flexible means of light manipulation so far invented. Naturally, they have many applications in optical neural networks. One such application is the interconnection of each of an N £ N array of inputs to each of an N £ N array of outputs with its own unique connection strength (weight in the language of neural networks). The practical limit seems to be around N ¼ 1000; yielding 1012 weights, roughly the same number as in the human brain! The brain, however, is not fully interconnected, so this produces an entirely different type of neural network. The resulting network would have fixed weights, but it could perform a variety of useful operations and be both forward and backward connected. Special on-axis holograms, called diffractive optical elements, can be mass manufactured very effectively and inexpensively to perform tasks such as multiple imaging. Photorefractive Neural Networks

A very complex thing happens when a pattern of fringes, such as those that form an interference pattern for hologram recording, strike materials such as lithium niobate, strontium barium niobate, gallium arsenide, and many others that fall into this category of ‘photorefractive’ materials. Of course, ‘photo’ refers to light and ‘refractive’ refers to the speed of light in a material – the speed of light in a vacuum divided by the index of refraction. These materials are photoconductive, so the light releases charge carriers in what would otherwise be an electrical insulator. The periodic charge pattern sets up a periodic space charge within the material. But the material is also electro-optic, that is, its index of refraction changes with applied electric field, thus generating a periodic index of refraction grating in the material. Finally, that index of refraction pattern (a phase hologram) diffracts the incoming light. Usually, but not always, a steady-state situation is reached after time that depends on the material and the photon density, although chaos can arise under some conditions. This makes for a dynamic

INFORMATION PROCESSING / Optical Neural Networks 279

hologram, one that can change or adapt over time as the input pattern changes. Once a satisfactory photorefractive hologram is formed, it can be stabilized (fixed) if desired. Most optical neural network uses envision a dynamic situation with changing interconnections. What has not been discussed above is that the index of refraction varies with the light’s electric field vector cubed, so photorefractives provide a convenient way to accomplish such nonlinear operations as winner-takes-all. Sometimes, we can record holograms with photorefractives that cannot be recorded by conventional holograms in conventional materials. To explain this, a hologram is a transducer between two optical wavefronts – usually called the reference and object beams. If the reference beam is incident on the hologram, it is (fully or partially) converted into the object beam. If the precisely reversed (phase conjugated) reference beam is incident on it, the precisely reversed (phase conjugated) object beam is derived from it. Likewise, the object wavefront (or its phase conjugate) produces the reference wavefront (or its phase conjugate). The hologram is made by recording the interference pattern between the two wavefronts: the reference and object beams. The ability of two wavefronts to form a temporally stable interference pattern at some point in space is called the mutual coherence of those wavefronts. Usually, both wavefronts are derived from the same laser beam to make achieving high mutual coherence possible. And, of course, both beams must be present to form an interference pattern. In some cases of photorefractive holograms, all of those conditions can be violated. The two wavefronts can derive from two sources that are different in wavelength, polarization, etc., and need not be simultaneously present. This phenomenon, called mutual phase conjugation, converts each wavefront into the phase conjugate of the other, but for effects due to any differences in wavelength and polarization. This allows new freedoms that have been exploited in optical neural networks. Photorefractive materials also tend to be piezoelectric. It has been found that up to about 20 totally independent reflective holograms can be stored with different applied electric field in lithium niobate. Changing electric fields so changes the fringe spacings and indices of refraction, that only one of those holograms has any appreciable diffraction efficiency at any one electric field value. The result is an electrically selectable interconnection pattern. The same hologram can contain the information needed for many neural networks or multiple layers of fewer of them. There is no need, in general, to have more

than an input layer, an output layer, and two intermediate or hidden layers. Others have used photorefractive material to implement learning in neural networks. In biological neural networks the synaptic strengths (the weights between neurons) continue to adapt. With photorefractive holograms, we can achieve this in an optical ANN. As previously discussed, a key development in the history of ANNs was a way to train multilayer neural networks and thus accomplish nonlinear discrimination. The first, and still most popular, way to do this is called backward error propagation. Several groups have been implemented in photorefractive optical neural networks.

Conclusions Essentially every major type of neural network has been implemented optically. Every major tool of optics has found its way into these optical neural networks. Yet, it remains the case that almost every neural network used for practical applications is electronic. The mere existence proof of a use for optical neural networks may be what the field needs to move forward from the demonstration stage to the application stage. The tools exist and there are niches where optics seems more appropriate than electronics. Hopefully, the next review of the field will include a number of commercial successes. Nothing spurs science better than the funds that follow successful applications.

List of Units and Nomenclature Adaptive resonance theory (ART)

Computational complexity

Another self-organized neural net categorizer that works by mutual adjustment between bottom-up and top-down neural connections. The measure of how the number of required calculations scales with the problem size, usually denoted by N: The scaling is represented by the Big O form of ðNÞ; where f ðnÞ shows how it scales asymptotically (large N). Problems that scale as OðN P Þ are said to be of polynomial complexity. Unfortunately, some of the most important problems have exponential complexity.

280 INFORMATION PROCESSING / Optical Neural Networks

Fanin/Fanout

Self organized maps (Kohonen networks)

Singular value decomposition (SVD)

Spatial light modulator (SLM)

Beams of light (unlike beams of electrons or holes) do not interfere with one another, so many of them can be modulated by the same physical elements quite independently, so long as they can be identified. This is usually done by sending the different beams in at different angles, giving rise to the words fanin and fanout, as suggested by the drawing below:

A means of self-organization of inputs into (normally 2D) maps wherein every concept falls into some category and the categories themselves develop out of application of the Kohonen’s algorithm and user-selected parameters applied to the given data. Any matrix can be broken into (decomposed) a weighted sum of simpler, rank-one matrices. The simplest such decomposition (in the sense of fewest number of terms) is the SVD. The weights are the singular values and are indicators of the importance of that term in the decomposition. A 1D or 2D array of elements under electronic or optical control that change some property of the light incident on them. Usually, it is the index of refraction in some direction that is changed. That change can be used to modulate the amplitude, the polarization, or the phase of the light.

See also Fourier Optics. Holography, Techniques: Overview. Nonlinear Optics, Basics: Photorefraction. Optical Communication Systems: Local Area Networks. Optical Processing Systems.

Further Reading Abu-Mostafa YS and Psaltis D (1987) Optical neural computers. Scientific American 256(3): 88 –95. Anderson DZ and Erie MC (1987) Resonator memories and optical novelty filters. Optical Engineering 26: 434 – 444. Brady D, Psaltis D and Wagner K (1988) Adaptive optical networks using photorefractive crystals. Applied Optics 27: 1752– 1759. Caulfield HJ, Kinser J and Rogers SJ (1987) Optical neural networks. Proc. IEEE 77: 1573– 1583. Chavel P, Lalanne P and Taboury J (1989) Optical innerproduct implementation of neural network models. Applied Optics 28: 377 – 385. Collins SA, Ahalt SC, Krishnamurthy AK and Stewart DF (1993) Optical implementation and applications of closest-vector selection in neural networks. Applied Optics 32: 1297– 1303. Farhat NH, Psaltis D, Prata A and Paek E (1985) Optical implementation of the Hopfield model. Applied Optics 24: 1469– 1475. Galstyan T, Frauel Y, Pauliat G, Villing A and Roosen G (1997) Topological map from a photorefractive selforganizing neural network. Optics Communications 135: 179 – 188. Goodman JW, Dias AR and Woody JM (1978) Fully parallel, high-speed incoherent optical method for performing discrete Fourier transforms. Optics Letters 2: 1 –3. Johnson JL (1994) Pulse-coupled neural nets: Translation, rotation, scale, distortion, and intensity signal invariance for images. Applied Optics 33: 6239– 6253. Jutamulia S (1994) Selected Papers on Optical Neural Networks. Bellingham: SPIE Press. Kinser JM, Caulfield HJ and Shamir J (1988) Design for a massive all-optical bidirectional associative memory: the big BAM. Applied Optics 27: 3442– 3444. Neifeld MA and Psaltis D (1993) Optical implementations of radial basis classifiers. Applied Optics 32: 1370– 1379. Owechko Y, Dunning GJ, Marom E and Soffer BH (1987) Holographic associative memory with nonlinearities in the correlation domain. Applied Optics 26: 1900– 1910. Paek EG and Psaltis D (1987) Optical associative memory using Fourier-transform holograms. Optical Engineering 26: 428 – 433. Psaltis D and Farhat NH (1985) Optical information processing based on associative-memory model of neural nets with thresholding and feedback. Optics Letters 10: 98 – 100. Soffer BH, Dunning GJ, Owechko Y and Marom E (1986) Associative holographic memory with feedback using phase-conjugate mirrors. Optics Letters 11(2): 118 – 120. Yu FTS (1993) Optical neural networks – architecture, design and models. In: Wolf E (ed.) Progress in Optics, II, pp. 63 –144. Amsterdam: North-Holland.

INSTRUMENTATION / Astronomical Instrumentation 281

INSTRUMENTATION Contents Astronomical Instrumentation Ellipsometry Photometry Scatterometry Spectrometers Telescopes

Astronomical Instrumentation J Allington-Smith, University of Durham, Durham, UK q 2005, Elsevier Ltd. All Rights Reserved.

Introduction This article describes the requirements placed on astronomical instruments and the methods used to achieve them, making reference to generic techniques described elsewhere in this encyclopedia. It is limited in scope to electromagnetic radiation with wavelengths from the atmospheric cutoff in the ultraviolet (0.3 mm) to the point in the near-infrared where the technology changes radically (5 mm), taking in the transition from uncooled to cooled instruments at , 1.7 mm. Thus, it excludes the extreme UV and X-ray regimes, where standard optical techniques become ineffective (e.g., requiring grazing incidence optics) and all mid-infrared, sub-mm, THz, and radio-frequency instruments. Terrestrial telescopes are the main focus but space-borne telescopes are subject to similar design considerations, except that the diffraction limit of the optics may be achieved at short wavelengths. Adaptive optics (AO), of increasing importance in astronomy and almost essential for the next generation of extremely large telescopes (ELTs), is covered elsewhere in this encyclopedia.

Astronomical Data The purpose of astronomical instruments is to measure the following properties of photons arriving on or near to the Earth and brought to a focus by an astronomical telescope:

(i) (ii) (iii) (iv)

direction of arrival; energy; polarization state; and time of arrival.

In practice, these properties cannot be measured simultaneously, requiring the following generic classes of instrument: . Imagers to measure (i) and, if equipped with ultra-

low-noise detectors, (iv); . Spectrographs to measure (ii), although some can

be configured to do a good job on (i) as well. Integral field spectrographs can measure (i) and (ii) simultaneously; and . Polarimeters for (iii) plus (i) and spectropolarimeters for (iii) plus (ii). In principle, energy-resolving detectors, such as arrays of superconducting tunnel-junction devices (STJs) can replace (i), (ii), and (iv). These must interface to the telescope which provides a focal surface with an image scale which is approximately constant over the field. Currently, almost all telescopes are based on the Cassegrain or Ritchy-Chretien reflecting configurations, comprising a large curved primary mirror (M1) which defines the collecting aperture (diameter DT ) and a curved secondary mirror (M2). This feeds the Cassegrain focus via a hole in M1 in line with the central obstruction caused by M2, or the Nasmyth focus if a tertiary (usually plane) mirror is added to direct light from M2 along the elevation axis to a location in which the gravity vector changes only in rotation about the optical axis. For the Cassegrain focus, the full range of change in the gravity vector is experienced as the telescope tracks objects during long exposures or slews between targets. It is also possible to dispense with M2 and mount the

282 INSTRUMENTATION / Astronomical Instrumentation

Table 1

Optical interfaces to representative types of telescope and achievable angular resolution

Aperture diameter DT (m)

Primary focus

F/# 2.4i 4.2ii 8.0iii 30iv 100v

2.3 2.8 1.8 1 –1.5 ,1.4

Final focus

Diffraction limit 2.44l/DT (mas)

Image scale (mm / 00 )

FOV ( 0 )

F/#

Image scale (mm / 00 )

FOV ( 0 )

l ¼ 0.5 mm

l ¼ 5.0 mm

60 (70) ,180

40 (45) ,20

24 11 16 15–19 ,6.5

279 221 621 ,2500 ,3000

25 15 10 ,20 ,10

101 60 32 8.4 2.5

1007 600 315 84 25

For enclosed telescopes All

Obtainable angular resolution for terrestrial telescopes

Free-atmosphere seeing at good site With adaptive optics (approx)

500–1500 Best Median

250 420 50–200

160 270 Diff. limit a

a The diffraction limit listed in the column above for l ¼ 5:0 mm applies. Exemplars: (i) Hubble Space Telescope; (ii) Herschel Telescope, La Palma; (iii) Gemini telescopes, Hawaii/Chile; (iv) CELT/GSMT (USA); (v) OWL (Europe).

instrument directly at the focus of M1 (prime focus), although this imposes severe restrictions on the mass and size of the instrument. An alternative mounting for instruments is via optical fibers. The input, with any fore-optics required to couple the telescope beam into the fiber, is mounted at the appropriate focus while the output, possibly several tens of meters distant, is mounted where the instrument is not subjected to variations in the gravity vector. The useful field of view (FOV) of the telescope is determined by how rapidly the image quality degrades with radius in the field; restrictions on the size of the Cassegrain hole and the space set aside for the star trackers required for accurate guiding and auxiliary optics involved in target acquisition, AO, etc. Typical telescope parameters are shown in Table 1, of which the last two entries are speculative examples of ELTs. The table also lists the diffraction limit of the telescopes and the limits imposed by seeing, with and without the effects of the telescope structure and enclosure, and approximate limits for performance with AO which approaches the diffraction limit in the infrared.

fundamental constants on cosmological timescales (, 1010 years). The term ‘efficiency’ has several components:

Instrument Requirements

Angular Resolution

Efficiency

Almost all astronomical observations are of very faint objects or require very high signal-to-noise (SNR) to extract the desired subtle information. The former, for example, might reveal the initial phase of galaxy or star formation, while the latter could reveal the mix of elements in the big bang or uncover changes in

(i) Throughput of the optical system, including the efficiency of the optical components (lenses or mirrors), dispersers, filters, and also detectors and also includes losses due to transmission/ reflection, diffraction, scattering, and vignetting by internal obstructions; (ii) Noise from all sources: the detector, including time-independent (read-noise) and time-dependent (dark-current) components; thermal emission from the instrument, telescope, enclosure and sky; straylight (see above), and electronic interference; (iii) Efficient use of the instrument and telescope to maximize the fraction of the time spent accumulating useful photons instead of, say, calibrating; reconfiguring the instrument; or acquiring targets; (iv) Parallelizing the observations so that the same instrument records data from many objects at the same time. This is discussed further below.

The required angular resolution is summarized in Table 2 for key features in galaxies. The resolutions currently achievable with AO systems (with severe restrictions on sky coverage pending routine availability of laser beacons), or without, are shown in bold. Finer resolution is possible using interferometric techniques, but is limited to a handful of bright stars.

INSTRUMENTATION / Astronomical Instrumentation 283

Table 2

Typical requirements for angular resolution and single-object field of view

Target

Milky Way Local group galaxy (M31) Distant galaxy Cluster of galaxies Cosmological structure a

Distancea

8 kpc 800 kpc .1 Gpc .1 Gpc .1 Gpc

Active nucleus

Star cluster (10 pc)

Central engine (10 AU)

Dynamical core (0.5 pc)

1 mas 0.01 mas 1025 mas

1000 100 mas 0.1 mas

40 300 2 mas

Visible extent

10000 0.5 –200 30 ,18

1 pc ¼ 3.1 £ 1016 m.

Field of View and Multiplexing

The FOV of an individual observation must be sufficient to cover the angular extent of the object with the minimum number of separate pointings of the telescope. As well as minimizing the total observation time, this reduces photometric errors caused by changes in the ambient conditions (e.g., flexure, thermal effects, and changes in the sky background). (Photometry is the measurement of the flux density received from an astronomical object, usually done with the aid of an imager employing filters with standardized broad passbands. Spectrophotometry is the same but in the much narrower passbands obtained by dispersing the light.) The last column of Table 2 indicates the required size of FOV. It is also highly desirable to observe many objects simultaneously since most studies require large statistically homogenous samples. The precise requirement depends on the surface number density of potential targets and the sampling strategy adopted to create the sample. Cosmological studies of the large-scale distribution of galaxies and clusters emphasize the size of the field ($ 0.58), since the targets may be sparsely sampled, while studies of clustered objects (galaxies or stars) aim to maximize the surface density of targets (# 1000 arcmin22) within a modest FOV (, 50 ). Spectral Resolution

The required energy resolution is specified in terms of the spectral resolving power, R ; l=dl, where dl is the resolution in wavelength defined usually in terms of the Rayleigh criterion. One of the main data products required in astrophysics is the velocity along the line of sight (radial velocity) obtained through measurement of the wavelength of a known spectral feature (emission or absorption line). This may be determined to an accuracy of:

dv ¼

Qc R

½1

where Q is the fraction of the width of the spectral resolution element, dl, to which the centroid can be

determined. This is limited by either the SNR of the observation of the line via Q < 1=SNR; or by the stability of the spectrograph via:   Dx dx ½2 Q¼ x dx where x is the angular width of the slit measured on the sky and dx/dx is the image scale at the detector. Dx is the amount of uncorrectable flexure in units of the detector pixel size. This is the part of the motion of the centroid of the line which is not susceptible to modeling with the aid of frequent wavelength calibration exposures. Typically, Dx < 0:2 pixels is achievable (roughly 3 mm at the detector or 10 mm at the slit for an 8 m telescope) implying Q < 1=30 for a typical ðdx=dxÞ ¼ 0:1 – 0:200 =pixel and x ¼ 0:5 – 100 : Table 3 gives typical requirement for R and the value of Q required to achieve it, given the spectral resolution obtainable for different types of current spectrograph. Extrasolar planet studies require very high stability. Some spectral observations require very accurate measurements of the shape of the spectral features to infer the kinematics of ensembles of objects which are individually not resolvable (e.g., stars in distant galaxies, gas in blackhole accretion disks). Other applications require accurate measurements of the relative flux of various spectral features within the same spectrum (e.g., to reveal abundances of elements or different types of star).

Optical Principles Imaging

The simplest type of imager consists of a detector, comprising a 2D array of light-sensitive pixels (Table 4) placed at a telescope focal surface without any intervening optics except for the filters required for photometry. Although maximizing throughput, this arrangement removes flexibility in changing the image scale since the physical size of pixels (typically 15 – 25 mm) is fixed. Thus the Gemini telescopes

284 INSTRUMENTATION / Astronomical Instrumentation

Table 3

Typical requirements for spectral resolution and stability

Target

dv or required data product

Required l/dl

Spectral regime

Available R

Stability required ¼ 1/Q

Simultaneous wavelength range

Cosmological parameters Galaxies Intra-cluster

1000 km/s (line centroid)

300

Low dispersion

300– 3000

1

*1 octave

100 km/s (line centroid) Ratio of line fluxes 10 km/s (line shapes) 1 km/s (line centroid) ,1 m/s (line centroid)

3000

1 –10

100 nm

Element abundance Stellar orbits Individual stars Extrasolar planets

Table 4

1000– 3000 3 £ 104 3 £ 105 .3 £ 108

Material/ type

Pixel size (mm)

a

,105

High

Format Single

1– 2.5 þ 0.6– 5

1 –3 £ 104

Medium

(1)

10 nm

10– 30

10 nm

.3000

,1 nm

Detectors in current use for astronomy

Wavelength (mm)

0.3– 1.0

1 octave

CCD CMOS HgCdTe InSb

Read noise (e-/pix)

Dark current (e-/pix/hr)

QE range (%)

Mosaic

Typical

State-of-the-art

10– 20 5 –10 15– 20 20– 30

4k £ 4k 4k £ 4k

12k £ 8k

2 5 –30

Negligible

10–90a

2k £ 2k

4k £ 4k

10

0.01–0.1

70–90

Quantum efficiency varies strongly over quoted wavelength range.

ðDT ¼ 8 mÞ provides an uncorrected image scale of 25 –40 mas/pixel at its sole F/16 focus. Although well-suited to high-resolution observations with AO, this is inappropriate for most observations of faint galaxies where 100 –200 mas/pixel is required. This arrangement also provides limited facilities for defining the wavelength range of the observations since there is no intermediate focal surface at which to place narrowband filters, whose performance is sensitive to field angle and are, therefore, best placed at an image conjugate. There is also no image of the telescope pupil at which to place a cold stop to reject the external and instrumental thermal radiation which is the dominant noise source at l . 1:5 mm: The alternative is to use a focal-reducer consisting of a collimator and camera. The auxiliary components (filters, cold-stops, Fabry – Perot etalons, etc.) then have an additional image and pupil conjugate available to them and the image scale may be chosen by varying the ratio of the camera and collimator focal lengths. The extra complexity of optics inevitably reduces throughput but, through the use of modern optical materials, coatings such as Sol – Gel and the avoidance of internally obstructed reflective optics, losses may be restricted to 10 – 20% except at wavelength shorter than , 0.35 mm. An imaging

focal reducer is a special case of the generic spectrograph described in the next section. Spectroscopy

Basic principles Astronomical spectrographs generally employ plane diffraction gratings. The interference between adjacent ray paths as they propagate through the medium is illustrated in Figure 1. From this, we obtain the grating equation: mrl ¼ n1 sin a þ n2 sin b ; G

½3

where a and b are the angles of incidence and diffraction respectively, r ¼ 1=a is the ruling density, l is the wavelength, and m is the spectral order. The refractive indices in which the incident and diffracted rays propagate are n1 and n2 respectively. The layout of a generic spectrograph employing a plane diffraction grating in a focal reducer arrangement is shown in Figure 2. The spectrograph re-images the slit onto the detector via a collimator and camera. The disperser is placed in the collimated beam close to a conjugate of the telescope pupil.

INSTRUMENTATION / Astronomical Instrumentation 285

A field lens placed near the focus is often incorporated in the collimator for this purpose. The geometrical factor, G, defined in the equation is constrained by the basic angle of the spectrograph:

Considering the case of a diffraction grating in vacuo, differentiation of the grating equation with respect to the diffracted angle yields the angular dispersion:

c¼a2b

dl cos b ¼ db mr

½4

which is the (normally fixed) angle between the optical axes of the collimator and camera. c ¼ np; corresponds to the Littrow configuration (for integer n).

n1

B b

a A′

from which the linear dispersion is obtained by considering the projection of db on the detector, dx: dl dl db cos b ¼ ¼ dx db dx m rf 2

½6

where f2 is the focal length of the camera (see Figure 2). In the diffraction-limited case, where the slit is arbitrarily narrow, the resolving power is given simply by the total number of rulings multiplied by the spectral order:

n2

A

½5

B′

Rp ¼ mrW

½7

where w is the length of the disperser as defined below. But in astronomy, it is usually determined by the width of the image of the slit, s, projected on the detector, s 0 . By conservation of Etendue:

a

s0 ¼ s

Figure 1 Derivation of the grating equation by consideration of the optical path difference between neighboring ray paths AB and A0 B0 .

F2 F1

½8

where F1 and F2 are the collimator and camera focal ratios, respectively. The spectral resolution of the spectrograph is simply the width of this expressed in

f2 Detector

fT

D2 Grating Slit

DT

Ψ

s

b a

c

Telescope

a

D1 W

f1

Figure 2 Generic spectrograph employing a plane diffraction grating. Note that according to the sign convention implicit in Figure 1, the diffraction angle, b, is negative in this configuration.

286 INSTRUMENTATION / Astronomical Instrumentation

wavelength units:   dl 0 cos b F2 sD1 cos b dl ¼ s ¼ s ¼ dx mrf2 F1 m rD 2 f 1

½9

The length of the intersection between the collimated beam and the plane of the grating (not necessarily the actual physical length of the grating) is W¼

D2 D1 ¼ cos b cos a

½10

Using this to eliminate cos b, the spectral resolution becomes

dl ¼

s mrF1 W

½11

and the resolving power is R;

l mrlF1 W F W ¼G 1 ¼ dl s s

½12

where G is the quantity defined in eqn [3]. Note that R is now independent of the details of the camera. This expression is useful for the laboratory since it is expressed in terms of the parameters of the experimental apparatus; but, for astronomy, it is more useful to express the resolving power in terms of the angular slit width, x, and the telescope aperture diameter, DT, via: s ¼ xfT ¼ xFT DT ¼ xF1 DT

½13

since the collimator and telescope focal ratios are the same if the spectrograph is directly beam-fed from the telescope. Note that even if the telescope focal surface is re-imaged onto the slit, the expression for the resolving power still holds due to the conservation of Etendue in the re-imaging optics. Thus the resolving power is R¼

mrlW GW ¼ xDT xDT

½14

Note that the resolving power obtained with a nonzero slit width is always less than the theoretical maximum, R # Rp ; for wavelengths:

l , lp ¼ xDT

½15

Thus, long-wavelength applications may approach the theoretical limit; in which case they are said to be diffraction-limited. If so, the resolving power is independent of the slit width, simplifying the interface to different telescopes. For a nondiffractionlimited spectrograph, the resolving power obtained depends on the aperture of the telescope to which it is fitted.

Optical configuration The non-Littrow configuration shown in Figure 2 introduces anamorphism into the recorded spectra such that a monochromatic image of the slit mask is compressed in the dispersion direction by a factor: A¼

D2 cos b ¼ D1 cos a

½16

which varies, depending on the choice of angles required to place the chosen wavelength on the detector using eqns [3] and [4]. Practical configurations that maximize R, do so by increasing W ¼ D1 =cos a: This implies maximizing cos b, since c ¼ a 2 b; which results in A . 1: This is known as the grating-to-camera configuration since the grating normal points more towards the camera than to the collimator. The grating-to-collimator configuration may also be used at the penalty of lower R but does permit a smaller camera. Thus the camera aperture must be oversized in the dispersion direction since D2 . D1 : The Littrow configuration is rarely used with reflective designs when large FOV is required because of internal vignetting and consequent light loss. However, it can be used effectively with transmissive dispersers. If the disperser is replaced by a plane mirror (or if the disperser is removed in a transmissive system), the system acts as a versatile imager, with A ¼ 1: This also facilitates target acquisition which requires that the telescope attitude be adjusted until the targets are aligned with apertures cut in an opaque mask placed at the input of the spectrograph. These apertures (normally slits) are usually oversized in the direction perpendicular to the dispersion to allow the spectrum of the sky to be recorded directly adjacent to that of the target to permit accurate subtraction of the background. The slit width is chosen by trading off the desire to maximize the amount of light entering the instrument from the target (but not so much as to admit excessive background light) and the need to maximize spectral resolving power, R, by reducing the slit width, x, according to eqn [14]. Since jGj # 2, the maximum attainable resolving power, according to eqn [14], is determined chiefly by the telescope aperture and angular slitwidth. The only important parameter of the spectrograph is the illuminated grating length, W. Therefore, maintaining the same resolving power as the telescope aperture increases, requires W to increase in direct proportion. Because of limits placed on the geometry by the need to avoid obstructions, etc., this means that both the collimated beam diameter and the size of the disperser must increase. This is one of the major

INSTRUMENTATION / Astronomical Instrumentation 287

problems in devising instruments for ELTs. Current instruments on 8 m telescopes feature collimated beam diameters of D1 < 150 mm: For ELTs, the implied sizes are 0.5 –2 m, requiring huge optics, accurately co-aligned mosaics of dispersing elements, and enormous support structures to provide the required stability. A simple calculation indicates that the ruling stylus for a classically ruled diffraction grating would have to cover a thousand kilometers without significant wear unless the disperser was made from smaller pieces.

Groove normal

Grating normal

First order light

Ψ a

i

b

r Groove shadowing g

Choice of dispersing element Astronomical spectrographs mostly use surface-relief diffraction (SL) gratings. These are ruled with groove densities 50 , r , 3000 mm21 : Figure 3 Blazing of ruled diffraction grating.

Blazing diffraction gratings. The intensity of the multibeam interference pattern from N rulings of finite width is given by 2



sin N f sin2 f

!

2

sin u u2

!

RB ¼ ½17

where 2f is the phase difference between the center of adjacent rulings and u is the phase difference between the center and edge of a single ruling. To be useful for astronomy, the peak in this pattern, determined by the second term, must coincide with a useful order, such as m ¼ 1; rather than zero order. This can be done by introducing a phase shift into the second term, which represents the diffraction of a single ruling, such that:



p cos g ðsin i 2 sin rÞ rl

½18

where the groove angle, g, is the angle by which the facets are tilted with respect to the plane of the grating. With the aid of Figure 3, it can be seen that i ¼ a 2 g and r ¼ g 2 b so that the maximum efficiency, which occurs when u ¼ 0; corresponds to i ¼ r; which is equivalent to simple reflection from the facets, and a þ b ¼ 2g: Thus, through this process of blazing, the grating eqn [3], at the blaze wavelength, lB, becomes:

rmlB ¼ 2sin g cos

c 2

eqns [10], [14], and [19] as:

½19

where c, is given by  eqn  [4] using the identity: sin x þ sin y ; 2 sin x þ2 y cos x þ2 y . Thus, the resolving power at blaze is obtained from

2D1 sing cosðc =2Þ xDT cosðg þ c =2Þ

½20

Use of grisms. An alternative configuration uses transmission gratings. The advantage here is that the collimator and camera can share the same optical axis in a straight-through configuration with unit anamorphism, A ¼ 1; which does not require oversizing of the camera. The same phase-shift can be applied as before by making the facets into prisms with the required groove angle. However, this would only allow zero-order to propagate undeviated into the camera so an additional prism is required with vertex angle f ¼ g to allow first-order light to pass undeviated into the camera. This composite of blazed transmission grating and prism is called a grism (Figure 4). The grating equation (eqn [3]) is modified for a grism in the blaze condition to mrlB ¼ ðn 2 1Þsinf since a ¼ 2 b ¼ f and n1 ; n; n2 ¼ 1. Noting that c ¼ 0, the resolving power at blaze is then: RB ¼

D1 ðn 2 1Þtan f xDT

½21

where n is the refractive index of the prism and grating material. Due to problems such as groove shadowing (illustrated for reflection gratings in Figure 3), these lack efficiency when f * 308 restricting their use to low-resolution spectroscopy. Use of volume phase holographic gratings. A newer alternative to surface-relief (SR) dispersers are volume phase holographic (VPH) gratings in which

288 INSTRUMENTATION / Astronomical Instrumentation

the interference condition is provided by a variation of refractive index of a material such as dichromated gelatine, ng, which depends harmonically on position inside the grating material as: ng ðx; zÞ ¼ n g þ Dng cos½2prg ðx sin g þ z cos gÞ

½22

modulation amplitude, respectively (Figure 5). These are described by the same grating (eqn [3]), as before leading to an identical expression for the resolving power at blaze, where mrlB ¼ 2 sin a; to that of a blazed plane diffraction grating in the Littrow configuration (eqn [20]):

where rg is the density of the lines of constant ng and n g and Dng are the mean refractive index and its

f

g

a

d

nG

D1 2 tan a xDT

½23

Unlike SR gratings, the blaze condition of VPH gratings may be varied by changing the basic angle of the spectrograph, c ¼ 2ðg þ aÞ; although this is often mechanically inconvenient. They may also be sandwiched between prisms to provide the necessary incident and diffraction angles while retaining the advantage of a straight-through configuration, to form an analogue of a grism, sometimes termed a vrism. Unlike grisms, these are not constrained by groove shadowing and so can be efficient at relatively high dispersion, subject to the constraint imposed by the size of the device.

b

D1

RB ¼

Use of echelle gratings. Another option to increase the resolving power is to use a coarse grating in high order via the use of an echelle format. The resolving power in the near-Littrow configuration in which it is usually operated, is given by eqn [20] with c ¼ 0:

n′

nR

Figure 4 Typical configuration of a grism. The prism vertex and groove angles are chosen for maximum efficiency on the assumption that refractive indices, nG ¼ nR ¼ n and n 0 ¼ 1: The blaze condition occurs when d ¼ 0: In this configuration, f ¼ g ¼ a:

RB ¼

D1 2 tan g xDT

½24

where g is the groove angle.

Lines of constant ng

x

g = p/2

z

a

a

b

g 2g +a

a Active layer (DCG) Glass substrate Figure 5 Various configurations of volume phase holographic gratings. The blaze condition is when b ¼ a: The inset shows how the blaze condition may be altered by changing the basic angle of the spectrograph.

INSTRUMENTATION / Astronomical Instrumentation 289

555

10

500

D1

454 Cross-dispersion

g

12

417

13

385

14

357

15

333

16

312

W

312 (a)

(b)

555

11

333

300 Primary dispersion

Figure 6 (a) Basic layout of an Echelle grating used in a near-Littrow configuration. (b) Illustrative cross-dispersed spectrogram showing a simplified layout on the detector. The numbers 10–16 label the different spectral orders. The numbers labeling the vertical axis are the wavelength (nm) at the lowest end of each complete order. Other wavelengths are labeled for clarity. For simplicity the orders are shown evenly spaced in cross-dispersion.

In the configuration shown in Figure 6, R is maximized by increasing W. This also means that G ¼ mrl is large. In order to avoid excessively high ruling densities, the grating may be used in high orders. However, this presents the problem of order overlap since the wavelength in order m occurs in the same place as wavelength ln ¼ ðm=nÞlm in order n. Limiting the observation to a single order via the use of order-sorting filters is one solution while another is to use crossdispersion via a chain of prisms or a grating with dispersion perpendicular to that of the echelle to separate the orders out. This option severely curtails the use of such spectral formats in multiobject spectroscopy. Increasing resolving power using immersed gratings. All the methods of using diffraction gratings discussed so far are subject to various geometric constraints which ultimately limit the maximum obtainable resolving power by placing limits on W (e.g., in eqn [14]). These may be partly overcome by optically coupling the dispersers to immersing prisms. This can lead to a doubling of the resolving power in some cases. Use of prisms. The final method of dispersion is to use a prism. This has the advantage that, since it does not work by multibeam interference, the light is not split into separate orders which removes the problem of order-overlap and improves efficiency. However, it does not produce the high spectral resolution required for some applications and the dependency of dispersion on l is often markedly nonlinear.

From a consideration of Fermat’s principle, it can be shown that the resolving power of a prism is:   lt dn R¼ xDT dl

½25

where t is the baselength of the prism. Using modern materials, R & 300 may be obtained. The problem of the nonlinearity in dispersion can be alleviated by the use of composite materials with opposite signs of dispersion. Multiobject spectroscopy (MOS) As discussed above, there is great value in increasing the multiplex gain. This can be done in two ways: (i) Increasing the number of apertures in the slit mask (Multislits). This requires that each slit is carefully cut in the mask because, although the telescope attitude (translation and rotation) may be adjusted to suit one aperture, errors in the relative positions of slits cannot be compensated since the image scale is fixed. Each aperture produces a spectrum on the detector and the mask designer must ensure that the targets and their matching slits are chosen to avoid overlaps between spectra and orders. This has the effect of limiting the surface density of targets which can be addressed in one observation. Passband filters can be used to limit the wavelength range and reduce the overlap problem. This option requires superior optics able to accommodate both a wide FOV and large disperser. (ii) Using optical fibers (Multifibers). A number of fibers may be deployed at the telescope focus to

290 INSTRUMENTATION / Astronomical Instrumentation

direct light from the various targets to a remote spectrograph whose input focus consists of one or more continuous pseudoslits. These are made up of the fiber outputs arranged in a line. The continuous nature of the slit means that spectral overlap is avoided without restricting the surface density of addressable targets; although there is a limit imposed by the distance of closest approach of the fibers. The method of deploying the fiber inputs may be a plugplate consisting of pre-cut holes into which encapsulated fibers are manually plugged, or a pick-and-place robot which serially positions the fiber inputs at the correct location on a magnetized field plate. The latter is highly versatile but mechanically complex with a significant configuration time that may erode the actual on-target integration time. The two systems have contrasting capabilities (Figure 7). The multislit approach provides generally better SNR since the light feeds into the spectrograph directly, but is compromised in terms of addressable surface density of targets, by the need to avoid spectral overlap and in FOV by the difficulty of the wide-field optics. A current example of this type of instrument, GMOS, is shown in Figure 8 and described in Table 5. The multifiber approach is limited in SNR by modal noise in the fibers and

attendant calibration uncertainties and the lack of contiguous estimates of the sky background. However, it is easier to adapt to large fields since the spectrograph field can be much smaller than the field over which the fiber inputs are distributed. In summary, multislit systems are best for narrow-butdeep surveys while the multifiber systems excels at wide-but-shallow surveys. Fiber-fed instruments may further prove their worth in ELTs where technical problems may require that these bulky instruments are mounted off the telescope (see below for further discussion). Of paramount importance in MOS is the quality of the background subtraction as discussed above. Traditionally, this requires slits which sample the sky background directly adjacent to the object. An alternative is nod-and-shuffle (va-et-vient) in which nearby blank regions are observed alternately with the main field using the same slit mask by moving the telescope (the nod). In the case of CCDs, the photogenerated charge from the interleaved exposures is stored temporarily on the detector by moving it to an unilluminated portion of the device (the shuffle). After many repeats on a timescale less than that of variations in the sky background (a few minutes), the accumulated charge is read out, incurring the read-noise penalty only once. Although requiring an effective doubling of the exposure time and an

Figure 7 Illustration of the difference between the multislit and multifiber approaches to multi-object spectroscopy.

INSTRUMENTATION / Astronomical Instrumentation 291

Gemini instrument support structure

Fore optic support structure On-instrument wavefront sensor

Collimator Filter wheels

IFU/mask cassettes

GMOS without enclosure and electronics cabinets

Grating turret & indexer unit

Camera

Main optical support structure

CCD unit Dewar shutter

Figure 8 The Gemini Multiobject Spectrograph (GMOS). One of two built for the two Gemini 8-m telescopes by a UK –Canadian consortium. It includes an integral field capability provided by a fiber-lenslet module built by the University of Durham. The optical configuration is the same as shown in Figure 2. It is shown mounted on an instrument support unit which includes a mirror to direct light from the telescope into different instruments. The light path is shown by the red dashed line. The slit masks are cut by a Nd-YAG laser in 3-ply carbon fiber sheets. See Table 5 for the specification.

Table 5

Main characteristics of the Gemini multiobject spectrographs

Image scale Detector Wavelength range Spectral resolving power

72 mas/pixel CCD with 13.5 mm pixels Format: 3 £ (4608 £ 2048) Total: 0.4–1.0 mm Simultaneous: # 1 octave R # 5000 with 0.500 slit

increase in the size of the detector, this technique allows the length of the slits to be reduced since no contiguous sky sample is required, thereby greatly increasing the attainable multiplex gain if the number density of potential targets is sufficiently high.

Integral field spectroscopy (IFS) IFS provides a spectrum of each spatial sample simultaneously within a contiguous FOV. Other approaches provide the same combination of imaging and spectroscopy but require a series of nonsimultaneous observations. Examples include imaging through a number of narrow passband filters with different effective wavelength and spectroscopy with a single slit which is stepped across the object.

FOV Slit/mask configuration

5.50 £ 5.50 #few £ 100 slits of width $ 0.200

Integral field unit

1500 £ 0.200 samples in 50 sq.00

Dispersion options

3 gratings þ mirror for imaging

Other such techniques are Fabry – Perot, Fouriertransform and Hadamard spectroscopy. Since the temporal variation of the sky background is a major limitation in astronomy of faint objects, IFS is the preferred technique for terrestrial observations of faint objects, but nonsimultaneous techniques are preferable in certain niche areas, and relatively more important in space where the sky background is reduced. The main techniques (Figure 9) are as follows: (i) Lenslets. The field is subdivided by placing an array of lenslets at the telescope focus (or its conjugate following re-imaging fore-optics) to form a corresponding array of micropupils. These are re-imaged on the detector by a focal

292 INSTRUMENTATION / Astronomical Instrumentation

Figure 9 Main techniques of integral field spectroscopy.

reducer using a conventional disperser. The spectra are, therefore, arranged in the same pattern as that of the lenslet array. Spectral overlap is reduced by angling the dispersion direction away from the symmetry axes of the lenslet array and by the fact that the pupil images are smaller than the aperture of the corresponding lenslet. (ii) Fibers þ lenslets. The field is subdivided as in (i) but the pupil images are relayed to a remote spectrograph using optical fibers which reformat the field into linear pseudoslits. This avoids the problem of overlap, allowing a greater length of spectrum than in (i), but is more complex since the arrays of fibers and lenslets must be precisely co-aligned and are subject to modal noise in the fibers. (iii) Image slicer. The field is subdivided in only one dimension by a stack of thin slicing mirrors placed at a conjugate of the telescope focus (Figure 10). Each mirror is angled so as to direct light to its own pupil mirror which re-images the field to form part of a linear pseudoslit. Thus the slices into which the image is divided are rearranged end-to-end to form a continuous slit at the spectrograph input. Unlike the other techniques, this retains spatial information along the length of each slice. This is dispersed by conventional means via a focal-reducer. An additional optic is often required at the image of each slice at the slit to reimage the micropupil images produced by the slicing mirrors onto a common pupil inside the spectrograph. In principle this is the most efficient of the three

methods since the one-dimensional slicing produces fewer diffraction losses in the spectrograph than the two-dimensional division used by the others, and minimizes the fraction of the detector surface which must be left unilluminated in order to avoid cross-talk between noncontiguous parts of the field. However the micromirrors are difficult to make since they require diamond-turning or grinding in metal or glass with a very fine surface finish (typically with RMS , 1 nm for the optical and , 10 nm in the infrared). This sort of system is, however, well-matched to cold temperatures, since the optical surfaces and mounts may be fabricated from the same material (e.g., Al) or from materials with similar thermal properties. The data are processed into a datacube whose dimensions are given by the two spatial coordinates, x; y; plus wavelength. The datacube may be sliced up in ways analogous to tomography to understand the physical process operating in the object. Spectropolarimetry and Polarimetry

Spectropolarimetry and polarimetry are analogous to spectroscopy and imaging, where the polarization state of the light is measured instead of the total intensity. Like IFS, spectropolarimetry is a photonstarved (i.e., limited in SNR by photon noise) area which benefits from the large aperture of current and projected telescopes. The Stokes parameters of interest are I, Q, and U. V is generally very small for astronomical objects.

INSTRUMENTATION / Astronomical Instrumentation 293

Figure 10 Principle of the advanced image slicer. Only three slices are shown here. Real devices have many more, e.g., the IFU for the Gemini Near-Infrared Spectrograph has 21 to provide a 500 £ 300 field with 0.1500 £ 0.1500 sampling. Reproduced from Content R (1997) A new design for integral field spectroscopy with 8 m telescopes. Proceedings SPIE 2871: 1295– 1305.

From these, the degree and azimuthal angle of linear polarization are obtained as:

pL ¼

Q2 þ U2 I2

!



1 U arctan 2 Q

! ½26

The Stokes parameters may be estimated using a modified spectrograph with a rotatable achromatic half-wave retarder, characterized by the angle of rotation about the optical axis, u, placed before the instrument and a polarizing beamsplitter which separates the incident beam into orthogonal polarization states (Figure 11). The two states are recorded simultaneously on different regions of the detector. The separation is achieved usually through the angular divergence produced by a Wollaston prism placed before the disperser or through the linear offset produced by a calcite block placed before the slit. The intensity as a function of the waveplate rotation angle, S(u), is recorded for each wavelength and position in the spectrum or image. The Stokes

parameters for each sample are then given by Q Sð0Þ 2 Sðp=2Þ ¼ I Sð0Þ þ Sðp=2Þ + * A1 2 vAB B1 A2 2 vAB B2 ; ¼ A1 þ vAB B1 A2 þ vAB B2 U Sðp=4Þ 2 Sð3p=4Þ ¼ I Sðp=4Þ þ Sð3p=4Þ + * C1 2 vCD D1 C2 2 vCD D2 ; ¼ C1 þ vCD D1 C2 þ vCD D2

½27

I ¼ Sð0Þ þ Sðp=2Þ * + A1 þ vAB B1 A2 þ vAB B2 ¼ etc: ; hA1 hA2 where A and B are the signals (functions of position and/or l) recorded with the waveplate at u ¼ 0 and u ¼ p=2 and the subscripts 1 and 2 refer to the two polarization states produced by the beamsplitter.

294 INSTRUMENTATION / Astronomical Instrumentation

Figure 11 Schematic layout of spectropolarimeter using a polarizing beamsplitter in the collimated beam (e.g., a Wollaston prism producing e- and o-states) to produce separate spectra for orthogonal polarization states on the detector. Alternative configurations in which the two polarizations have a linear offset (e.g., a calcite block) may be used in which case the beamsplitter is placed before the slit.

Likewise, C and D are a pair of observations taken with u ¼ p=4 and u ¼ 3p=4: The angled brackets indicate a suitable method of averaging the quantities inside. The factors vAB and vCD are estimated as sffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffi A1 A2 C1 C2 ½28 vAB ¼ vCD ¼ B1 B2 D1 D2 and hA1 is a single calibration factor for polarization state 1 of observation A, etc. One problem is instrumental polarization caused by oblique reflections from telescope mirrors upstream of the polarizing waveplate, such as the fold mirror external to GMOS (see Figure 8). This must be carefully accounted for by taking observations of stars with accurately known polarization. The dispersing element is also likely to be very strongly dependent on the polarization state of the incident light with widely different efficiency as a function of wavelength. Although this effect is cancelled out using the procedure described above, it introduces a heavy toll in terms of SNR reduction.

Technology Issues Use of Optical Fibers in Astronomy

As discussed above, optical fibers are commonly used in astronomy for coupling spectrographs to wide, sparsely sampled FOVs in the case of MOS, or small contiguously sampled FOVs in the case of IFS. For both applications, the most important characteristics are: (i) Throughput. Near-perfect transmission is required for wavelength intervals of , 1 octave within the overall range of 0.3 – 5 mm. For MOS, a prime focus fiber feed to an off-telescope spectrograph implies fiber runs of several tens of meters for 8 m telescopes, increasing proportionally for ELTs. For IFS, runs of only

, 1 m are required for self-contained par-focal IFUs. For l , 2:5 mm; fused silica is a suitable material, although it is not currently possible to go below , 0.35 mm, depending on fiber length. Newer techniques and materials may improve the situation. For l . 2:5 mm; alternative materials are required, such as ZrF4 and, at longer wavelengths, chalcogenide glasses, but these are much more expensive and difficult to use than fused silica. (ii) Conservation of Etendue. Fibers suffer from focal ratio degradation, a form of modal diffusion, which results in a speeding up of the output beam with respect to the input. Viewed as an increase in entropy, it results in a loss of information. In practice, this implies either reduced throughput, as the fiber output is vignetted by fixed-size spectrograph optics, or a loss in spectral resolution if the spectrograph is oversized to account for the faster beam from the fibers. This effect may be severe at the slow focal ratios pr oduced by many telescopes ðF . 10Þ but is relatively small for fast beams ðF , 5Þ: Thus, its effect may be mitigated by speeding up the input beam by attaching lenslets directly to the fibers, either individually (for MOS) or in a close-packed array (for IFS). If the spectrograph operates in both fiber- and beam-fed modes (e.g., Figure 8), it is also necessary to use lenslets at the slit to convert the beam back to the original speed. (iii) Efficient coupling to the telescope. This requires that the fibers are relatively thick to match the physical size of the sampling at the telescope focus which is typically xs ¼ 0:100 for IFS and xs ¼ 200 for MOS. By conservation of Etendue, the physical size of the fiber aperture is df ¼ xs DT Ff

½29

INSTRUMENTATION / Astronomical Instrumentation 295

where Ff is the focal ratio of the light entering the fiber and DT is the diameter of the telescope aperture. Using the above values with Ff ¼ 4; as required to maintain Etendue, implies 15 , df , 300 mm for an 8 m or 60 , df , 1200 mm for a 30 m telescope. Except at the lower end, this implies the use of multimode (step-index) fibers. However, the need in IFS to oversize the fibers to account for manufacturing errors and current requirements of xs $ 0:200 to produce sufficient SNR in a single spatial sample gives a practical limit of df . 50 mm which is in the multimode regime. Recently, the use of photonic crystal fibers in astronomy has been investigated. These, together with other improvements in material technology may improve the performance of optical fibers, which will be of special relevance to ELTs where, for example, there will be a need to couple very fast beams into fibers. Cooling for the Near-Infrared

As shown in Figure 12, it is necessary to cool instruments for use in the infrared. This is to prevent thermal emission from the instrument becoming the dominant noise source. The minimum requirement is to place a cold stop at a conjugate of the telescope pupil. However, much of the rest of the instrument requires cooling because of the nonzero emissivity of the optics. The telescope also requires careful design.

For example, the Gemini telescopes are optimized for the NIR by undersizing M2 so that the only stray light entering the final focal surface is from the cold night sky. Irrespective of the thermal background and wavelength, all detectors (see Table 4), require cooling to reduce internally generated noise to acceptable levels. For the CCDs used in the optical, the cryogen is liquid nitrogen, but for the infrared, cooling with liquid helium is required. Structure

As discussed above, the necessity for instruments to scale in size with the telescope aperture cannot be achieved by simply devising a rigid structure, since the required materials are generally not affordable (instruments on 8 m telescopes cost roughly e/$ 5– 15 M, including labor). Furthermore, the stability also depends critically on the mounting of the individual optical elements, where the choice of material is constrained by the need for compliance with the optical materials. The solution adopted for GMOS was to design the collimator and camera mounting so that variations in the gravity vector induce a translation in each component orthogonal to the optical axis without inducing tilts or defocus. Therefore, the only effect of instability is a translation of the image of the slit on the detector. Since the slit area is rigidly attached to the telescope interface, the image of a celestial object does not move with respect

H 300K

250K

200K

T = 150K

Counts/pixel/s

1000.0

100.0

R = 300

10.0

Mean

1.0

R = 3000

0.1

Continuum

100K

Dark current

1

2 3 Wavelength (µm)

4

Figure 12 Illustration of the need for cooling in NIR instruments. This is modeled for the case of a spectrograph on an 8 m telescope with 0.200 £ 0.200 sampling with both throughput and total emissivity of 50%. The curves indicate the cumulative thermal background for wavelengths shortward of that shown on the scale. For comparison, a typical dark current is shown for NIR detectors (see Table 4). Also shown is the background from the night sky for the case of the H-band, for two values of spectral resolution, R. The mean sky background level is shown for both values of R. For R ¼ 3000; the continuum level found between the strong, narrow OH-lines which make up most of the signal is also shown. To observe in the H-band, cooling to 240 8C is required for high spectral resolution, but is unnecessary at low resolution. At longer wavelengths, fully cryogenic cooling with liquid nitrogen is generally required.

296 INSTRUMENTATION / Astronomical Instrumentation

to the slit providing that the telescope correctly tracks the target. The motion of the image of the slit on the detector is then corrected by moving the detector orthogonal to the optical axis to compensate. Taking account of nonrepeatable motion in the stucture and the detector mounting, it proved possible to attain the desired stability in open-loop by measuring the flexure experienced with the control system turned off, making a look-up table to predict the required detector position for each attitude setting (elevation and attitude) and applying the correction by counting pulses sent to the stepper motors which control the detector translation. Some instruments require much greater stability than GMOS does. The flexure-control outlined above may be augmented by operation in closed-loop so that nonrepeatable movements may be accounted for. However, this ideally requires a reference light source which illuminates the optics in the same way as the light from the astronomical target and which is recorded by the science detector without adversely impacting the observation. Alternatively, partial feedback may be supplied by repeated metrology of key optical components such as fold-mirrors. One strategy to measure radial velocities to very small uncertainities (a few meters per second) is to introduce a material (e.g., iodine) into the optical path, which is present throughout the observations. This absorbs light from the observed object’s continuum at one very well-defined wavelength. Thus, instrument instability can be removed by measuring the centroid of the desired spectral line in the object relative to the fixed reference produced by the absorption cell. However, flexure must still be carefully controlled to avoid blurring the line during the course of a single exposure. Mounting of instruments via a fiber-feed remotely from the telescope, at a location where they are not subjected to variations in the gravity vector, is a solution for applications where great stability is required. But, even here, care must be taken with the fore-optics and pickoff system attached to the telescope, and to account for modal noise induced by changes in the fiber configuration as the telescope tracks. Finally, the instrument structure has other functions. For cooled instruments, a cryostat is required inside which most of the optical components are mounted. For uncooled instruments, careful control of temperature is needed to avoid instrument motion due to thermal expansion. This requires the use of an enclosure which not only blocks out extraneous light, but provides a thermal buffer against changes in the

ambient temperature during observations and reduces thermal gradients by forced circulation of air.

See also Diffraction: Diffraction Gratings. Fiber and Guided Wave Optics: Fabrication of Optical Fiber. Imaging: Adaptive Optics. Instrumentation: Telescopes; Spectrometers. Spectroscopy: Fourier Transform Spectroscopy; Hadamard Spectroscopy and Imaging.

Further Reading Allington-Smith JR, Murray G, Content C, et al. (2002) Integral field spectroscopy with the Gemini Multiobject Spectrograph. Publications of the Astronomical Society of the Pacific 114: 892. Bjarklev A, Broeng J and Sanchez Bjarklev AS (2003) Photonic Crystal Fibers. Kluwer. Bowen IS (1938) The Image-Slicer: a device for reducing loss of light at slit of stellar spectrograph. Astrophysical Journal 88: 113. Carrasco E and Parry I (1994) A method for determining the focal ratio degradation of optical fibres for astronomy. Monthly Notices of the Royal Astronomical Society 271: 1. Courtes G (1982) Instrumentation for Astronomy with Large Optical Telescope, IAU Colloq, 67. In: Humphries CM (ed.) Astrophysics and Space Science Library, vol. 92, pp. 123. Dordrecht: Reidel. Glazebrook K and Bland-Hawthorn J (2001) Microslit nod-shuffle spectrocopy: a technique for achieving very high densities of spectra. Publications of the Astronomical Society of the Pacific 113: 197. Lee D, Haynes R, Ren D and Allington-Smith JR (2001) Characterisation of microlens arrays for astronomical spectroscopy. PASP 113: 1406. McLean I (1997) Electronic Imaging in Astronomy. Chichester, UK: John Wiley and Sons Ltd. Palmer C (2000) Diffraction Grating Handbook. Richardson Grating Laboratory, 4th edn. Rochester, NY. Ridgway SH and Brault JW (1984) Astronomical Fourier transform spectroscopy revisited. Ann. Rev. Astron. Astrophys. 22. Saha SK (2002) Modern optical astronomy: technology and impact of interferometry. Review of Modern Physics 74: 551. Schroeder D (2000) Astronomical Optics, 2nd edn. San Diego, CA: Academic Press. van Breugel W and Bland-Hawthorn J (eds) Imaging the Universe in Three Dimensions, ASP Conference Series, vol. 195. Weitzel L, Krabbe A, Kroker H, et al. (1996) 3D: The next generation near-infrared imaging spectrometer. Astron. & Astrophys. Suppl. 119: 531.

INSTRUMENTATION / Ellipsometry 297

Ellipsometry J N Hilfiker, J A Woollam & Co., Inc., Lincoln, NE, USA J A Woollam, University of Nebraska, Lincoln, NE, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Ellipsometry measures a change in polarization as light reflects from or transmits through a material structure. The polarization-change is represented as an amplitude ratio, C, and a phase difference, D. The measured response is dependent on optical properties and thickness of each material. Thus, ellipsometry is primarily used to determine film thickness and optical constants. However, it is also applied to the characterization of composition, crystallinity, roughness, doping concentration, and other material properties associated with a change in optical response. Interest in ellipsometry has grown steadily since the 1960s as it provided the sensitivity necessary to measure nanometer-scale layers used in microelectronics. Today, the range of applications has spread to basic research in physical sciences, semiconductor, data storage, flat panel display, communication, biosensor, and optical coating industries. This widespread use is due to increased dependence on thin films in many areas and the flexibility of ellipsometry to measure most material types: dielectrics, semiconductors, metals, superconductors, organics, biological coatings, and composites of materials. This article provides a fundamental description of ellipsometry measurements along with the typical data analysis procedures. The primary applications of ellipsometry are also surveyed.

orientation and phase, it is considered unpolarized. For ellipsometry, we are interested in the case where the electric field follows a specific path and traces out a distinct shape at any point. This is known as polarized light. When two orthogonal light waves are in-phase, the resulting light will be linearly polarized (Figure 1a). The relative amplitudes determine the resulting orientation. If the orthogonal waves are 908 out-ofphase and equal in amplitude, the resultant light is circularly polarized (Figure 1b). The most general polarization is ‘elliptical’, which combines orthogonal waves of arbitrary amplitude and phase (Figure 1c). This is how ellipsometry gets its name. Materials

Two values are used to describe optical properties, which determine how light interacts with a material.

Light and Materials Ellipsometry measurements involve the interaction between light and material. Light

Light can be described as an electromagnetic wave traveling through space. For ellipsometry, it is adequate to discuss the electric field behavior in space and time, also known as polarization. The electric field of a wave is always orthogonal to the propagation direction. Therefore, a wave traveling along the z-direction can be described by its x- and y-components. If the light has completely random

Figure 1 Orthogonal waves combine to demonstrate polarization: (a) linear; (b) circular; and (c) elliptical.

298 INSTRUMENTATION / Ellipsometry

They are generally represented as a complex number. ~ consists of the index The complex refractive index ðnÞ ðnÞ and extinction coefficient ðkÞ: n~ ¼ n þ ik

½1

Alternatively, the optical properties can be represented as the complex dielectric function: 1~ ¼ 11 þ i12

½2

with the following relation between conventions: 1~ ¼ n~ 2

½3

The index describes the phase velocity for light as it travels through a material compared to the speed of light in vacuum, c: v¼

c n

½4

Light slows as it enters a material with higher index. Because frequency remains constant, the wavelength will shorten. The extinction coefficient describes the loss of wave energy to the material. It is related to the absorption coefficient, a, as:



4pk l

The optical constants for TiO2 are shown in Figure 3 from the ultraviolet (UV) to the infrared (IR). The optical constants are wavelength dependent with absorption ðk . 0Þ occurring in both UV and IR, due to different mechanisms that take energy from the light wave. IR absorption is commonly caused by molecular vibration, phonon vibration, or freecarriers. UV absorption is generally due to electronic transitions, where light provides energy to excite an electron to an elevated state. A closer look at the optical constants in Figure 3 shows that real and imaginary optical constants are not independent. Their shapes are mathematically coupled together through Kramers– Kro¨nig consistency. Further details are covered later in this article.

Interaction Between Light and Materials

Maxwell’s equations must remain satisfied when light interacts with a material, which leads to boundary conditions at the interface. Incident light will reflect and refract at the interface, as shown in Figure 4. The angle between the incident ray and sample normal

½5

Light loses intensity in an absorbing material, according to Beer’s Law: IðzÞ ¼ Ið0Þ e2 iaz

½6

Thus, the extinction coefficient relates how quickly light vanishes in a material. These concepts are demonstrated in Figure 2, where a light wave travels from air into two different materials of varying properties.

Figure 2 Wave travels from air into absorbing Film 1 and then transparent Film 2. The phase velocity and wavelength change in each material depending on index of refraction (Film 1: n ¼ 4; Film 2: n ¼ 2).

Figure 3 Complex dielectric function for TiO2 film covering wavelengths from the infrared (small eV) to the ultraviolet (high eV).

Figure 4

Light reflects and refracts according to Snell’s law.

INSTRUMENTATION / Ellipsometry 299

ðfi Þ will be equal to the reflected angle, ðfr Þ: Light entering the material is refracted to an angle ðft Þ given by: n0 sinðfi Þ ¼ n1 sinðft Þ

½7

The same occurs at every interface where a portion reflects and the remainder transmits at the refracted angle. This is illustrated in Figure 5. The boundary conditions provide different solutions for electric fields parallel and perpendicular to the sample surface. Therefore, light can be separated into orthogonal components with relation to the plane of incidence. Electric fields parallel and perpendicular to the plane of incidence are considered p- and s-polarized, respectively. These two components are independent for isotropic materials and can be calculated separately. Fresnel described the amount of light reflected and transmitted at an interface between materials: rs ¼

rp ¼

ts ¼

E0r E0i

E0r E0i

E0t E0i

!

ni cos ui 2 nt cos ut ni cos ui þ nt cos ut

½8a

¼

nt cos ui 2 ni cos ut ni cos ut þ nt cos ui

½8b

¼

2ni cos ui ni cos ui þ nt cos ut

½8c

! p

! s

! ¼ p

2ni cos ui ni cos ut þ nt cos ui

½8d

Thin film and multilayer structures involve multiple interfaces, with Fresnel reflection and transmission coefficients applicable at each. It is important to track the relative phase of each light component to correctly determine the overall reflected or transmitted beam. For this purpose, we define the film phase thickness as: 

 t1 n cos u1 b ¼ 2p l 1

½9

The superposition of multiple light waves introduces interference that is dependent on the relative phase of each light wave. Figure 5 illustrates the combination of light waves in the reflected beam and their corresponding Fresnel calculations.

Ellipsometry Measurements

¼ s

tp ¼

E0t E0i

For ellipsometry, primary interest is measurement of how p- and s-components change upon reflection or transmission relative to each other. In this manner, the reference beam is part of the experiment. A known polarization is reflected or transmitted from the sample and the output polarization is measured. The change in polarization is the ellipsometry measurement, commonly written as:

r ¼ tanðCÞ eiD

½10

Figure 5 Light reflects and refracts at each interface, which leads to multiple beams in a thin film. Interference between beams depends on relative phase and amplitude of the electric fields. Fresnel reflection and transmission coefficients can be used to calculate the response from each contributing beam.

300 INSTRUMENTATION / Ellipsometry

An example ellipsometry measurement is shown in Figure 6. The incident light is linear with both p- and s-components. The reflected light has undergone amplitude and phase changes for both p- and s-polarized light, and ellipsometry measures their changes. The primary methods of measuring ellipsometry data all consist of the following components: light source, polarization generator, sample, polarization analyzer, and detector. The polarization generator and analyzer are constructed of optical components that manipulate the polarization: polarizers, compensators, and phase modulators. Common ellipsometer configurations include rotating analyzer (RAE), rotating polarizer (RPE), rotating compensator (RCE), and phase modulation (PME). The RAE configuration is shown in Figure 7. A light source produces unpolarized light, which is sent through a polarizer. The polarizer passes light of a preferred electric field orientation. The polarizer axis is oriented between the p- and s-planes, such that both arrive at the sample surface. The linearly polarized light reflects from the sample surface,

becoming elliptically polarized and travels through a continuously rotating polarizer (referred to as the analyzer). The amount of light allowed to pass will depend on the polarizer orientation relative to the electric field ‘ellipse’ coming from the sample. The detector converts light from photons to electronic signal to determine the reflected polarization. This information is used with the known input polarization to determine the polarization change caused by the sample reflection. This is the ellipsometry measurement of C and D.

Single Wavelength Ellipsometry (SWE)

SWE uses a single frequency of light to probe the sample. This results in a pair of data, C and D, used to determine up to two material properties. The optical design can be simple, low-cost, and accurate. Lasers are an ideal light source with well-known wavelength and significant intensity. The optical elements can be optimized to the single wavelength. However, there is relatively low information content (two measurement values). SWE is excellent for quick measurement of

Figure 6 Typical ellipsometry configuration, where linearly polarized light is reflected from the sample surface and the polarization change is measured to determine the sample response.

Figure 7 Rotating analyzer ellipsometer configuration uses a polarizer to define the incoming polarization and a rotating polarizer after the sample to analyze the outgoing light. The detector converts light to a voltage with the time-dependence leading to measurement of the reflected polarization.

INSTRUMENTATION / Ellipsometry 301

nominally known films, like SiO2 on Si. Care must be taken when interpreting unknown films, as multiple solutions occur for different film thickness. The data from a transparent film will cycle through the same values as the thickness increases. From Figure 5, it can be seen that this is related to interference between the multiple light beams. The refracted light may travel through different thicknesses to emerge in-phase with the first reflection, but is the returning light delayed by one wavelength or multiple wavelengths? Mathematically, the thickness cycle can be determined as:

Df ¼

l qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 n~ 21 2 n~ 20 sin2 f

½11

This is demonstrated in Figure 8, where the complete thickness cycle is shown for SiO2 on Si at 758 angle of incidence and 500 nm wavelength. A bare silicon substrate would produce data at the position of the star. As the film thickness increases, the data will move around this circle in the direction of the arrow. From eqn [11], the thickness cycle is calculated as 226 nm. Thus the cycle is completed and returns to the star when the film thickness reaches 226 nm. Any data point along the cycle represents a series of thicknesses which depend on how many times the cycle has been completely encircled. For example, the square represents a data point for 50 nm thickness. However, the same data point will represent 276 and 502 nm thicknesses. A second data point at a new wavelength or angle would help determine the correct thickness.

Spectroscopic Ellipsometry (SE)

Spectroscopic measurements solve the ‘period’ problem discussed for SWE. Data at surrounding wavelengths insure the correct thickness is determined, even as there remain multiple solutions at one wavelength in the spectrum. This is demonstrated in Figure 9, showing the spectroscopic data for the first three thickness results. The SE data oscillations clearly distinguish each thickness. Thus, a single thickness solution remains to match data at multiple wavelengths. SE provides additional information content for each new wavelength. While the film thickness will remain constant regardless of wavelength, the optical constants will change across the spectrum. The optical constant shape contains information regarding the material microstructure, composition, and more. Different wavelength regions will provide the best information for each different material property. For this reason, SE systems have been developed to cover very wide spectral regions from the UV to the IR. Spectroscopic measurements require an additional component, a method of wavelength selection. The most common methods include the use of monochromators, Fourier-transform spectrometers, and gratings or prisms with detection on a diode array. Monochromators are typically slow as they sequentially scan through wavelengths. Spectrometers and diode arrays allow simultaneous measurement at multiple wavelengths. This has become popular with the desire for high-speed SE measurements. Variable Angle Ellipsometry

Ellipsometry measurements are typically performed at oblique angles, where the largest changes in polarization occur. The Brewster angle, defined as: tanðfB Þ ¼

Figure 8 Data from a single wavelength will cycle through the same points as the film thickness increases. The star represents the starting point, with no SiO2 film on Si. As the thickness increases, the data follow the cycle in the direction of the arrow. After the cycle is complete (thickness ¼ Df ), the data repeat the cycle. Thus, any point along the curve represents multiple possible thicknesses – example shown by a square at 50 nm thickness, which is also equal to 276 and 502 nm.

n1 n0

½12

Figure 9 Three spectroscopic ellipsometry measurements that match at 500 nm wavelength, but are easily distinguishable as a function of wavelength.

302 INSTRUMENTATION / Ellipsometry

gives the angle where reflection of p-polarized light goes through a minimum. Ellipsometry measurements are usually near this angle, which can vary from 558 for low-index dielectrics to 758 or 808 for semiconductors and metals. It is also common to measure at multiple angles of incidence in ellipsometry. This allows additional data to be collected for the same material structure under different conditions. The most important change introduced by varying the angle is the change in path length through the film as the light refracts at a different angle. Multiple angles do not always add new information about a material structure, but the extra data builds confidence in the final answer. In Situ

Within the last decade, it has become common to employ optical diagnostics during processing. In situ ellipsometry allows the optical response to be monitored in real-time. This has also led to feedback control for many processes. While in situ is generally restricted to a single angle of incidence, there is a distinct advantage to collecting data at different ‘times’ during processing to get unique ‘glimpses’ of the sample structure as it changes. In situ SE is commonly applied to semiconductors. Conventional SE systems measure from the UV to NIR, which matches the spectral region where semiconductors absorb due to electronic transitions. The shape and position of the absorption can be very sensitive to temperature, composition, crystallinity, and surface quality, which allows SE to closely monitor these properties. In situ SE has also been

Figure 10 Flowchart for ellipsometry data analysis.

used to monitor numerous material types, including metals and dielectrics. A promising in situ application is for multilayer optical coatings, where thickness and index can be determined in real-time to allow correction of future layers, to optimize optical performance.

Data Analysis Ellipsometry measures changes in polarization. However, it is used to determine the material properties of interest, such as film thickness and optical constants. In the case of a bulk material, the equations derived for a single reflection can be directly inverted to provide the ‘pseudo’ optical constants from the ellipsometry measurement, r: "

12r k1l ~ ¼ sin ðfÞ 1 þ tan ðfÞ 1þr 2

2

!# ½13

This equation assumes there are no surface layers of any type. However, there is typically a surface oxide or roughness for any bulk material and the direct inversion would incorporate these into the bulk optical constants. The more common procedure used to deduce material properties from ellipsometry measurements follows the flowchart in Figure 10. Regression analysis is required because an exact equation cannot be written. Often the answer is overdetermined with hundreds of experimental data points for a few unknowns. Regression analysis allows all of the measured data to be included when determining the solution.

INSTRUMENTATION / Ellipsometry 303

Data analysis proceeds as follows. After the sample is measured, a model is constructed to describe the sample. The model is used to calculate the predicted response from Fresnel’s equations. Thus, each material must be described with a thickness and optical constants. If these values are not known, an estimate is given for the purpose of the preliminary calculation. The calculated values are compared to experimental data. Any unknown material properties can then be varied to improve the match between experiment and calculation. The number of unknown properties should not exceed the information content contained in the experimental data. For example, a singlewavelength ellipsometer produces two data points (C,D) which allows a maximum of two material properties to be determined. Finding the best match between the model and experiment is typically done through regression. An estimator, like the mean squared error (MSE), is used to quantify the difference between curves. The unknown parameters are allowed to vary until the minimum MSE is reached. Care must be taken to ensure the best answer is found, corresponding to the lowest MSE. For example, Figure 11a shows the MSE curve versus film thickness for a transparent film on silicon. There are multiple ‘local’ minima, but the lowest MSE value

occurs at a thickness of 749 nm. This corresponds to the correct film thickness. It is possible that the regression algorithm will mistakenly fall into a ‘local’ minima, depending on the starting thickness and the MSE structural conditions. Comparing the results by eye for the lowest MSE and a local minima easily distinguish the true minima (see Figures 11b and c).

Ellipsometry Characterization The two most common material properties studied by ellipsometry are film thickness and optical constants. In addition, ellipsometry can characterize material properties that affect the optical constants. Crystallinity, composition, resistivity, temperature, and molecular orientation can all affect the optical constants and in turn be measured by ellipsometry. This section details many of the primary applications important to ellipsometry. Film Thickness

The film thickness is determined by interference between light reflecting from the surface and light traveling through the film. Depending on the relative phase of the rejoining light to the surface reflection, there can be constructive or destructive interference.

Figure 11 (a) MSE curve versus thickness shows the ‘global’ minimum where the best match between model and experiment occurs, and ‘local’ minima that may be found by the regression algorithm, but do not give the final result. (b) The experimental data and corresponding curves generated for the model at the ‘global’ minimum. (c) Similar curves at the ‘local’ minimum near 0.45 microns thickness is easily distinguishable as an incorrect result.

304 INSTRUMENTATION / Ellipsometry

Figure 12 (a) Reflected intensity and (b) ellipsometric delta for three thin oxides on silicon show the high sensitivity of D to nanometerscale films not available from the intensity measurement.

The interference involves both amplitude and phase information. The phase information from D is very sensitive to films down to submonolayer thickness. Figure 12 compares reflected intensity and ellipsometry for the same series of thin SiO2 layers on Si. There are large variations in D, while the reflectance for each film is nearly the same. Ellipsometry is typically used for films with thickness ranging from sub-nanometers to a few microns. As films become greater than several tens of microns thick, it becomes increasingly difficult to resolve the interference oscillations, except with longer infrared wavelengths, and other characterization techniques become preferred. Thickness measurements also require a portion of the light to travel through the entire film and return to the surface. If the material is absorbing, thickness measurements by optical instruments will be limited to thin, semi-opaque layers. This limitation can be circumvented by measuring in a spectral region where there is lower absorption. For example, an organic film may strongly absorb UV and IR light, but remain transparent at mid-visible wavelengths. For metals, which strongly absorb at all wavelengths, the maximum layer for thickness determination is typically ,100 nm. Optical Constants

Thickness measurements are not independent of the optical constants. The film thickness affects the path length of light traveling through the film, but the index determines the phase velocity and refracted angle. Thus, both contribute to the delay between surface reflection and light traveling through the film. Both n and k must be known or determined along with the thickness to get the correct results from an optical measurement. The optical constants for a material will vary for different wavelengths and must be described at all wavelengths probed with the ellipsometer. A table of

optical constants can be used to predict the response at each wavelength. However, it is less convenient to adjust unknown optical constants on a wavelengthby-wavelength basis. It is more advantageous to use all wavelengths simultaneously. A dispersion relationship often solves this problem, by describing the optical constant shape versus wavelength. The adjustable parameters of the dispersion relationship allow the overall optical constant shape to match the experimental results. This greatly reduces the number of unknown ‘free’ parameters compared to fitting individual n; k values at every wavelength. For transparent materials, the index is often described using the Cauchy or Sellmeier relationship. The Cauchy relationship is typically given as: nðlÞ ¼ A þ

B C þ 4 2 l l

½14

where the three terms are adjusted to match the refractive index for the material. The Sellmeier relationship enforces Kramers– Kro¨nig (KK) consistency, which ensures the optical dispersion retains a realistic shape. The Cauchy is not constrained by KK consistency and can produce unphysical dispersion. The Sellmeier relationship can be written as: 11 ¼

Al2 l20 ðl2 2 l20 Þ

½15

Absorbing materials will often have a transparent wavelength region that can be modeled with the Cauchy or Sellmeier relationships. However, the absorbing region must account for both real and imaginary optical constants. Many dispersion relationships use oscillator theory to describe the absorption for materials, which include the Lorentz, Harmonic, and Gaussian oscillators. They all share similar attributes, where the absorption features are described with an amplitude, broadening, and center energy (related to frequency of light).

INSTRUMENTATION / Ellipsometry 305

tenth the wavelength of light. In practice, EMA theory is useful for studying very thin surface roughness or interfacial intermixing of materials. These cases are both generally approximated by mixing the two surrounding materials in equal portions. In the case of surface roughness, the second material is void ðn ¼ 1Þ: EMA theory has also been extended for application to porous materials where the directional dependence of the inclusions is handled mathematically. Figure 13 Lorentz oscillator illustrating the primary oscillator parameters: amplitude ðAÞ; broadening ðBÞ; and center energy ðEc Þ to describe the imaginary dielectric function shape and an offset ð11;offset Þ to help match the real component after KK transformation has defined its shape.

Kramers– Kro¨nig consistency is used to calculate the shape of the real component after the imaginary behavior is described by the oscillator. An offset to the real component is added to account for extra absorption outside the measured spectral region. The Lorentz oscillator can be written as: 1~ ¼ 11;offset þ

AEc E2c 2 E2 2 iBE

½16

where the parameters for amplitude ðAÞ; broadening ðBÞ; center energy ðEc Þ; and offset ð11;offset Þ are also shown in Figure 13 for a typical Lorentz oscillator. The energy, E; is related to the frequency of a wave, n : E ¼ hn ø

1240 lnm

½17

where h is Planck’s constant and the wavelength, l; is given in nanometers. More advanced dispersion models, like the Tauc– Lorentz and Cody – Lorentz, will include terms to describe the bandgap energy. Mixing Materials

Crystallinity

Semiconductors such as Si, are widely used materials, but their properties depend strongly on crystallinity. The UV absorption features in crystalline silicon are broadened and shifted as the material becomes more amorphous. This change in optical properties related to the degree of crystallinity has been used to advantage with ellipsometry measurements to monitor semiconductors and other films. Polysilicon films are used in both the display and semiconductor industry. The degree of crystallinity varies for different process conditions and can be monitored optically to ensure consistent material properties. Composition

The composition of many alloys will affect the optical constants. The strongest changes often occur in the absorbing region, with shifts in position and amplitude of absorption features. For example, the electronic transitions in Hg12xCdxTe move to higher energy with Cd concentration increase (Figure 14). This material is used for IR detectors that require precise control of composition. Spectroscopic ellipsometry performs this task in real-time with instant feedback to correct the composition during processing. Other applications include AlGaN and InGaN for optoelectronics.

When two or more materials are mixed on a microscopic level, an effective medium approximation (EMA) may be used to determine the resulting optical constants. In the case of the Bruggemann EMA, which is commonly used, the mixture optical constants ð1~eff Þ relate to those of the individual materials, as: fa

1a 2 1eff 1 2 1eff þ fb b ¼0 1a þ 21eff 1b þ 21eff

½18

This can be interpreted as small particles of material A suspended in host material B. The length-scale for mixed particles must satisfy certain electromagnetic equations: typically smaller than one

Figure 14 Optical properties for Hg12xCdxTe vary with changes in composition.

306 INSTRUMENTATION / Ellipsometry

Figure 15 Experimental ellipsometry data and corresponding fit when the model is described as (a) a homogeneous single-layer and (b) a graded film with index variation through the film. (c) The best model to match subtleties in experimental data allows the index to increase toward the surface of the film, with a thin roughness layer on the surface.

Doping Concentration

Dopants in a material will introduce absorption in the infrared due to free-carriers. Spectroscopic ellipsometry measurements at long wavelengths can characterize this optical absorption, thus characterizing the doping concentration. This is common for highly doped semiconductors and transparent conductors such as indium tin oxide (ITO).

Typical ellipsometry measurements assume no crosscoupling between p- and s-polarizations. This cannot be assumed with anisotropic materials, which has led to ‘generalized’ ellipsometry measurements. Generalized ellipsometry measures additional information regarding the p- to s- and s- to p- conversion upon reflection. This allows characterization of anisotropic materials, which contain directionally dependent optical constants.

Optical Variation (Grading)

Many thin film properties change vertically throughout the film (along the direction perpendicular to the surface). This is most often induced by processing conditions, whether intentional or unintentional. Figure 15a shows the fit to experimental spectroscopic ellipsometry data taken from a single-layer film when it is modeled as a homogeneous layer. To improve the agreement between experimental and model-generated curves, the index was allowed to vary in a linear manner throughout the film. The best fit is shown in Figure 15b, where the model includes the index variation and a thin surface roughness layer. The sample structure is demonstrated in Figure 15c.

Conclusions Ellipsometry is a common optical technique for measuring thin films and bulk materials. It relies on the polarization changes due to reflection or transmission from a material structure to deduce material properties, like film thickness and optical constants. This technique continues to develop as the requirement for thin film characterization increases.

See also Optical Anisotropy

Many materials are optically anisotropic; i.e., their optical properties vary in different film directions.

Geometrical Optics: Lenses and Mirrors; Prisms. Optical Coatings: Thin-Film Optical Coatings. Optical Materials: Measurement of Optical Properties of Solids.

INSTRUMENTATION / Photometry 307

Polarization: Introduction; Matrix Analysis. Semiconductor Physics: Band Structure and Optical Properties. Spectroscopy: Fourier Transform Spectroscopy.

Further Reading Aspnes DE (1985) The accurate determination of optical properties by ellipsometry. In: Palik ED (ed.) Handbook of Optical Constants of Solids, pp. 89 – 112. Orlando FL: Academic Press. Azzam RMA and Bashara NM (1987) Ellipsometry and Polarized Light. Amsterdam, The Netherlands: Elsevier Science B.V. Boccara AC, Pickering C and Rivory J (eds) (1993) Spectroscopic Ellipsometry. Amsterdam: Elsevier Publishing. Collins RW, Aspnes DE and Irene EA (eds) (1998) Proceedings from the Second International Conference on Spectroscopic Ellipsometry. Thin Solid Films, vols. 313– 314. Gottesfeld S, Kim YT and Redondo A (1995) Recent applications of ellipsometry and spectroellipsometry in electrochemical systems. In: Rubinstein I (ed.) Physical Electrochemistry: Principles, Methods, and Applications. New York: Marcel Dekker. Herman IP (1996) Optical Diagnostics for Thin Film Processing, pp. 425 – 479. San Diego, CA: Academic Press. Johs B, Woollam JA, Herzinger CM, et al. (1999) Overview of variable angle spectroscopic ellipsometry (VASE), Part II: Advanced applications. Optical Metrology, vol. CR72, pp. 29 – 58. Bellingham, Washington: SPIE, Bellingham.

Johs B, Hale J, Ianno NJ, et al. (2001) Recent developments in spectroscopic ellipsometry for in situ applications. In: Duparre´ A and Singh B (eds) Optical Metrology Roadmap for the Semiconductor, Optical, and Data Storage Industries II, vol. 4449, pp. 41 – 57. Bellingham, Washington: SPIE. Roseler A (1990) Infrared Spectroscopic Ellipsometry. Berlin: Akademie-Verlag. Rossow U and Richter W (1996) Spectroscopic ellipsometry. In: Bauer G and Richter W (eds) Optical Characterization of Epitaxial Semiconductor Layers, pp. 68– 128. Berlin: Springer-Verlag. Tompkins HG (1993) A User’s Guide to Ellipsometry. San Diego, CA: Academic Press. Tompkins HG and Irene EA (eds) (in press) Handbook of Ellipsometry. New York: William Andrew Publishing. Tompkins HG and McGahan WA (1999) Spectroscopic Ellipsometry and Reflectometry. New York: John Wiley & Sons, Inc. Woollam JA (2000) Ellipsometry, variable angle spectroscopic. In: Webster JG (ed.) Wiley Encyclopedia of Electrical and Electronics Engineering, pp. 109 – 116. New York: John Wiley & Sons. Woollam JA and Snyder PG (1992) Variable angle spectroscopic ellipsometry. In: Brundle CR, Evans CA and Wilson S (eds) Encyclopedia of Materials Characterization: Surfaces, Interfaces, Thin Films, pp. 401 – 411. Boston, MA: Butterworth-Heinemann. Woollam JA, Johs B, Herzinger CM, et al. (1999) Overview of variable angle spectroscopic ellipsometry (VASE), Part I: Basic theory and typical applications. Optical Metrology, vol. CR72, pp. 3– 28. Washington: SPIE, Bellingham.

Photometry J Schanda, University of Veszpre´m, Veszpre´m, Hungary q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Fundamentals of Photometry

Photometry is defined by the International Commission on Illumination (internationally known as CIE from the abbreviation of its French name: Commission Internationale de l’Eclairage) as ‘measurement of quantities referring to radiation as evaluated according to a given spectral efficiency function, e.g., VðlÞ or V 0 ðlÞ:’ A note to the above definition states that in many languages it is used in a broader sense, covering the science of optical

radiation measurement. We will restrict our treatise to the above fundamental meaning of photometry. Under spectral efficiency function we understand the spectral luminous efficiency function of the human visual system. The internationally agreed symbols are VðlÞ; for photopic vision (daylight conditions) and V 0 ðlÞ for scotopic vision (nighttime conditions). CIE standardized the VðlÞ function in 1924 and the V 0 ðlÞ function in 1951; their spectral distribution is shown in Figure 1. The VðlÞ function was determined using mainly the so-called flicker photometric technique, where the light of a reference stimulus and of a test stimulus of varying wavelengths are alternatively shown to the human observer at a frequency at which the observer is already unable to perceive the difference in color (due to the different wavelength of the two radiations), but still perceives a

308 INSTRUMENTATION / Photometry

where Lx ¼ Km

ð780 nm 380 nm

Slx VðlÞdl

½1

and x refers to A, or B, or C, or D, and Slx is the spectral radiance distribution that produces the stimulus Sl ¼

Figure 1 Spectral efficiency functions of the human eye under photopic V(l) and scotopic V 0 (l) conditions (see CIE The basis of physical photometry. CIE 18.2:1983).

flicker sensation if the luminances of the two stimuli are different. Adjusting the radiance of the test stimulus, we can reach the situation of minimum flicker sensation. In this case we state that we set the two stimuli to equal luminance. Luminance is the quantity that the human observer perceives as brightness (in the case of near white stimuli, see below); radiance is the physical counterpart measured in W m22 sr21. For the V 0 ðlÞ function one can just project the two stimuli side by side, as under scotopic adaptation we cannot distinguish colors, only brightness differences are perceived, thus one can set equal scotopic luminance for the two stimuli by adjusting the brightness of the test stimulus until it agrees with that of the reference stimulus. Illuminating engineering was mainly interested in describing the effect of near white light, and therefore a psychophysical correlation was selected that described the perceived brightness sensation reasonably well. For near white stimuli, additivity holds, i.e., if A, B, C, and D are four stimuli, and the A stimulus matches the B stimulus and the C stimulus matches the D stimulus, then the superposition of the A and C stimuli matches the superposition of the B and D stimuli. The ‘match’ word is used here to describe stimuli that produce the same perception. It could be shown that for these stimuli, if the radiance is weighted with the VðlÞ function, the constructed luminances will be equal, i.e: if then

LA ¼ LB and LC ¼ LD LA þ LC ¼ LB þ LD ; or LA þ LD ¼ LB þ LC

dSðlÞ dl

where SðlÞ is the spectral radiance. Km is the maximum value of the luminous efficacy of radiation, its value is 683 lm W21 (see the discussion of luminous flux in the sub-section on photometric quantities). To be precise the integration should go from 0 nm to infinity, but it is usual to define the lower and upper wavelength limits of the visible spectrum as 380 nm and 780 nm. VðlÞ is defined between 360 nm and 830 nm, and V 0 ðlÞ between 380 nm and 780 nm, see Figure 1. We have to stress that the concept of luminance – and the entire system of the present-day photometry – is not to quantify brightness perception; it is only a reasonable approximation for near white stimuli. For colored lights a brightness –luminance discrepancy exists (called the Helmhotz– Kohlrausch effect: saturated colors look brighter than predicted by luminance). Luminance is, however, a good description for the visibility of fine details, thus it is a good concept for lighting calculations. Photometry has a unique situation in the SI system of units: the candela (cd) is a base unit of the SI system, the only one that is connected to psychophysical phenomena. By 1979, in the definition of the 16th General Conference of Weights and Measures, the candela was traced back to radiation quantities; nevertheless it was kept as the base unit of photometry: The candela is the luminous intensity, in a given direction, of a source that emits monochromatic radiation of frequency 540 £ 1012 Hz and that has a radiant intensity in that direction of 1/683 W sr – 1. (540 £ 1012 Hz corresponds approximately to 555 nm.)

To be able to calculate a photometric quantity of radiation of other wavelengths, one of the psychophysical functions VðlÞ or V 0 ðlÞ has to be used and eqn [1] applied. We will see later that besides VðlÞ and V 0 ðlÞ in modern applications, a number of further spectral luminous efficiency functions might be used. Photometric Quantities

As discussed in the previous section, from the point of view of vision, the most important quantity is

INSTRUMENTATION / Photometry 309

luminance. Looking, however, at the definition of the base unit, it is obvious that from the physical point of view the definition of a quantity corresponding to radiant power, measured in watts, can help to bridge the gap between photometry and radiometry: luminous flux, measured in lumens (lm), with the symbol F; is defined by eqn [1], with Sl ðlÞ inserted in W m21. Based on this more practical quantity of luminous flux and its unit, lm, the different quantities used in photometry can be built up as follows: Luminous flux:

F ¼ Km

ð1 dF ðlÞ e VðlÞdl dl 0

½2

where Fe ðlÞ is the radiant flux measured in W, Fe;l ¼

area emitting the radiation, and Q is the angle between the normal of dA and the direction of luminance measurement (see Figure 3). The unit of luminance is cd m22 (in some older literature called nit). Illuminance:



dF dA

½5

where dF is the luminous flux incident on the dA element of a surface (see Figure 4). The unit of illuminance is the lux (lx ¼ lm m22). A remark on the use of SI units and related quantities No prefixes can be added to the SI units, thus irrespective whether one measures luminance based

dFe ðlÞ dl

is the spectral distribution (or spectral concentration) of the radiant flux (W m21). Km ¼ 683 lm W21, is the maximum value of the luminous efficacy of radiation for lm < 555 nm. Similar equations can be written for scotopic luminous flux, with VðlÞ exchanged with V 0 ðlÞ; where K0m ¼ 1700 lm W21 for lm < 507 nm. All further quantities can be defined both for photopic and scotopic conditions. Here we write them only as photopic quantities. Luminous intensity:



dF dV

½3

where dF is the luminous flux traveling in an elementary solid angle dV; assuming a point source (see Figure 2). The unit of luminous intensity is the candela (cd ¼ lm sr21). Luminance:



›2 F ›A cos Q ›V

Figure 3 Geometry for the definition of luminance.

½4

where dF is the luminous flux traveling in an elementary solid angle dV; dA is the elementary surface

Figure 2 intensity.

Concept of point source, solid angle, and luminous

Figure 4 Illuminance is the total luminous flux per unit area incident at a point coming from a hemispherical solid angle.

310 INSTRUMENTATION / Photometry

on the photopic VðlÞ function, or the scotopic V 0 ðlÞ function, the so-determined photopic luminance or scotopic luminance is measured in cd m22 (no photopic or scotopic lumen, candela, etc., exist!). This often creates confusion, because only for a 555 nm monochromatic radiation will a 1 cd m22 photopic or scotopic luminance produce equal visual sensation, for every other wavelength the two are not commensurable. One can only relate photopic measurement results to photopic ones, and scotopic measurement results to scotopic ones.

Concepts of Advanced Photometry The quantities and units, as described in the previous section, are the quantities and units internationally agreed by the Meter Convention and the International Standards Organization (ISO). Modern photometric applications need, however, some further quantities and weighting functions. Thus CIE, the international organization for the development of standards in the field of optical radiation measurements, has defined a series of further weighting functions and quantities. The most important ones are the following: VM(l) function In the blue part of the spectrum (below 460 nm) the values of the VðlÞ function turned out to be too low. For decades this was only of concern for the vision research community, but with the introduction of blue LEDs and other short wavelength emitting sources (e.g., blue channel of displays) this shortcoming of the VðlÞ function became of practical importance. The ‘CIE 1988 28 spectral luminous efficiency function for photopic vision’ corrects this anomaly (see Figure 5). V10(l) function The official VðlÞ function is valid only for foveal vision (i.e., for targets that are smaller than 48 of visual angle). The foveal area of the retina is covered with a yellow pigmented layer (macula lutea) that absorbs in the blue part of the spectrum, at larger visual angles this screening is not effective anymore. The V10 ðlÞ function was determined for a visual angle of 108 (see Figure 5). Its international recommendation is still under consideration at the time of writing this article and if accepted, it will be recommended for targets seen at approximately 108 off-axis, e.g., for measuring the photometric properties of traffic signs and signals, where the driver has to observe the information in the periphery of his or her eye.

Figure 5 Spectral luminous efficiency (SPL) functions defined or under consideration for international adoption: V2: standard V(l) function; VM2: CIE 1988 VM(l) function, this is equivalent to the brightness SPL for a point source; V10: 108 visual field SPL, Vb,2: 28 visual field brightness SPL, Vb;2 (l); Vb,10: 108 visual field brightness SPL, Vb;10 (l).

Brightness matching functions As already mentioned in the Introduction, luminance is not a good correlate of brightness, which is a human perception. To find a better correlation of brightness the problem to be addressed is that brightness is a nonadditive phenomenon, i.e., in eqn [1] one cannot add (integrate) the monochromatic radiations to get a brightness correlation for a nonmonochromatic radiation. Brightness evaluating spectral luminous efficiency functions can be used only to compare monochromatic radiations. The CIE has compiled such functions for point sources, 28 and 108 visual field sizes. Figure 5 also shows these functions. Due to the fact that the brightness perception is nonadditive in respect of the stimuli that produce it, no brightness photometry can be built that uses equations, as shown in eqn [1]. For brightness description we have to rely on the concept of equivalent luminance, a term the definition of which has recently been updated.

Equivalent luminance Of a field of given size and shape, for a radiation of arbitrary relative spectral distribution Leq : Luminance of a comparison field in which monochromatic radiation of frequency 540 £ 1012 Hz has the same brightness as the field considered under the specified photometric conditions of measurement; the comparison field must have a specified size and shape

INSTRUMENTATION / Photometry 311

which may be different from that of the field considered. To build an instrument that measures this quantity is a real challenge for the future. Advanced Use of Photometry

Based on above newly defined quantities, several attempts are under way to extend the usefulness of photometry in designing the human visual environment. The eye is an optical system and as in every such system, the depth of focus and the different aberrations of the system will decrease with decreasing pupil size. Pupil size will decrease with increasing illumination, and in the case of constant luminance with higher content of short wavelength radiation. Thus, there exists a school of researchers who advocate that increased blue content in the light has beneficial effects on vision, and one should extend the classical photopic-based photometry with a scotopicbased one to properly describe the visual effect of lighting. Other investigations are concerned about the visibility at low light levels, the range used in street lighting (from about a few thousands of a candela per square meter to about a few candelas per square meter, according to one definitions: 1023cd m22 – 3 cd m22). In this mesopic range both rods and cones are contributing to vision, and this changes with lighting level and direction of view (for foveal vision, i.e., looking straight ahead, photopic photometry seems to hold even at low light levels). For peripheral visual angles brightness perception and the perception of an object (a signal, sign or obstacle in a nighttime driving situation) seem to have different spectral responsivity. In driving situations the necessary reaction time of the driver is an important parameter, thus experiments are going on to define a photometric system based on reaction time investigations. In indoor situations apart from the necessary level of illumination, the observed glare is a contributor whether an environment will be accepted as pleasing or annoying. Illuminating engineering distinguishes between two types of glare: disability glare reduces visibility, discomfort glare is just an annoying experience without influencing the short-term task performance. An interesting question is the spectral sensitivity to discomfort glare, as it can influence not only indoor but also outdoor activity. Preliminary experiments seem to show that luminance sensitivity and discomfort glare sensitivity have different spectral distribution; glare sensitivity seems to peak at shorter wavelengths.

The above might be related to a further question, lying already at the boundaries of photometry, but that has to be considered in photometric design and measurement: the human daily and yearly rhythm (circadian and seasonal rhythm) of human activity coupled to hormone levels. They are influenced by light, as, for example, the hormone melatonin production is influenced by light exposure. Physiological investigations showed that melatonin production suppression has a maximum around 460 nm and might be coupled to a radiation sensitive ganglion cell in the retina. Whether discomfort sensation is mediated via the same neural pathway or via a visual one has not yet been decided. But photometry has to take these also into consideration and in the future, measurement methods and instruments to determine them, will have to be developed.

Advances in Photometric Measurements Primary Standards

The main concern in photometry is that the uncertainty of photometric measurements is still much higher than that in other branches of physics. This is partly due to the higher uncertainty in radiometry and partly to the increase in uncertainty within the chain of uncertainty propagation from the National Laboratory to the workshop floor measurement. In National Standards Laboratories, very sophisticated systems are used to determine the power of the incoming radiation and then elaborated spectroradiometric techniques are used to evaluate the radiation in the form of light, i.e., perform photometric measurements (e.g., NIST CIRCUS equipment, where multiple laser sources are used as power sources, and highly sophisticated methods to produce a homogeneous nonpolarized radiation field for the calibration of secondary photometric detectors). There is, however, also a method to supply end users with absolute detectors for the visible part of the spectrum. Modern high-end Si photovoltaic cells have internal quantum efficiencies in the visible part of the spectrum of over 99.9%. Reflection losses at the silicon surface are minimized by using three or more detectors arranged in a trap configuration, where the light reflected from one detector is fed to the second one, from there to the third one, and eventually to some further ones. In a three detector configuration, as shown in Figure 6, the light from the third detector is reflected back to the second and from there to the first one. As every detector reflects only a small amount of radiation, by the fifth reflection practically

312 INSTRUMENTATION / Photometry

Figure 6 Schematic layout of a three Si-cell trap detector: light comes in from the left, is first partly absorbed, partly reflected on Detector 1, then on Detector 2, then on Detector 3, from where it is reflected back to Detector 2 and 1.

all the radiation is absorbed and contributes to the electric signal. Such trap detectors have an almost 100% quantum efficiency in the visible part of the spectrum, and can be used as photometric detectors if a well designed color correcting filter is applied in front of the detector. Secondary Type Measurements

In practical photometry the three most important quantities to be measured are the total luminous flux of different lamps, the illuminance in a plane and the luminance. Light source measurement Total luminous flux. The two methods to measure the total luminous flux is to use a goniophotometer or a photometer (Ulbicht) sphere. In goniophotometry, recent years have not brought major breakthroughs, the automation of the systems got better, but the principles are unchanged. The integrating sphere photometer (a sphere with inner white diffuse coating, where the lamp is in the middle of the sphere) used to be a simple piece of equipment to compare total luminous flux lamps against flux standards. In recent years a new technique has been developed at NIST– USA. This enables the absolute measurement of luminous flux from illuminance measurement, the fundamentals of this new arrangement being shown in Figure 7: the

Figure 7 Arrangement of the absolute integrating sphere system developed at NIST for the detector-based total luminous flux calibration. By permission of IESNA from Ohno Y and Bergman R (2003) Detector-referenced integrating sphere photometry for industry. J. IES Summer 21– 26.

test lamp is as usual in the middle of the sphere, but now light from an external source is introduced into the sphere. An illuminance meter measures the flux entering from this source. The sphere detector compares the two signals ðyi and ye Þ: Knowing the absolute characteristics of the sphere (a difficult measurement) one can determine the total luminous flux of the test lamp using the absolute illuminance value. As illuminance is easily determined from luminous intensity (from eqns [3] and [5] one gets with dV ¼ dA=r2 that E ¼ I=r2 ; where r is the distance between the source and the illuminated surface, supposed to be perpendicular to the direction to the source), this technique permits us to derive the total luminous flux scale from illuminance or luminous intensity measurement using an integrating sphere. Luminous intensity of LEDs. An other major break-through achieved during the past years was the unified measurement of LED intensity. Light-emitting diodes became, in recent years, important light sources for large scale signaling and signing, and it is foreseen that they will become important contributors in every field of light production (from car headlamps to general illumination). The most fundamental parameter of the light of an LED is its luminous intensity. The spatial power distribution of LEDs is usually collimated, but the LEDs often squint, as seen in Figure 8. In the past, some manufacturers measured the luminous intensity in

INSTRUMENTATION / Photometry 313

the direction of maximum emission, others used the direction of the optical axis for this quantity. The highly collimated character of the radiation made measurements in far field rather difficult. Therefore, CIE recommended a new term and measuring geometry: average LED intensity can be measured under two measuring conditions, as shown in Figure 9. The detector has to be set in the direction of the LED mechanical axis (discussions are still going on as to what the reference direction should be with modern surface mounted LEDs, as with those the mechanical axis is ill-defined, the normal to the base-plane could be a better reference direction). The detector has to have an exactly 1.00 cm2 circular aperture, and the distance between this aperture and the tip of the LED is for condition A, d ¼ 0:316 m, and for Condition B, d ¼ 0:100 m (these two distances with the 1.00 cm2 detector area provide 0.001 sr (steradians) and 0.01 sr opening angles). Recent international round-robins have shown that, based on the new recommendations, agreement

Figure 8 Spatial light distribution of an LED, Figure 8a shows the distribution of a ‘squinting’ LED in a plane including the optical axis, Figure 8b shows light distribution in a plane perpendicular to the optical axis. By permission of the Commission Internationale de l’Eclairage, from the Publication “Measurement of LEDs” CIE 127-1997; CIE Publications are obtainable from the CIE Central Bureau: Kegelgasse 27, A-1033 Wien, Austria.

between different laboratories decreased from the 10 to 20% level to 1 to 2%. The remaining difference is mainly due to the fact that the LEDs emit in narrow wavelength bands, and the transfer of the calibration value for the 100 mm2 detector from a white (CIE Standard Illuminant A color temperature) incandescent lamp to the narrow band LED emission is still uncertain, mainly due to stray light effects in the spectral responsivity and emission measurement. Luminance distribution measurement The human observer sees luminance (and color) differences. Thus for every illuminating engineering design task the luminance distribution in the environment is of utmost importance. Traditionally this was measured using a spot-luminance meter, aiming the device into a few critical directions. The recent development of charge coupled device (CCD) twodimensionally sensitive arrays (and other, e.g., MOSFET, CMOS, Charge injection device (CID), charge imaging matrix (CIM) systems: as for the time being the CCD technology provides best performance, we will refer to two-dimensional electronic image capture devices as to CCD cameras) opened the possibility of using an image-capturing camera for luminance distribution measurements. Such measurements are badly needed in display calibration, nearfield photometry (photometry in planes nearer than in which the inverse square law holds), testing of car headlamp light distribution, glare, and homogeneity studies indoors and outdoors, etc. Solid-state cameras have the big advantage over older type vacuum-tube image capturing devices, that the geometric position and alignment of the single pixels is well-defined and stays constant. Nowadays, CCD image detector chips are mass produced, and one can get a variety of such devices and cameras starting with some small resolution

Figure 9 Schematic diagram of CIE Standard Conditions for the measurement of Average LED Intensity. Distance d ¼ 0.316 m for Condition A and d ¼ 0.100 m for Condition B. By permission of the Commission Internationale de l’Eclairage, from the Publication “Measurement of LEDs” CIE 127-1997; CIE Publications are obtainable from the CIE Central Bureau: Kegelgasse 27, A-1033 Wien, Austria.

314 INSTRUMENTATION / Photometry

(few thousand pixels) devices up to cameras with tens of mega pixel resolution. Detectors are now available with internal intensification enabling measurements down to a few photons per second intensity levels. Main problems with these detectors are: . spectral and absolute nonuniformities of the single

pixels (see Figure 10), where a spatial homogeneity map of a CCD two-dimensional array detector is shown; the irregular 3% sensitivity change on the surface of the detector is negligible for imaging purposes, but has to be corrected in cases of photometric measurements. Even if the receptor chip would have an absolutely homogeneous sensitivity, there would be a drop in response from the middle of the imaging area to the boarders: Light reaching the edges of the detector reach the detector at an oblique angle and this produces a decrease of sensitivity with a4 ; where a is the angle of incidence, measured from the middle of the lens to the given pixel of the detector and the surface normal of the detector. . aliasing effects if the pixel resolution is not large enough to show straight lines as such when they are not in line with a pixel row or column. . nonlinearity and cross-talk among the adjacent pixels. Figure 11 shows the so-called ‘inverse gamma’ characteristic of a CCD camera. The digital electronic output of the camera shows a Y ¼ E2g type function, where Y is the output DAC (digital-analog converter) values, E is the irradiance of the pixel, and g is the exponent (this description comes from the film industry, where the film density depends exponentially on the

Figure 10 Spatial homogeneity of a two-dimensional CCD array.

irradiation; display devices have usually a nonlinear input (DAC value) – output (luminance) characteristic, and the camera inverse gamma value corrects for this output device characteristic. This is, however, not required if the camera is used for photometric measurements; this built-in nonlinearity has to be corrected in the evaluating software if the camera is intended for photometric measurements. . to be able to perform photometric measurements the camera has to have a spectral responsivity corresponding to the CIE VðlÞ-function. Many cameras have built-in filters to do this – eventually also red and blue filters to be able to capture color – but the color correction of cameras, where the correction is made by small adjacent filter chips, is usually very poor. Figure 12 shows the spectral

Figure 11 ‘Inverse gamma’ characteristic of a commercial digital photographic camera, measurement points are shown at different speed settings, curve is a model function representative of the camera response.

Figure 12 Spectral sensitivity of a commercial digital photographic camera.

INSTRUMENTATION / Photometry 315

sensitivity curve of a digital photographic camera; the output signal is produced by an internal matrix transformation of the signals produced by adjacent pixels equipped with different color filters. Figure 13 shows the spectral sensitivity of a CCD camera specially designed for photometric measurements. Naturally, meaningful photometric measurements can be made only with such a camera. Very often, however, an approximate luminance distribution is enough, and then a picture captured by a digital photographic camera, plus one luminance measurement of a representative object for absolute calibration, suffices. Above nonspectral systematic errors can be corrected by appropriate soft-ware, so a CCD camera photometer, as shown schematically in Figure 14, is well suited to measure display characteristics, indoor and outdoor luminance distributions. The challenge

Figure 13 Spectral sensitivity of a CCD camera specially designed for photometric measurements. Kindly supplied by INPHORA Inc.

for illuminating engineering is at present how the many millions of luminance values can be evaluated to get to meaningful light measurement data. The real challenge will come if visual science provides better hints how the human visual system evaluates the illuminance distribution on the retina, and instrument manufacturers will be able to capture signals corresponding to those produced by the receptor cells and provide the necessary algorithms our brain uses to get to brightness, lightness, color, luminance contrast, and glare type of output information.

Concluding Remarks Advances in optical instrumentation, both in the field of light sources and detectors – coupled with the possibilities modern digital computation (eventually in the future increasing the use of neural networks) – provide already many new measurement technical solutions and further ones are certainly underway. The use of LEDs needs the rethinking of many classical illuminating engineering concepts, from visibility and glare evaluation, evenness of illumination to color rendering. All of them are coupled with problems in basic photometry. Thus, there is a need to re-evaluate concepts used in design techniques. The new area-sensitive detectors provide methods of determining classical photometric quantities of entire visual fields in one shot, but already foreshadow the development of new quantities that correlate better with visual perceptions.

List of Units and Nomenclature Terms with an p refer to definitions published by the CIE in the International Lighting Vocabulary, CIE 17.4:1986, where further terms and definitions related to light and lighting are to be found. Candelap: SI unit of luminous intensity: The candela is the luminous intensity, in a given direction, of a source that emits monochromatic radiation of frequency 540 £ 1012 hertz and that has a radiant intensity in that direction of 1/683 watt per steradian. (16th General Conference of Weights and Measures, 1979). 1 cd ¼ 1 lm sr21

Figure 14 Cross-section of a photometric CCD camera.

Equivalent luminancep: Luminance of a comparison field in which monochromatic radiation of frequency 540 £ 1012 Hz has the same brightness as the field considered under the specified photometric conditions of measurement; the comparison field must have a specified size and shape which may be

316 INSTRUMENTATION / Photometry

different from that of the field considered. unit : cd m

22

Notes: 1. Radiation at a frequency of 540 £ 1012 Hz has a wavelength in standard air of 555.016 nm. 2. A comparison field may also be used in which the radiation has any relative spectral distribution, if the equivalent luminance of this field is known under the same conditions of measurement. Far field photometry: Photometry where the inverse square law is valid. Flicker photometerp: Visual photometer in which the observer sees either an undivided field illuminated successively, or two adjacent fields illuminated alternately, by two sources to be compared, the frequency of alteration being conveniently chosen so that it is above the fusion frequency for colours but below the fusion frequency for brightnesses. Foveap: Central part of the retina, thin and depressed, which contains almost exclusively cones and forming the site of most distinct vision. Note: The fovea subtends an angle of about 0.026 rad (1.5 degree) in the visual field. Goniophotometerp: Photometer for measuring the directional light distribution characteristics of sources, luminaires, media or surfaces. Illuminancep: Quotient of the luminous flux dFv incident on an element of the surface containing the point, by the area dA of that element. Equivalent definition. Integral, taken over the hemisphere visible from the given point, of the expression Lv cos u dV; where Lv is the luminance at the given point in the various directions of the incident elementary beams of solid angle dV; and u is the angle between any of these beams and the normal to the surface at the given point. Ev ¼

ð dFv ¼ Lv cos u dV dA 2psr unit : lx ¼ lm m22

Inverse square law: The illumination at a point on a surface varies directly with the luminous intensity of the source, and inversely as the square of the distance between the source and the point if the source is seen as a point source. Lumenp: SI unit of luminous flux: Luminous flux emitted in unit solid angle (steradian) by a uniform point source having a luminous intensity of 1 candela. (9th General Conference of Weights and Measures, 1948). Equivalent definition. Luminous flux of a beam of monochromatic radiation whose frequency is

540 £ 10 12 hertz and whose radiant flux 1/683 watt. Luminancep: Quantity defined by the formula L¼

is

›2 F ›A cos u ›V

where ›F is the luminous flux transmitted by an elementary beam passing through the given point and propagating in the solid angle dV containing the given direction; dA is the area of a section of that beam containing the given point, u is the angle between the normal to that section and the direction of the beam. unit : cd m22 Luminous intensityp: Quotient of the luminous flux dFv leaving the source and propagated in the element of solid angle dV containing the given direction, by the element of solid angle: dFv Iv ¼ dV unit : cd ¼ lm sr21 Luxp: SI unit of illuminance: Illuminance produced on a surface of area 1 square meter by a luminous flux of 1 lumen uniformly distributed over that surface. 1 lx ¼ 1 lm m22 Note: Non-metric unit: lumen per square foot (lm ft22) or footcandle (fc) (USA) ¼ 10.764 lx. Near-filed photometry: photometry made in the vicinity of an extended source, so that the inverse square law is not valid. Photometer (or Ulbicht) spherep: A hollow sphere, whitened inside. Owing to the internal reflexions in the sphere, the illumination on any part of the sphere’s inside surface is proportional to the luminous flux entering the sphere, or produced inside the sphere by a lamp. The illuminance of the internal sphere wall is measured via a small window. Pixel: The individual picture elements of an image or elements in a display that can be addressed individually. Radiancep: Quantity defined by the formula L¼

›2 F ›A cos u ›V

where ›F is the radiant flux transmitted by an elementary beam passing through the given point and propagating in the solid angle dV containing the given direction; dA is the area of a section of that beam containing the given point, u is the angle between the normal to that section and the direction of the beam. unit : W m22 srad

INSTRUMENTATION / Scatterometry 317

Retinap: Membrane situated inside the back of the eye that is sensitive to light stimuli; it contains photoreceptors, the cones and the rods, and nerve cells that interconnect and transmit to the optic nerve the signals resulting from stimulation of the photoreceptors. Spectralp: An adjective that, when applied to a quantity X pertaining to electromagnetic radiation, indicates: – either that X is a function of the wavelength l, symbol: X(l), – or that the quantity referred to is the spectral concentration of X; symbol: Xl ; dX=dl: Xl is also a function of l and in order to stress this may be written Xl ðlÞ without any change of meaning. Spectral luminous efficiency function: for photopic vision, V(l); for scotopic vision V 0 (l)p: Ratio of the radiant flux at wavelength lm to that at wavelength l such that both radiations produce equally intense luminous sensations under specified photometric conditions and lm is chosen so that the maximum value of this ratio is equal to 1. Unless otherwise indicated, the values used for the spectral luminous efficiency in photopic vision are the values agreed internationally in 1924 by the CIE (Compte Rendu 6e session, p.67), completed by interpolation and extrapolation (Publications CIE No.18 (1970), p.43 and No. 15 (1971), p.93), and recommended by the International Committee of Weights and Measures (CIPM) in 1972. For scotopic vision, the CIE in 1951 adopted, for young observers, the values published in Compte Rendu 12e session, Vol. 3, p. 37, and ratified by the CIPM in 1976. These values define respectively the VðlÞ or V 0 ðlÞ functions.

Total luminous flux: luminous flux of a source emitted into 4p steradians. Trap detector: Detector array prepared from detectors of high internal quantum efficiency, where the reflected radiation of the first detector is directed to the second one, and so on so that practically all the radiation is absorbed by one of the detectors (the radiation is “trapped” in the detector system).

See also Displays. Incoherent sources: Lamps.

Further Reading Commission Internationale de l’Eclairage: The basis of physical photometry (1983) CIE 18.2. Commission Internationale de l’Eclairage: Proceedings of the CIE Symposium 2002 on Temporal and Spatial Aspects of Light and Colour Perception and Measurement, August 2002: Publ.CIE x025:2003. Cromer CL, Eppledauer G, Hardis JE, Larason TC and Parr AC (1993) National Institute of Standards and Technology detector-based photometric scale. Applied Optics 32: 2936– 2948. Decusatis C (ed.) (1997) Handbook of Applied Photometry. Woodbury, New York: American Institute of Physics. Rea MS (ed.) (2000) Illuminating Engineering Society of North America. The IESNA Lighting Handbook: Reference and Application, 9th edn. Kaiser PK and Boynton RM (1966) Human Color Vision, 2nd edn. Washington, DC: Optical Society of America. McCluney WR (1994) Introduction to radiometry and photometry. Boston, London: Artech House. Ohno Y and Bergman R (2003) Detector-referenced integrating sphere photometry for industry. J IES Summer 21 – 26. Walsh JWT (1953) Photometry. London: Constable.

Scatterometry J C Stover, The Scatter Works, Inc., Tucson, AZ, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Scattered light is a limiting source of optical noise in many advanced optical systems, but it can also be a sensitive indicator of optical component quality. Consider the simple case of a telescope successfully used to image a dim star against a dark background; however, if light from a bright source (such as the moon located well out of the field of view) enters the

telescope, it will scatter from the interior walls and the imaging optics themselves. Some of this light eventually reaches the detector and creates a dim background haze that washes out the image of the distant star. A good telescope design accounts for these effects by limiting potential scatter propagation paths and by requiring that critical elements in the optical system meet scatter specifications. This means doing a careful system analysis and a means to quantify the scattering properties of the telescope components. This article discusses modern techniques for quantifying, measuring, and analyzing scattered light, and reviews their development.

318 INSTRUMENTATION / Scatterometry

Like many scientific advances moving scatterometry from an art to a reliable metrology was done in a series of small hops (not always in the same direction), rather than a single leap. It started in 1961 when a paper by Hal Bennett and Jim Porteous reported measurements made by gathering most of the light scattered from front surface mirrors and normalizing this signal by the much larger specular reflection. They defined this ratio as the total integrated scatter (TIS), and using a scalar diffraction theory result drawn from the radar literature, related it to the surface root mean square (rms) roughness. By the mid-1970s, several angleresolved scatterometers had been built as research tools in university, government, and industry labs. Unfortunately, instrument operation and data manipulation were generally poor, and meaningful comparison measurements were virtually impossible due to instrument differences, sample contamination, and confusion over what parameters should be compared. Analysis of scatter data, to characterize sample surface roughness, was the subject of many publications. A derivation of what is commonly called ‘BRDF’ (bidirectional reflectance distribution function) was published by Nicodemus and co-workers at the National Bureau of Standards (now the National Institute of Science and Technology or NIST) in 1970, but did not gain common acceptance as a way to quantify scatter measurements until the late 1980s when the advent of small computers, combined with inspection requirements for defense-related optics, dramatically stimulated the development of scatter metrology. Commercial laboratory instrumentation became available that could measure and analyze as many as 50 to 100 samples a day, and the number (and sophistication) of measurement facilities increased dramatically. The first ASTM Standards were published (TIS in 1987 and BRDF in 1991), but it was still several years before most publications correctly used these quantifying terms. Government defense funding decreased dramatically in the early 1990s, following the end of the Cold War, but the economic advantages of using scatter metrology and analysis for space applications and in the rapidly advancing semiconductor industry, continued state-of-the-art advancements. The following sections detail how scatter is quantified when related to area (roughness) and local (pit/particle) generating sources. Instrumentation and the use of scattering models are also briefly reviewed.

Quantifying Scattered Light Scatter signals can be easily quantified as scattered light power per unit solid angle (in watts per

steradian); however, in order to make the results more meaningful, these signals are usually normalized, in some fashion, by the light incident on the scatter source. The three ways commonly employed to do this are defined below. If the scattering feature in question is uniformly distributed across the illuminated spot on the sample (such as surface roughness), then it makes sense to normalize the collected scattered power in watts/ steradian by the incident power. This simple ratio, which has units of inverse steradians, was commonly referred to as ‘the scattering function.’ Although this term is occasionally still found in the literature, it has been generally replaced by the closely related BRDF, which is defined by the differential ratio of the sample radiance normalized by its irradiance. After some simplifying assumptions are made, this reduces to the original scattering function with a cosine of the polar scattering angle in the denominator. The BRDF, defined in this manner, has become the standard way to report angle-resolved scatter from features that uniformly fill the illuminated spot. The cosine term results from the fact that NIST used radiometric terms to define BRDF: BRDF ¼

Ps=V Pi cos us

½1

The scatter function is often referred to as the ‘cosine corrected BRDF’ and is simply equal to the BRDF multiplied by the cosine of the polar scattering angle. Figure 1 gives the geometry for the situation, and defines the polar and azimuthal angles (us and fs ), as well as the solid collection angle (V). Other common abbreviations are BSDF, for the more generic bidirectional scatter distribution function, and BTDF for quantifying transmissive scatter.

Ps Pi

qi

qs



fs X

Figure 1 Scatter analysis uses standard spherical coordinates to define terms.

INSTRUMENTATION / Scatterometry 319

Integration of the scatter signal over much of the scattering hemisphere allows calculation of TIS, as the ratio of the scatter signal to the reflected specular power. This integration is usually carried out experimentally in such a way that both the incident beam and reflected specular beam are excluded. In the most common TIS situation, the beam is incident at a small angle near surface normal, and the integration is done from small values of us to almost 90 degrees. If the fraction of light scattered from the specular reflection is small and if the scatter is caused by surface roughness, then it can be related to the rms surface roughness of the reflecting surface. As a ratio of powers, the TIS is a dimensionless quantity. The normalization is by Pr (instead of Pi) because reductions in scatter caused by low reflectance do not influence the roughness calculation. The pertinent relationships are given below, where s is the rms roughness and l is the light wavelength: TIS ¼ Ps=Pr ø ð4ps=lÞ2

½2

Of course, all scatter measurements are integrations over a detector collection aperture, but the TIS designation is reserved for situations where the aim is to gather as much scattered light as possible, while ‘angle resolved’ designs are created to gain information from the distribution of the scattered light. Notice that TIS values become very large when measured from a diffuse surface, where the specular reflection is very small. Although TIS can be measured for any surface, the diffuse reflectance (equal to Ps/Pi) would often be more appropriate for diffuse surfaces. The various restrictions associated with relating TIS to rms roughness are detailed below. Scatter from discrete features, such as particles and pits, which do not completely fill the illuminated spot, must be treated differently. This is because changes in spot size, with no corresponding change in total incident power, will change the incident intensity (watts/unit area) at the feature and thus also change the scatter signal (and BRDF) without any corresponding changes in the scattering feature. Clearly this is unacceptable if the object is to characterize the defect with scatter measurements. The solution is to define another quantification term, known as the differential scattering cross-section (DSC), where the normalization is the incident intensity at the feature (the units for DSC are area/steradian). Because this was not done in terms of radiometric units at the time it was defined, the cosine of the polar scattering angle is not in the definition. The same geometrical

definitions, found in Figure 1, also apply for the DSC: DSC ¼

Ps=V Ii

½3

If the DSC is integrated over the solid angle associated with a collection aperture then the value has units of area. Because relatively small area focused laser beams are often used as a source, area is most commonly given in micrometers squared. These three scatter parameters, the BRDF, the TIS, and the DSC, are obviously functions of system variables such as geometry, scatter direction (both in and out of the incident plane), incident wavelength and polarization, as well as feature characteristics. It is the dependence of the scatter signal on these system parameters that makes the scatter models useful for optimizing instrument designs. It is their dependence on feature characteristics that makes scatter measurement a useful metrology tool. A key point needs to be stressed. When applied appropriately, TIS, BRDF, and DSC are absolute terms, not relative terms. The DSC of a 100 nm PSL in a given direction for a given source is a fixed value, which can be repeatedly measured and even accurately calculated from models. The same is true for TIS and BRDF values associated with surface roughness of known statistics. Scatter measuring instruments, such as particle scanners or lab scatterometers, can be calibrated in terms of these quantities. As has already been pointed out, the user of a scanner will almost always be more interested in characterizing defects than in the resulting scatter values, but the underlying instrument calibration can always be expressed in terms of these three quantities. This is true even though designers and users may find it convenient to use other metrics (such as polysterene latex (PSL) spheres) as a way to relate calibration.

Angle Resolved Scatterometers The diagram in Figure 2 shows the most common scatterometer configuration. The source is fixed and the sample is rotated to the desired incident angle. The receiver is then rotated about the sample during scatter measurement. Most commonly, scatterometers operate just in the plane of incidence; however, instruments capable of measuring at virtually any location in either the reflective or transmissive hemispheres have been built. Although dozens of instruments have been built following this general design, other configurations are in use. For example, the source and receiver may be fixed and the sample rotated so that the scatter pattern moves past the receiver. This is easier mechanically than

320 INSTRUMENTATION / Scatterometry

moving the receiver at the end of an arm, but complicates analysis because the incident angle and the observation angle change simultaneously. Another combination is to fix the source and sample together, at constant incident angle, and rotate this unit (about the point of illumination on the sample) so that the scatter pattern moves past a fixed receiver. This has the advantage that a long receiver/sample distance can be used without motorizing a long (heavy) receiver arm. It has the disadvantage that heavy (or multiple) sources are difficult to deal with. Other configurations, with everything fixed, have been designed that employ several receivers to merely sample the BSDF and display a curve fit of the resulting data. This is an economical solution if the BSDF is relatively uniform without isolated diffraction peaks. The goniometer section of a real instrument, similar to that of Figure 2, is shown in Figure 3.

Chopper Lens

Spatial filter

Laser Ref. det.

Signature noise

Mirror

Sample Beam dump

Goniometer Receiver Figure 2 Basic elements of an incident plane scatterometer are shown.

Figure 3 The author’s scatterometer, which is similar to the diagram of Figure 2 is shown. In this case a final focusing lens is introduced to produce a very small illuminated spot on the silicon wafer sample. The white background was introduced to make the instrument easier to view.

Computer control of the measurement is essential to maximize versatility and minimize measurement time. The software required to control the measurement plus the display and analysis of the data can be expected to be a significant portion of total instrument development cost. The following reviews typical design features (and issues) associated with the source, sample mount and receiver components. The source in Figure 2 is formed by a laser beam that is chopped, spatially filtered, expanded, and finally brought to a focus on the receiver path. The beam is chopped to reduce both optical and electronic noise. This is usually accomplished through the use of lock-in detection in the electronics package which suppresses all signals except those at the chopping frequency. Low noise, programmable gain electronics are essential to reducing system noise. The reference detector is used to allow the computer to ratio out laser power fluctuations and, in some cases, to provide the necessary timing signal to the lock-in electronics. Polarizers, wave plates, and neutral density filters are also commonly placed prior to the spatial filter. The spatial filter removes source scatter from the laser beam and presents a point source which is imaged by the final focusing element, in this case a mirror, to the detector zero position. Focusing the beam at this location allows near specular scatter to be more easily measured. Lasers are convenient sources, but are not necessary. Broadband sources are often required to meet a particular application or to simulate the environment where a sample will be used. Monochromators and filters can be used to provide scatterometer sources of arbitrary wavelength. The noise floors with these tunable incoherent sources increases as the spectral bandpass is narrowed, but they have the advantage that the scatter pattern does not contain laser speckle. The sample mount can be very simple or very complex. In principal, six degrees of mechanical freedom are required to fully adjust the sample. The order in which these stages are mounted affects the ease of use (and cost) of the sample holder. In practice, it often proves convenient to either eliminate, or occasionally duplicate, some of these degrees of freedom. In addition, some of these axes may be motorized to allow the sample area to be rasterscanned to automate sample alignment or to measure reference samples. As a general rule, the scatter pattern is insensitive to small changes in incident angle but very sensitive to small angular deviations from specular. Instrumentation should be configured to allow location of the specular reflection (or transmission) very accurately. Receiver designs vary, but changeable entrance apertures, bandpass

INSTRUMENTATION / Scatterometry 321

filters, lenses, and field stops are generally positioned in front of the detector. A serious measurement problem is getting light scattered by the instrument, called instrument signature, confused with light scattered by the sample. An example of instrument signature is shown by the dotted line in Figure 2, which represents scatter from the final mirror. The signature is often measured in the straight through (transmission) direction, multiplied by the measured specular reflectance and then compared to the measured sample BRDF. Another issue is the fact that the measured BRDF is really the convolution of the receiver aperture with the actual (incremental) BRDF. When the scatter signal varies slowly across the aperture the measurement is virtually identical to the true BRDF. Near the specular reflection, or at diffraction peaks, the differences between the measurement (or convolution) and the actual (incremental) BRDF can be huge. Measurements made using invisible sources and measurements of curved samples present additional problems. These problems and the issues of calibration and accuracy are covered in the Further Reading section at the end of this article.

TIS Instruments The two common methods of making TIS measurements are shown in Figures 4 and 5. The first one is based on a hemispherical mirror (or Coblentz sphere) to gather scattered light from the sample and image it onto the scatter detector. The specular beam enters and leaves the hemisphere through a small circular hole. The diameter of that hole defines the near specular limit of the instrument. The reflected beam (not the incident beam) should be centered in the hole

Figure 4 A diagram showing the Coblentz sphere approach to TIS measurements used in the early development of scatter instrumentation.

Integrating sphere

Pi

Pr Scatter detector

Sample Ps

Figure 5 More modern TIS instruments make use of an integrating sphere approach which is easier to align and does not suffer from problems associated with measuring high-angle scatter from the sample.

because the BRDF will be symmetrical about it. Alignment of the hemispherical mirror is critical, and not trivial, in this approach. The second approach involves the use of an integrating sphere. A section of the sphere is viewed by a recessed detector. If the detector field of view (FOV) is limited to a section of the sphere that is not directly illuminated by scatter from the sample, then the signal will be proportional to total scatter from the sample. Again, the reflected beam should be centered on the exit hole. The Coblentz sphere method presents more signal to the detector; however, some of this signal is incident on the detector at very high angles. Thus, this approach tends to discriminate against high-angle scatter (which is not a problem for many samples). The integrating sphere is easier to align, but has a lower signal to noise ratio (less signal on the detector) and is more difficult to build in the IR where uniform diffuse surfaces are harder to obtain. A common mistake with TIS measurements is to assume that for near normal incidence, the orientation between source polarization and sample orientation is not an issue. TIS measurements made with a linearly polarized source on a grating at different orientations will quickly demonstrate this dependence. TIS measurements can be made very near the specular reflection by utilizing a diffusely reflecting plate with a small hole in it. A converging beam is reflected off the sample and through the hole. Scatter is diffusely reflected from the diffuse plate to a receiver designed to uniformly view the plate. The reflected power is measured by moving the plate so the specular beam misses the hole and then taking that measurement. The ratio of the two scatter measurements gives the TIS. Measurements starting

322 INSTRUMENTATION / Scatterometry

closer than 0.1 degrees from specular can be made in this manner and it is an excellent way to check incoming optics or freshly coated optics for low scatter.

well-known grating equations as: fx ¼ ðsin us cos fs 2 sin ui Þ=l fy ¼ ðsin us sin fs Þ=l

Analyzing Scatter from Surface Roughness The preceding sections have concentrated on obtaining and quantifying accurate scatter data, but that leaves the question of what to do with it once you have it. In rare situations you may be given a scatter (BRDF) specification – such as, the BRDF from the mirror must be less than 1024 sr21 10 degrees from specular when measured at a wavelength of 633 nm incident at 5 degrees with an S polarized source. Unfortunately this is very uncommon. If the issue is limiting scatter as a noise source, you will probably have to generate your own specification based on specific system requirements. More difficult, and often of more economic value, is the situation where scatter measurements are being used as a metrology to learn something about the sample characteristics – like roughness or defect size and/or type. The relationship between the measured BRDF and reflector roughness statistics was a subject of intense interest from the mid-1970s through the early 1990s. Dozens of papers, and even some books, have been written on the subject, and it can only be outlined here. The relatively easy case of scatter from roughness on a clean, optically smooth, front surface reflector was first published in 1975; however it was several years later before confirming experiments were completed. The deceptively simple relationship, based on vector perturbation theory, is shown below. BRDF ¼ ð16p2 =l4 Þ cos ui cos us QSðfx ; fy Þ

½4

Q is the polarization factor and is determined by the material constants of the reflector, as well as the system geometry. In many cases, it is numerically about equal to the specular reflectance and this approximation is often justified. Exact expressions are available in the literature. Sðfx ; fy Þ is the surface power spectral density function (or PSD). It may be thought of as roughness power (surface height variations squared) per unit spatial frequency (undulations per unit distance instead of per unit time). Integration of the PSD over spatial frequency space results in the mean square roughness over that band of frequencies. Taking the square root gives the root mean square roughness (or rms). Frequencies in both the x and y directions on the surface are involved and they are defined by the

and ½5

Thus eqn [4] becomes a model for surface roughness that allows BRDF measurement to be used to find and/or verify surface roughness specifications. Of course, there are exceptions. If the ‘clean, optically smooth, front surface reflector’ limitations are violated, then the surface will have more than just roughness as a source of scatter and the PSD found from eqn [4] will be too large. Obtaining the same PSD from BRDF measurements made at different wavelengths, or different polarizations, is an indication that the surface is scattering ‘topographically’ and the PSD can be found using this technique. Because some scatter measurements can be made very rapidly, there are industry situations where scatter metrology offers a very fast means of monitoring surface quality. A little study of eqns [4] and [5] makes it clear that ranges of spatial frequencies in the PSD correspond directly to angular ranges of the BRDF. Scatter from a single spatial frequency corresponds to a single scatter direction. Following the above discussion, it becomes clear why the pioneering integrated (TIS) scatter measurements could be used to produce surface rms values as indicated by eqn [2]. Unfortunately, the result of eqn [4] was not available in 1961. Instead eqn [2] was derived for the special case of a smooth surface with Gaussian statistics. Nobody was thinking about spatial bandwidths and angular limits. When striking differences were found between roughness measurements made by TIS and profilometer, the Gaussian assumption became the ‘whipping boy,’ and TIS scatter measurements took an underserved hit. In fact, integration of eqn [4] results in the TIS result given in eqn [2] under the generally true small angle assumption that most of the scatter is close to the specular reflection. The differences make sense when the concept of spatial frequency bandwidths (or appropriate angular limits) is introduced. It becomes clear that different wavelengths, incident angles and scatter collection angles will also generate different rms values (for the same surface) and why so much confusion resulted following the definition of TIS in terms of rms roughness. There is no such thing as a unique rms roughness for a surface, anymore than there is a single spatial bandwidth for the PSD, or a single set of angles over which to integrate the BRDF. TIS measurements and rms measurements

INSTRUMENTATION / Scatterometry 323

should always be given with enough information to determine bandwidth limits.

Measuring and Analyzing Scatter from Isolated Surface Features Understanding scatter from discrete surface features has led to big changes in the entertainment business (CDs, DVDs, digitized music, and films, etc.) as well as providing an important source of metrology for the semiconductor industry as they develop smaller faster chips for a variety of modern uses. Thus, just about everybody in the modern world utilizes our understanding of scatter from discrete surface features. On the metrology side, the roughness signals described in the last section are a serious source of background noise that limits the size of the smallest defects that can be found. Discrete surface features come in a dazzling array of types, sizes, materials, and shapes – and they all scatter differently. A 100 nm silicon particle scatters a lot differently than a 100 nm silicon oxide particle (even if they have the same shape), and a 100 nm diameter surface pit will have a different scatter pattern. Models describing scatter from a variety of discrete surface features have been developed. Although many of these are kept confidential for competitive reasons, NIST offers some models publicly through a web site. Confirming a model requires knowing exactly what is scattering the light. In order to accomplish this, depositions of PSLs of known size are made on the surface. Scatter is measured from one or more spheres. A second measurement of background scatter is then subtracted and the net BRDF is converted to DSC units using the known (measured) illuminated spot size. Measurements of this type have been used to confirm discrete feature scatter models. The model is then used to calculate scatter from different diameters and materials. Combined with the model for surface roughness to evaluate system noise, this capability allows signal to noise evaluation of different defect scanners designs.

type and use) are in daily use. These instruments are truly amazing; they inspect 200 mm wafers at a rate of one every thirty seconds reporting feature location, approximate size, and in some cases even type (pit or particle) by analyzing scatter signals that last only about 100 nanoseconds. Similar systems are now starting to be used in the computer disk industry and flat panel display inspection of surface features will follow. Scanners are calibrated with PSLs of different sizes. Because it is impossible to identify all defect types and because diameter has little meaning for irregularly shaped objects, feature ‘size’ is reported in ‘PSL equivalent diameters.’ This leads to confusing situations for multiple detector systems, which (without some software help) would report different sizes for real defects that scatter much differently than PSLs. These difficulties have caused industry confusion similar to the ‘Gaussian statistics/bandwidth limited’ issues encountered before roughness scatter was understood. The publication of international standards relating to scanner calibration has reduced the level of confusion.

Scatter Related Standards Early scatter related standards for BRDF, TIS, and PSD calculations were written for ASTM (the American Society for Testing Materials), but as the industrial need for these documents moved to the semiconductor industry, there was pressure to move the documents to SEMI (Semiconductor Equipment and Materials Inc.), which is an international organization. By 2004, the process of rewriting the ASTM documents in SEMI format was well underway in the Silicon Wafer Committee. Topics covered include: surface defect specification (M35), defect capture rate (M50), scanner specifications (M52), scanner calibration (M53), particle deposition testing (M58), BRDF measurement (ME1392), and PSD calculation (MF1811). This body of literature is probably the only place where all aspects of these related problems are brought together in one place.

Conclusion Practical Industrial Instrumentation Surface defects and particles, smaller than 100 nm are now routinely found on silicon wafers using scatter instruments known as particle scanners. Thousands of these instruments (with price tags approaching a million dollars each depending on

The bottom line is that scatter measurement and analysis has moved from an art to a working metrology. Dozens of labs around the world can now take the same sample and get about the same measured BRDF from it. Thousands of industrial surface scanners employing scattered light signals are in use every day on every continent. In virtually every

324 INSTRUMENTATION / Spectrometers

house in the modern world there is at least one entertainment device that depends on scatter signals. In short – scatter works.

See also Scattering: Scattering Theory.

Further Reading Bennett JM and Mattsson L (1995) Introduction to Surface Roughness and Scattering, 2nd edn. Washington, DC: Optical Society of America. Bennett HE and Porteus JO (1961) Relation between surface roughness and specular reflectance at normal incidence. Journal of the Optical Society of America 51: 123. Cady FM, Bjork DR, Rifkin J and Stover JC (1989) BRDF error analysis. Proceedings of SPIE 1165: 154 – 164. Church EL and Zavada JM (1975) Residual surface roughness of diamond-turned optics. Applied Optics 14: 1788. Doicu A, Eremin Y and Wriedt T (2000) Acoustic & Electromagnetic Scattering Analysis Using Discrete Sources. San Diego, CA: Academic Press.

Gu ZH, Dummer RS, Maradudin AA and McGurn AR (1989) Experimental study of the opposition effect in the scattering of light from a randomly rough metal surface. Applied Optics 28(3): 537. Nicodemus FE, Richmond JC, Hsia JJ, Ginsberg IW and Limperis T (1977) Geometric Considerations and Nomenclature for Reflectance. NBS Monograph 160, US Dept. of Commerce. Scheer CA, Stover JC and Ivakhnenko VI (1998) Comparison of models and measurements of scatter from surface-bound particles. SPIE Proceedings 3275. Schiff TF, Stover JC and Bjork DR (1992) Mueller matrix measurements of scattered light. Proceedings of SPIE 1753: 269 – 277. Stover JC (1975) Roughness characterization of smooth machined surfaces by light scattering. Applied Optics 14(8): 1796. Stover JC (1995) Optical Scattering: Measurement and Analysis, 2nd edn. Bellingham, WA: SPIE. Stover JC (2001) Calibration of particle detection systems. In: Alain Diebold (ed.) Handbook of Silicon Semiconductor Metrology, chap. 21. New York: Marcel Dekker. Wolfe WL and Bartell FO (1982) Description and limitations of an automated scatterometer. Proceedings of SPIE 362-30.

Spectrometers K A More, SenSyTech, Inc. Imaging Group, Ann Arbor, MI, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Sp7ectrometers were developed after the discovery that glass prisms disperse light. Later, it was discovered that diffraction from multiple, equally spaced, wires or fibers also dispersed light. About 100 years ago, Huygens proposed his wave theory of light and Fraunhofer developed diffraction theory, which allowed scientific development of diffraction gratings. These discoveries then led to the development of spectrometers. Light theory was sufficiently developed such that spectrometer designs, developed over 100 years ago, are still being used today. These theories and designs are briefly described along with comments on how current technology has improved upon these designs. This is followed by some examples of imaging spectrometers which have wide spectral coverage from 450 nm to 14 mm and produce images of more than 256 spectral bands.

The basic elements of a spectroscopic instrument are shown in Figure 1. The source, or more usually an image of the source, fills an entrance slit and the radiation is collimated by either a lens or mirror. The radiation is then dispersed, by either a prism or a grating, so that the direction of propagation of the radiation depends upon its wavelength. It is then brought to a focus by a second lens or mirror and the spectrum consists of a series of monochromatic images of the entrance slit. The focused radiation is detected, either by an image detector such as a photographic plate, or by a flux detector such as a photomultiplier, in which case the area over which the flux is detected is limited by an exit slit. In some cases the radiation is not detected at this stage, but passes through the exit slit to be used in some other optical system. As the exit slit behaves as a monochromatic source, the instrument can be regarded as a wavelength filter and is then referred to as a monochromator.

Prisms The wavelength dependence of the index of refraction is used in prism spectrometers. Such an optical

INSTRUMENTATION / Spectrometers 325

Figure 1 The basic elements of a spectroscopic instrument. With permission from Hutley MC (1982) Diffraction Gratings, pp. 57–232. London: Elsevier.

Figure 2 Elementary prism spectrometer schematic. W is the width of the entrance beam; Sp is the length of the prism face; and B is the prism base length. Reproduced with permission from The Infrared Handbook (1985). Ann Arbor, MI: Infrared Information Analysis Center.

element disperses parallel rays or collimated radiation into different angles from the prism according to wavelength. Distortion of the image of the entrance slit is minimized by the use of planewave illumination. Even with planewave illumination, the image of the slit is curved because not all of the rays from the entrance slit can traverse the prism in its principal plane. The prism is shown in the position of minimum angular deviation of the incoming rays in Figure 2. At minimum angular deviation, maximum power can pass through the prism. For a prism adjusted to the position of minimum deviation: r1 ¼ r2 ¼ Ap =2

½1

and i1 ¼ i2 ¼ ðDp þ Ap Þ=2

½2

where Dp ¼ angle of deviation Ap ¼ angle of prism r1 and r2 ¼ internal angles of refraction i1 and i2 ¼ angles of entry and exit. The angle of deviation, Dp , varies with wavelength. The resulting angular dispersion is defined as dDp =dl, while the linear dispersion is dx=dl ¼ F dDp =dl, where F is the focal length of the camera or imaging lens and x is the distance across the image plane.

326 INSTRUMENTATION / Spectrometers

Figure 3 Infrared spectrograph of the Littrow-type mount with a rock salt prism. Reproduced with permission from The Infrared Handbook (1985) Ann Arbor, MI: Infrared Information Analysis Center.

It can be shown that: dDp =dl ¼ ½B=W½dn=dl ¼ ½dDp =dn½dn=dl

the prism for a double pass through the prism, as shown in Figure 3. ½3

where

Gratings

B ¼ base length of the prism, W ¼ width of the illumination beam n ¼ index of refraction

Rowland is credited with the development of a grating mount that reduced aberrations in the spectrogram. He found that a grating ruled on a concave surface of radius R and locating the entrance and exit slits on the same radius would give the least aberrations, as shown in Figure 4. r ¼ R cos a is the distance to the entrance slit (r) and r1 ¼ R cos b is the distance to the focal point for the exit slit ðr1 Þ: Rowland showed that:

while dx=dl ¼ F½B=W½dn=dl

½4

One may define the resolving power, RP, of an instrument as the smallest resolvable wavelength difference, according to the Rayleigh criterion, divided into the average wavelength in that spectral region. Thus: RP ¼ l=Dl ¼ ½l=dDp ½dDp =dl ¼ ½l=dDp ½B=W½dn=dl

½5

The limiting resolution is set by diffraction due to the finite beamwidth, or effective aperture of the prism, which is rectangular. Thus: RP ¼ ½l=ðl=WÞ½B=W½dn=dl

½6

RP ¼ B½dn=dl

½7

so that:

If the entire prism face is not illuminated, then only the illuminated base length must be used for B. Littrow showed that aberrations would be minimized by making the angle of incidence equal to the angle of refraction b (also known as the Littrow Configuration). Littrow used a plane mirror behind

cos a=R 2 cos2 a=r þ cos b=R 2 cos2 b=r1 ¼ 0

½8

One solution to this equation is for a and b to each be zero and r ¼ R cos a and r1 ¼ R cos b This condition is met if r and r1 lie on the Rowland circle. There are various ways in which the Rowland circle condition may be satisfied and some of them are shown in Figure 5. The simplest mounting of all is that due to Paschen and Runge (Figure 5a) in which the entrance slit is positioned on the Rowland circle and a photographic plate (or plates) is constrained to fit the Rowland circle. Alternatively, for photoelectric detection, a series of exit slits is arranged around the Rowland circle each having its own detector. In the latter case, the whole spectrum is not recorded, only a series of predetermined wavelengths, but when used in this ‘polychromator’ form, it is very rugged and

INSTRUMENTATION / Spectrometers 327

convenient for applications such as the routine analysis of samples of metals and alloys. In this case, slits and detectors are set up to measure the light in various spectral lines, each characteristic of a particular component or trace element.

Figure 4 The construction of the Rowland Circle. With permission from Hutley MC (1982) Diffraction Gratings, pp. 57–232. London: Elsevier.

The oldest concave grating mount is that designed by Rowland himself and which bears his name (Figure 5b). In this case, the grating and photographic plates are fixed at opposite ends of the diameter of the Rowland circle by a moveable rigid beam. The entrance slit remains fixed above the intersection of two rails at right angles to each other and along which the grating plate holder (or the exit slit) is free to move. In this way, this entrance slit, grating, and plate holder are constrained always to lie on the Rowland circle and it has the advantage that the dispersion is linear, which is useful in the accurate determination of wavelengths. Unfortunately, this mounting is rather sensitive to small errors in the position of the entrance slit and in the orthogonality of the rails, and is now very rarely used. A variation on this mounting was devised by Abney, who again mounted the grating and the plate holder on a rigid bar at opposite ends of the diameter of the Rowland circle (Figure 5c). The entrance slit is mounted on a bar of length equal to the radius of the Rowland circle. In this way, the slit always lays on the Rowland circle, but it had to rotate about its axis in order that the jaws should remain perpendicular to the line from the grating to

Figure 5 Mountings of the concave grating: (a) Paschen –Runge; (b) Rowland; (c) Abney; (d) and (e) Eagle; (f) Wadsworth; (g) Seya–Namioka; (h) Johnson –Onaka. With permission from Hutley MC (1982) Diffraction Gratings, pp. 57– 232. London: Elsevier.

328 INSTRUMENTATION / Spectrometers

the slit. It also had the disadvantage that the source must move with the exit slit, which could be inconvenient. In the Eagle mounting (Figures 5d and 5e), the angles of incidence and diffraction are made equal, or very nearly so, as with Littrow mounting for plane gratings. Optically, this system has the advantage that the astigmatism is generally less than that of the Paschen –Runge or Rowland mounting, but on the other hand, the dispersion is nonlinear. From a mechanical point of view, it has the disadvantage that it is necessary with great precision both to rotate the grating and to move it nearer to the slits in order to scan the spectrum. However, it does have the practical advantage that it is much more compact than other mountings, and this is of particular importance when we bear in mind the need to enclose the instrument in a vacuum tank. Ideally, the entrance slit and exit slit or photographic plate should be superimposed if we are to set a ¼ b: In practice, of course, the two are displaced either sideways, in the plane of incidence as shown in Figure 5, or out of the plane of incidence, in which case the entrance slit is positioned below the meridonal plane and the plate holder just above it, as shown in Figure 5. The out-ofplane configuration is generally used for spectrographs and the in-plane system for monochromators. The penalty incurred in going out of plane is that coma is introduced in the image, and slit curvature becomes more important. This limits the length of the ruling that can effectively be used. The second well-known solution to the Rowland equation is the Wadsworth mounting (Figure 5f), in which the incident light is collimated, so r is set at infinity and the focal equation reduces to r1 ¼ R cos2 b=ðcos a þ cos bÞ: One feature of this mounting is that the astigmatism is zero when the image is formed at the center of the grating blank, i.e., when b ¼ 0: It is a particularly useful mounting for applications in which the incident light is naturally collimated (for example, in rocket or satellite astronomy, spectroheliography and in work using synchrotron radiation). However, if the light is not naturally collimated, the Wadsworth mount requires a collimating mirror, so one has to pay the penalty of the extra losses of light at this mirror. The distance from the grating to the image is about half that for a Rowland circle mounting which makes the instrument more compact and, since the grating subtends approximately four times the solid angle, there is a corresponding increase in the brightness of the spectral image. Not all concave grating mountings are solutions to the Rowland equation. In some cases, other advantages may compensate for a certain defect of focus.

A particularly important example of this is the Seya – Namioka mounting (Figure 5g), in which the entrance slit and exit slit are kept fixed and the spectrum is scanned by a simple rotation of the grating (Figure 5g). In order to achieve the optimum conditions for this mounting, we set a ¼ w þ u and b ¼ u 2 w, where 2w is the angle subtended at the grating by the entrance and exit slit and u is the angle through which the grating is turned. The amount of defocus is given by: Fðu; w; r; r1 Þ ¼ ½cos2 ðu þ wÞ=r þ ½cosðu þ wÞ=R þ ½cos2 ðu 2 wÞ=r1  þ ½cosðu 2 wÞ=R ½9 and the optimum conditions are those for which F(u,w,r,r1) remains as small as possible as u is varied over the required range. Seya set F and three derivatives of F with respect to u equal to zero for u ¼ 0 and obtained the result: pffiffi w ¼ sin21 ð1= 3Þ ¼ 358150

½10

r ¼ r1 ¼ R cos w

½11

and

which corresponds to the Rowland circle either in zero order or for zero wavelength. In practice, it is usual to modify the angle slightly so that the best focus is achieved in the center of the range of interest rather than at zero wavelength. The great advantage of the Seya – Namioka mounting is its simplicity. An instrument need consist only of a fixed entrance and exit slit, and a simple rotation of the grating is all that is required to scan the spectrum. It is, in fact, simpler than instruments using plane gratings. Despite the fact that at the ends of the useful wavelength range the resolution is limited by the defect of focus and the astigmatism is particularly bad, the Seya –Namioka mounting is very widely used, particularly for medium-resolution rather than high-resolution work. A similar simplicity is a feature of the Johnson– Onaka mounting (Figure 5h). Here again, the entrance and exit slits remain fixed, but the grating is rotated about an axis which is displaced from its center; in this way, it is possible to reduce the change of focus that occurs in the Seya – Namioka mounting. The system is set up so that at the center of the desired wavelength range, the slits and grating lie on the Rowland circle, as shown in Figure 5h. The optimum radius of rotation, i.e., the distance GC, was found by

INSTRUMENTATION / Spectrometers 329

Figure 6 Ebert mounting of the plane grating designed by Fastie. ‘Sl’ is the entrance slit; G is the grating; M is the concave mirror; and P is the photographic plate. The horizontal section is at the top and the vertical section is at the bottom. Reproduced with permission from The Infrared Handbook (1985) Ann Arbor, MI: Infrared Information Analysis Center.

Onaka to be:  GCopt ¼ ½R sin 12 ða þ bÞ ½1 2

1 2

£ ðtan b 2 tan aÞ

tanða þ bÞ ½12

Another non-Rowland spectrometer is the Ebert – Fastie mount, which mounts the slit, flat grating, and a detector array in the arrangement shown in Figure 6. Ebert first developed the design using two separate concave mirrors, one for collimating the incident beam and the second to focus the diffracted spectrum. Fastie used a single but larger concave mirror, which simplified the mounting structure and produced a rugged, compact spectrometer that has been used in rocket flights and space satellite observatories for astronomical and upper atmospheric applications. The Czerny– Turner mount is similar to the Ebert mount, except the flat grating is located in the same plane that contains the entrance and exit slits.

Advanced Spectrometers While the above spectrometer designs are still used, major advances in implementation are now available. Ray tracing allows the designer to quantify the aberrations and determine solutions to remove them. Aspheric optic elements can now be fabricated to correct aberrations. Gratings can be ruled on aspherical surfaces to not only disperse the light beam, but also be an element in the optical design. Holography has been developed to etch gratings for use over a wide spectral range and to not only disperse the light, but also work as an optical element. Because holographic gratings are chemically etched,

there are no machine burrs to scatter light and the hologram is free of periodic differences in groove widths that create ghosts in ruled gratings. Linear and array photodetectors have replaced film. An advantage of film was its ability to fit curved focal planes. Photodetectors are etched into flat wafers of the photodiode material. To adapt flat arrays to a curved focal plane, fiber optic face plate couplers have been ground on one side to match the focal plane curvature and the flat back side is either optically coupled to the detector or closely coupled for proximity focusing. The photodetector arrays are available in multiple materials to cover the spectral range, from soft X-rays to the thermal infrared. Often cooling is required to obtain low noise. They are now available with sensitivities never reached with film and the ease of coupling the output data to a computer for real-time analysis has taken much of the labor out of analyzing the spectrograph. Another advance is precision motor drives with precision motion sensors, some using laser interferometry, to provide feedback on automated movement of gratings, slits, and detectors as a spectrometer cycles through its wavelength range.

Imaging Spectrometers Spectrometers that image the spectral characteristics in each pixel of the image to form a data cube, as shown in Figure 7, are named imaging spectrometers, and if the spectra have high resolution and blocks of consecutive neighbors, then the data are called hyperspectral. Imaging spectrometers have several applications that range from medical to remote sensing of land use and environment. This article will cover examples for remote sensing imaging spectrometers which require spectral coverage in all of the atmospheric windows from the UV to thermal IR. Imaging spectrometers require a combination of spectrometers, light collecting optics, and scan mechanisms to scan the instantaneous field of view of the spectrometer over a scene. Remote sensing of the Earth applications requires an aerial platform. Either a helicopter, an aircraft, or orbital space satellites are used. The platform motion is used as part of the scanning process so that the optics image a single point on the ground so that a scanner (called a line scanner) scans a long line that is cross tracked to the platform motion. Or, the optics image a slit that is parallel to the platform track that covers many scan lines and a scanner moves the slit cross track to the platform motion. This scanner is called a whiskbroom scanner. Or, the optics image a large slit so no scan mechanism is needed other than the platform

330 INSTRUMENTATION / Spectrometers

Figure 7 Hyperspectral Data Cube. Hyperspectral imagers divide the spectrum into many discrete narrow channels. This fine quantization ofn spectral information on a pixel by pixel basis enables researchers to discriminate the individual constituents in an area much more effectively. For example, the broad spectral bands of a multispectral sensor allow the user only to coarsely discriminate between areas of deciduous and coniferous forest, plowed fields, etc., whereas a hyperspectral imager provides characteristic signatures which can be correlated with specific spectral templates to help determine the individual constituents and possibly even reveal details of the natural processes which are affecting them.

Figure 8 Multispectral infrared and visible imaging spectrometer optical schematic.

motion to form an image. This scanner is called a pushbroom scanner. One important requirement is that all spectral measurements of a pixel be coregistered. Most airborne imaging spectrometers use a common aperture for a line scanner, or a slit for whisk and

pushbroom scanners, so that platform instability from pitch roll and yaw and from inability to fly in a straight line do not compromise the coregistration of spectral data on each pixel. The image data may require geometric correction, but the spectral data are not compromised.

INSTRUMENTATION / Spectrometers 331

There is one other type of imaging spectrometer that uses a linear variable filter (also known as a wedge filter) over a 2D array of photodetectors. Each image contains a different spectral band over each row of pixels so that each frame images a scene with a different spectral band over each row of pixels. The array is oriented so the rows of spectral bands are perpendicular to the flight track. After the platform

Table 1

MIVIS physical properties Height

Width

Deptha

Weight

in

in

in

lbs kg

cm

cm

cm

Scan head 26.5 67.0 20.6 52.0 28.1 71.0 220 100 Electronics 40.0 102.0 19.0 48.3 24.0 61.0 Total system 460 209 weight (approx.) a

Not including connectors or cable bends.

Table 2

MIVIS spectral coverage (mm)

OPTICAL PORT 1

OPTICAL PORT 3

CH #

CH #

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Band Edges Lower

Upper

0.43 0.45 0.47 0.49 0.51 0.53 0.55 0.57 0.59 0.61 0.63 0.65 0.67 0.69 0.71 0.73 0.75 0.77 0.79 0.81

0.45 0.47 0.49 0.51 0.53 0.55 0.57 0.59 0.61 0.63 0.65 0.67 0.69 0.71 0.73 0.75 0.77 0.79 0.81 0.83

OPTICAL PORT 2 21 22 23 24 25 26 27 28

motion moves over one ground pixel, the whole array is read out and the frame is shifted one row of pixels so the second frame adds a second spectral band to each row imaged in the first frame. This frame stepping is carefully timed to the platform velocity and is repeated until each row of pixels is imaged in all spectral bands. This type of imaging spectrometer has been flown in aircraft and satellites. In aircraft, the platform motion corrupts the coregistration of the spectrum in each pixel. Extensive ground processing is required to geometrically correct each frame to improve spectral coregistration. This is a difficult task and this type of imaging spectrometer has lost favor for airborne use. Stabilized satellites have been a better platform and the wedge type of imaging spectrometer has been used successfully in space. As examples of airborne and ground-based imaging spectrometers, a line scanner imaging

1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50

1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55

OPTICAL PORT 4

Band Edges Lower

Upper

29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

2.000 2.008 2.016 2.023 2.031 2.039 2.047 2.055 2.063 2.070 2.078 2.086 2.094 2.102 2.109 2.117 2.125 2.133 2.141 2.148

2.008 2.016 2.023 2.031 2.039 2.047 2.055 2.063 2.070 2.078 2.086 2.094 2.102 2.109 2.117 2.125 2.133 2.141 2.148 2.156

49

2.156

50 51 52 53 54 55 56 57 58 59 60

2.164 2.172 2.180 2.188 2.195 2.203 2.211 2.219 2.227 2.234 2.242

CH #

Band Edges Lower

Upper

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

2.250 2.258 2.266 2.273 2.281 2.289 2.297 2.305 2.313 2.320 2.328 2.336 2.344 2.352 2.359 2.367 2.375 2.383 2.391 2.398

2.258 2.266 2.273 2.281 2.289 2.297 2.305 2.313 2.320 2.328 2.336 2.344 2.352 2.359 2.367 2.375 2.383 2.391 2.398 2.406

2.164

81

2.406

2.414

2.172 2.180 2.188 2.195 2.203 2.211 2.219 2.227 2.234 2.242 2.250

82 83 84 85 86 87 88 89 90 91 92

2.414 2.422 2.430 2.438 2.445 2.453 2.461 2.469 2.477 2.484 2.492

2.422 2.430 2.438 2.445 2.453 2.461 2.469 2.477 2.484 2.492 2.500

CH #

93 94 95 96 97 98 99 100 101 102

Band Edges Lower

Upper

8.20 8.60 9.00 9.40 9.80 10.20 10.70 11.20 11.70 12.20

8.60 9.00 9.40 9.80 10.20 10.70 11.20 11.70 12.20 12.70

332 INSTRUMENTATION / Spectrometers

spectrometer and a ground-based scanned push broom scanner are described below. MIVIS (multispectral infrared and visible imaging spectrometers) hyperspectral scanner developed by SenSyTech Imaging Group (formerly Daedalus Enterprises) for the CNR (Consiglio National Researche) of Italy is a line scanner imaging spectrometer. The optical schematic is shown in Figure 8. A rotating 458 mirror scans a line of pixels on the ground. Each pixel is collimated by a parabolic mirror in a Gregorian telescope mount. The collimated beam is reflected by a Pfund assembly to an optical bench that houses four spectrometers. An aperture pinhole in the Pfund assembly defines a common instantaneous field of view for each pixel. The collimated beam is then split off with either thin metallic coated or dielectric

Table 3

coated dichroics to four spectrometers. The thin metallic mirrors reflect long wavelengths and transmit short wavelengths. The multiple dielectric layers cause interference such that light is reflected at short wavelengths and transmitted at long wavelengths. After splitting off four wide bands (visible, near infrared, mid-wave infrared, and thermal infrared) each band is dispersed in its own spectrometers. The wavelength coverage for each spectrometer is based on the wavelength sensitivity of different photodetector arrays. The visible spectrometer uses a silicon photodiode array; the near infrared spectrometer, an InGaAs array, the mid-infrared a InSb array and the thermal infrared, a MCT (mercury doped cadmium teluride) photo-conductor array. Note the beam expander in spectrometer 3 for the

LAFS technical specifications

Spatial resolution Spatial coverage Focus range Spectral resolution Spectral range Dynamic range Illumination Acquisition time Viewfinder Calibrations Data displays Data format Data storage

1.0 mrad IFOV vertical and horizontal (square pixels). Optical lens for 0.5 mrad IFOV 158 TFOV horizontal, nominal. A second, 7.58 TFOV, is an option 50 to infinity 5.0 nm per spectral channel, sampled at 2.5 nm interval 400–1100 nm 8 bit (1 part in 256) Solar illumination ranging from 10:00 a.m. to 2:00 p.m. under overcast conditions to full noon sunshine 1.5 sec for standard illuminations. Option for longer times for low light level conditions Near real time video view of the scene Flat fielding to compensate for CCD variations in responsivity. Special calibration Single band imagery, selectable Waterfall Chart (one spatial line by spectral, as acquired by the CCD). Single pixel spectral display Convertible to format compatible with image processors Replaceable hard disc, 14 data cubes/disk

Figure 9 LAFS system block diagram.

INSTRUMENTATION / Spectrometers 333

mid-infrared (IR). Since wavelength resolution increases with the size of the illuminated area of a diffraction grating a beam expander was needed to meet the specified mid-IR resolution. The physical characteristics of MIVIS are given in Table 1 and the spectral coverage in Table 2. An example of a push broom imaging spectrometer is the large area fast spectrometer (LAFS), which is a field portable, tripod mounted, imaging spectrometer. LAFS uses a push broom spectrometer with a galvanometer driven mirror in front of the entrance slit. The mirror is stepped to acquire a 2D image £ 256 spectral bands data cube. LAFS was developed for the US Marine Corps under the direction of the US Navy Coastal Systems Station of the Dahlgren Division. The technical specifications are shown in Table 3. Various concepts for the imaging spectrometer were studied and the

Figure 10 LAFS.

only concept that could meet the above specification was a Littrow configuration grating spectrometer that imaged one line in the scene and dispersed the spectra perpendicular to the line image onto a CCD array. A galvanometer mirror stepped the line image over the scene to generate a spectral image of the scene. A block diagram of LAFS is shown in Figure 9 and a photograph of the prototype in Figure 10. Figure 11 is a photograph of the optical head. Data from a single frame are stored in computer (RAM) memory then transferred to a replaceable hard disk for bulk storage. Data are collected in a series of 256 CCD frames and each CCD frame has a line of 256 pixels along one axis and 256 spectral samples along the second axis. The 256 CCD frames constitute a data cube as shown in Figure 12. This does not allow field viewing of an image in a single spectral band. Earth Viewe software is used in the field portable computer

334 INSTRUMENTATION / Spectrometers

Figure 11 LAFS optical head.

Figure 12 Arrangement of image memory and construction of image displays.

INSTRUMENTATION / Spectrometers 335

Table 4

Hyperspectral

LAFS physical properties

Dimensions and weight

Size (inches)

weight (lbs)

Optical head Electronics Battery

7£9£8 12 £ 15.5 £ 9 4.5 £ 16.5 £ 2

11 30 7 48 þ cables

Line scanners

Whisk broom

Power: Less than 150 W at 12 VDC; Battery time – 45 min/ charge.

Push broom

to reorder the data cube into 256 spatial images, one for each spectral band as shown in Figure 12. The prototype is packaged in two chassis, a compact optical head and a portable electronic chassis. The size, weight, and power are shown in Table 4. LAFS was designed to also work as an airborne imaging spectrometer. The framing mirror can be locked so it views a scene directly below the aircraft (push broom mode) or the mirror can be programmed to sweep a 256 lines cross track to the aircraft flight path (whisk broom mode). In the whisk broom mode, the scan mirror rotation arc can be increased for a wider field of view than in the push broom mode. LAFS illustrates the decrease in size, weight, and power of a push broom imaging spectrometer that results from a 256 £ 256 pixel array rather than single pixel line arrays of a line scanner. The array increases integration time of a line by a factor of 256, which allows longer detector integration dwell time on each pixel and thus smaller light collecting optics.

See also

List of Units and Nomenclature Data cube

Pixel

Instantaneousfield-of-view (IFOV)

Field-of-view

Multiple images of a scene in many spectral bands. The spectra of each pixel can be obtained by plotting the spectral value on the same pixel in each spectral band image. An image point (or small area) defined by the instantaneous-fieldof-view of either the optics, entrance aperture or slit, and the detector area, or by some combination of these components. A pixel is the smallest element in a scene that is resolved by the imaging system. The IFOV is used in describing scanners to define the size of the smallest field-of-view (usually in milliradians or microradians) that be resolved by a scanner system. The total size in angular dimensions of a scanner or imager.

A data cube with many (.48) spectral bands. An optical/mechanical system that scans one pixel at a time along a line of a scene. An optical/mechanical system that scans two or more lines at a time. An optical system that images a complete line at a time.

Diffraction: Diffraction Gratings; Fraunhofer Diffraction. Fiber Gratings. Geometrical Optics: Prisms. Imaging: Hyperspectral Imaging; Interferometric Imaging. Interferometry: Overview. Modulators: Acousto-Optics. Optical Materials: Color Filters and Absorption Glasses. Semiconductor Materials: Dilute Magnetic Semiconductors; Group IV Semiconductors, Si/SiGe; Large Gap II –VI Semiconductors; Modulation Spectroscopy of Semiconductors and Semiconductor Microstructures. Spectroscopy: Fourier Transform Spectroscopy; Raman Spectroscopy.

Further Reading Baselow R, Silvergate P, Rappaport W, et al. (1992) HYDICE (Hyperspectral Digital Imagery Collection Experiment Instrument Design. International Symposium on Spectral Sensing Research. Bianchi, R, Marino, CM and Pignatti, S Airborne Hyperspectral Remote Sensing in Italy. Pomezia (Roma), Italy: CNR, Progetto LARA, 00040. Borough HC, Batterfield OD, Chase RP, Turner RW and Honey FR (1992) A Modular Wide Field of View Airborne Imaging Spectrometer Preliminary Design Concept. International Symposium on Spectral Sensing Research. Carmer DC, Horvath R and Arnold CB (1992) M7 Multispectral Sensor Testbed Enhancements. International Symposium on Spectral Sensing Research. Hackwell JA, Warren DW, Bongiovi RP, et al. (1996) LWIR/ MWIR Imaging Hyperspectral Sensor for Airborne and Ground-Based Remote Sensing. In: Descour M and Mooney J (eds) Imaging Spectrometry II. Proc. SPIE 2819, pp. 102 –107. Hutley MC (1982) Diffraction Gratings. London: Academic Press. James JF and Sternberg RS (1969) The Design of Optical Spectrometers. London: Chapman and Hall. Kulgein NG, Richard SP, Rudolf WP and Winter EM (1992) Airborne Chemical Vapor Detection Experiment. International Symposium on Spectral Sensing Research. Lucey PG, Williams T, Horton K, Hinck K and Buchney C (1994) SMIFTS: A Cryogenically Cooled, Spatially Modulated, Imaging Fourier Transform Spectrometer for Remote Sensing Applications. International Symposium on Spectral Sensing Research.

336 INSTRUMENTATION / Telescopes

Mourales P and Thomas D (1998) Compact LowDistortion Imaging Spectrometer for Remote Sensing. SPIE Conference on Imaging Spectrometry IV 3438: 31– 37. Pagano TS (1992) Design of the Moderate Resolution Imaging Spectrometer – NADIR (MODIS-N). American Society of Photogrametry & Remote Sensing, pp. 374 – 385. Stanich CG and Osterwisch FG (1994) Advanced Operational Hyperspectral Scanners MIVIS and AHS. First International Airborne Remote Sensing Conference and Exhibition, Strasbourg, France: Environmental Research Institute of Michigan. Sun X and Baker JJ (2001) A Hight Definition Hyperspectral Imaging System. Proceedings, Fifth International Airborne Remote Sensing Conference and Exhibition, Veridian. Torr MR, Baselow RW and Torr DG (1982) Imaging Spectrascopy of the Thermosphere from the Space Shuttle. Applied Optics 21: 4130.

Torr MR, Baselow RW and Mounti J (1983) An imaging spectrometric observatory for spacelab. Astrophys. Space Science. 92: 237. Vane G, Green RO, Chrien TG, Enmark HT, Hansen EG and Porter WM (1993) The Airborne Visible/Infrared Imaging Spectrometer (AVIRIS). Remote Sensing of the Environment 44: 127– 143. Warren CP, Anderson RD, Velasco A, Lin PP, Lucas JR and Speer BA (2001) High Performance VIS and SWIR Hyperspectral Imagers for Airborne Applications. Fifth International Conference and Exhibition, Veridian. Wolfe WL (1997) Introduction to Imaging Spectrometers. SPIE Optical Engineering Press. Wolfe WL and Zisfis GJ (eds) (1985) The Infrared Handbook, revised edition. Washington, DC: Office of Naval Research, Department of the Navy. Zywicki RW, More KA, Holloway J and Witherspoon N (1994) A Field Portable Hyperspectral Imager: The Large Area Fast Spectrometer. Proceedings of the International Symposium on Spectral Sensing Research.

Telescopes M M Roth, Astrophysikalisches Institut Potsdam, Potsdam, Germany q 2005, Elsevier Ltd. All Rights Reserved.

Introduction In a broad sense, telescopes are optical instruments which provide an observer with an improved view of a distant object, where improved may be defined in terms of magnification, angular resolution, and light collecting power. Historically, the invention of the telescope was a breakthrough with an enormous impact on fundamental sciences (physics, astronomy), and, as a direct consequence, on philosophy; but also on other technological, economical, political, and military developments. It seems that the first refractive telescopes were built in the Netherlands near the end of the sixteenth century by Jan Lipperhey, Jakob Metius, and Zacharias Janssen. Based on reports of those first instruments, Galileo Galilei built his first telescope, which was later named after him, and made his famous astronomical observations of sunspots, the phases of Venus, Jupiter’s moons, the rings of Saturn, and his discovery of the nature of the Milky Way as an assembly of very many stars. The so-called astronomical telescope was invented by Johannes Kepler, who in 1611 also published in Dioptrice the theory of this telescope. The disturbing effects of spherical and chromatic aberrations were soon realized by astronomers, opticians, and other users, but it was not before Moor Hall (1792) and John Dollond (1758) that at

least the latter flaw was corrected for by introducing the achromat. Later, the achromat was significantly improved by Peter Dollond, Jesse Ramsden, and Josef Fraunhofer. The catadioptric telescope was probably introduced by Lenhard Digges (1571) or Nicolas Zucchius (1608). In 1671, Sir Isaac Newton was the first to use a reflector for astronomical observations. Wilhelm Herschel improved this technique and began in 1766 to build much larger telescopes with mirror diameters up to 1.22 m. When, in the nineteenth century, the conventional metal mirror was replaced by glass, it was Le´ on Foucauld (1819 – 1868) who applied a silver coating in order to improve the reflectivity of its surface. Modern refractors can be built with superb apochromatic corrections, and astronomical reflectors have advanced to mirror diameters of up to 8.4 m (monolithic mirror), and 10 m (segmented mirror). While the classical definition of a telescope involves an observer, i.e., the eye, in modern times often electronic detectors have replaced human vision. In the following, we will, therefore, use a more general definition of a telescope as an optical instrument, whose purpose is to image a distant object, either in a real focal plane, or in an afocal projection for the observation by eye.

Basic Imaging Theory In the most simple case, a telescope may be constructed from a single lens or a single mirror,

INSTRUMENTATION / Telescopes 337

creating a real focal plane. Let us, therefore, introduce some basic principles of imaging using optical elements with spherical surfaces. Refraction at a Spherical Surface

Figure 1 shows an example of refraction at a spherical surface. A ray emerging from object O hits the surface between two media of refractive index n and n0 at point P. After refraction, the ray continues at a deflected angle 10 towards the normal and intersects the optical axis at point O0 . In the ideal case, all rays emerging from O at different angles s are collected in O0 , thus forming a real image of object point O in O0 . In the paraxial approximation, we have sin s ¼ tan s ¼ s; and p < 0: In triangle (O, O0 , P): s 0 2 s ¼ i 2 i 0 : Using Snell’s law ni ¼ n0 i 0 ; we obtain: n0 2 n s 2s¼i n 0

0

½1

Since f ¼ s 0 þ i 0 ; and in the paraxial approximation:

s 0 n0 2 sn ¼ fðn0 2 nÞ

½2

h 0 h h n 2 n ¼ ðn0 2 nÞ s0 s r

½3

we obtain finally:

Combining two or more refractive surfaces, as in the preceding section, allows us to describe a single lens or more complex optical systems with several lenses in series. For a single lens in air, we just give the lens equation (without deriving it in detail): s02 ðF0 Þ ¼

1 r2 ½nr1 2 ðn 2 1Þd n 2 1 ðn 2 1Þd þ nðr2 2 r1 Þ

½6

where d is the lens thickness, measured between the vertices of its surfaces, r1, r2 are the radii of the surfaces, n the index of refraction of the lens material, and s02 ðF0 Þ the distance of the object image from the vertex of the last surface, facing the focal plane. The back focal distance is given by: f0 ¼

1 nr1 r2 n 2 1 ðn 2 1Þd þ nðr2 2 r1 Þ

½7

An important special case is where the lens thickness is small compared to the radii of the lens surfaces, i.e., when d p lr2 2 r1 l: In this case we can neglect the term of ðn 2 1Þd and write: f0 ¼

1 r1 r2 n 2 1 ðr2 2 r1 Þ

½8

Reflection at a Spherical Surface 0

0

n n n 2n 2 ¼ s r s0

½4

For an object at infinity, where s ¼ 1; s0 becomes the focal distance f 0 : n0 f ¼r 0 n 2n 0

Lenses

and

f~ ¼ 2r

n n0 2 n

½5

Let us now consider the other simple case of imaging object O into O0 by reflection from the concave spherical surface, as depicted in Figure 2. In the paraxial case, we have:

s ¼ y=s; i 0 ¼ f 2 s 0;

Figure 1 Refraction at a spherical surface between media with refractive indices n and n 0:

i ¼ f 2 s;

f ¼ y=r

s 0 ¼ y=s0 ;

½9

338 INSTRUMENTATION / Telescopes

With i ¼ 2i 0 ; we obtain:

the focal length of the telescope, we therefore have:

1 1 2 2 ¼2 s0 s r

r 2

Dy f

½12

The so-called plate scale m of the telescope in units of arcsec/mm is given by:

Again, with s ¼ 1; the focal length becomes: f0 ¼ 2

tan d ¼

½10

½11

In what follows, we will now assume that the distance of the object is always large (1), which, in general, is the situation when a telescope is employed. The image is formed at a surface defined by O0 , ideally a focal plane (paraxial case). We will consider deviations from the ideal case below. Simple Telescopes

As we have observed above, a telescope is in principle nothing but an optical system, which images an object at infinity. In the most simple case, a lens with positive power or a concave mirror will satisfy this condition. The principle of a refractor and a reflecting telescope is shown in Figures 3 and 4. In both cases, parallel light coming from the distant object is entering through the entrance pupil, experiences refraction (reflection) at the lens (mirror), respectively, and is converging to form a real image in the focal plane. Two point sources, for example two stars in the sky, separated by an angle d; will be imaged in the focal plane as two spots with separation Dy: When f is

m ¼ 3600 atan

1 ðf in mmÞ f

½13

In the case of the refractor, we have also shown an eyepiece, illustrating that historically the instrument was invented to enhance the human vision. The eye is located at the position of the exit pupil, thus receiving parallel light, i.e., observing an object at infinity, however (i) with a flux which is increased by a factor A; given by the ratio of the area of the telescope aperture with diameter D to the area of the eye’s pupil d with A ¼ ðD=dÞ2 ; and (ii) with a magnification M of the apparent angle between two separate objects at infinity, which is given by the ratio of the focal lengths of the objective and the eyepiece: G¼

fobj focl

½14

The magnification can also be expressed as the ratio of the diameters of the exit pupil and the entrance pupil: Dexpup G¼ ½15 Denpup The first astronomical telescopes with mirrors were equipped with an eyepiece for visual observation. The example in Figure 4 shows a configuration which is more common nowadays for professional astronomical observations, using a direct imaging detector in the focal plane (photographic plate, electronic camera).

Aberrations Reflector Surfaces

Figure 2 Reflection at a spherical surface with radius r :

We shall now study the behavior of a real telescope beyond the ideal in the case of a reflector. Figure 5 shows the situation of a mirror, imaging an on-axis

Figure 3 Refractor, astronomical telescope for visual observations with eyepiece (Kepler telescope).

INSTRUMENTATION / Telescopes 339

conic constant: y2 2 2Rz þ ð1 þ CÞz2 ¼ 0

½19

and the following characteristics:

Figure 4 imager.

Reflector, astronomical telescope with prime focus

C.0

ellipsoidal; prolate

C¼0

spherical

21 , C , 0

ellipsoidal; oblate

C ¼ 21

paraboloidal

C , 21

hyperboloidal

Let us now leave the paraxial approximation and determine the focal length of rays which are parallel to, but further away at a distance y from the optical axis. Figure 5 shows the relevant geometrical relations. The intersection of the reflected ray with the optical axis at distance f from the vertex can be found from: f ¼ z þ ðf 2 zÞ where

y ¼ tanð2iÞ f 2z

½20

We use the slope of the tangent in point P: dz ¼ tan i dy

Figure 5 Imaging an object at infinity with reflector.

object at infinity into focus at point O0 . With Fermat’s principle, the optical paths of all rays forming the image must have identical length l. For all j; we must satisfy: Zj þ f ¼ l

½16

differentiate eqn [19] to get: dz y ¼ dy R 2 ð1 þ CÞz0

tan 2f ¼ ½17

Combining eqns [16] and [17], we obtain the equation for a surface which satisfies the condition for perfect on-axis imaging: y2 ¼ 24fz

½18

Obviously only a paraboloid is capable of fulfilling the condition for all rays parallel to the optical axis at whatever separation y. Also, we see that spherical mirrors are less than ideal imagers: rays at increasing distance y from the axis will suffer from increasing path length differences. The effect is called spherical aberration. We will investigate aberrations more quantitatively further below. Elementary axisymmetrical surfaces can be expressed as follows, with parameter C being the

½22

use the trigonometrical relation:

From the geometry in Figure 5, we find also: y2i þ ðf 2 Zj Þ2 ¼ l2

½21

2 tan f 1 2 tan2 f

½23

and substitute for tanð2iÞ in eqn [20] to obtain: f ¼

R ð1 2 CÞz y2 þ 2 2 2 2ðR 2 ð1 þ CÞzÞ

½24

In order to express eqn [24] solely in terms of y, we rewrite eqn [19]: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 R @ y2 A z¼ 1 2 1 2 2 ð1 þ CÞ 1þC R 0

½25

Expanding z in a series and neglecting any orders higher than 6: z¼

y2 y4 y6 þ ð1 þ CÞ 3 þ ð1 2 CÞ3 2R 16R5 8R

½26

340 INSTRUMENTATION / Telescopes

Figure 6 Transverse spherical aberration.

we obtain finally: f ¼

R 1þC 2 ð1 þ CÞð3 þ CÞ 4 2 y 2 y 2 4R 16R3

½27

. Transverse Spherical Aberration

Obviously, for any C – 21; rays of different distance y to the optical axis are focused at different focal lengths f (spherical aberration). The net effect for a distant point source is that the image is blurred, rather than forming a sharp spot. The lateral deviation of nonparaxial rays intersecting the nominal focal plane at f0 ¼ R=2 is described by the transverse spherical aberration Atr (Figure 6): Atr;sph ¼

y ðf 2 f Þ f 2z 0

½28

Using eqns [19] and [27], one obtains: Atr;sph ¼ 2

1 þ C 3 3ð1 þ CÞð3 þ CÞ 5 y þ y þ ··· 2R2 8R4

Figure 7 Angular aberration determined with reference to a paraboloid.

½29

Angular Spherical Aberration

It is sometimes more convenient to consider the angular spherical aberration, since a telescope is naturally measuring angles between objects at infinity. To this end, we compare an arbitrary reflector surface according to eqn [19] with a reference paraboloid, which is known to be free of on-axis spherical aberration (Figure 7). For any given ray, we observe how the reflected ray under study is tilted

towards the reference ray, which would be reflected from the ideal surface of the paraboloid. While the former is measured under an exit angle of 2i towards the optical axis (see Figure 5), we designate the latter 2ip : The angular spherical aberration is defined as the difference of these two angles: Aang;sph ¼ 2i 2 2ip ¼ 2ði 2 ip Þ

½30

Again, we can use the slope of the tangent to determine the change of angles i ! ip : ! dzp dz d 2 ½31 ¼ 2 Dz Aang;sph ¼ 2 dy dy dy It is, therefore, sufficient to consider the difference in z between the surface under study and the

INSTRUMENTATION / Telescopes 341

reference paraboloid. From eqn [26], we obtain as an approximation to third order: 3

Aang;sph ¼ 2ð1 þ CÞ

y R3

½32

Aberrations in the Field

Since the use of a telescope exclusively on-axis would be somewhat limited, let us finally investigate the behavior under oblique illumination, i.e., when an object at infinity is observed under an angle Q towards the optical axis of the system. This will allow us to assess aberrations in the field. For simplicity, we are considering a paraboloid, remembering that on-axis the system is free of spherical aberration. Again, we are employing an imaginary reference paraboloid which is pointing on-axis towards the object at angle Q (Figure 8). Its coordinate system (O0 , z0 , y0 ) is displaced and tilted with regard to the system under study (O, z, y). One can show that the offset Dz, which is used to derive the angular aberration in the same way as above, is given by: 2Dz ¼ a1

y3 Q y2 Q 2 þ a3 yQ3 þ a2 2 R R

½33

The angular aberration in the field is then: Aang; f ¼ 3a1

y2 Q yQ2 þ a3 Q 3 þ 2a 2 R R2

The coefficients are called:

Chromatic Aberration

So far we have only considered optical aberrations of reflecting telescopes. Spherical aberration in the case of a lens is treated in analogous ways and will not be discussed again. However, the wavelength dependence of the index of refraction in optical media (dispersion) is an important factor and gives rise to chromatic aberrations which are only encountered in refractive optical systems. We can distinguish two principal effects in the paraxial approximation: (i) longitudinal chromatic aberration, resulting in different axial image positions as a function of wavelength; and (ii) lateral chromatic aberration, which can be understood as an aberration of the principal ray as a function of wavelength, leading to a varying magnification as a function of wavelength (plate scale in the case of a telescope). As an example, let us determine the wavelength dependence of the focal length of a single thin lens. Opticians are conventionally using discrete wavelengths of spectral line lamps of different chemical elements, e.g., the Mercury e line at 546.1 nm, Cadmium F 0 480.0 nm, or Cadmium C 0 643.8 nm, covering more or less the visual wavelength range of the human eye. Differentiation of eqn [8] yields for the wavelength of e:

½34

½35

r1 r2 ¼ f 0e ðne 2 1Þ r2 2 r1

½36

Substituting

a1: coma a2: astigmatism a3: distortion

f0 1 r1 r2 ¼2 dn ðne 2 1Þ2 r2 2 r1

we get: df 0 ¼ 2

dn f0 ne 2 1 e

½37

Replacing the differential by the differences Df 0 ¼ f 0F 0 2 f 0C 0 and Dn ¼ nF 0 2 nC 0 ; then: f 0F 0 2 f 0C 0 ¼ 2

nF0 2 nC 0 0 f0 fe ¼ 2 e ne 2 1 ne

½38

ne is called Abbe’s Number: ne ¼

Figure 8 Aberration in the field, oblique rays.

ne 2 1 nF 0 2 nC 0

½39

It is a tabulated characteristic value of dispersion for any kind of glass and allows one to quickly estimate the difference in focal length at the extreme wavelengths F 0 and C 0 when the nominal focal length of a lens is known as e.

342 INSTRUMENTATION / Telescopes

given by:

Angular Resolution In the previous sections, we have exclusively employed geometrical optics for the description of telescope properties. With geometrical optics and in the absence of aberrations, there is no limit to the angular resolution of a telescope, i.e., there is no limiting separation angle dlim ; below which two distant point sources can no longer be resolved. However, real optical systems experience diffraction, leading to the finite width of the image of a point source (Airy disk), whose normalized intensity distribution in the focal plane as a function of radius r0 from the centroid for the case of a circular aperture of diameter D can be conveniently described by a Bessell function (l: wavelength, f 0 : focal length of the objective):

IðrÞ ¼ I0



2J1 ðnÞ n

2 where



pDr 0 lf 0

½40

The intensity distribution is plotted in Figure 9. The first four minima of the diffraction ring pattern are listed in Table 1. Two point sources are defined to be at the resolution limit of the telescope when the maximum of one diffraction pattern happens to coincide with just the first diffraction minimum of the other one (Figure 9, right). As a rule of thumb, the angular resolution is of a telescope with diameter D is

dmin ¼ 1:22

l 138 ! D D

½41

the latter in units of arcsec for a wavelength of 550 nm and D in mm.

Types of Telescopes There are very many different types of telescopes which were developed for different purposes. In order to obtain an impression, we will only briefly introduce a nonexhaustive selection of refractors and reflectors, which by no means is representative of the whole. For a more complete overview, the reader is referred to the literature. Refractor Objectives

Refractors are first of all characterized by their objectives. Figure 10 shows a sequence of lenses, indicating the progress of lens design developments from left to right. All lenses are groups 2 or 3, designed to compensate for chromatic aberration through optimized combinations of crown and flint glasses. The first lens is a classical C achromat, e.g., using BK7 and SF2 glass, which is essentially the design which was introduced by Hall and Dollond. It is still a useful objective for small telescopes, where a residual chromatic aberration is tolerable. The second lens is the Fraunhofer E type, which is characterized by an air gap between the two lenses (BK7, F2), which helps to optimize the image quality of the system, reducing spherical aberration and coma. As a disadvantage, the two additional glass – air interfaces lead to ghost images (reflected light) and Fresnell losses. The objective is critical with regard to the alignment of tip and tilt. The third example is an improved AS type achromat (KzF2, BK7), which is similar to the E type in

Figure 9 Left: radial intensity distribution of diffraction pattern (Airy disk). Right: two overlapping point source images with same intensity at limiting angular separation d: Table 1

Diffraction minima of airy disk

n

nn

r 0 D/2f 0 l

1 2 3 4

3.83 7.02 10.2 13.3

0.61 1.12 1.62 2.12

Figure 10 Different types of refractor objectives, from left to right: classical C achromat, Fraunhofer E achromat with air gap, AS achromat, APQ apochromat.

INSTRUMENTATION / Telescopes 343

that it also possesses an air gap, however with an improved chromatic aberration correction in comparison to the Fraunhofer objective. The lenses show a more pronounced curvature and are more difficult to manufacture. The beneficial use of KzF2 glass is an example of the successful research for the development of new optical media, which was pursued near the end of the nineteenth century in a collaboration between Otto Schott and Ernst Abbe´. The last example is a triplet, employing ZK2, CaF2, and ZK2. This APQ type of objective is an apochromat, whose chromatic aberration vanishes at 3 wavelengths. It is peculiar for its use of a crystal (calcium fluoride, CaF2) as the optical medium of the central lens. The manufacture of the CaF2 lens is critical and expensive, but it provides for an excellent correction of chromatic and spherical aberration of this high quality objective. The triplet is also difficult to manufacture because of the coefficient of thermal expansion, which is roughly different by a factor of two between CaF2 and the glass. In order to avoid disrupture of the cemented lens group, some manufacturers have used oil immersion to connect the triplet. Since CaF2 not only has excellent dispersion properties to correct for chromatic aberration, but also exhibits an excellent transmission beyond the visual wavelength regime, this type of lens is particularly useful for applications in the blue and UV. Finally, Figure 11 shows the complete layout of an astrograph, which is a telescope for use with photographic plates. The introduction of this type of instrument meant an important enhancement of astrophysical research by making available a significantly more efficient method, compared to the traditional visual observation of individual stars, one by one. Using photographic plates, it became possible to measure position and brightness of hundreds and thousands of stars, when previously one could investigate tens. Typical astrographs were built with apertures between 60 and 400 mm, and a field-of-view as large as 15 degrees. Despite the growing importance of space-based observatories for astrometry, some astrographs are still in use today.

plays an important role for the final appearance to the observer, with an effect on magnification, image quality, chromatism, field of view, and other properties of the entire system. From a large variety of different ocular types, let us briefly consider a few simple examples. The first eyepiece in Figure 12 is the one of the Kepler telescope, which was already introduced in Figure 3. It has a quite simple layout with a single lens of positive power. The lens is located after the telescope focal plane at a distance which is equal to the ocular focal length foc. The telescope and the ocular focal planes are coinciding, and as a result, the outcoming beam is collimated, forming the system exit pupil after a distance foc. This is where ideally the observer’s eye is located. The focal plane is creating a real image, which is a convenient location for a reticle. The Kepler telescope is, therefore, a good tool for measurement and alignment purposes. We note that the image is inverted, which is relatively unimportant for a measuring telescope and astronomy, but very annoying for terrestrial use.

Basic Oculars

While the general characteristics of a refractor are dominated by the objective, the eyepiece nevertheless

Figure 11 Astrograph.

Figure 12 Elementary eyepieces after Kepler, Galilei, Huygens, and Ramdsen.

344 INSTRUMENTATION / Telescopes

This latter property has been avoided in the next example, which is the ocular of the Galilean telescope, which was built by Galileo and used for his famous observations of planets and moons in the solar system. It uses a single lens, but contrary to the Kepler type with negative power. The lens is located in the telescope beam in front of the focal plane of the objective, which means that the exit pupil is virtual and not accessible to the observer (indicated by dotted lines in Figure 12). The overall length of the telescope is somewhat shorter than Kepler’s, however at the expense of the lack of a real focal plane, so no reticle can be applied. The third example is the 2-lens-ocular of Christian Huygens, who was able to prove with his design that it is possible to correct paraxially the lateral chromatic aberration. The eyepiece is still being built today. Another elementary ocular is shown in Figure 12, which is the eyepiece after Ramsden. This system is also corrected for lateral chromatic aberration. The real image is formed on the front surface of the second lens, providing a location for a reticle or a scale for measuring purposes. The exit pupil is interior to the ocular. In addition to these selected basic types, there are many more advanced multilens oculars in use today, with excellent correction and large field of view (up to 608). Reflectors

Let us now describe several basic types of reflectors (see Figure 13). The first example is the Newton reflector, which uses a flat folding mirror to create an accessible exterior focus. It is conceptually simple and still in use today for amateur astronomy. When the main mirror is a paraboloid, the spherical aberration is corrected on the axis of the system. The second example represents an important improvement over the simple 1-mirror design of Newton’s telescope, in that the combination of a parabolic first mirror (main mirror, M1) with a hyperbolic secondary mirror (M2) eliminates spherical aberration, but not astigmatism and coma. The Cassegrain telescope has a curved focal surface which is located behind the primary mirror. Due to the onset of coma and astigmatism, the telescope has a modest useful field-of-view. Besides the correction properties, the system has significant advantages in terms of mechanical size and weight, playing an important role for the construction of large telescopes. The basic type of this design has practically dominated the construction of the classical large astronomical telescopes with M1 diameters of up to 5 m, and, more recently, the modern 8 – 10 m class telescopes. On the practical

Figure 13 Reflectors, from top to bottom: Newton, Cassegrain, Gregory, Schmidt.

side, the external focus of the Cassegrain on the bottom of the mirror support cell is a convenient location for mounting focal plane instruments (e.g., CCD cameras, spectrographs, and others). The more common variant of the Cassegrain type, which was actually used for the design of these telescopes, is the layout of Ritchey (1864 – 1945) and Cretien (1879 – 1956), which combines two hyperbolic mirrors for M1 and M2 (RCC), but otherwise has the same appearance as the Cassegrain. Besides spherical aberration, it also corrects for coma, giving a useful field-of-view of up to 18. Like in the case of the Cassegrain telescope, the RCC exhibits some curvature of the focal plane.

INSTRUMENTATION / Telescopes 345

Our third example is the Gregory reflector, which also is a 2-mirror system, however with a concave elliptical secondary mirror. The correction of this telescope is similar, but somewhat inferior to the one of the Cassegrain. The use of a concave M2 is advantageous for manufacture and testing. Nevertheless, the excessive length of the system has precluded the use of this layout for large astronomical telescopes, except for stationary beams, for example, in solar observatories. The last example is the design put forward by Bernhard Schmidt (1879 – 1935) who wanted to create a telescope with an extremely wide field-ofview. The first telescope of his new type achieved a field-of-view of 168 – something which had not been feasible with previous techniques. The basic idea is to use a fast spherical mirror, and to correct for spherical aberration by means of an aspherical corrector plate in front of the mirror, which also forms the entrance pupil of the telescope. The focal surface is strongly curved, making it necessary to employ a detector with the same radius of curvature (bent photographic plates), or to compensate the curvature by means of a field lens immediately in front of the focus.

The presence of the corrector plate and, if used, the field flattener, gives rise to significant chromatism. Nevertheless, the Schmidt telescope was a very successful design and dominated wide-angle survey work in astronomy for many decades.

Applications General Purpose Telescopes

According to the original meaning of the Greek ‘tele – skopein’, the common conception of the telescope is the one of an optical instrument, which provides an enlarged view of distant objects. In addition, a large telescope aperture, compared to the eye, improves night vision and observations with poorly illuminated scenes. There are numerous applications in nautical, security, military, hunting, and other professional areas. For all of these applications, an intuitive vision is important, essentially ruling out the classical astronomical telescope (Kepler telescope, see Figure 3), which produces an inverted image. Binoculars with Porro prisms are among the most popular instruments with upright vision, presenting a compact outline due to the folded beam geometry (Figure 14), but also single tube telescopes with re-imaging oculars are sometimes used. Metrology

Figure 14 Principle of prism binoculars with folded beams.

Figure 15 Collimator and telescope.

As we have realized in the introduction, telescopes are instruments which perform a transformation from angle of incidence in the aperture to distances from the origin in the focal plane. If the focal length of the objective is taken sufficiently long, and a high-quality eyepiece is used, very small angles down to the arcsec regime can be visually observed. Telescopes are, therefore, ideal tools to measure angles to very high precision. In Figure 15 we see a pair of telescopes, facing each other, where one instrument is equipped with a light source, and the other one is used to observe the first one. The first telescope is also called a collimator. It possesses a lamp, a condensor system, a focal plane mask (usually a pinhole, a crosshair, or some other useful pattern imprinted on a transparent plate), and the telescope objective. The focal plane mask is projected to infinity. It can thus be observed with the

346 INSTRUMENTATION / Telescopes

measuring telescope, such that an image of the collimator mask appears in the telescope focal plane. Also the telescope is equipped with a mask in its focal plane, usually with a precision scale or another pattern for convenient alignment with the collimator mask. If now the collimator is tilted by a small angle d against the optical axis of the telescope, the angle of incidence in the measuring telescope changes by this same amount, giving rise to a shift of the image by Dx ¼ f tan d: Note that the measurement is completely insensitive to parallel shifts between the telescope and the collimator, since it is parallel light which is transmitted between the two instruments.

Figure 16 Autocollimator.

Another special device, which combines the former two instruments into one, is the Autocollimator. The optical principle is shown in Figure 16: in principle, the autocollimator is identical to a measuring telescope, except that a beamsplitter is inserted between the focal plane and the objective. The beamsplitter is used to inject light from the same kind of light source device as in the normal collimator. The axis of the folded light source beam and of the illuminated collimator mask are aligned to match the focal plane mask of the telescope section: the optical axis of the emerging beam is identical to the optical axis of the telescope. Under this condition, the reflected light from a test object will be exactly centered on the telescope focal plane mask, when it is perfectly aligned to normal incidence. Normally, a high-quality mirror is attached to the device under study, but sometimes also the test object itself possesses a plane surface of optical quality, which can be used for the measurement. Figure 17 shows schematically an example for such a measurement, where a mirror is mounted on a linear stage carriage for the purpose of measuring the flexure of the stage to very high precision. With a typical length of < 1 m, this task is not a trivial problem for a mechanical measurement.

Figure 17 Using an autocollimator to measuring the flexure of a linear stage.

INSTRUMENTATION / Telescopes 347

Figure 18 The 80 cm þ 60 cm ‘Great Refractor’ at the Astrophysical Observatory Potsdam. The correction of the 80 cm objective of this telescope led Johannes Hartmann to the development of the ‘Hartmann Test’, which was the first quantitative method to measure aberrations. This method is still in use today, and has been further developed to become the ‘Shack-Hartmann’ technique of wavefront sensing. Courtesy of Astrophysikalisches Institut Potsdam.

By measurement with the autocollimator the tilt of the mirror consecutively for a sufficiently large number of points xi along the rail, we can now plot the slope of the rail at each point xi as tan d: dy ¼ tan d dx

½42

Integrating the series of slope values reproduces the shape of the rail y ¼ f ðxÞ: Other applications include multiple autocollimation, wedge measurements, and right angle measurements. Multiple autocollimation is achieved by inserting a fixed plane-parallel and semi-transparent mirror

between the autocollimator and the mirror under study. Light coming from the test mirror will be partially transmitted through the additional mirror towards the autocollimator, and partially reflected back to the test mirror, and so forth. After k reflections, a deflection of kd is measured, when d is the tilt of the test mirror. The wedge angle a of a plane-parallel plate with refractive index nw is easily measured by observing the front and back surface reflections from the plate, keeping in mind that the beam bouncing back from the rear will also experience refraction in the plate. If d is the measured angle difference between the two front and rear beams, then a ¼ d=2nw :

348 INSTRUMENTATION / Telescopes

Two surfaces of an object can be tested for orientation at right angles by attaching a mirror to surface 1 and observe the mirror with the fixed autocollimator. In a second step, an auxiliary pentaprism is inserted into the beam, deflecting the light towards surface 2, where the mirror is attached next. If a precision pentaprism is used, any deviation from right angles between the two surfaces is seen as an offset between step 1 and 2. These are only a few examples of the very many applications for alignment and angle measurements in optics and precision mechanics. Without going into much detail, we are just mentioning that also the theodolite and sextant make use of the same principle of angular measurements, however without the need of a collimator or autocollimation.

Despite its importance as a measurement and testing device for optical technology and manufacture as described in the preceeding paragraph, it is the original invention and subsequent improvement of the telescope for Astronomy which is probably the

most fascinating application of any optical system known to the general public. The historical development of the astronomical telescope spans from the 17th century Galileian refractor with an aperture of just a few centimeters in diameter, over the largest refractors built for the Lick and Yerkes observatories with objectives of up to < 1 m in diameter (see also Figures 18 –20), the 4 m-class reflectors of the second half of the 20th century like e.g. the Hale Observatory 5m, the Kitt Peak and Cerro Tololo 4 m, the La Silla 3.6 m, or the La Palma 4.2 m telescopes (to name only a few), to the current state-of-the-art of 8 – 10 m class telescopes, prominent examples being the two Keck Observatory 10 m telescopes on Mauna Kea (Hawaii), or the ESO Very Large Observatory (Paranal, Chile), consisting of four identical 8.2 m telescopes (Figures 21 –24). This development has always been driven by the discovery of fainter and fainter celestial objects, whose quantitative study is the subject of modern astrophysics, involving many specialized disciplines such as atom/quantum/nuclear physics, chemistry, general relativity, electrodynamics, hydrodynamics, magneto-hydrodynamics, and so forth. In fact, cosmic objects are often referred to as ‘laboratories’

Figure 19 Pistor & Martin Meridian Circle at the former Berlin Observatory in Potsdam Babelsberg. The instrument was one of the finest art telescopes with ultra-high precision for astrometry, built in 1868. Courtesy of Astrophysikalisches Institut Potsdam.

Figure 20 Architect’s sketch of historical ‘Einsteinturm’ solar telescope, Astrophysical Observatory Potsdam. The telescope objective is mounted in the tower structure, facing upward a siderostat which is mounted inside the dome. The light is focused in the basement after deflection from a folding mirror on the slit of a bench-mounted high resolution spectrograph. The telescope represented state-of-the-art technology at the beginning of the twentieth century. Courtesy of Astrophysikalisches Institut Potsdam.

Astronomical Telescopes

INSTRUMENTATION / Telescopes 349

Figure 21 The Very Large Telescope Observatory (VLT) on Mt Paranal, Chile, with VLT Interferometer (VLTI). The four large enclosures are containing the active optics controlled VLT unit telescopes, which are operated either independently of each other, or in parallel to form the VLTI. The structures on the ground are part of the beam combination optics/delay lines. Courtesy of European Southern Observatory.

Figure 22 Total view of Kueyen, one of four unit telescopes of the VLT observatory. Courtesy of European Southern Observatory.

with extreme conditions of density, temperature, magnetic field strength, etc., which would be impossible to achieve in any laboratory on Earth. Except for particles (neutrinos, cosmic rays), gravitational

Figure 23 8.2 m thin primary mirror for the VLT. The VLT design is based on thin meniscus mirrors, made out of ZERODUR, which have low thermal inertia and which are always kept at ambient temperature, thus virtually removing the effect of thermal turbulence (mirror seeing). Seeing is the most limiting factor of ground-based optical astronomy. The optimized design of the VLT telescopes has introduced a significant leap forward to a largely improved image quality from the ground, and, consequently, improved sensitivity. The thin mirrors are constantly actively controlled in their mounting cells, using numerous piston actuators on the back surface of the mirror. Another key technology under development is ‘adaptive optics’, a method to compensate atmospheric wavefront deformations in real-time, thus obtaining diffraction limited images. Courtesy of European Southern Observatory.

waves, and the physical investigation of meteorites and other solar system material collected by space probes, telescopes (for the entire wavelength range of the electromagnetic spectrum) are the only tool to accomplish quantitative measurements of these cosmic laboratories. In an attempt to unravel the history of the birth and evolution of the universe, to discover and to measure the first generation of stars and galaxies, and to find evidence for traces of life outside of planet Earth, plans are currently underway to develop yet another new generation of giant optical telescopes with apertures of 30 – 100 m in diameter (ELT: extremely large telescope). The technological challenges in terms of precision and stability are outstanding. Along with the impressive growth of light collecting area, we must stress the importance of angular resolution, which has also experienced a significant evolution over the history of astronomical telescopes: from < 5 arcsec of the Galilei telescope, which already provided an order of magnitude improvement over the naked eye, over roughly 1 arcsec of conventional telescopes in the 20th century, a few hundredths of an arcsec for the orbiting Hubble Space

350 INSTRUMENTATION / Telescopes

Figure 25 Hubble Space Telescope (HST) in orbit. This 2.4 m telescope is probably the most productive telescope which was ever built. Due to the absence of the atmosphere, the telescope delivers diffraction-limited, extremely sharp images, which have revealed unprecedented details of stars, star clusters, nebulae and other objects in the Milky Way and in other galaxies. Among the most spectacular results, HST has provided the deepest look into space ever, revealing light from faint galaxies that was emitted when the universe was only at 10% of its present age. Courtesy of Space Telescope Science Institute, Baltimore.

Further Reading

Figure 24 Top: spin-cast 8.4 m honeycomb mirror #1 for the Large Binocular Telescope, the largest monolithic mirror of optical quality in the world. Shown in the process of applying a protective plastic coating to the end-polished surface. Bottom: LBT mirror #1, finally mounted in mirror cell. Courtesy of Large Binocular Telescope Observatory.

Telescope (Figure 25), down to milli-arcsec resolution of the Keck and VLT Interferometers. The development of the aforementioned ELTs is thought to be only meaningful if operated in a diffraction-limited mode, using the emerging technique of adaptive optics, which is a method to correct for wavefront distortions caused by atmospheric turbulence (image blurr) in real-time. The combination of large optical/infrared telescopes with powerful detectors and focal plane instruments (see Instrumentation: Astronomical Instrumentation) has been a prerequisite to make astrophysics one of the most fascinating disciplines of fundamental sciences.

See also Geometrical Optics: Aberrations. Imaging: Adaptive Optics. Instrumentation: Astronomical Instrumentation.

Bely PY (ed.) (2003) The Design and Construction of Large Optical Telescopes. New York: Springer. Born M and Wolf E (1999) Principles of Optics, 7th ed. Cambridge: Cambridge University Press. Haferkorn H (1994) Optik, 3rd ed. Leipzig: Barth Verlagsgesellschaft. Korsch D (1991) Reflective Optics. San Diego: Academic Press. Laux U (1999) Astrooptik, 2nd ed. Heidelberg: Verlag Sterne und Weltraum. McCray WP (2004) Giant Telescopes: Astronomical Ambition and the Promise of Technology. Cambridge: Harvard University Press. Osterbrock DE (1993) Pauper and Prince: Richey, Hale, and Big American Telescopes. Tucson: University of Arizona Press. Riekher R (1990) Fernrohre und ihre Meister, 2nd ed. Berlin: Verlag Technik GmbH. Rutten HGJ and van Venrooij MAM (1988) Telescope Optics. Richmond, VA: Willman-Bell. Schroeder DJ (1987) Astronomical Optics. San Diego: Academic Press. Welford WT (1974) Aberrations of the Symmetrical Optical System. London: Academic Press. Wilson RN (1996) Reflecting Telescope Optics I: Basic Design Theory and its Historical Development. Berlin: Springer. Wilson RN (1999) Reflecting Telescope Optics II: Manufacture, Testing, Alignment, Modern Techniques. Berlin: Springer.

INTERFEROMETRY / Overview 351

INTERFEROMETRY Contents Overview Gravity Wave Detection Phase-Measurement Interferometry White Light Interferometry

Overview J C Wyant, University of Arizona, Tucson, AZ, USA

being measured. Figure 1 shows typical two-beam interference fringes. There are many two-beam interferometers and we will now look at a few of them.

q 2005, Elsevier Ltd. All Rights Reserved.

Fresnel Mirrors Introduction Interferometers are a powerful tool used in numerous industrial, research, and development applications, including measuring the quality of a large variety of manufactured items such as optical components and systems, hard disk drives and magnetic recording heads, lasers and optics used in CDs and DVD drives, cameras, laser printers, machined parts, components for fiber optics systems, and so forth. Interferometers can also be used to measure distance, spectra, and rotations, etc. The applications are almost endless. There are many varieties of interferometers. Some interferometers form interference fringes using two beams and some use multiple beams. Some interferometers are common path; some use lateral shear; some use radial shear; and some use phase-shifting techniques. Examples of the various types are given below, but first we must give the basic equation for two-beam interference.

Fresnel Mirrors are a very simple two beam interferometer. A beam of light reflecting off two mirrors, set at a slight angle to one another, will produce an interference pattern. The fringes are straight and equally spaced. The fringes become more finely spaced as the angle between the two mirrors increases, d increases, and D becomes smaller (Figure 2).

Plane Parallel Plate When a monochromatic source illuminates a plane parallel plate, as shown in Figure 3, the interference

Two-Beam Interference When two quasi-monochromatic waves interfere, the irradiance of the interference pattern is given by pffiffiffiffiffi I ¼ I1 þ I2 þ 2 I1 I2 cos½f

½1

where the I’s represent the irradiance of the individual beams and f is the phase difference between the two interfering beams. The maximums of the interference occur when f ¼ 2mp; where m is an integer. In most applications involving interferometers, f is the quantity of interest since it is related to the quantity

Figure 1 Typical two-beam interference fringes.

352 INTERFEROMETRY / Overview

Figure 2 Fresnel mirrors.

Figure 4

Michelson interferometer.

Figure 3 Plane parallel plate interferometer.

fringes are circles centered on the normal to plate. The fringes are called Haidinger fringes, or fringes of equal inclination since, for a given plate thickness, they depend on the angle of incidence. If the maximum or minimum occurs at the center, the radii of the fringes are proportional to the square root of integers. If the plate has a slight variation in thickness and it is illuminated with a collimated beam then the interference fringes are called Fizeau fringes of equal thickness and they give the thickness variations in the plate.

Michelson Interferometer A Michelson interferometer is shown in Figure 4. Light from the extended source is split into two beams by the beamsplitter. The path difference can be viewed as the difference between the mirror M1 and the image of mirror M2, M20 . With the M1 and M20 parallel, the fringes are circular and localized at infinity. With M1 and M20 at a slight angle, the interference fringes are straight lines parallel to the equivalent intersection of the mirrors and localized approximately at the intersection. When the mirrors are only a few wavelengths apart, white light fringes

Figure 5 Twyman– Green interferometer for testing a concave spherical mirror.

appear, and can be used to determine their coincidence.

Twyman –Green Interferometer If a point source is used in a Michelson interferometer it is generally called a Twyman– Green interferometer. Twyman– Green interferometers are often used to test optical components such as flat mirrors, curved mirrors, windows, lenses, and prisms. Figure 5 shows a Twyman –Green interferometer for the testing of a concave spherical mirror. The interferometer is aligned such that the focus of the diverger lens is at the center of curvature of the spherical mirror. If the mirror under test is perfect, straight equi-spaced fringes are obtained. Figure 6 shows two interference fringes and the relationship between surface height error for the mirror being tested and fringe deviation. l is the wavelength of the light.

Fizeau Interferometer The Fizeau interferometer is a simple device used for testing optical surfaces, especially flats and spheres

INTERFEROMETRY / Overview 353

(Figure 7). The fringes are fringes of equal thickness. It is sometimes useful to tilt the reference surface a little to get several nearly straight interference fringes. The relationship between surface height error and fringe deviation from straightness is the same as for the Twyman – Green interferometer.

Mach –Zehnder Interferometer A Mach – Zehnder interferometer is sometimes used to look at samples in transmission. Figure 8 shows a

typical Mach – Zehnder interferometer for looking at a sample in transmission.

Murty Lateral Shearing Interferometer Lateral shearing interferometers compare a wavefront with a shifted version of itself. While there are many different lateral shear interferometers, one that works very well with a nearly collimated laser beam is a Murty shearing interferometer shown in Figure 9. The two laterally sheared beams are produced by reflecting a coherent laser beam off a plane parallel plate. The Murty interferometer can be used to measure the aberrations in the lens or it can be used to determine if the beam leaving the lens is collimated. If the beam incident upon the plane parallel plate is collimated a single fringe results, while if the beam is not perfectly collimated straight equi-spaced fringes result where the number of fringes gives the departure from collimation.

Radial Shear Interferometer Radial shear interferometers compare a wavefront with a magnified or demagnified version of itself. While there are many different radial shear

Figure 6 Relationship between surface height error and fringe deviation.

Figure 8 Mach– Zehnder interferometer.

Figure 7 Fizeau interferometer.

Figure 9 Murty lateral shear interferometer.

354 INTERFEROMETRY / Overview

interferometers, one that works very well with a nearly collimated beam is shown in Figure 10. It is essentially a Mach –Zehnder interferometer where, in one arm, the beam is magnified and in the other arm the beam is demagnified. Interference fringes result in the region of overlap. The sensitivity of the interferometer depends upon the amount of radial shear. If the two interfering beams are approximately the same size there is little sensitivity, while if the two beams greatly differ in size there is large sensitivity. Sometimes radial shear interferometers are used test the quality of optical components.

Scatterplate Interferometer A scatterplate interferometer, invented by Jim Burch in 1953, is one of the cleverest interferometers. Almost any light source can be used and no highquality optics are required to do precision interferometry. The critical element in the interferometer is the scatterplate which looks like a piece of ground glass, but the important item is that the scattering points are arranged so the plate has inversion symmetry

as shown in Figure 11. The light that is scattered the first time through the plate, and unscattered the second time through the plate, interferes with the light unscattered the first time and scattered the second time. The resulting interference fringes show errors in the mirror under test. This type of interferometer is insensitive to vibration because it is what is called a common path interferometer, in that the test beam (scattered – unscattered) and the reference beam (unscattered –scattered) travel along almost the same paths in the interferometer.

Smartt Point Diffraction Interferometer Another common path interferometer used for the testing of optics is the Smartt point diffraction interferometer (PDI) shown in Figure 12. In the PDI the beam of light being measured is focused onto a partially transmitting plate containing a pinhole. The pinhole removes the aberration from the light passing through it (reference beam), while the light passing through the plate (test beam) does not have the aberration removed. The interference of these two beams gives the aberration of the lens under test.

Sagnac Interferometer

Figure 10 Radial shear interferometer.

Figure 11 Scatterplate interferometer.

The Sagnac interferometer has beams traveling in opposite directions, as shown in Figure 13. The interferometer is highly stable and easy to align. If the interferometer is rotated with an angular velocity there will be a delay between the transit times of the clockwise and the counterclockwise beams. For this reason, the Sagnac interferometer is used in laser gyros.

INTERFEROMETRY / Overview 355

Fiber Interferometers Fiber interferometers were first used for rotation sensing by replacing the ring cavity in the Sagnac interferometer with a multiloop made of a singlemode fiber. Fiber-interferometer rotation sensors are attractive as rotation sensors because they are small and low-cost. Since the optical path in a fiber is affected by its temperature and it also changes when the fiber is stretched, fiber interferometers can be used as sensors for mechanical strains, temperature, and pressure. They can also be used for the measurement of magnetic fields by bonding the fiber to a magnetorestrictive element. Electric fields can be measured by bonding the fiber to a piezoelectric film.

Multiple Beam Interferometers In general, multiple beam interferometers provide sharper interference fringes than a two beam interferometer. If light is reflected off a plane parallel plate there are multiple reflections. If the surfaces of the plane parallel plate have a low reflectivity, the higherorder reflections have a low intensity and the multiple reflections can be ignored and the resulting interference fringes have a sinusoidal intensity profile, but if the surfaces of the plane parallel plate have a high reflectivity, the fringes become sharper. Figure 14 shows the profile of the fringes for both the reflected

and the transmitted light for different values of surface reflectivity. Multiple beam interference fringes are extremely useful for measuring the spectral content of a source. One such multiple beam interferometer, that is often used for measuring spectral distributions of a source, is a Fabry –Perot interferometer that consists of two plane-parallel plates separated by an air space, as shown in Figure 15.

Phase-Shifting Interferometry The equation for the intensity resulting from twobeam interference, contains three unknowns, the two individual intensities and the phase difference between the two interfering beams. If three or more measurements are made of the intensity of the two-beam interference as the phase difference between the two beams is varied in a known manner, all three unknowns can be determined. This technique, called phase-shifting interferometry, is an extremely powerful tool for measuring phase distributions. Generally the phase difference is

Figure 12 Smartt point diffraction interferometer.

Figure 14 Multiple beam interference fringes for the transmitted and reflected light for a plane parallel plate.

Figure 13 Sagnac interferometer.

Figure 15 Fabry–Perot interferometer.

356 INTERFEROMETRY / Overview

and 270 phase difference between the test and reference beams. After passing through a polarizer the four phase-shifted interferograms fall on the detector array. Not only can the effects of vibration be eliminated, but by making short exposures to freeze the vibration, the vibrational modes can be measured. Movies can be made showing the vibration. Likewise, flow fields can be measured. Combining modern electronics, computers, and software with old interferometric techniques, provides for very powerful measurement capabilities.

List of Units and Nomenclature I R l f

irradiance intensity reflectance wavelength phase

Figure 16 Vibration-insensitive phase-shifting interferometer.

changed by 90 degrees between consecutive measurements of the interference intensity in which case the three intensities can be written as pffiffiffiffiffi   Ia ðx; yÞ ¼ I1 þ I2 þ 2 I1 I2 cos fðx; yÞ 2 908 pffiffiffiffiffi   Ib ðx; yÞ ¼ I1 þ I2 þ 2 I1 I2 cos fðx; yÞ pffiffiffiffiffi   Ic ðx; yÞ ¼ I1 þ I2 þ 2 I1 I2 cos fðx; yÞ þ 908 and the phase distribution is given by Ia ðx; yÞ 2 Ic ðx; yÞ fðx; yÞ ¼ ArcTan 2 Ia ðx; yÞ þ 2Ib ðx; yÞ 2 Ic ðx; yÞ

See also Coherence: Speckle and Coherence. Detection: Fiber Sensors. Holography, Techniques: Computer –Generated Holograms; Digital Holography; Holographic Interferometry. Imaging: Adaptive Optics; Interferometric Imaging; Wavefront Sensors and Control (Imaging Through Turbulence). Interferometry: Phase Measurement Interferometry; White Light Interferometry; Gravity Wave Detection. Microscopy: Interference Microscopy. Tomography: Optical Coherence Tomography.

!

Phase-shifting interferometry has greatly enhanced the use of interferometry in metrology since it provides a fast low noise way of getting interferometric data into a computer.

Vibration-Insensitive Phase-Shifting Interferometer While phase-shifting interferometry has greatly enhanced the use of interferometry in metrology, there are many applications where it cannot be used because of the environment, especially vibration. One recent phase-shifting interferometer that works well in the presence of vibration is shown in Figure 16. In this interferometer, four phase-shifted frames of data are captured simultaneously. The test and reference beams have orthogonal polarization. After the test and reference beams are combined they are passed through a holographic element to produce four identical beams. The four beams are then transmitted through wave retardation plates to cause 0, 90, 180,

Further Reading Candler C (1951) Modern Interferometers. London: Hilger and Watts. Francon M (1966) Optical Interferometry. New York: Academic Press. Hariharan P (1991) Selected papers on interferometry. SPIE, vol. 28. Hariharan P (1992) Basics of Interferometry. San Diego: Academic Press. Hariharan P (2003) Optical Interferometry. San Diego: Academic Press. Hariharan P and Malacara D (1995) Selected papers on interference, interferometry, and interferometric metrology. SPIE, vol. 110. Malacara D (1990) Selected papers on optical shop metrology. SPIE, vol. 18. Malacara D (1992) Optical Shop Testing. New York: Wiley. Malacara D (1998) Interferogram Analysis for Optical Testing. New York: Marcel Decker. Robinson D and Reid G (1993) Interferogram Analysis. Brisol, UK: Institute of Physics. Steel WH (1985) Interferometry. Cambridge: Cambridge University Press. Tolansky S (1973) An Introduction to Interferometry. New York: Longman.

INTERFEROMETRY / Gravity Wave Detection 357

Gravity Wave Detection N Christensen, Carleton College, Northfield, MN, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction In the later part of the nineteenth century Albert Michelson performed extraordinary experiments that shook the foundations of the physics world. Michelson’s precise determination of the speed of light was an accomplishment that takes great skill to reproduce today. Edward Morley teamed up with Michelson to measure the velocity of the Earth with respect to the aether. The interferometer that they constructed was exquisite, and through amazing experimental techniques the existence of the aether was disproved. The results of Michelson and Morley led to a revolution in physics, and provided evidence that helped Albert Einstein to develop the general theory of relativity. Now the Michelson interferometer may soon provide dramatic confirmation of Einstein’s theory through the direct detection of gravitational radiation. An accelerating electric charge produces electromagnetic radiation – light. It should come as no surprise that an accelerating mass produces gravitational light, namely gravitational radiation. In 1888 Heinrich Hertz had the luxury to produce and detect electromagnetic radiation in his laboratory. There will be no such luck with gravitational radiation because gravity is an extremely weak force. Albert Einstein postulated the existence of gravitational radiation in 1916, and in 1989 Joe Taylor and Joel Weisberg indirectly confirmed its existence through observations of the orbital decay of the binary pulsar 1913 þ 16. Direct detection of gravity waves will be difficult. For humans to have any chance of detecting gravity waves there must be extremely massive objects accelerating up to relativistic velocities. The only practical sources are astrophysical: supernovae, pulsars, neutron star or black hole binary systems, black hole formation or even the Big Bang. The observation of these types of events would be extremely significant for contributing to knowledge in astrophysics and cosmology. Gravity waves from the Big Bang would provide information about the Universe at its earliest moments. Observations of supernovae will yield a gravitational snapshot of these extreme cataclysmic events. Pulsars are neutron stars that can spin on their axes at frequencies in the

hundreds of hertz, and the signals from these objects will help to decipher their characteristics. Gravity waves from the final stages of coalescing binary neutron stars could help to accurately determine the size of these objects and the equation of state of nuclear matter. The observation of black hole formation from these binary systems would be the coup de grace for the debate on the existence of black holes, and the ultimate triumph for general relativity. Electromagnetic radiation has an electric field transverse to the direction of propagation, and a charged particle interacting with the radiation will experience a force. Similarly, gravitational radiation will produce a transverse force on massive objects, a tidal force. Explained via general relativity it is more accurate to say that gravitational radiation will deform the fabric of space-time. Just like electromagnetic radiation there are two polarizations for gravity waves. Let us imagine a linearly polarized gravity wave propagating in the z-direction, hðz; tÞ ¼ h0þ eiðkz2vtÞ : The fabric of space is stretched due to the strain created by the gravity wave. Consider a length L0 of space along the x-axis. In the presence of the gravity wave the length oscillates like LðtÞ ¼ L0 þ

h0þ L cos vt 2

hence there is a change in its length of DLx ¼

h0þ L cos vt 2

A similar length L0 of the y-axis oscillates, like LðtÞ ¼ L0 2

h0þ L cos vt 2

or DLy ¼

2h0þ L cos vt 2

One axis stretches while the perpendicular one contracts, and then vice versa, as the wave propagates through. The other polarization (h0x ) produces a strain on axes 458 from (x,y). Imagine some astrophysical event produces a gravity wave that has amplitude h0þ on Earth; in order to detect a small distance displacement DL one should have a detector that spans a large length L. A supernova within our own Galaxy might possibly produce a gravity wave of size h , 10218 with characteristic frequencies around

358 INTERFEROMETRY / Gravity Wave Detection

1 kHz, but the occurrence of an event such as this will be extremely rare. More events will come from novae in other galaxies, but due to the great distances and the fact that the magnitude of the gravity wave falls of as 1/r, such events will be substantially diminished in magnitude. A Michelson interferometer, with arms aligned along the x- and y- axes, can measure small phase differences between the light in the two arms. Therefore, this type of interferometer can turn the length variations of the arms produced by a gravity wave into changes in the interference pattern of the light exiting the system. This was the basis of the idea from which modern laser interferometric gravitational radiation detectors have evolved. Imagine a gravity wave of amplitude h is incident on an interferometer. The change in the arm length will be h , DL/L0, so in order to optimize the sensitivity it is advantageous to make the interferometer arm length L0 as large as possible. Detectors coming on-line are attempting to measure distance displacements that are of order DL , 10218 m or smaller, less that the size of an atomic nucleus! If the interferometers can detect such distance displacements, and directly detect gravitational radiation, it will be one of the most spectacular accomplishments in experimental physics. The first implementation of a laser interferometer to detect a gravity wave was by Forward who used earphones to listen to the motion of the interference signal. The engineering of signal extraction for modern interferometers is obviously far more complex. Numerous collaborations are building and operating advanced interferometers in order to detect gravitational radiation. In the United States there is the Laser Interferometer Gravity Wave Observatory (LIGO), which consists of two 4-km interferometers located in Livingston, Louisiana and Hanford, Washington. In addition, there is an additional 2-km interferometer within the vacuum system at Hanford. An Italian –French collaboration (VIRGO) has a 3-km interferometer near Pisa, Italy. GEO, a German –British collaboration, is a 600-m detector near Hanover, Germany. TAMA is the Japanese 300-m interferometer in Tokyo. The Australians (AIGO) are constructing a 500-m interferometer in Western Australia. All of the detectors will be attempting to detect gravity waves with frequencies from about 50 Hz to 1 kHz. As will be described below, there are a number of terrestrial noise sources that will inhibit the performance of the interferometric detectors. The sensitivity of detection increases linearly with interferometer arm length, which implies that there could be advantages to constructing a gravity wave detector

in space. This is the goal of the Laser Interferometer Space Antenna (LISA) collaboration. The plan is to deploy three satellites in a heliocentric orbit with a separation of about 5 £ 106 km. The launch for LISA is planned for around 2015, and the detector will be sensitive to gravity waves within the frequency range of 1023 to 1021 Hz. Due to the extremely long baseline, LISA is not strictly an interferometer, as most light will be lost as the laser beams expand while traveling such a great distance. Instead, the phase of the received light will be detected and used to lock the phase of the light that is re-emitted by another laser. Joe Weber pioneered the field of gravitational radiation detection in the 1960s. He used a 1400-kg aluminum cylinder as an antenna. Gravitational waves would hopefully excite the lowest-order normal mode of the bar. Since the gravity wave is a strain on space-time, experimentalists win by making their detectors longer in length. This provides a longterm advantage for laser interferometers, which can scale in length relatively easily. However, as of 2002, interferometers and bars (now cooled to 4 K) have comparable sensitivities.

Interferometer Styles The Michelson interferometer is the tool to be used to detect a gravitational wave. Figure 1 shows a basic system. The beamsplitter and the end mirrors would be suspended by wires, and effectively free to move in the plane of the interferometer. The arms have lengths L1 and L2 that are roughly equal on a kilometer scale. With a laser of power P and wavelength l incident on the beamsplitter, the light

Figure 1 A basic Michelson interferometer. The photodetector receives light exiting the dark port of the interferometer and hence the signal.

INTERFEROMETRY / Gravity Wave Detection 359

exiting the dark port of the interferometer is Pout ¼ P sin2



 2p  L1 2 L2 l



The interferometer operates with the condition that in the absence of excitation the light exiting the dark port is zero. This would be the case for a simple and basic interferometer. However, interferometers like those in LIGO will be more sophisticated, and will use a heterodyne detection strategy. If E0 is the amplitude of the electric field from the laser, and assuming the use of a 50 – 50 beamsplitter, the electric field (neglecting unimportant common phase shifts) for the light incident on the photodetector would be Eout

  E  E  ¼ 0 eidf1 2 eidf2 < i 0 f1 2 f2 2 2 ¼ iE0

 2p  L1 2 L2 L

The laser light will be phase modulated at a frequency in the MHz regime. As such, the deconvolved current from the photodiode that detects light at the interferometer output will be proportional to the phase acquired by the light, namely I / 2lp ðL1 2 L2 Þ: A gravity wave of optimal polarization normally incident upon the interferometer plane will cause one arm to decrease in length while the other increases. The Michelson interferometer acts as a gravity wave transducer; the change in arm lengths results in more light exiting the interferometer dark port. The mirrors in the interferometer are suspended via wires so that they are free to move under the influence of the gravity wave. An interferometer’s sensitivity increases with arm length, but geographical and financial constraints will limit the size of the arms. If there could be some way to bounce the light back and forth to increase the effective arm length it would increase the detector performance. Fabry –Perot  cavities p on resonance light ffiffiffiffiffiffiffi have a storage time of 2L cð1 2 R1 R2 Þ : Figure 2 shows the system of a Michelson interferometer with Fabry –Perot cavities. This gravity wave interferometer design was proposed and tested in the late 1970s by Ron Drever. The far mirror R2 has a very high reflectivity (R2 , 1) in order to ultimately direct the light back towards the beamsplitter. The front mirror reflectivity R1 is such that LIGO’s effective arm length increases to L , 300 km. The optical properties of the mirrors of the Fabry –Perot cavities must be exquisite in order to achieve success. LIGO’s mirrors were tested, and the root mean squared surface uniformity is less than 1 nm, scattered light is less than 50 parts per million (ppm), absorption is

Figure 2 A Michelson interferometer with Fabry– Perot cavities in each arm. The front cavity mirrors have reflectivity R1 while the end mirrors have R2 , 1. By using Fabry– Perot cavities LIGO will increase the effective arm length by a factor of 75.

Figure 3 A picture of a mirror and test mass for LIGO. The fused silica component is 10.7 kg in mass and 25 cm in diameter. Photograph courtesy of Caltech/LIGO.

less than 2 ppm, and the radii of curvature for the mirrors are matched to less than 3%. A LIGO test mass (and therefore a Fabry– Perot mirror) can be seen in Figure 3. In 1888 Michelson and Morley, with their interferometer, had a sensitivity that allowed the measurement of 0.02 of a fringe, or about 0.126 radian. Prototype interferometers constructed by the LIGO science team have already demonstrated a phase pffiffiffiffinoise spectral density of fðf Þ ¼ 10210 radian Hz for frequencies above 500 Hz. Assuming a 1 kHz signal with 1 kHz bandwidth this implies a phase sensitivity of Df ¼ 3:2 £ 1029 radian. This is about the phase sensitivity that LIGO hopes to accomplish in the 4-km Fabry– Perot system. There has been quite an evolution in interferometry since Michelson’s time. The noise sources that inhibit the interferometer performance are discussed below. However, let us

360 INTERFEROMETRY / Gravity Wave Detection

Figure 4 A power recycled Michelson interferometer with Fabry–Perot cavities in each arm. Normally light would exit the interferometer through the light port and head back to the laser. Installation of the recycling mirror with reflectivity Rr sends the light back into the system. A Fabry–Perot cavity is formed between the recycling mirror and the first mirror (R1) of the arms. For LIGO this strategy will increase the power circulating in the interferometer by a factor of 50.

Figure 5 A signal recycled and power recycled Michelson interferometer with Fabry– Perot cavities in each arm. Normally light containing the gravity wave signal would exit the interferometer through the dark port and head to the photodetector. Installation of the signal recycling mirror with reflectivity Rs sends the light back into the system. The phase of the light acquired from the gravity wave will build up at a particular frequency determined by the reflectivity Rs.

consider one’s ability to measure the relative phase between the light in the two arms. The Heisenberg uncertainty relation for light with phase f and photon number N is DfDN , 1. For a measurement lasting time t using laser power P and frequency f, the photon number is N ¼ Plt=hc; andpffiffiffi withpPoisson ffiffiffiffiffiffiffiffiffi statistics describing the light DN ¼ N ¼ Plt=hc: Therefore

factor of 50. The higher circulating light power therefore improves the sensitivity. There is one additional modification to the interferometer system that can further improve sensitivity, but only at a particular frequency. A further Fabry – Perot system can be made by installing what is called a signal recycling mirror; this would be mirror Rs in Figure 5. Imagine light in arm 1 on the interferometer that acquires phase as the arm expands due to the gravity wave. The traveling gravity wave’s oscillation will subsequently cause arm 1 to contract while arm 2 expands. If the light that was in arm 1 could be sent to arm 2 while it is expanding, then the beam would acquire additional phase. This process could be repeated over and over. Mirror Rs serves this purpose, with its reflectivity defining the storage time for light in each interferometer arm. The storage time defined by the cavity formed by the signal recycling mirror, Rs, and the mirror at the front of the interferometer arm cavity, R1, determines the resonance frequency. Signal recycling will give a substantial boost to interferometer sensitivity at a particular frequency, and will eventually be implemented in all the main groundbased interferometric detectors. The LIGO interferometers are infinitely more complex than the relatively simple systems displayed in the figures of this paper. Figure 6 presents an aerial view of the LIGO site at Hanford, Washington. The magnitude of the 4-km system is apparent.

DfDN ¼

pffiffiffiffiffiffiffiffiffi 2p DL Plt=hc ¼ 1 l

implies that 1 DL ¼ 2p

sffiffiffiffiffiffi hcl Pt

With more light power the interferometer can measure smaller distance displacements and achieve better sensitivity. LIGO will use about 10 W of laser power, and will eventually work towards 100 W. However, there is a nice trick one can use to produce more light circulating in the interferometer, namely power recycling. Figure 4 displays the power recycling interferometer design. The interferometer operates such that virtually none of the light exits the interferometer dark port, and the bulk of the light returns towards the laser. An additional mirror, Rr in Figure 4, recycles the light. For LIGO, recycling will increase the effective light power by another

INTERFEROMETRY / Gravity Wave Detection 361

Figure 6 Arial view of the LIGO Hanford, Washington site. The vacuum enclosure at Hanford contains both 2-km and 4-km interferometers. Photograph courtesy of Caltech/ LIGO.

Noise Sources and Interferometer Sensitivity If the interferometers are to detect distance displacements of order DL , 10218 m then they must be isolated from a host of deleterious noise sources. Seismic disturbances should not shake the interferometers. Thermal excitation of components will affect the sensitivity of the detector and should be minimized. The entire interferometer must be in an adequate vacuum in order to avoid fluctuations in gas density that would cause changes in the index of refraction and hence modification of the optical path length. The laser intensity and frequency noise must be minimized. The counting statistics of photons influences accuracy. If ever there was a detector that must avoid Murphy’s law this is it; little things going wrong cannot be permitted if such small distance displacements are to be detected. The expected noise sensitivity for the initial LIGO interferometers is displayed in Figure 7. In the best of all worlds the interferometer sensitivity will be limited by the counting statistics of the photons. A proper functioning laser will have its photon number described by Poisson statistics, or shot noise; if the mean number of photons arriving pffiffiffi per unit time is N then the uncertainty is DN ¼ N ; which as noted above implies an interferometer displacement sensitivity of 1 DL ¼ 2p

sffiffiffiffiffiffi hcl Pt

or a noise spectral density of 1 DLðf Þ ¼ 2p

sffiffiffiffiffiffi pffiffiffiffi hcl ðin units of m HzÞ P

Figure 7 The target spectral density of the noise for the initial LIGO system. LIGO will be dominated by seismic noise at low frequencies (10–100 Hz), thermal noise (from the suspension system and internal modes within the mirrors) in the intermediate regime (100–300 Hz), and photon shot noise thereafter. Other sources of noise are also noted, specifically gravity gradients (gravitational excitation of masses from the seismic motion of the ground), radiation pressure of photons on the mirrors, stray (scattered) light, and index of refraction fluctuations from residual gas in the vacuum.

Note also that the sensitivity increases as the light power increases. The reason for this derives from the statistics of repeated measurements. The relative lengths of the interferometer arm lengths could be measured, once, by a photon. However, the relative positions are measured repeatedly with every photon from pffiffiffithe laser, and the variance of the mean decreases as N ; where N is the number of measurements (or photons) involved. The uncertainty in the difference of the interferometer arm lengths is therefore inversely proportional to photon number, and hence the laser’s power. In terms of strain sensitivity this would imply

1 hðf Þ ¼ 2pL

sffiffiffiffiffiffi pffiffiffiffi hcl ðin units of 1 HzÞ P

This assumes the light just travels down the arm and back once. With Fabry –Perot cavities the light is stored, and the typical photon takes many trips back and forth before exiting the system. In order to maximize light power the end mirrors (R2 , 1) and

362 INTERFEROMETRY / Gravity Wave Detection

the strain sensitivity is improved to sffiffiffiffiffiffi 1 hcl hðf Þ ¼ 2pL0 P  where L0 ¼ 4L ð1 2 R1 Þ: As the frequency of gravity waves increases the detection sensitivity will decrease. If the gravity wave causes the interferometer arm length to increase, then decrease, while the photons are still in the arm cavity, then the phase acquired from the gravity wave will be washed away. This is the reason why interferometer sensitivity decreases as frequency increases, and explains the high-frequency behavior seen in Figure 7. Taking this into account, the strain sensitivity is sffiffiffiffiffiffi

!1=2 1 hcl 2pL0 f 2 hðf Þ ¼ 1þ 2pL0 c P  where L0 ¼ 4L ð1 2 R1 Þ and f is the frequency of the gravity wave. If the gravitational wave is to change the interferometer arm length then the mirrors that define the arm must be free to move. In systems like LIGO wires suspend the mirrors; each mirror is like a pendulum. While allowing the mirrors to move under the influence of the gravity wave is a necessary condition, the pendulum itself is the first component of an elaborate vibration isolation system. Seismic noise will be troublesome for the detector at low frequencies. The spectral pffiffiffiffi density of the seismic noise is about ð1027 =f 2 Þm Hz for frequencies above 1 Hz. A simple pendulum, by itself, acts as a motion filtering device. Above its resonance frequency the pendulum filters motion with a transfer function like Tðf Þ / ðf0 =f Þ2 : Detectors such as LIGO will have a pendulum with resonant frequencies of about f0 < 1 Hz; thus providing an isolation of 104 when looking for signals at f ¼ 100 Hz: The various gravity wave detector collaborations have different vibration isolation designs. The mirrors in these interferometers will be suspended in elaborate vibration isolation systems, which may include multiple pendulum and isolation stacks. Seismic noise will be the limiting factor for interferometers seeking to detect gravity waves in the tens of hertz range, as can be seen in the sensitivity curve presented in Figure 7. Due to the extremely small distance displacements that these systems are trying to detect it should come as no surprise that thermal noise is a problem. This noise enters through a number of components in the system. The two most serious thermal noise sources are the wires suspending the mirrors in the pendulum, and the mirrors themselves. Consider the wires; there

are a number of modes that can oscillate (i.e. violin modes). At temperature T each mode will have energy of kBT, but distributed over a band of frequencies determined by the quality factor (or Q) of the material. Low-loss (or high-Q) materials work best; for the violin modes of the wires there will be much noise at particular frequencies (in the hundreds of hertz). The mirror is a cylindrical object, which will have normal modes of oscillation that can be thermally excited. The first generation of LIGO will have these masses composed of fused silica, which is typical for optical components. The Qs for the internal modes are greater than 2 £ 106 : A switch may eventually be made to sapphire mirrors, which have better thermal properties. The limitation to the interferometers’ sensitivity due to the thermal noise internal to the mirrors can be seen in Figure 7, and will be a worrying noise source for the firstgeneration LIGO in the frequency band around 100 –400 Hz. The frequency noise of the laser can couple into the system to produce length displacement sensitivity noise in the interferometer. With arm lengths of 4 km, it will be impossible to hold the length of the two arms absolutely equal. The slightly differing arm spans will mean that the light sent back from each of the two Fabry– Perot cavities will have slightly differing phases. As a consequence, great effort is made to stabilize the frequency of the light entering the interferometer. The LIGO laser can be seen in Figure 8. The primary laser is a Lightwave Model 126 nonplanar ring oscillator. High power is generated from this stabilized laser through the use of optical amplifiers. The beam is sent through four optical amplifiers, and then retro-reflected back

Figure 8 A picture of the Nd:YAG laser and amplifier system that produces 10 W of light for LIGO. Photograph courtesy of Caltech/LIGO.

INTERFEROMETRY / Gravity Wave Detection 363

Figure 9 The optical system for the LIGO system. Light moves from the frequency-stabilized laser to a mode cleaner, and then to the interferometer. Presently 8 W of TEM00 laser power is delivered to the interferometer.

through the amplifiers again. For LIGO, the laser is locked and held to a specific frequency by use of signals from a reference cavity, a mode cleaner cavity, and the interferometer. For low-frequency stabilization the temperature of the ring oscillator is adjusted. At intermediate frequencies adjustment is made by signals to a piezo-electric transducer within the ring oscillator cavity. At high frequencies the noise is reduced with the use of an electro-optic crystal. The LIGO lasers currently have a frequency pffiffiffiffi noise of 2 £ 1022 Hz Hz at frequencies above 1 kHz. It will prove important to worry about the stability of the laser power for the interferometric detectors. The hope is to be shot noise limited at frequencies above a few hundred hertz. The Nd:YAG power amplifiers used are pumped with an array of laser diodes, so the light power is controlled through feedback to the laser diodes. The LIGO requirements for the fluctuations on the power P are pffiffiffiffi DP=P , 1028 Hz: The spatial quality of the light is ensured through the use of a mode-cleaning cavity. LIGO uses a triangular array of mirrors separated by 15 m. LIGO’s input optical system can be seen in Figure 9. The current LIGO optical system yields 8 W of 1.06 mm light in the TEM00 mode.

Conclusion The attempt to measure gravitational radiation with laser interferometers could possibly be the most difficult optical experiment of our time. Over a hundred years ago Michelson succeeded in carrying off experiments of amazing difficulty as he measured the speed of light and disproved the existence of the aether. Gravity wave detection is an experiment worthy of Michelson, and there are hundreds of physicists striving to make it a reality. Great success has already been achieved. The TAMA 300-m interferometer is operational pffiffiffiffiand has achieved a sensitivity hðf Þ ¼ 5 £ 10221 Hz in the 700 Hz to 1.5 kHz frequency band. The LIGO

interferometers are now operating, and scientists are presently de-bugging the system in order to achieve the target sensitivity. The first scientific data taking for LIGO commenced in 2002, and is continuing at the time of writing. This will be a new telescope to peer into the heavens. With every new means of looking at the sky there has come unexpected discoveries. Physicists do know that there will be signals that they can predict. Binary systems of compact objects (neutron stars or black holes) will produce chirp signals that may be extracted by matched filtering techniques. A supernova will produce a burst that will hopefully rise above the noise. Pulsars, or neutron stars spinning about their axes at rates sometimes exceeding hundreds of revolutions per second, will produce continuous sinusoidal signals that can be seen by integrating for sufficient lengths of time. Gravity waves produced by the Big Bang will produce a background stochastic noise that can possibly be extracted by correlating the outputs from two or more detectors. These are exciting physics results that will come through tremendous experimental effort.

See also Imaging: Interferometric Imaging. Interferometry: Overview.

Further Reading Barish BC (1997) Gravitational wave detection. In: Tsubono K, Fujimoto MK and Kurodo K (eds) Gravitational Wave Detection, pp. 155 –161. Tokyo: Universal Academic. Blair DG, Taniwaki M, Zhao CN, et al. (1997) Progress in the development of technology for advanced laser interferometric gravitational wave detectors. In: Tsubono K, Fujimoto MK and Kurodo K (eds) Gravitational Wave Detection, pp. 75 –94. Tokyo: Universal Academic. Brillet A (1997) Virgo status report. In: Tsubono K, Fujimoto MK and Kurodo K (eds) Gravitational Wave Detection, pp. 163 – 173. Tokyo: Universal Academic. Danzmann K (1997) LISA –A gravitational wave observatory in heliocentric orbit. In: Tsubono K, Fujimoto MK and Kurodo K (eds) Gravitational Wave Detection, pp. 111– 116. Tokyo: Universal Academic. Drever RWP, Ford GM, Hough J, et al. (1980) A gravitywave detector using optical cavity sensing. In: Schmutzer E (ed.) Proceedings of the Ninth International Conference on General Relativity and Gravitation, pp. 265– 267. Berlin: Springer. Forward RL (1978) Wideband laser-interferometer gravitational-radiation experiment. Physical Review D17: 379.

364 INTERFEROMETRY / Phase-Measurement Interferometry

Hough J and the GEO 600 Team (1997) GEO 600 Current status and some aspects of the design. In: Tsubono K, Fujimoto MK and Kurodo K (eds) Gravitational Wave Detection, pp. 175 – 182. Tokyo: Universal Academic. Robertson NA (2000) Laser interferometric gravitational wave detectors. Classical Quantum Gravity 17: R19 – R40. Saulson PR (1994) Fundamentals of Interferometric Gravitational Wave Detectors. Singapore: World Scientific. Taylor JH and Weisberg JM (1989) Further experimental tests of relativistic gravity using the binary

pulsor PSR 1913 þ 16. Astrophysical Journal 345: 434 – 450. Thorne KS (1987) Gravitational radiation. In: Hawking S and Isreal W (eds) 300 Years of Gravitation, pp. 330 – 458. Cambridge, UK: Cambridge University Press. Tsubono K and the TAMA collaboration (1997) TAMA Project, Gravitational wave detection. In: Tsubono K, Fujimoto MK and Kurodo K (eds) Gravitational Wave Detection, pp. 183– 191. Tokyo: Universal Academic. Weiss R (1972) Electromagnetically coupled broadband gravitational antenna. MIT Quarterly Progress Report no. 105.

Phase-Measurement Interferometry K Creath, Optineering and University of Arizona, Tucson, AZ, USA J Schmit, Veeco Instruments, Inc., Tucson, AZ, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Phase-measurement interferometry is a way to measure information encoded in interference patterns generated by an interferometer. Fringe patterns (fringes) created by interfering beams of light are analyzed to extract quantitative information about an object or phenomenon. These fringes are localized somewhere in space and require a certain degree of spatial and temporal coherence of the source to be visible. Before these techniques were developed in the late 1970s and early 1980s, fringe analysis was done either by estimating fringe deviation and irregularity by eye or by manually digitizing the centers of interference fringes using a graphics tablet. Digital cameras and desktop computers have made it possible to easily obtain quantitative information from fringe patterns. The techniques described in this article are independent of the type of interferometer used. Interferometric techniques using fringe analysis can measure features as small as a micron wide or as large as a few meters. Measurable height ranges vary from a few nanometers up to 10s of mm, depending upon the interferometric technique employed. Measurement repeatability is very consistent, and typically repeatability of 1/100 rms of a fringe is easily obtainable while 1/1000 rms is possible. Actual measurement precision depends on what is being measured and how the technique is implemented, while accuracy depends upon comparison to a sanctified standard.

There are many different types of applications for fringe analysis. For example, optical surface quality can be determined using a Twyman– Green or Fizeau interferometer. In addition, wavefront quality measurements of a source or an optical system can be made in transmission, and the index of refraction and homogeneity of optical materials can be mapped out. Many nonoptical surfaces can also be measured. Typically, surface topography information at some specific spatial frequency scale is extracted. These measurements are limited by the resolution of the optical system and the field of view of the imaging system. Lateral and vertical dimensions can also be measured. Applications of nonoptical surfaces include disk and wafer flatness, roughness measurement, distance and range sensing. Phase measurement techniques used in holographic interferometry, TV holography, speckle interferometry, moire´, grating interferometry, and fringe projection are used for nondestructive testing to measure surface structure as well as displacements due to stress and vibration. This article outlines the basics of phase measurement interferometry (PMI) techniques as well as the types of algorithms used. The bibliography lists a number of references for further reading.

Background Basic Parts of a Phase-Measuring Interferometer

A phase-measuring interferometer consists of a light source, an illumination system (providing uniform illumination across the test surface), a beamsplitter (usually a cube or pellicle so both beams have the same optical path), a reference surface (needs to be good because these techniques measure the difference between the reference and test surface), a sample

INTERFEROMETRY / Phase-Measurement Interferometry 365

fixture, an imaging system (images a plane in space where the surface is located onto the camera), a camera (usually a monochrome CCD), an image digitizer/frame grabber, a computer system, software (to control the measurement process and calculate the surface map), and often a spatial or temporal phase shifter to generate multiple interferograms. Steps of the Measurement Process

To generate a phase map, a sample is placed on a sample fixture and aligned, illumination levels are adjusted, the sample image is focused onto the camera, the fringes are adjusted for maximum contrast, the phase shifter or fringe spacing is adjusted or calibrated as necessary, a number of images is obtained and stored with the appropriate phase differences, the optical path difference (OPD) is calculated as the modulo 2p phase and then is unwrapped at each pixel to determine the phase map. To make consistent measurements, some interferometers need to be on vibration isolation systems and away from heavy airflows or possible acoustical coupling. It helps to cover any air paths that are longer than a few millimeters. Consideration needs to be made for consistency of temperature and humidity. The human operating the interferometer is also a factor in the measurement. Does this person always follow the same procedure? Are the measurements sampled consistently? Are the test surfaces clean? Is the sample aligned the same and correctly? Many different factors can affect a measurement. To obtain repeatable measurements it is important to have a consistent procedure and regularly verify measurement consistency.

surface is imaged onto the camera and the computer has a frame grabber that takes frames of fringe data. The most common commercially available interferometer for optical testing is the Fizeau interferometer (Figure 2). This versatile instrument is very insensitive to vibrations due to the large common path for both interfering wavefronts. The reference surface (facing the test object) is moved by a PZT to provide the phase shift. Interference microscopes (see Microscopy: Interference Microscopy. Interferometry: White Light Interferometry) are used for looking at surface roughness and small structures (see Figure 3). These instruments can employ Michelson, Mirau, and Linnik interference objectives with a laser or white light source, or Fizeau-type interference objectives that typically use a laser source because of the unequal paths in the arms of the interferometer. The phase shift is accomplished by moving the sample, the reference surface, or parts of the objective relative to the sample. Figure 3 shows a schematic of a Mirau-type interferometric microscope for phase measurement.

Common Interferometer Types Figure 2 Fizeau interferometer.

One of the most common interferometers used in optical testing is the Twyman– Green interferometer (Figure 1). Typically, a computer controls a mirror pushed by a piezo-electric transducer (PZT). The test

Figure 1 Twyman–Green interferometer.

Figure 3 Mirau interference microscope.

366 INTERFEROMETRY / Phase-Measurement Interferometry

Determination of Phase The Interference Equation

Interference fringes from a coherent source (e.g., a laser) are theoretically sinusoidal (Figure 4a), while fringes from an incoherent source (e.g., white light) are localized in a wavepacket at a point in space (Figure 4b) where the optical paths of the arms of the interferometer are equal and marked on Figure 4 as scanner positions z ¼ 0 (see Interferometry: White Light Interferometry). For generality, this analysis considers the determination of phase within the wave packet for fringes localized in space. Interference fringes at any point in the wavepacket can be written in the following form: Iðx; yÞ ¼ I0 {1 þ g ðx; y; zÞcos½fðx; yÞ}

½1

where I0 is the dc irradiance, g is the fringe visibility (or contrast), 2I0 g is the modulation (irradiance amplitude or the ac part of the signal), and f is the phase of the wavefront as shown in Figure 5. For simplicity, this drawing assumes that the interference fringe amplitude is constant. Fringe visibility can be determined by calculating



Imax 2 Imin Imax þ Imin

½2

where Imax is the maximum value of the irradiance for all phase values and Imin is the minimum value. The fringe visibility has a real value between 0 and 1 and varies with the position along the wavepacket. The fringe visibility, as defined here, is the real part of the complex degree of coherence. Types of Phase Measurement Techniques

Phase can be determined from either a number of interference fringe patterns or from a single interferogram with appropriate fringe spacing. Temporal techniques require an applied phase shift between the

test and reference beams as a function of time while multiple frames of interference fringe data are obtained. Spatial techniques can obtain data from a single interferogram that requires a carrier pattern of almost straight fringes to either compare phases of adjacent pixels or to separate orders while performing operations in the Fourier domain. Spatial techniques may also simultaneously record multiple interferograms with appropriate relative phase shift differences separated spatially in space. Multiple frame techniques require more data than single frame techniques. Temporal techniques require that the interference fringes be stable over the time period it takes to acquire the number of images. While single-frame spatial techniques require less data and can be done with a single image, they generally have a reduced amount of resolution and less precision than temporal techniques. There are literally hundreds of algorithms and techniques for extracting phase data from interference fringe data. The references listed in the Further Reading offer more details of these techniques. Temporal Phase Measurement

Temporal techniques use data taken as the relative phase between the test and reference beams is modulated (shifted). The phase (or OPD) is calculated at each measured point in the interferogram. As the phase shifter is moved, the phase at a single point in the interferogram changes. The effect looks like the fringes are moving across the interferogram, and because of these techniques are sometimes called fringe scanning or fringe shifting techniques. However, the fringes are not really moving; rather the irradiance at a single detector point is changing (hopefully sinusoidally) in time (see Figure 6). A 1808 or p phase causes a bright fringe to become a dark fringe. Phase Modulation Techniques

There are many ways to introduce a phase modulation (or shift). These include moving a mirror or the

Figure 4 Interference fringes for coherent and incoherent sources as observed at point x ; y ; z ¼ 0 corresponds to equal optical path lengths of the reference and object beams.

INTERFEROMETRY / Phase-Measurement Interferometry 367

sample, tilting a glass plate, moving a diffraction grating, rotating a polarizer, analyzer, or half-wave plate, using a two-frequency (Zeeman) laser source, modulating the source wavelength, and switching an acousto-optic Bragg cell, or magneto-optic/electrooptic cell. While any of these techniques can be used for coherent sources, special considerations need to be made for temporally incoherent sources (see Interferometry: White Light Interferometry). Figure 7 shows how moving a mirror introduces a relative phase shift between object and reference beams in a Twyman – Green interferometer. Extracting Phase Information

Including the phase shift, the interference equation is written as

Since the detector has to integrate for some finite time, the detected irradiance at a single point becomes an integral over the integration time D (Figure 8) where the average phase shift for the ith frame of data is ai : Ii ðx; yÞ ¼

1 ðai þðD=2Þ I ðx; yÞf1 þ g0 ðx; yÞ D ai 2ðD=2Þ 0 cos½fðx; yÞ þ aðtÞ gdaðtÞ

½4

After integrating over aðtÞ; the irradiance of the detected signal becomes    D Ii ðx; yÞ ¼ I0 ðx; yÞ 1 þ g0 ðx; yÞsinc 2  cos½fðx; yÞ þ ai  ½5

Iðx; yÞ ¼ I0 ðx; yÞf1 þ g0 ðx; yÞcos½fðx; yÞ þ aðtÞ g ½3 where Iðx; yÞ is the irradiance at a single detector point, I0 ðx; yÞ is the average (dc) irradiance, g0 ðx; yÞ is the fringe visibility before detection, fðx; yÞ is the phase of the wavefront being measured, and aðtÞ is the phase shift as a function of time.

Figure 5 Variable definitions for interference fringes.

where sincðD=2Þ ¼ sinðD=2Þ=ðD=2Þ; which reduces the detected visibility. Ramping Versus Stepping

There are two different ways of shifting the phase; either the phase shift can be changed in a constant and linear fashion (ramping) (see Figure 8) or it can be stepped in increments. Ramping provides a continuous smooth motion without any jerking motion. This option may be preferred if the motion does not wash out the interference fringes. However, ramping requires good synchronization of the camera/digitizer and the modulator to get the correct OPD (or phase) changes between data frames. Ramping allows faster data taking but requires electronics that are more sophisticated. In addition, it takes a finite time for a mass to move linearly. When ramping, the first frame or two of data usually needs to be discarded because the shift is not correct until the movement is linear. The major difference between ramping and stepping the phase shift is a reduction in the modulation of the interference fringes after detection (the sincðD=2Þ term in eqn [5]). When the phase shift is stepped ðD ¼ 0Þ; the sinc term has a value of one. When the phase shift is ramped ðD ¼ aÞ for a phase

Figure 6 Interference fringe patterns corresponding to different relative phase shifts between test and reference beams.

Figure 7 Moving a mirror to shift relative phase by l=4:

Figure 8 Change in OPD integrating over a D phase shift with sample separation a:

368 INTERFEROMETRY / Phase-Measurement Interferometry

shift of a ¼ 908 ¼ p=2; this term has a value of 0.9. Therefore, ramping slightly reduces the detected fringe visibility. Signal Modulation

Phase measurement techniques assume that the irradiance of the interference fringes at the camera covers as much of the detector’s dynamic range as possible and that the phase of the interference fringe pattern is modulating at each individual pixel as the phase between beams is modulated. If the irradiance at a single detector point does not modulate as the relative phase between beams is shifted, the height of the surface cannot be calculated. Besides the obvious reductions in irradiance modulation due to the detector sampling and pixel size or a bad detector element, scattered light within the interferometer, and defects or dirt on the test object, can also reduce signal modulation. Phase measurement techniques are designed to take into account the modulation of the signal at each pixel. If the signal does not modulate enough at a given pixel, then the data at that pixel are considered unusable, flagged as ‘bad’ and are often left blank. Phase values for these points may be interpolated from surrounding pixels if there are sufficient data. Three-Frame Technique

The simplest method to determine phase uses three frames of data. With three unknowns, three sets of recorded fringe data are needed to reconstruct a wavefront providing a phase map. Using phase shifts of ai ¼ p=4; 3p=4; and 5p=4; the three fringe measurements at a single point in the interferogram may be expressed as    p I1 ¼ I0 1 þ g cos f þ 2 # " pffiffi 2 g ðcos f 2 sin fÞ ½6 ¼ I0 1 þ 2    3p I2 ¼ I0 1 þ g cos f þ 2 # " pffiffi 2 ¼ I0 1 þ g ð2cos f 2 sin fÞ 2    5p I3 ¼ I0 1 þ g cos f þ 2 # " pffiffi 2 ¼ I0 1 þ g ð2cos f þ sin fÞ 2

Note that ðx; yÞ dependencies are still implied. The choice of the specific phase shift values is to make the math simpler. The phase at each detector point is ! 21 I3 2 I2 ½9 f ¼ tan I1 2 I2 In most fringe analysis techniques, we are basically trying to solve for terms such that we end up with the tangent function of the phase. The numerator and denominator shown above are respectively proportional to the sine and cosine of the phase. Note that the dc irradiance and fringe visibility appear in both the numerator and denominator. This means that variations in fringe visibility and average irradiance from pixel to pixel do not affect the results. As long as the fringe visibility and average irradiance at a single pixel is constant from frame to frame, the results will be good. If the different phase shifts are multiplexed onto multiple cameras, the results will be dependent upon the gain of corresponding pixels. Bad data points with low signal modulation are determined by calculating the fringe visibility at each data point using: pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðI3 2 I2 Þ2 þ ðI1 2 I2 Þ2 pffiffi g¼ ½10 2I0 It is simpler to calculate the ac signal modulation ð2I0 gÞ and then set the threshold on the modulation to a typical value of about 5 – 10% of the dynamic range. If the modulation is less than this value at any given data point, the data point is flagged as bad. Bad points are usually caused by noisy pixels and can be due to scratches, pits, dust and scattered light. Although three frames of data are enough to determine the phase, this and other 3-frame algorithms are very sensitive to systematic errors due to nonsinusoidal fringes or nonlinear detection, phase shifter miscalibration, vibrations and noise. In general, the larger the number of frames of data used to determine the phase, the smaller the systematic errors. Phase Unwrapping

½7

½8

The removal of phase ambiguities is generally called phase unwrapping, and is sometimes known as integrating the phase. The phase ambiguities owing to the modulo 2p arctangent calculation can simply be removed by comparing the phase difference between adjacent pixels. When the phase difference between adjacent pixels is greater than p; a multiple of 2p is added or subtracted to make the difference less than p: For the reliable removal of

INTERFEROMETRY / Phase-Measurement Interferometry 369

Figure 10 Angles of illumination and viewing.

From Wavefront to Surface

Once the phase of the wavefront is known, surface shape can be determined from the phase map. Surface height H of the test surface relative to the reference surface at a location ðx; yÞ is given by Hðx; yÞ ¼

fðx; yÞl 2p ðcos u þ cos u 0 Þ

½11

where l is the wavelength of illumination, and u and u 0 – the angles of illumination and viewing with respect to the surface normal – are shown in Figure 10. For interferometers (e.g., Twyman – Green or Fizeau) where the illumination and viewing angles are normal to the surface ðu ¼ u 0 ¼ 0Þ; the surface height is simply: Hðx; yÞ ¼

Figure 9 Fringes, wrapped, and unwrapped phase maps.

discontinuities, the phase must not change by more than p (l=2 in optical path (OPD)) between adjacent pixels. Figure 9 shows an example of wrapped and unwrapped phase values. Given a trouble-free wrapped phase map, it is enough to search row by row (or column by column) for phase differences of more than p between neighboring pixels. However, fringe patterns usually are not perfect and are affected by systematic errors. Some of the most frequently occurring error sources are noise, discontinuities in the phase map, violation of the sampling theorem and invalid data points (e.g., due to holes in the object or low modulation regions). Some phase maps are not easily unwrapped. In these cases different techniques like wave packet peak sensing in white light interferometry are used.

l fðx; yÞ 4p

½12

Since the wavefront measured represents the relative difference between the interfering reference and test wavefronts, this phase map only directly corresponds to the surface under test when the reference wavefront is perfectly flat. In practice, the shape of the reference surface needs to be accounted for by measuring it using a known test surface and subtracting this reference measurement from subsequent measurements of the test surface. Phase Change on Reflection

Phase shifting interferometry measures the phase of reflected light to determine the shape of objects. The reflected wavefront will represent the object surface (within a scaling factor) if the object is made of a single material. If the object is comprised of multiple materials that exhibit different phase changes on reflection, the measured wavefront needs to be corrected for these phase differences (see Interferometry: White Light Interferometry).

Overview of Phase Measurement Algorithms and Techniques There are literally hundreds of published algorithms and techniques. The optimal algorithm depends on

370 INTERFEROMETRY / Phase-Measurement Interferometry

the application. Most users prefer fast algorithms using a minimal amount of data that are as accurate and repeatable as possible, immune to noise, adaptable, and easy to implement. In practice, there are trade-offs that must be considered when choosing a specific algorithm or technique. This section provides an overview of the types of algorithms to aid the reader in sifting through published algorithms.

N frames

3 4 5

Synchronous Detection

One of the first techniques for temporal phase measurement utilized methods of communication theory to perform synchronous detection. To synchronously detect the phase of a noisy sinusoidal signal, the signal is first correlated (or multiplied) with sinusoidal and cosinusoidal reference signals (signals in quadrature) of the same frequency and then averaged over many periods of oscillation. This method of synchronous detection as applied to phase measurement can be extracted from the least squares estimation result when the phase shifts are chosen such that N measurements are equally spaced over one modulation period. With phase shifts ai such that: 2p i; ai ¼ N

Table 1 Sampling function weights for a few selected algorithms

with i ¼ 1; …; N

the phase can be calculated from "P # Ii ðx; yÞsin ai 21 P fðx; yÞ ¼ tan Ii ðx; yÞcos ai

½13

½14

8 12

Coefficients

p 2 p 2 p 2 p 3 p 2 p 3

1; 21; 0 0; 1; 21 0; 21; 0; 1 1; 0; 21; 0 0; 22; 0; 2; 0 1; 0; 22; 0; 1 pffiffi 3½0; 1; 1; 0; 21; 21; 0 21; 21; 1; 2; 1; 21; 21 1; 5; 211; 215; 15; 11; 25; 21 1; 25; 211; 15; 15; 211; 25; 1 pffiffi 3½0; 23; 23; 3; 9; 6; 26; 29; 23; 3; 3; 0 2; 1; 27; 211; 21; 16; 16; 21; 211; 27; 1; 2

the phase is calculated using 2X ni Ii 6 i 21 6 X f ¼ tan 6 4 d i Ii

3 7 7 7 5

½16

i

The numerator and denominator of the arctangent argument are both polynomials. The numerator is a sum proportional to the sine (imaginary part) and the denominator is a sum proportional to the cosine (real part), num ¼ 2kI0 g sin ai ¼

X

ni Ii

½17

di Ii

½18

i

Note that N can be any number of frames (or samples). The more frames of data, the smaller the systematic errors. This technique does not take large amounts of memory for a large number of frames, because only the running sums of the fringes multiplied by the sine and cosine of the phase shift need to be remembered. The 4-frame algorithm from Table 1 is an example of a direct adaptation of synchronous detection where simple values of 1, 2 1 or 0 for every p=2 phase shift can be assigned to the sine and cosine functions. Algorithm Design

In the last ten years, a lot of work has been done to generalize the derivation of fringe analysis algorithms. This work has enabled the design of algorithms for specific applications, which are insensitive to specific systematic error sources. Most algorithms use polynomials for the numerator and denominator. Given fringe data: Ii ¼ I0 ½1 þ g cosðf þ ai Þ

7

Phase shift

½15

den ¼ 2kI0 g cos ai ¼

X i

where the constant k depends on the values of coefficients. From this the fringe visibility is given by pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðnumÞ2 þ ðdenÞ2 g¼ 2kI0

½19

The coefficient vectors for the numerator and denominator are window functions. For an algorithm such as the 4-frame technique, the weights of all samples are equal ½1; 1; 1; 1: This makes the coefficients all equal. For other algorithms, such as the 5frame technique, the weights are larger on the middle frames than on the outer frames. The weights for the 5-frame technique are ½1; 2; 2; 2; 1: A property of the coefficient vectors is that the sum of the coefficients for each, the numerator and denominator, should be zero. Examples of coefficient vectors for a few selected algorithms are given in Table 1.

INTERFEROMETRY / Phase-Measurement Interferometry 371

Heterodyne Interferometry

Historically, heterodyne techniques were developed and used before temporal phase-shifting techniques. These techniques generally determine the phase electronically at a single point by counting fringes and fractions of fringes. Areas are analyzed by scanning a detector. Phase shifts are usually obtained using two slightly different frequencies in the reference and test beams. The beat frequency produced by the interference between the reference and test beams is compared to a reference sinusoidal signal, which may be produced either optically or electronically. The time delay (or distance traveled) between the crossing of the zero phase points of the test and reference sinusoidal signals is a measure of the phase. Every time the test signal passes through another zero in the same direction as the test surface is moved, another fringe can be counted. This is how fringe orders are counted. If the beam is interrupted as the detector is scanned across the interferogram, the fringe count is corrupted and the measurement needs to be started again. Frequency multiplication (harmonics) can also be used to determine fractions of fringes. Today, heterodyne techniques are used mainly in distance measuring interferometers. The precision and accuracy of distance measuring interferometers is at least on the order of 1 part in 106. Fourier-Transform Technique

The Fourier-transform technique is a way to extract phase from a single interferogram. It is used a lot in nondestructive testing and stellar interferometry where it is difficult to get more than a single interferogram. The basic technique is shown schematically in Figure 11. The recorded interferogram distribution is Fourier transformed, and one order (usually the þ or 2 first order) is either isolated and shifted to zero frequency or filtered out using a

rectangular window. After an inverse Fourier transform, the result is the phase. To illustrate this technique mathematically, the interference equation is rewritten as Iðx; yÞ ¼ I0 ðx; yÞ þ cðx; yÞ expði2pf0 xÞ þ cp ðx; yÞexpð2i2pf0 xÞ

½20

where cðx; yÞ ¼ I0 ðx; yÞg ðx; yÞ exp½ifðx; yÞ and the p indicates a complex conjugate. The term cðx; yÞ contains the phase information we wish to extract. After performing a one-dimensional Fourier transform: Iðj; yÞ ¼ I0 ðj; yÞ þ cðj 2 f0 ; yÞ þ cp ðj 2 f0 ; yÞ

½21

where j is the spatial frequency in the x direction, and italics indicate Fourier transforms. The next step is to filter out and isolate the second term, and then inverse Fourier transform to yield cðx; yÞ: The wavefront modulo 2p is then given by ) ( 21 Im½cðx; yÞ ½22 fðx; yÞ ¼ tan Re½cðx; yÞ where Re and Im refer to the real and imaginary part of the function. This technique has limitations. If the fringes are nonsinusoidal, there will not be a simple distribution in the frequency space; there will be many orders. Another problem is overlapping orders. There needs to be a carrier frequency present that ensures that the orders are separated in frequency space. This carrier frequency is produced by adding tilt fringes until the orders are separated. This means that the aberration (fringe deviation) has to be less than the fringe spacing. Another problem is aliasing. If the interferogram is not sampled sufficiently, there will be aliasing and it will not be possible to separate the orders in frequency space. Finally, large variations in average fringe irradiance and fringe visibility across the interferogram can also cause problems. Spatial Carrier-Frequency Technique

Figure 11 Fourier transform technique.

This is essentially the equivalent of the Fourier transform technique but is performed in the spatial domain. It is also used when there is only one interferogram available and its major applications include nondestructive testing and measurement of large optics. These techniques relate closely to the temporal phase-measurement methods; however, instead of using a number of interferograms they can obtain all the information from a single interferogram. As an example, let’s assume that the fringes are vertical and parallel to the columns on the detector array.

372 INTERFEROMETRY / Phase-Measurement Interferometry

The carrier frequency (i.e., the number of tilt fringes) is adjusted so that there is an a phase change from the center of one pixel to the next. As long as there is not much aberration (deviation), the phase change from pixel to pixel across the detector array will be approximately constant. When the fringes are set up this way, the phase can be calculated using adjacent pixels (see Figure 12). If one fringe takes up 4 pixels, the phase shift a between pixels will be 908. An algorithm such as the threeframe, four-frame, or five-algorithm can be used with adjacent pixels as the input. Therefore, 3, 4, or 5 pixels in a row will yield a single-phase point. Then, the analysis window is shifted sideways one pixel and phase is calculated at the next point. This technique assumes that the dc irradiance and fringe visibility do not change over the few pixels used to calculate each phase value. Spatial Multichannel Phase-Shift Techniques

These techniques detect all phase maps simultaneously and multiplex the phase shift using static optical elements. This can be done by using either separate cameras as illustrated below or by using different detector areas to record each of the interferograms used to calculate the phase. The phase is usually calculated using the same techniques that are used for the temporal phase techniques.

As an example, a four-channel interferometer can be made using the setup shown in Figure 13 to record four interferograms with 908 phase shifts between them. Camera 1 will yield fringes shifted 1808 with respect to camera 2, and cameras 3 and 4 will have phase shifts of 908 and 2708. The optical system may also utilize a holographic optical element to split the beam to multiplex the four phase shifts on four quadrants of a single camera. Signal Demodulation Techniques

The task of determining phase can be broadened by looking toward the field of signal processing. For communication via radio, radar, and optical fibers electrical engineers have developed a number of ways of compressing and encoding a signal as well as decompressing a signal and decoding it. An interference fringe pattern looks a lot like an am radio signal. Thus, it can be demodulated in similar ways. In recent years many new algorithms have been developed by drawing on techniques from communication theory and applying them to interferogram processing. Many of these use different types of transforms such as Hilbert transforms for straight fringes or circular transforms for closed fringes. Extended Range Phase Measurement Techniques

Figure 12 Spatial phase shifting. Different relative phase shifts are on adjacent detector elements.

A major limitation of phase measurement techniques is that they cannot determine surface discontinuities larger than l=4 (l=2 in optical path (OPD)) between adjacent pixels. One obvious solution is to use longer wavelength sources in the infrared where optically rough surfaces look smooth and their shape can be measured. An alternative is two-wavelength interferometry where two measurements at different wavelengths are taken and the measurable height limitation is now determined by the equivalent wavelength: l1 l2 leq ¼ ½23 ll1 2 l2 l

Figure 13 Spatial phase shifting using four cameras.

Another method that allows for measurement of smooth surfaces with large height discontinuities that is limited only by the working distance of the objective combines a white light and a phase measurement interferometric technique in one long scan. The position of the wave packet resolves 2p ambiguities that result from the arctangent function (see Phase unwrapping section above). The phase of the fringe close to the wave packet maximum is determined without ambiguity. Sometimes it is not the step height discontinuity that is a problem but rather the high slope of the measured surface. If we know that the surface is continuous, then the

INTERFEROMETRY / Phase-Measurement Interferometry 373

unwrapping procedure can take advantage of this a priori information to look for a continuous. Techniques for Deformation Measurement

Some of the techniques for deformation measurement have already been described earlier (see sections on Single frame techniques and Spatial multichannel phase-shift techniques). However, fringes in any interferometer can be analyzed not only in the x; y planes but also in the x; z; or y; z planes for which the carrier frequency of the fringes is introduced. Analysis of fringes in any plane has the same restrictions. If the initial static object shape is measured using conventional phase measurement techniques, not only can the motion of each object point be determined but also the deformation of the whole object in time. If the motion of the object is periodic, then the object motion can be ‘frozen’ by using stroboscopic illumination of the same frequency as the object’s motion. Once the motion is frozen, any temporal technique can be used to measure object shape at some phase of its motion. Changing the time offset between stroboscopic illumination and the periodic signal driving the motion, the object can be ‘frozen’ and thus measured at different phases of its periodic motion.

Systematic Errors Noise Sensitivity

Measurement noise mostly arises from random fluctuations in the detector readout and electronics. This noise reduces precision and repeatability. Averaging multiple measurements can reduce effects of these random fluctuations. Phase Shifter Errors

Phase shifter errors can be due to both miscalibration of the system and nonlinearities in the phase shift. It is possible to purchase very linear phase shifters. It is also possible to correct nonlinearities by determining the voltage signal making the phase shifter provide a linear phase shift. Linear phase shifter errors (miscalibration) of the phase shift have an error signature that is at twice the frequency of the fringes. If there are two fringes across the field of view, the error signature will have four across the field of view. Figure 14 shows the difference between a calibrated and an uncalibrated phase shifter as well as the difference in error for two different phase measurement algorithms. Some algorithms are obviously more sensitive than others to this type of error.

Figure 14 Comparison of phase maps for calibrated and uncalibrated phase shifts using two different phase measurement algorithms.

Other Types of Systematic Errors to Consider

Other types of errors to consider are detector nonlinearities, quantization errors due to analog-todigital converters, and dissimilar materials (see Phase change upon reflection). For temporal phase measurement techniques errors due to vibration and

374 INTERFEROMETRY / Phase-Measurement Interferometry

Conclusions Phase measurement interferometry techniques have increased measurement range and precision enabling the production of more complex and more precise components. As work continues on development of interferometric techniques, phase measurement techniques will continue to become more robust and less sensitive to systematic errors. Anticipated advances will enable measurements of objects that were unimaginable 20 or 30 years ago.

See also Interferometry: White Light Interferometry. Microscopy: Interference Microscopy.

Further Reading

Figure 15 Measurements of (a) hard disc substrate with a vertical range is 2.2 micrometers and (b) binary grating roughness standard.

air turbulence need to be considered as well. Spatial phase measurement techniques are sensitive to miscalibrated tilt (wrong carrier frequency), unequally spaced fringes, and sampling and windowing in Fourier transform techniques. Choosing an Algorithm or Technique

Each algorithm and type of measurement technique is sensitive to different types of systematic errors. Choosing a proper algorithm or technique for a particular type of measurement depends on the specific conditions of the test itself and reducing the systematic errors for a particular type of measurement. This is the reason that so many algorithms exist. The references in the bibliography will help the reader determine what type of algorithm will work best for a specific application. Examples of Applications

Phase shifting interferometry can be used for measurements such as hard disk flatness, quality of optical elements, lens curvature, dimensions and quality of air-bearing surfaces of magnetic read/write heads, cantilevers, and semiconductor elements. Figure 15 shows results for measurements of a hard disk substrate and a roughness grating.

Bruning JH (1978) Fringe scanning interferometers. In: Malacara D (ed.) Optical Shop Testing, pp. 409 – 438. New York: Wiley. Creath K (1988) Phase-measurement interferometry techniques. In: Wolf E (ed.) Progress in Optics, vol. 26, pp. 349– 393. Amsterdam: Elsevier. Creath K (1993) Temporal phase measurement methods. In: Robinson DW and Reid GT (eds) Interferogram Analysis, pp. 94– 140. Bristol: IOP Publishing. Creath K and Schmit J (1996) N-point spatial phasemeasurement techniques for non-destructive testing. Optics and Lasers in Engineering 24: 365 –379. Ghiglia DC and Pritt MD (1998) Two-Dimensional Phase Unwrapping. New York: Wiley. Greivenkamp JE and Bruning JH (1992) Phase shifting interferometry. In: Malacara D (ed.) Optical Shop Testing, 2nd edn, pp. 501– 598. New York: Wiley. Hariharan P (1992) Basics of Interferometry. New York: Academic Press. Kerr D (ed.) (2004) FASIG special issue on fringe analysis. Optics and Lasers in Engineering 41: 597 –686. Kujawinska M (1993) Spatial phase measurement methods. In: Robinson DW and Reid GT (eds). Interferogram Analysis, pp. 141– 193. Bristol, UK: IOP Publishing. Larkin KG (1996) Neat nonlinear algorithm for envelope detection in white light interferometry. Journal of Optical Society of America A 13: 832 – 842. Malacara D (1992) Optical Shop Testing, 2nd edn. New York: Wiley. Malacara D, Servin M and Malacara Z (1998) Interferogram Analysis for Optical Testing. New York: Marcel Dekker. Robinson DW and Reid GT (1993) Interferogram Analysis: Digital Processing Techniques for Fringe Pattern Measurement. Bristol, UK: IOP Publishing. Schmit J and Creath K (1996) Window function influence on phase error in phase-shifting algorithms. Applied Optics 35: 5642– 5649. Surrel Y (2000) Fringe analysis. Photo-Mechanics: Topics in Applied Physics 77: 55– 102.

INTERFEROMETRY / White Light Interferometry 375

White Light Interferometry J Schmit, Veeco Instruments, Inc., Tucson, AZ, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction In 1960 the development of the laser injected new possibility into an old discipline, interferometry (see Interferometry: Overview). The excellent spatial and temporal coherence (ideally a single point and a single wavelength source) of the laser allowed for the formation of nonlocalized interference fringes. With lasers, researchers could easily generate good contrast fringes practically anywhere where the beams overlapped. These excellent-quality fringes from laser interferometers enabled high-resolution distance and displacement measurements on the order of meters and made possible noncontact surface probing with nanometer and subnanometer resolution. This kind of precision, which was supported by the advent of computers and the advancement of detectors, was previously unimaginable. However, interference phenomena were used in metrology long before lasers and can be observed without any complicated device or source. On a rainy day we can observe the colorful interference patterns that have formed on a thin layer of gasoline in a puddle. The relationship between colors and the small distance between reflecting surfaces, observed as far back as 1665 by Hook, is still used to approximate layer thicknesses. The colors in interference patterns have been used to determine the spectral components of beams. By the end of the nineteenth century, Michelson and Benoit were using interference phenomena to determine distances, long before the invention of the computer and the laser. While the application of lasers in interferometry has advanced the science, white light interferometry (WLI), which uses a spatially and temporally incoherent source and creates fringes localized over a few microns in space, has also benefited from technological advancements. The development of solid state detectors, fast computers, electronic signal processing, and precise scanning stages has allowed for incredibly fast analysis of white light interferograms. In fact, WLI is used in many applications from film thickness and surface measurement, through spectroscopy to astronomy. This chapter focuses on the formation of white light fringes. We first examine the existence of white light fringes in the everyday world. A description of different interferometric setups and their uses follows.

The bulk of this article details the formation of white light fringes and examines their different applications with an emphasis on the analysis of object structure. Suggestions for further reading, both within this encyclopedia and from outside sources, can be found at the end of this article. A basic theory of interferometry and coherence is to be found in corresponding articles of this encyclopedia.

White Light Interference in Everyday Life The effects of interference display spectacular colors when observed in white light, such as the sun. We often see these effects in oil or gasoline spills, in soap bubbles, or between two smooth pieces of glass in contact (separated by a very thin layer of air film) – which are the simplest everyday white light interferometers. The colors we see in these displays are interference colors, and their origin is interference rather than dispersion as in the colors of a rainbow or light reflected from a CD. The picture in Figure 1 is of a gasoline spill under cloudy sky illumination; we see beautiful interference colors that are hard to find in a rainbow: iron-gray, magenta, grayish blue, whitish green, and brownish yellow. Interference colors will differ not only with the thickness of the layer but also with the layer’s absorption and dispersion, the relative indices of refraction of the film, the surrounding media, and the illumination. Interference colors can be observed for layers from a fraction of a micron to a few microns thick.

Interferometers for White Light Observation In a white light interferometer, either the colors or the intensity distribution of the fringes is typically analyzed to retrieve the necessary encoded information, such as the film thickness, birefringence, index of refraction, dispersion, spectral properties, and surface structure. White light interference can be observed only in interferometric designs where the optical paths in both arms of the interferometer are (nearly) equal and the system is (nearly) compensated for dispersion. Interference in Thin Film

Beams of light of different wavelengths incident on a transparent thin film (such as the ones in the puddle) are partially reflected from the top air/film interface

376 INTERFEROMETRY / White Light Interferometry

Figure 1 Interference colors in a gasoline spill in a puddle. This picture would closely resemble a picture from an interference microscope, if not for the leaf in the top left corner.

and partially from the bottom film/substrate interface and then beams of the same wavelength interfere with each other. The optical path difference (OPD) traveled by the interfering beams is related to the film thickness, the index of refraction for the given wavelength and the angle of incidence. Fringes of equal thickness will be observed if the film thickness (distance between interfering wavefronts) varies. These fringes represent points of equal distance between the two surfaces of the film and are formed close to the film (see Figure 2). This situation is typical in any interferometer since rarely are wavefronts perfectly plane and parallel. For any individual wavelength, li ; where the OPD is equal to mli ; and where m is an integer, a bright fringe of the color of the wavelength, li ; will be observed due to constructive interference. For white light illumination, the color at a given point will be dominated by the color of the wavelength for which the interference will be constructive. This color can be used to estimate the optical thickness of the film. An additional factor needs to be taken into account when color is used for estimating optical thickness, namely the phase change on reflection. Different colors will be observed if the ratio of indices of refraction for both film interfaces is ,1 (or .1) as opposed to when the ratio of indices is ,1 for one interface and .1 for the other.

Figure 2 Formation of fringes of equal thickness for film with wedge.

Polarization Microscope

Interference colors that correspond to equal thickness fringes can represent the birefringence of an object; these colors are often observed using a polarization microscope (see Microscropy: Interference Microscopy). The polarization microscope is a conventional microscope with a set of polarizers with crossed polarization axes, one placed above and the other placed below the tested birefringent object.

INTERFEROMETRY / White Light Interferometry 377

Interference colors have been tabulated for many years and for many different purposes. Newton devised his color scale to describe interference fringes for two pieces of glass in contact with each other. Michel-Le´vy and Lacroix in 1889, created a color scale to help recognize different rock forming minerals. For more information about colors in white light interference see sections on White Light Interference and Spectral Interference below. Michelson Interferometer

A Michelson interferometric setup, shown in Figure 3, is often used to analyze white light interference, but the intensity of the fringes is observed as one mirror is scanned rather than the color of fringes that is usually observed. Two plane mirrors (equivalent to the top and bottom surfaces of a thin film) return the beams to the beamsplitter, which recombines parts of the returned beams and directs them towards the detector where the interference is observed. Beam S1 travels through the parallel plate three times while beam S2 passes through the plate only once causing the system to be not well compensated (balanced) for dispersion. Thus, for observation of white light interference, an identical plate is placed in the path of beam S2. As one of the plane mirrors moves along the optical axis to change the OPD, the detector collects the irradiance, the intensity. Fringes will be observed if the system is well-compensated for dispersion and for optical path lengths. Any spectral changes or changes in optical path lengths in the interferometer affect the shape or position of the fringes, and the interferometer measures these changes. Common-path interferometers, like the Mach – Zender or Jamin interferometers, are naturally

Figure 3 Michelson interferometer.

compensated interferometers and can be used for measurement of dispersion in gases. Other interferometers, such as the Twyman – Green or Fizeau interferometers, make use of their unequal interferometer arms and their nonlocalized fringes from laser sources for testing different optical elements and systems in reflection and transition (see Interferometry: Phase Measurement Interferometry).

White Light Interference A white light source consists of a wide spectrum of wavelengths in visible spectrum, from about 380 up to 750 (violet to red) nanometers. However, the principles of WLI described in this article basically apply when any low coherence source is used. Low coherence refers to a source of not only low temporal but also low spatial coherence; WLI can also be referred to as low coherence interferometry. We will be concerned mainly with temporal effect of the source, the source spectrum. Since different wavelengths from the source spectrum are mutually incoherent, we will first look at interference between two waves of a selected monochromatic illumination with wave number k ¼ 2p=l; where l is the wavelength. The intensity of the interference fringes at point x; y (these coordinates are omitted in all equations), as one of the mirrors is scanned along the optical axis, z can be described as pffiffiffiffiffi Iðk; zÞ ¼ I1 þ I2 þ 2 I1 I2 lgðzÞl cosðkzÞ

½1

or can be written in the form: " Iðk; zÞ ¼ I0

# pffiffiffiffiffi 2 I1 I2 1þ lgðzÞl cosðkzÞ I1 þ I2

½2

where I1 and I2 are intensities of each of the beams and I0 ¼ I1 þ I2 ; lgðzÞl is the modulus of the complex mutual coherence function (see Coherence: Overview) assumed here to equal 1 (perfect coherence for each wavelength). The optical path difference z equals z1 2 z2 where z1 and z2 are the optical path lengths that the interfering waves have traveled. The difference in the traveled paths, z; corresponds to the position of the scanning mirror. White light interference (polychromatic interference) is the overlaying of all the monochromatic fringes created for each wavelength of the source spectrum (Figure 4a). A detector observes the sum of all the fringe intensities. In mathematical form, this interference can be described as the integral of all the fringes Iðk; zÞ for all

378 INTERFEROMETRY / White Light Interferometry

excellent distance and 3D position sensors. However, this same characteristic makes them more difficult to align than interferometers with nonlocalized fringes. Envelope of Fringes Due to Source Spectrum

The resultant intensity of interference from a broad spectrum source (see eqn [3]) can be described in general form as IðzÞ ¼ I0 ½1 þ VðzÞ cosðk0 zÞ

Figure 4 Formation of white light fringes: (a) fringes for individual wavelengths; (b) fringes for white light.

wavenumbers k: IðzÞ ¼

ðk2

SðkÞDðkÞIðk; zÞdk

½3

k1

where SðkÞ is the spectral distribution of the light source, with SðkÞ ¼ 0 outside of k1 through k2 range of wave numbers, and DðkÞ is a spectral response of the detector. We assume that the detector’s response DðkÞ equals 1 over the whole spectrum. Because the spacing of the fringes for each wavelength emitted by the white light source is different, the maxima of fringes will align around only one point for a wellcompensated interferometer, as shown in Figure 4a. This alignment around a single point, occurs because in a well-compensated interferometer there is one point for which the OPD is zero for all wavelengths. Away from this point, the observed sum of the intensities quickly falls off, as shown in Figure 4b. The maximum fringe, the fringe that marks the zero OPD, is called the zero-order fringe, and each next fringe of smaller amplitude on either side is called þ 1 and 2 1, þ 2 and 22 order fringe, and so on. Because white light fringes are localized and can only be found within microns or tens of microns of the zero OPD, white light interferometers are

½4

where I0 ¼ I1 þ I2 is the background intensity, VðzÞ is the fringe visibility function or coherence envelope, and k0 ¼ 2p=l0 is the central wave number for fringes under the envelope. The VðzÞ is proportional to the modulus of the Fourier transform of the source spectrum SðkÞ: Generally, if the light source has a Gaussian spectrum SðkÞ; then the envelope of the fringes can be described also as a Gaussian function VðzÞ: If the spectrum of the source is rectangular, then the envelope of the fringes will be a sinc function. The wider that the spectrum of the source is the narrower the width of the envelope will be, as shown in Figure 5. The width of the fringe envelope determines the coherence length of the source (see Coherence: Overview); for a white light source this width is in the order of 1– 2 microns. Different white light sources, such as tungstenhalogen, incandescent, or arc lamps, have different spectra and thus create different coherence envelopes, as shown in Figure 5. The spectra of semiconductor light sources, such as LEDs and SLDs, are similar in shape to a Gaussian function. The fact that the coherence envelope is related to the spectrum of the source by its Fourier transform is commonly used in Fourier transform spectroscopy, where the Fourier transform of the detected fringes is calculated to find the spectral components of the beams. Position of Fringes Under Envelope Due to Reflection of Dielectric

Thus far we have assumed that in WLI the interferometer is compensated for all wavelengths, and thus, for this position the maximum of the fringes aligns with the maximum of the envelope, namely where there is a zero phase shift w0 ¼ 0 between fringe and envelope maxima. If there is an odd number of reflections from dielectric surfaces in one arm of the interferometer and an even number in the other, the fringes will be shifted by w0 ¼ 1808 under the coherence envelope and the minimum of the fringe will align with the maximum of the coherence envelope. Thus eqn [4]

INTERFEROMETRY / White Light Interferometry 379

Figure 5 Spectrum and interferogram for (a) tungsten-halogen lamp and for (b) red LED sources.

can be expressed in a more general form: IðzÞ ¼ I0 ½1 þ VðzÞ cosðk0 z þ w0 Þ

½5

Figure 6 shows color fringes for such a case; the dark fringe marks the zero OPD, and this fringe is surrounded by the greenish blue colors of shorter wavelengths. In contrast, the bright fringe at the maximum envelope position or marking the zero OPD would be surrounded by the reddish colors of longer wavelengths. In a real system the fringes may be shifted with respect to the envelope by any amount of w0 ; and this shift may be due to any number of factors, such as the phase shift on reflection from nondielectric surfaces and dispersion, which we consider next. Changes in Envelope and Fringes Due to Reflection of Nondielectric Materials

The relative position of the envelope maximum and fringe position will be different if the beam is reflected from different nondielectric materials. This difference exists because the phase change on reflection from a test surface, like metals or heavily doped semiconductors, varies with each wavelength. This variance in fringe and peak coherence position may be

predicted and corrected for in surface height calculations. The linear dependence of the phase change on reflection versus wave number shifts the location of the coherence envelope peak position of fringes by different amounts. The constant phase change on reflection and higher-order terms only shift the fringes underneath the coherence envelope. As long as the change on reflection has a small dependence on the second- and higher-order of the wave number, the shape of the coherence envelope is preserved. Changes in Envelope and Fringes Due to Dispersion

Dispersion in an interferometer that is not balanced, perhaps because a dispersive element was placed in one arm or the compensating plate has an incorrect thickness, will influence fringe formation. The phase delay between interference patterns for individual wavelengths is proportional to the product of the geometrical path and the index of refraction equal to d £ nðkÞ: The intensity may be described as:

IðzÞ ¼

ð k2 k1

k1 þ VðzÞ cos{kz 2 kd½nðkÞ}l dk

½6

380 INTERFEROMETRY / White Light Interferometry

used in astronomy. Highly accurate white light fringe estimation, using the optical path-length delay between the two arms of the interferometer, is a cornerstone of stellar interferometry. Fourier Analysis of White Light Interferograms

Figure 6 Formation of white light fringes with destructive interference for OPD ¼ 0:

The dependence of the refractive index on the wave number k can be described as a linear expansion: nðkÞ ¼ nðk0 Þ þ

dn ðk 2 k0 Þ dk

Wavelength-dependent changes in a white light interferogram can be more easily analyzed in the spectrum rather than the spatial domain. The Fourier transform of a white light interferogram yields two symmetrical side lobes at the mean wavelength of the interferogram; analysis of just one of these side lobes is sufficient. The spectral amplitude of the side lobe contains information about the spectral components of the interfering beams, while the spectral phase in regions with appreciable spectral amplitude supplies information about any unbalanced dispersion in the interferometer, as shown in Figure 7. Fourier transform analysis is extensively used in Fourier transform spectrometry (see Spectroscopy: Fourier Transform Spectroscopy). For a dispersion-balanced interferometer, the interferogram is symmetrical around the zero OPD position. For a symmetrical interferogram, the spectral phase will be zero if the zero OPD position of the sampled interferogram is in the middle of sampling range; otherwise a linear factor appears in the spectral phase. This linear term, while useful to surface profiling, because it determines the position of the coherence signal with respect to scanner sampling, is unwanted in Fourier spectroscopy and needs to be corrected for. Dispersion and possible thin-film effects will commonly introduce a nonlinearity in the spectral phase.

½7

The linear dispersion shifts the envelope by the group index of refraction times the thickness of the dispersive element; this dispersion also shifts the fringes under the envelope slightly. In other words, the linear phase delay for a range of wavelengths causes a group delay of the whole envelope (wave packet). Higher-order dispersion, absorption of the elements, or effects due to thin films can cause the envelope to widen or even become asymmetrical, the position of the fringes under the envelope to shift, the fringes to lose contrast, and the period of the fringes to change or vary with the z position. Dispersion effects will be stronger for sources that have a wider spectrum; however, the observed changes will be different for different shapes of spectra. The phase of the fringes under the envelope and the position of the envelope itself are parameters often

Controlled Phase Shift of Fringes Under the Envelope – Geometric Phase Shift

Many techniques in interferometry depend on shifting the phase of the interfering wavefronts. Mechanical shifters used in white light interferometry will introduce the same shift measured in nanometers for all wavelengths; however, when measured in degrees or radians, the shift will vary for different wavelengths. Geometric phase shifters (achromatic phase shifters operating on the principle of geometric phase) introduce for all wavelengths the same shift when measured in degrees and are based on polarization elements like a rotating wave plate in circularly polarized light or a rotating polarizer in a circularly polarized beam. Fringes underneath the coherence envelope shift, as shown in Figure 8, while the coherence envelope stays in place. The advantage to these techniques is that only the phase of fringes changes, not the fringe contrast

INTERFEROMETRY / White Light Interferometry 381

Figure 8 Fringes for two different geometric phase shifts.

Figure 9 Polarization interferometer with geometric phase shift.

Figure 7 Spectral amplitude (a) and spectral phase (b) for input from unbalanced interferometer (c).

(see Interferometry: Phase Measurement Interferometry). A geometrical phase shifter can be very useful in polarization microscopes (Figure 9), white light shearing interferometers, or any system where the phase of the white light fringes needs to be measured. Spectral Interference

White light fringes, because they are made up of fringes of many wavelengths, are observed only over very small path differences. If we filter only a narrow band from the white light spectrum, fringes would be visible over a much larger scan. For different wavelengths we would observe different color and frequency fringes; this is simply the reverse process of the description in the white light interference section. If we place a spectrometer in the observation plane of the white light interferometer (Figure 10), we will observe fringes with continuously changing wavelengths in dispersed light. These fringes are called

fringes of chromatic order or channeled spectrum fringes and find their application in film thickness measurement or absolute distance measurement in the range up to 1 mm. Channeled spectra were used for analysis of dispersion, thin film, and spectroscopic measurements. The number of observed fringes for a given wavelength is directly proportional to the measured optical path difference. The optical path difference can be determined if the difference in fringe numbers is determined for two well-known wavelengths (this is equivalent to two-wavelength interferometry). The optical path difference can be also quickly estimated from the frequency of fringes for a given wavelength; the larger the optical path difference, the denser the fringes.

Surface Topography and Object Structure Measurement Although WLI has many applications, this section focuses on white light interference as applied to surface topography measurement. Interference microscopes that use white light illumination are often based on the Michelson setup shown in Figure 11.

382 INTERFEROMETRY / White Light Interferometry

Figure 11 Michelson interferometric objective.

Figure 10 Interferometers with prism spectrometer to observe fringes of equal chromatic order for OPD measurement.

A beamsplitter cube is placed underneath the bright field objective, and one mirror is placed to the side at the focus plane of the objective while the other mirror is replaced by the measured surface with respect to which the whole interference objective is scanned. The narrowly localized fringes at the best focus position for each point of the surface corresponding to a pixel on the CCD camera is observed during the scan. Because of the design of the Michelson objective, in order to accommodate higher magnification objectives constraints, other interference setups like Mirau and Linnik were developed (see Microscopy: Interference Microscopy). However, these designs are still based on the Michelson interferometer, being also compensated interferometers with equal lengths and amounts of glass in each arm. These interference microscopes typically have two complementary modes of operation; one mode uses

monochromatic illumination and the other employs a white light source. Monochromatic illumination provides excellent vertical resolution but is limited in its range. It cannot correctly assign fringe order to discontinuous surfaces larger than 160 nanometers. Figure 12a shows monochromatic fringes for the surface of the profile shown in Figure 13. We see that with the monochromatic illumination the height of the grooves remains unknown because it is impossible to assign order numbers to these fringes. To resolve this ambiguity, white light illumination is employed because it allows for easy identification of the zero order fringe. Figure 12b shows white light fringes created for the same surface. We see the zero-order fringe for the zero OPD as well as the next orders due to the decreasing contrast of the fringes. The position of the zero-order fringe can be followed and approximate heights can be determined visually. Thus, white light illumination permits the measurement of a broader range of surfaces that are rough or have large height variations up to a few millimeters. The white light interferometer is a great focus detector (Figure 14) at each point of the field of view. The principle behind WLI is finding these individual focus positions using localized fringes observed during the surface scan. WLI interferometers provide high lateral resolution and large vertical range in the order of hundreds of nanometers or micrometers (Figure 15). White light interferometers are commonly used to measure magnetic heads, MEMS devices, binary optic, and machined surfaces. Signal Processing of White Light Interferograms

For WLI topographic measurements, each pixel registers an interferogram whose position varies

INTERFEROMETRY / White Light Interferometry 383

Figure 12 Fringes in quasimonochromatic and white light for object similar to one as presented in Figure 13.

described in eqn [8]: N21 X

Hðx; yÞ ¼

2 Ii ðx; yÞ 2 Iiþ1 ðx; yÞ zi

n¼1 N21 X

Ii ðx; yÞ 2 Iiþ1 ðx; yÞ

2

½8

n¼1

Figure 13 3D object profile – binary grating.

with the surface height. Figure 16 shows two interferograms for two positions in the field of view of a measured step height surface; the first shows the top surface of the step height and the second shows the bottom surface of the step height. Algorithms have been developed to analyze the signal in different ways. Some focus on finding the fringe envelope position and others examine the position of the fringe underneath the envelope. These algorithms work under the assumption that white light fringes can be described by the same function over the whole field of view of the interferometer. Algorithms are applied to a signal so as to find the position of the coherence envelope for each pixel. This process is called coherence gating, and all algorithms perform coherence gating. Some algorithms look for the position of the envelope’s maximum in a process whereby first the coherence envelope of the fringes is found using signal filtering. Then, the curve is fit to a few points around the envelope’s maximum, and finally the position of the envelope’s maximum is found. Other algorithms calculate the center of mass of the coherence signal, as

This method is very fast and computationally efficient. Center of mass calculations are equivalent to calculations of the maximum of the envelope position, but only for a symmetrical signal. For an asymmetrical signal a piston is introduced for each point; however, this piston does not affect the whole measurement. Still another method calculates the spectral phase slope in the Fourier domain to determine the position of the coherence envelope in a process that is equivalent to finding the center of mass. Finally, a different group of algorithms tracks the position of the bright or dark fringe close to the envelope’s maximum. Scanner Speed – Sampling Rate

A practical sampling rate of the white light interference signal is around four pixels per fringe. Because in most algorithms it is not the position of the fringes that we want to determine but rather the position of the envelope of the fringes, sampling the envelope of the fringes with four samples is usually sufficient. However, this sufficiency comes at the expense of lower repeatability (higher noise). The advantage to this sampling rate is short measurement time; higher sampling rates increase measurement time. Measurement speed can be increased by up to 25 times when the envelope of the fringes is widened by filtering light of a bandwidth about 20 – 40 nm from the white light source. Scanner Nonlinearity

Scanner motion is assumed to be linear, but nonlinearity in this motion impairs measurement

384 INTERFEROMETRY / White Light Interferometry

Figure 14 Operation of white light interferometer for surface topography measurement.

Figure 15 WLI measurement of Fresnel lens in a highresolution mode. Courtesy of Glimmerglass.

accuracy. To account for this, simple algorithms can calculate the average steps along each interferogram in scanning direction (Figure 17) and measured steps can be used in coherence detection algorithms.

In order to have continuous information about the scan steps, the fringes should be visible somewhere in the field of view at each moment of the scan. This task may require introducing a large tilt for samples with large discontinuities such as stepped surfaces. For large discontinuities where introducing sample tilt may not be sufficient, the envelope of the fringes can be extended by increasing the spatial or temporal coherence of the light source. An alternate solution involves measuring the scanner motion with a distance-measuring interferometer or triggering the camera to collect intensity data at every equal scan step, which can be determined by using the zero-crossing technique, as is commonly done in Fourier transform spectroscopy. Equal scan steps are more suitable for techniques based on transform techniques that assume equal sampling. Other peak detection algorithms, such as center of mass calculations, make use of measured but not necessarily equal steps directly in the algorithm. Observing the phase along the interferogram can also provide information that is important in optical fiber sensing. These observed changes, which assume that sampling rates of the interferogram are known,

INTERFEROMETRY / White Light Interferometry 385

Figure 18 Fringes for (a) thin and (b) thick film.

Figure 16 White light fringes as seen by two pixels during scan through focus for object in form of step.

Figure 17 Phase calculated along the white light interferogram. This calculated phase can be used to determine scanner motion.

enable correction for changes in wave number value k0 which can be due to such things as changes in the working voltage of bulb, an introduced higher-order dispersion, or a large tilt of the sample. Increased Resolution White Light Interferometry

Interferometric methods that employ a monochromatic light source to detect the phase of the fringes

(see Interferometry: Phase Measurement Interferometry) can achieve about 10 times better (0.3 nanometers versus 3 nanometers) vertical resolution than the WLI methods described so far. Combining coherence position detection from WLI to determine fringe order with the phase detection of phase shifting techniques allows for the measurement of samples with height discontinuities larger than 160 nanometers with the resolution and accuracy of phase shifting interferometry (PSI). This combination is particularly well-suited for determining the shape of smooth surfaces with large height differences such as binary diffractive optics or micro-electromechanical systems (MEMS) (see Figure 15). Using this combined method we obtain both a lower-resolution map of the envelope position and a higher-resolution map of the phase (position) of the zero order fringe. These maps may differ slightly due to effects similar to those discussed in sections Changes in Envelope and Fringes Due to Reflection of Nondielectric and Changes in Envelope and Fringes Due to Dispersion above. In interference microscopes, shifts in the envelope and fringe position may be introduced by field-dependent and chromatic aberrations of the system and the system’s effective numerical aperture. These shifts can vary for different points on the tested surface, but for simplicity are assumed to be constant over the field. Some correction for these effects can be applied.

Film Thickness Measurement

White light interferograms that are obtained in an optical profiler can be used to measure transparent film thicknesses because the character of the interferogram changes with the thickness of the film (Figure 18). Two different techniques, a thin or thick film technique, are used depending of the range of the film thickness. A thick film technique is used if two

386 INTERFEROMETRY / White Light Interferometry

Figure 19 Thick film (see Figure 18b) irradiance.

sets of best contrast white light fringes from each interface are clearly separated, meaning that no interference occurs between the wavefronts reflected from the top and bottom surfaces of the film. A thin film technique is employed when the two sets of fringes overlap. Thick film measurement A simple technique for finding the relative position of the peaks of the fringe envelopes can be implemented to find the thickness of a film. Figure 19 shows two clearly separated sets of fringes formed for the air/film and film/substrate interfaces. Whether these two sets of fringes are separated or not depends mainly on the geometrical thickness of the film and its group index of refraction, which is determined by the dispersion of the film. The typical range of measurable film thicknesses runs from 3 to 150 microns, depending on the dispersion of the film. This measurement allows for the detection of flows on the surface and interface of the film. Similar principles based on finding the position of the coherence envelopes are used for distance sensing, thickness measurement of plates, and the cornea of the eye, in low coherence reflectometry and structure measurements in biological samples in optical coherence tomography. Thin film measurement For thicknesses of five microns down to tens of nanometers, the white light interferogram is created from the interference between the beams reflected from the reference mirror and the two beams reflected from the thin film layer. Once the interferogram is registered while the objective is scanned vertically with a constant velocity, the spectral phase is calculated by applying a Fourier transform to the measured signal at each pixel as described in the section Fourier Analysis of White Light Interferograms above. The phase slope is subtracted and the dispersion of the system needs to be known. The spectral phase for the thin film interference has the form of a polynomial; thus, the polynomial for the chosen film model (n and k) is fitted, and regression

Figure 20 Thin film (see Figure 18a): (a) irradiance, (b) spectral amplitude and (c) spectral phase.

analysis is used to find the best fit and, therefore, the film thickness (Figure 20). Spatial Coherence Effects in the Interference Microscope

So far we have been discussing temporal coherence effects, but everywhere that the source has a wavelength bandwidth, spatial coherence may play an important role. In an interference microscope for surface topography measurement, the size of the aperture of the condenser is on the order of the aperture of the objective that illuminates the surface with a wide range of angles. For large angles, which are determined by the numerical aperture (NA) of the objective, where NA ¼ 0.5 –0.95, the combination of the influence of the spatial and temporal coherence is clearly visible. The additional spatial effects include a reduction of the fringe envelope width and a decrease in fringe spacing corresponding to k0 : Calibration of fringe spacing is typically done on the system and accounts for spatial effects as well as for any uncertainty of k0 due to the source spectra, the working temperature of the source, the spectral

INTERFEROMETRY / White Light Interferometry 387

response of the detector and the influence of other factors like the intensity distribution in the illuminating aperture. The spatial influences can be reduced by stopping down the condenser, which causes an increase in the contrast of the fringes. Rough Surfaces

Rough surfaces are difficult to measure using interferometric techniques, but under certain coherence conditions white light interference can do the job. For rough surfaces, if the microstructure of the object is not resolved by the imaging system, speckles, rather than fringes, are observed. Each speckle has a random phase which is approximately constant in the whole speckle area. If the rough surface is scanned through focus, each individual speckle exhibits the intensity modulation that is typical for WLI. These speckles enable the measurement, but they also introduce noise proportional to the roughness of the measured surface. Despite the noise that speckles introduce into the WLI measurement, WLI has an advantage because it rejects the light that has undergone scattering outside of a small sample volume, thus allowing precise noninvasive measurement of object structure, even in dense media.

Applications WLI is used in many disciplines and instruments such as: . Fourier transform spectroscopy – source and

material properties; . Michelson stellar interferometer – angular size of

. . . . . .

star, binary stars, delay measurement in optical paths of interferometer; Shearing interferometry – structure measurement; DIC Nomarski interferometry – structure measurement; Speckle interferometry – structure measurement; Holography – structure measurement; Optical sensors – temperature, pressure, distance; Optical coherence tomography – structure measurement.

See also Coherence: Coherence and Imaging; Overview. Holography, Techniques: Holographic Interferometry. Interferometry: Phase Measurement Interferometry. Microscopy: Interference Microscopy. Spectroscopy: Fourier Transform Spectroscopy.

Further Reading Born M and Wolf E (1997) Principles of Optics. New York: Cambridge University Press. Creath K (1997) Sampling requirements for white light interferometry. Fringe’97 Proceedings of the 3rd International Workshop on Automatic Processing of Fringe Patterns, pp. 52 – 59. Danielson BL and Boisbert CY (1991) Absolute optical ranging using low coherence interferometry. Applied Optics 30: 2975– 2979. Davis SP, Abrams MC and Brault JW (2001) Fourier Transform Spectrometry. London: Academic Press. de Groot P and Deck L (1995) Surface profiling by analysis of white light interferograms in the spatial frequency domain. Journal of Modern Optics 42: 389– 401. Dresdel T, Hausler G and Venzke H (1992) Three dimensional sensing of rough surfaces by coherence radar. Applied Optics 31(7): 919– 925. Harasaki A, Schmit J and Wyant JC (2001) Offset envelope position due to phase change on reflection. Applied Optics 40: 2102– 2106. Hariharan P (1992) Basics of Interferometry. New York: Academic Press. Hariharan P, Larkin KG and Roy M (1994) The geometric phase: interferometric observation with white light. Journal of Modern Optics 41: 663– 667. Larkin KG (1996) Neat nonlinear algorithm for envelope detection in white light interferometry. Journal of the Optical Society of America A 13: 832 – 842. Michelson AA and Benoit JR (1895) De´termination expe´rimentale de la valeur du me´tre en longueurs d’ondes lumineuses. Travaux et Me´moires des Bureau International des Poids et Mesures 11: 1. Olszak AG and Schmit JG (2003) High stability white light interferometry with reference signal for real time correction of scanning errors. Optical Engineering 42(1): 54 – 59. Park M-C and Kim S-W (2000) Direct quadratic polynomial fitting for fringe peak detection of white light scanning interferograms. Optical Engineering 39: 952– 959. Pavlicek P and Soubusta J (2004) Measurement of the influence of dispersion on white light interferometry. Applied Optics 43: 766 – 770. Schmit JG and Olszak A (2002) High precision shape measurement by white light interferometry with real time scanner error correction. Applied Optics 41(28): 5943– 5950. Sheppard CJR and Larkin KG (1995) Effects of numerical aperture on interference fringe spacing. Applied Optics 34: 4731– 4734. Sheppard CJR and Roy M (2003) Low-coherence interference microscopy. In: Torok P and Kao F-J (eds) Optical Imaging and Microscopy, pp. 257 – 273. Berlin, Heidelberg, New York: Springer-Verlag. Steel W (1987) Interferometry. New York: Cambridge University Press. Turyshew SG (2003) Analytical modeling of the white light fringe. Applied Optics 42(1): 71 –90.

L LASERS Contents

Carbon Dioxide Laser Dye Lasers Edge Emitters Excimer Lasers Free Electron Lasers Metal Vapor Lasers Noble Gas Ion Lasers Optical Fiber Lasers Organic Semiconductors and Polymers Planar Waveguide Lasers Semiconductor Lasers Up-Conversion Lasers

Carbon Dioxide Laser C R Chatwin, University of Sussex, Brighton, UK q 2005, Elsevier Ltd. All Rights Reserved.

Introduction This article gives a brief history of the development of the laser and goes on to describe the characteristics of the carbon dioxide laser and the molecular dynamics that permit it to operate at comparatively high power and efficiency. It is these commercially attractive features and its low cost that has led to its adoption as one of most popular industrial power beams. This outline also describes the main types of carbon dioxide laser and briefly discusses their characteristics and uses.

Brief History In 1917 Albert Einstein developed the concept of stimulated emission which is the phenomenon used in lasers. In 1954 the MASER (Microwave Amplification by Stimulated Emission of Radiation) was the first device to use stimulated emission. In that year Townes and Schawlow suggested that stimulated emission could be used in the infrared and optical portions of the electromagnetic spectrum. The device was originally termed the optical maser, this term being dropped in favor of LASER, standing for: Light Amplification by Stimulated Emission of Radiation. Working against the wishes of his manager at Hughes Research Laboratories, the electrical engineer Ted Maiman created the first laser on the 15 May 1960. Maiman’s flash lamp pumped ruby laser produced pulsed red electromagnetic radiation at a wavelength of 694.3 nm. During the most active period of laser systems discovery Bell Labs made a very

390 LASERS / Carbon Dioxide Laser

significant contribution. In 1960, Ali Javan, William Bennet and Donald Herriot produced the first Helium Neon laser, which was the first continuous wave (CW) laser operating at 1.15 mm. In 1961, Boyle and Nelson developed a continuously operating Ruby laser and in 1962, Kumar Patel, Faust, McFarlane and Bennet discovered five noble gas lasers and lasers using oxygen mixtures. In 1964, C.K.N. Patel created the high-power carbon dioxide laser operating at 10.6 mm. In 1964, J.F. Geusic and R.G. Smith produced the first Nd:Yag laser using neodymium doped yttrium aluminum garnet crystals and operating at 1.06 mm.

Characteristics Due to its operation between low lying vibrational energy states of the CO2 molecule, the CO2 laser has a high quantum efficiency, , 40%, which makes it extremely attractive as a high-power industrial materials processing laser (1 to 20 kW), where energy and running costs are a major consideration. Due to the requirement for cooling to retain the population inversion, the efficiency of electrical pumping and optical losses – commercial systems have an overall efficiency of approximately 10%. Whilst this may seem low, for lasers this is still a high efficiency. The CO2 laser is widely used in other fields, for example, surgical applications, remote sensing, and measurement. It emits infrared radiation with a wavelength that can range from 9 mm up to 11 mm. The laser

transition may occur on one of two transitions: (0001) ! (10 00), l ¼ 10:6 mm; (0001) ! (02 00), l ¼ 9:6 mm; see Figure 1. The 10.6 mm transition has the maximum probability of oscillation and gives the strongest output; hence, this is the usual wavelength of operation, although for specialist applications the laser can be forced to operate on the 9.6 mm line. Figure 1 illustrates an energy level diagram with four vibrational energy groupings that include all the significantly populated energy levels. The internal relaxation rates within these groups are considered to be infinitely fast when compared with the rate of energy transfer between these groups. In reality the internal relaxation rates are at least an order of magnitude greater than the rates between groups. Excitation of the upper laser level is usually provided by an electrical glow discharge. However, gas dynamic lasers have been built where expanding a hot gas through a supersonic nozzle creates the population inversion; this creates a nonequilibrium region in the downstream gas stream with a large population inversion, which produces a very highpower output beam (135 kW – Avco Everett Research Lab). For some time the gas dynamic laser was seriously considered for use in the space-based Strategic Defence Initiative (SDI-USA). The gas mixture used in a CO2 laser is usually a mixture of carbon dioxide, nitrogen, and helium. The proportions of these gases varies from one laser system to another, however, a typical mixture is 10%-CO2;

Figure 1 Six level model used for the theoretical description of CO2 laser action.

LASERS / Carbon Dioxide Laser 391

10%-N2; 80%-He. Helium plays a vital role in the operation of the CO2 laser in that it maintains the population inversion by depopulating the lower laser level by nonradiative collision processes. Helium is also important for stabilization of the gas discharge; furthermore it greatly improves the thermal conductivity of the gas mixture, which assists in the removal of waste heat via heat exchangers. Small quantities of other gases are often added to commercial systems in order to optimize particular performance characteristics or stabilize the gas discharge; for brevity we only concern ourselves here with this simple gas mixture.

Molecular Dynamics Direct Excitation and De-excitation

It is usual for excitation to be provided by an electrical glow discharge. The direct excitation of carbon dioxide (CO2) and nitrogen (N2) ground state molecules proceeds via inelastic collisions with fast electrons. The rates of kinetic energy transfer are a and g; respectively, and are given by eqns [1] and [2]:



FCO2 £ IP ðsec21 Þ E000 1 £ n0

½1



FN2 £ IP ðsec21 Þ Ev¼1 £ n4

½2

Figure 2 Electron energy distribution function.

Ee ; FCO2 ; and FN2 are obtained by solution of the Boltzmann transport equation (BTE); the average electron energy can be optimized to maximize the efficiency (FCO2 ; FN2 ) with which electrical energy is utilized to create a population inversion. Hence, the discharge conditions required to maximize efficiency can be predicted from the transport equation. Figure 2 shows one solution of the BTE for the electron energy distribution function.

where:

Resonant Energy Transfer

FCO2 ¼ Fraction of the input power (IP) coupled into the excitation of the energy level E000 1 ; n0 is the CO2 ground level population density; FN2 ¼ Fraction of the input power (IP) coupled into the excitation of the energy level Ev¼1 ; and n4 is the N2 ground level population density.

Resonant energy transfer between the CO2 (0001) and N2 ðv ¼ 2Þ energy levels (denoted 1 and 5 in Figure 1) proceeds via excited molecules colliding with ground state molecules. A large percentage of the excitation of the upper laser level takes place via collisions between excited N2 molecules and ground state CO2 molecules. The generally accepted rate of this energy transfer is given by eqns [5] and [6]:

The reverse process of the above occurs when molecules lose energy to the electrons and the electrons gain an equal amount of kinetic energy; the direct de-excitation rates are given by h and b; eqns [3] and [4], respectively: ! E000 1 ðsec21 Þ h ¼ a £ exp ½3 Ee !

b ¼ g £ exp

Ev¼1 ðsec21 Þ Ee

½4

where Ee is the average electron energy in the discharge.

K51 ¼ 19 000 PCO2 ðsec21 Þ

½5

K15 ¼ 19 000 PN2 ðsec21 Þ

½6

where PCO2 and PN2 are the respective gas partial pressures in Torr. Hence, CO2 molecules are excited into the upper laser level by both electron impact and impact with excited N2 molecules. The contribution from N2 molecules can be greater than 40% depending on the discharge conditions.

392 LASERS / Carbon Dioxide Laser

Collision – Induced Vibrational Relaxation of the Upper and Lower Laser Levels

The important vibrational relaxation processes are illustrated by Figure 1 and can be evaluated from eqns [7– 10]; where the subscripts refer to the rate between energy levels 1 and 32; 21 and 31; 22 and 31; 32 and 0, respectively: K132 ¼ 367 PCO2 þ 110 PN2 þ 67 PHe ðsec21 Þ

½7

where T is the absolute temperature and n refers to the population density of the gas designated by the subscript. This expression takes account of the different constituent molecular velocity distributions and different collision cross-sections for CO2 ! CO2, N2 ! CO2 and He ! CO2 type collisions. Equation [13] also takes account of the significant line broadening effect of helium. Neglecting the unit change in rotational quantum number, the energy level degeneracies g1 and g2 may be dropped. n100 0 is partitioned such that n100 0 ¼ 0:1452n2 and eqn [12] can be re-cast as eqn [14]:

K2131 ¼ 6 £ 105 PCO2 ðsec21 Þ

½8

K2231 ¼ 5:15 £ 105 PCO2 ðsec21 Þ

½9

g ¼ s ðn1 2 0:1452n2 Þ cm21

½10

where n1 and n2 are the population densities of energy groups ‘1’ and ‘2’ respectively.

K320 ¼ 200 PCO2 þ215 PN2 þ3270 PHe ðsec

21

Þ

K132 , K2131 ; and K2231 are vibration/vibration transfer rates and K320 is a vibration/translation transfer rate. Note the important effect of helium on eqn [10]; helium plays a major role in depopulating the lower laser level, thus enhancing the population inversion. PHe is the partial pressure of helium in Torr. Radiative Relaxation

Spontaneous radiative decay is not a major relaxation process in the CO2 laser but it is responsible for starting laser action via spontaneous emission. The Einstein ‘A’ coefficient for the laser transition is given by eqn [11]: A ¼ ðtsp Þ21 ¼ 0:213 ðsec21 Þ

½11

Gain

The gain (g) is evaluated from the product of the absorption coefficient (s) and the population inversion, eqn [12]:   g g ¼ s n000 1 2 1 n100 0 cm21 ½12 g2 For most commercial laser systems the absorption coefficient is that for high-pressure collisionbroadening (P . 5.2 Torr) where the intensity distribution function describing the line shape is Lorentzian. The following expression describes the absorption coefficient, eqn [13]:

s¼ TnCO2

692:5 ! ðcm2 Þ nN2 nHe þ 1:4846 1 þ 1:603 nCO2 nCO2 ½13

½14

Stimulated Emission

Consider a laser oscillator with two plane mirrors, one placed at either end of the active gain medium, with one mirror partially transmitting (see Figure 4). Laser action is initiated by spontaneous emission that happens to produce radiation whose direction is normal to the end mirrors and falls within the resonant modes of the optical resonator. The rate of change of photon population density ðIp Þ within the laser cavity can be written as eqn [15]: Ip dIP ¼ Ip cg 2 dt T0

½15

where the first term on the right-hand side accounts for the effect of stimulated emission and the second term represents the number of photons that decay out of the laser cavity, T0 is the photon decay time, given by eqn [16], and is defined as the average time a photon remains inside the laser cavity before being lost either through the laser output window or due to dispersion; if dispersion is ignored, Ip =T0 ; is the laser output: 2L

T0 ¼ c loge

1 RB RF

!

½16

where L is the distance between the back and the front mirrors, which have reflectivities of RB and RF , respectively. The dominant laser emission occurs on a rotational – vibrational P branch transition Pð22Þ; that is ð J ¼ 21Þ ! ð J ¼ 22Þ line of the ð000 1Þ ! ð100 0Þ; l ¼ 10:6 mm transition, where J is the rotational quantum number. The rotational level relaxation rate is so rapid that equilibrium is maintained between rotational levels so that they

LASERS / Carbon Dioxide Laser 393

feed all their energy through the Pð22Þ transition. This model simply assumes constant intensity, basing laser performance on the performance of an average unit volume. By introducing the stimulated emission term into the molecular rate equations, which describe the rate of transfer of molecules between the various energy levels illustrated in Figure 1, a set of molecular rate eqns [17 –21] can be written that permit simulation of the performance of a carbon dioxide laser: dn1 ¼ an0 2 hn1 þ K51 n5 2 K15 n1 2 Ksp n1 dt     n1 2 K132 n1 2 n32 2 Ip cg n32 e     dn2 n ¼ Ksp n1 þ Ip cg 2 K2131 n21 2 21 n31 dt n     31 e n22 2 K2231 n22 2 n n31 e 31      dn3 n ¼ 2K2131 n21 2 21 n31 2 2K2231 n22 dt n     31 e    n22 n1 2 n31 þ K132 n1 2 n32 n31 e n32 e     n32 n 2 K320 n32 2 n0 e 0

½17

½18

½19

dn5 ¼ gn4 2 bn5 2 K51 n5 þ K15 n1 dt

½20

dIp Ip ¼ Ip cg 2 dt T0

½21

The terms in square brackets ensure that the system maintains thermodynamic equilibrium; subscript ‘e’ refers to the fact that the populations in the square brackets are the values for thermodynamic equilibrium. The set of five simultaneous differential equations can be solved using a Runge – Kutta method. They can provide valuable performance prediction data that is helpful in optimizing laser design, especially when operated in the pulsed mode. Figure 3a illustrates some simulation results for the transverse flow laser shown in Figure 9. The results illustrate the effect of altering the gas mixture and how this can be used to control the gain switched spike that would result in unwelcome work piece plasma generation if allowed to become too large. Figure 3b shows an experimental laser output pulse from the high pulse repetition frequency (prf – 5kHz) laser illustrated in Figure 9. This illustrates that even a

quite basic physical model can give a good prediction of laser output performance.

Optical Resonator Figure 4 shows a simple schematic of an optical resonator. This simple optical system consists of two mirrors (the full reflector is often a water cooled gold coated copper mirror), which are aligned to be orthogonal to the optical axis that runs centrally along the length of the active gain medium in which there is a population inversion. The output coupler is a partial reflector (usually dielectrically coated zinc selenide – ZnSe –that may be edge cooled) so that some of the electromagnetic radiation can escape as an output beam. The ZnSe output coupler has a natural reflectivity of about 17% at each air– solid interface. For high power lasers (2 kW) 17% is sufficient for laser operation; however, depending on the required laser performance the inside surface is often given a reflective coating. The reflectivity of the inside face depends on the balance between the gain (eqn [14]), the output power and the power stability requirements. The outside face of the partial reflector must be anti-reflection (AR) coated. Spontaneous emission occurs within the active gain medium and radiates randomly in all directions; a fraction of this spontaneous emission will be in the same direction as the optical axis, perpendicular to the end mirrors, and will also fall into a resonant mode of the optical resonator. Spontaneous emission photons interact with CO2 molecules in the excited upper laser level, excited state (0001), which stimulates these molecules to give up a quanta of vibrational energy as photons via the radiative transition ð000 1Þ ! ð100 0Þ; l ¼ 10:6mm: The radiation given up will have exactly the same phase and direction as the stimulating radiation and thus will be coherent with it. The reverse process of absorption also occurs, but so long as there is a population inversion there will be a net positive output. This process is called light amplification by stimulated emission of radiation (LASER). The mirrors continue to redirect the photons parallel to the optical axis and so long as the population inversion is not depleted, more and more photons are stimulated by stimulated emission which dominates the process and also dominates the spontaneous emission, which is important to initiate laser action. Light emitted by lasers contains several optical frequencies, which are a function of the different modes of the optical resonator; these are simply the standing wave patterns that can exist within the resonator structure. There are two types of

394 LASERS / Carbon Dioxide Laser

Figure 3 (a) Predicted output pulses for transverse flow CO2 laser for different gas mixtures, (b) Experimental output pulse from transverse flow CO2 laser.

Figure 4 Optical resonator.

resonator modes: longitudinal and transverse. Longitudinal modes differ from one another in their frequency of oscillation whereas transverse modes differ from one another in their oscillation frequency and field distribution in a plane orthogonal to the direction of propagation. Typically CO2 lasers have a large number of longitudinal

modes; in CO2 laser applications these are of less interest than the transverse modes, which determine the transverse beam intensity and the nature of the beam when focused. In cylindrical coordinates the transverse modes are labelled TEMpl ; where subscript ‘p’ is the number of radial nodes and ‘l’ is the number of angular nodes. The lowest order mode is the TEM00, which has a Gaussianlike intensity profile with its maximum on the beam axis. A light beam emitted from an optical resonator with a Gaussian profile is said to be operating in the ‘fundamental mode’ or the TEM00 mode. The decrease in irradiance ðIÞ with distance ‘r’ from the axis ðI0 Þ of the Gaussian beam is

LASERS / Carbon Dioxide Laser 395

described by eqn [22]: IðrÞ ¼ I0 expð2r2 =w2 Þ

½22

where w is the radial distance, where the power density is decreased to 1=e2 of its axial value. Ideally a commercial laser should be capable of operation in the fundamental mode as, with few exceptions, this results in the best performance in applications. Laser cutting benefits from operation in the fundamental mode; however, welding or heat treatment applications may benefit from operation with higher-order modes. Output beams are usually controlled to be linearly or circularly polarized, depending upon the requirements of the application. For materials processing applications the laser beam is usually focused via a water cooled ZnSe lens or, for very high power lasers, a parabolic gold coated mirror. Welding applications will generally use a long focal length lens and cutting applications will use a short focal length, which generates a higher irradiance at the work piece than that necessary for welding. The beam delivery optics are usually incorporated into a nozzle assembly that can deliver cooling water and assist gases for cutting and anti-oxidizing shroud gases for welding or surface engineering applications.

Laser Configuration CO2 lasers are available in many different configurations and tend to be classified on the basis of their physical form and the gas flow arrangement, both of which greatly affect the output power available and beam quality. The main categories are: sealed off lasers, waveguide lasers, slow axial flow, fast axial flow, diffusion cooled, transverse flow, transversely excited atmospheric lasers, and gas dynamic lasers. Sealed-Off and Waveguide Lasers

Depopulation of the lower laser level is via collision with the walls of the discharge tube, so the attainable output power scales with the length of the discharge column and not its diameter. Output powers are in the range 5 W to 250 W. Devices may be constructed from concentric glass tubes with the inner tube providing the discharge cavity and the outer tube acting to contain water-cooling of the inner discharge tube. The inner tube walls act as a heat sink for the discharge thermal energy (see Figure 5). The DC electrical discharge is provided between a cathode and anode situated at either end of the discharge tube. A catalyst must be provided to ensure regeneration of CO2 from CO. This may be accomplished by adding about 1% of H2O to the gas mixture, or alternatively,

Figure 5 Schematic of sealed-off CO2 laser, approximately 100 W per meter of gain length, gas cooled by diffusion to the wall.

recombination can be achieved via a hot (300 8C) Ni cathode, which acts as a catalyst. RF-excited all metal sealed-off tube systems can deliver lifetimes greater than 45 000 hours. Diffusion-cooled slab laser technology will also deliver reliable sealed operation for 20 000 hrs. Excitation of the laser medium occurs via RF excitation between two water-cooled electrodes. The water-cooled electrodes dissipate (diffusion cooled) the heat generated in the gas discharge. An unstable optical resonator provides the output coupling for such a device (see Figure 6). Output powers are in the range 5 W to 300 W and can be pulsed from 0 to 100 kHz. These lasers are widely used for marking, rapid prototyping, and cutting of nonmetals (paper, glass, plastics, ceramics) and metals. Waveguide CO 2 lasers use small bore tubes (2 –4 mm) made of BeO or SiO2 where the laser radiation is guided by the tube walls. Due to the small tube diameter, a gas total pressure of 100 to 200 Torr is necessary, hence the gain per unit length is high. This type of laser will deliver 30 W of output power from a relatively short (50 cm long) compact sealed-off design; such a system is useful for microsurgery and scientific applications. Excitation can be provided from a longitudinal DC discharge or from an RF source that is transverse to the optical axis; RF excitation avoids the requirement for an anode and cathode and results in a much lower electrode voltage. Slow Axial Flow Lasers

In slow flow lasers the gas mixture flows slowly through the laser cavity. This is done to remove the products of dissociation that will reduce laser efficiency or prevent it from operating at all, and the main contaminant is CO. The dissociated gases (mainly CO and O2) can be recombined using a catalyst pack and then reused via continuous recirculation. Heat is removed via diffusion through the walls of the tube containing the active gain medium. The tube is frequently made of Pyrex glass with a concentric outer tube to facilitate water-cooling of the

396 LASERS / Carbon Dioxide Laser

Figure 6 Schematic for sealed-off slab laser and diffusion cooled laser with RF excitation (courtesy of Rofin).

laser cavity (see Figure 7). Slow flow lasers operate in the power range 100 W to 1500 W, and tend to use a longitudinal DC electrical discharge which can be made to run continuously or pulsed if a thyratron switch is build into the power supply; alternatively, electrical power can be supplied via transverse RF excitation. The power scales with length, hence high power slow flow lasers have long cavities and require multiple cavity folds in order to reduce their physical size.

Fast Axial Flow Lasers

The fast axial flow laser, Figure 8, can provide output powers from 1 kW to 20 kW; it is this configuration that dominates the use of CO2 lasers for industrial applications. Industrial lasers are usually in the power range 2– 4 kW. The output power from these devices scales with mass flow, hence the gas mixture is recycled through the laser discharge region at sonic or supersonic velocities. Historically this was achieved using Rootes blowers to compress the gas upstream of the laser cavity. Rootes compressors are inherently inefficient and the more advanced laser systems utilize turbine compressors, which deliver greater efficiency and better laser stability. Rootes compressors can be a major source of vibration. With this arrangement heat exchangers are required to remove heat after the laser discharge region and also after the compressor stage, as the compression process heats up the laser gases. Catalyst packs are used to regenerate gases but some gas replacement is often required. These laser systems have short cavities and use folded stable resonator designs to achieve higher output powers with extremely high-quality beams that are particularly suitable for cutting applications. They also give

Figure 7 Slow flow CO2 laser, approximately 100 W per meter of gain length, gas cooled by diffusion to the wall.

excellent results when used for welding and surface treatments. Fast axial flow lasers can be excited by a longitudinal DC discharge or a transverse RF discharge. Both types of electrical excitation are common. For materials processing applications it is often important to be able to run a laser in continuous wave (CW) mode or as a high pulse repetition rate (prf) pulsed laser and to be able to switch between CW and pulsed in real time; for instance, laser cutting of accurate internal corners is difficult using CW operation but very easy using the pulsed mode of operation. Both methods of discharge excitation can provide this facility. Diffusion Cooled Laser

The diffusion-cooled slab laser is RF excited and gives an extremely compact design capable of delivering 4.5 kW pulsed from 8 Hz to 5 kHz prf or CW with good beam quality (see Figure 6). The optical resonator is formed by the front and rear mirrors and two parallel water cooled RF-electrodes. Diffusion cooling is provided by the RF-electrodes, removing the requirement for conventional gas recirculation via Rootes blowers or turbines.

LASERS / Carbon Dioxide Laser 397

Figure 8 Fast axial flow carbon dioxide laser.

This design of laser results in a device with an extremely small footprint that has low maintenance and running costs. Applications include: cutting, welding, and surface engineering.

designs. For this reason this type of laser is suitable for a wide range of welding and surface treatment applications. Transversely Excited Atmospheric (TEA) Pressure

Fast Transverse Flow Laser

In the fast transverse flow laser (Figure 9a) the gas flow, electrical discharge, and the output beam are at right angles to each other (Figure 9b). The transverse discharge can be high voltage DC, RF, or pulsed up to 8 kHz (Figure 9c). Very high output power per unit discharge length can be obtained with an optimal total pressure ðPÞ of ,100 Torr; systems are available delivering 10 kW of output power, CW or pulsed (see Figures 3a and b). The increase in total pressure requires a corresponding increase in the gas discharge electric field, E; as the ratio E=P must remain constant, since this determines the temperature of the discharge electrons, which have an optimum mean value (optimum energy distribution, Figure 2) for efficient excitation of the population inversion. With this high value of electric field, a longitudinaldischarge arrangement is impractical (500 kV for a 1 m discharge length); hence, the discharge is applied perpendicular to the optical axis. Fast transverse flow gas lasers provided the first multikilowatt outputs but tend to be expensive to maintain and operate. In order to obtain a reasonable beam quality, the output coupling is often obtained using a multipass unstable resonator. As the population inversion is available over a wide rectangular cross-section, this is a disadvantage of this arrangement and beam quality is not as good as that obtainable from fast axial flow

If the gas total pressure is increased above , 100 Torr it is difficult to sustain a stable glow discharge, because above this pressure instabilities degenerate into arcs within the discharge volume. This problem can be overcome by pulse excitation; using submicrosecond pulse duration, instabilities do not have sufficient time to develop; hence, the gas pressure can be increased above atmospheric pressure and the laser can be operated in a pulsed mode. In a mode locked format optical pulses shorter than 1 ns can be produced. This is called a TEA laser and with a transverse gas flow is capable of producing short high power pulses up to a few kHz repetition frequency. In order to prevent arc formation, TEA lasers usually employ ultraviolet or e-beam preionization of the gas discharge just prior to the main current pulse being applied via a thyratron switch. Output coupling is usually via an unstable resonator. TEA lasers are used for marking, remote sensing, range-finding, and scientific applications.

Conclusions It is 40 years since Patel operated the first high power CO2 laser. This led to the first generation of lasers which were quickly exploited for industrial laser materials processing, medical applications, defense,

398 LASERS / Carbon Dioxide Laser

Figure 9 (a) Transverse flow carbon dioxide laser gas recirculator, (b) Transverse flow carbon dioxide electrodes, (c) Transverse flow carbon dioxide gas discharge as seen from the output window.

and scientific research applications; however, the first generation of lasers were quite unreliable and temperamental. After many design iterations, the CO2 laser has now matured into a reliable, stable laser source available in many different geometries and power ranges. The low cost of ownership of the latest generation of CO2 laser makes them a very attractive commercial proposition for many industrial and scientific applications. Commercial

lasers incorporate many novel design features that are beyond the scope of this article and are often peculiar to the particular laser manufacturer. This includes gas additives and catalysts that may be required to stabilize the gas discharge of a particular laser design; it is this optimization of the laser design that has produced such reliable and controllable low-cost performance from the CO2 laser.

LASERS / Carbon Dioxide Laser 399

List of Units and Nomenclature The direct excitation of carbon dioxide (CO 2) ground state molecules (sec21) b The direct de-excitation of nitrogen (N2) (sec21) g The direct excitation of nitrogen (N2) ground state molecules (sec21) h The direct de-excitation of carbon dioxide (CO2) (sec21) l wavelength (m) s absorption coefficient (cm2) t Electrical current pulse length (ms) tsp Spontaneous emission life time of the upper laser level (sec) A ¼ ðtsp Þ21 The Einstein ‘A’ coefficient for the ¼ 0:213ðsec21 Þ laser transition c velocity of light (cm sec21) Cc Coupling capacitance (nF) CO2 Carbon dioxide e subscript ‘e’ refers to the fact that the populations in the square brackets are the values for thermodynamic equilibrium E Electric Field (V cm21) FCO2 Fraction of the input power (IP) coupled into the excitation of the energy level E000 1 FN2 Fraction of the input power (IP) coupled into the excitation of the energy level Ev¼1 g gain (cm21) g1 and g2 energy level degeneracy’s of levels 1 and 2 He Helium I beam irradiance (W cm22) I0 beam irradiance at the center of a Gaussian laser beam (W cm22) Ip Photon population density (photons cm23) Ip Input current (A) IP Electrical input power (W cm23) J the rotational quantum number K51 ; K15 Resonant energy transfer between the CO 2 (0001) and N2ðv ¼ 2Þ energy levels proceeds via excited molecules colliding with ground state molecules (sec21) K132 ; K2131 ; are vibration/vibration transfer K2231 rates (sec 21) between energy levels 1 and 32; 21 and 31; 22 and 31, respectively (see Figure 1)

K320

a

Ksp ; A L

n N2 P PCO2 ; PHe and PN2 Pin r RB RF t T T0 w

is a vibration/translation transfer rate (sec21) between energy levels 32 and 0 (see Figure 1) Spontaneous emission rate (sec21) The distance between the back and the front mirrors, which have reflectivity’s of RB and RF (cm) molecular population (molecules cm23) Nitrogen Pressure (Torr) The respective gas partial pressures (Torr) Electrical input power (kW) radius of laser beam (cm) Back mirror reflectivity Front mirror reflectivity time (sec) Temperature (deg K) the photon decay time (sec) the radial distance where the power density is decreased to 1=e2 of its axial value

See also Fiber and Guided Wave Optics: Overview. Lasers: Noble Gas Ion Lasers.

Further Reading Anderson JD (1976) Gasdynamic Lasers: An Introduction. New York: Academic Press. Chatwin CR, McDonald DW and Scott BF (1991) Design of a High P.R.F. Carbon Dioxide Laser for Processing High Damage Threshold Materials. Selected Papers on Laser Design, SPIE Milestone Series, pp. 425 – 433. Washington: SPIE Optical Engineering Press. Cool AC (1969) Power and gain characteristics of high speed flow lasers. Journal of Applied Physics 40(9): 3563. Crafer RC, Gibson AF, Kent MJ and Kimmit MF (1969) Time-dependent processes on CO2 laser amplifiers. British Journal of Applied Physics 2(2): 183. Gerry ET and Leonard AD (1966) Measurement of 10.6-m CO2 laser transition probability and optical broadening cross sections. Applied Physics Letters 8(9): 227. Gondhalekar A, Heckenberg NR and Holzhauer E (1975) The mechanism of single-frequency operation of the hybrid CO2 laser. IEEE Journal of Quantum Electronics QE-11(3): 103. Gordiets BF, Sobolev NN and Shelepin LA (1968) Kinetics of physical processes in CO2 lasers. Soviet Physics JETP 26(5): 1039. Herzberg G (1945) Molecular Spectra & Molecular Structure. Infra-red and Raman Spectra of Polyatomic Molecules, vol. 2. New York: Van Nostrand.

400 LASERS / Dye Lasers

Hoffman AL and Vlases GC (1972) A simplified model for predicting gain, saturation and pulse length for gas dynamic lasers. IEEE Journal of Quantum Electronics 8(2): 46. Johnson DC (1971) Excitation of an atmospheric pressure CO2 – N2 – He laser by capacitor discharges. IEEE Journal of Quantum Electronics Q.E.-7(5): 185. Koechner W (1988) Solid State Laser Engineering. Berlin: Springer Verlag. Kogelnik H and Li T (1966) Laser beams and resonators. Applied Optics 5: 1550– 1567. Levine AK and De Maria AJ (1971). Lasers, vol. 3. New York: Marcel Dekkar, Chapter 3. Moeller G and Ridgen JD (1965) Applied Physics Letters 7: 274. Moore CB, Wood RE, Bei-Lok Hu and Yardley JT (1967) Vibrational energy transfer in CO2 lasers. Journal of Chemical Physics 11: 4222. Patel CKN (1964) Physics Review Letters 12: 588.

Siegman A (1986) Lasers. Mill Valley, California: University Science. Smith K and Thomson RM (1978) Computer Modeling of Gas Lasers. New York and London: Plenum Press. Sobolev NN and Sokovikov VV (1967) CO2 lasers. Soviet Physics USPEKHI 10(2): 153. Svelto O (1998) Principles of Lasers, 4th edn. New York: Plenum. Tychinskii VP (1967) Powerful lasers. Soviet Physics USPEKHI 10(2): 131. Vlases GC and Money WM (1972) Numerical modelling of pulsed electric CO2 lasers. Journal of Applied Physics 43(4): 1840. Wagner WG, Haus HA and Gustafson KT (1968) High rate optical amplification. IEEE Journal of Quantum Electronics Q.E.-4: 287. Witteman W (1987) The CO2 Laser. Springer Series in Optical Sciences, vol. 53.

Dye Lasers F J Duarte, Eastman Kodak Company, New York, NY, USA A Costela, Consejo Superior de Investigaciones Cientificas, Madrid, Spain q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Background

Dye lasers are the original tunable lasers. Discovered in the mid-1960s these tunable sources of coherent radiation span the electromagnetic spectrum from the near-ultraviolet to the near-infrared (Figure 1). Dye lasers spearheaded and sustained the revolution in

atomic and molecular spectroscopy and have found use in many and diverse fields from medical to military applications. In addition to their extraordinary spectral versatility, dye lasers have been shown to oscillate from the femtosecond pulse domain to the continuous wave (cw) regime. For microsecond pulse emission, energies of up to hundreds of joules per pulse have been demonstrated. Further, operation at high pulsed repetition frequencies (prfs), in the multikHz regime, has provided average powers at kW levels. This unrivaled operational versatility is summarized in Table 1. Dye lasers are excited by coherent optical energy from an excitation, or pump, laser or by optical energy from specially designed lamps called flashlamps. Recent advances in semiconductor laser

Figure 1 Approximate wavelength span from the various classes of laser dye molecules. Reproduced with permission from Duarte FJ (1995) Tunable Laser Handbook. New York: Academic Press.

LASERS / Dye Lasers 401

Table 1

Emission characteristics of liquid dye lasers

Dye laser class

Spectral coveragea

Energy per pulseb

Prf b

Power b

Laser-pumped pulsed dye lasers Flashlamp-pumped dye lasers CW dye lasers

350–1100 nm 320–900 nm 370–1000 nm

800 Jc 400 Je

13 kHzd 850 Hzf

2.5 kWd 1.2 kWf 43 Wg

a

Approximate range. Refers to maximum values for that particular emission parameter. c Achieved with an excimer-laser pumped coumarin dye laser by Tang and colleagues, in 1987. d Achieved with a multistage copper-vapor-laser pumped dye laser using rhodamine dye by Bass and colleagues, in 1992. e Reported by Baltakov and colleagues, in 1974. Uses rhodamine 6G dye. f Reported by Morton and Dragoo, in 1981. Uses coumarin 504 dye. g Achieved with an Arþ laser pumped folded cavity dye laser using rhodamine 6G dye by Baving and colleagues, in 1982. b

technology have made it possible to construct very compact all-solid-state excitation sources that, coupled with new solid-state dye laser materials, should bring the opportunity to build compact tunable laser systems for the visible spectrum. Further, direct diode-laser pumping of solid-state dye lasers should prove even more advantageous to enable the development of fairly inexpensive tunable narrow-linewidth solid-state dye laser systems for spectroscopy and other applications requiring low powers. Work on electrically excited organic gain media might also provide new avenues for further progress. The literature of dye lasers is very rich and many review articles have been written describing and discussing traditional dye lasers utilizing liquid gain media. In particular, the books Dye Lasers, Dye Laser Principles, High Power Dye Lasers, and Tunable Lasers Handbook provide excellent sources of authoritative and detailed description of the physics and technology involved. In this article we offer only a survey of the operational capabilities of the dye lasers using liquid gain media in order to examine with more attention the field of solid-state dye lasers.

Brief History of Dye Lasers

1965: Quantum theory of dyes is discussed in the context of the maser (R. P. Feynman). 1966: Dye lasers are discovered (P. P. Sorokin and J. R. Lankard; F. P. Scha¨fer and colleagues). 1967: The flashlamp-pumped dye laser is discovered (P. P. Sorokin and J. R. Lankard; W. Schmidt and F. P. Scha¨fer). 1967 – 1968: Solid-state dye lasers are discovered (B.H.SofferandB.B.McFarland;O.G.Peterson and B. B. Snavely). 1968: Mode-locking, using saturable absorbers, is demonstrated in dye lasers (W. Schmidt and F. P. Scha¨fer).

1970: The continuous-wave (cw) dye laser is discovered (O. G. Peterson, S. A. Tuccio, and B. B. Snavely). 1971: The distributed feedback dye laser is discovered (H. Kogelnik and C. V. Shank). 1971 –1975: Prismatic beam expansion in dye lasers is introduced (S. A. Myers; E. D. Stokes and colleagues; D. C. Hanna and colleagues). 1972: Passive mode-locking is demonstrated in cw dye lasers (E. P. Ippen, C. V. Shank, and A. Dienes). 1972: The first pulsed narrow-linewidth tunable dye laser is introduced (T. W. Ha¨nsch). 1973: Frequency stabilization of cw dye lasers is demonstrated (R. L. Barger, M. S. Sorem, and J. L. Hall). 1976: Colliding-pulse-mode locking is introduced (I. S. Ruddock and D. J. Bradley). 1977 –1978: Grazing-incidence grating cavities are introduced (I. Shoshan and colleagues; M. G. Littman and H. J. Metcalf; S. Saikan). 1978 – 1980: Multiple-prism grating cavities are introduced (T. Kasuya and colleagues; G. Klauminzer; F. J. Duarte and J. A. Piper). 1981: Prism pre-expanded grazing-incidence grating oscillators are introduced (F. J. Duarte and J. A. Piper). 1982: Generalized multiple-prism dispersion theory is introduced (F. J. Duarte and J. A. Piper). 1983: Prismatic negative dispersion for pulse compression is introduced (W. Dietel, J. J. Fontaine, and J-C. Diels). 1987: Laser pulses as short as six femtoseconds are demonstrated (R. L. Fork, C. H. Brito Cruz, P. C. Becker, and C. V. Shank). 1994: First narrow-linewidth solid-state dye laser oscillator (F. J. Duarte). 1999 – 2000: Distributed feedback solid-state dye lasers are introduced (Wadsworth and colleagues; Zhu and colleagues).

402 LASERS / Dye Lasers

Molecular Energy Levels Dye molecules have large molecular weights and contain extended systems of conjugated double bonds. These molecules can be dissolved in an adequate organic solvent (such as ethanol, methanol, ethanol/water, and methanol/water) or incorporated into a solid matrix (organic, inorganic, or hybrid). These molecular gain media have a strong absorption generally in the visible and ultraviolet regions, and exhibit large fluorescence bandwidths covering the entire visible spectrum. The general energy level diagram of an organic dye is shown in Figure 2. It consists of electronic singlet and triplet states with each electronic state containing a multitude of overlapping vibrational – rotational levels giving rise to broad continuous energy bands. Absorption of visible or ultraviolet pump light excites the molecules from the ground state S0 into some rotational – vibrational level belonging to an upper excited singlet state, from where the molecules decay nonradiatively to the lowest vibrational level of the first excited singlet state S1 on a picosecond time-scale. From S1 the molecules can decay radiatively, with a radiative lifetime on the nanosecond time-scale, to a higherlying vibrational –rotational level of S0. From this level they rapidly thermalize into the lowest vibrational – rotational levels of S0. Alternatively, from S1, the molecules can experience nonradiative

relaxation either to the triplet state T1 by an intersystem crossing process or to the ground state by an internal conversion process. If the intensity of the pumping radiation is high enough a population inversion between S1 and S0 may be attained and stimulated emission occurs. Internal conversion and intersystem crossing compete with the fluorescence decay mode of the molecule and therefore reduce the efficiency of the laser emission. The rate for internal conversion to the electronic ground state is usually negligibly small so that the most important loss process is intersystem crossing into T1 that populates the lower metastable triplet state. Thus, absorption on the triplet– triplet allowed transitions could cause considerable losses if these absorption bands overlap the lasing band, inhibiting or even halting the lasing process. This triplet loss can be reduced by adding small quantities of appropriate chemicals that favor nonradiative transitions that shorten the effective lifetime of the T1 level. For pulsed excitation with nanosecond pulses, the triplet –triplet absorption can be neglected because for a typical dye the intersystem crossing rate is not fast enough to build up an appreciable triplet population in the nanosecond time domain. Dye molecules are large (a typical molecule incorporates 50 or more atoms) and are grouped into families with similar chemical structures. A survey of the major classes of laser dyes is given later. Solid-state laser dye gain media are also considered later.

Liquid Dye Lasers Laser-Pumped Pulsed Dye Lasers

Figure 2 Schematic energy level diagram for a dye molecule. Full lines: radiative transitions; dashed lines: nonradiative transitions; dotted lines: vibrational relaxation.

Laser-pumped dye lasers use a shorter wavelength, or higher frequency, pulsed laser as the excitation or pump source. Typical pump lasers for dye lasers are gas lasers such as the excimer, nitrogen, or copper lasers. One of the most widely used solid-state laser pumps is the frequency doubled Nd:YAG laser which emits at 532 nm. In a laser-pumped pulsed dye laser the active medium, or dye solution, is contained in an optical cell often made of quartz or fused silica, which provides an active region typically some 10 mm in length and a few mm in width. The active medium is then excited either longitudinally, or transversely, via a focusing lens using the pump laser. In the case of transverse excitation the pump laser is focused to a beam , 10 mm in width and , 0.1 mm in height. Longitudinal pumping requires focusing of the excitation beam to a diameter in the 0.1 –0.15 mm range. For lasers operated at low prfs (a few pulses per

LASERS / Dye Lasers 403

second), the dye solution might be static. However, for high-prf operation (a few thousand pulses per second) the dye solution must be flowed at speeds of up to a few meters per second in order to dissipate the heat. A simple broadband optically pumped dye laser can be constructed using just the pump laser, the active medium, and two mirrors to form a resonator. In order to achieve tunable, narrow-linewidth, emission, a more sophisticated resonator must be employed. This is called a dispersive tunable oscillator and is depicted in Figure 3. In a dispersive tunable oscillator the exit

Figure 3 Copper-vapor-laser pumped hybrid multiple-prism near grazing incidence (HMPGI) grating dye laser oscillator. Adapted with permission from Duarte FJ and Piper JA (1984) Narrow linewidth high prf copper laser-pumped dye-laser oscillators. Applied Optics 23: 1391–1394.

side of the cavity is comprised of a partial reflector, or an output coupler, and the other end of the resonator is composed of a multiple-prism grating assembly. It is the dispersive characteristics of this multiple-prism grating assembly and the dimensions of the emission beam produced at the gain medium that determine the tunability and the narrowness, or spectral purity, of the laser emission. In order to selectively excite a single vibrational – rotational level of a molecule such as iodine (I2), at room temperature, one needs a laser linewidth of Dn < 1:5 GHz (or Dl < 0:0017 nm at l ¼ 590 nm). The hybrid multiple-prism near-grazing-incidence (HMPGI) grating dye laser oscillator illustrated in Figure 4 yields laser linewidths in the 400 MHz # Dn # 650 MHz range at 4 – 5% conversion efficiencies whilst excited by a copper-vapor laser operating at a prf of 10 kHz. Pulse lengths are , 10 ns at full-width half-maximum (FWHM). The narrowlinewidth emission from these oscillators is said to be single-longitudinal-mode lasing because only one electromagnetic mode is allowed to oscillate. The emission from oscillators of this class can be amplified many times by propagating the tunable narrow-linewidth laser beam through single-pass amplifier dye cells under the excitation of pump lasers. Such amplified laser emission can reach enormous average powers. Indeed, a copper-vaporlaser pumped dye laser system at the Lawrence Livermore National Laboratory (USA), designed for the laser isotope separation program, was reported to

Figure 4 Flashlamp-pumped multiple-prism grating oscillators. From Duarte FJ, Davenport WE, Ehrlich JJ and Taylor TS (1991) Ruggedized narrow-linewidth dispersive dye laser oscillator. Optics Communications 84: 310–316. Reproduced with permission from Elsevier.

404 LASERS / Dye Lasers

yield average powers in excess of 2.5 kW, at a prf of 13.2 kHz, at a better than 50% conversion efficiency as reported by Bass and colleagues in 1992. Besides high conversion efficiencies, excitation with coppervapor lasers, at l ¼ 510:554 nm, has the advantage of inducing little photodegradation in the active medium thus allowing very long dye lifetimes. Flashlamp-Pumped Dye Lasers

Flashlamps utilized in dye laser excitation emit at black-body temperatures in the 20 000 K range, thus yielding intense ultraviolet radiation centered around 200 nm. One further requirement for flashlamps, and their excitation circuits, is to deliver light pulses with a fast rise time. For some flashlamps this rise time can be less than a few nanoseconds. Flashlamp-pumped dye lasers differ from laserpumped pulsed dye lasers mainly in the pulse energies and pulse lengths attainable. This means that flashlamp-pumped dye lasers, using relatively large volumes of dye, can yield very large energy pulses. Excitation geometries use either coaxial lamps, with the dye flowing in a quartz cylinder at the center of the lamp, or two or more linear lamps arranged symmetrically around the quartz tube containing the dye solution. Using a relatively weak dye solution of rhodamine-6G (2.2 £ 1025 M), a coaxial lamp, and an active region defined by a quartz tube 6 cm in diameter and a length of 60 cm, Baltakov and colleagues, in 1974, reported energies of 400 J in pulses 25 ms long at FWHM. Flashlamp-pumped tunable narrow-linewidth dye laser oscillators described by Duarte and colleagues, in1991, employ a cylindrical active region 6 mm in diameter and 17 cm in length. The dye solution is made of rhodamine 590 at a concentration of 1 £ 1025 M. This active region is excited by a coaxial flashlamp. Using multiple-prism grating architectures (see Figure 4) these authors achieve a diffraction limited TEM00 laser beam and laser linewidths of Dn < 300 MHz at pulsed energies in the 2 – 3 mJ range. The laser pulse duration is reported to be Dt < 100 ns. The laser emission from this class of multiple-prism grating oscillator is reported to be extremely stable. The tunable narrow-linewidth emission from these dispersive oscillators is either used directly in spectroscopic, or other scientific applications, or is utilized to inject large flashlamppumped dye laser amplifiers to obtain multi-joule pulse energies with the laser linewidth characteristics of the oscillator. Continuous Wave Dye Lasers

CW dye lasers use dye flowing at linear speeds of up to 10 meters per second which are necessary to

remove the excess heat and to quench the triplet states. In the original cavity reported by Peterson and colleagues, in 1970, a beam from an Arþ laser was focused on to an active region which is contained within the resonator. The resonator comprised dichroic mirrors that transmit the bluegreen radiation of the pump laser and reflect the red emission from the dye molecules. Using a pump power of about 1 W, in a TEM00 laser beam, these authors reported a dye laser output of 30 mW. Subsequent designs replaced the dye cell with a dye jet, an introduced external mirror, and integrated dispersive elements in the cavity. Dispersive elements such as prisms and gratings are used to tune the wavelength output of the laser. Frequency-selective elements, such as etalons and other types of interferometers, are used to induce frequency narrowing of the tunable emission. Two typical cw dye laser cavity designs are described by Hollberg, in 1990, and are reproduced here in Figure 5. The first design is a linear three-mirror folded cavity. The second one is an eight-shaped ring dye laser cavity comprised of mirrors M1, M2, M3, and M4. Linear cavities exhibit the effect of spatial hole burning which allows the cavity to lase in more than one longitudinal mode. This problem can be overcome in ring cavities (Figure 5b) where the laser emission is in the form of a traveling wave. Two aspects of cw dye lasers are worth emphasizing. One is the availability of relatively high powers in single longitudinal mode emission and the other is the

Figure 5 CW laser cavities: (a) linear cavity and (b) ring cavity. Adapted from Hollberg LW (1990) CW dye lasers. In: Duarte FJ and Hillman LW (eds) Dye Laser Principles, pp. 185–238. New York: Academic Press. Reproduced with permission from Elsevier.

LASERS / Dye Lasers 405

demonstration of very stable laser oscillation. First, Johnston and colleagues, in 1982, reported 5.6 W of stabilized laser output in a single longitudinal mode at 593 nm at a conversion efficiency of 23%. In this work eleven dyes were used to span the spectrum continuously from , 400 nm to , 900 nm. In the area of laser stabilization and ultra narrow-linewidth oscillation it is worth mentioning the work of Hough and colleagues, in 1984, that achieved laser linewidths of less than 750 Hz employing an external reference cavity.

Ultrashort-Pulse Dye Lasers

Ultrashort-pulse, or femtosecond, dye lasers use the same type of technology as cw dye lasers configured to incorporate a saturable absorber region. One such configuration is the ring cavity depicted in Figure 6. In this cavity the gain region is established between mirrors M1 and M2 whilst the saturable absorber is deployed in a counter-propagating arrangement. This arrangement is necessary to establish a collision between two counter-propagating pulses at the saturable absorber thus yielding what is known as colliding-pulse mode locking (CPM) as reported by Ruddock and Bradley, in 1976. This has the effect of creating a transient grating, due to interference, at the absorber thus shortening the pulse. Intracavity prisms

were incorporated in order to introduce negative dispersion, by Dietel and colleagues in 1983, and thus subtract dispersion from the cavity and ultimately provide the compensation needed to produce femtosecond pulses. The shortest pulse obtained from a dye laser, 6 fs, has been reported by Fork and colleagues, in 1987, using extra-cavity compression. In that experiment, a dye laser incorporating CPM and prismatic compensation was used to generate pulses that were amplified by a copper-vapor laser at a prf of 8 kHz. The amplified pulses, of a duration of 50 fs, were then propagated through two grating pairs and a four-prism sequence for further compression.

Solid-State Dye Laser Oscillators In this section the principles of linewidth narrowing in dispersive resonators are outlined. Albeit the discussion focuses on multiple-prism grating solid-state dye laser oscillators, in particular, the physics is applicable to pulsed high-power dispersive tunable lasers in general. Multiple-Prism Dispersion Grating Theory

The spectral linewidth in a dispersive optical system is given by Dl < Duð7l uÞ21

½1

where Du is the light beam divergence, 7l ¼ ›=›l; and 7l u is the overall dispersion of the optics. This identity can de derived either from the principles of geometrical optics or from the principles of generalized interferometry as described by Duarte, in 1992. The cumulative single-pass generalized multipleprism dispersion at the mth prism, of a multiple-prism array as illustrated in Figure 7, as given by Duarte and Piper, in 1982, 7l f2;m ¼ H2;m 7l nm þ ðk1;m k2;m Þ21   £ H1;m 7l nm ^ 7l f2;ðm21Þ

½2

In this equation

Figure 6 Femtosecond dye laser cavities: (a) linear femtosecond cavity and (b) ring femtosecond cavity. Adapted from Diels J-C (1990) Femtosecond dye lasers. In: Duarte FJ and Hillman LW (eds) Dye Laser Principles, pp. 41 – 132. New York: Academic Press. Reproduced with permission from Elsevier.

k1;m ¼ cos c1;m =cos f1;m

½3a

k2;m ¼ cos f2;m =cos c2;m

½3b

H1;m ¼ tan f1;m =nm

½4a

H2;m ¼ tan f2;m =nm

½4b

406 LASERS / Dye Lasers

Figure 7 Generalized multiple-prism arrays in (a) additive and (b) compensating configurations. From Duarte FJ (1990) Narrowlinewidth pulsed dye laser oscillators. In: Duarte FJ and Hillman LW (eds) Dye Laser Principles, pp. 133– 183. New York: Academic Press. Reproduced with permission from Elsevier.

Here, k1;m and k2;m represent the physical beam expansion experienced by the incident and the exit beams, respectively. Equation [2] indicates that 7l f2;m ; the cumulative dispersion at the mth prism, is a function of the geometry of the mth prism, the position of the light beam relative to this prism, the refractive index of the prism, and the cumulative dispersion up to the previous prism 7l f2;ðm21Þ : For an array of r identical isosceles, or equilateral, prisms arranged symmetrically, in an additive configuration, so that the angles of incidence and emergence are the same, the cumulative dispersion reduces to 7l f2;r ¼ r7l f2;1

½5

Under these circumstances the dispersions add in a simple and straightforward manner. For configurations incorporating right-angle prisms, the dispersions need to be handled mathematically in a more subtle form. The generalized double-pass, or return-pass, dispersion for multiple-prism beam expanders was introduced by Duarte, in 1985: 7l FP ¼ 2M1 M2

r X

0 ð^1ÞH1;m @

m¼1

þ2

r X m¼1

0

ð^1ÞH2;m @

r Y

k1; j

j¼m m Y j¼1

k1; j

r Y j¼m

m Y

121 k2; j A 7l nm 1

k2; j A7l nm

j¼1

½6

Here, M1 and M2 are the beam magnification factors given by M1 ¼

r Y

k1;m

½7a

k2;m

½7b

m¼1

M2 ¼

r Y m¼1

For a multiple prism expander designed for an orthogonal beam exit, and Brewster’s angle of incidence, eqn [6] reduces to the succinct expression given by Duarte, in 1990: 7l FP ¼ 2

r X

ð^1Þðnm Þm21 7l nm

½8

m¼1

Equation [6] can be used to either quantify the overall dispersion of a given multiple-prism beam expander or to design a prismatic expander yielding zero dispersion, that is, 7l FP ¼ 0; at a given wavelength. Physics and Architecture of Solid-State Dye-Laser Oscillators

The first high-performance narrow-linewidth tunable laser was introduced by Ha¨nsch in 1972. This laser yielded a linewidth of Dn < 2.5 GHz (or Dl < 0.003 nm at l < 600 nm) in the absence of an intracavity etalon. Ha¨nsch demonstrated that the laser linewidth from a tunable laser was narrowed significantly when the beam incident on the tuning grating was expanded using an astronomical

LASERS / Dye Lasers 407

telescope. The linewidth equation including the intracavity beam magnification factor can be written as Dl < DuðM7l QG Þ21

½9

From this equation, it can be deduced that a narrow Dl is achieved by reducing Du and increasing the overall intracavity dispersion ðM7l QG Þ: The intracavity dispersion is optimized by expanding the size of the intracavity beam incident on the diffractive surface of the tuning grating until it is totally illuminated. Ha¨nsch used a two-dimensional astronomical telescope to expand the intracavity beam incident on the diffraction grating. A simpler beam expansion method consists in the use of a single-prism beam expander as disclosed by several authors (Myers, 1971; Stokes and colleagues, 1972; Hanna and colleagues, 1975). An extension and improvement of this approach was the introduction of multipleprism beam expanders as reported by Kasuya and colleagues, in 1978, Klauminzer, in 1978, and Duarte and Piper, in 1980. The main advantages of multipleprism beam expanders, over two-dimensional telescopes, are simplicity, compactness, and the fact that the beam expansion is reduced from two dimensions to one dimension. Physically, as explained previously, prismatic beam expanders also introduce a dispersion component that is absent in the case of the astronomical telescope. Advantages of multipleprism beam expanders over single-prism beam expansion are higher transmission efficiency, lower amplified spontaneous emission levels, and the flexibility to either augment or reduce the prismatic dispersion. In general, for a pulsed multiple-prism grating oscillator, Duarte and Piper, in 1984, showed that the return-pass dispersive linewidth is given by

21 Dl ¼ DuR MR7l QG þ R7l FP

At present, very compact and optimized multipleprism grating tunable laser oscillators are found in two basic cavity architectures. These are the multiple-prism Littrow (MPL) grating laser oscillator, reported by Duarte in 1999 and illustrated in Figure 8, and the hybrid multiple-prism near grazing-incidence (HMPGI) grating laser oscillator, introduced by Duarte in 1997 and depicted in Figure 9. In early MPL grating oscillators the individual prisms integrating the multiple-prism expander were deployed in an additive configuration thus adding the cumulative dispersion to that of the grating and thus contributing to the overall dispersion of the cavity. In subsequent architectures the prisms were deployed in compensating configurations so as to yield zero dispersion and thus allow the tuning characteristics of the cavity to be

Figure 8 MPL grating solid-state dye laser oscillator: optimized architecture. The physical dimensions of the optical components of this cavity are shown to scale. The length, of the solid-state dye gain medium, along the optical axis, is 10 mm. Reproduced with permission from Duarte FJ (1999) Multiple-prism grating solidstate dye laser oscillator: optimized architecture. Applied Optics 38: 6347– 6349.

½10

where R is the number of return-cavity passes. The grating dispersion in this equation, 7l QG ; can be either from a grating in Littrow or near grazing-incidence configuration. The multiple-returnpass equation for the beam divergence was given by Duarte, in 2001:



2

2 1=2 DuR ¼ l=pw 1 þ LR =BR þ AR LR =BR

½11

Here, LR ¼ ðpw =lÞ is the Rayleigh length of the cavity, w is the beam waist at the gain region, while AR and BR are the corresponding multiple-returnpass elements derived from propagation matrices. 2

Figure 9 HMPGI grating solid-state dye laser. Schematics to scale. The length of the solid state dye gain medium, along the optical axis, is 10 mm. Reproduced with permission from Duarte FJ (1997) Multiple-prism near-grazing-incidence grating solidstate dye laser oscillator. Optics and Laser Technology 29: 513 –516.

408 LASERS / Dye Lasers

determined by the grating exclusively. In this approach the principal role of the multipleprism array is to expand the beam incident on the grating thus augmenting significantly the overall dispersion of the cavity as described in eqns [9] or [10]. In this regard, it should be mentioned that beam magnification factors of up to 100, and beyond, have been reported in the literature. Using solid-state laser dye gain media, these MPL and HMPGI grating laser oscillators deliver tunable single-longitudinal-mode emission at laser linewidths 350 MHz # Dn # 375 MHz and pulse lengths in the 3 –7 ns (FWHM) range. Long-pulse operation of this class of multiple-prism grating oscillators has been reported by Duarte and colleagues, in 1998. In these experiments laser linewidths of Dn < 650 MHz were achieved at pulse lengths in excess of 100 ns (FWHM) under flashlamp-pumped dye laser excitation. The dispersive cavity architectures described here have been used with a variety of laser gain media in the gas, the liquid, and the solid state. Applications to tunable semiconductor lasers have also been reported by Zorabedian, in 1992, Duarte, in 1993, and Fox and colleagues, in 1997. Concepts important to MPL and HMPGI grating tunable laser oscillators include the emission of a single-transverse-mode (TEM00) laser beam in a compact cavity, the use of multiple-prism arrays, the expansion of the intracavity beam incident on the grating, the control of the intracavity dispersion, and the quantification of the overall dispersion of the multiple-prism grating assembly via generalized dispersion equations. Sufficiently high intracavity dispersion leads to the achievement of return-pass dispersive linewidths close to the free-spectral range of the cavity. Under these circumstances single-longitudinal-mode lasing is readily achieved as a result of multipass effects. A Note on the Cavity Linewidth Equation

So far we have described how the cavity linewidth equation Dl < Duð7l uÞ21 and the dispersion equations can be applied to achieve highly coherent, or very narrow-linewidth, emission. It should be noted that the same physics can be applied to achieve ultrashort, or femtosecond, laser pulses. It turns out that intracavity prisms can be configured to yield negative dispersion as described by Duarte and Piper, in 1982, Dietel and colleagues, in 1983, and Fork and colleagues in 1984. This negative dispersion can reduce significantly the overall dispersion of the cavity. Under those

circumstances, eqn [1] predicts broadband emission which according to the uncertainty principle, in the form of, DnDt < 1

½12

can lead to very short pulse emission since Dn and Dl are related by the identities Dl < l2 =Dx

½13

Dv < c=Dx

½14

and

Distributed-Feedback Solid-State Dye Lasers

Recently, narrow-linewidth laser emission from solidstate dye lasers has also been obtained using a distributed-feedback (DFB) configuration by Wadsworth and colleagues, in 1999, and Zhu and colleagues, in 2000. As is well known, in a DFB laser no external cavity is required, the feedback being provided by Bragg reflection from a permanently or dynamically written grating structure within the gain medium as reported by Kogelnik and Shank, in 1971. By using a DFB laser configuration where the interference of two pump beams induces the required periodic modulation in the gain medium, laser emission with linewidth in the 0.01 – 0.06 nm range have been reported. Specifically, Wadsworth and colleagues reported a laser linewidth of 12 GHz (0.016 nm at 616 nm) using Perylene Red doped PMMA at a conversion efficiency of 20%. It should also be mentioned that the DFB laser is also a dye laser development that has found wide and extensive applicability in semiconductor lasers employed in the telecommunications industry.

Laser Dye Survey Cyanines

These are red and near-infrared dyes which present long conjugated methine chains (!CH ¼ CH!) and are useful in the spectral range longer than 800 nm, where no other dyes compete with them. An important laser dye belonging to this class is 4-dicyanomethylene-2-methyl-6-( p-dimethylaminostyryl)-4H-pyran (DCM, Figure 10a), with laser emission in the 600 – 700 nm range depending on the pump and solvent, liquid or solid, used. Xanthenes

These dyes have a xanthene ring as the chromophore and are classified into rhodamines, incorporating

LASERS / Dye Lasers 409

Figure 10 Molecular structures of some common laser dyes: (a) DCM; (b) Rh6G; (c) PM567; (d) Perylene Orange (R ¼ H) and Perylene Red (R ¼ C4H6O); (e) Coumarin 307 (also known as Coumarin 503); (f) Coumarin 153 (also known as Coumarin 540A).

amino radicals substituents, and fluoresceins, with hydroxyl (OH) radical substituents. They are generally very efficient and chemically stable and their emission covers the wavelength region from 500 to 700 nm. Rhodamine dyes are the most important group of all laser materials, with rhodamine 6G (Rh6G, Figure 10b), also called rhodamine 590 chloride, being probably the best known of all laser dyes. Rh6G exhibits an absorption peak, in ethanol, at 530 nm and a fluorescence peak at 556 nm, with laser emission typically in the 550 – 620 nm region. It has been demonstrated to lase efficiently both in liquid and solid solutions. Pyrromethenes

Pyrromethene.BF2 complexes are a new class of laser dyes synthesized and characterized more recently as reported by Shah and colleagues, in 1990, Pavlopoulos and colleagues, in 1990, and Boyer and colleagues, in 1993. These laser dyes exhibit reduced triplet –triplet absorption over their fluorescence and lasing spectral region while retaining a high quantum fluorescence yield. Depending on the substituents on the chromophore, these dyes present laser emission over the spectral region from the green/yellow to the red, competing with the rhodamine dyes and have been demonstrated to lase with good performance when incorporated into solid hosts. Unfortunately, they are relatively unstable because of the aromatic amine groups in their structure, which render them

vulnerable to photochemical reactions with oxygen as indicated by Rahn and colleagues, in 1997. The molecular structure of a representative and well-known dye of this class, 1,3,5,7,8-pentamethyl2,6-diethylpyrromethene-difluoroborate complex (pyrromethene 567, PM567) is shown in Figure 10c. Recently, analogs of dye PM567 substituted at position 8 with an acetoxypolymethylene linear chain or a polymerizable methacryloyloxy polymethylene chain have been developed. These new dipyrromethene.BF2 complexes have demonstrated improved efficiency and better photostability than the parent compound both in liquid and solid state. Conjugated Hydrocarbons

This class of organic compounds includes perylenes, stilbenes, and p-terphenyl. Perylene and perylimide dyes are large, nonionic, nonpolar molecules (Figure 10d) characterized by their extreme photostability and negligible singlet –triplet transfer, as discussed by Seybold and Wagenblast and colleagues, in 1989, which have high quantum efficiency because of the absence of nonradiative relaxation. The perylene dyes exhibit limited solubility in conventional solvents but dissolve well in acetone, ethyl acetate, and methyl methacrylate, with emission wavelengths in the orange and red spectral regions. Stilbene dyes are derivatives of uncyclic unsaturated hydrocarbons such as ethylene and butadiene, with emission wavelengths in the 400 – 500 nm range. Most of them end in phenyl radicals. Although they

410 LASERS / Dye Lasers

are chemically stable, their laser performance is inferior to that of coumarins. p-Terphenyl is a valuable and efficient p-oligophenylene dye with emission in the ultraviolet. Coumarins

Coumarin derivatives are a popular family of laser dyes with emission in the blue-green region of the spectrum. Their structure is based on the coumarin ring (Figure 10e,f) with different substituents that strongly affect its chemical characteristics and allow covering an emission spectral range between 420 and 580 nm. Some members of this class rank among the most efficient laser dyes known, but their chemical photobleaching is rapid compared with xanthene and pyrromethene dyes. An additional class of blue-green dyes is the tetramethyl derivatives of coumarin dyes introduced by Chen and colleagues, in 1988. The emission from these dyes spans the spectrum in the 453 – 588 nm region. These coumarin analogs exhibit improved efficiency, higher solubility, and better lifetime characteristics than the parent compounds. Azaquinolone Derivatives

The quinolone and azaquinolone derivatives have a structure similar to that of coumarins but with their laser emission range extended toward the blue by 20 –30 nm. The quinolone dyes are used when laser emission in the 400 – 430 nm region is required. Azaquinolone derivatives, such as 7-dimethylamino1-methyl-4-metoxy-8-azaquinolone-2 (LD 390), exhibit laser action below 400 nm. Oxadiazole Derivatives

These are compounds which incorporate an axodiazole ring with aryl radical substituents and which lase in the 330 –450 nm spectral region. Dye 2-(4-biphenyl)-5-(4-t-butylphenyl)-1,3,4-oxadiazole (PBD), with laser emission in the 355 – 390 nm region, belongs to this family.

Solid-State Laser Dye Matrices Although some first attempts to incorporate dye molecules into solid matrices were already made in the early days of development of dye lasers, it was not until the late 1980s that host materials with the required properties of optical quality and high damage threshold to laser radiation began to be developed. In subsequent years, the synthesis of new high-performance dyes and the implementation of new ways of incorporating the organic molecules into the solid matrix resulted in significant advances towards the development of practical tunable

solid-state dye lasers. A recent detailed review of the work done in this field has been given by Costela et al. Inorganic glasses, transparent polymers, and inorganic – organic hybrid matrices have been successfully used as host matrices for laser dyes. Inorganic glasses offer good thermal and optical properties but present the difficulty that the high melting temperature employed in the traditional process of glass making would destroy the organic dye molecules. It was not until the development of the low-temperature sol-gel technique for the synthesis of glasses that this limitation was overcome. The sol-gel process, based on inorganic polymerization reactions performed at about room temperature and starting with metallo-organic compounds such as alkoxides or salts, as reported by Brinker and Scherrer, in 1989, provides a safe route for the preparation of rigid, transparent, inorganic matrix materials incorporating laser dyes. Disadvantages of these materials are the possible presence of impurities embedded in the matrix or the occurrence of interactions between the dispersed dye molecules and the inorganic structure (hydrogen bonds, van der Waals forces) which, in certain cases, can have a deleterious effect on the lasing characteristics of the material. The use of matrices based on polymeric materials to incorporate organic dyes offers some technical and economical advantages. Transparent polymers exhibit high optical homogeneity, which is extremely important for narrow-linewidth oscillators, as explained by Duarte, in 1994, good chemical compatibility with organic dyes, and allow control over structure and chemical composition making possible the modification, in a controlled way, of relevant properties of these materials such as polarity, free volume, molecular weight, or viscoelasticity. Furthermore, the polymeric materials are amenable to inexpensive fabrication techniques, which facilitate miniaturization and design of integrated optical systems. The dye molecules can be either dissolved in the polymer or linked covalently to the polymeric chains. In the early investigations on the use of solid polymer matrices for laser dyes, the main problem to be solved was that of the low resistance to laser radiation exhibited by the then existing materials. In the late 1980s and early 1990s new modified polymeric organic materials began to be developed with a laser-radiation-damage threshold comparable to that of inorganic glasses and crystals as reported by Gromov and colleagues, in 1985, and Dyumaev and colleagues, in 1992. This, together with the above-mentioned advantages, made the use of polymers in solid-state dye lasers both attractive and competitive.

LASERS / Dye Lasers 411

An approach that intends to bring about materials in which the advantages of both inorganic glasses and polymers are preserved but the difficulties are avoided is that of using inorganic – organic hybrid matrices. These are silicate-based materials, with an inorganic Si – O – Si backbone, prepared from organosilane precursors by sol-gel processing in combination with organic cross-linking of polymerizable monomers as reported by Novak, in 1993, Sanchez and Ribot, in 1994, and Schubert, in 1995. In one procedure, laser dyes are mixed with organic monomers, which are then incorporated into the porous structure of a sol-gel inorganic matrix by immersing the bulk in the solution containing monomer and catalyst or photoinitiator as indicated by Reisfeld and Jorgensen, in 1991, and Bosch and colleagues, in 1996. Alternatively, hybrids can be obtained from organically modified silicon alkoxides, producing the so-called ORMOCERS (organically modified ceramics) or ORMOSILS (organically modified silanes) as reported by Reisfeld and Jorgensen, in 1991. An inconvenient aspect of these materials is the appearance of optical inhomogeneities in the medium due to the difference in the refractive index between the organic and inorganic phases as well as to the difference in density between monomer and polymer which causes stresses and the formation of optical defects. As a result, spatial inhomogeneities appear in the laser beam, thereby decreasing its quality as discussed by Duarte, in 1994. Organic Hosts

The first polymeric materials with improved resistance to damage by laser radiation were obtained by Russian workers, who incorporated rhodamine dyes into modified poly(methyl methacrylate) (MPMMA), obtained by doping PMMA with low-molecular weight additives. Some years later, in 1995, Maslyukov and colleagues demonstrated lasing efficiencies in the range 40 – 60% with matrices of MPMMA doped with rhodamine dyes pumped longitudinally at 532 nm, with a useful lifetime (measured as a 50% efficiency drop) of 15 000 pulses at a prf of 3.33 Hz. In 1995, Costela and colleagues used an approach based on adjusting the viscoelastic properties of the material by modifying the internal plasticization of the polymeric medium by copolymerization with appropriate monomers. Using dye Rh6G dissolved in a copolymer of 2-hydroxyethyl methacrylate (HEMA) and methyl methacrylate (MMA) and transversal pumping at 337 nm, they demonstrated laser action with an efficiency of 21% and useful lifetime of 4500 pulses (20 GJ/mol in terms of total

input energy per mole of dye molecule when the output energy is down to 50% of its initial value). The useful lifetime increased to 12 000 pulses when the Rh6G chromophore was linked covalently to the polymeric chains as reported by Costela and colleagues, in 1996. Comparative studies on the laser performance of Rh6G incorporated either in copolymers of HEMA and MMA or in MPMMA were carried out by Giffin and colleagues, in 1999, demonstrating higher efficiency of the MPMMA materials but superior normalized photostability (up to 240 GJ/mol) of the copolymer formulation. Laser conversion efficiencies in the range 60 –70% for longitudinal pumping at 532 nm were reported by Ahmad and colleagues, in 1999, with PM567 dissolved in PMMA modified with 1,4-diazobicyclo[2,2,2] octane (DABCO) or perylene additives. When using DABCO as an additive, a useful lifetime of up to 550 000 pulses, corresponding to a normalized photostability of 270 GJ/mol, was demonstrated at a 2 Hz prf. Much lower efficiencies and photostabilities have been obtained with dyes emitting in the blue-green spectral region. Costela and colleagues, in 1996, performed some detailed studies on dyes coumarin 540A and coumarin 503 incorporated into methacrylate homopolymers and copolymers, and demonstrated efficiencies of at most 19% with useful lifetimes of up to 1200 pulses, at a 2 Hz prf, for transversal pumping at 337 nm. In 2000, Somasundaram and Ramalingam obtained useful lifetimes of 5240 and 1120 pulses, respectively, for coumarin 1 and coumarin 490 dyes incorporated into PMMA rods modified with ethyl alcohol, under transversal pumping at 337 nm and a prf of 1 Hz. A detailed study of photo-physical parameters of R6G-doped HEMA-MMA gain media was performed by Holzer and colleagues, in 2000. Among the parameters determined by these authors are quantum yields, lifetimes, absorption cross-sections, and emission cross-sections. A similar study on the photophysical parameters of PM567 and two PM567 analogs incorporated into solid matrices of different acrylic copolymers and in corresponding mimetic liquid solutions was performed by Bergmann and colleagues, in 2001. Preliminary experiments with rhodamine 6G-doped MPMMA at high prfs were conducted using a frequency-doubled Nd:YAG laser at 5 kHz by Duarte and colleagues, in 1996. In those experiments, involving longitudinal excitation, the pump laser was allowed to fuse a region at the incidence window of the gain medium so that a lens was formed. This lens and the exit window of the gain medium comprised a

412 LASERS / Dye Lasers

short unstable resonator that yielded broadband emission. In 2001, Costela and colleagues demonstrated lasing in rhodamine 6G- and pyrromethenedoped polymer matrices under copper-vapor-laser excitation at prfs in the 1.0 – 6.2 kHz range. In these experiments, the dye-doped solid-state matrix was rotated at 1200 rpm. Average powers of 290 mW were obtained at a prf of 1 kHz and a conversion efficiency of 37%. For a short period of time average powers of up to 1 W were recorded at a prf of 6.2 kHz. In subsequent experiments by Abedin and colleagues the rpf was increased to 10 kHz by using as pump laser a diode-pumped, Q-switched, frequency doubled, solid state Nd:YLF laser. Initial average powers of 560 mW and 430 mW were obtained for R6G-doped and PM567-doped polymer matrices, respectively. Lasing efficiency was 16% for R6G and the useful lifetime was 6.6 min (or about 4.0 million shots). Inorganic and Hybrid Hosts

In 1990, Knobbe and colleagues and McKiernan and colleagues, from the University of California at Los Angeles, incorporated different organic dyes, via the sol-gel techniques, into silicate (SiO2), aluminosilicate (Al2O3 – SiO2), and ORMOSIL host matrices, and investigated the lasing performance of the resulting materials under transversal pumping. Lasing efficiencies of 25% and useful lifetimes of 2700 pulses at 1 Hz repetition rate were obtained from Rh6G incorporated into aluminosilicate gel and ORMOSIL matrices, respectively. When the dye in the ORMOSIL matrices was coumarin 153, the useful lifetime was still 2250 pulses. Further studies by Altman and colleagues, in 1991, demonstrated useful lifetimes of 11 000 pulses at a 30 Hz prf when rhodamine 6G perchlorate was incorporated into ORMOSIL matrices. The useful lifetime increased to 14 500 pulses when the dye used was rhodamine B. When incorporated into xerogel matrices, pyrromethene dyes lase with efficiencies as high as the highest obtained in organic hosts but with much higher photostability: a useful lifetime of 500 000 pulses at 20 Hz repetition rate was obtained by Faloss and colleagues, in 1997, with PM597 using a pump energy of 1 mJ. The efficiency of PM597 dropped to 40% but its useful lifetime increased to over 1 000 000 pulses when oxygen-free samples were prepared, in agreement with previous results that documented the strong dependence of laser parameters of pyrromethene dyes on the presence of oxygen. A similar effect was observed with perylene dyes: deoxygenated xerogel samples of Perylene Red exhibited a useful lifetime of 250 000 pulses at a prf

of 5 Hz and 1 mJ pump energy, to be compared with a useful lifetime of only 10 000 pulses obtained with samples prepared in normal conditions. Perylene Orange incorporated into polycom glass and pumped longitudinally at 532 nm gave an efficiency of 72% and a useful lifetime of 40 000 pulses at a prf of 1 Hz as reported by Rhan and King, in 1998. A direct comparison study of laser performance of Rh6G, PM567, Perylene Red, and Perylene Orange in organic, inorganic, and hybrid hosts was carried out by Rahn and King in 1995. They found that the nonpolar perylene dyes had better performance in partially organic hosts, whereas the ionic rhodamine and pyrromethene dyes performed best in the inorganic sol-gel glass host. The most promising combinations of dye and host for efficiency and photostability were found to be Perylene Orange in polycom glass and Rh6G in sol-gel glass. An all-solid-state optical configuration, where the pump was a laser diode array side-pumped, Q-switched, frequency-doubled, Nd:YAG slab laser, was demonstrated by Hu and colleagues, in 1998, with dye DCM incorporated into ORMOSIL matrices. A lasing efficiency of 18% at the wavelength of 621 nm was obtained, and 27 000 pulses were needed for the output energy to decrease by 10% of its initial value at 30 Hz repetition rate. After 50 000 pulses the output energy decreased by 90% but it could be recovered after waiting for a few minutes without pumping. Violet and ultraviolet laser dyes have also been incorporated into sol-gel silica matrices and their lasing properties evaluated under transversal pumping by a number of authors. Although reasonable efficiencies have been obtained with some of these laser dyes, the useful lifetimes are well below 1000 pulses. More recently, Costela and colleagues have demonstrated improved conversion efficiencies in dye-doped inorganic – organic matrices using pyrromethene 567 dye. The solid matrix in this case is poly-trimethylsilyl-methacrylate cross-linked with ethylene glycol and copolymerized with methyl methacrylate. Also, Duarte and James have reported on dye-doped polymer-nanoparticle laser media where the polymer is PMMA and the nanoparticles are made of silica. These authors report TEM00 laser beam emission and improved dn=dT coefficients of the gain media that results in reduced Du.

Dye Laser Applications Dye lasers generated a renaissance in diverse applied fields such as isotope separation, medicine,

LASERS / Dye Lasers 413

photochemistry, remote sensing, and spectroscopy. Dye lasers have also been used in many experiments that have advanced the frontiers of fundamental physics. Central to most applications of dye lasers is their capability of providing coherent narrow-linewidth radiation that can be tuned continuously from the near ultraviolet, throughout the visible, to the near infrared. These unique coherent and spectral properties have made dye lasers particularly useful in the field of high-resolution atomic and molecular spectroscopy. Dye lasers have been successfully and extensively used in absorption and fluorescence spectroscopy, Raman spectroscopy, selective excitations, and nonlinear spectroscopic techniques. Well documented is their use in photochemistry, that is, the chemistry of optically excited atoms and molecules, and in biology, studying biochemical reaction kinetics of biological molecules. By using nonlinear optical techniques, such as harmonic and sum frequency and difference frequency generation, the properties of dye lasers can be extended to the ultraviolet. Medical applications of dye lasers include cancer photodynamic therapy, treatment of vascular lesions, and lithotripsy as described by Goldman in 1990. The ability of dye lasers to provide tunable subGHz linewidths in the orange-red portion of the spectrum, at kW average powers, made them particularly suited to atomic vapor laser isotope separation of uranium as reported by Bass and colleagues, in 1992, Singh and colleagues, in 1994, and Nemoto and colleagues, in 1995. In this particular case the isotopic shift between the two uranium isotopes is a few GHz. Using dye lasers yielding Dn # 1 GHz the 235U can be selectively excited in a multistep process by tuning the dye laser(s) to specific wavelengths, in the orange-red spectral region, compatible with a transition sequence leading to photoionization. Also, high-prf narrowlinewidth dye lasers, as described by Duarte and Piper, in 1984, are ideally suited for guide star applications in astronomy. This particular application requires a high-average power diffraction-limited laser beam at l < 589 nm to propagate for some 95 km, above the surface of the Earth, and illuminate a layer of sodium atoms. Ground detection of the fluorescence from the sodium atoms allows measurements on the turbulence of the atmosphere. These measurements are then used to control the adaptive optics of the telescope thus compensating for the atmospheric distortions. A range of industrial applications is also amenable to the wavelength agility and high-average power output characteristics of dye lasers. Furthermore,

tunability coupled with narrow-linewidth emission, at large pulsed energies, make dye lasers useful in military applications such as directed energy and damage of optical sensors.

The Future of Dye Lasers Predicting the future is rather risky. In the early 1990s, articles and numerous advertisements in commercial laser magazines predicted the demise and even the oblivion of the dye laser in a few years. It did not turn out this way. The high power and wavelength agility available from dye lasers have ensured their continued use as scientific and research tools in physics, chemistry, and medicine. In addition, the advent of narrow-linewidth solid-state dye lasers has sustained a healthy level of research and development in the field. Today, some 20 laboratories around the world are engaged in this activity. The future of solid-state dye lasers depends largely on synthetic and manufacturing improvements. There is a need for strict control of the conditions of fabrication and a careful purification of all the compounds involved. In the case of polymers, in particular, stringent control of thermal conditions during the polymerization step is obligatory to ensure that adequate optical uniformity of the polymer matrix is achieved, and that intrinsic anisotropy developed during polymerization is minimized. From an engineering perspective what are needed are dyedoped solid-state media with improved photostability characteristics and better thermal properties. By better thermal properties is meant better dn=dT factors. To a certain degree progress in this area has already been reported. As in the case of liquid dye lasers, the lifetime of the material can be improved by using pump lasers at compatible wavelengths. In this regard, direct diode laser pumping of dye-doped solid-state matrices should lead to very compact, long-lifetime, tunable lasers emitting throughout the visible spectrum. An example of such a device would be a green laser-diode-pumped rhodamine-6G doped MPMMA tunable laser configured in a multipleprism grating architecture. As far as liquid lasers are concerned, there is a need for efficient water-soluble dyes. Successful developments in this area were published in the literature with coumarin-analog dyes by Chen and colleagues, in 1988, and some encouraging results with rhodamine dyes have been reported by Ray and colleagues, in 2002. Efficient water-soluble dyes could lead to significant improvements and simplifications in high-power tunable lasers.

414 LASERS / Edge Emitters

See also Coherent Control: Experimental. Nonlinear Optics, Applications: Pulse Compression via Nonlinear Optics. Quantum Optics: Entanglement and Quantum Information. Relativistic Nonlinear Optics.

Further Reading Costela A, Garcı´a-Moreno I and Sastre R (2001) In: Nalwa HS (ed.) Handbook of Advanced Electronic and Photonic Materials and Devices, vol. 7, pp. 161 – 208. San Diego, CA: Academic Press. Diels J-C and Rudolph W (1996) Ultrashort Laser Pulse Phenomena. New York: Academic Press. Demtro¨der W (1995) Laser Spectroscopy, 2nd edn. Berlin: Springer-Verlag.

Duarte FJ (ed.) (1991) High Power Dye Lasers. Berlin: Springer-Verlag. Duarte FJ (ed.) (1995) Tunable Lasers Handbook. New York: Academic Press. Duarte FJ (ed.) (1995) Tunable Laser Applications. New York: Marcel Dekker. Duarte FJ (2003) Tunable Laser Optics. New York: Elsevier Academic. Duarte FJ and Hillman LW (eds) (1990) Dye Laser Principles. New York: Academic Press. Maeda M (1984) Laser Dyes. New York: Academic Press. Scha¨fer FP (ed.) (1990) Dye Lasers, 3rd edn. Berlin: Springer-Verlag. Radziemski LJ, Solarz RW and Paisner JA (eds) (1987) Laser Spectroscopy and its Applications. New York: Marcel Dekker. Weber MJ (2001) Handbook of Lasers. New York: CRC.

Edge Emitters J J Coleman, University of Illinois, Urbana, IL, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction The semiconductor edge emitting diode laser is a critical component in a wide variety of applications, including fiber optics telecommunications, optical data storage, and optical remote sensing. In this section, we describe the basic structures of edge emitting diode lasers and the physical mechanisms for converting electrical current into light. Laser waveguides and cavity resonators are also outlined. The power, efficiency, gain, loss, and threshold characteristics of laser diodes are presented along with the effects of temperature and modulation. Quantum well lasers are outlined and a description of grating coupled lasers is provided. Like all forms of laser, the edge emitting diode laser is an oscillator and has three principal components; a mechanism for converting energy into light, a medium that has positive optical gain, and a mechanism for obtaining optical feedback. The basic edge emitting laser diode is shown schematically in Figure 1. Current flow is vertical through a pn junction and light is emitted from the ends. The dimensions of the laser in Figure 1 are distorted, in proportion to typical real laser diodes, to reveal more detail. In practical laser diodes, the width of the stripe contact is actually much smaller than shown and the thickness is smaller still. Typical dimensions are (stripe width £ thickness £ length) 2 £ 100 £ 1000 mm.

Light emission is generated only within a few micrometers of the surface, which means that 99.98% of the volume of this small device is inactive. Energy conversion in diode lasers is provided by current flowing through a forward-biased pn junction. At the electrical junction, there is a high density of injected electrons and holes which can recombine in a direct energy gap semiconductor material to give an emitted photon. Charge neutrality requires equal numbers of injected electrons and holes. Figure 2 shows a simplified energy versus momentum (E– k) diagrams for (a) direct, and (b) indirect semiconductors. In a direct energy gap material, such as GaAs, the energy minima have the same momentum vector, so recombination of an electron in the conduction band and a hole in the valence band takes place directly. In an indirect material, such as silicon, momentum conservation requires the participation of a third particle – a phonon. This third-order process is far less efficient than direct recombination and, thus far, too inefficient to support laser action. Optical gain in direct energy gap materials is obtained when population inversion is reached.

Figure 1

Schematic drawing of an edge emitting laser diode.

LASERS / Edge Emitters 415

met and any additional injected current results in optical gain. The transparency current density Jo, is given by Jo ¼

Figure 2 Energy versus momentum for (a) direct, and (b) indirect semiconductors.

Figure 3 Density of states versus energy for conduction and valence bands.

Population inversion is the condition where the probability for stimulated emission exceeds that of absorption. The density of excited electrons n, in the conduction band is given by n¼

ð1 Ec

rn ðEÞfn ðEÞdE cm23

½1

where rn ðEÞ is the conduction band density of states function and fn ðEÞ is the Fermi occupancy function. A similar equation can be written for holes in the valence band. Figure 3 shows the density of states r versus energy as a dashed line for conduction and valence bands. The solid lines in Figure 3 are the rðEÞf ðEÞ products, and the areas under the solid curves are the carrier densities n and p. The Fermi functions are occupation probabilities based on the quasi-Fermi levels EFn and EFp, which are, in turn, based on the injected carrier densities, dn and dp. Population inversion implies that the difference between the quasi-Fermi levels must exceed the emission energy which, for semiconductors, must be greater than the bandgap energy: EFn 2 EFp $ "v , Eg

qno Lz tspon

½3

where no is the transparency carrier density (dn ¼ dp ¼ no at transparency), Lz is the thickness of the optically active layer, and tspon is the spontaneous carrier lifetime in the material (typically 5 nsec). A resonant cavity for optical feedback is obtained simply by taking advantage of the natural crystal formation in single crystal semiconductors. Most III – V compound semiconductors form naturally into a zincblende lattice structure. With the appropriate choice of crystal planes and surfaces, this lattice structure can be cleaved such that nearly perfect plane –parallel facets can be formed at arbitrary distances from each other. Since the refractive index of these materials is much larger (,3.5) than that of air, there is internal optical reflection for normal incidence of approximately 30%. This type of optical cavity is called a Fabry –Perot resonator. Effective lasers of all forms make use of optical waveguides to optimize the overlap of the optical field with material gain and cavity resonance. The optical waveguide in an edge emitting laser is best described by considering it in terms of a transverse waveguide component, defined by the epitaxial growth of multiple thin heterostructure layers, and a lateral waveguide component, defined by conventional semiconductor device processing methods. One of the simplest, and yet most common, heterostructure designs is shown in Figure 4. This five-layer separate confinement heterostructure (SCH) offers excellent carrier confinement in the active layer as well as a suitable transverse waveguide. Clever choice of different materials for each layer results in the energy band structure, index of refraction

½2

Charge neutrality requires that dn ¼ dp. Transparency is the point where these two conditions are just

Figure 4 Schematic cross-section of a typical five-layer separate confinement heterostructure (SCH) laser.

416 LASERS / Edge Emitters

Figure 5 Energy band structure, index of refraction profile, and optical field mode profile of the SCH laser of Figure 4.

profile, and optical field mode profile shown in Figure 5. The outer confining layers are the thickest (, 1 mm) and have the largest energy bandgap and lowest refractive index. The inner barrier layers are thinner (, 0.1 mm), have intermediate bandgap energy and index of refraction, and serve both an optical role in defining the optical waveguide and an electronic role in confining carriers to the active layer. The active layer is the narrowest bandgap, highest index material, and is the only layer in the structure designed to provide optical gain. It may be a single layer or, as in the case of a multiple quantum well laser, a combination of layers. The bandgap energy, E, of this active layer plays a key role in determining the emission wavelength, l, of the laser: E¼

hc l

Lateral waveguiding in a diode laser can be accomplished in a variety of ways. A common example of an index guided laser is the ridge waveguide laser shown in Figure 6. After the appropriate transverse waveguide heterostructure sample is grown, conventional semiconductor processing methods are used to form parallel etched stripes from the surface near, but not through, the active layer. An oxide mask is patterned such that electrical contact is formed only to the center (core) region between the etched stripes, the core stripe being only a few microns wide. The lateral waveguide that results arises from the average index of refraction (effective index) in the etched regions outside the core being smaller than that of the core. Figure 7 shows the asymmetric near field emission profile of the ridge waveguide laser and cross-sectional profiles of the laser emission and effective index of refraction. The optical spectrum of an edge emitting diode laser below and above threshold is the superposition of the material gain of the active layer and the resonances provided by the Fabry –Perot resonator. The spectral variation of material gain with injected carrier density (drive current) is shown in Figure 8. As the drive is increased, the intensity and energy of peak gain both increase. The Fabry –Perot resonator has resonances every half wavelength, given by L¼

m l 2 n

½5

½4

where h is Planck’s constant and c is the speed of light. The bandgap energy of the SCH structure is shown in Figure 5. The lowest energy active layer collects injected electrons and holes and the carrier confinement provided by the inner barrier layers allows the high density of carriers necessary for population inversion and gain. The layers in the structure also have the refractive index profile shown in Figure 5 which forms a five-layer slab dielectric waveguide. With appropriate values for thicknesses and indices of refraction, this structure can be a very strong fundamental mode waveguide with the field profile shown. A key parameter of edge emitting diode lasers is G, the optical confinement factor. This parameter is the areal overlap of the optical field with the active layer, shown as dashed lines in the figure and can be as little as a few percent, even in very high performance lasers.

Figure 6 Schematic diagram of a real index guided ridge waveguide diode laser.

Figure 7 Near field emission pattern, cross-sectional emission profiles (solid lines), and refractive index profiles (dashed lines) from an index guided diode laser.

LASERS / Edge Emitters 417

where R1 and R2 are the reflectivities of the two facets, g is the optical gain, ai are losses associated with optical absorption (typically 3– 15 cm21), and L is the cavity length. At laser threshold, g ¼ gth and I ¼ Io, i.e., the round trip gain exactly equals the losses. The result is gth ¼ ai þ

Figure 8 Gain versus emission energy for four increasing values of carrier (current) density. Peak gain is shown as a dashed line.

1 1 ln 2L R1 R2

½7

where the second term represents the mirror losses am. Of course, mirror losses are desirable in the sense that they represent useable output power. Actually, the equation above does not take into account the incomplete overlap of the optical mode with the gain in the active layer. The overlap is defined by the confinement factor G, described above, and the equation must be modified to become Ggth ¼ ai þ

1 1 ln 2L R1 R2

½8

For quantum well lasers, the material peak gain, shown in Figure 8, as a function of drive current is given approximately by gt ¼ bJo ln

J Jo

½9

where Jo is the transparency current density. The threshold current density Jth, which is the current density where the gain exactly equals all losses, is given by

Figure 9 Emission spectra from a Fabry– Perot edge emitting laser structure at, and above, laser threshold.

where L is the cavity length, l/n is the wavelength in the medium (n is the refractive index), and m is an integer. For normal length edge emitter cavities, these ˚ apart. cavity modes are spaced only 1 or 2 A The result is optical spectra, near laser threshold and above, that look like the spectra of Figure 9. Many cavity modes resonate up to threshold, while above threshold, the superlinear increase in emission intensity with drive current tends to greatly favor one or two of the Fabry –Perot longitudinal cavity modes. Laser threshold can be determined from a relatively simple analysis of the round trip gain and losses in the Fabry –Perot resonator. If the initial intensity is Io, after one round trip the intensity will be given by I ¼ Io R1 R2 eðg2ai Þ2L

½6

J Jth ¼ o exp hi



1 1 ln 2L R1 R2 GbJ o

½10

Note that an additional parameter hi has appeared in this equation. Not all of the carriers injected by the drive current participate in optical processes; some are lost to other nonradiative processes. This is accounted for by including the internal quantum efficiency term hi, which is typically at least 90%. The actual drive current, of course, must include the geometry of the device and is given by Ith ¼ wLJth

½11

where L is the cavity length, and w is the effective width of the active volume. The effective width is basically the stripe width of the core but also includes current spreading and carrier diffusion under the stripe. The power– current characteristic (P – I) for a typical edge emitting laser diode is shown in Figure 10. As the drive current is increased

418 LASERS / Edge Emitters

and the wall plug efficiency hw

hw ¼

Figure 10 Power versus current (P –I ) curve for a typical edge emitting laser diode.

from zero, the injected carrier densities increase and spontaneous recombination is observed. At some point the gain equals the internal absorption losses, and the material is transparent. Then, as the drive current is further increased, additional gain eventually equals the total losses, including mirror losses, and laser threshold is reached. Above threshold, nearly all additional injected carriers contribute to stimulated emission (laser output). The power generated internally is given by P ¼ wLð J 2 Jth Þhi

hn q

½12

where wL is the area and hn is the photon energy. Only a fraction of the power generated internally is extracted from the device Po ¼ wLð J 2 Jth Þhi

hn am q ai þ am

½13

Clearly a useful design goal is to minimize the internal optical loss ai. The slope of the P – I curve of Figure 10 above threshold is described by an external differential quantum efficiency hd, which is related to the internal quantum efficiency by

hd ¼ hi



am ai þ am



½14

Other common parameters used to characterize the efficiency of laser diodes include the power conversion efficiency hp

hp ¼

Po VF I

½15

Po I

½16

The power conversion efficiency hp, recognizes that the forward bias voltage VF is larger than the photon energy by an amount associated with the series resistance of the laser diode. The wall plug efficiency hw, has the unusual units of WA21 and is simply a shorthand term that relates the common input unit of current to the common output unit of power. In high performance low power applications, the ideal device has minimum threshold current since the current used to reach threshold contributes little to light emission. In high power applications, the threshold current becomes insignificant and the critical parameter is external quantum efficiency. Temperature effects may be important for edge emitting lasers, depending on a particular application. Usually, concerns related to temperature arise under conditions of high temperature operation (T . 50 8C), where laser thresholds rise and efficiencies fall. It became common early in the development of laser diodes to define a characteristic temperature To for laser threshold, which is given by Ith ¼ It"o

T exp To

! ½17

where a high To is desirable. Unfortunately, this simple expression does not describe the temperature dependence very well over a wide range of temperatures, so the appropriate temperature range must also be specified. For applications where the emission wavelength of the laser is important, the change in wavelength with temperature dl/dT may also be specified. For conventional Fabry– Perot cavity lasers, dl/dT is typically 3 – 5 A˚C21. In order for a diode laser to carry information, some form of modulation is required. One method for obtaining this is direct modulation of the drive current in a semiconductor laser. The transient and temporal behavior of lasers is governed by rate equations for carriers and photons which are necessarily coupled equations. The rate equation for carriers is given by dn J n ¼ 2 2 ðc=nÞbðn 2 no ÞwðEÞ dt qLz tsp

½18

where J/qLz is the supply, n/tsp is the spontaneous emission rate, and the third term is the stimulated emission rate ðc=nÞbðn 2 no ÞwðEÞ where b is the gain coefficient and wðEÞ is the photon density.

LASERS / Edge Emitters 419

The rate equation for photons is given by dw un w ¼ ðc=nÞbðn 2 no Þw þ 2 tp dt tsp

½19

where u is the fraction of the spontaneous emission that couples into the mode (a small number), and tp is the photon lifetime (,1 psec) in the cavity. These rate equations can be solved for specific diode laser materials and structures to yield a modulation frequency response. A typical small signal frequency response, at two output power levels, is shown in Figure 11. The response is typically flat to frequencies above 1 GHz, rises to a peak value at some characteristic resonant frequency, and quickly rolls off. The characteristic small signal resonant frequency vr, is given by

v2r ¼

ðc=nÞbw tp

½20

where w is the average photon density. Direct modulation of edge emitting laser diodes is limited by practicality to less than 10 GHz. This is in part because of the limits imposed by the resonant frequency and in part by chirp. Chirp is the frequency modulation that arises indirectly from direct current modulation. Modulation of the current modulates the carrier densities which, in turn, results in modulation of the quasi-Fermi levels. The separation of the quasiFermi levels is the emission energy (wavelength). Most higher-speed lasers make use of external modulation schemes. The discussion thus far has addressed aspects of semiconductor diode edge emitting lasers that are common to all types of diode lasers irrespective of the choice of materials or the details of the active layer structure. The materials of choice for efficient laser devices include a wide variety of III – V

Figure 11 Direct current modulation frequency response for an edge emitting diode laser at two output power levels.

binary compounds and ternary or quaternary alloys. The particular choice is based on such issues as emission wavelength, a suitable heterostructure lattice match, and availability of high-quality substrates. For example, common low-power lasers for CD players (l , 820 nm) are likely to be made on a GaAs substrate using a GaAs – AlxGa12xAs heterostructure. Lasers for fiberoptic telecommunications systems (l , 1.3 – 1.5 mm) are likely to be made on an InP substrate using an InP – InxGa 12xAs yP 12y heterostructure. Red laser diodes (l , 650 nm) are likely to be made on a GaAs substrate using a GaAs – InxGa12xAsyP12y heterostructure. Blue and ultraviolet lasers, a relatively new technology compared to the others, make use of AlxGa12xN – GaN – InyGa12yN heterostructures formed on sapphire or silicon carbide substrates. An important part of many diode laser structures is a strained layer. If a layer is thin enough, the strain arising from a modest amount of lattice mismatch can be accommodated elastically. In thicker layers, lattice mismatch results in an unacceptable number of dislocations that affect quantum efficiency, optical absorption losses, and, ultimately, long-term failure rates. Strained layer GaAs – InxGa12xAs lasers are a critical component in rare-earth doped fiber amplifiers. The structure of the active layer in edge emitting laser diodes was originally a simple double heterostructure configuration with a single active layer of 500 – 1000 A˚ in thickness. In the 1970s however, advances in the art of growing semiconductor heterostructure materials led to growth of highquality quantum well active layers. These structures, having thicknesses that are comparable to the electron wavelength in a semiconductor (,200 A˚), revolutionized semiconductor lasers, resulting in much lower threshold current densities, higher efficiencies, and a broader range of available emission wavelengths. The energy band diagram for a quantum well laser active region is shown in Figure 12. This structure is a physical realization of the particle-in-a-box problem in elementary quantum mechanics. The quantum well yields quantum states in the conduction band at discrete energy levels, with odd and even electron wavefunctions, and breaks the degeneracy in the valence band, resulting in separate energy states for light holes and heavy holes. The primary transition for recombination is from the n ¼ 1 electron state to the h ¼ 1 heavy hole state and takes place at a higher energy than the bulk bandgap energy. In addition, the density of states for quantum wells becomes a step-like function and optical gain is enhanced. The advantages in practical edge

420 LASERS / Edge Emitters

Figure 12 Energy band diagram for a quantum well heterostructure.

emitting lasers that are the result of one-dimensional quantization are such that virtually all present commercial laser diodes are quantum well lasers. These kinds of advantages are driving development of laser diodes with additional degrees of quantization, including two dimensions (quantum wires) and three dimensions (quantum dots). The Fabry –Perot cavity resonator described here is remarkably efficient and relatively easy to fabricate, which makes it the cavity of choice for many applications. There are many applications, however, where the requirements for linewidth of the laser emission or the temperature sensitivity of the emission wavelength make the Fabry –Perot resonator less desirable. An important laser technology that addresses both of these concerns involves the use of a wavelength selective grating as an integral part of the laser cavity. A notable example of a grating coupled laser is distributed feedback laser shown schematically in Figure 13a. This is similar to the SCH cross-section of Figure 4 with the addition of a Bragg grating above the active layer. The period L, of the grating is chosen to fulfill the Bragg condition for mth-order coupling between forward- and backward-propagating waves, which is L¼

ml 2n

½21

where l=n is the wavelength in the medium. The index step between the materials on either side of the grating is small and the amount of reflection is also small. Since the grating extends throughout the length of the cavity, however, the overall effective reflectivity

Figure 13 Schematic diagram of distributed feedback (DFB) and distributed Bragg reflector (DBR) edge emitting laser diodes.

can be large. Another variant is the distributed Bragg reflector (DBR) laser resonator, shown in Figure 13b. This DBR laser has a higher index contrast, deeply etched surface grating located at one or both ends of the otherwise conventional SCH laser heterostructure. One of the advantages of this structure is the opportunity to add a contact for tuning the Bragg grating by current injection. Both the DBR and DFB laser structures are particularly well suited for telecommunications or other applications where a very narrow single laser emission line is required. In addition, the emission wavelength temperature dependence for this type of laser is much smaller, typically 0.5 A˚C21.

See also Semiconductor Physics: Quantum Wells and GaAsbased Structures.

Further Reading Agrawal GP (ed.) (1995) Semiconductor Lasers Past, Present, and Future. Woodbury, NY: American Institute of Physics. Agrawal GP and Dutta NK (1993) Semiconductor Lasers. New York: Van Nostrand Reinhold. Bhattacharya P (1994) Semiconductor Optoelectronic Devices. Upper Saddle River, NJ: Prentice-Hall. Botez D and Scifres DR (eds) (1994) Diode Laser Arrays. Cambridge, UK: Cambridge University Press. Chuang SL (1995) Physics of Optoelectronic Devices. New York: John Wiley & Sons.

LASERS / Excimer Lasers 421

Coleman JJ (1992) Selected Papers on Semiconductor Diode Lasers. Bellingham, WA: SPIE Optical Engineering Press. Einspruch NG and Frensley WR (eds) (1994) Heterostructures and Quantum Devices. San Diego, CA: Academic Press. Iga K (1994) Fundamentals of Laser Optics. New York: Plenum Press.

Kapon E (1999) Semiconductor Lasers I and II. San Diego, CA: Academic Press. Nakamura S and Fasol G (1997) The Blue Laser Diode. Berlin: Springer. Verdeyen JT (1989) Laser Electronics, 3rd edn. Englewood Cliffs, NJ: Prentice-Hall. Zory PS Jr (ed.) (1993) Quantum Well Lasers. San Diego, CA: Academic Press.

Excimer Lasers J J Ewing, Ewing Technology Associates, Inc., Bellevue, WA, USA q 2005, Elsevier Ltd. All Rights Reserved.

Introduction We present a summary of the fundamental operating principles of ultraviolet, excimer lasers. After a brief discussion of the economics and application motivation, the underlying physics and technology of these devices is described. Key issues and limitations in the scaling of these lasers are presented.

UV power and pulse energy, however, are considerably higher. The fundamental properties of these lasers enable the photochemical and photophysical processes used in manufacturing and medicine. Short pulses of UV light can photo-ablate materials, being of use in medicine and micromachining. Short wavelengths enable photochemical processes. The ability to provide a narrow linewidth using appropriate resonators enables precise optical processes, such as lithography. We trace out some of the core underlying properties of these lasers that lead to the core utility. We also discuss the technological problems and

Background: Why Excimer Lasers? Excimer lasers have become the most widely used source of moderate power pulsed ultraviolet sources in laser applications. The range of manufacturing applications is also large. Production of computer chips, using excimer lasers as the illumination source for lithography, has by far the largest commercial manufacturing impact. In the medical arena, excimer lasers are used extensively in what is becoming one of the world’s most common surgical procedures, the reshaping of the lens to correct vision problems. Taken together, the annual production rate for these lasers is a relatively modest number compared to the production of semiconductor lasers. Current markets are in the range of $0.4 billion per year (Figure 1). More importantly, the unique wavelength, power, and pulse energy properties of the excimer laser enable a systems and medical procedure market, based on the excimer laser, to exceed $3 billion per year. The current average sales price for these lasers is in the range of $250 000 per unit, driven primarily by the production costs and reliability requirements of the lasers for lithography for chip manufacture. Relative to semiconductor diode lasers, excimer lasers have always been, and will continue to be, more expensive per unit device by a factor of order 104.

Figure 1 The sales of excimer lasers have always been measured in small numbers relative to other, less-expensive lasers. For many years since their commercial release in 1976 the sales were to R&D workers in chemistry, physics and ultimately biology and medicine. The market for these specialty but most useful R&D lasers was set to fit within a typical researcher’s annual capital budget, i.e., less than $100K. As applications and procedures were developed and certified or approved, sales and average unit sales price increased considerably. Reliance on cyclical markets like semiconductor fabrication led to significant variations in production rates and annual revenues as can be seen.

422 LASERS / Excimer Lasers

solutions that have evolved to make these lasers so useful. The data in Figure 1 provide a current answer to the question of ‘why excimer lasers?’ At the dawn of the excimer era, the answer to this question was quite different. In the early 1970s, there were no powerful short wavelength lasers, although IR lasers were being scaled up to significant power levels. However, visible or UV lasers could in principle provide better propagation and focus to a very small spot size at large distances. Solid state lasers of the time offered very limited pulse repetition frequency and low efficiency. Diode pumping of solid state lasers, and diode arrays to excite such lasers, was a far removed development effort. As such, short-wavelength lasers were researched over a wide range of potential media and lasing wavelengths. The excimer concept was one of those candidates. We provide, in Figure 2, a ‘positioning’ map showing the range of laser parameters that can be obtained with commercial or developmental excimer lasers. The map has coordinates of pulse energy and pulse rate, with diagonal lines expressing average power. We show where both development goals and

Figure 2 The typical energy and pulse rate for certain applications and technologies using excimer lasers. For energy under a few J, a discharge laser is used. The earliest excimer discharge lasers were derivatives of CO2 TEA (transversely excited atmospheric pressure) and produced pulse energy in the range of 100 mJ per pulse. Pulse rates of order 30 Hz were all that the early pulsed power technology could provide. Early research focused on generating high energy. Current markets are at modest UV pulse energy and high pulse rates for lithography and low pulse rates for medical applications.

current markets lay in a map. Current applications are shown by the boxes in the lower right-hand corner. Excimer lasers have been built with clear apertures in the range of a fraction of 0.1 to 104 cm2, with single pulse energies covering a range of 107. Lasers with large apertures require electron beam excitation. Discharge lasers correspond to the more modest pulse energy used for current applications. In general, for energies lower than ,5 J, the excitation method of choice is a self-sustained discharge, clearly providing growth potential for future uses, if required. The applications envisaged in the early R&D days were in much higher energy uses, such as in laser weapons, laser fusion, and laser isotope separation. The perceived need for lasers for isotope separation, laser-induced chemistry, and blue-green laser-based communications from satellites, drove research in high pulse rate technology and reliability extension for discharge lasers. This research resulted in prototypes, with typical performance ranges as noted on the positioning map that well exceed current market requirements. However, the technology and underlying media physics developed in these early years made possible advances needed to serve the ultimate real market applications. The very important application of UV lithography requires high pulse rates and high reliability. These two features differentiate the excimer laser used for lithography from the one used in the research laboratory. High pulse rates, over 2000 Hz in current practice, and the cost of stopping a computer chip production line for servicing, drive the reliability requirements toward 1010 shots per service interval. In contrast, the typical laboratory experiments are more often in the range of 100 Hz and, unlike a lithography production line, do not run 24 hours a day, 7 days a week. The other large market currently is in laser correction of myopia and other imperfections in the human eye. For this use the lasers are more akin to the commercial R&D style laser in terms of pulse rate and single pulse energy. Not that many shots are needed to ablate tissue to make the needed correction, and shot life is a straightforward requirement. However, the integration of these lasers into a certified and accepted medical procedure and an overall medical instrument drove the growth of this market. The excimer concept, and the core technology used to excite these lasers, is applicable over a broad range of UV wavelengths, with utility at specific wavelength ranges from 351 nm in the near UV to 157 nm in the vacuum UV (VUV). There were many potential excimer candidates in the early R&D phase (Table 1). Some of these

LASERS / Excimer Lasers 423

Table 1

Key excimer wavelengths

Excimer emitter

l (nm)

Comment

Ar2

126

Kr2

146

F2

157

Xe2

172

ArF KrCl KrF

193 222 248

XeI

254

XeBr

282

Br2

292

XeCl Hg2

308 335

I2

342

XeF

351, 353

XeF

480

HgBr

502

XeO

540

Very short emission wavelength, deep in the Vacuum UV; inefficient as laser due to absorption, see Xe2. Broad band emitter and early excimer laser with low laser efficiency. Very efficient converter of electric power into Vacuum UV light. Molecule itself is not an excimer, but uses lower level dissociation; candidate for next generation lithography. The first demonstrated excimer laser. Very efficient formation and emission but excited state absorption limits laser efficiency. Highly efficient as a lamp. Workhorse for corneal surgery and lithography. Too weak compared to KrF and not as short a wavelength as ArF. Best intrinsic laser efficiency; numerous early applications but found major market in lithography. Significant use in other materials processing. Never a laser, but high formation efficiency and excellent for a lamp. Historically the first rare gas halide excimer whose emission was studied at high pressure. First rare gas halide to be shown as a laser; inefficient laser due to excited state absorption, but excellent fluorescent emitter; a choice for lamps. Another halogen molecule with excimer like transitions that has lased but had no practical use. Optimum medium for laser discharge excitation. Hg vapor is very efficiently excited in discharges and forms excimers at high pressure. Subject of much early research, but as in the case of the rare gas excimers such as Xe2 suffer from excited state absorption. Despite the high formation efficiency, this excimer was never made to lase. Iodine UV molecular emission and lasing was first of the pure halogen excimer-like lasers demonstrated. This species served as kinetic prototype for the F2 ‘honorary’ excimer that has important use as a practical VUV source. This excimer laser transition terminates on a weakly bound lower level that dissociates rapidly, especially when heated. The focus for early defense-related laser development; as this wavelength propagates best of this class in the atmosphere. This very broadband emission of XeF terminates on a different lower level than the UV band. Not as easily excited in a practical system, so has not had significant usage. A very efficient green laser that has many of the same kinetic features of the rare gas halide excimer lasers. But the need for moderately high temperature for the needed vapor pressure made this laser less than attractive for UV operation. This is an excimer-like molecular transition on the side of the auroral lines of the O atom. The optical cross-section is quite low and as a result the laser never matured in practice.

candidates can be highly efficient, .50% relative to deposited energy, at converting electrical power into UV or visible emission, and have found a parallel utility as sources for lamps, though with differing power deposition rates and configurations. The key point in the table is that these lasers are powerful and useful sources at very short wavelengths. For lithography, the wavelength of interest has consistently shifted to shorter and shorter wavelengths as the printed feature size decreased. Though numerous candidates were pursued, as shown in Table 1, the most useful lasers are those using rare gas halide molecules. These lasers are augmented by dye laser and Raman shifting to provide a wealth of useful wavelengths in the visible and UV. All excimer lasers utilize molecular dissociation to remove the lower laser level population. This lower

level dissociation is typically from an unbound or very weakly bound ground-state molecule. However, halogen molecules also provide laser action on excimer-like transitions, that terminate on a molecular excited state that dissociates quickly. The vacuum UV transition in F2 is the most practical method for making very short wavelength laser light at significant power. Historically, the rare gas dimer molecules, such as Xe2, were the first to show excimer laser action, although self absorption, low gain, and poor optics in the VUV limited their utility. Other excimerlike species, such as metal halides, metal rare gas continua on the edge of metal atom resonance lines, and rare gas oxides were also studied. The key differentiators of rare gas halides from other excimerlike candidates are strong binding in the excited state, lack of self absorption, and relatively high optical cross-section for the laser transition.

424 LASERS / Excimer Lasers

Excimer Laser Fundamentals Excimer lasers utilize lower-level dissociation to create and sustain a population inversion. For example, in the first excimer laser demonstrated, Xe2, the lasing species is one which is comprised of two atoms that do not form a stable molecule in the ground state, as xenon dimers in the ground state are not a stable molecule. In the lowest energy state, that with neither of the atoms electronically excited, the interaction between the atoms in a collision is primarily one of repulsion, save for a very weak ‘van der Waals’ attraction at larger internuclear separations. This is shown in the potential energy diagram of Figure 3. If one of the atoms is excited, for example by an electric discharge, there can be a chemical binding in the electronically excited state of the molecule relative to the energy of one atom with an electron excited and one atom in the ground state. We use the shorthand notation p to denote an electronically excited atomic molecular species, e.g., Xep. The excited dimer, excimer for short, will recombine rapidly at high pressure from atoms into the Xep2 electronically excited molecule. Radiative lifetimes of the order 10 nsec to 1 msec are typical before photon emission returns the molecule to the lower level. Collisional quenching often reduces the lifetime below the radiative rate. For Xep2 excimers, the emission is in the vacuum ultraviolet (VUV). There are a number of variations on this theme, as noted in Table 1. Species, other than rare gas pairs, can exhibit broadband emission. The excited state binding energy can vary significantly, changing the

fraction of excited states that will form molecules. The shape of the lower potential energy curve can vary as well. From a semantic point of view heteronuclear, diatomic molecules, such as XeO, are not excimers, as they are not made of two of the same atoms. However, the more formal name ‘exciplex’ did not stick in the laser world; excimer laser being preferred. Large binding energy is more efficient for capturing excited atoms into excited excimer-type molecules in the limited time they have before emission. Early measurements of the efficiency of converting electrical energy deposited into fluorescent radiation showed that up to 50% of the energy input could result in VUV light emission. Lasers were not this efficient due to absorption. The diagram shown in Figure 3 is simplified because it does not show all of the excited states that exist for the excited molecule. There can be closely lying excited states that share the population and radiate at a lower rate, or perhaps absorb to higher levels, also not shown in Figure 3. Such excited state absorption may terminate on states derived from higher atomic excited states, noted as Xepp in the figure, or yield photo-ionization, producing the diatomic ion-molecule plus an electron, such as: Xep2 þ hn ! Xeþ 2 þe

Such excited state absorption limited the efficiency for the VUV rare gas excimers, even though they have very high formation and fluorescence efficiency. Rare gas halide excimer lasers evolved from ‘chemi-luminescence’ chemical kinetics studies which were looking at the reactive quenching of rare gas metastable excited states in flowing after glow experiments. Broadband emission in reactions with halogen molecules was observed in these experiments, via reactions such as: Xep þ Cl2 ! XeClp þ Cl

Figure 3 A schematic of the potential energy curves for an excimer species such as Xe2. The excited molecule is bound relative to an excited atom and can radiate to a lower level that rapidly dissociates on psec timescales since the ground state is not bound, save for weak binding due to van der Waals forces. The emission band is intrinsically broad.

½1

½2

At low pressure the emissions from molecules, such as XeClp, are broad in the emissions bandwidth. Shortly after these initial observations, the laser community began examination of these species in high-pressure, electron-beam excited mixtures. The results differed remarkably from the low-pressure experiments and those for the earlier excimers such as Xep2. The spectrum was shifted well away from the VUV emission. Moreover, the emission was much sharper at high pressure, though still a continuum with a bandwidth of the order of 4 nm. A basis for understanding is sketched in the potential energy curves of Figure 4. In this schematic we identify the rare gas atoms by Rg, excited rare gas atoms by

LASERS / Excimer Lasers 425

Figure 4 The potential energy curves for rare gas halides are sketched here. Both relatively sharp continuum emission is observed along with broad bands corresponding to different lower levels. The excited ion pair states consist of 3 distinct levels, two that are very close in energy and one that is shifted up from the upper laser level (the B state) by the spin orbit splitting of the rare gas ion.

Rgp, and halogen atoms by X. Ions play a very important role in the binding and reaction chemistry, leading to excited laser molecules. The large shift in wavelength from the VUV of rare gas dimer excimers, ,308 nm in XeClp versus 172 nm in Xep2, is due to the fact that the binding in the excited state of the rare gas halide is considerably stronger. The sharp and intense continuum at high pressure (. ,100 torr total pressure) is due to the fact that the ‘sharp’ transition terminates on a ‘flat’ portion of the lower-level potential energy curve. Indeed, in XeCl and XeF, the sharp band laser transition terminates on a slightly bound portion of the lower-level potential energy curve. Both the sharp emissions and broad bands are observed, due to the fact that some transitions from the excited levels terminate on a second, more repulsive potential energy curve. An understanding of the spectroscopy and kinetic processes in rare gas plus halogen mixtures is based on the recognition that the excited state of a rare gas halide molecule is very similar to the ground state of the virtually isoelectronic alkali halide. The binding energy and mechanism of the excited state is very close to that of the ground state of the most similar alkali halide. Reactions of excited state rare gases are very similar to those of alkali atoms. For example, Xep and Cs are remarkably similar, from a chemistry and molecular physics point of view, as they have the same outer shell electron configuration and similar ionization potentials. The ionization potential of Xep is similar to that of cesium (Cs). Cs, and all the other

alkali atoms, react rapidly with halogen molecules. Xenon metastable excited atoms react rapidly with halogen molecules. The alkali atoms all form ionically bonded ground states with halogens. The rare gas halides are effectively strongly bonded ion pairs, Xe(þ)X(2), that are very strongly bonded relative to the excited states that yield them. The difference between, for example, XeClp and CsCl, is that XeClp radiates in a few nanoseconds while CsCl is a stable ground-state molecule. The binding energy for these ionic bonded molecules is much greater than the more covalent type of bonding that is found in the rare gas excimer excited states. Thus XeIp, which is only one electron different (in the I) than Xep2, emits at 254 nm instead of 172 nm. The first observed and most obvious connection to alkali atoms is in formation chemistry. Mostly, rare gas excited states react with halogen molecules rapidly. The rare gas halide laser analog reaction sequence is shown below, where some form of electric discharge excitation kicks the rare gas atom up to an excited state, so that it can react like an alkali atom, eqn [3]: efast þ Kr ! Krp þ eslow

½3

Krp þ F2 ! KrFp ðv; JÞhigh þ F

½4

There are subtle and important cases where the neutral reaction, such as that shown in eqn [2] or [4], is not relevant as it is too slow compared to other processes. Note that the initial reaction does not yield the specific upper levels for the laser but states that are much higher up in the potential well of the excited rare gas halide excimer molecule. Further collisions with a buffer gas are needed to relax the states to the vibrational levels that are the upper levels for the laser transition. Such relaxation at high pressure can be fast, which leads the high-pressure spectrum to be much sharper than those first observed in flowing after-glow experiments. The excited states of the rare gas halides are ion pairs bound together for their brief radiative lifetime. This opens up a totally different formation channel, indeed the one that often dominates the mechanism, ion –ion recombination. Long before excimer lasers were studied, it was well known that halogen molecules would react with ‘cold’ electrons (those with an effective temperature less than a few eV) in a process called attachment. The sequence leading to the upper laser level is shown below for the Kr/F2 system, but the process is generically identical in other rare gas halide mixtures.

426 LASERS / Excimer Lasers

KrFp formation kinetics via ion channel: Ionization: Attachment:

e þ Kr ! Krþ þ 2e

½5

e þ F2 ! Fð2Þ þ F

½6

Ion – ion recombination: Krþ þ Fð2Þ þ M ! KrFp ðv; JÞhigh þ M

½7

Relaxation: KrFp ðv; JÞhigh þ M ! KrFp ðv; JÞlow þ M

½8

Finally, there is one other formation mechanism, displacement, that is unique to the rare gas halides. In this process, a heavier rare gas will displace a lighter rare gas ion in an excited state to form a lower-energy excited state: ArFp þ Kr ! KrFp ðv; JÞhigh þ Ar

½9

In general, the most important excimer lasers tend to have all of these channels contributing to excited state formation for optimized mixtures. These kinetic formation sequences express some important points regarding rare gas halide lasers. The source of the halogen atoms for the upper laser level disappears via both attachment reactions with electrons and by the reaction with excited states. The halogen atoms eventually recombine to make molecules, but at a rate too slow for sufficient continuous wave (cw) laser gain. For excimer-based lamps, however, laser gain is not a criterion and very Table 2

efficient excimer lamps can be made. Another point to note is that forming the inversion requires transient species (and in some cases halogen fuels) that absorb the laser light. Net gain and extraction efficiency are a trade-off between pumping rate, an increase in which often forms more absorbers and absorption itself. Table 2 provides a listing of some of the key absorbers. Finally, the rare gas halide excited states can be rapidly quenched in a variety of processes. More often than not, the halogen molecule and the electrons in the discharge react with the excited states of the rare gas halide, removing the contributors to gain. We show, in Table 3, a sampling of some of the types of formation and quenching reactions with their kinetic rate constant parameters. Note that one category of reaction, three-body quenching, is unique to excimer lasers. Note also that the three-body recombination of ions has a rate coefficient that is effectively pressure dependent. At pressures above ,3 atm, the diffusion of ions toward each other is slowed and the effective recombination rate constant decreases. Though the rare gas halide excited states can be made with near-unity quantum yield from the primary excitation, the intrinsic laser efficiency is not this high. The quantum efficiency is only ,25% (recall rare gas halides all emit at much longer wavelengths than rare gas excimers); there is inefficient extraction due to losses in the gas and windows. In practice there are also losses in coupling the power into the laser, and wasted excitation during the finite start-up time. The effect of losses is shown in Table 4 and Figure 5 where the pathways are charted and losses identified. In KrF, it takes ,20 eV to make a rare gas ion in an electron beam excited mixture,

Absorbers in rare gas halide lasers

Species

XeF

XeCl

KrF

ArF

Laser l (nm) F2 absorption s (cm2) Cl2 absorption s (cm2) HCl absorption s (cm2) F(2) absorption s (cm2) Cl(2) absorption s (cm2) Rg(þ) 2 diatomic ion absorption s (cm2)a

351, 353 8 £ 10221 NA NA 2 £ 10218 NA Ne: ,10218 Ar: ,3 £ 10217 Xe: ,3 £ 10217 Excited states of rare gas atoms and rare gas dimer molecules s , 10219 – 10217 Lower laser level absorbs unless heated

308 NA 1.7 £ 10219 Nil NA 2 £ 10217 Ne: ,10217 Ar: ,4 £ 10217 Xe: ,3 £ 10218 Excited states of rare gas atoms and rare gas dimer molecules s , 10219 – 10217 Weak lower level absorption s , 4 £ 10216

248 1.5 £ 10220 NA NA 5 £ 10218 NA Ne: ,2.5 £ 10217 Ar: ,2 £ 10218 Kr ,10218 Excited states of rare gas atoms and rare gas dimer molecules s , 10219 – 10217 Windows and dust can be an issue

193

Other transient absorbers

Other

NA NA 5 £ 10218 NA Ne: ,10218 Ar: NIL Excited states of rare gas atoms and rare gas dimer molecules s , 10219 – 10217 Windows and dust can be an issue

a Note that the triatomic rare gas halides of the form Rg2X will exhibit similar absorption as the diatomic rare gas ions as the triatomic rare gas halides of this form are essentially ion pairs of an Rg2(þ) ion and the halide ion.

LASERS / Excimer Lasers 427

Table 3

Typical formation and quenching reactions and rate coefficients

Formation Rare gas plus halogen molecule

Krp þ F2 ! KrFp (v ; J)þF k , 7 £ 10210 cm3/sec; Branching ratio to KrFp , 1 Arp þ F2 ! ArFp (v ; J)þF k , 6 £ 10210 cm3/sec; Branching ratio to ArFp , 60% Xep þ F2 ! XeFp (v; J)þF k , 7 £ 10210 cm3/sec; Branching ratio to XeFp , 1 Xep þ HCl(v ¼ 0) ! Xe þ H þ Cl k , 6 £ 10210 cm3/sec; Branching ratio to XeClp , 0 Xep þ HCl(v ¼ 1) ! XeClp þ H k , 2 £ 10210 cm3/sec; Branching ratio to XeClp ,1

Ion– ion recombination

Ar(þ) þ F(2) þ Ar ! ArFp(v,J ) k , 1 £ 10225 cm6/sec at 1 atm and below k , 7:5 £ 10226 cm6/sec at 2 atm; Note effective 3-body rate constant decreases further as pressure goes over ,2 atm Kr(þ) þ F(2) þ Kr ! KrFp (v ; J) k , 7 £ 10226 cm6/sec at 1 atm and below; rolls over at higher pressure

Displacement

ArFp (v ; J)þKr ! KrFp (v; J)þAr k , 2 £ 10210 cm3/sec; Branching ratio to KrFp ,1

Quenching Halogen molecule

XeClp þ HCl ! Xe þ Cl þ HCl k , 5:6 £ 10210 cm3/sec; KrFp þ F2 ! Kr þ F þ F k , 6 £ 10210 cm3/sec; All halogen molecule quenching have very high reaction rate constants

3-body inert gas

ArFp þ Ar þ Ar ! Ar2 Fp þ Ar k , 4 £ 10231 cm6/sec KrFp þ Kr þ Ar ! Kr2 Fp þ Ar k , 6 £ 10231 cm6/sec XeClp þ Xe þ Ne ! Xe2 Clp þ Ne k , 1 £ 10233 cm6/sec; 4 £ 10231 cm6/sec with Xe as 3rd body These reactions yield a triatomic excimer at lower energy that can not be recycled back to the desired laser states

Electrons

XeClp þ e ! Xe þ Cl þ e k , 2 £ 1027 cm3/sec In 2-body quenching by electrons, the charge of the electron interacts with the dipole of the rare gas halide ion pair at long range; above value is typical of others

2-body inert gas

KrFp þ Ar ! Ar þ Kr þ F k , 2 £ 10212 cm3/sec XeClp þ Ne ! Ne þ Xe þ Cl k , 3 £ 10213 cm3/sec 2-body quenching by rare gases is typically slow and less important than the reactions noted above

somewhat less in an electric discharge. The photon release is ,5 eV; the quantum efficiency is only 25%. Early laser efficiency measurements (using electron beam excitation) showed laser efficiency in the best cases slightly in excess of 10%, relative to the deposited energy. The discrepancy relative to the quantum efficiency is due to losses in the medium. For the sample estimate in Table 4 and Figure 5 we show approximate density of key species in the absorption chain, estimated on a steady-state approximation of the losses in a KrF laser for an excitation rate of order 1 MW/cc, typical of fast discharge lasers. The net

small signal gain is of the order of 5%/cm. The key absorbers are the parent halogen molecule, F2, the fluoride ion F(2), and the rare gas excited state atoms. A detailed pulsed kinetics code will provide slightly different results due to transient kinetic effects and the intimate coupling of laser extraction to quenching and absorber dynamics. The ratio of gain to loss is usually in the range of 7 to 20, depending on the actual mixture used and the rare gas halide wavelength. The corresponding extraction efficiency is then limited to the range of ,50%. The small signal gain, go ; is related to the power deposition

428 LASERS / Excimer Lasers

Table 4

Typical ground and excited state densities shown for KrF at Pin , 1 MW/cm3

Species

Amount particles/cm3

Absorption s (cm)2

Loss %/cm

F2 Kr Buffer F(2) Rgp, Rgpp

,1.2 £ 1017 ,3 £ 1018 4 £ 1019 ,1014 1 £ 1015

0.18 – – 0.05 0.1

KrF (before dissociating) Rg2Fp

,1012 8 £ 1014, Note this species density decreases as flux grows 2.5 £ 1014

1.5 £ 10220 – – 5 £ 10218 k10218l depends on ratio of Rgpp /Rgp 2 £ 10216 1 £ 10218 2 £ 10216

5 (gain)

KrFp (B)

0.02 0.08

Figure 5 The major decay paths and absorbers that impact the extraction of excited states of rare gas halide excimers. Formation is via the ion –ion recombination, excited atom reaction with halogens and displacement reactions, not shown. We note for a specific mixture of KrF the approximate values of the absorption by the transient and other species along with the decay rates for the major quenching processes. Extraction efficiency greater than 50% is quite rare.

rate per unit volume, Pdeposited , by eqn [10]. Pdeposited ¼ g0 Ep ðtupper hpumping hbranching f sÞ21

½10

Ep is the energy per excitation, ,20 eV or 3.2 £ 10218 J/excitation, f is the fraction of excited molecules in the upper laser level, ,40% due to KrFp in higher vibrational states and the slowly radiating ‘C’ state. The laser cross-section is s, and the h terms are efficiencies for getting from the primary excitation down to the upper laser level. The upper state lifetime, tupper , is the sum of the inverses of radiative and quenching lifetimes. When the laser cavity flux is sufficiently high, stimulated emission decreases the flow of power into the quenching processes. The example shown here gives a ratio of net gain to the nonsaturating component of loss of 13/1. Of the

excitations that become KrFp, only ,55% or so will be extracted in the best of cases. For other less ideal rare gas halide lasers, the gain to loss ratio, the extraction efficiency, and the intrinsic efficiency are all lower. Practical discharge lasers have lower practical wall plug efficiency due to a variety of factors, discussed below.

Discharge Technology Practical discharge lasers use fast-pulse discharge excitation. In this scheme, an outgrowth of the CO2 TEA laser, the gas is subjected to a rapid high-voltage pulse from a low-inductance pulse forming line or network, PFL or PFN. The rapid pulse serves to break down the gas, rendering it conductive with an electron density of . ,1014 electrons/cm3. As the

LASERS / Excimer Lasers 429

gas becomes conductive, the voltage drops and power is coupled into the medium for a short duration, typically less than 50 nsec. The discharge laser also requires some form of ‘pre-ionization’, usually provided by sparks or a corona-style discharge on the side of the discharge electrodes. The overall discharge process is intrinsically unstable, and pump pulse duration needs to be limited to avoid the discharge becoming an arc. Moreover, problems with impedance matching of the PFN or PFL with the conductive gas lead to both inefficiency and extra energy that can go into postdischarge arcs, which both cause component failure and produce metal dust which can give rise to window losses. A typical discharge laser has an aperture of a few cm2 and a gain length of order 80 cm and produces outputs in the range of 100 mJ to 1 J. The optical pulse duration tends to be ,20 nsec, though the electrical pulse is longer due to the finite time needed for the gain to build up and the stimulated emission process to turn spontaneously emitted photons into photons in the laser cavity. The pre-ionizer provides an initial electron density on the order of 108 electrons/cm3. With spark preionization or corona pre-ionization, it is difficult to make a uniform discharge over an aperture greater than a few cm2. For larger apertures, one uses X-rays for pre-ionization. X-ray pre-ionized discharge excimer lasers have been scaled to energy well over 10 J per pulse and can yield pulse durations over 100 nsec. The complexity and expense of the X-ray approach, along with the lack of market for lasers in this size range, resulted in the X-ray approach remaining in the laboratory. Aside from the kinetic issues of the finite time needed for excitation to channel into the excimer upper levels and the time needed for the laser to ‘start up’ from fluorescent photons, these lasers have reduced efficiency due to poor matching of the pump source to the discharge. Ideally, one would like to have a discharge such that when it is conductive the voltage needed to sustain the discharge is less than half of that needed to break the gas down. Regrettably for the rare gas halides (and in marked distinction to the CO2 TEA laser) the voltage that the discharge sustains in a quasi-stable mode is ,20% of the breakdown voltage, perhaps even less, depending on the specific gases and electrode shapes. Novel circuits, which separate the breakdown phase from the fully conductive phase, can readily double the efficiency of an excimer discharge laser, though this approach is not necessary in many applications. The extra energy that is not dissipated in the useful laser discharge often results in arcing or ‘hot’ discharge areas after the main laser pulse,

leading to electrode sputtering, dust that finds its way to windows, and the need to service the laser after ,108 pulses. When one considers the efficiency of the power conditioning system, along with the finite kinetics of the laser start-up process in short pulses, and the mismatch of the laser impedance to the PFN/PFL impedance, lasers that may have 10% intrinsic efficiency in the medium in e-beam excitation may only have 2% to 3% wall plug efficiency in a practical discharge laser. A key aspect of excimer technology is the need for fast electrical circuits to drive the power into the discharge. This puts a significant strain on the switch that is used to start the process. While spark gaps, or multichannel spark gaps, can handle the required current and current rate of rise, they do not present a practical alternative for high-pulse rate operation. Thyratrons and other switching systems, such as solid state switches, are more desirable than a spark gap, but do not offer the desired needed current rate of rise. To provide the pumping rate of order 1 MW/cm3 needed for enough gain to turn the laser on before the discharge becomes unstable, the current needs to rise to a value of the order of 20 kA in a time of ,20 nsec; the current rate of rise is ,1012 A/sec. This rate of rise will limit the lifetime of the switch that drives the power into the laser gas. As a response to this need, magnetic switching and magnetic compression circuits were developed so that the lifetime of the switch can be radically increased. In the simplest cases, one stage of magnetic compression is used to make the current rate of rise to be within the range of a conventional thyratron switch as used in radar systems. Improved thyratrons and a simple magnetic assist also work. For truly long operating lifetimes, multiple stages of magnetic switching and compression are used, and with a transformer one can use a solid state diode as the start switch for the discharge circuit. A schematic of an excimer pulsed power circuit, that uses a magnetic assist with a thyratron and one stage of magnetic compression, is shown in Figure 6. The magnetic switching approach can also be used to provide two separate circuits to gain high efficiency by having one circuit to break down the gas (the spiker) and a second, low-impedance ‘sustainer’ circuit, to provide the main power flow. In this approach, the laser itself can act as part of the circuit by holding off the sustaining voltage for a m-sec or so during the charge cycle and before the spiker breaks down the gas. A schematic of such a circuit is shown in Figure 7. Using this type of circuit, the efficiency of a XeCl laser can be over 4% relative to the wall plug. A wide variety of technological twists on this approach have been researched.

430 LASERS / Excimer Lasers

Figure 6 One of the variants on magnetic switching circuits that have been used to enhance the lifetime of the primary start switch. A magnetic switch in the form of a simple 1 turn inductor with a saturable core magnetic material such as a ferrite allows the thyratron to turn on before major current flows. Other magnetic switches may be used in addition to peak up the voltage rate of rise on the laser head or to provide significant compression. For example one may have the major current flow in the thyratron taking place on the 500 nsec time scale while the voltage pulse on the discharge is in the 50 nsec duration. A small current leaks through and is accommodated by the charging inductor.

Figure 7 For the ultimate in efficiency the circuit can be arranged to provide a leading edge spike that breaks the gas down using a high impedance ‘spiker’ and a separate circuit that is matched to the discharge impedance when it is fully conductive. High coupling efficiency reduces waste energy that can erode circuit and laser head components via late pulse arcs. The figure shows a magnetic switch as the isolation but the scheme can be implemented with a second laser head as the isolation switch, a rail gap (not appropriate for long life) or a diode at low voltage.

Excimer lasers present some clear differences in resonator design compared to other lasers. The apertures are relatively large, so one does not build a TEMoo style cavity for good beam quality and expect to extract any major portion of the energy available. The typical output is divergent and multimode. When excimer lasers are excited by a short pulse discharge, ,30 nsec, the resulting gain duration is also short, ,20 nsec, limiting the number of passes a photon in a resonator can have in the gain. The gain duration of ,20 nsec only provides ,3.5 round trips of a photon in the cavity during the laser lifetime, resulting in the typical high-order multimode beam. Even with an unstable resonator, the gain duration is

not long enough to collapse the beam into the lowestorder diffraction-limited output. If line narrowing is needed, and reasonable efficiency is required, an oscillator amplifier configuration is often used with the oscillator having an appropriate tuning resonator in the cavity. If both narrow spectral linewidth and excellent beam quality are needed one uses a seeded oscillator approach where the narrowband oscillator is used to seed an unstable resonator on the main amplifier. By injecting the seed into the unstable resonator cavity, a few nsec before the gain is turned on, the output of the seeded resonator can be locked in wavelength, bandwidth, and provide good beam quality. Long pulse excimer lasers, such as e-beam excited lasers or long pulse X-ray pre-ionized discharge devices can use conventional unstable resonator technology with good results. Excimer lasers are gas lasers that run at both high pressure and high instantaneous power deposition rates. During a discharge pulse, much of the halogen bearing ‘fuel’ is burned out by the attachment and reactive processes. The large energy deposition per pulse means that pressure waves are generated. All of these combine with the characteristic device dimensions to require gas flow to recharge the laser mixture between pulses. For low pulse rates, less than ,200 Hz, the flow system need not be very sophisticated. One simply needs to flush the gas from the discharge region with a flow having a velocity of the order of 5 m/sec for a typical discharge device. The flow is transverse to the optical and discharge directions. At high pulse rates, the laser designer needs to consider flow in much more detail to minimize the power required to push the gas through the laser head. Circulating pressure waves need to be damped out so that the density in the discharge region is controlled at the time of the next pulse. Devices called acoustic dampers are placed in the sides of the flow loop to remove pressure pulses. A final subtlety occurs in providing gas flow over the windows. The discharge, post pulse arcs, and reactions of the halogen source with the metal walls and impurities creates dust. When the dust coats the windows, the losses in the cavity increase, lowering efficiency. By providing a simple gas flow over the window, the dust problem can be ameliorated. For truly long life, highreliability lasers, one needs to take special care in the selection of materials for electrodes and insulators and avoid contamination, by assembling in clean environments.

See also Lasers: Carbon Dioxide Lasers. Laser-Induced Damage of Optical Materials. Nonlinear Optics,

LASERS / Free Electron Lasers 431

Applications: Pulse Compression via Nonlinear Optics; Raman Lasers. Scattering: Raman Scattering.

Further Reading Ballanti S, Di Lazzaro P, Flora F, et al. (1998) Ianus the 3-electrode laser. Applied Physics B 66: 401 – 406. Brau CA (1978) Rare gas halogen lasers. In: Rhodes CK (ed.) Excimer Lasers, Topics of Applied Physics, vol. 30. Berlin: Springer Verlag. Brau CA and Ewing JJ (1975) Emission spectra of XeBr, XeCl, XeF, and KrF. Journal of Chemical Physics 63: 4640– 4647. Ewing JJ (2000) Excimer laser technology development. IEEE Journal of Selected Topics in Quantum Electronics 6(6): 1061–1071.

Levatter J and Lin SC (1980) Necessary conditions for the homogeneous formation of pulsed avalanche discharges at high pressure. Journal of Applied Physics 51: 210 – 222. Long WH, Plummer MJ and Stappaerts E (1983) Efficient discharge pumping of an XeCl laser using a high voltage prepulse. Applied Physics Letters 43: 735– 737. Rhodes CK (ed.) (1979) Excimer Lasers, Topics of Applied Physics, vol. 30. Berlin: Springer-Verlag. Rokni M, Mangano J, Jacob J and Hsia J (1978) Rare gas fluoride lasers. IEEE Journal of Quantum Electronics QE-14: 464 – 481. Smilansksi I, Byron S and Burkes T (1982) Electrical excitation of an XeCl laser using magnetic pulse compression. Applied Physics Letters 40: 547 – 548. Taylor RS and Leopold KE (1994) Magnetic-spiker excitation of gas discharge lasers. Applied Physics B, Lasers and Optics 59: 479 – 509.

Free Electron Lasers A Gover, Tel-Aviv University, Tel-Aviv, Israel q 2005, Elsevier Ltd. All Rights Reserved.

Introduction The Free Electron Laser (FEL) is an exceptional kind of laser. Its active medium is not matter, but charged particles (electrons) accelerated to high energies, passing in vacuum through a periodic undulating magnetic field. This distinction is the main reason for the exceptional properties of FEL: operating at a wide range of wavelengths – from mm-wave to X-rays with tunability, high power, and high efficiency. In this article we explain the physical principles of FEL operation, the underlying theory and technology of the device and various operating schemes, which have been developed to enhance performance of this device. The term ‘Free Electron Laser’ was coined by John Madey in 1971, pointing out that the radiative transitions of the electrons in this device are between free space (more correctly – unbound) electron quantum states, which are therefore states of continuous energy. This is in contrast to conventional atomic and molecular lasers, in which the electron performs radiative transition between bound (and therefore of distinct energy) quantum states. Based on these theoretical observations, Madey and his colleagues in Stanford University demonstrated FEL operation first as an amplifier (at l ¼ 10:6 mm) in 1976, and subsequently as an oscillator (at l ¼ 3:4 mm) in 1980.

From the historical point of view, it turned out that Madey’s invention was essentially an extension of a former invention in the field of microwave-tubes technology – the Ubitron. The Ubitron, a mm-wave electron tube amplifier based on a magnetic undulator, was invented and developed by Philips and Enderbry who operated it at high power levels in 1960. The early Ubitron development activity was not noticed by the FEL developers because of the disciplinary gap, and largely because its research was classified at the time. Renewed interest in high-power mm-wave radiation emission started in the 1970s, triggered by the development of pulsed-line generators of ‘Intense Relativistic Beams’ (IRB). This activity, led primarily by plasma physicists in the defense establishment laboratories of Russia (mostly IAP in Gorky – Nizhny Novgorod) and the US (mostly N.R.L. – DC) led to development of highgain high-power mm-wave sources independently of the development of the optical FEL. The connection between these devices and between them to conventional microwave tubes (as Traveling Wave Tubes – TWT) and other electron beam radiation schemes, like Cenenkov and Smith-Purcell radiation that may also be considered FELs, was revealed in the mid-1970s, starting with the theoretical works of P. Spangle, A. Gover and A. Yariv who identified that all these devices satisfy the same dispersion equation as the TWT derived by John Pierce in the 1940s. Thus, the optical FEL could be conceived as a kind of immense electron tube, operating with a highenergy electron beam in the low gain regime of the Pierce TWT dispersion equation.

432 LASERS / Free Electron Lasers

The extension of the low-gain FEL theory to the general ‘electron-tube’ theory is important because it led to development of new radiation schemes and new operating regimes of the optical FEL. This was exploited by physicists in the discipline of accelerator physics and synchrotron radiation, who identified, starting with the theoretical works of C. Pellegrini and R. Bonifacio in the early 1980s, that high-current, high-quality electron beams, attainable with further development of accelerators technology, could make it possible to operate FELs in the high-gain regime, even at short wavelengths (vacuum ultra-violet – VUV and soft X-ray) and that the high-gain FEL theory can be extended to include amplification of the incoherent synchrotron spontaneous emission (shot noise) emitted by the electrons in the undulator. These led to the important development of the ‘self (synchrotron) amplified spontaneous emission (SASE) FEL’, which promised to be an extremely high brightness radiation source, overcoming the fundamental obstacles of X-ray lasers development: lack of mirrors (for oscillators) and lack of high brightness radiation sources (for amplifiers). A big boost to the development of FEL technology was given during the period of the American ‘strategic defense initiative – SDI’ (Star-Wars) program in the mid-1980s. The FEL was considered one of the main candidates for use in a ground-based or space-based ‘directed energy weapon – DEW’, that can deliver megawatts of optical power to hit attacking missiles. The program led to heavy involvement of major American defense establishment laboratories (Lawrence – Livermore National Lab, Los-Alamos National Lab) and contracting companies

(TRW, Boeing). Some of the outstanding results of this effort were demonstration of the high-gain operation of an FEL amplifier in the mm-wavelength regime, utilizing an Induction Linac (Livermore, 1985), and demonstration of enhanced radiative energy extraction efficiency in FEL oscillator, using a ‘tapered wiggler’ in an RF-Linac driven FEL oscillator (Los-Alamos, 1983). The program has not been successful in demonstrating the potential of FELs to operate at the high average power levels needed for DEW applications. But after the cold-war period, a small part of the program continues to support research and development of medical FEL application.

Principles of FEL Operation Figure 1 displays schematically an FEL oscillator. It is composed of three main parts: an electron accelerator, a magnetic wiggler (or undulator), and an optical resonator. Without the mirrors, the system is simply a synchrotron undulator radiation source. The electrons in the injected beam oscillate transversely to their propagation direction z, because of the transverse magnetic Lorenz force: F’ ¼ 2evz e^ z £ B’

½1

In a planar (linear) wiggler, the magnetic field on axis is approximately sinusoidal: B’ ¼ Bw e^ y cos kw z

½2

In a helical wiggler: B’ ¼ Bw ð^ey cos kw z þ e^ x sin kw zÞ

½3

Figure 1 Components of a FEL-oscillator. (Reproduced from Benson SV (2003) Free electron lasers push into new frontiers. Optics and Photonics News 14: 20–25. Illustration by Jaynie Martz.)

LASERS / Free Electron Lasers 433

In either case, if we assume constant (for the planar wiggler – only on the average) axial velocity, then z ¼ vz t: The frequency of the transverse force and the mechanical oscillation of the electrons, as viewed transversely in the laboratory frame of reference, is: v v0s ¼ kw vz ¼ 2p z ½4 lw where lw ¼ 2p=kw is the wiggler period. The oscillating charge emits an electromagentic radiation wavepacket. In a reference frame moving with the electrons, the angular radiation pattern looks exactly like dipole radiation, monochromatic in all directions (except for the frequency-line-broadening due to the finite oscillation time, i.e., the wiggler transit time). In the laboratory referenceframe the radiation pattern concentrates in the propagation ðþzÞ direction, and the Doppler upshifted radiation frequency depends on the observation angle Q relative to the z-axis: v0s ½5 v0 ¼ 1 2 bz cos Q On axis (Q ¼ 0), the radiation frequency is:

v0 ¼

ckw bz ¼ ð1 þ bz Þbz gz2 ckw ø 2gz2 ckw 1 2 bz

½6

where bz ; vz =c, gz ; ð1 2 b2z Þ21=2 are the axial (average) velocity and the axial Lorenz factor, respectively, and the last part of the equation is valid only in the (common) highly relativistic limit gz q 1: Using the relations b2z þ b2’ ¼ b2 , b’ ¼ aw =g, one can express gz : g ½7 gz ¼ 1 þ a2w =2 (this is for a linear wiggler, in the case of a helical wiggler the denoninator is 1 þ a2w ): 1 g ; ð1 2 b2 Þ21=2 ¼ 1 þ k2 mc  ½8 ¼ 1 þ 1k ½MeV 0:511 and aw – (also termed K) ‘the wiggler parameter’ is the normalized transverse momentum: aw ¼

eBw ¼ 0:093Bw ½KGausslw ½cm kw mc

½9

Typical values of Bw in FEL wigglers (undulators) are of the order of Kgauss’, and lw of the order of CMs, and consequently aw , 1: Considering that electron beam accelerator energies are in the range of MeV to GeV, one can appreciate from eqns [6] – [8], that a significant relativistic Doppler shift factor 2g 2z , in the range of tens to millions, is possible. It, therefore,

provides incoherent synchrotron undulator radiation in the frequency range of microwave to hard X-rays. Synchrotron undulator radiation was studied in 1951 and since then has been a common source of VUV radiation in synchrotron facilities. From the point of view of laser physics theory, this radiation can be viewed as ‘spontaneous synchrotron radiation emission’ in analogy to spontaneous radiation emission by electrons excited to higher bound-electron quantum levels in atoms or molecules. Alternatively, it can be regarded as the classical shot noise radiation, associated with the current fluctuations of the randomly injected discrete charges comprising the electron beam. Evidently this radiation is incoherent, and the fields it produces average in time to zero, because the wavepackets emitted by the randomly injected electrons interfere at the observation point with random phase. However, their energies sum up and can produce substantial power. Based on fundamental quantum-electrodynamical principles or Einstein’s relations, one would expect that any spontaneous emission scheme can be stimulated. This principle lies behind the concept of the FEL, which is nothing but stimulated undulator synchrotron radiation. By stimulating the electron beam to emit radiation, it is possible, as with any laser, to generate a coherent radiation wave and extract more power from the gain medium, which in this case is an electron beam, that carries an immense amount of power. There are two kinds of laser schemes which utilize stimulation of synchrotron undulator radiation: (i) A laser amplifier. In this case the mirrors in the schematic configuration of Figure 1 are not present, and an external radiation wave at frequencies within the emission range of the undulator is injected at the wiggler entrance. This requires, of course, an appropriate radiation source to be amplified and availability of sufficiently high gain in the FEL amplifier. (ii) A laser oscillator. In this case an open cavity (as shown in Figure 1) or another (waveguide) cavity is included in the FEL configuration. As in any laser, the FEL oscillator starts building up its radiation from the spontaneous (synchrotron undulator) radiation which gets trapped in the resonator and amplified by stimulated emission along the wiggler. If the threshold condition is satisfied (having single path gain higher than the round trip losses), the oscillator arrives to saturation and steady state coherent operation after a short transient period of oscillation build-up.

434 LASERS / Free Electron Lasers

Because the FEL can operate as a high-gain amplifier (with a long enough wiggler and an electron beam of high current and high quality), also a third mode of operation exists: self amplified spontaneous emission (SASE). In this case, the resonator mirrors in Figure 1 are not present and the undulator radiation generated spontaneously in the first sections of the long undulator is amplified along the wiggler and emitted at the wiggler exit at high power and high spatial coherence. The Quantum-Theory Picture

A free electron, propagating in unlimited free space, can never emit a single photon. This can be proven by examining the conservation of energy and momentum conditions: 1 k i 2 1 k f ¼ "v

½10

ki 2 kf ¼ q

½11

that must be satisfied, when an electron in an initial free-space energy and momentum state ð1ki ; "ki Þ makes a transition to a final state ð1kf ; "kf Þ, emitting a single photon of energy and momentum ð"v; qÞ: In free space: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1k ¼ c ð"kÞ2 þ ðmc2 Þ2 ½12 q¼

v e^ c q

½13

and eqns [10] – [13] have only one solution, v ¼ 0, q ¼ 0. This observation is illustrated graphically in the energy –momentum diagram of Figure 2a in the framework of a one-dimensional model. It appears that if both eqns [10] and [11] can be satisfied, then the phase velocity of the emitted radiation wave vph ¼ v=q (the slope of the chord) will equal the electron wavepacket group velocity vg ¼ vz at some intermediate point kp ¼ pp =":  v 1ki 2 1kf ›1  ¼ ¼ vph ¼ ½14  ¼ vg q "ðki 2 kf Þ › p  pp For a radiation wave in free space (eqn [13]), this results in c ¼ vg , which contradicts special relativity. The reason for the failure to conserve both energy and momentum in the transition is that the photon momentum "q it too small to absorb the large momentum shift of the electron, as it recoils while releasing radiative energy "v: This observation leads to ideas on how to make a radiative transition possible: (i) Limit the interaction length. If the interaction length is L, the momentum conservation

Figure 2 Conservation of energy and momentum in forward photon emission of a free electron: (a) The slope of the tangent to the curve at intermediate point k p, ›1k(k p)/›k may be equal to the slope of the chord "v/q which is impossible in free space. (b) electron radiative transition made possible with an electromagnetic pump (Compton Scattering). (c) The wiggler wavenumber – kw conserves the momentum in electron radiative transition of FEL.

LASERS / Free Electron Lasers 435

condition in eqn [11] must be satisfied only within an uncertainty range ^p=L: This makes it possible to obtain radiative emission in free electron radiation effects like ‘Transition Radiation’ and in microwave tubes like the Klystron. (ii) Propagate the radiation wave in a ‘slow wave’ structure, where the phase velocity of the radiation wave is smaller than the speed of light, and satisfaction of eqn [14] is possible. For example, in the Cerenkov effect, charged particles pass through a medium (gas) with index of refraction n . 1: Instead of eqn [13], q ¼ nðvÞðv=cÞ^eq , and consequently qz ¼ nðvÞðv=cÞcos Qq , where we assume radiative emission at an angle Qq relative to the electron propagation axis z: Substitution in eqn [14] results in the Cerenkov radiation condition vg nðvÞcos Qq ¼ 1: Another example for radiation emission in a slow-wave structure is the Traveling Wave Tube (TWT). In this device, a periodic waveguide of periodicity lw permits (via the Floquet theorem) propagation of slow partial waves (space harmonics) with increased wavenumber qz þ mkw (m ¼ 1, 2, …), and again eqn [14] can be satisfied. (iii) Rely on a ‘two-photon’ radiative transition. This can be ‘real photon’ Compton scattering of an intense radiation beam (electromagnetic pump) off an electron beam, or ‘virtual photon’ scattering of a static potential, as is the case in bremsstrahlung radiation and in synchrotron – undulator radiation. The latter radiation scheme may be considered as a ‘magnetic brehmsstrahlung’ effect or as ‘zero frequency pump’ Compton scattering, in which the wiggler contributes only ‘crystal momentum’ "kw , to help satisfy the momentum conservation condition in eqn [11]. The Compton scattering scheme is described schematically for the one-dimensional (back scattering) case in Figure 3, and its conservation of energy and momentum diagram is depicted in Figure 2b (a ‘real photon’ (vw , kw ) free-space pump wave is assumed with kw ¼ vw =c). The analogous diagram of a static wiggler ðvw ¼ 0, kw ¼ 2p=lw Þ is shown in Figure 2c. It is worth noting that the effect of the incident scattered wave or the wiggler is not necessarily a small perturbation. It may modify substantially the electron energy-dispersion diagram of the free electron and a more complete ‘Brillouin diagram’ should be used in Figure 2c. In this sense, the wiggler may be viewed as the analogue of a one-dimensional crystal, and its period lw analogous to the crystal lattice constant. The momentum conservation during a radiation

Figure 3 The scheme of backward scattering of an electromagnetic wave off an electron beam (Doppler shifted Compton scattering).

transition, with the aid of the wiggler ‘crystal momentum’ "kw is quite analogous to the occurrence of vertical radiative transitions in direct bandgap semiconductors, and thus the FEL has, curiously enough, some analogy to microscopic semiconductor lasers. All the e-beam radiation schemes already mentioned can be turned into stimulated emission devices, and thus may be termed ‘free electron lasers’ in the wide sense. The theory of all of these devices is closely related, but most of the technological development was carried out on undulator radiation (Magnetic brehmsstrahlung) FELs, and the term FEL is usually reserved for this kind (though some developments of Cerenkov and Smith –Purcell FELs are still carried out). When considering a stimulated emission device, namely enhanced generation of radiation in the presence of an external input radiation wave, one should be aware, that in addition to the emission process described by eqns [10] and [11] and made possible by one of the radiation schemes described above, there is also a stimulated absorption process. Also, this electronic transition process is governed by the conservation of energy and momentum conditions, and is described by eqns [10] and [11] with ki and kf exchanged. Focusing on undulator-radiation FEL and assuming momentum conservation in the axial ðzÞ dimension by means of the wiggler wavenumber kw , the emission and absorption quantum transition levels and radiation frequencies are found from the solution of equations: 1kzi 2 1kzf ¼ "ve kzi 2 kzf ¼ qze þ kw 1kza 2 1kzi ¼ "va kza 2 kzi ¼ qza þ kw

½15a ½15b ½16a ½16b

For fixed kw , fixed transverse momentum and given e-beam energy 1kzi and radiation emission angle Qq ðqz ¼ ðv=cÞcos Qq Þ, eqns [15] and [16] have separately

436 LASERS / Free Electron Lasers

Figure 4 The figure illustrates that the origin of difference between the emission and absorption frequencies is the curvature of the energy dispersion line, and the origin of the homogeneous line broadening is momentum conservation uncertainty ^p=L in a finite interaction length. (Reproduced with permission from Friedman A, Gover A, Ruschin S, Kurizki G and Yariv A (1988) Spontaneous and stimulated emission from quasi-free electrons. Reviews of Modern Physics 60: 471–535. Copyright (1988) by the American Physical Society.)

distinct solutions, defining the electron upper and lower quantum levels for radiative emission and absorption respectively. The graphical solutions of these two set of equations are shown in Figure 4, which depicts also the ‘homogeneous’ frequency-line broadening "Dve , "Dva of the emission and absorption lines due to the uncertainty in the momentum conservation ^p=L in a finite interaction length. In the quantum limit of a cold (monoenergetic) e-beam and a long interaction length L, the absorption line center va is larger than the emission line center ve , and the linewidths Dve ø Dva ¼ DvL are narrower than the emission and absorption lines spacing va 2 ve , as shown in Figure 5a. The FEL then behaves as a 3-level quantum system, with electrons occupying only the central level, and the upper level is spaced apart from it more than the lower level (Figure 4). In the classical limit " ! 0, one can Taylor-expand 1kz around kzi : Using:

Figure 5 Net gain emission/absorption frequency lines of FEL: (a) in the quantum limit: va 2 ve q DvL , (b) in the classical limit: va 2 ve p DvL :

one obtains:

ve ø va ø v0 ¼ vz ðqz0 þ kw Þ vz ¼

1 ›1kz ; " ›kz

1

g 2z g m

2

¼

1 › 1kz "2 ›k2z

½17

which for qz0 ¼ ðv=cÞcos Qq reproduces the classical synchronism condition in eqn [5]. The homogeneous

LASERS / Free Electron Lasers 437

broadening linewidth is found to be: DvL 1 ¼ Nw v0

magnetic field in eqn [2] (Figure 1): ½18

where Nw ¼ L=lw is the number of wiggler periods. The classical limit condition requires that the difference between the emission and absorption line centers will be smaller than their width. This is expressed in terms of the ‘recoil parameter 1’ (for Qq ¼ 0): 1¼

va 2 ve 1 þ b "v0 ¼ Nw p 1 DvL b2z g mc2

½19

This condition is satisfied in all practical cases of realizable FELs. When this happens, the homogeneous line broadening dominates over the quantum-recoil effect, and the emission and absorption lines are nearly degenerate (Figure 5b). The total quantum-electrodynamic photonic emission rate expression:   dnq ¼ Gsp ðnq þ 1ÞFðv 2 ve Þ 2 nq Fðv 2 va Þ dt

½20

reduces then into: dnq d ¼ nq Gsp 1DvL Fðv 2 ve Þ dv dt þ Gsp Fðv 2 v0 Þ

½21

vx ¼ vw cos kw ze ðtÞ

½22

x ¼ xw sin kw ze ðtÞ

½23

where vw ¼ caw =g and xw ¼ vw =ðvz kw Þ: An electromagnetic wave Ex ðz; tÞ ¼ E0 cosðvt2kz zÞ propagates collinearly with the electron. Figure 6 displays the electron and wave ‘snap-shot’ positions as they propagate along one wiggler period lw : If the electron, moving at average axial velocity vz, enters the interaction region z ¼ 0 at t ¼ 0, its axial position is ze ðtÞ ¼ vz t, and the electric force it experiences is 2eEx ðze ðtÞ; tÞ ¼ 2eE0 cosðv 2 kz vz Þt: Clearly, this force is (at least initially at t ¼ 0) opposite to the transverse velocity of the electron vx ¼ vw cosðkw vz Þt (imply in deceleration) and the power exchange rate 2eve ·E ¼ 2evx Ex corresponds to transfer of energy into the radiation field on account of the electron kinetic energy. Because the phase velocity of the radiation mode is larger than the electron velocity, vph ¼ v=kz . vz , the electron phase we ¼ ðv 2 kz vz Þt grows, and the power exchange rate 2evx Ex changes. However, if one synchronizes the electron velocity, so that while the electron traverses one wiggler period ðt ¼ lw =vz Þ, the electron phase advances by 2p : ðv 2 kz vz Þ·lw =vz ¼ 2p, then the power exchange rate from the electron to the wave remains non-negative through the entire interaction length, because then the electron transverse velocity

Here nq is the number of photons in radiation mode q, Gsp – the spontaneous emission rate, and Fðv 2 v0 Þ is the emission (absorption) lineshape function. Figure 5b depicts the transition of the net radiative emission/absorption rate into a gain curve which is proportional to the derivative of the spontaneous emission lineshape function (first term in eqn [21]). Equation [21] presents a fundamental relation between the spontaneous and stimulated emission of FELs, which was observed first by John Madey (Madey’s theorem). It can be viewed as an extension of Einstein’s relations to a classical radiation source. The Classical Picture

The spontaneous emission process of FEL (synchrotron undulator radiation) is nothing but dipole radiation of the undulating electrons, which in the laboratory frame of reference is Doppler shifted to high frequency. The understanding of the stimulated emission process requires a different approach. Consider a single electron, following a sinusoidal trajectory under the effect of a planar undulator

Figure 6 ‘Snapshots’ of an electromagnetic wave period slipping relative to an undulating electron along one wiggler period lw : The energy transfer to the wave – ev·E remains non-negative all along.

438 LASERS / Free Electron Lasers

vx and the wave electric field Ex reverse sign, at each period exactly at the same points ðlw =4; 3lw =4Þ: This situation is depicted in Figure 6, which shows the slippage of the wave crests relative to the electron at five points along one wiggler period. The figure describes the synchronism condition, in which the radiation wave slips one optical period ahead of the electron, while the electron goes through one wiggle motion. In all positions along this period, v·E $ 0 (in a helical wiggler and a circularly polarized wave this product is constant and positive v·E . 0 along the entire period). Substituting lw ¼ 2p=kw , this phase synchronism condition may be written as: v ¼ kz þ kw ½24 vz which is the same as eqns [17] and [5]. Figure 6 shows that a single electron (or a bunch of electrons of duration smaller than an optical period) would amplify a co-propagating radiation wave, along the entire wiggler, if it satisfies the synchronism condition in eqn [24] and enters the interaction region ðz ¼ 0Þ at the right (decelerating) phase relative to the radiation field. If the electron enters at the opposite phase it accelerates (on account of the radiation field energy which is then attenuated by ‘stimulated absorption’). Thus, when an electron beam is injected into a wiggler at the synchronism condition with electrons entering at random times, no net amplification or absorption of the wave is expected on the averages. Hence, some more elaboration is required, in order to understand how stimulated emission gain is possible then. Before proceeding, it is useful to define the ‘pondermotive force’ wave. This force originates from the nonlinearity of the Lorenz force equation: d ðg mvÞ ¼ 2eðE £ v £ BÞ dt

½25

At zero order (in terms of the radiation fields), the only field force on the right-hand side of eqn [25] is due to the strong wiggler field (eqns [2] and [3]), which results in the transverse wiggling velocity (eqn [22] for a linear wiggler). When solving next eqn [25] to first order in terms of the radiation fields:   Es ðr; tÞ ¼ Re E~ s eiðkz z2vtÞ   ~ s eiðkz z2vtÞ Bs ðr; tÞ ¼ Re B

½26

the cross product v £ B between the transverse components of the velocity and the magnetic field generates a longitudinal force component:   Fpm ðz; tÞ ¼ Re e^ z F~ pm eiðkz þkw Þz2ivt

½27

that varies with the beat wavenumber ks þ kw at slow phase velocity ðvph ¼ v=ðkz þ kw Þ , cÞ: This slow force-wave is called the pomdermotive (PM) wave. Assuming the signal radiation wave in eqn [26] is polarization-matched to the wiggler (linearly polarized or circularly polarized for a linear or helical wiggler respectively), the PM force amplitude is given by:  ~ s’ law gbz lF~ pm l ¼ elE ½28 With large enough kw , it is always possible to slow down the phase velocity of the pondermotive wave until it is synchronized with the electron velocity: vph ¼

v ¼ vz kz þ kw

½29

and can apply along the interaction length a decelerating axial force, that will cause properly phased electrons to transfer energy to the wave on account of their longitudinal kinetic energy. This observation is of great importance. It reveals that even though the main components of the wiggler and radiation fields are transverse, the interaction is basically longitudinal. This puts the FEL on an equal footing with the slow-wave structure devices as the TWT and the Cerenkov – Smith – Purcell FELs, in which the longitudinal interaction takes place with the longitudinal electric field component of a slow TM radiation mode. The synchronism condition in eqn [29] between the pondermotive wave and the electron, which is identical with the phase-matching condition in eqn [24], is also similar to the synchronism condition between an electron and a slow electromagnetic wave (eqn [14]). Using the pondermotive wave concept, we can now explain the achievement of gain in the FEL with a random electron beam. Figure 7 illustrates the interaction between the pondermotive wave and electrons, distributed at the entrance ðz ¼ 0Þ randomly within the wave period. Figure 7a shows ‘snap-shots’ of the electrons in one period of the pondermotive wave lpm ¼ 2p=ðkz þ kw Þ at different points along the wiggler, when it is assumed that the electron beam is perfectly synchronous with the pondermotive wave vph ¼ v0 : As explained before, some electrons are slowed down, acquiring negative velocity increment Dv: However, for each such electron, there is another one, entering the wiggler at an accelerating phase of the wave, acquiring the same positive velocity increment Dv: There is then no net change in the energy of the e-beam or the wave, however, there is clearly an effect of ‘velocity-bunching’ (modulation), which turns along the wiggler into ‘density-bunching’ at the same

LASERS / Free Electron Lasers 439

Figure 7 ‘Snapshots’ of a pondermotive wave period interacting L with an initially uniformly distributed electron beam taking place respectively along the interaction length 0 , z , L b : (a) Exact synchronism in a uniform wiggler (bunching). (b) Energy bunching, density bunching and radiation are in the energy buncher, dispersive magnet Lb , z , Lb þ Ld and radiating wiggler Lb þ Ld , z , Lb þ Ld þ Lr sections of an Optical– Klystron. (c) Slippage from bunching phase to radiating phase at optimum detuning off synchronism in a uniform wiggler FEL.

frequency v and wavenumber kz þ kw as the modulating pondermotive wave. The degree of density-bunching depends on the amplitude of the wave and the interaction length L: In the nonlinear limit the counter propagating (in the beam reference frame) velocity modulated electrons may over-bunch namely cross over, and debunch again. Bunching is the principle of classical stimulated emission in electron beam radiation devices. If the e-beam had been prebunched in the first place, we would have injected it at a decelerating phase relative to the wave and obtained net radiation gain right away (super radiant emission). This is indeed the principle behind the ‘Optical-Klystron’ (OK) demonstrated in Figure 7b. The structure of the OK is described ahead in Figure 19. The electron beam is velocity (energy) modulated by an input electromagnetic radiation wave in the first ‘bunching-wiggler section’ of length Lb : It then passes through a drift-free ‘energydispersive magnet section’ (chicane) of length Ld , in which the velocity modulation turns into density bunching. The bunched electron beam is then injected back into a second ‘radiating-wiggler section’, where it co-propagates with the same electromagnetic wave but with a phase advance of p=2 2 m2p, m ¼ 1, 2, …

(spatial lag of lpm =4 2 mlpm in real space) which places the entire bunch at a decelerating phase relative to the PM-wave and so amplifies the radiation wave. The principle of stimulated-emission gain in FEL, illustrated in Figure 7c, is quite similar. Here the wiggler is uniform along the entire length L, and the displacement of the electron bunches into a decelerating phase position relative to the PM-wave is obtained by injecting the electron beam at a velocity vz0 , slightly higher than the wave vph (velocity detuning). The detailed calculation shows that detuning corresponding to a phase shift of DCðLÞ ¼   ðv=vz0 Þ 2 ðkz þ kw Þ L ¼ 22:6 (corresponding to spatial bunch advance of 0:4lpm along the wiggler length), provides sufficient synchronism with the PMwave in the first half of the wiggler to obtain bunching, and sufficient deceleration-phasing of the created bunches in the second part of the wiggler to obtain maximum gain.

Principles of FEL Theory The 3D radiation field in the interaction region can be expanded in general in terms of a complete set of

440 LASERS / Free Electron Lasers

free-space or waveguide modes {1q ðx; yÞHq ðx; yÞ}: hX i Eðr; tÞ ¼ Re ½30 cq ðzÞ1q ðx; yÞeiðkzq z2vtÞ

Thus, one should substitute in eqn [34]: ~ pm ðkz þ kw ; vÞ F~ z ðkz þ kw ; vÞ ¼ 2e½E ~ sc ðkz þ kw ; vÞ þE

q

The mode amplitudes Cq ðzÞ may grow along the wiggler interaction length 0 , z , L, according to the mode excitation equation: d 1 2ikzq ð ð Cq ðzÞ ¼ 2 e Jðx; y; zÞ·1pq ðx; yÞdx dy ½31 dz 4P ÐÐ 1q £ Hq ·^ez dx dy is the mode where P ¼ 2 12 Re normalization power, and J~ is the bunching current component at frequency v , that is phase matched to the radiation waves, and needs to be calculated consistently from the electron force equations. The FEL Small Signal Regime

We first present the basic formulation of FEL gain in the linear (small signal) regime, namely the amplified radiation field is assumed to be proportional to the input signal radiation field, and the beam energy loss is negligible. This is done in the framework of a one-dimensional (single transverse radiation mode) model. The electron beam charge density, current density, and velocity modulation are solved in the framework of a one-dimensional plasma equations model (kinetic or fluid equations). The longitudinal PM-force in eqn [27] modulates the electron beam velocity via the longitudinal part of the force eqn [25]. This brings about charge modulation rðz; tÞ ¼ Re½r~ðkz þ kw ; vÞeiðkz þkw Þz2ivt  and consequently, also longitudinal space-charge field E~ sc ðkzq þ kw ; vÞ and longitudinal current density modulation J~z ðkz þ kw ; vÞ, related through the Poison and continuity equations: iðkz þ kw ÞE~ sc ðkz þ kw ; vÞ ¼ r~ðkz þ kw ; vÞ=1

½32

ðkz þ kw ÞJ~z ðkz þ kw ; vÞ ¼ vr~ðkz þ kw ; vÞ

½33

Solving the force eqn [25] for a general longitudinal force Fz ðz; tÞ ¼ Re½F~ z ðkz ; vÞeiðkz z2vtÞ  results in a linear longitudinal current response relation: J~z ðkz ; vÞ ¼ 2ivxp ðkz ; vÞF~ z ðkz ; vÞ=ð2eÞ

½34

where xp ðkz ; vÞ is the longitudinal susceptibility of the electron beam ‘plasma’.The beam charge density in the FEL may be quite high, and consequently the space charge field E~ sc , arising from the Poison eqn [32], may not be negligible. One should take into consideration then that the total longitudinal force F~ z is composed of both the PM-force of eqn [27] and an ~ sc : arising longitudinal space-charge electric force – eE

½35

and solve it self-consistently with eqns [32] and [33] to obtain the ‘external-force’ response relation: J~z ðkz þ kw ; vÞ ¼

2ivxp ðkz þ kw ; vÞ ~ E ðk þ kw ; vÞ 1 þ xp ðkz þ kw ; vÞ=1 pm z

½36

where we defined the PM ‘field’: E~ pm ¼ F~ pm =ð2eÞ: In the framework of a single-mode interaction model, we keep in the summation of eqn [30] only one mode q (usually the fundamental mode, and in free space – a Gaussian mode). The transverse current density components in eqn [31] J~’ ¼ 12 r~ v~ w are found using eqns [22], [33], and [36]. Finally, ~ q eidkz (where dk ; kz 2 kqz substituting Cq ðzÞ ¼ C and kz ø kqz is the wavenumber of the radiation wave modified by the interaction with the electrons) results in the general FEL dispersion relation: ðkz 2 kzq Þb1 þ xp ðkz þ kw ; vÞ=1c ¼ kxp ðkz þ kw ; vÞ=10

½37

Equation [37] is a general expression, valid for a wide variety of FELs, including Cerenkov – Smith – Purcell and TWT. They differ only in the expression for k: For the conventional (magnetic wiggler) FEL:



1 Ae a2w v 2 A 4 Aem g2 b2z c JJ

½38

where Ae is the cross-section area of the electron  pffiffiffiffiffiffiffi beam, and Aem ; P q 12 10 =m0 l1q’ ð0; 0Þl2 is the effective area of the interacting radiation-mode q, and it is assumed that the electron beam, passing on axis ðxe ; ye Þ ¼ ð0; 0Þ, is narrow relative to the transverse mode variation Ae =Aem p 1: The ‘Besselfunctions coefficient’ AJJ is defined for a linear wiggler only, and is given by: " " # # a2w a2w AJJ ¼ J0 2 J1 ½39 2ða2w þ 2Þ 2ða2w þ 2Þ In a helical wiggler, AJJ ; 1: Usually aw p 1, and therefore AJJ ø 1: The Pierce Dispersion Equation

The longitudinal plasma response susceptibility function xp ðkz ; vÞ has been calculated, in any plasma formulation, including fluid model, kinetic model, or

LASERS / Free Electron Lasers 441

even quantum-mechanical theory. If the electron beam axial velocity spread is small enough (cold beam), then the fluid plasma equations can be used. The small signal longitudinal force equation derived from eqn [25], together with eqn [33] and the small signal current modulation expression: J~z ø r0 v~ z þ vz r~

½40

Alternatively stated, the exit amplitude of the electromagnetic mode can in general be expressed in terms of the initial conditions: Cq ðLÞ ¼ H E ðvÞCq ðv; 0Þ þ Hv ðvÞ~vq ðv; 0Þ þ H i ðvÞ~iðv; 0Þ where

result in:

v0p2 xp ðkz ; vÞ ¼ 2 1 ð v 2 kz v z Þ 2

H ð E;v;iÞ ðvÞ ¼

dkðdk 2 u 2 upr Þðdk 2 u þ upr Þ ¼ Q

½42

where dk ¼ kz 2 kzq , u is the detuning parameter (off the synchronism condition of eqn [24]):

u;

v 2 kzq 2 kw vz

½43

vp vz

½44

Q ¼ ku 2p

½45

up ¼

Here upr ¼ r pup, where rp , 1 is the plasma reduction factor. It results from the reduction of the ~ sp in a beam of longitudinal space-charge field E finite radius rb due to the fringe field effect (rp ! 1 when the beam is wide relative to the longitudinal modulation wavelength: rb q lpm ¼ 2p=ðkzq þ kw Þ). The cubic equation has, of course, three solutions dki (i ¼ 1, 2, 3), and the general solution for the radiation field amplitude and power is thus: 3 X

Aj eidkj z

½46

PðzÞ ¼ lCq ðzÞl2 P q

½47

Cq ðzÞ ¼

j¼1

The coefficients Ai can be determined from three initial conditions of the radiation and e-beam parameters Cq ð0Þ, v~ ð0Þ, ~ið0Þ, and can be given as a linear combination of them (here ~i ¼ Ae J~z is the longitudinal modulation current): Aj ¼ AEj ðvÞCvq ð0Þ þ Avj ðvÞ~vv ð0Þ þ Aij ðvÞ~iv ð0Þ

½48

3 X

Ajð E;v;iÞ ðvÞeidkj L

½50

j¼1

½41

where v0p ¼ ðe2 n0 =gg 2z 1mÞ1=2 is the longitudinal plasma frequency, n0 is the beam electrons density ðr0 ¼ 2en0 Þ, vz is the average axial velocity of the beam. In this ‘cold-beam’ limit, the FEL dispersion eqn [37] reduces into the well-known ‘cubic dispersion equation’ derived first by John Pierce in the late 1940s for the TWT:

½49

In the conventional FEL, electrons are injected in randomly, and there is no velocity prebunching ð~vðv; 0Þ ¼ 0Þ or current prebunching ð~iðv; 0Þ ¼ 0Þ (or ~ v; 0Þ ¼ 0). Consequently, Cq ðzÞ is equivalently nð proportional to Cq ð0Þ and one can define and calculate the FEL small-signal single-path gain parameter: GðvÞ ;

lCq ðv; LÞl2 PðLÞ ¼ ¼ lH E ðvÞl2 2 Pð0Þ lCq ðv; 0Þl

½51

The FEL Gain Regimes

At different physically meaningful operating regimes, some parameters in eqn [42] can be neglected relative to others, and simple analytic expressions can be found for dki , Ai , and consequently GðvÞ: It is convenient to normalize the FEL parameters to the  ¼ QL3 : An wiggler length: u ¼ uL, upr ¼ upr L, Q additional figure of merit parameter is the ‘thermal’ spread parameter: v v uth ¼ zth L vz vz

½52

where vzth is the axial velocity spread of the e-beam (in a Gaussian velocity pffiffi distribution model: f ðvz Þ ¼ exp½ðvz 2 vz0 Þ=vzth  p vzth ). The axial velocity spread can result out of beam energy spread or angular spread (finite ‘emittance’). It should be small enough, so that the general dispersion relation of eqn [37] reduces to eqn [42] (the practical ‘cold beam’ regime). Assuming now a conventional FEL ð~vz ð0Þ ¼ 0, ~ið0Þ ¼ 0Þ, the single path gain in eqn [51] can be calculated. We present next this gain expression in the different regimes. The maximum values of the gain expression in the different regimes are listed in Table 1. Low gain This is the regime where the differentialgain in a single path satisfies G 2 1 ¼ ½PðLÞ 2 Pð0Þ Pð0Þ p 1: It is not useful for FEL amplifiers but most FEL oscillators operate in this regime.

442 LASERS / Free Electron Lasers

Table 1

The gain regimes maximum gain expressions Gain regime

Parameters domain

Max. gain expression

I

Tenuous beam low-gain

 upr , uth , p Q,

P(L)  ¼ 1 þ 0:27Q P(0)

II

Collective low-gain

 Q upr . , uth , p 2

P(L)  upr ¼ 1 þ Q=2 P(0)

III

Collective high-gain

 . upr . Q  1=3 , uth , Q  .p Q=2

qffiffiffiffiffiffiffiffiffi P(L) 1  upr ) ¼ exp( 2Q= P(0) 4

IV

Strong coupling high-gain

 1=3 . upr , uth , Q  .p Q

pffiffi P(L) 1 ¼ exp( 3Q 1=3 ) P(0) 9

V

Warm beam

 1=3 ; p uth . upr , Q

P(L)  u 2th ) ¼ exp(3Q= P(0)

The three solutions of eqn [42] – namely the terms of eqn [46] – are reminiscent of the three eigenwaves of the uncoupled system ðk ¼ Q ¼ 0Þ : the radiation mode and the two plasma (space-charge) waves of the e-beam (the slow and fast waves, corresponding respectively to the forward and backward propagating plasma-waves in the beam rest reference-frame). In the low-gain regime, all three terms in eqn [46] are significant. Calculating them to first order in k, results in analytical gain expressions in the collective ðupr q pÞ and tenuous-beam ðupr p pÞ regimes (note that upr =2p ¼ f 0pr L=vz is the number of plasma oscillations within the wiggler transit time L=vz ). In most practical situations the beam current density is small enough, and its energy high enough, to limit operation to the tenuous-beam regime. The gain curve function is then:  uðvÞÞ ¼ Q  d sinc2 ðu=2Þ GðvÞ 2 1 ¼ QFð du

v 2 v0 uðvÞ ; uðvÞL ¼ 2p DvL

namely: ½53

½54

where sincðuÞ ; ðsin uÞ=u, and in free space (no waveguide) propagation ðkzq ¼ v=cÞ, the FWHM frequency bandwidth of the sinc2 ðu=2Þ function is: DvL 1 ¼ Nw v0

Figure 8 The low-gain cold-beam small-signal gain curve of FEL as a function of the detuning parameter u(v).

½55

The small signal gain curve is shown in Figure 8. There is no gain at synchronism – v ¼ v0 : Maximum  is attained at a frequency gain – G 2 1 ¼ 0:27Q, slightly smaller than v0 , corresponding to u ¼ 22:6: The small gain curve bandwidth is DvSG ø DvL =2,

DvSG 1 ¼ 2Nw v0

½56

High gain This is the regime where the FEL gain in a single path satisfies G ¼ PðLÞ=Pð0Þ q 1: It is useful, of course, when the FEL is used as an amplifier. Since the coefficients of the cubic eqn [42] are all real, the solutions dki ði ¼ 1, 2, 3Þ must be either all real, or composed of one real solution dk3 and two complex solutions, which are complex conjugate of each other: dk1 ¼ dkp2 : In the first case, all terms in eqn [46] are purely oscillatory, there is no exponential growth, and the FEL operates in the low gain regime. In the second case, assuming Imðdk1 Þ , 0, Imðdk2 Þ . 0, the first term grows exponentially, and if L is long enough it will dominate over the other decaying ð j ¼ 2Þ and oscillatory ð j ¼ 3Þ terms, and

LASERS / Free Electron Lasers 443

result in an exponential gain expression: !2 A1 GðvÞ ¼ e2ð Im dk1 ÞL A1 þ A2 þ A3

½57

If we focus on the tenuous-beam strong coupling (high – gain) regime upr p ldkl, then the cubic eqn [42] gets the simple form: dkbðdkÞ2 2 u2 c ¼ G3

G¼Q

a2 v=c Ib 2 ¼ p 3 w2 5 AJJ g g z b z Aem IA

!1=3 ½59

and IA ¼ 4p10 me c3 =e ø 17 kA is the Alfven current. The solution of eqn [58] near synchronism ðu ø 0Þ is: pffiffi pffiffi 1 2 3i 1 þ 3i G; dk2 ¼ G; dk1 ¼ 2 2 ½60 dk3 ¼ 2G resulting in: Cq ðzÞ Cq ð0Þ pffi pffi i 2 3þi 1 h 3þi Gz e 2 ¼ þ e 2 Gz þ e2 iGz 3

H E ðvÞ ¼

½61

and for GL q 1: Gø

1 pffi3GL e 9

½62

The FEL gain is then exponential and can be very high. The gain exponential coefficient is characterized then by its third-order root scaling with the current, aIb1=3 : The high-gain frequency detuning curve (found by solving eqn [58] to second order in u) is: 1 pffi3GL e expð2u2 /GL33=2 Þ 9 " # 1 pffi3GL ðv 2 v0 Þ2 exp 2 ; e 9 Dv2HG

½65

A ‘prebunched-beam FEL’ emits coherent radiation based on the process of Super-radiant Emission (in the sense of Dike). Because all electrons emit in phase radiation wavepackets into the radiation mode, the resultant field amplitude is proportional in this case to the beam current Ib and the radiation. By contrast, spontaneous emission from a random electron beam (no bunching) is the result of incoherent superposition of the wavepackets emitted by the electrons and its power is expected to be proportional to the current Ib . When the current to radiation field transfer function H i(v) is known, eqn [65] can be used to calculate the superradiant power, and in the high-gain regime also the amplified-superradiant power. The latter is the amplification of the superradiant radiation in the downstream sections of a long wiggler. Such unsaturated gain is possible only when the beam is partly bunched i˜(v)(Ib) (because the FEL gain process requires enhanced bunching). The expressions for the current to field transfer function, in the superradiant gain and the high-gain amplified superradiance limits respectively, are: lH i ðvÞl ¼

ðPpb =Pq Þ1=2 sincðuL=2Þ Ib

  ðP =P Þ1=2 ðpffi3=2ÞGL 2ðv2v0 Þ2 =2ðDvÞ2  i  HG H ðvÞ  ¼ pb q e e 3GLIb

½66

½67

where



½63

where DvHG is the 1=e half-width of the gain curve: DvHG 33=4 lw ¼ v0 2p ðL=GÞ1=2

ðPq ÞSR ¼ Pq lH i ðvÞl2 l~iðv; 0Þl2

½58

where 1=3

input radiation signal ðCq ðv; 0Þ ¼ 0Þ if the electron beam velocity or current (density) are prebunched. Namely, the injected e-beam has a frequency component v~ ðvÞ or ~iðvÞ in the frequency range where the radiation device emits. In the case of pure density bunching ð~vðvÞ ¼ 0Þ, the coherent power emitted is found from eqns [46, 47, 49]:

½64

Super-Radiance, Spontaneous-Emission and Self Amplified Spontaneous Emission (SASE)

Intense coherent radiation power can be generated in a wiggler or any other radiation scheme without any

Ppb ¼

Ib2 Zq  aw 2 L2 32 gbz Aem

½68

radiation mode impedance (in free-space Zq is pthe ffiffiffiffiffiffiffi Zq ¼ m0 =10 ). From these expressions one can calculate the power and spectral power of both coherent (superradiant) and partially coherent (spontaneous emission) radiation of FEL in the negligible gain and high gain regimes. The corresponding super-radiant power is in the negligible superradiance gain limit: PSR

   ~iðvÞ 2    sinc2 ðuL=2Þ ¼ Ppb  Ib 

½69

444 LASERS / Free Electron Lasers

(proportional, as expected, to the modulation current squared) and in the high-gain amplified superradiance limit (assuming initial partial bunching liðvÞ=Ibl p 1):    ~iðvÞ 2 1 pffi 2 2    PSR ¼ Ppb  e 3GL e2ðv2v0 Þ =ðDvÞHG ½70 2  Ib 9ðGLÞ The discussion is now extended to incoherent (or partially coherent) spontaneous emission. Due to its particulate nature, every electron beam has random frequency components in the entire spectrum (shot noise). Consequently, incoherent radiation power is always emitted from an electron-beam passing through a wiggler, and its spectral-power can be calculated through the relation: dPq 2 klið~ vÞl2 l ½71 ¼ P q lH i ðvÞl2 T dv p Here ~iðvÞ is the Fourier transform of P the current of T randomly injected electrons iðtÞ ¼ 2e N j¼1 dðt 2 toj Þ, where NT is the average number of electrons in a time period T, namely, the average (DC) current is Ib ¼ 2eNT =T: For a randomly distributed beam, the shot noise current is simply kliðvÞl2 l=T ¼ eIb , and therefore the spontaneous emission power of the FEL, which is nothing but the ‘synchrotron-undulator radiation’, is given by (see eqn [66]):   dPq 1 L2 aw 2 ¼ eIb Zq sinc2 ðuL=2Þ ½72 16p dv Aem gbz If the wiggler is long enough, the spontaneous emission emitted in the first part of the wiggler can be amplified by the rest of it (SASE). In the high-gain limit (see eqn [67]), the amplified spontaneous emission power within the gain bandwidth of eqn [64] is given by: pffi ð1 2 1 Pq ¼ P q eIb lH i ðvÞl2 dv ¼ Psh e 3GL 9 p 0

½77

dCi ¼ ui dz

½78

where Ci ¼

½73

Saturation Regime

The FEL interaction of an electron with an harmonic electromagnetic (EM) wave is essentially described by the longitudinal component of the force in eqn [25], driven by the pondermotive force of eqns [27] and [28]:

dzi =dt ¼ vzi

dui ¼ K2s sin Ci dz

ðz 0

ðv=vzi ðz 0 Þ 2 kz 2 kw Þ dz 0

ui ¼

where Psh is an ‘effective shot-noise input power’: ePpb 2 Psh ¼ pffiffi ðDvÞHG ½74 p I0 ðGLÞ2

d ðg mv Þ ¼ lF~ pm lcos½vt 2 ðkz þ kw Þzi  dt i zi

As long as the interaction is weak enough (small signal regime), the change in the electron velocity is negligible – vzi ø vz0 , and the phase of the forcewave, experienced by the electron, is linear in time Ci ðtÞ ¼ ½v 2 ðkz þ kw Þvz0 ðt 2 t0i Þ þ vt0i : Near synchronism condition u ø 0 (eqn [24]), eqn [75] results in bunching of the beam, because different acceleration/deceleration forces are applied on each electron, depending on their initial phase Ci ð0Þ ¼ vt0i ð2p , Ci ð0Þ , pÞ within each optical period 2p=v (see Figure 7). Taylor expansion of vzi around vz0 in eqns [75] and [76], and use of conservation of energy between the e-beam and the radiation field, would lead again to the small signal gain expression eqn [53] in the low gain regime. When the interaction is strong enough (the nonlinear or saturation regime), the electron velocities change enough to invalidate the assumption of linear time dependence of Ci and the nonlinear set of eqns [75] and [76] needs to be solved exactly. It is Ð convenient to invert the dependence on time zi ðtÞ ¼ tt0i vzi ðt 0 Þdt 0 , and turnÐthe coordinate z to the  independent variable ti ðzÞ ¼ z0 dz0 vzi ðz0 Þ þ t0i : This, and direct differentiation of gi ðvzi Þ, reduces eqns [75] and [76] into the well-known pendulum equation:

½75 ½76

v 2 kz 2 kw vzi

½79

½80

are respectively the pondermotive potential phase and the detuning value of electron i at position z: pffiffiffiffiffiffiffiffiffiffi k aw as AJJ Ks ¼ ½81 g0 gz0 b2z0 is the synchrotron oscillation wavenumber, where aw ~ v=mc, and g0 ¼ gð0Þ, is given in eqn [9], as ¼ elEl gz0 ¼ gz ð0Þ, and bz0 ¼ bz ð0Þ are the initial parameters of the assumed cold beam. The pendulum eqns [77] and [78] can be integrated once, resulting in: 1 2

u2i ðzÞ 2 K2s cos Ci ðzÞ ¼ Ci

½82

and the integration constant is determined for each electron by its detuning and phase relative to the

LASERS / Free Electron Lasers 445

pondermotive wave at the entrance point ðz ¼ 0Þ: Ci ¼ 12 u2i ð0Þ 2 K2s cos Ci ð0Þ: The uðzÞ, CðzÞ phase-space trajectories of eqn [82] are shown in Figure 9 for various values of Ci (corresponding to the initial conditions ui ð0Þ, Ci ð0Þ). The trajectories corresponding to lCi l . K2s are open; namely electrons on these trajectories, while oscillating, can slip-off out of the pondermotive –potential wave period to adjacent periods, ahead or backward, depending on the value of their detuning parameter u . The trajectories corresponding to lCi l , K2s are closed, namely the electrons occupying these trajectories are ‘trapped’, and their phase displacement is bound to a range lCi ðzÞ 2 npl , Cim ; arccosðlCi l=K2s Þ , p within one pondermotive-wave period. The trajectory Ci ¼ K2s defines the ‘separatrix’:

ui ðzÞ ¼ ^2Ks cosðCi =2Þ

½83

which is sometimes referred to as the ‘trap’ or ‘bucket’. Every electron within the separatrix stays trapped, and the ones out of it are free (untrapped). The height of the separatrix (maximum detuning swing) is Dutrap ¼ 4Ks : The oscillation frequency of the trapped electrons can be estimated for deeply trapped electrons ðCm p 2pÞ. In this case the physical pendulum eqns [77] and [78] reduce to the mathematical pendulum equation with an oscillation frequency Ks , in the z coordinate. This longitudinal oscillation, called ‘synchrotron oscillation’, takes place as a function of time at the ‘synchrotron frequency’ Vs ¼ Ks vz : Differentiation of ui ðvzi Þ and vzi ðgi Þ permits to describe the phase-space dynamics in terms of the more physical parameters dvzi ¼ vzi 2 vph and

dgi ¼ gi 2 gph , where: vph ¼

v kz þ kw

is the phase velocity of the pondermotive wave and gph ; ð1 2 b2ph Þ21=2 : 2 ui ¼

v k dvzi ¼ 3 2 dg i c2 b2z0 bz0 g z0 g0

½85

Figure 10 displays a typical dynamics of electron beam phase-space ðg; CÞ evolution for the case of a cold beam of energy gð0Þ entering the interaction region at z ¼ 0 with uniform phase distribution (random arrival times t0i ). The FEL is assumed to operate in the lowgain regime (typical situation in an FEL oscillator), and, therefore, the trap height (corresponding to Dutrap ¼ 4ks ):  Dgtrap ¼ 8b2z g 2z gKs k ½86 remains constant along the interaction length. Figure 10a displays the e-beam phase-space evolution in the small signal regime. The uniform phase distribution evolves along the wiggler into a bunched distribution (compare to Figure 7c), and its average kinetic energy goes down ðDEk Þ ¼ ½kgi ðLÞl2 gð0Þmc2 , 0, contributing this energy to the field of the interacting radiation mode, DPq ¼ ðDEk ÞI0 =e: In this case (corresponding in an FEL oscillator to the early stages of oscillation build-up), the electrons remain free (untrapped) along the entire length L: Figure 10b displays the e-beam phase-space evolution in the large signal (saturation) regime (in the case of an oscillator – at the steady-state saturation stage). Part of the electrons are found inside the trap, immediately upon entering the interaction region ðz ¼ 0Þ, and they lose energy of less than (but near) mc2 Dgtrap as they pass through the interaction region ðz ¼ LÞ: A portion of the electrons remain outside the traps upon entrance. They follow open trajectories and lose less energy or may even become accelerated due to their interaction with the wave. It can be appreciated from this discussion that a good design strategy in attempting to extract maximum power from the electron beam in the FEL interaction, is to set the parameters determining the synchrotron oscillation frequency Ks in eqn [81] so that only half a synchrotron oscillation period will be performed along the interaction length: Ks L ¼ p

Figure 9 The (u – C) phase-space trajectories of the pendulum equation.

½84

½87

This is controlled in an amplifier by keeping the input radiation power Pq ð0Þ (and consequently as ) small enough, so that Ks will not exceed the value set

446 LASERS / Free Electron Lasers

Figure 10 ‘Snapshots’ of the (g – C) phase-space distribution of an initially uniformly distributed cold beam relative to the PM-wave trap at three points along the wiggler (a) Moderate bunching in the small-signal low gain regime. (b) Dynamics of electron beam trapping and synchrotron oscillation at steady state saturation stage of a FEL oscillator (Ks L ¼ p).

by eqn [87]. In an oscillator, this is controlled by increasing the output mirror transmission sufficiently, so that the single path incremental small signal gain G-1 will not be much larger than the round trip loss, and the FEL will not get into deep saturation. When the FEL is over-saturated ðKs L . pÞ, the trapped electrons begin to gain energy as they continue to rotate in their phase-space trajectories beyond the lowest energy point of the trap, and the radiative energy extraction efficiency drops down. A practical estimate for the FEL saturation power emission and radiation extraction efficiency can be derived from the following consideration: the electron beam departs from most of its energy during the interaction with the wave, if a significant fraction of the electrons are within the trap and have positive velocity dvzi relative to the wave velocity vph at z ¼ 0, and if at the end of the interaction length ðz ¼ LÞ, they complete half a pendulum swing and reverse their velocity relative to the wave dvzi ðLÞ ø 2dvzi ð0Þ: Correspondingly, in the energy phase-space diagram (Figure 10b) the electrons perform half a synchrotron oscillation swing and dgi ðLÞ ¼ gi ðLÞ 2 gph ¼ 2dgi ð0Þ: In order to include in this discussion also the FEL amplifier (in the high gain regime), we note that in this case the phase velocity of the wave vph in eqn [84], and correspondingly gph , are modified by the interaction contribution to the radiation wavenumber – kz ¼ kz0 þ ReðdkÞ, and also the electron detuning parameter (relative to the pondermotive

wave) ui in eqn [80] differs from the beam detuning parameter u in eqn [43]: ui ¼ u 2 ReðdkÞ: Based on these considerations and eqn [85], the maximum energy extraction from the beam in the saturation process is: Dg ¼ 2dgi ð0Þ ¼ 2b3z0 g 2z0 g0

Re dk 2 u k

½88

where u is the initial detuning parameter in eqn [43]. In an FEL oscillator, operating in general in the low-gain regime, lRe dkl p lul, oscillation will start usually at the resonator mode frequency, corresponding to the detuning parameter uðvÞ ¼ 22:6=L, for which the small signal gain is maximal (see Figure 8). Then the maximum radiation extraction efficiency can be estimated directly from eqn [88]. It is, in the highly relativistic limit ðbz0 ø 1Þ:

hext ¼

Dg 1 ø g0 2Nw

½89

In an FEL amplifier, in the high-gain regime Re dk ¼ G=2 q lul, and consequently in the same limit:

hext ø

Gl w 4p

½90

It may be interpreted that the effective wiggler length for saturation is Leff ¼ 2p=G: Equation [90], derived here for a coherent wave, is considered valid also for estimating the saturation

LASERS / Free Electron Lasers 447

efficiency also in SASE-FEL. In this context, it is also called ‘the efficiency parameter’ 2r :

FEL Radiation Schemes and Technologies Contrary to conventional atomic and molecular lasers, the FEL operating frequency is not determined by natural discrete quantum energy levels of the lasing matter, but by the synchronism condition of eqn [24] that can be predetermined by the choice of wiggler period, lw ¼ 2p=kw , the resonator dispersion characteristics kzq ðvÞ, and the beam axial velocity vz : Because the FEL design parameters can be chosen at will, its operating frequency can fit any requirement, and furthermore, it can be tuned over a wide range (primarily by varying vz ). This feature of FEL led to FEL development efforts in regimes where it is hard to attain high-power tunable conventional lasers or vacuum-tube radiation sources – namely in the sub-mm (far infrared or THz) regimes, and in the VUV down to soft X-ray wavelengths. In practice, in an attempt to develop short wavelength FELs, the choice of wiggler period lw is limited by an inevitable transverse decay of the magnetic field away from the wiggler magnets surface (a decay range of < k21 w ) dictated by the Maxwell equations. To avoid interception of electron beam current on the walls or on the wiggler surfaces, typical wiggler periods are made longer than lw . 1 cm. FELs (or FEMs – free electron masers) operating in the long wavelengths regime (mm and sub-mm wavelengths) must be based on waveguide resonators to avoid excessive diffraction of the radiation beam along the interaction length (the wiggler). This determines the dispersion relation  kzq ðvÞ ¼ ðv2 2 v2coq Þ1=2 c where vcoq is the waveguide cutoff frequency of the radiation mode q: The use of this dispersion relation in eqn [24] results in an equation for the FEL synchronism frequency v0 : Usually the fundamental mode in an overmoded waveguide is used (the waveguide is overmoded because it has to be wide enough to avoid interception of electron beam current). In this case ðv0 q vco Þ and certainly in the case of an open resonator (common in FELs operating in the optical regime) kzq ¼ v=c, and the synchronism condition in eqn [24] simplified to the well-known FEL radiation wavelength expression in eqn [6]:

l ¼ ð1 þ bz Þbz g 2z lw ø 2g 2z lw

½91

where gz , aw are defined in eqns [7] –[9]. To attain strong interaction, it is desirable to keep the wiggler parameter aw large (eqn [38]), however, if aw . 1, this will cause reduction in the operating

wavelength according to eqns [7] and [91]. For this reason, and also in order to avoid harmonic frequencies emission (in case of a linear wiggler), aw , 1 in common FEL design. Consequently, considering the practical limitations on lw , the operating wavelength eqn [91] is determined primarily by the beam relativistic Lorentz factor g (eqn [8]). The conclusion is that for a short wavelength FEL, one should use an electron beam accelerated to high kinetic energy Ek : Also, tuning of the FEL operatingwavelength can be done by changing the beam energy. Small-range frequency tuning can be done also by changing the spacing between the magnet poles of a linear wiggler. This varies the magnetic field experienced by the e-beam, and effects the radiation wavelength through change of aw (see eqns [7] and [91]). Figure 11 displays the operating wavelengths of FEL projects all over the world versus their e-beam energy. FELs were operated or planned to operate over a wide range of frequencies, from the microwave to X-ray – eight orders of magnitude. The data points fall on the theoretical FEL radiation curve eqns [7], [8], and [91]. FEL Accelerator Technologies

The kind of accelerator used is the most important factor in determining the FEL characteristics. Evidently, the higher the acceleration energy, the shorter is the FEL radiation wavelength. However, not only the acceleration beam energy determines the shortest operating wavelength of the FEL, but also the e-beam quality. If the accelerated beam has large energy spread, energy instability, or large emittance (the product of the beamwidth with its angular spread), then it may have large axial velocity spread vzth : At high frequencies, this may push the detuning spread parameter uth (eqn [52]) to the warm beam regime (see Table 1), in which the FEL gain is diminished, and FELs are usually not operated. Other parameters of the accelerator determine different characteristics of the FEL. High current in the electron beam enables higher gain and higher power operation. The e-beam pulse shape (or CW) characteristics, affect, of course, the emitted radiation waveform, and may also affect the FEL gain and saturation characteristics. The following are the main accelerator technologies used for FEL construction. Their wavelength operating-regimes (eqn [91]) (determined primarily by their beam acceleration energies), are displayed in Figure 12. Modulators and pulse-line accelerators These are usually single pulse accelerators, based on high voltage power supplies and fast discharge stored

448 LASERS / Free Electron Lasers

Figure 11 Operating wavelengths of FELs around the world vs. their accelerator beam energy. The data points correspond in ascending order of accelerator energy to the following experimental facilities: NRL (USA), IAP (Russia), KAERI (Korea), IAP (Russia), JINR/IAP (Russia), INP/IAP (Russia), TAU (Israel), FOM (Netherlands), KEK/JAERI (Japan/Korea), CESTA (France), ENEA (Italy), KAERI-FEL (Korea), LEENA (Japan), ENEA (Italy), FIR FEL (USA), mm Fel (USA), UCSB (USA), ILE/ILT (Japan), MIRFEL (USA), UCLA-Kurchatov (USA/Russia), FIREFLY (GB), JAERI-FEL (Japan), FELIX (Netherlands), RAFEL (USA), ISIR (Japan), UCLAKurchatov-LANL (USA/RU), ELSA (France), CLIO (France), SCAFEL (GB), FEL (Germany), BFEL (China), KHI-FEL (Japan), FELI4 (Japan), iFEL1 (Japan), HGHG (USA), FELI (USA), MARKIII (USA), ATF (USA), iFEL2 (Japan), VISA (USA), LEBRA (Japan), OK-4 (USA), UVFEL (USA), iFEL3 (Japan), TTF1 (Germany), NIJI-IV (Japan), APSFEL (USA), FELICITAI (Germany), FERMI (Italy), UVSOR (Japan), Super-ACO (France), TTF2 (Germany), ELETTRA (Italy), Soft X-ray (Germany), SPARX (Italy), LCLS (USA), TESLA (Germany). X, long wavelengths; p , short wavelengths; W, planned short wavelengths SASE-FELs. Data based in part on H. P. Freund, V. L. Granatstein, Nucl. Inst. and Methods In Phys. Res. A249, 33 (1999), W. Colson, Proc. of the 24th Int. FEL conference, Argone, III. (ed. K. J. Kim, S. V. Milton, E. Gluskin). The data points fall close to the theoretical FEL radiation condition expression (91) drawn for two practical limits of wiggler parameters.

electric energy systems (e.g., Marx Generator), which produce short pulse (tens of nSec) Intense Relativistic Beam (IRB) of energy in the range of hundreds of keV to few MeV and high instantaneous current (order of kAmp), using explosive cathode (plasma field emission) electron guns. FELs (FEMs), based on such accelerators, operated mostly in the microwave and mm-wave regimes. Because of their poor beam quality and single pulse characteristic, these FELs were, in most cases, operated only as Self Amplified Spontaneous Emission (SASE) sources, producing intense radiation beams of low coherence at instantaneous power levels in the range of 1 –100 MW. Because of the high e-beam current and low energy,

these FEMs operated mostly in the collective highgain regime (see Table 1). Some of the early pioneering work on FEMs was done in the 1970s and 1980s in the US (NRL, Columbia Univ., MIT), Russia (IAP), and France (Echole Politechnique), based on this kind of accelerators. Induction linacs These too are single pulse (or low repetition rate) accelerators, based on induction of electromotive potential over an acceleration gap by means of an electric-transformer circuit. They can be cascaded to high energy, and produce short pulse (tens to hundreds

LASERS / Free Electron Lasers 449

Figure 12 Approximate wavelength ranges accessible with FELs based on current accelerator and wiggler technologies.

of nSec), high current (up to 10 kA) electron beams, with relatively high energy (MeV to tens of MeV). The interest in FELs, based on this kind of accelerator technology, stemmed in the 1980s primarily from the SDI program, for the propose of development of a DEW FEL. The main development of this technology took place on a 50 MeV accelerator – ATA (for operating at 10 mm wavelength) and a 3.5 MeV accelerator – ETA (for operating at 8 mm wavelength). The latter experiment, operating in the highgain regime, demonstrated record high power (1 GW) and energy extraction efficiency (35%). Electrostatic accelerators These accelerators are DC machines, in which an electron beam, generated by a thermionic electrongun (typically 1 – 10 Amp) is accelerated electrostatically. The charging of the high voltage terminal can be done by mechanical charge transport (Van de Graaff) or electrodynamically (Crockford –Walton accelerator, Dynamitron). The first kind can be built at energies up to 25 MeV, and the charging current is less than mAmp. The second kind have terminal voltage less than 5 MeV, and the charging current can be hundreds of mAmps. Because of their DC characteristics, FELs based on these kinds of accelerators can operate at arbitrary pulse shape structure and in principle – continuously (CW). However, because of the low charging current, the high electron beam current (1– 10 Amp), required for FEL lasing must be transported without any interception along the entire way from the electron gun, through the acceleration tubes and the FEL wiggler, and then decelerated down to the voltage depressed beam-collector (multistage collector), closing the electric circuit back to the e-gun (current recirculation). The collector is situated at the e-gun potential, biased by moderate voltage high current power supplies, which deliver the current and power

needed for circulating the e-beam and compensates for its kinetic energy loss in favor of the radiation field in the FEL cavity. This beam current recirculation is, therefore, also an ‘Energy retrieval’ scheme, and can make the overall energy transfer efficiency of the electrostatic-accelerator FEL very high. In practice, high-beam transport efficiency in excess of 99.9% is needed for CW lasing, and has not been demonstrated yet. To avoid HV-terminal voltage drop during lasing, electrostatic-accelerator FELs are usually operated in a single pulse mode. Few FELs of this kind have been constructed. The first and main facility is the UCSB FEL shown in Figure 13. It operates in the wavelength range of 30 mm to 2.5 mm (with three switchable wigglers) in the framework of a dedicated radiation user facility. This FEL operates in the negatively charged terminal mode, in which the e-gun and collector are placed in the negatively charged HV-terminal inside the pressurized insulating gas tank, and the wigglers are situated externally at ground potential. An alternative operating mode of positively charged terminal internal cavity electrostatic-accelerator FEM was demonstrated in the Israeli Tandem – Accelerator FEM and the Dutch F.O.M. Fusion-FEM projects. This configuration enables operating with long pulse, high coherence, and very high average power. Linewidth of Dv=v ø 1025 was demonstrated in the Israeli FEM and high power (730 kW over few microseconds) was demonstrated in the Dutch FEM, both at mm-wavelengths. The goal of the latter development project (which was not completed) was quasicontinuous operation at 1 MW average power for application in fusion plasma heating. Radio-frequency (RF) accelerators RF-accelerators are by far the most popular electronbeam sources for FELs. In RF accelerators, short electron beam bunches (bunch duration 1– 10 pSec) are accelerated by the axial field of intense RF radiation (frequency about 1 GHz), which is applied in the acceleration cavities on the injected short e-beam bunches, entering with the accelerating-phase of the RF field. In microtrons, the electron bunches perform circular motion, and get incremental acceleration energy every time they re-enter the acceleration cavity. In RF-LINACs (linear accelerator), the electron bunches are accelerated in a sequence of RF cavities or a slow-wave structure, which keep an accelerating-phase synchronization of the traversing electron bunches with the RF field along a long linear acceleration length. The bunching of the electrons, prior to the acceleration step, is traditionally performed by bunching RF-cavities and a dispersive magnet (chicane) pulse compression system.

LASERS / Free Electron Lasers 451

The FEL small signal gain, must be large enough to build-up the radiation field in the resonator from noise to saturation well within the macropulse duration. RF-Linacs are essential facilities in synchrotron radiation centers, used to inject electron beam current into the synchrotron storage ring accelerator from time to time. Because of this reason, many FELs based on RF-LINACs were developed in synchrotron centers, and provide additional coherent radiation sources to the synchrotron radiation center users. Figure 15 displays FELIX – a RF-LINAC FEL which is located in one of the most active FEL radiation user-centers in FOM – Holland. Storage rings Storage rings are circular accelerators in which a number of electron (or positron) beam bunches (typically of 50 – 500 pS pulse duration and hundreds of ampere peak current) are circulated continuously by means of a lattice of bending magnets and quadrupole lenses. Typical energies of storage ring accelerators are in the hundreds of MeV to GeVs range. As the electrons pass through the bending magnets, they lose a small amount of their energy due to emission of synchrotron radiation. This energy is replenished by a small RF acceleration cavity placed in one section of the ring. The electron beam bunch

dimensions, energy spread, and emittance parameters are set in steady state by a balance between the electrons oscillations within the ring lattice and radiation damping due to the random synchrotron emission process. This produces high-quality (small emittance and energy spread) continuous train of electron beam bunches, that can be used to drive a FEL oscillator placed as an insertion device in one of the straight sections of the ring between two bending magnets. Demonstrations of FEL oscillators, operating in a storage ring, were first reported by the French (LUREOrsay) in 1987 (at visible wavelengths) and the Russians (VEPP-Novosibirsk) in 1988 (in the ultraviolet). The short wavelength operation of storagering FELs is facilitated by the high energy, low emittance and low energy spread parameters of the beam. Since storage ring accelerators are at the heart of all synchrotron radiation centers, one could expect that FEL would be abundant in such facilities as insertion devices. There is, however, a problem of interference of the FEL operating as an insertion device in the normal operation of the ring itself. The energy spread increase, induced in the electron beam during the interaction in a saturated FEL oscillator, cannot be controlled by the synchrotron radiation damping process, if the FEL operating power is too high.

Figure 15 The FELIX RF-Linac FEL operating as a radiation users center in F.O.M. Netherlands. (Courtesy of L. van der Meer, F.O.M.)

452 LASERS / Free Electron Lasers

This limits the FEL power to be kept as a fraction of the synchrotron radiation power dissipation all around the ring (the ‘Renieri Limit’). The effect of the FEL on the e-beam quality, reduces the lifetime of the electrons in the storage ring, and so distrupts the normal operation of the ring in a synchrotron radiation user facility. To avoid the interference problems, it is most desirable to operate FELs in a dedicated storage ring. This also provides the option to leave long enough straight sections in which long enough wigglers provide sufficient gain for FEL oscillation. Figure 16 displays the Duke storage ring FEL, which is used as a unique radiation user facility, providing intense coherent short wavelength radiation for applications in medicine, biology, material studies, etc. Superconducting (SC) RF-LINACS When the RF cavities of the accelerator are superconducting, there are very low RF power losses on the cavity walls, and it is possible to maintain continuous acceleration field in the RF accelerator with a moderate-power continuous RF source, which delivers all of its power to the electron beam kinetic energy. Combining the SC-RF-LINAC technology with an FEL oscillator, pioneered primarily by Stanford University and Thomas Jefferson Lab (TJL) in the US and JAERI Lab in Japan, gave rise to an important scheme of operating such a system in a current recirculating energy retrieval mode.

This scheme revolutionized the development of FELs in the direction of high-power, high-efficiency operation, which is highly desirable, primarily for industrial applications (material processing, photochemical production, etc.). In the recirculating SC-RF-LINAC FEL scheme the wasted beam emerging out of the wiggler after losing a fraction of only few percents (see eqn [89]) out of its kinetic energy, is not dumped into a beam-dump, as in normal cavity RF accelerators, but is re-injected, after circulation, into the SC-RF accelerator. The timing of the wasted electron bunches re-injection is such that they experience a deceleration phase along the entire length of the accelerating cavities. Usually, they are re-injected at the same cell with a fresh new electron bunch injected at an acceleration phase, and thus the accelerated fresh bunch receives its acceleration kinetic energy directly from the wasted beam bunch, that is at the same time decelerated. The decelerated wasted beam bunches are then dumped in the electron beam dump at much lower energy than without recirculation, at energies that are limited primarily just by the energy spread induced in the beam in the FEL laser-saturation process. This scheme, not only increases many folds the over-all energy transformation efficiency from wall-plug to radiation, but would solve significant heat dissipation and radioactivity activation problems in a high-power FEL design. Figure 17 displays the TJL Infrared SC-RF-LINAC FEL oscillator, that demonstrated for the first time

Figure 16 The Duke – University Storage Ring FEL operating as a radiation-users center in N. Carolina, USA. (Mendening: Matthew Busch, courtesy of Glenn Edwards, Duke FEL Lab.)

LASERS / Free Electron Lasers 453

record high average power levels – nearly 10 kWatt at optical frequencies (1 –14 mm). The facility is in upgrade development stages towards eventual operation at 100 kWatt in the IR and 1 kWatt in the UV. It operates in the framework of a laser material processing consortium and demonstrates important material processing applications, such as high-rate micromachining of hard materials (ceramics) with picoSecond laser pulses. The e-beam current recirculation scheme of SCRF-LINAC FEL has a significant advantage over the e-beam recirculation in a storage ring. As in electrostatic accelerators, the electrons entering the wiggler are ‘fresh’ cold-beam electrons from the injector, and not a wasted beam corrupted by the laser saturation process in a previous circulation through the FEL.

This also makes it possible to sustain high average circulating current despite the disruptive effect of the FEL on the e-beam. This technological development has given rise to a new concept for a radiation-user facility light-source 4GLS (fourth-generation light source), which is presently in a pilot project development stage at Daresbury Lab in the UK (see Figure 18). In such a scheme, IR and UV FEL oscillators and XUV SASE-FEL can be operated together with synchrotron magnet dipole and wiggler insertion devices without disruptive interference. Such a scheme, if further developed, can give rise to new radiation-user, light-source facilities, that can provide a wider range of radiation parameters than synchrotron centers of previous generation.

Figure 17 The Thomas Jefferson Lab. recirculating beam-current superconducting Linac FEL operating as a material processing FEL-user center in Virginia USA (Courtesy of S. Benson, Thomas Jefferson Laboratory).

Figure 18 The Daresbury Fourth Generation Light-Source concept (4GLS). The circulating beam-current superconducting Linac includes SASE-FEL, bending magnets and wigglers as insertion devices. (Courtesy of M. Poole, Damesbury Laboratory)

454 LASERS / Free Electron Lasers

Magnetic Wiggler Schemes

The optical klystron The stimulated emission process in FEL (see Figure 7c) is based on velocity (energy) bunching of the e-beam in the first part of the wiggler, which turns into density bunching along the central part of the wiggler, and then the density-bunched electron beam performs ‘negative work’ on the radiation wave and emits radiative energy in the last part of the wiggler. In the OK, these steps are carried out in three separate parts of the wiggler: the energy bunching wiggler section, the dispersive magnet density buncher, and the radiating wiggler section (see Figure 7b). A schematic of the OK is shown in Figure 19. The chicane magnetic structure in the dispersive section brings all electrons emerging from the bunching wiggler back onto the axis of the Ðradiating wiggler, 21 but provides variable delay Dtdi ¼ LLbb þLd ðv21 zi 2 vph Þ dz ¼ ½dðDtd Þ=dgdgi relative to the pondermotive wave phase to different electrons, which acquired different energy modulation increments dgi ¼ gi 2 gph in the final section. The radiation condition is satisfied whenever the bunch-center phase satisfies Dwd ¼ vDtd ¼ p=2 2 2mp (see Figure 7b). However, because the energy dispersion coefficient dðDtd Þ=dg, is much larger in the chicane than in a wiggler of the same length, the density bunching amplitude, and consequently the OK gain, are much larger than in a uniform wiggler FEL of the same length. The OK was invented by Vinokurov and Skrinsky in 1977 and first demonstrated in 1987 at visible wavelengths in the ACO storage ring of LURE in Orsay, France, and subsequently in 1988 at UV wavelengths, in the VEPP storage ring in Novosibirsk, Russia. The OK is an optimal FEL configuration, if used as an insertion device in a storage ring, because it can provide sufficient gain to exceed the high lasing threshold at the short operating wavelengths of a high-energy storage-ring FEL, and still conform with the rather short straight sections

available for insertion devices in conventional synchrotron storage rings. It should be noted that the OK is equivalent to a long wiggler FEL of length Leff of equal gain and therefore its axial velocity spread acceptance is small (this is determined from the cold beam limit uth p p with Leff used in eqn [52]). This too is consistent with storage ring accelerators, which are characterized by small energy spread and emittance of the electron beam. Radiation emission at harmonic frequencies In a linear wiggler (eqn [2]), the axial velocity:

bz ¼ ½b2 2 ðaw =gÞ2 cos2 kw z1=2

½92

is not constant. It varies with spatial periodicity lw =2, and in addition to its average value bz ¼ ½b2 2 a2w =2g 2 1=2 , contains Fourier components of spatial frequencies 2mkw ðm ¼ 1; 2; …Þ. When aw q 1, the axial oscillation deforms the sinusoidal trajectory of the electrons in the wiggler (eqns [22] and [23]), and in a frame of reference moving at the average velocity bz , the electron trajectories in the wiggling ðx – zÞ plane forms an figure 8 shape, rather than a pure transverse linear motion. In the laboratory frame this leads to synchrotron undulator emission in the forward direction at all odd harmonic frequencies of v0 , corresponding to substitution of kw ! ð2m þ 1Þkw ðm ¼ 1; 2; 3; …Þ in eqn [6]:

v2mþ1 ¼ ð2m þ 1Þv0 ø 2g 2z cð2m þ 1Þkw

½93

All the stimulated emission gain expressions, presented earlier for the fundamental harmonic, are valid with appropriate substitution of

u2mþ1 ¼

v 2 kz 2 ð2m þ 1Þkw vz

½94

instead of eqn [43], and substitution of the harmonic-weight Bessel-function coefficient of

Figure 19 Schematics of the Optical– Klystron, including an energy bunching wiggler, a dispersive magnet bunching section and a radiating wiggler.

456 LASERS / Free Electron Lasers

synchronism with the beam. Slowing down the PM wave can be done by the gradual increase of the wiggler wavenumber kw ðzÞ (or decrease of its period lw ðzÞ), so that eqns [29] or [91] keep being satisfied for a given frequency, even if vz (or gz ) goes down. A more correct description of the nonlinear interaction dynamics of the electron beam in a saturated tapered-wiggler FEL is depicted in Figure 20: the electron trap synchronism energy gph ðzÞ tapers down (by design) along the wiggler, while the trapped electrons are forced to slow down with it, releasing their excess energy by enhanced radiation. An upper limit estimate for the extraction efficiency of such a tapered wiggler FEL would be:

hext ¼

gph ð0Þ 2 gph ðLÞ gph ð0Þ

½99

and the corresponding radiative power generation would be: DP ¼ hext Ib Ek =e: In practice, the phasespace area of the tapered wiggler separatrix is reduced due to the tapering, and only a fraction of the electron beam can be trapped, which reduces correspondingly the practical enhancement in radiative extraction efficiency and power. An alternative wiggler tapering scheme consists of tapering the wiggler field Bw ðzÞ (or wiggler parameter amplitude aw ðzÞ). If these are tapered down, the axial velocity and axial energy (eqn [7]) can still keep constant (and in synchronism with the PM wave) even if the beam energy g goes down. Thus, in this scheme, the excess radiative energy extracted from the beam comes out of its transverse (wiggling) energy. Efficiency and power enhancement of FEL by wiggler tapering have been demonstrated experimentally both in FEL amplifiers (first by Livermore, 1985) and oscillators (first by Los-Alamos, 1983). This elegant way to extract more power from the beam

still has some limitations. It can operate efficiently only at a specified high radiation power level for which the tapering was designed. In an oscillator, a long enough untapered section must be left to permit sufficient small signal gain in the early stages of the laser oscillation build-up process. FEL Oscillators

Most FEL devices are oscillators. As in any laser, in order to turn the FEL amplification process into an oscillation process, one provides a feedback mechanism by means of an optical resonator. In steady state saturation, GRrt ¼ 1, where Rrt is the round trip reflectivity factor of the resonator and G ¼ PðLÞ=Pð0Þ is the saturated single-path gain coefficient of the FEL. To attain oscillation, the small signal (unsaturated) gain, usually given by the small gain expression in eqn [53], must satisfy the lasing threshold condition G . 1=Rrt , as in any laser. When steady state oscillation is attained, the oscillator output power is: Pout ¼

T DPext 1 2 Rrt

½100

where DPext ¼ hext I0 ðg0 2 1Þmc2 =e and hext is the extraction efficiency, usually given by eqn [89] (low-gain limit). Usually, FEL oscillators operate in the low-gain regime, in which case 1 2 Rrt ¼ L þ T p 1 (where L is the resonator internal loss factor). Consequently, then Pout ø DPext T=ðL þ TÞ, which would give a maximum value, depending on the saturation level of the oscillator. In the general case, one must solve the nonlinear force equations together with the resonator feedback relations of the oscillating radiation mode, in order to maximize the output power (eqn [100]) or efficiency by choice of optimal T for given L.

Figure 20 ‘Snapshots’ of the trap at three locations along a tapered wiggler FEL.

LASERS / Free Electron Lasers 457

In an FEL oscillator operating with periodic electron bunches (as in RF-acclerator based FEL), the solution for the FEL gain and saturation dynamics requires extension of the single frequency solution of the electron and electromagnetic field equations to the time domain. In principle, the situation is similar to that of a mode-locked laser, and the steady state laser pulse train waveform constitutes a superposition of the resonator longitudinal modes that produces a self-similar pulse shape with the highest gain (best overlap with the e-beam bunch along the interaction length). Because the e-beam velocity vz0 is always smaller (in an open resonator) than the group velocity of the circulating radiation wavepacket, the radiation wavepacket slips ahead of the electron bunch one optical period l in each wiggling period (Slippage

effect). This reduces the overlap between the radiation pulse and the e-beam bunch along the wiggler (see Figure 14) and consequently decreases the gain. Fine adjustment of the resonator mirrors (as shown in Figure 14) is needed to attain maximal power and optimal radiation pulse shape. The pulse-slippage gain reduction effect is negligible only if the bunch length is much longer than the slippage length Nw l, which can be expressed as:

tp q 2p=DvL

½101

where DvL is the synchrotron undulator radiation frequency bandwidth (eqn [55]). This condition is usually not satisfied in RF-accelerator FELs operating in the IR or lower frequencies, and the

Figure 21 Anticipated peak brightness of SASE FELs (TTF-DESY, LCLS-SLAC) in comparison to the undulators in present third generation Synchrotron Radiation sources. Figure courtesy of DESY, Hamburg, Germany.

458 LASERS / Free Electron Lasers

slippage effect gain reduction must be then taken into account. An FEL operating in the cold-beam regime constitutes an ‘homogeneous broadening’ gain medium in the sense of conventional laser theory. Consequently, the longitudinal mode competition process that would develop in a CW FEL oscillator, leads to single-mode operation and high spectral purity (temporal coherence) of the laser radiation. The minimal (intrinsic) laser linewidth would be determined by an expression analogous to the Schawlow– Towns limit of atomic laser:

ðDf Þint ¼

ðDf1=2 Þ2 Ib =e

½102

where Df1=2 is the spectral width of the cold resonator mode. Expression [102] predicts extremely narrow linewidth. In practice, CW operation of FEL was not yet attained, but Fourier transform limited linewidths in the range of Df =f0 ø 1026 were measured in longpulse electrostatic accelerator FELs. In an FEL oscillator, based on a train of e-beam bunches (e.g., an R.F. accelerator beam), the linewidth is very wide and is equal to the entire gain bandwidth (eqn [56]) in the slippage dominated limit, and to the Fourier transform limit Dv ø 2p=tp in the opposite negligible-slippage limit (eqn [101]). Despite this slippage, it was observed in RF-LINAC FEL that the radiation pulses emitted by the FEL oscillator are phase corrected with each other, and therefore their total temporal coherence length may be as long as the

Figure 22 Phase 1 of the SASE FE L (TTF VUV-FEL1): (a) Accelerator layout scheme; (b) General view of the TESLA test facility. Figure courtesy of DESY, Hamburg, Germany.

460 LASERS / Metal Vapor Lasers

Metal Vapor Lasers D W Coutts, University of Oxford, Oxford, UK q 2005, Elsevier Ltd. All Rights Reserved.

Introduction Metal vapor lasers form a class of laser in which the active medium is a neutral or ionized metal vapor usually excited by an electric discharge. These lasers fall into two main subclasses, namely cyclic pulsed metal vapor lasers and continuous-wave metal ion lasers. Both types will be considered in this article, including basic design and construction, power supplies, operating characteristics (including principal wavelengths), and brief reference to their applications.

Self-Terminating ResonanceMetastable Pulsed Metal Vapor Lasers The active medium in a self-terminating pulsed metal vapor laser consists of metal atoms or ions in the vapor phase usually as a minority species in an inert buffer gas such as neon or helium. Laser action occurs between a resonance upper laser level and a metastable lower laser level (Figure 1). During a fast pulsed electric discharge (typically with a pulse duration of order 100 ns) the upper laser level is preferentially excited by electron impact excitation because it is strongly optically connected to the ground state (resonance transition) and hence has a large excitation cross-section. For a sufficiently large metal atom (or ion) density, the resonance radiation becomes optically trapped, thus greatly extending the lifetime of the upper laser level such that decay from the upper laser level is channelled through the emission of laser

Figure 1 Resonance-metastable energy levels for self-terminating metal vapor lasers.

radiation to the metastable lower laser level. Lasing terminates when the electron temperature falls to a point such that preferential pumping to the upper laser level is no longer sustained, and the build-up of population in the metastable lower laser level destroys the population inversion. Therefore after each excitation pulse, the resulting excited species in the plasma (in particular the metastable lower laser levels which are quenched by collisions with cold electrons) must be allowed sufficient time to relax and the plasma must be allowed to partially recombine before applying the next excitation pulse. The relaxation times for self-terminating metal vapor lasers correspond to operating pulse repetition frequencies from 2 kHz to 200 kHz. Many metal vapors can be made to lase in the resonance-metastable scheme and are listed together with their principal wavelengths and output powers in Table 1. The most important self-terminating pulsed metal vapor laser is the copper vapor laser and its variants which will be discussed in detail in the following sections. Of the other pulsed metal vapor lasers listed in Table 1, only the gold vapor laser (principal wavelengths 627.8 nm and 312.3 nm), and the barium vapor laser which operates in the infrared (principal wavelengths 1.5 mm and 2.55 mm) have had any commercial success. All the selfterminating pulsed metal vapor lasers have essentially the same basic design and operating characteristics as exemplified by the copper vapor laser.

Copper Vapor Lasers Copper vapor lasers (CVLs) are by far the most widespread of all the pulsed metal vapor lasers. Figure 2 shows the energy level scheme for copper. Lasing occurs simultaneously from the 2P3/2 level to the 2D5/2 level (510.55 nm) and from the 2P1/2 level to the 2D3/2 level (578.2 nm). Commercial devices are available with combined outputs of over 100 W at 510.55 nm and 578.2 nm (typically with a green-toyellow power ratio of 2:1). A typical copper vapor laser tube is shown in Figure 3. High-purity copper pieces are placed at intervals along an alumina ceramic tube which typically has dimensions of 1 –4 cm diameter and 1 – 2 m long. The alumina tube is surrounded by a solid fibrous alumina thermal insulator, and a glass or quartz vacuum envelope. Cylindrical electrodes made of copper or tantalum are located at each end of the plasma tube to provide a longitudinal discharge

LASERS / Metal Vapor Lasers 461

Table 1 Metal

Principal self-terminating resonance-metastable metal vapor lasers Principal wavelengths (nm)

Powers Typical (W)

Cu Au Ba

Pb Mn

510.55 578.2 627.8 312 1500 2550 1130 722.9 534.1 (.50%) 1290

2– 70 1– 50 1– 8 0.1–0.2 2– 10 1 0.5

Total efficiencies

Pulse repetition frequency (kHz)

Technological development

Maximum (W) 2500 total

1%

4 –40

20 1.2 12 1.5 1.0 4.4 12 total

0.13% 0.5%

2 –40 1– 8 5 –15

0.15% 0.32%

10–30 ,10

Figure 2 Partial energy level scheme for the copper vapor laser.

Figure 3 Copper vapor laser tube construction.

Highly developed and commercially available Commercially available Have been produced commercially, but largely experimental Experimental Experimental

462 LASERS / Metal Vapor Lasers

arrangement. The cylindrical electrodes and silica laser end windows are supported by water-cooled end pieces. The laser windows are usually tilted by a few degrees to prevent back reflections into the active medium. The laser head is contained within a water cooled metal tube to provide a coaxial current return for minimum laser head inductance. Typically a slow flow (,5 mbar l min21) of neon at a pressure of 20 – 80 mbar is used as the buffer gas with an approximately 1% H2 additive to improve the afterglow plasma relaxation. The buffer gas provides a medium to operate the discharge when the laser is cold and slows diffusion of copper vapor out of the ends of the hot plasma tube. Typical copper fill times are of order 200 – 2000 hours (for 20 – 200 g copper load). Sealed-off units with lifetimes of order 1000 hours have been in production in Russia for many years. During operation waste heat from the repetitively pulsed discharge heats the alumina tube up to approximately 1500 8C at which point the vapor pressure of copper is of approximately 0.5 mbar which corresponds to the approximate density required for maximum laser output power. Typical warm-up times are therefore relatively long at around one hour to full power. One way to circumvent the requirement for high temperatures required to produce sufficient copper density by evaporation of elemental copper (and hence also reduce warm-up times) is to use a copper salt with a low boiling point located in one or more side-arms of the laser tube (Figure 4). Usually copper halides are used as the salt, with the copper bromide laser being the most successful. For the CuBr laser, a temperature of just 600 8C is sufficient to produce the required Cu density by dissociation of CuBr vapor in the discharge. With the inclusion of 1– 2% H2 in the neon buffer gas, HBr is also formed in the CuBr laser, which has the additional benefit of improving recombination in the afterglow via dissociative attachment of free

Figure 4 Copper bromide laser tube construction.

electrons: HBr þ e2 ! H þ Br2 ; followed by ion neutralization: Br2 þ Cuþ ! Br þ Cup : As a result of the lower operating temperature and kinetic advantages of HBr, CuBr lasers are typically twice as efficient (2 –3%) as their elemental counterparts. Sealed-off CuBr systems with powers of order 10 – 20 W are commercially produced. An alternative technique for reducing the operating temperature of elemental CVLs is to flow a buffer gas mixture consisting of , 5% HBr in neon at approximately 50 mbar l min21 and allow this to react with solid copper metal placed within the plasma tube at about 600 8C to produce CuBr vapor in situ. The so-called Cu HyBrID (hydrogen bromide in discharge) laser has the same advantages as the CuBr laser (e.g., up to 3% efficiency) but at the cost of requiring flowing highly toxic HBr in the buffer gas, a requirement which has so far prevented commercialization of Cu HyBrID technology. The kinetic advantages of the hydrogen halide in the CuBr laser discharge can also be applied to a conventional elemental CVL through the addition of small partial pressure of HCl to the buffer gas in addition to the 1 –2% H2 additive. HCl is preferred to HBr as it is less likely to dissociate (the dissociation energy of HCl at 0.043 eV is less than HBr at 0.722 eV). Such kinetic enhancement leads to a doubling in average output power, a dramatic increase in beam quality through improved gain characteristics, and shifts the optimum pulse repetition frequency for kinetically enhanced CVLs (KE-CVLs) from 4 – 10 kHz up to 30 – 40 kHz. Preferential pumping of the upper laser levels requires an electron temperature in excess of the 2 eV range, hence high voltage (10– 30 kV), high current (hundreds of A), short (75 –150 ns) excitation pulses are required for efficient operation of copper vapor lasers. To generate such excitation pulses CVLs are typically operated with a power supply incorporating a high-voltage thyratron switch. In the most basic configuration, the charge-transfer circuit shown

LASERS / Metal Vapor Lasers 463

in Figure 5a, a dc high-voltage power supply resonantly charges a storage capacitor (CS, typically a few nF) through a charging inductor LC, a highvoltage diode and bypass inductor LB, up to twice the supply voltage VS in a time of order 100 ms. When the thyratron is triggered, the storage capacitor discharges through the thyratron and the laser head on a time-scale of 100 ns. Note that during the fast discharge phase, the bypass inductor in parallel with the laser head can be considered to be an open circuit. A peaking capacitor CP (, 0.5 CS) is provided to increase the rate of rise of the voltage pulse across the laser head. Given the relatively high cost of thyratrons, more advanced circuits are now often used to extend the service lifetime of the thyratron to several thousand hours. In the more advanced circuit (Figure 5b) an LC inversion scheme is used in combination with magnetic pulse compression techniques and operates as follows. Storage capacitors CS are resonantly charged in parallel to twice the dc supply voltage VS as before. When the thyratron is switched, the charge on CS1 inverts through the thyratron and transfer inductor LT. This LC inversion drags the voltage on the top of CS2 down to 2 4VS. When the voltage across the first saturable inductor LS1 reaches a maximum (2 4VS) LS1 saturates, and allows current to flow from the storage capacitors (now charged in series) to the transfer capacitor

Figure 5 Copper vapor laser excitation circuits. (a) Charge transfer circuit; (b) LC inversion circuit with magnetic pulse compression.

ðCT ¼ 0:5CS2 Þ thereby transferring the charge from CS1 and CS2 to CT in a time much less than the initial LC inversion time. At the moment when the charge on transfer capacitor C T reaches a maximum (also 2 4VS), LS2 saturates, and the transfer capacitor is discharged through the laser tube, again with a peaking capacitor to increase the voltage rise time. By using magnetic pulse compression the thyratron switched voltage can be reduced by 4 and the peak current similarly reduced (at the expense of increased current pulse duration) thereby greatly extending the thyratron lifetime. Note that in both circuits a ‘magnetic assist’ LA saturable inductor is provided in series with the thyratron to delay the current pulse through the thyratron until after the thyratron has reached high conductivity thereby reducing power deposition in the thyratron. Copper vapor lasers produce high average powers (2 –100 W available commercially, with laboratory devices producing average powers of over 750 W) and have wall-plug efficiencies of approximately 1%. Copper vapor lasers also make excellent amplifiers due to their high gains, and the amplifiers can be chained together to produce average powers of several kW. Typical pulse repetition frequencies range from 4 to 20 kHz, with a maximum reported pulse repetition frequency of 250 kHz. An approximate scaling law states that for an elemental device with tube diameter D (mm) and length L (m) the average output power in watts will be of order D £ L. For example, a typical 25 W copper vapor laser will have a 25 mm diameter by 1 m long laser tube, and operate at 10 kHz corresponding to 2.5 mJ pulse energy and 50 kW peak power (50 ns pulse duration). Copper vapor lasers have very high singlepass gains (greater than 1000 for a 1 m long tube), large gain volumes and short gain durations (20 – 80 ns, sufficient for the intracavity laser light to make only a few round trips within the optical resonator). Maximum output power is therefore usually obtained in a highly ‘multimode’ (spatially incoherent) beam by using either a fully stable or a plane –plane resonator with a low reflectivity output coupler (usually Fresnel reflection from an uncoated optic is sufficient). To obtain higher beam quality a high magnification unstable resonator is required (Figure 6). Fortunately, copper vapor lasers have sufficient gain to operate efficiently with unstable resonators with magnifications ðM ¼ R1 =R2 Þ up to 100 and beyond. Resonators with such high magnifications impose very tight geometric constraints on propagation of radiation on repeated round-trips within the resonator, such that after two round-trips the divergence is typically diffractionlimited. Approximately half the stable-resonator

464 LASERS / Metal Vapor Lasers

Figure 6 Unstable resonator configuration often used to obtain high beam quality from copper vapor lasers.

output power can therefore be obtained with near diffraction-limited beam quality by using an unstable resonator. Often, a small-scale oscillator is used in conjunction with a single power amplifier to produce high output power with diffraction-limited beam quality. Hyperfine splitting combined with Doppler broadening lead to an inhomogeneous linewidth for the laser transitions of order 8 –10 GHz corresponding to a coherence length of order 3 cm. The high beam quality and moderate peak power of CVLs allows efficient nonlinear frequency conversion to the UV by second harmonic generation (510.55 nm ! 255.3 nm, 578.2 nm ! 289.1 nm) and sum frequency generation (510.55 nm þ 578.2 nm ! 271.3 nm) using b -barium borate b-BaB2O4 (BBO) as the nonlinear medium. Typically average powers in excess of 1 W can be obtained at any of the three wavelengths from a nominally 20 W CVL. Powers up to 15 W have been obtained at 255 nm from high-power CVL master-oscillator power-amplifier systems using cesium lithium borate; CsLi B6O10 (CLBO) as the nonlinear crystal. Key applications of CVLs include pumping of dye lasers (principally for laser isotope separation) and pumping Ti:sapphire lasers. Medically, the CVL yellow output is particularly useful for treatment of skin lesions such as port wine stain birth marks. CVLs are also excellent sources of short-pulse stroboscopic illumination for high-speed imaging of fast objects and fluid flows. The high beam quality, visible wavelength and high pulse repetition rate make CVLs very suited to precision laser micromachining of metals, ceramics and other hard materials. More recently, the second harmonic at 255 nm has proved to be an excellent source for writing Bragg gratings in optical fibers.

Afterglow Recombination Metal Vapor Lasers Recombination of an ionized plasma in the afterglow of a discharge pulse provides a mechanism for achieving a population inversion and hence laser

output. The two main afterglow recombination metal vapor lasers are the strontium ion vapor laser (430.5 nm and 416.2 nm) and calcium ion vapor laser (373.7 nm and 370.6 nm) whose output in the violet and UV spectral regions extends the spectral coverage of pulsed metal vapor lasers to shorter wavelengths. A population inversion is produced by recombination pumping where doubly ionized Sr (or Ca) recombines to form singly ionized Sr (or Ca) in an excited state: Srþþ þ e2 þ e2 ! Srþp þ e2 : Note that recombination rates for a doubly ionized species are much faster than for singly ionized species hence recombination lasers are usually metal ion lasers. Recombination (pumping) rates are also greatest in a cool dense plasma, hence helium is usually used as the buffer gas as it is a light atom hence promotes rapid collisional cooling of the electrons. Helium also has a much higher ionization potential than the alkalineearth metals (including Sr and Ca) which ensures preferential ionization of the metal species (up to 90% may be doubly ionized). Recombination lasers can operate where the energy level structure of an ion species can be considered to consist of two (or more) groups of closely spaced levels. In the afterglow of a pulsed discharge electron collisional mixing within each group of levels will maintain each group in thermodynamic equilibrium yielding a Boltzmann distribution of population within each group. If the difference in energy between the two groups of levels ð5 eV >> kTÞ is sufficiently large then thermal equilibrium between the groups cannot be maintained via collisional processes. Given that recombination yields excited singly ionized species (usually with a flux from higher to lower ion levels), it is possible to achieve a population inversion between the lowest level of the upper group and the higher levels within the lower group. This is the mechanism for inversion in the Srþ (and analogous Caþ) ion laser as indicated in the partial energy level scheme for Srþ shown in Figure 7. Strontium and calcium ion recombination lasers have similar construction to copper vapor lasers described above. The lower operating temperatures (500 – 800 8C) mean that minimal or no thermal insulation is required for self-heated devices.

LASERS / Metal Vapor Lasers 465

Figure 7 Partial energy level scheme for the strontium ion laser.

For good tube lifetime BeO plasma tubes are required due to the reactivity of the metal vapors. Usually helium at a pressure of up to one atmosphere is used as the buffer gas. Typical output powers at 5 kHz pulse repetition frequency are of order 1 W (, 0.1% wall plug efficiency) at 430.5 nm from the He Srþ laser and 0.7 W at 373.7 nm from the He Caþ laser. For the high specific input power densities (10 – 15 W cm23) required for efficient lasing, overheating of the laser gas limits aperture scaling beyond 10 –15 mm diameter. Slab laser geometries have been used successfully to aperture-scale strontium ion lasers. Scaling of the laser tube length beyond 0.5 m is not practical as achieving high enough excitation voltages for efficient lasing becomes problematic. Gain in strontium and calcium ion lasers is lower than in resonance-metastable metal vapor lasers such as the CVL, hence optimum output coupler reflectivity is approximately 70%. The pulse duration is also longer at around 200 – 500 ns resulting in moderate beam quality with plane –plane resonators. For both strontium and calcium ion lasers the two principal transitions share upper laser levels and hence exhibit gain competition such that without wavelength-selective cavities usually only the longer wavelength of the pair is produced. With wavelengthselective cavities up to 60% (Srþ) and 30% (Caþ) of the normal power can be obtained at the shorter wavelength.

Many potential applications exist for strontium and calcium ion lasers, given their ultraviolet (UV)/violet wavelengths. Of particular importance is fluorescence spectroscopy in biology and forensics, treatment of neonatal jaundice, stereolithography, micromachining and exposing photoresists for integrated circuit manufacture. Despite these many potential applications, technical difficulties in power scaling mean that both strontium ion and calcium ion lasers have only limited commercial availability.

Continuous-Wave Metal Ion Lasers In a discharge excited noble gas, there can be a large concentration of noble gas atoms in excited metastable states and noble gas ions in their ground states. These species can transfer their energy to a minority metal species (M) via two key processes: either charge transfer (Duffendack reactions) with the noble gas ions (Nþ): M þ Nþ ! Mþp þ N þ DE ðwhere Mþp is an excited metal ion stateÞ or Penning ionization with a noble gas atom in an excited metastable state (Np): M þ Np ! Mþp þ N þ e2

466 LASERS / Metal Vapor Lasers

In a continuous discharge in a mixture of a noble gas and a metal vapor, steady generation of excited metal ions via energy transfer processes can lead to a steady state population inversion on one or more pairs of levels in the metal ion and hence produce cw lasing. Several hundred metal ion laser transitions have been observed to lase in a host of different metals. The most important such laser is the helium cadmium laser, which has by far the largest market volume by number of unit sales of all the metal vapor lasers. In a helium cadmium laser, Cd vapor is present at a concentration of about 1 – 2% in a helium buffer gas which is excited by a dc discharge. Excitation to the upper laser levels of Cd (Figure 8) is primarily via Penning ionization collisions with Heþ 23S1 metastable ions produced in the discharge. Excitation via electron-impact excitation from the ion ground state may also play an important role in establishing a population inversion in Cdþ lasers. Population inversion can be sustained continuously because the 2 P3/2 and 2P1/2 lower laser levels decay via strong resonance transitions to the 2S1/2 Cdþ ground state, unlike in the self-terminating metal vapor lasers. Two principal wavelengths can be produced, namely 441.6 nm (blue) and 325.0 nm (UV) with cw powers up to 200 mW and 50 mW available respectively from commercial devices.

Typical laser tube construction (Figure 9) consists of a 1 –3 mm diameter discharge channel typically 0.5 m long. A pin anode is used at one end of the laser tube together with a large-area cold cylindrical cathode located in a side-arm at the other end of the tube. Cadmium is transported to the main discharge tube from a heated Cd reservoir in a side-arm at 250 –300 8C. With this source of atoms at the anode end of the laser a cataphoresis process (in which the positively charged Cd ions are propelled towards the cathode end of the tube by the longitudinal electric field) transports Cd into the discharge channel. Thermal insulation of the discharge tube ensures that it is kept hotter than the Cd reservoir to prevent condensation of Cd from blocking the tube bore. A large-diameter Cd condensation region is provided at the cathode end of the discharge channel. A large-volume side arm is also provided to act as a gas ballast for maintaining correct He pressure. As He is lost through sputtering and diffusion through the Pyrex glass tube walls, it is replenished from a high-pressure He reservoir by heating a permeable glass wall separating the reservoir from the ballast chamber which allows He to diffuse into the ballast chamber. Typical commercial sealed-off Cd lasers have operating lifetimes of several thousand hours. Overall laser construction is not that much more

Figure 8 Partial energy level diagram for helium and cadmium giving HeCd laser transitions.

LASERS / Noble Gas Ion Lasers 467

Figure 9 Helium cadmium laser tube construction.

complex than a HeNe laser, hence unit costs are considerably lower than low-power argon ion lasers which also provide output in the blue. Usually Brewster angle windows are provided together with a high Q stable resonator (97 – 99% reflectivity output coupler) to provide polarized output. With their blue and UV wavelengths and relatively low cost (compared to low-power argon ion lasers), HeCd lasers have found wide application in science, medicine and industry. Of particular relevance is their application for exposing photoresists where the blue wavelength provides a good match to the peak photosensitivity of photoresist materials. A further key application is in stereolithography where the UV wavelength is used to cure an epoxy resin. By scanning the UV beam in a raster pattern across the surface of a liquid epoxy, a solid three-dimensional object may be built up in successive layers.

List of Units and Nomenclature Boltzmann constant Excited metal ion Excited noble gas atom Energy difference Electron Gas flow

[eV K21]

[eV]

k Mþp Np DE e2

[mbar l min21]

Mirror curvatures Metal atom Noble gas atom Noble gas ion Pulse duration Pulse repetition frequency Quality factor Temperature Tube diameter Tube length Wavelength Unstable resonator magnification

[m]

R1, R2 M N Nþ

[ns] [kHz]

[K] [mm] [m] [nm], [mm]

Q T D L M

See also Nonlinear Sources: Harmonic Generation in Gases.

Further Reading Little CE (1999) Metal Vapor Lasers. Chichester, UK: John Wiley. Ivanov IG, Latush EL and Sem MF (1996) Metal Vapor Ion Lasers, Kinetic Processes and Gas Discharges. Chichester, UK: John Wiley. Little CE and Sabotinov NV (eds) (1996) Pulsed Metal Vapor Lasers. Dordrecht, The Netherlands: Kluwer. Lyabin NA, Chursin AD, Ugol’nikov SA, Koroleva ME and Kazaryan MA (2001) Development, production, and application of sealed-off copper and gold vapor lasers. Quantum Electronics 31: 191 – 202. Petrash GG (ed.) (1989) Metal Vapor and Metal Halide Vapor Lasers. Commack, NY: Nova Science Publishers. Silfvast WT (1996) Laser Fundamentals. New York: Cambridge University Press.

Noble Gas Ion Lasers W B Bridges, California Institute of Technology, Pasadena, CA, USA q 2005, Elsevier Ltd. All Rights Reserved.

History The argon ion laser was discovered in early 1964, and is still commercially available in 2004, with about $70 million in annual sales 40 years after this discovery. The discovery was made independently and nearly simultaneously by four different groups; for three of the four, it was an accidental result of

studying the excitation mechanisms in the mercury ion laser (historically, the first ion laser), which had been announced only months before. For more on the early years of ion laser research and development, see the articles listed in the Further Reading section at the end of this article. The discovery was made with pulsed gas discharges, producing several wavelengths in the blue and green portions of the spectrum. Within months, continuous operation was demonstrated, as well as oscillation on many visible wavelengths in ionized krypton and xenon. Within a year, over 100 wavelengths were observed to oscillate in the ions

468 LASERS / Noble Gas Ion Lasers

of neon, argon, krypton, and xenon, spanning the spectrum from ultraviolet to infrared; oscillation was also obtained in the ions of other gases, for example, oxygen, nitrogen, and chlorine. A most complete listing of all wavelengths observed as gaseous ion lasers is given in the Laser Handbook cited in the Further Reading section. Despite the variety of materials and wavelengths demonstrated, however, it is the argon and krypton ion lasers that have received the most development and utilization. Continuous ion lasers utilize high current density gas discharges, typically 50 A or more, and 2– 5 mm in diameter. Gas pressures of 0.2 to 0.5 torr result in longitudinal electric fields of a few V/cm of discharge, so that the power dissipated in the discharge is typically 100 to 200 W/cm. Such high-power dissipation required major technology advances before longlived practical lasers became available. Efficiencies have never been high, ranging from 0.01% to 0.2%. A typical modern ion laser may produce 10 W output at 20 kW input power from 440 V three-phase power lines and require 6 – 8 gallons/minute of cooling water. Smaller, air-cooled ion lasers, requiring 1 kW of input power from 110 V single-phase mains can produce 10 – 50 mW output power, albeit at even lower efficiency.

Theory of Operation The strong blue and green lines of the argon ion laser originate from transitions between the 4p upper levels and 4s lower levels in singly ionized argon, as shown

in Figure 1. The 4s levels decay radiatively to the ion ground state. The strongest of these laser lines are listed in Table 1. The notation used for the energy levels is that of the L – S coupling model. The ion ground state electron configuration is 3s23p5(2Po3/2). The inner ten electrons have the configuration 1s22s22p6, but this is usually omitted for brevity. The excited states shown in Figure 1 result from coupling a 4p or 4s electron to a 3s23p4(3P) core, with the resulting quantum numbers S (the net spin of the electrons), L (the net angular momentum of the electrons), and J (the angular momentum resulting from coupling S to L) are represented by 2Sþ1 LJ, where L ¼ 0; 1; 2; … is denoted S, P, D, F, … The superscript ‘o’ denotes an odd level, while even levels omit a superscript. Note that some weaker transitions involving levels originating from the 3s23p4(1D) core configuration also oscillate. Note also that the quantum mechanical selection rules for the L – S coupling model are not rigorously obeyed, although the stronger laser lines satisfy most or all of these rules. The selection rule lDJl ¼ 1 or 0, but not J ¼ 0 ! J ¼ 0 is always obeyed. All transitions shown in Figure 1 and Table 1 belong to the second spectrum of argon, denoted Ar II. Lines originating from transitions in the neutral atom make up the first spectrum, Ar I; lines originating from transitions in doubly ionized argon are denoted Ar III, and so forth for even more highly ionized states. Much work has been done to determine the mechanisms by which the inverted population is formed in the argon ion laser. Reviews of this extensive research

Figure 1 4p and 4s doublet levels in singly ionized argon, showing the strongest blue and green laser transitions.

LASERS / Noble Gas Ion Lasers 469

Table 1

Ar II laser blue-green wavelengths

Wavelength (nanometers)

Transitiona (upper level) ! (lower level)

454.505 457.935 460.956 465.789 472.686 476.486 487.986 488.903 496.507 501.716 514.179 514.532 528.690

4p 2Po3/2 ! 4s 2P3/2 4p 2So1/2 ! 4s 2P1/2 (1D)4p 2Fo7/2 ! (1D)4s 2D5/2 4p 2Po1/2 ! 4s 2P3/2 4p 2Do3/2 ! 4s 2P3/2 4p 2Po3/2 ! 4s 2P1/2 4p 2Do5/2 ! 4s 2P3/2 4p 2Po1/2 ! 4s 2P1/2 4p 2Do3/2 ! 4s 2P1/2 (1D)4p 2Do5/2 ! 3d 2D3/2 (1D)4p 2Fo7/2 ! 3d 2D5/2 4p 4Do5/2 ! 4s 2P3/2 4p 4Do3/2 ! 4s 2P1/2

Relative strengthb (Watt) 0.8 1.5 – 0.8 1.3 3.0 8.0 c

3.0 1.8 c

10 1.8

a All levels are denoted in L – S coupling with the (3P) core unless otherwise indicated. Odd parity is denoted by the superscript ‘o’. b Relative strengths are given as power output from a commercial Spectra-Physics model 2080-25S argon ion laser. c These lines may oscillate simultaneously with the nearby strong line, but are not resolved easily in the output beam.

Figure 2 Schematic representation of energy levels in neutral and singly ionized argon, indicating alternative pathways for excitation and de-excitation of the argon ion laser levels.

are found in the Further Reading section. While a completely quantitative picture of argon ion laser operation is lacking to this day, the essential processes are known. Briefly, the 4p upper levels are populated by three pathways, as illustrated in Figure 2:

(i) by electron collision with the 3p6 neutral ground state atoms. This ‘sudden perturbation process’ requires at least 37 eV electrons, and also singles out the 4p 2Po3/2 upper level, which implies that only the 476 and 455 nm lines would oscillate. This is the behavior seen in pulsed discharges at very low pressure and very high axial electric field. This pathway probably contributes little to the 4p population under ordinary continuous wave (cw) operating conditions, however. (ii) by electron collision from the lowest-lying s and d states in the ion, denoted M(s,d) in Figure 1. This only requires 3 – 4 eV electrons. These states have parity-allowed transitions to the ion ground state, but are made effectively metastable by radiation trapping (that is, there is a high probability that an emitted photon is re-absorbed by another ground state ion before it escapes the discharge region), or by requiring lDJl to be 2 to make the transition (forbidden by quantum selection rules). Thus, these levels are both created and destroyed primarily by electron collision, causing the population of the M(s,d) states to follow the population of the singly ionized ground state, which, in turn, is approximately proportional to the discharge current. Since a second electron collision is required to get from the M(s,d) states to the 4p states, a quadratic variation with current for the laser output power would be expected, and that is what is observed over some reasonable range of currents between threshold and saturation. Note that radiative decay from higher-lying opposite-parity p and f states, denoted X(p,f), can also contribute to the population of M(s,d), but the linear variation in population of the M(s,d) with discharge current is assured by electron collision creation and destruction. (iii) by radiative decay from higher-lying oppositeparity s and d states, denoted C(s,d). These states are populated by electron collision with the 3p5 ion ground states, and thus have populations that also vary quadratically with discharge current. The contribution of this cascade process has been measured to be 20% to 50% of the 4p upper laser level population. Note that it is not possible to distinguish between processes (ii) and (iii) by the variation of output power with discharge current; both give the observed quadratic dependence. The radiative lifetimes of the 4s 2P levels are sufficiently short to depopulate the lower laser levels by radiative decay. However, radiation trapping greatly lengthens this decay time, and a bottleneck

470 LASERS / Noble Gas Ion Lasers

can occur. In pulsed ion lasers, this is exhibited by the laser pulse terminating before the excitation current pulse ends, which would seem to preclude continuous operation. However, the intense discharge used in continuous operation heats the ions to the order of 2300 K, thus greatly Doppler broadening the absorption linewidth and reducing the magnitude of the absorption. Additionally, the ions are attracted to the discharge tube walls, so that the absorption spectrum is further broadened by the Doppler shift due to their wall-directed velocities. The plasma wall sheath gives about 20 V drop in potential from the discharge axis to the tube wall, so most ions hit the wall with 20 eV of energy, or about ten times their thermal velocity. A typical cw argon ion laser operates at ten times the gas pressure that is optimum for a pulsed laser, and thus the radiation trapping of the 4s ! 3p5 transitions is so severe that it may take several milliseconds after discharge initiation for the laser oscillation to begin. As the discharge current is increased, the intensities of the blue and green lines of Ar II eventually saturate, and then decrease with further current. At these high currents, there is a buildup in the population of doubly ionized atoms, and some lines of Ar III can be made to oscillate with the appropriate ultraviolet mirrors. Table 2 lists the strongest of these lines, those that are available in the largest commercial lasers. Again, there is no quantitative model for the performance in terms of the discharge parameters, but the upper levels are assumed to be populated by processes analogous to those of the Ar II laser. Table 2

At still higher currents, lines in Ar IV can be made to oscillate as well. Much less research has been done on neon, krypton, and xenon ion lasers, but it is a good assumption that the population and depopulation processes are the same in these lasers. Table 3 lists both the Kr II and Kr III lines that are available from the largest commercial ion lasers. Oscillation on lines in still-higher ionization states in both krypton and xenon have been observed.

Operating Characteristics A typical variation of output power with discharge current for an argon ion laser is shown in Figure 3. This particular laser had a 4 mm diameter discharge in a water-cooled silica tube, 71 cm in length, with a 1 kG axial magnetic field. The parameter is the argon pressure in the tube before the discharge was struck. Note that no one curve is exactly quadratic, but that the envelope of the curves at different filling pressures is approximately quadratic. At such high discharge current densities (50 A in the 4 mm tube is approximately 400 A/cm2) there is substantial pumping of gas out of the small-bore discharge region. Indeed, a return path for this pumped gas must be provided from anode to cathode ends of the discharge to keep the discharge from self-extinguishing. The axial electric field in this discharge was 3– 5 V/cm, so the input power was of the order of 10 to 20 kW, yielding an efficiency of less than 0.1%, an unfortunate characteristic of all ion lasers.

Ultraviolet argon ion laser wavelengths

Wavelength (nanometers)

Spectrum

Transitiona (upper level) ! (lower level)

Relative strengthb (Watt)

275.392 275.6 300.264 302.405 305.484 333.613 334.472 335.849 350.358 350.933 351.112 351.418 363.789 379.532 385.829 390.784 408.904 414.671 418.298

III ? III III III III III III III III III III III III III III IV? III III

(2Do)4p 1D2 ! (2Do)4s 1Do2 ? (2Po)4p 1P1 ! (2Po)3d 1Do2 (2Po)4p 3D3 ! (2Po)4s 3Po2 (2Po)4p 3D2 ! (2Po)4s 3Po1 (2Do)4p 3F4 ! (2Do)4s 3Do3 (2Do)4p 3F3 ! (2Do)4s 3Do2 (2Do)4p 3F2 ! (2Do)4s 3Do1 (2Do)4p 3D2 ! (2Do)4s 3Do2 (4So)4p 3P0 ! (4So)4s 3So1 (4So)4p 3P2 ! (4So)4s 3So1 (4So)4p 3P1 ! (4So)4s 3So1 (2Do)4p 1F3 ! (2Do)4s 1Do2 (2Po)4p 3D3 ! (2Po)3d 3Po2 (2Po)4p 3D2 ! (2Po)3d 3Po1 (2Po)4p 3D1 ! (2Po)3d 3Po0 ? (2Do)4p 3P2 ! (2Po)4s 3Po2 (2Do)4p 1P1 ! (2Do)4s 1Do2

0.3 0.02 0.5 0.5 0.2 0.4 0.8 0.8 0.05 0.05 2.0 0.7 2.5 0.4 0.15 0.02 0.04 0.02 0.08

a

All levels are denoted in L – S coupling with the core shown in ( ). Odd parity is denoted by the superscript ‘o’. Relative strengths are given as power output from a commercial Spectra-Physics model 2085-25S argon ion laser.

b

LASERS / Noble Gas Ion Lasers 471

Table 3

Krypton ion laser wavelengths

Wavelength (nanometers)

Spectrum

Transitiona (upper level) ! (lower level)

Relative strengthb (Watt)

337.496 350.742 356.432 406.737 413.133 415.444 422.658 468.041 476.243 482.518 520.832 530.865 568.188 631.024 647.088 676.442 752.546 793.141 799.322

III III III III III III III II II II II II II III II II II II II

(2Po)5p 3D3 ! (2Po)5s 3Po2 (4So)5p 3P2 ! (4So)5s 3So1 (4So)5p 3P1 ! (4So)5s 3So1 (2Do)5p 1F3 ! (2Do)5s 1Do2 (4So)5p 5P2 ! (4So)5s 3So1 (2Do)5p 3F3 ! (2Do)5s 1Do2 (2Do)5p 3F2 ! (2Do)4d 3Do1 (3P)5p 2So1/2 ! (3P)5s 2P1/2 (3P)5p 2Do3/2 ! (3P)5s 2P1/2 (3P)5p 4So3/2 ! (3P)5s 2P1/2 (3P)5p 4Po3/2 ! (3P)5s 4P3/2 (3P)5p 4Po5/2 ! (3P)5s 4P3/2 (3P)5p 4Do5/2 ! (3P)5s 2P3/2 (2Do)5p 3P2 ! (2Po)4d 3D1 (3P)5p 4Po5/2 ! (3P)5s 2P3/2 (3P)5p 4Po1/2 ! (3P)5s 2P1/2 (3P)5p 4Po3/2 ! (3P)5s 2P1/2 (1D)5p 4Fo7/2 ! (3P)4d 2F5/2 (3P)5p 4Po3/2 ! (3P)4d 4D1/2

– 1.5 0.5 0.9 1.8 0.3 – 0.5 0.4 0.4 – 1.5 0.6 0.2 3.0 0.9 1.2 0.3

a

All levels are denoted in L – S coupling with the core shown in ( ). Odd parity is denoted by the superscript ‘o’. Relative strengths are given as power output from a commercial Spectra-Physics model 2080RS ion laser.

b

Technology With such high input powers required in a small volume to produce several watts output, ion laser performance has improved from 1964 to the present only as new discharge technologies were introduced. The earliest laboratory argon ion lasers used thinwalled (< 1 mm wall thickness) fused silica discharge tubes, cooled by flowing water over the outside wall of the tube. The maximum input power per unit length of discharge was limited by thermal stresses in the silica walls caused by the temperature differential from inside to outside. Typically, ring-shaped cracks would cause the tube to fail catastrophically. Attempts to make metal – ceramic structures with alumina (Al2O3) discharge tubes to contain the plasma were made early on (1965) but were not successful. Such tubes invariably failed from fracture by thermal shock as the discharge was turned on. Later, successful metal – ceramic tubes were made with beryllia (BeO), which has a much higher thermal conductivity than silica or alumina and is much more resistant to thermal shock. Today, all of the lower power (less than 100 mW output) are made with BeO discharge tubes. Some ion lasers in the 0.5 to 1 W range are also made with water-cooled BeO discharge tubes. A typical low-power air cooled argon ion laser is shown in Figure 4. The large metal can (2) on the right end of the tube contains an impregnated-oxide hot cathode, heated directly by current through ceramic feed-through insulators (3). The small

(< 1 mm diameter) discharge bore runs down the center of the BeO ceramic rod (1), and several smaller diameter gas return path holes run off-axis parallel to the discharge bore to provide the needed gas equalization between cathode can and anode region. A copper honeycomb cooler is brazed to the cathode can, two more to the outer wall of the BeO cylinder, and one to the anode (4) at the left end of the tube. The laser mirrors (5) are glass-fritted to the ends of the tube, forming a good vacuum seal. Note that in this small laser, the active discharge bore length is less than half of the overall length. The very early argon ion lasers were made with simple smooth dielectric tubes, allowing the continuous variation in voltage along the length of the tube required by the discharge longitudinal electric field. However, this variation can be step-wise at the discharge walls, and still be more or less smooth along the axis. Thus, the idea arose of using metal tube segments, insulated one from another, to form the discharge tube walls. The first version of this idea (1965) used short (< 1 cm) metal cylinders supported by metal disks and stacked inside a large diameter silica envelope, with each metal cylinder electrically isolated from the others. The tubes and disks were made of molybdenum, and were allowed to heat to incandescence, thus radiating several kilowatts of heat through the silica vacuum envelope to a water-cooled collector outside. While this eliminated the problems of thermal shock and poor thermal conductivity inherent in dielectric wall discharges, it made another

472 LASERS / Noble Gas Ion Lasers

Figure 3 Laser output power (summed over all the blue and green laser lines) versus discharge current for a laser discharge 4 mm in diameter and 71 cm long, with a 1 kilogauss longitudinal magnetic field. The parameter shown is the argon fill pressure in mTorr prior to striking the discharge. The envelope of the curves exhibits the quadratic variation of output power with discharge current.

problem painfully evident. The intense ion bombardment of the metal tube walls sputtered the wall material, eventually eroding the shape of the metal cylinders and depositing metal films on the insulating wall material, thus shorting one segment to another. Many different combinations of materials and configurations were investigated in the 1960s and 1970s to find a structure that would offer good laser performance and long life. It was found that simple thin metal disks with a central hole would effectively confine the discharge to a small diameter (on the order of the hole size) if a longitudinal d-c magnetic

field of the order of 1 kiloGauss were used. The spacing between disks can be as large as 2 – 4 discharge diameters. Of the metals, tungsten has the lowest sputtering yield for argon ions in the 20 eV range (which is approximately the energy they gain in falling to the wall across the discharge sheath potential difference). An even lower sputtering yield is exhibited by carbon, and graphite cylinders contained within a larger diameter silica or alumina tube were popular for a while for ion laser discharges. Unfortunately, graphite has a tendency to flake or powder, so such laser discharge tubes became contaminated with ‘dust’ which could eventually find its way to the optical windows of the tube. Beryllia and silica also sputter under argon ion bombardment, but with still lower yields than metals or carbon; however, their smaller thermal conductivities limit them to lower-power applications. The material/configuration combination that has evolved for higher-power argon ion lasers today is to use a stack of thin tungsten disks with 2 –3 mm diameter holes for the discharge. These disks, typically 1 cm in diameter, are brazed coaxially to a larger copper annulus (actually, a drawn cup with a 1 cm hole on its axis). A stack of these copper/ tungsten structures is, in turn, brazed to the inside wall of a large diameter alumina vacuum envelope. The tungsten disks are exposed to and confine the discharge, while the copper cups conduct heat radially outward to the alumina tube wall, which, in turn, is cooled by fluid flow over its exterior. Thus, the discharge is in contact only with a low-sputtering material (tungsten) while the heat is removed by a high thermal conductivity material (copper). Details differ among manufacturers, but this ‘cool disk’ technology seems to have won out in the end. A photo of a half-sectioned disk/cup stacked assembly from a Coherent Innovae ion laser is shown in Figure 5. The coiled impregnated-tungsten cathode is also shown. Sputtering of the discharge tube walls is not uniform along the length of the gas discharge. The small diameter region where the laser gain occurs is always joined to larger diameter regions containing cathode and anode electrodes (as shown in Figure 5, for example). A plasma double sheath (that is, a localized increase in potential) forms across the discharge in the transition region between large and small diameter regions (the discharge ‘throats,’ which may be abrupt or tapered). This sheath is required to satisfy the boundary conditions between the plasmas of different temperatures in the different regions. Such a double sheath imparts additional energy to the ions as they cross the sheath, perhaps

LASERS / Noble Gas Ion Lasers 473

Figure 4 A typical commercial low-power, air-cooled argon ion laser. The cathode can (2) is at the right and the beryllia bore (1) and anode (4) is at the left. The cathode current is supplied through ceramic vacuum feed-throughs (3). Mirrors (5) are glass fritted onto the ends of vacuum envelope. (Photo courtesy of JDS Uniphase Corp.)

Figure 5 Discharge bore structure of a typical commercial high-power argon ion laser, the Coherent Innovae. Copper cups are brazed to the inner wall of a ceramic envelope, which is cooled by liquid flow over its outer surface. A thin tungsten disk with a small hole defining the discharge path is brazed over a larger hole in the bottom of the copper cup. Additional small holes in the copper cup near the ceramic envelope provide a gas return path from cathode to anode. Also shown is the hot oxide-impregnated tungsten cathode. (Photo courtesy of Coherent, Inc.)

an additional 20 eV. When these ions eventually hit the discharge walls near the location of the sheath, they have 40 eV of energy rather than the 20 eV from the normal wall sheath elsewhere in the small diameter discharge. Sputtering yield (number of sputtered wall atoms per incident ion) is exponentially dependent on ion energy in this low ion energy region, so the damage done to the wall in the vicinity of the ‘throat’ where the double sheath

forms may be more than ten times that elsewhere in the small diameter bore region. This was the downfall of high-power operation of dielectric discharge bores, even BeO; while sputtering was acceptable elsewhere in the discharge, the amount of material removed in a small region near the discharge throat would cause catastrophic bore failure at that point. This localized increase in sputtering in the discharge throat is common to all

474 LASERS / Noble Gas Ion Lasers

ion lasers, including modern cooled tungsten disk tubes. Eventually, disks near the throat are eroded to larger diameters, and more material is deposited on the walls nearby. Attempts to minimize localized sputtering by tapering the throat walls or the confining magnetic field have proven unsuccessful; a localized double sheath always forms somewhere in the throat. Because of the asymmetry caused by ion flow, the double sheath is larger in amplitude in the cathode throat than the anode throat, so that the localized wall damage is larger at the cathode end of the discharge than at the anode end. Another undesirable feature of sputtering is that the sputtered material ‘buries’ some argon atoms when it is deposited on a wall. This is the basis of the well-known Vac-Ionw vacuum pump. Thus, the operating pressure in a sealed laser discharge tube will drop during the course of operation. In lowpower ion lasers, this problem is usually solved by making the gas reservoir volume large enough to satisfy the desired operating life (for example, the large cathode can in Figure 4). In high-power ion lasers, the gas loss would result in unacceptable operating life even with a large reservoir at the fill pressure. Thus, most high-power ion lasers have a gas pressure measurement system and dual-valve arrangement connected to a small high-pressure reservoir to ‘burp’ gas into the active discharge periodically to keep the pressure within the operating range. Unfortunately, if a well-used ion laser is left inoperative for months, some of the ‘buried’ argon tends to leak back into the tube, with the gas pressure becoming higher than optimum. The pressure will gradually decrease to its optimum value with further operation of the discharge (in perhaps tens of hours). In the extreme case, the gas pressure can rise far enough so that the gas discharge will not strike, even at the maximum power supply voltage. Such a situation requires an external vacuum pump to remove the excess gas, usually a factory repair. In addition to simple dc discharges, various other techniques have been used to excite ion lasers, primarily in a search for higher power, improved efficiency and longer operating life. Articles in the Further Reading section give references to these attempts. Radio-frequency excitation at 41 MHz was used in an inductively coupled discharge, with the laser bore and its gas return path forming a rectangular single turn of an air-core transformer. A commercial product using this technique was sold for a few years in the late 1960s. Since this was an ‘electrode-less’ discharge, ion laser lines in reactive gases such as chlorine could be made to oscillate, as well as the noble gases, without

‘poisoning’ the hot cathode used in conventional dc discharge lasers. A similar electrode-less discharge was demonstrated as a quasi-cw laser by using iron transformer cores and exciting the discharge with a 2.5 kHz square wave. Various microwave excitation configurations at 2.45 GHz and 9 GHz also resulted in ion laser oscillation. Techniques common to plasma fusion research were also studied. Argon ion laser oscillation was produced in Z-pinch and Q-pinch discharges and also by high-energy (10– 45 keV) electron beams. However, none of these latter techniques resulted in a commercial product, and all had efficiencies worse than the simple dc discharge lasers. It is interesting that the highest-power output demonstrations were made in less than ten years after the discovery. Before 1970, 100 W output on the argon blue-green lines was demonstrated with dc discharges two meters in length. In 1970, a group in the Soviet Union reported 500 W blue-green output from a two-meter discharge with 250 kW of dc power input, an efficiency of 0.2%. Today, the highest output power argon ion laser offered for sale is 50 W output.

Manufacturers More than 40 companies have manufactured ion lasers for sale over the past four decades. This field has now (2004), narrowed to the following: Coherent, Inc. INVERSion Ltd. JDS Uniphase Laser Physics, Inc. Laser Technologies GmbH LASOS Lasertechnik GmbH Lexel Laser, Inc. Melles Griot Spectra-Physics, Inc.

http://www.coherentinc.com http://inversion.iae.nsk.su http://www.jdsu.com http://www.laserphysics.com http://www.lg-lasertechnologies.com http://www.LASOS.com http://www.lexellaser.com http://lasers.mellesgriot.com http://www.spectraphysics.com

Further Reading Bridges WB (1979) Atomic and ionic gas lasers. In: Marton L and Tang CL (eds) Methods of Experimental Physics, Vol 15: Quantum Electronics, Part A, pp. 31 – 166. New York: Academic Press. Bridges WB (1982) Ionized gas lasers. In: Weber MJ (ed.) Handbook of Laser Science and Technology, Vol. II, Gas Lasers, pp. 171 – 269. Boca Raton, FL: CRC Press.

LASERS / Optical Fiber Lasers 475

Bridges WB (2000) Ion lasers – the early years. IEEE Journal of Selected Topics in Quantum Electronics 6: 885– 898. Davis CC and King TA (1975) Gaseous ion lasers. In: Goodwin DW (ed.) Advances in Quantum Electronics, vol. 3, pp. 169 – 469. New York: Academic Press.

Dunn MH and Ross JN (1976) The argon ion laser. In: Sanders JH and Stenholm S (eds) Progress in Quantum Electronics, vol. 4, pp. 233– 269. New York: Pergamon. Weber MJ (2000) Handbook of Laser Wavelengths. Boca Raton, FL: CRC Press.

Optical Fiber Lasers G E Town, Macquarie University, NSW, Australia N N Akhmediev, Australian National University, ACT, Australia q 2005, Elsevier Ltd. All Rights Reserved.

Optical fiber lasers were first demonstrated in the 1960s, and since then have developed to become versatile optical sources with many desirable properties. Aided by developments in associated technologies, such as fiber design and fabrication methods, semiconductor pump diode technology, and fibercoupled and in-fiber components such as Bragg grating filters, optical fiber lasers now compete with other laser technologies in many applications, from telecommunications to materials processing. An optical fiber laser is fundamentally an optical oscillator, which converts input pump power to coherent optical output power at one or more welldefined wavelengths. Optical oscillators require two basic elements: optical gain, and optical feedback. For sustained oscillation to occur the round-trip gain in the laser cavity must be unity, and the roundtrip phase a multiple of 2p. In optical fiber lasers the optical gain is provided by an optical fiber amplifier using one or a combination of fundamental physical processes, e.g., stimulated emission, stimulated scattering, or nonlinear parametric processes. Early fiber amplifiers used stimulated Raman scattering to produce optical gain; however, rare-earth-doped fiber amplifiers, in which gain is provided by stimulated emission, are now more common. The reader is referred to Optical Amplifiers: Erbrium Doped Fiber Amplifiers for Lightwave Systems, for more details on doped fiber amplifiers, and to Scattering: Stimulated Scattering, Nonlinear Optics, Applications: Raman Lasers, Optical Parametric Devices: Optical Parametric Oscillators (Pulsed), and Scattering: Raman Scattering, for more details on stimulated scattering and parametric gain processes. Optical feedback may be provided in two fundamental ways, e.g., by using a closed ring of fiber, or from reflections from nonuniformities and

discontinuities in the waveguide such as in Bragg grating filters or at fiber ends. The reader is referred to Fiber Gratings for more details on fiber Bragg gratings. The main advantages of optical fiber lasers are derived from the confinement of the pump and signal in a small optical waveguide. In contrast with bulk lasers, the pump intensity in the fiber waveguide is largely independent of the laser length, resulting in large amplifier gain and low laser threshold, even for gain media with small absorption and emission crosssections. The large gain enables lossy elements such as optical fiber-coupled devices and bulk optical elements to be incorporated into the laser cavity, providing additional control over the optical signal being generated. To obtain a large gain optical fiber amplifiers must usually be relatively long (i.e., from several centimeters to many meters), and so the linear and nonlinear properties of the fiber waveguide can have a significant influence on the optical signal being generated, resulting in some interesting and useful phenomena, e.g., as in soliton fiber lasers. The reader is referred to Fiber and Guided Wave Optics: Optical Fiber Cables, for fiber-based components, to Fiber and Guided Wave Optics: Dispersion, Light Propagation, and Nonlinear Effects (Basics), for a review of the linear and nonlinear properties of optical fibers, and to Solitons: Soliton Communication Systems, and Temporal Solitons, for the theory and applications of temporal solitons. Whilst optical fibers may be fabricated in a range of materials, including polymer and crystalline materials, most fibers and fiber lasers are currently made of silica containing rare-earth ion dopants – see Fiber and Guided Wave Optics: Fabrication of Optical Fiber, for details. Lasers in silica optical fiber have much in common with other glass-host lasers, including a wide range of potential pump and lasing wavelengths, and broadband absorption and gain due to homogeneous and inhomogeneous broadening of the lasing energy levels in the amorphous glass host. For example, erbium-doped amplifiers in alumino-silicate glass fibers at room temperature have lasing transitions with homogeneous and

476 LASERS / Optical Fiber Lasers

Table 1

Commonly used pump and signal transitions in rare-earth doped fiber lasers

Rare earth dopant

Pump wavelengths [nm]

Signal wavelengths [nm]

Excited state lifetime [ms]

Energy levels

Praseodymium Neodymium

480, 585 590, 800

0.1 0.5

Samarium Holmium Erbium Thulium

488 455, 650, 1150 800, 975, 1480 790, 1210 (1060– 1320)

4 3 4 4 3 3

Ytterbium

920 (840–980)

885, 1080 920 (900–950) 1060 (1055–1140) 651 2040 1550 (1530–1610) 1480 (1460–1520) 1850 (1650–2050) 975 1040 (1010–1160)

1.5 0.6 10 0.3 0.8

3 3 4

Figure 1 Example of a simple all-fiber optical oscillator. The rare-earth doped fiber provides optical gain, and the Bragg gratings provide narrowband optical feedback.

inhomogeneous linewidths of several nanometers, and can be optimized to provide around 30 dB gain over an 80 nm bandwidth centered around 1570 nm. The large gain-bandwidth is useful for achieving either wide tunability of the lasing wavelength and/or ultrashort pulse generation. Table 1 summarizes the most commonly used pump and signal transitions for a range of rare-earth dopants in silicate optical fibers. Other desirable properties of silica fiber lasers include high efficiency, excellent thermal dissipation, substantial energy storage, high power capability, and compatibility with a wide range of optical fiber devices and systems. The characteristics of optical fiber lasers can be optimized for some very different applications, ranging from narrow linewidth and low-noise lasers to broadband pulsed lasers with high energy and/or high peak power. In the following sections the fundamentals of laser theory as applied to fiber lasers will be reviewed, highlighting the reasons underlying such versatility.

Fiber Laser Fundamentals An example of a simple all-fiber laser is shown schematically in Figure 1. The laser contains the basic elements of optical gain (e.g., a rare-earth doped fiber amplifier) and optical feedback (e.g., a pair of Bragg grating filters). In the laser shown, the two Bragg gratings provide optical feedback only over a narrow range of wavelengths, which further restricts the

range of possible lasing wavelengths. One grating must have a reflectivity less than unity to allow a proportion of the lasing signal to be coupled out of the laser cavity. At threshold, the gain in the amplifier exactly compensates for the loss in the output coupler. The optical pump, typically from a semiconductor laser diode, is absorbed as is propagates along the fiber amplifier, e.g., by rare-earth ion dopants, which are raised to an excited state. The absorbed energy is stored and eventually emitted at a longer wavelength by either spontaneous or stimulated emission. Spontaneous emission produces unwanted noise and is associated with a finite excited state lifetime, whilst stimulated emission produces optical gain provided the gain medium is ‘inverted’, i.e., with more active ions in the upper lasing energy level than the lower lasing energy level. In practice some energy is also lost to nonradiative (thermal) emissions, e.g., in the transfer of energy between the pump and upper lasing energy levels. In an ideal gain medium the nonradiative transfer rate between the pump and upper lasing energy levels is much larger than the spontaneous or stimulated emission rates, so the inversion may usually be assumed to be independent of nonradiative transfer rates. Energy level diagrams for three- and four-level atoms are shown in Figure 2. The local optical gain coefficient (i.e. power gain per unit length, in nepers2) in a fiber amplifier is proportional to the local inversion, i.e., gðzÞ , sS DNðzÞ; where sS is the emission cross-section at the

LASERS / Optical Fiber Lasers 477

Figure 2 Energy level diagrams for three- and four-level laser transitions. W represents the probability of an optical transition per unit time associated with stimulated absorption or emission, and 1=t is the rate of spontaneous emission. Dotted lines represent nonradiative transitions (e.g., between the pump and upper lasing level).

signal wavelength, and DNðzÞ the difference in population density between the upper and lower lasing energy levels (DN . 0 if the gain medium is inverted). The local inversion can be calculated by solving the rate equations including all stimulated and spontaneous transitions between the laser energy levels. The end-to-end amplifier power gain, G ¼ PS ðLA Þ=PS ð0Þ, in length LA of amplifying fiber is then " # ð LA G ¼ exp gðzÞ dz ½1 0

which in decibels is G ¼ 4:34

ðLA

gðzÞ dz

0

A large gain coefficient is desirable for amplification with minimal added noise (i.e., low noise figure), and a large gain per unit pump power is desirable for low threshold lasing. For a given gain medium the only factor usually accessible to control the inversion and hence gain is the pump rate, i.e., the rate at which pump energy is absorbed by the gain medium, which for an unbleached amplifier is proportional to pump intensity. For example, in an ideal four-level unsaturated laser medium with uniform pump intensity, the small signal gain is approximately ð LA 0

gðzÞ dz < sS

PPabs t hn P A

½2

in which t is the lifetime of the upper lasing level, PPabs is the pump power absorbed in the fiber amplifier with cross-sectional area A; h is Planck’s constant, and n P is the pump frequency. For a three-level laser under the same conditions, the small signal gain is ð LA 0

gðzÞ dz < 2sS NLA þ sS

2PPabs t hn P A

½3

in which the first term corresponds to ground state absorption of the signal, in which N is the dopant ion density, and the factor of 2 in the second term is due to the fact that in an ideal three-level system every absorbed pump photon increases the inversion, DN; by two. Equation [3] can be approximated by multiplying [2] by the factor (PPabs 2 PPsat)/(PPabs þ PPsat), where PPsat ¼ hn P A=ðsP tÞ is the pump power which must be absorbed to reduce the pump absorption coefficient to half its unpumped value (or the signal absorption coefficient to zero), and sP is the pump absorption cross-section. Unlike four-level lasers in which gain is available as soon as pump power is applied, three-level lasers require half the active ions to be excited before the inversion becomes positive and gain is produced. The main advantages of optical fiber amplifiers are now clear; in an optical fiber waveguide the light is confined in a very small cross-sectional area, typically less than ten microns diameter. Consequently even a small coupled pump power can have a large intensity and produce a large gain coefficient, or low laser threshold. Even if the pump absorbed per unit length in the fiber amplifier is small (e.g., due to low dopant concentration or low absorption coefficient), this can often be compensated by using a long length of fiber with small intrinsic loss. Furthermore, rare-earth dopants have relatively long metastable state lifetimes, t; which further assists in producing a large inversion throughout the fiber, and hence large overall gain or low lasing threshold, even at low pump powers. For example, a typical four-level neodymium doped fiber amplifier providing gain at lS ¼ 1:06 mm, pumped at lP ¼ 0:8 mm, with core diameter 7 mm, t ¼ 0:5 ms, sS ¼ 1:4 £ 10220 cm2, and sP ¼ 23 £ 10221 cm2, could theoretically provide a small signal gain of 0.3 dB for every milliwatt of absorbed pump power. Whilst the above analysis highlights the main advantages of fiber amplifiers, it is approximate in a number of respects. For example, eqns [2] and [3] assume both pump and signal have a uniform intensity distribution across the gain medium, and completely overlap. In single mode optical fibers the intensity distribution across the fiber core is not uniform but approximately Gaussian, with a width and effective area that depends on the numerical aperture of the waveguide and the wavelength. Consequently the overlap of the pump and signal beams with the doped fiber core and with each other is less than 100%, and the pump and signal intensities are higher in the center of the fiber than at the corecladding boundary. The nonuniform distribution of pump and signal beams slightly modifies both the gain efficiency and gain saturation behavior

478 LASERS / Optical Fiber Lasers

(i.e., the dependence of gain on signal power) in fiber amplifiers. More accurate analysis would also take into account factors such as the rare-earth dopant distribution (which may not be uniform), the variation of pump and signal intensities along the fiber, spectral variation in the absorption and emission cross-sections, spatial and spectral hole-burning in the gain medium, degeneracies in the energy levels, excited state absorption and other loss mechanisms, temperature, and spontaneous emission noise.

Continuous Wave Fiber Lasers The ideal continuous wave (cw) laser converts input pump power with low coherence to a highly coherent optical output signal which is constant in amplitude and wavelength, with the spectral purity of the output (i.e., linewidth) limited only by cavity losses. Practical continuous wave lasers may be characterized by four parameters: the pump power required for the onset of oscillation (i.e., threshold); the conversion efficiency of pump to signal power above threshold; the peak wavelength of the optical output; and the spectral width of the optical output. Continuous wave fiber lasers often have a low threshold, high efficiency, and narrow linewidth relative to other types of laser. Output powers approaching 100 watts have been achieved using specialized techniques. The threshold power is determined by the total cavity loss and gain efficiency (i.e., gain per unit of absorbed pump power) of the fiber amplifier. For example, for the laser shown in Figure 1 the internal losses are minimal, and the pump power required to reach threshold may be calculated by setting the product of the round-trip small-signal gain, and the output mirror reflectivity, G2 R, equal to unity. For the laser configuration shown in Figure 1, and using eqns [1] and [2] for a four-level laser, the pump power which must be absorbed in the amplifier for the laser to reach threshold is Pth ¼ acav LC

hn P A sS t

gain is produced, and hence the threshold of threelevel lasers is given approximately by eqn [4] with aint ¼ sS NLA 2LC , and increases with amplifier length. In either case the pump power required at the input of the laser to reach threshold, PP ð0Þ, may be approximated using the relation PPabs ¼ PP ð0Þ 2 PP ðLA Þ < PP ð0Þ½1 2 expð2sP NLA Þ. For either three- or four-level lasers pumped above threshold, the inversion of the gain medium remains at its threshold level as the internal round-trip gain must remain unity; however a coherent optical output signal at the peak emission wavelength grows from noise until its amplitude is limited by saturation of the amplifier gain. Provided the output coupling loss is not too large (e.g., T , 60%) such that the total signal intensity is constant along the whole length of the fiber amplifier, it can be shown that changes in pump power above threshold cause a proportional change in the signal power coupled out of the Fabry– Perot cavity in one direction, Pþ Sout ; i.e. Pþ Sout ¼

ð1 2 RÞ nS ðP 2 Pth Þ d0 nP Pabs

½5

in which d0 ¼ 2aint L þ ln 1=R represents the total round-trip loss at threshold. The relationship between absorbed pump power and output signal power is shown schematically in Figure 3. The constant of proportionality is known as the slope or conversion efficiency, defined as hS ¼ PSout =ðPPabs 2 Pth Þ: In both three- and four-level fiber lasers with small intrinsic losses aint < 0, and small output coupling (i.e., R < 1), the slope efficiency can approach the intrinsic quantum efficiency, nS =nP : Sustained oscillation can only occur if the optical path length around the cavity is an integral number of wavelengths. Wavelengths satisfying this condition are called ‘modes’ of the cavity, determined by

½4

in which LC is the cavity length, and acav ¼ aint þ 1 2 LC lnð1=RÞ is the total cavity loss per unit length, comprising both internal losses, aint ; and outcoupling losses through a mirror with reflectivity R. For example, if the internal losses were negligible and the transmittance, T ¼ 1 2 R; of the output reflector in the laser shown in Figure 1 was 50% (i.e. 3 dB), and the other fiber parameters were the same as above (i.e. G ¼ 0:3 dB/mW), then the laser threshold would be 5 mW. In three-level fiber lasers ground state absorption of the signal is usually the dominant loss mechanism which must be overcome before a net

Figure 3 Representative plot of laser output power versus absorbed pump power.

LASERS / Optical Fiber Lasers 479

2nLC ¼ ml0 ; where n is the effective refractive index seen by light propagating in the cavity, LC is the cavity length, m is an integer, and l0 is the free-space wavelength. In Figure 1, the cavity length comprises the amplifying fiber, and two Bragg grating filters which may be regarded as having an effective length (i.e., less than the actual grating length) which depends on the grating characteristics. In fiber lasers the cavity is typically between several centimeters and many meters in length, hence there are usually very many potential lasing modes within the gain-bandwidth of the amplifying medium. For example, in the Fabry –Perot cavity of Figure 1, if the cavity length was 5 meters, the mode spacing would be nmþ1 2 nm ¼ c=ð2nLC Þ ¼ 20 MHz, i.e. orders of magnitude less than the bandwidth of the optical amplifier. Ideally the first mode to reach threshold would determine the laser output wavelength and the spectral width, or linewidth, of the laser output would be Dn L ¼ hn=ð2pt2C PSout Þ; in which tC is the lifetime of photons in the cavity and determined by cavity losses, including outcoupling, and given by tC ¼ tR =1; where tR is the cavity round trip time and 1 the fractional energy loss in the cavity per round trip. In practice, the linewidth is usually significantly larger than the latter theoretical limit due to transient lasing of different modes within the bandwidth of the optical amplifier and environmental perturbations to the fiber cavity. Fiber amplifiers typically have a large homogeneously broadened gain-bandwidth, and being long and flexible are susceptible to acoustic and thermal perturbations, hence care must be taken to physically stabilize the laser cavity to minimize both mode-hopping and the output linewidth. Whilst the CW laser shown in Figure 1 has the advantage of simple construction, it would not be ideal for narrow linewidth CW generation. The reason is that counterpropagating waves in the Fabry –Perot cavity form a standing wave, i.e., in which the local signal intensity varies due to interference between forward and backward propagating waves. The standing wave can cause spatial hole-burning in the gain medium (i.e., a spatial variation in gain, linked to the spatial variation in intensity) which in turn reduces the average gain for the lasing mode and promotes mode-hopping and hence spectral broadening of the output. A preferable arrangement for many fiber lasers is a traveling wave laser, which can be realized using a ring configuration, as shown in Figure 4. In this configuration the optical isolator ensures unidirectional lasing and avoidance of spatial holeburning. Additional components in the cavity can include a wavelength selective coupler (WSC), i.e., to couple the pump wavelength into the laser cavity

Figure 4 Schematic of a traveling wave laser, useful for low noise continuous wave oscillation. The wavelength selective coupler (WSC) is required to couple pump power into the laser cavity without coupling signal out.

whilst not coupling the signal wavelength out, and a narrowband filter (e.g. a small Fabry– Perot resonator) to further stabilize and/or narrow the lasing linewidth. Rare-earth doped fiber lasers with similar cavity configurations have been realized with linewidths of approximately 10 kHz, limited by environmental (e.g., acoustic) perturbations.

Pulsed Fiber Lasers There are two main types of fiber laser useful for generating short high-power pulses: Q-switched lasers, and mode-locked lasers. Q-switched fiber lasers are useful for generating large energy pulses (e.g., microjoules) with very high peak power (e.g., kilowatts) with relatively long pulse duration (e.g., tens of nanoseconds), whilst mode-locked fiber lasers are typically capable of generating ultrashort pulses (e.g., sub-picosecond) with moderate energy (e.g., nanojoules) and moderate-to-high peak power (e.g., tens of watts). Pulsed lasers typically produce less average output power than CW lasers (e.g., up to about 10 W). Both types of pulsed laser typically contain an element for loss modulation within the cavity, however the pulse generation mechanisms are very different. Q-switched lasers operate by rapid switching of the cavity loss. Whilst the intracavity loss is high and the laser below threshold, energy is transferred from the pump to the lasing medium. The long excited state lifetime and small emission cross-sections of rareearth dopants assists greatly in the latter process, so that a significant amount of energy can be stored in long fiber amplifiers before amplification of spontaneous emission noise begins to deplete the inversion. After the gain medium is fully inverted the cavity loss is suddenly reduced, and the laser is taken well above threshold, resulting in a rapid build-up of noise and the formation of a pulse which ideally extracts all the stored energy within a few round-trips in the laser cavity. The situation is shown schematically in Figure 5.

480 LASERS / Optical Fiber Lasers

Figure 5 Pulse generation in a Q-switched laser. Initially the intracavity loss is high, the laser remains below threshold, and the inversion of the gain medium increases. When the intracavity loss is suddenly reduced, the laser goes well above threshold and spontaneous emission noise is rapidly amplified into a large pulse which extracts much of the energy previously stored in the gain medium.

An upper limit on the pulse energy obtainable from a Q-switched laser may be determined by assuming all active ions in the laser cavity are excited before the pulse is generated (i.e., at the time the cavity Q is switched high), that no ions remain excited immediately after the pulse is generated, and that the pulse generated is much shorter than the time required to pump all the ions into their excited state. The energy stored in the gain medium, and hence the maximum conceivable pulse energy, would then be E ¼ hnS NV; where N is the density of active ions and V is the cavity volume. For example, a typical erbium-doped fiber laser of length 10 m and dopant density N ¼ 1019 ions/cm in a single-mode fiber with core diameter 7 mm could ideally store up to 500 mJ of energy. Two things limit the maximum inversion and energy storage achievable in practice: spontaneous emission noise (due to the finite excited-state lifetime) which is amplified and depletes the inversion, and

unwanted feedback (e.g., due to the finite extinction ratio of the Q-switching element) which results in cw lasing which limits the inversion. Following from eqns [1] and [2], the unidirectional gain per unit of stored energy in a fiber laser may be expressed as G ¼ 4:34=ðAcore Fsat Þ ¼ 4:34=Esat decibels per joule, where Fsat ¼ hnS =sS is the saturation fluence, and Esat ¼ Asat Fsat is the saturation energy. For example, if the maximum gain achievable before the onset of cw lasing was 30 dB (i.e., even when the Q-switch introduces a large loss), then the maximum energy storable in the gain medium would be E ¼ 6:9Esat : For the same erbium-doped fiber parameters as used in the previous paragraph, the maximum storable energy would be E < 70 mJ. Only a proportion of the energy stored in the gain medium may usually be converted into a Q-switched output pulse. The actual Q-switched pulse energy, duration, and peak power can be calculated by solving the rate equations for the photon flux and the population difference between the lasing energy levels in the gain medium. The equations must usually be solved numerically, however assuming the Q changes much more rapidly than the pulse build-up time, which is in turn much shorter than the time taken for a significant change in inversion to either pumping or spontaneous emission, the pulse parameters may be approximated analytically and are found to be determined by only two parameters, i.e., the population inversion (or stored energy) just before Q-switching, and the lifetime of photons within the cavity, tC, defined previously. Under the conditions described above, standard analysis gives the Q-switched pulse duration as TP