Fundamentals of Optics: An Introductory Course 1510657800, 9781510657809

This book presents a simple yet elegant introduction to classical optics focused primarily on establishing fundamental c

350 117 41MB

English Pages 329 [333] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Fundamentals of Optics: An Introductory Course
 1510657800, 9781510657809

Citation preview

FUNDAMENTALS OF

OPTICS

An Introductory Course Yobani Mejía-Barbosa Translated from the Spanish by

Herminso Villarraga-Gómez

SPIE Terms of Use: This SPIE eBook is DRM-free for your convenience. You may install this eBook on any device you own, but not post it publicly or transmit it to others. SPIE eBooks are for personal use only. For details, see the SPIE Terms of Use. To order a print version, visit SPIE.

Library of Congress Control Number: 2022048509

English translation of Fundamentos de óptica: Curso introductorio Copyright © 2021 UNAL Translated by Herminso Villarraga-Gómez

Published by SPIE P.O. Box 10 Bellingham, Washington 98227-0010 USA Phone: +1 360.676.3290 Fax: +1 360.647.1445 Email: [email protected] Web: http://spie.org Copyright © 2023 Society of Photo-Optical Instrumentation Engineers (SPIE) All rights reserved. No part of this publication may be reproduced or distributed in any form or by any means without written permission of the publisher. The content of this book reflects the work and thought of the author. Every effort has been made to publish reliable and accurate information herein, but the publisher is not responsible for the validity of the information or for any outcomes resulting from reliance thereon. Background cover image from Shutterstock: Fouad A. Saad Printed in the United States of America. First printing 2023 For updates to this book, visit http://spie.org and type “PM359” in the search field.

This book is dedicated to my wife, Janneth, and my children, Ana Catalina, María Isabel, and Daniel.

Contents Translator's Preface Author's Preface Introduction 1

xi xiii xv

Geometrical Optics 1.1

1.2

1.3

1.4

1.5

1.6 1.7

1

Rays or Waves 1.1.1 Camera obscura 1.1.2 Newton’s corpuscular theory of light 1.1.3 Huygens’ wave theory 1.1.4 Graphical ray tracing Fermat’s Principle 1.2.1 Modern formulation of Fermat’s principle 1.2.2 Rays and wavefronts 1.2.3 Image from a point source Refracting Surfaces 1.3.1 Modeling the cornea of a human eye 1.3.2 Refraction at spherical surfaces 1.3.3 Focal lengths and focal points 1.3.4 Focal planes 1.3.5 Paraxial imaging of extended objects 1.3.6 Optical power and vergence Reflecting Surfaces 1.4.1 Ray tracing for spherical mirrors 1.4.2 The parabolic mirror Lenses: Thin Lens Approximation 1.5.1 Ray tracing for thin lenses 1.5.2 Newton’s lens equation 1.5.3 Real and virtual images domain 1.5.4 Focal planes in thin lenses 1.5.5 Ray tracing for oblique rays Lenses: Principal Planes 1.6.1 A lens system Stops and Pupils 1.7.1 Aperture stop

vii

2 2 4 5 8 11 14 15 16 17 19 21 24 26 28 29 30 31 34 36 38 40 41 43 44 46 51 54 55

viii

2

3

Contents

1.7.2 Pupils 1.7.3 Marginal and chief rays 1.7.4 Field stop, field of view, and angle size 1.8 Some Optical Instruments 1.8.1 The human eye (schematic representation) 1.8.2 Magnifiers 1.8.3 The telescope 1.8.4 The microscope 1.9 Monochromatic Optical Aberrations 1.9.1 Field curvature 1.9.2 Spherical aberration 1.9.3 Distortion 1.9.4 Astigmatism and coma References

58 60 62 65 65 71 75 81 84 85 87 88 90 100

Polarization

103

2.1

Plane Waves and Polarized Light 2.1.1 Maxwell’s equations with plane waves 2.1.2 Irradiance 2.1.3 Natural light and polarized light 2.1.4 Elliptical, circular, and linear polarization 2.1.5 Polarization: general case 2.2 Dichroism Polarization 2.2.1 Linear polarizer 2.2.2 Malus’ law 2.3 Polarization by Reflection 2.3.1 Laws of reflection and refraction 2.3.2 Fresnel equations 2.3.3 Reflectance and transmittance 2.4 Polarization by Total Internal Reflection 2.4.1 Total internal reflection 2.4.2 Reflectance and transmittance 2.5 Polarization with Birefringent Materials 2.5.1 Phase retarder plates 2.5.2 Birefringent crystals 2.5.3 Refraction in crystals 2.5.4 Polarizing prisms 2.6 Vectors and Jones Matrices References

104 105 106 108 109 111 114 114 116 118 118 121 127 130 130 135 136 138 141 144 146 149 154

Interference

155

3.1

156 158 159

Interference and Coherence 3.1.1 Degree of coherence 3.1.2 Interference and coherence

Contents

ix

3.1.3 Coherence length Interference of Two Plane Waves 3.2.1 Interference with inclined plane waves 3.2.2 Displacement of interference fringes 3.2.3 Interferogram visibility 3.3 Interference of Two Spherical Waves 3.3.1 Circular fringes with the Michelson interferometer 3.3.2 Parallel fringe approximation with the Michelson interferometer 3.4 Practical Aspects in the Michelson Interferometer 3.4.1 Laboratory interferometer 3.5 Interference in a Plate of Parallel Faces 3.5.1 Stokes relations 3.5.2 Multiple-wave interference 3.5.3 Two-wave interference 3.6 Interference from N Point Sources 3.6.1 Plane wave approximation 3.7 Interference with Extended Light Sources 3.7.1 Artificial extended sources 3.8 Young Interferometer I 3.8.1 Division of wavefront and division of amplitude 3.9 Other Interferometers 3.9.1 Fabry–Pérot interferometer 3.9.2 Antireflective thin film 3.9.3 Newton and Fizeau interferometers References

181 184 188 190 191 192 206 209 211 214 217 219 222 223 223 225 226 229

Diffraction

231

4.1

231 234 237 240 242 244 247 248 251 252 252 257 258 258 262 262

3.2

4

4.2

4.3

4.4

4.5

Huygens–Fresnel Principle 4.1.1 Fresnel zones 4.1.2 Fresnel treatment results Diffraction Integral 4.2.1 Kirchhoff integral theorem 4.2.2 Fresnel–Kirchhoff diffraction 4.2.3 Sommerfeld diffraction Fresnel and Fraunhofer Diffraction 4.3.1 Fraunhofer diffraction 4.3.2 Fresnel diffraction 4.3.3 Some examples Young Interferometer II 4.4.1 Effect of the size of the diffraction aperture 4.4.2 Effect of light source size Image Formation with Diffraction 4.5.1 Image of a point (source) object

164 166 168 170 171 172 176

x

Contents

4.5.2 Resolution in the image (two points) 4.5.3 Image of an extended object 4.6 Diffraction Gratings References

266 268 269 273

Appendices

275

A Ray Tracing

275

References B Refractive Index References C Optical Glasses References D Chromatic Aberrations References E Prisms References

280 281 286 287 289 291 297 299 306

F Polarization Ellipse

307

Index

309

Translator’s Preface This translation was made from the first edition of the book Fundamentos de Óptica, in Spanish, written by Professor Yobani Mejía-Barbosa and published in 2021 by the Universidad Nacional de Colombia (UNAL). While the English translation retains the originality of Mejía-Barbosa’s writing and the technical content remains unchanged, a literal translation was not always possible due to the inability of English to capture some of the intricate nuances of the Spanish language and the ways in which Spanish allows for the expression of some sentences in a simplified manner with the use of syntactic and contextual twists. In such cases, the English translation opts for alternative ways of expressing the same ideas without losing their original meaning. The translator has worked along with the author (Prof. Mejía-Barbosa), SPIE editors, and volunteer reviewers to ensure that the final translation of this book is fluent and adopts the standard technical terminology (in English) currently used by the optics community. This translation has also benefited from the fact that, coincidentally, the translator of this book (me) was introduced to the field of optics with the material presented through the pages of this book, taught directly by Prof. Mejía-Barbosa, when this book did not even exist in its Spanish version. That was in 2005, when I was a student in the Physics Department of UNAL at Bogotá and decided to take an elective course called Fundamentos de Óptica. I fell in love with optics through that course and have since decided to pursue a career in the field, specifically in applied optics. If I had to say today which lectures, from all the classes I have taken so far, have had the greatest influence on my professional life, they would be the ones from the Fundamentals of Optics course taught by Prof. Mejía-Barbosa at UNAL, which are included in this book. In addition to representing an important turn in my career toward applied sciences, taking me away from a strongly theoretical emphasis that dominates the UNAL physics program, Fundamentos de Óptica equipped me with a basic understanding of optics that I needed in order to pursue graduate studies at the College of Optics and Photonics of the University of Central Florida and then earn a Ph.D. in Optical Science and Engineering from the University of North Carolina at Charlotte (UNCC).

xi

xii

Translator’s Preface

Optics is a field that is filled with a wide range of multidisciplinary career opportunities. By attending the meetings sponsored by SPIE, the international society for optics and photonics, one can realize how high the current demand in the job market is for optical engineers and how much people with knowledge of optical science are appreciated. For me, learning optics has fueled my career and led to my employment at world-renowned optical companies, at Nikon from 2015 to 2019, and most recently, since 2019, at ZEISS. The field of applied optics has enriched my life in a way that I could never have predicted when I was younger. The Fundamentals of Optics course that I took at UNAL laid the foundation. I am forever grateful. This is the main reason why I volunteered to translate Fundamentos de Óptica from Spanish to English. I would like to see Prof. Mejía-Barbosa’s book reach wider audiences and help many students from around the world, not just from Spanish-speaking countries, with the study of optics. This book presents a simple but elegant introduction to classical optics that focuses primarily on establishing fundamental concepts that students exposed to the field of optics for the first time need to learn. With examples demonstrating the use of optics in a wide range of practical applications, it reflects the pedagogical approach used by Prof. Mejía-Barbosa to teach his Fundamentals of Optics course at UNAL. This book will prove useful for college students, and even graduate students, of physics, optical science, optical engineering, and any other related science or engineering discipline that deals with optics at some level. I would like to express my gratitude to Prof. Mejía-Barbosa, UNAL, and SPIE for giving me the opportunity to translate this book. In addition, I would like to express my appreciation to Prof. Glenn D. Boreman of the University of North Carolina at Charlotte and editors (Alexandra MacWade and Patrick Franzen) of SPIE who provided valuable feedback on previous versions of this translation, thus contributing to its improvement. Herminso Villarraga-Gómez Ann Arbor, Michigan, USA October 3, 2022

Author’s Preface This book presents lectures on classical optics, based on the Fundamentals of Optics course that I have been teaching for 12 years in the Physics Department of the Universidad Nacional de Colombia (UNAL) at Bogotá. At first, I occasionally taught these lectures as an elective course in the Physics undergraduate program. Beginning in the fall of 2009 through the fall of 2017, I taught the same lectures each semester as an optative regular course of four hours per week in a 16-week semester format. The content of the course was based on my own research experience in the field of applied optics, which was enriched at the Centro de Investigaciones en Óptica, México, where I pursued my Ph.D. in Optics (1998–2001). This book comprises four chapters: Geometrical Optics, Polarization, Interference, and Diffraction, each consisting of several sections. Each section corresponds to a lecture of two hours. The book includes 30 sections and six appendices. I have written this book to provide students of physics, optics, and engineering with a basic understanding of the main topics related to geometrical and physical optics. For further reading on classical optics, students may consult other well-known textbooks such as Fundamentals of Optics, Fourth Edition, by Jenkins and White (McGraw-Hill Education, 2001), Optics, Fifth Edition, by Hecht (Pearson, 2016), and Principles of Optics, Sixth Edition, by Born and Wolf (Pergamon Press, 1980). I would like to thank UNAL for granting me a sabbatical year (February 2018 to February 2019), during which I was able to write this book. Yobani Mejía-Barbosa Bogotá, D. C., Colombia March 2021



A course that is not specifically designated as part of a degree requirement, but rather offered by a professor for students who want to take it.  A course that is included in a curriculum but noncompulsory. Students select such a course from a limited set of specialized subjects.

xiii

Introduction In the current state of the art, optics can be defined as a science that studies the nature of light (visible range of the electromagnetic spectrum), its propagation, and its interaction with matter. The origins of optics are closely related to studies of vision and the propagation of light. Alhazen (965–1040) compared the image formation in the human eye with the image formation in a camera obscura (a closed box with a small hole in one of its walls), which he himself built for the first time. The camera obscura also makes it possible to observe the rectilinear propagation of light (light rays). Centuries ago, in his treatise on optics, Euclid (325–265 BCE) presented a geometrical description of light propagation. Following one of Euclid’s postulates (a straight line segment can be drawn joining any two points), to go from one point to another in a homogeneous medium, light follows the straight line that connects these points; i.e., light follows the shortest geometrical path. This is a preliminary version of Fermat’s principle of the propagation of light, which can be used to derive the laws of reflection and refraction of light. In the seventeenth century, Huygens proposed a new description for the propagation of light, a preliminary version of the wave theory of light. Huygens concluded that light slows down as it enters denser media and explained reflection and refraction with his wave theory. In the following century, Fresnel added wave interference to Huygens’ theory, and with this, he explained diffraction phenomena. Currently, both the geometrical (in terms of rays) and wave-like behaviors are accepted models for describing the propagation of light. Whereas the ray version is better suited for the design of optical instruments, the wave version is more appropriate for the study of image quality.

 A look at the origins of optics up to the nineteenth century can be found in Olivier Darrigol’s book, A History of Optics from Greek Antiquity to the Nineteenth Century (Oxford University Press, 2012).

xv

Chapter 1

Geometrical Optics When certain optical phenomena can be explained by geometrical concepts, we are in the realm of what is called geometrical optics. Starting from the idea that light propagates as geometrical rays, and that this propagation is governed by Fermat’s principle, we can study the propagation of light in media with a constant or variable refractive index, image formation in instruments comprised of optical elements (e.g., lenses, prisms, and mirrors), optical aberrations that deteriorate such instruments’ image quality, the design of optical devices, etc. So, although geometrical optics cannot explain some optical phenomena tied to the wave nature of light, it has a very wide range of applications (Fig. 1.1). Therefore, starting the study of optics from the geometrical point of view is fully justified. Starting from Fermat’s principle, the geometrical properties of optical systems formed by refracting spherical surfaces, spherical mirrors, and lenses

Figure 1.1 Imaging using a positive lens. The image generated by the lens is real and becomes the object for the mobile device’s camera that took the photo. We can understand this imaging process from the perspective of geometrical optics. Courtesy of Felix Ernesto Charry Pastrana.

1

2

Chapter 1

are described in this chapter. While most of the chapter is based on the paraxial approach (small-angle approximation), an introduction to the subject of monochromatic optical aberrations is presented at the end. Chromatic aberration and prisms are briefly discussed in Appendices D and E.

1.1 Rays or Waves 1.1.1 Camera obscura A camera obscura is basically a closed box with a small circular hole in one of its faces. The function of the hole is to let through light contained in a cone whose vertex is a bright point on the object (object point) and whose base is the hole. The size of the hole determines the angle of the cone. Suppose that we have a square box of side length L, with a circular hole of radius ra at the center of one side wall, and that there is an object at distance d from the box (Fig. 1.2). The light that emerges from a point in the object is projected onto the wall opposite the hole as an approximately circular bright spot of radius ra(L þ d)/d. If we define the object as a union of bright points, we will see the superposition of the bright spots on the rear wall inside of the box corresponding to each bright point of the object. If the hole is relatively large, we will essentially see a very blurred (out-of-focus) image of the object on the box’s rear wall. But if the size of the hole is gradually reduced, the image will gradually change from out-of-focus to sharp as if the radius of the hole and the angle of the illumination were practically zero. In this limiting case the lighting cone is transformed into what is called a ray of light. As a matter of course, theoretically speaking, if the radius of the hole tends to zero, the light beam will be a straight line (carrying no energy). In this limiting (not real) case, we would say that the image is an identical copy of the object,

Object

Image

d

L

Figure 1.2 Imaging with a camera obscura.



For an object point on the optical axis, it will be circular; for an object point outside the optical axis, it will be an ellipse.

Geometrical Optics

3

except for a scale factor. Following the geometry of Fig. 1.2, it is also evident that the image will be inverted (i.e., a right-side-up object results in an upsidedown image). Figure 1.3 shows an example of four images of the same object [shown in (a)]. All the images were taken with a camera obscura [shown in (b)] using the same exposure time, of iy

Figure 1.5

Newton’s law of refraction.

ix

6

Chapter 1

Figure 1.6 Waves in a water bucket. (a) Circular waves and (b) secondary circular waves in an aperture (diffraction).

So, in the example of Fig. 1.6(a), the drop is equivalent to the source that originates the circular waves (spherical in the three-dimensional case). In a homogeneous medium, circular wavefronts propagate with the same speed. As the wavefront moves away from the source, its radius increases, and for large distances, the wavefront in the vicinity of a point will resemble a flat wavefront. Suppose that a wavefront is obstructed, allowing only a small portion to pass through, as shown in Fig. 1.6(b). What happens is that when crossing the opening, circular wavefronts are generated again that propagate with the same speed of the original wavefront. This fact, which is due to wave diffraction, is the central idea of Huygens’ principle and can be stated as follows [3]: Each point on a wavefront can be thought of as a source of secondary spherical waves that propagate with the same speed as the wavefront. After a while, the propagated wavefront will be the envelope of the secondary spherical waves. This is illustrated in Fig. 1.7, which recreates the propagation of a wavefront from the secondary Huygens’ waves. Huygens’ principle is applicable to the propagation of light and allows us to correctly derive the laws of reflection and refraction. In refraction, as shown in Fig. 1.5, the speed of light in the medium with the highest refractive index is less than the speed of light in the medium with the lowest refractive index (contrary to Newton’s assumption). Finally, let us see how the law of refraction is derived from Huygens’ principle. A wavefront S  S 0 , incident with an inclination angle u at an interface that separates two homogeneous media of refractive indices n (incident medium) and n0 (refracting medium), is shown in Fig. 1.8(a). When the wavefront reaches point a, secondary waves are generated that propagate

Geometrical Optics

7

Secondary wavelets

Wavefront

Figure 1.7

New wavefront

Propagating wavefront reconstruction using Huygens’ principle.

with speeds y (in the incident medium) and y 0 (refracting medium). At a later time t the wavefront reaches point b, and the two aforementioned secondary waves, emitted at a, will respectively have the radii ae ¼ yt and ad ¼ y0 t. The speed of light in a medium of refractive index n is defined as y ¼ c∕n, where c is the speed of light in vacuum. Therefore, if n < n0 , then y > y 0 , and the case just described will look as presented in Fig. 1.8(a). According to Huygens’ principle, in the time t, the envelope for the reflected wave will be the plane wavefront be and the envelope for the refracted wave will be the plane wavefront bd. For refraction, we have that sin u0 ¼ ad∕ab and sin u ¼ cb∕ab. Since cb ¼ ae, we get to sin u0 y 0 t ¼ : sin u yt

c

e

S' n sin = n' sin '

n

n'

(1.1)

n S

a

'

b

n'

d

(a) Figure 1.8

'

(b) Law of refraction according to Huygens’ principle.

8

Chapter 1

Finally, using the refractive index definition, we arrive at the law of refraction, also known as Snell’s law, n sin u ¼ n0 sin u0 ,

(1.2)

which indicates the way in which the inclination of the refracted wavefront changes (angle u0 ). For a given medium, the refractive index n is a function of the wavelength of light in vacuum. Therefore, when providing the value of the refractive index, we should also specify the wavelength to which that value corresponds. However, if no wavelength is specified, standard practice in optics is to give a value that corresponds to the refractive index at the spectral Fraunhofer line at 587.56 nm (i.e., a yellow light close to the maximun sensitivity of the eye; see Appendix C). If instead of observing the wavefront’s inclination we look at the wavefront’s propagation directions, then Snell’s law can also be established, as shown in Fig. 1.8(b). In fact, if we return to the interpretation of light rays, Snell’s law can also be stated by saying that if a light ray in the medium with refractive index n hits an interface with an angle u (measured with respect to the normal of the interface at the point where the light ray hits), the refracted light ray in the medium with refractive index n0 will form an angle u0 with respect to the normal. So we have two concepts: the light ray and the wavefront. Geometrically speaking, both concepts are related; the light rays are orthogonal to the wavefronts. Thus, using any of these two representations is equivalent, as long as we consider the propagation of light as a geometrical matter. In 1678, with knowledge of Snell’s works, Huygens (in Traité de la Lumière) shows that with his ideas of secondary waves, Snell’s law is fulfilled. As we will see later, this law can also be derived from other physical principles. 1.1.4 Graphical ray tracing Snell’s law has a graphical interpretation that enables the tracing of light rays, or simply rays, on any refracting surface in a simple (and exact) way. In Fig. 1.9(a), we have the refraction of an incident ray. Suppose we construct a triangle, as shown in Fig. 1.9(b), as follows. A side of length will be equal to the refractive index n when extending the incident ray from the incidence point A. The point at the end of this side will be B. A side of length equal to the refractive index n0 drawn from the incidence point A will be in the direction of the refracted ray. The point at the end of this side will be C. The  Named in honor of Willebrord Snellius (1580–1626), who was a professor of mathematics at the University of Leiden.  The refractive index in dielectric materials is discussed in Appendix B.

Geometrical Optics

9

N

N

N

n' n n

n A

n'

n

n'

B

'

n'

'

(a) Figure 1.9

C

(b)

(c)

Graphical ray tracing according to Snell’s law.

third side will be the line that joins the points B and C. This triangle has the property that the side BC is parallel to the normal line N. Triangle ABC in Fig. 1.9(b) fulfills Snell’s law. This results directly from the law of sines for the triangle ABC, as follows: AB AC 0 ¼ sin u sinðp  uÞ

(1.3)

n n0 , ¼ sin u0 sin u

(1.4)

or

which is Snell’s law. This method enables the development of a technique for the graphical ray tracing of the refraction phenomena, as illustrated by Fig. 1.9(c), which can be stated as follows: Consider a ray that hits a surface that separates two media: an incidence medium with refractive index n and a refracting media with refractive index n0 . The refracted ray can be obtained as follows. Determine the normal to the surface at the incidence point, and then draw two circles of radii n and n0 with the center at the incidence point. Extend the incident ray to the circle of radius n and use the intersecting point between the extended ray and that circle as a starting point to draw a line segment parallel to the normal. This new line intersects the circle of radius n0 at another point that indicates the direction in which the refracted ray should be traced from the point of incidence. This technique works on flat and curved surfaces, given that Snell’s law is applied locally. For example, Fig. 1.10 shows two examples with curved

10

Chapter 1

n < n'

n > n'

n n'

N N

n n' n

n

n'

n'

(a)

(b)

Figure 1.10

Graphical ray tracing on curved surfaces.

surfaces; in (a), n < n0 , and in (b), n > n0 . Note the change in the extension of the incident ray in (b), which now goes to the circle with the largest radius. In the latter case, a relevant situation arises when the line parallel to the normal N touches the smaller circle tangentially. The refracted ray forms an angle of p/2; i.e., the ray is tangent to the interface. In Fig. 1.11, this particular case is illustrated. The angle of the incident ray for which this occurs is denoted by uc and is called the critical angle. At u ≥ uc , the phenomenon of total internal reflection occurs, which we will see in detail in Section 2.4.1. From the triangle ABC in Fig. 1.11, sin uc ¼

n0 : n

(1.5)

N

c

n A

C

n'

B

Figure 1.11 When the line parallel to the normal N touches the circle of radius n0 tangentially, there is total internal reflection.

Geometrical Optics

11

1.2 Fermat’s Principle In his treatise on geometry, Euclid (325–265 BCE) postulates that “a straight line segment can be drawn joining any two points.” In Euclidean space, the shortest distance between two points is the length of the straight line segment that joins them. Following this postulate, Hero of Alexandria (10–70 BCE) establishes that for light to go from one point to another, it follows the shortest geometrical path. This is another way of saying that light propagates as geometrical rays. Given that in a homogeneous medium light propagates with constant speed, it can also be said that for light to go from one point to another, it follows the path for which the least time is used. For a homogeneous medium, the two statements are equivalent. However, if the two points are in different media (both homogeneous), the result is that light no longer follows the shortest geometrical path. In Fig. 1.12(a), three possible light paths are illustrated. A planar interface separates the media with refractive indices n and n0 . The shortest geometrical path is the straight line PT2Q, but it does not correspond to the path that light follows if n0 ≠ n. The path followed by light resembles lines PT1Q when n0 < n and the lines PT3Q when n0 > n. So how can the real path followed by light be obtained when n0 ≠ n? The answer to this question is obtained by using the statement above referring to the path for which the least time is used. This statement was formulated by Fermat (1601–1665) and is known as Fermat’s principle. To apply Fermat’s principle to the problem illustrated in Fig. 1.12(a), consider the geometry of Fig. 1.12(b). The interface is located at y ¼ 0 and the point T at x in the horizontal axis. The coordinates of point P are (0, h) and those of point Q are (a,  b). The time for light to go from P to Q through T will be P

P

h

n

T1

T2

n

T

T3

n'

n' b

'

Q

Q

x a

(a)

(b)

Figure 1.12 Sketch to derive Snell’s law from Fermat’s principle. (a) Three possible trajectories to go from P to Q. (b) Geometry to calculate the real trajectory.

12

Chapter 1

n n0 t ¼ PT þ TQ, c c

(1.6)

i.e., n pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n0 t¼ h2 þ x2 þ c c

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi b2 þ ða  xÞ2 :

(1.7)

Fermat’s principle states that t must be as small as possible, which is equivalent to finding the value of x for which the right side of Eq. (1.7) is the smallest. In other words, we should differentiate with respect to x and set the first derivative equal to zero, i.e., n x n0 ða  xÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 0, c h2 þ x2 c b2 þ ða  xÞ2

(1.8)

from which the Snell’s law n0 sin u0 ¼ n sin u [Eq. (1.2)] is derived. For the reflection case, as in a flat mirror, the geometry of the problem is shown in Fig. 1.13(a). The angle between the incident ray and the normal line at T is u, and the angle between the reflected ray and the normal line at T is u0 . By applying Fermat’s principle to Fig. 1.13(a), we obtain equations equal to those we had in the case of refraction but now with n0 ¼ n, so Eq. (1.8) remains sin u0  sin u ¼ 0. Therefore, the reflected ray has an inclination angle equal to the inclination angle of the incident ray; i.e., u0 ¼ u. In Fig. 1.13(b), the virtual image P0 is included. The distance from P0 to the mirror is equal to the distance from P to the mirror. An observer at Q will see the light coming from P0 and not from P. Thus, in effect, the point T is at the intersection of the P

P

Q

h

Q

'

b

n

n

T

T

x a

P'

(a)

(b)

Figure 1.13 Sketch to derive the reflection law from Fermat’s principle. (a) Geometry to calculate the path from P to Q through T. (b) T is at the intersection of the line joining P0 with Q. The distance from P0 to the mirror is equal to the distance from P to the mirror.

Geometrical Optics

13

T

Observer

Figure 1.14 A mirage. An observer sees something that is not really at the point T. What the observer sees is light from the sky.

straight line P0 Q, so that for the virtual source, Fermat’s principle is also satisfied. An interesting example that is explained by Fermat’s principle is the mirage, as illustrated in Fig. 1.14. Suppose we are standing on a straight road on a plain with a clear sky. Due to the heat, the air that is closer to the road’s surface will have a lower density than the air that is higher. Consequently, the refractive index varies with height and is lower near the road’s surface. If we now direct our gaze toward a point T on the road, the light that arrives in that direction is not the one that comes out of point T; it is light from the sky. Thus, we see it as if the road in front of us is wet or as if there is a mirror. What is happening is that the light is avoiding the areas of higher refractive index, looking for the minimum time trajectory, and some of the rays coming from the sky are reaching our eyes when we look in the indicated direction. A situation in which there are infinite trajectories of the same time is found in an ellipsoidal mirror (of rotational ellipsoid surface), which has the property of forming the image of a point source located in one of the foci in the other focus. Consider a cross section of the ellipsoid containing the axis of revolution, as shown in Fig. 1.15. Suppose that inside the mirror we have air (n ¼ 1) and that a point source is located at the focus P from which rays T

P

Q

Figure 1.15 Elliptical mirror. The light emitted by a point source at P will arrive after being reflected in the mirror at the point Q following any of the infinite trajectories that satisfy the equation of the ellipse.

14

Chapter 1

diverge in a radial direction. Any ray will follow a straight path until it reaches the mirror at a point T and will be reflected again following a straight path through point Q, which is at the other focus of the ellipse. By changing the position of T, the reflected light ray will pass through Q again. In both cases, the time is the same, and this occurs for any other point T on the surface of the mirror. So none of the trajectories is a minimum. This indicates that the original formulation of Fermat’s principle does not cover all cases. 1.2.1 Modern formulation of Fermat’s principle If we multiply Eq. (1.6) by the speed of light, ct ¼ nPT þ n0 TQ, we have an equality of two quantities of different nature. On the left side, we have the distance that light would travel in vacuum during the time t; on the right side, we have a distance that differs from the geometrical distance, i.e., the geometrical distance multiplied by the refractive index. This last distance is called the optical path length (OPL). Taking this concept into account, Fermat’s principle in its modern form states that: A ray of light goes from one point to another following a path whose OPL is stationary with respect to the variations of such path. In this statement, “stationary” means that the OPL can be a minimum value, a maximum value, or an inflection point with a horizontal slope. In this way, all possible light trajectories are included. For example, in the ellipsoidal mirror of Fig. 1.15, the ray trajectories correspond to the inflection point of Fermat’s principle. So, in a general way, suppose that we have a path C of length s that joins two points, P1 and P2, that are in an inhomogeneous medium, as shown in Fig. 1.16. The refractive index will be a function of the spatial coordinates, n ¼ n(x, y, z), and the OPL will be given by Z OPL ¼ nðsÞds, (1.9) C

Shortest optical path P2

s

n(x,y,z)

P1

ds

Shortest geometrical path

Figure 1.16 OPL in an inhomogeneous medium.

Geometrical Optics

where ds ¼

15

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dx2 þ dy2 þ dz2 . If the OPL satisfies ∂ ðOPLÞ ¼ 0, ∂s

(1.10)

then C is the path that the light will follow. In Fig. 1.16, although the path length of the resulting shortest optical path will be greater than the length of the shortest geometrical path (segment P1 P2 ), the length of the optical path along C is less than the length of the optical path along P1 P2 . 1.2.2 Rays and wavefronts Taking into account the above, from the geometrical point of view, a ray of light is the curve that satisfies Fermat’s principle. If the medium is homogeneous, n ¼ n0 (constant), then Z OPL ¼ n0 ds ¼ n0 P1 P2 , C

since the shortest path between two points (in Euclidean space) is a straight line segment. Therefore, the rays in a homogeneous medium are straight lines, whereas in an inhomogeneous medium, they will be curved. For example, in Fig. 1.17(a) a possible configuration for some rays emitted by a point source immersed in a medium of variable refractive index is shown. Now consider a curve orthogonal to the rays. If the distance between the point source and each point intersected by the rays on the orthogonal curve is such that the OPL is the same, the curve is called a geometrical Radial rays

Curved rays

S

Collimated rays

S

n = n0

n = n (x,y,z)

Wavefront

(a)

Spherical wavefront

n = n0

Plane wavefront

(b)

(c)

Figure 1.17 Rays and wavefronts are orthogonal to each other. (a) Point source in an inhomogeneous medium; the rays are curved and the wavefront is a distorted surface. (b) Point source in a homogeneous medium; the rays are radial lines and the wavefront is a sphere. (c) At an infinite distance from the point source in a homogeneous medium, the rays are parallel lines (collimated) and the wavefront is a plane.

16

Chapter 1

wavefront. Of course, for another OPL we will have another wavefront. So what we have is a family of rays and a family of wavefronts orthogonal to each other. In the three-dimensional case, the rays will not be limited to one plane and the wavefronts will be surfaces orthogonal to the rays. In the case in which the refractive index is constant, the rays will be radial lines and the wavefronts will be spheres [Fig. 1.17(b)]. If we observe the wavefronts at a far distance from the source that tends to infinity, the wavefronts will be planes and the corresponding rays will be lines parallel to each other [Fig. 1.17(c)]. These rays are called collimated rays. From a mathematical point of view, if F(x, y, z) represents the wavefront, then ∇F represents the rays and ∇F tˆ ¼ k∇Fk

(1.11)

gives the direction of the ray. Therefore, to refer to the propagation of light we can do it with either of the two representations: rays or wavefronts. Given one of the representations we can move on to the other with Eq. (1.11). In Section 1.1.3, we qualitatively define the wavefront from the wave motion of the peaks and valleys on the surface of the water. Formally, in waves, the wavefront is defined as the surface where each point has the same phase value. On the other hand, in Fermat’s principle, the wavefront is defined as the surface where each point has the same OPL value. The two definitions are equivalent. As we will see later, the phase can be obtained from the OPL. 1.2.3 Image from a point source Suppose that we have a point source S submerged in a homogeneous medium from which rays diverge, as illustrated in Fig. 1.17(b). Given that the refraction of light at an interface implies that light rays change their direction according to Snell’s law, then it is possible to design a surface or a set of refracting surfaces (optical system) with which the rays can rejoin at a point S0 (also in a homogeneous medium). If the OPL is the same for all the rays that leave point S, refracting in the optical system and reaching point S0 , then we will say that S0 is the image of S. This implies that the wavefront converging on S0 must also be spherical, as shown in Fig. 1.18. Figure 1.18 describes the formation of the image of a point object S into a point image S0 . But the description can also be made in the reverse sense; i.e., if S0 is the object, then S will be the image. S and S0 are then said to be conjugate points. If the optical system is such that the rays leaving S do not reach S0 with the same OPL, then we will not have the image of a point but a spot of a certain extent. This is explained by taking into account that a point image implies a

Geometrical Optics

17

Optical system

S

S'

Object

Image

Figure 1.18

Image formation of a point source.

convergent spherical wavefront, and if the rays arriving or approaching the ideal position of S0 do not have the same optical path, the convergent wavefront will not be spherical, so it is not possible to have a point image. In this case, the optical system is said to introduce aberrations that distort the image. We will address this topic in a general way at the end of this chapter. The question that interests us now is to determine the shape that refracting surfaces should have to generate a point image of a point object.

1.3 Refracting Surfaces In Fig. 1.18, the optical system receives at the input a bundle of divergent rays emerging from a point object (point source) and outputs a bundle of rays converging toward a point image. How should the optical system be set up to perform this task? To answer this question, we can start with the simplest case: image formation by a refracting surface. The shape of the surface can be determined from Fermat’s principle. In Fig. 1.19, the imaging of a point object P is shown using a refracting surface that separates two media of refractive indices n and n0 (n0 > n). The length of any incident ray is denoted by l, and the length of the corresponding refracted ray is denoted by l0 . The OPL for any ray should be the same; therefore, OPL ¼ nl þ n0 l 0 ¼ constant:

l

(1.12)

l'

P

P'

n

n'

Figure 1.19 The shape of the refracting surface that enables a point image P0 to be obtained from a point object P is called a Cartesian oval.

18

Chapter 1

The surface obtained by solving Eq. (1.12) is a surface of revolution called Cartesian oval. The axis of revolution coincides with the straight line PP0 . The intersection of that line with the surface defines the surface vertex. The surface center of curvature at the vertex lies on the axis of revolution. Therefore, with the Cartesian oval, we form the image of P in P0 , as shown in Fig. 1.19. In other cases we may need more than one surface, for example, if we want the image to be in the same medium as the object. In any case, in the rest of this chapter, let us limit ourselves to optical systems whose refracting surfaces have symmetry of revolution. In addition, all the axes of revolution will be coincident; i.e., we will have a single axis of revolution that is called the optical axis of the system. So, the optical axis should contain all the vertices and centers of curvature (at the vertices) of the refracting surfaces. A simple example of a Cartesian oval is obtained when we want to produce, at a distance b from the refracting surface vertex V, a point image of a point object located at an infinite distance from V. The incident rays (coming from infinity) can be assumed to be collimated and parallel to the optical axis, as shown in Fig. 1.20. Consider a plane orthogonal to the optical axis that passes through the point A. The incident rays at that plane have the same OPL from the object point. Therefore, we can take this plane as a reference to measure the OPL of a ray passing through a point P, going through the point Q on the refracting surface S, and arriving at the image point P0 . If the point P0 is the image of a point object that is at infinity, all the rays exiting the plane at A and converging to P0 should have the same OPL. In particular, for the ray coming out of P, OPL ¼ nPQ þ n0 r ¼ na þ n0 b,

(1.13)

where a is the distance from A to V. Note that the right side of Eq. (1.13) is the OPL along the optical axis. Since PQ ¼ a þ b  r cos u, n < n' Q

P

A

V n a

P' n' b

Figure 1.20 The surface shape that generates a point image of a point object that is at infinity is an ellipsoid of revolution when n < n0 .

Geometrical Optics

19

  n n0 r 1  0 cos u ¼ bðn0  nÞ: n

(1.14)

Defining a new constant p ¼ bðn0  nÞ∕n0 , we get rð1  e cos uÞ ¼ p,

(1.15)

which describes a conical surface with eccentricity e ¼ n/n0 . Since n0 > n, then e < 1, corresponding to an ellipse. Therefore, the refracting surface S, which focuses the collimated rays parallel to the optical axis at a point image, is an ellipsoid of revolution. 1.3.1 Modeling the cornea of a human eye An example where the refracting surface resembles an ellipsoid of revolution is the anterior corneal surface of the human eye. The human eye is a biological–optical system that focuses light rays on the concave surface of the retina. Anatomically, the eyeball of a normal adult eye is an approximately spherical oval of about 23 mm in diameter. Light enters the eye first through the cornea, a transparent tissue on the front of the eye (Fig. 1.21). The average refractive index of the cornea is 1.376. In an emmetropic eye, the cornea has an anterior surface with a mean radius of 7.8 mm in the central area (around the vertex), and as it moves away from the center, the radius increases as the cornea overlaps with the sclera (the outer layer covering the eyeball). This makes the shape of the anterior corneal surface resemble an ellipsoid. Various experimental studies [4] have shown that most normal eye corneas can be characterized by an ellipsoidal model with an eccentricity of 0.5 and a vertex offset toward the temporal zone of approximately 0.4 mm (measured with respect to the optical axis). The posterior surface of the cornea, with a mean radius of 6.5 mm, is separated from the anterior surface by approximately 0.6 mm. Right behind Sclera Lens

Retina Fovea

Cornea

Visual axis

Optical axis Aqueous humor

Vitreous humor

Optic disc

Iris

Figure 1.21 Schematic diagram of the human eye. In emmetropic eyes, the anterior surface of the cornea resembles an ellipsoid of revolution.

20

Chapter 1

the cornea is the anterior chamber, which contains aqueous humor with a refractive index of 1.336. Then there is the iris, which controls the amount of light that enters the eye, varying its internal diameter from 2 mm (for highlight objects) to 8 mm (for low-light objects). Behind the iris is the crystalline lens, shaped like a biconvex lens of approximately 9 mm in diameter and 4 mm thickness. The refractive index of the lens varies, from around 1.406 in the inner core to around 1.386 in the outer zones. The crystalline lens can vary its shape to achieve a fine focus, so that the light coming from any external object is focused onto the retinal surface. Behind the crystalline lens there is another chamber with a transparent substance, called vitreous humor, with a refractive index of 1.337. Ultimately, the light is focused onto the retina, a concave surface that contains two classes of photoreceptor cells, the rods and the cones. A detailed description of the anatomy and function of the eye can be found in the works of Helmholtz [5], Le Grand and El Hage [6], Davson [7], and Smith and Atchison [8]. Due to the complexity of the human eye (i.e., the visual axis does not coincide with the optical axis; Fig. 1.21), we usually work in optometry and ophthalmology with simplified models of the human eye that enable us to analyze image formation in the retina. There are several models, some of which include an ellipsoidal model of the cornea [9]. For example, the human eye could be thought of as an optical system with a single refracting surface (the cornea), with an equivalent refractive index neq, as shown in Fig. 1.22. Rays coming from very distant object points are focused on the same point on the retina, if the shape of the cornea is assumed to be ellipsoidal, according to Eq. (1.15). More simplified models use a sphere, instead of an ellipsoid, to represent the corneal shape. Such simplification is adequate if the bundle of rays that comes from infinity is limited to a region very close to the optical axis (in which we can model a sphere as a second-order surface). If the beam of rays is not limited to a small region, Fermat’s principle would not hold for all rays and the image would no longer be a point but a blur spot.

V

P' neq

Figure 1.22 Simplified model of the human eye. The eye is modeled as an ellipsoidal refracting surface with an equivalent refractive index neq.

Geometrical Optics

21

1.3.2 Refraction at spherical surfaces Although with Cartesian ovals we have optical systems to form ideal images (point images of point objects), in most image-forming systems, they are impractical because the ideal image is obtained for a fixed position of the object. If the position of the object (along the optical axis) is changed, an ideal image will no longer be formed. On the other hand, spherical surfaces are easy to manufacture and high-quality images can be obtained with them, for different object positions, by combining several spherical refracting surfaces and/or by limiting the transversal extension of a ray bundle to the region close to the optical axis. To see how the light rays behave on a refractory spherical surface, let us calculate the refraction of the rays according to Fermat’s principle based on Fig. 1.23. In Fig. 1.23, some quantities appear with a negative sign. This is because it uses the Cartesian sign convention, which will be used hereafter. Thus, • The distances to the left of the vertex (or to the left of the point Q) are negative. • The distances to the right of the vertex (or to the left of the point Q) are positive. • The angles with respect to the optical axis (or with respect to the normal N) are negative if they turn clockwise. • The angles with respect to the optical axis (or with respect to the normal N) are positive if they turn against the clockwise direction. • The radius of curvature of the refracting surface is negative if the center of curvature is to the left of the vertex. • The radius of curvature of the refracting surface is positive if the center of curvature is to the right of the vertex. • The height of an object point outside the optical axis (i.e., the distance from the optical axis) is negative if the point is below the optical axis. N Q

l

l'

'

'

V

P

R

C

-s

n

s'

n'

Figure 1.23 Geometry to calculate the refraction of light rays at a spherical surface.

P'

22

Chapter 1

• The height of an object point outside the optical axis (i.e., the distance from the optical axis) is positive if the point is above the optical axis. Additionally, it is assumed that light travels from left to right and that all rays are contained in the same plane (defined by the points P, Q, and P0 ). The sign convention in optical systems varies by author. The convention used in this book is the one usually employed in optical design. In general, the problem with having different sign conventions becomes apparent when some signs in the equations connecting object distances and image distances change. According to the Cartesian sign convention, in Fig. 1.23 all distances and angles are drawn as positive. For example, P is to the left of V and therefore s < 0; in the drawing we put  s to indicate the length separation between P and V. Then, the OPL for the ray PQP0 is OPL ¼ nl þ n0 l 0 :

(1.16)

Using the triangles PQC and P0 QC, we get qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R2 þ ðs þ RÞ2  2Rðs þ RÞ cosðaÞ

(1.17)

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R2 þ ðs0  RÞ2  2Rðs0  RÞ cosðp þ aÞ:

(1.18)

l ¼ and l0 ¼

Since a is the only variable on the right side of Eqs. (1.17) and (1.18), Fermat’s principle for the ray PQP0 implies that d(OPL)/da ¼ 0, leading to 0¼n

ðs þ RÞ ðs0  RÞ  n0 , l l0

(1.19)

from which   n0 n 1 n0 s0 ns  ¼  : l0 l R l0 l

(1.20)

This equation tells us how far P0 is from V for the ray PQP0 . This is an equation with two unknowns, s0 and l0 . Therefore, to find the position of P0 , we can choose some numerical method. However, to see how a ray bundle is refracted out of the point P, instead of using Eq. (1.20), let us use the graphical ray tracing method described in Section 1.1.4. For example, Fig. 1.24 shows a five ray trace on a spherical surface S of radius R ¼ 41 mm, with n ¼ 1.0 and n0 ¼ 1.8. The rays emerge from a point P located at s ¼  149 mm and reach the surface S with equidistant height increments.

Geometrical Optics

23

C

P n

n'

Figure 1.24 Graphical ray tracing on a spherical surface. The refracted rays do not converge at the same point.

From Fig. 1.24, it can be seen that the refracted rays do not converge at a single point on the optical axis. Since this defect appears when using a spherical surface instead of a Cartesian oval, the image is said to be affected by spherical aberration. It is also observed that, upon arrival to the optical axis in the image side, the rays that travel closer to the optical axis are less separated from each other than the rays that travel farther away from the optical axis. This suggests that, if the ray bundle reaching the spherical surface is limited to those hitting only a small circular area that is very close to the optical axis, acceptable images can be obtained on the image side. Thus, for an object point on the optical axis, the incident rays that we should consider are those that travel almost parallel to the optical axis, as shown in Fig. 1.25. This is called the paraxial approximation. Using the approximations l  s and l0  s0 in Eq. (1.20), we get n0 n ðn0  nÞ ,  ¼ s0 s R

(1.21)

which describes image formation by a spherical surface in the paraxial approximation. Equation (1.21) is called the Gauss equation.

C

P n

P'

n'

Figure 1.25 In the paraxial approximation, the incident rays emerging from P are very close to the optical axis. Refracted rays converge to the point P0 .

24

Chapter 1

1.3.3 Focal lengths and focal points A refracting spherical surface is called convex if the radius of curvature is positive; it is called concave if the radius of curvature is negative. For both surfaces, we are going to see what happens for an incident or refracted parallel bundle of rays when n0 > n. Suppose that paraxial rays parallel to the optical axis hit a convex spherical surface, i.e., the point object is located at s ¼  `. Therefore, in Eq. (1.21), distance s0 is a constant that depends only on the geometrical and optical parameters of the refracting surface. Such a distance is called the secondary focal length f 0 and is determined by the formula 1 ðn0  nÞ 1 : ¼ f0 n0 R

(1.22)

Thus, s0 ¼ f 0 is the distance (to the right of the vertex) at which all the rays converge. The point where the rays converge is called the secondary focal point F0 . This is illustrated in Fig. 1.26(a). Now, where should a point object be placed so that the refracted rays emerge parallel? This implies that s0 ¼ ` and therefore, in Eq. (1.21), the variable s becomes a constant that depends only on the specifications of the refracting surface. Such a constant is called the primary focal length f and is given by n

n' R

C F'

f'

(a) n

n' R

C F

f

(b) Figure 1.26 Focal lengths and focal points in a convex spherical refracting surface for which n0 > n. (a) Secondary focal point F0 and (b) primary focal point F.

Geometrical Optics

25

1 ðn0  nÞ 1 ¼ : f n R

(1.23)

Thus, s ¼  f is the distance (to the left of the vertex) at which a point object should be placed for its image to be located at infinity. The position where the point object is placed is called the primary focal point F. This is illustrated in Fig. 1.26(b). In Figs. 1.26(a) and 1.26(b), an orthogonal (dashed) line is included that passes through the vertex of the refracting surface. Note that the incident rays are drawn as refracted at the dashed line and not at the actual refracting surface (as it should be). This is done on purpose to emphasize the paraxial approximation and reminds us that in practice the rays travel very close to the optical axis. However, rays are deliberately drawn away from the optical axis to clearly illustrate the ray tracing used for locating the image. The case of a concave surface is shown in Fig. 1.27. The focal lengths are given by Eqs. (1.22) and (1.23) but with R < 0, so f 0 < 0 and f < 0. When s ¼  `, the refracted rays diverge as if they were coming from a point F0 located at s0 ¼ jf 0 j, as shown in Fig. 1.27(a). In such a case, we would say n

n'

R

C F'

f'

(a)

n

n'

R C F

f

(b) Figure 1.27 Focal lengths and focal points in a concave refracting spherical surface for which n0 > n. (a) Secondary focal point F0 and (b) primary focal point F.

26

Chapter 1

that there is a virtual point image at F0 (secondary focal point). When s0 ¼ `, the incident rays converge to the point F located at s ¼ jf j, as shown in Fig. 1.27(b). In such a case, we would say that there is a virtual point object at F (primary focal point). 1.3.4 Focal planes In Fig. 1.26(a), we have rays parallel to the optical axis, limited to the paraxial region, that focus to a secondary focal point. These rays come from an object point (at infinity) that is on the optical axis. If we now consider another object point outside the optical axis (also at infinity), the bundle of parallel rays that hits the refracting surface is oblique. These rays converge at a point that is at the focal length f 0 from the refracting surface, measured along the auxiliary optical axis drawn from the off-axis object point to the center of curvature C. This is shown in Fig.1.28. With respect to this auxiliary axis, the situation is equivalent to the object on the original optical axis. As a result, considering the refraction of the oblique bundles of parallel rays, the rays will be focused onto a curved surface at the image side. This surface is called the focal surface. The paraxial approximation implies that the inclinations of the rays are limited to small angles. Therefore, in this case, all the oblique bundles of parallel rays will focus onto a small region of the focal surface that can be approximated by a plane. This is called the focal plane (Fig. 1.28). Accordingly, we would say that in the paraxial approximation, the focal planes are defined as planes perpendicular to the optical axis containing the focal points. Hence, primary and secondary focal planes can be established for each refracting surface. In the paraxial approximation, the following generalizations can be made for n < n0 : • An oblique bundle of parallel rays that hit at a convex spherical refracting surface converges to a point on the secondary focal plane;

Focal surface

Focal plane

R

C

n

n'

f'

Figure 1.28 Focal surface. In the paraxial approximation, the focal surface can be described as a plane where oblique bundles of parallel rays converge.

Geometrical Optics

27

and, for a point object located anywhere on the primary focal plane, the rays are refracted as an oblique bundle of parallel rays. • An oblique bundle of parallel rays that hit at a concave spherical refracting surface diverges from a point on the secondary focal plane; and, for a virtual point object located anywhere on the primary focal plane, the rays are refracted as an oblique bundle of parallel rays. In both cases, the inclination of the bundle of parallel rays is given by the undeviated ray passing through the center of curvature C (Fig. 1.28). Example: Refraction in a sphere As an application of an optical system that contains two spherical surfaces, convex and concave, consider the image for a point object located at infinity that is generated by a sphere of radius R0 and refractive index n0 ¼ 3/2. Refraction occurs at two surfaces, labeled as 1 and 2 in Fig. 1.29. The first surface has a radius R ¼ R0. The second has a radius R ¼  R0 and is separated from the first by a distance 2R0. The object distance measured from the first surface is s1 ¼  `. Therefore, the image generated by the first surface is at s01 ¼ f 01 ¼ 3R0 . This first image is then seen from the second surface as a virtual object located at s2 ¼ 3R0  2R0 ¼ R0. The image for this virtual object, generated by the second surface, would be located at s02 ¼ R0 ∕2. It is worth noting that for the first image to be located right at the vertex of the second surface, i.e., at s01 ¼ 2R0 when s1 ¼  `, per Eq. (1.21) the refractive index of the sphere would have to be n0 ¼ 2. In a way, this explains why the human eye is not completely spherical. If it were with an equivalent refractive index neq ¼ 4/3, all of us would be hypermetropic (farsighted). Instead, the anterior corneal surface of the human eye has a smaller radius than the eyeball so that the refracted light hits the retina (at the vertex

n' = 3/ 2 n=1

1

2

n=1

R0

F'1 C

s'2 = R 0 / 2 s 2 = R0 s'1= f '1 = 3R 0

Figure 1.29 The refraction in a sphere of refractive index n0 ¼ 3/2 generates the image of an object at s ¼  ` at the distance s02 ¼ R0 ∕2 from the sphere’s back face (dashed line labeled as 2).

28

Chapter 1

of the second surface), as illustrated in the simplified model of the eye shown in Fig. 1.22. 1.3.5 Paraxial imaging of extended objects So far we have only considered point objects. If the object is not a point, we would say it is an extended object and its size can be described by an arrow of a certain height, measured from the optical axis. All rays that leave the object will be contained in a plane that includes the optical axis. This plane is called the meridional plane. With the help of the parallel rays tracing method described in Section 1.3.3, we can graphically find the position and size of the image. Consider Fig. 1.30, which shows two refraction cases: (a) a convex surface and (b) a concave surface. In both cases, let us place an object of height h at a distance  s from the vertex. In the first case, let us trace a ray that leaves the object tip and travels parallel to the optical axis. This ray is refracted toward the secondary focal point. Then, let us trace another ray, also coming out of the object tip, that passes through the primary focal point and is refracted traveling parallel to the optical axis. The two refracted rays intersect at a point that defines the tip of the image. Such an intersection point reveals the size of the image and its location, i.e.,  h0 and s0 , respectively. A third ray that one can draw, exiting the object tip, is the ray that passes through the center of curvature of the surface. Since the angle of incidence at the refracting surface is zero, this ray is -s

s'

h

F'

C O

I

- h'

F n'

n

(a) -s - s'

h O

I

F'

F

C

h'

n

n'

(b) Figure 1.30 Graphical ray tracing to determine location and size of the image generated by spherical refracting surfaces with n0 > n. (a) Convex, R > 0; (b) concave, R < 0.

Geometrical Optics

29

not deflected as it passes through. Of course, this ray will also reach the end point where the first two rays intersect. In the second case, let us again trace a ray that leaves the object tip and travels parallel to the optical axis. This ray is refracted diverging from the secondary focal point. Next, let us trace a second ray, also coming out of the object tip and passing through the center of curvature of the refracting surface. As in the previous case, this ray is not deflected when passing through the refracting surface because its angle of incidence is zero. The two refracted rays are divergent, so they do not intersect (there is no real image). However, the two refracted rays appear to emerge from a common point behind the refracting surface. Such a point is the tip of a virtual image; its projection on the optical axis locates its position. Again, we have determined the size and position of the image, i.e., h0 and  s0 , respectively. A third ray can be drawn exiting the object tip and directed toward the primary focal point (which is to the right of the surface). When this ray reaches the refracting surface, it is refracted in a direction parallel to the optical axis. This ray also appears to emerge from the tip of the virtual image. The image magnification is the ratio between the image size and the object size, i.e., mt ¼ h0 /h. If mt is negative, the image is inverted, as in Fig. 1.30(a); if mt is positive, the image is upright, as in Fig. 1.30(b). On the other hand, if jmt j . 1, the image will be larger, and if jmt j , 1, the image will be smaller, when compared to the size of the object. From the geometry of Fig. 1.30(a), looking at similar triangles, we get mt ¼

R  s0 : Rs

(1.24)

This relationship is also valid for the case of Fig. 1.30(b), taking into account that R < 0. 1.3.6 Optical power and vergence Another useful way to interpret Eq. (1.21) is by analyzing the change in curvature of the refracted wavefront, with respect to the incident wavefront, by using the refractive power of the spherical surface, which is defined as P¼

ðn0  nÞ : R

(1.25)

The unit of optical power is called the diopter (D) and is defined as 1 D ¼ 1 m–1. Since we are in the paraxial approximation, the wavefronts emerging from a point object and converging to a point image are spherical. In particular, the curvature of the incident wavefront, hitting the vertex of the surface, would be 1/s; the curvature of the refracted wavefront leaving the vertex, propagating on the the right side of the surface, would be 1/s0 . These

30

Chapter 1

curvatures multiplied by the corresponding refractive indices are called vergences, U and V, respectively. The unit of vergence is also the diopter. In terms of these quantities, Eq. (1.21) becomes V  U ¼ P:

(1.26)

This relationship between vergences and power is very useful in optometry and is often used as U þ P ¼ V:

(1.27)

Thus, the curvature U/n of the object wavefront is modified by the power P of the refracting surface, resulting in the curvature V/n0 of the image wavefront. Taking into account the Cartesian sign convention, a diverging wavefront has a negative curvature and a converging wavefront has a positive curvature. Also, whereas a divergent spherical wavefront corresponds to a real object or a virtual image, a convergent spherical wavefront would correspond to a real image or a virtual object. In a system of several refracting surfaces, the power of the jth surface is Pj ¼

n0j  nj , Rj

(1.28)

where n0j and nj are the refractive indices to the right and to the left of the jth surface, respectively, and Rj is the jth radius of curvature.

1.4 Reflecting Surfaces In addition to refracting light, an interface that separates two media can also reflect light. If the surface is smooth (like a mirror), the reflection will be specular; but if it is rough, the reflection will be diffuse. In this section, we will deal with specular reflection of light, which can also be described by light rays and, of course, complies with Fermat’s principle (as mentioned in Section 1.2). Using Fig. 1.13, we can describe the reflection phenomena in a plane mirror. By applying Fermat’s principle to Fig 1.13(a), an equation similar to Eq. (1.8) is obtained but with n0 ¼ n. The reflection of light on a plane, and even on spherical mirrors, can be described from the equations obtained in Sections 1.2 and 1.3 (for refraction), if we make n0 ¼  n and follow the sign convention established in Section 1.3.2. Moreover, if we limit this treatment to mirrors immersed in air, then n ¼ 1 and, therefore, n0 ¼  1. With this, Eq. (1.2) becomes u0 ¼ u:

(1.29)

Figure 1.31 illustrates two cases of light reflection: (a) on a flat surface and (b) on a curved surface. In both cases, the angle of incidence of a light ray is

Geometrical Optics

31

N

N '.

.'

n=1 n=1 n' = -1 n' = -1

(a)

(b)

Figure 1.31 Reflection on a smooth surface. (a) A flat mirror and (b) a curved surface. The angle of reflection is equal to the angle of incidence but with a negative sign.

measured with respect to the normal line N at the point on the surface where the ray is incident. The angle of reflection is also measured with respect to the normal N. The incident ray, the reflected ray, and the normal at the point of incidence all lie in the same plane, known as the plane of incidence. Thus, the law of reflection can be stated as follows: The angle formed by the reflected ray and the normal equals the angle formed by the incident ray and the normal. The two rays and the normal are contained in the same plane. On the other hand, Eq. (1.21) becomes 1 1 2 þ ¼ s0 s R

(1.30)

for spherical mirrors. According to the sign convention described in Section 1.3.2, R > 0 if the center of curvature is to the right of the mirror vertex and R < 0 if the center of curvature is to the left of the mirror vertex. In the former case, we say that the mirror is convex; in the latter case, we say that it is concave. The distances s and s0 are measured with respect to the mirror vertex. 1.4.1 Ray tracing for spherical mirrors In spherical mirrors, two incident rays are usually traced; a ray parallel to the optical axis and a ray directed to the mirror vertex, as shown in Fig. 1.32. For the first ray s ¼  `, so s0 ¼ R/2. Therefore, if R < 0, incident paraxial rays parallel to the optical axis will be reflected converging at a point located to the left of vertex V at a distance s0 ¼ f ¼ jRj∕2. This point is called the focal point F of the mirror and will be the image of a point object located at infinity. If R > 0, incident paraxial rays parallel to the optical axis will be

32

Chapter 1

C

F

C

V

F

V

(a) R < 0

V

F

C

V

F

C

(b) R > 0 Figure 1.32 Rays in spherical mirrors based in the paraxial approximation. (a) A concave mirror and (b) a convex mirror.

reflected diverging from a point located to the right of V at the distance s0 ¼ f ¼ R/2. Now the focal point F is virtual, and the image of a point object located at infinity will be a virtual point image at F. On spherical surfaces, the normal at the point of incidence is a radial line with origin at the center of curvature C. For the second ray, the situation is very simple. At the vertex the normal of the surface coincides with the optical axis, so for both (concave and convex) mirrors the ray is reflected to the left and downward. The trace of these two rays is shown in Fig. 1.32(a) for the concave mirror and in Fig. 1.32(b) for the convex mirror. With these two rays, we can locate the image generated by the mirrors. In Fig. 1.33, the ray tracing for four different scenarios in a concave mirror is shown. In Figs. 1.33(a), (b), and (c), the image is real and inverted, going from smaller to larger than the object. In particular, in (b) s ¼ R and s0 ¼ R, and the image is the same size as the object. In (d) the image is virtual, upright, and larger than the object. The image magnification mt ¼ h0 /h can be obtained by considering the ray that goes to the mirror vertex. According to the trace of that ray, we have two similar triangles that satisfy the relation h/(  s) ¼ (  h0 )/(  s0 ). Therefore, the image magnification will be s0 mt ¼  : s

(1.31)

Geometrical Optics

33

O

O

h

O

h C

F

h'

V

h C

F

V

C

F

V

h'

I

h'

I

(a)

I

(b)

(c)

I

O

h'

h C

F

V

(d) Figure 1.33 Image formation in concave spherical mirrors. (a) Real image, inverted, and smaller than the object. (b) Real image, inverted, and of the same size as the object. (c) Real image, inverted, and larger than the object. (d) Virtual image, upright, and equal in size to the object.

O

I

h

h' V

F

C

Figure 1.34 Image formation in a convex mirror. The image is always virtual, upright, and smaller than the object, and its apparent position is between the vertex V and the focal point F.

Thus, in the case shown in Fig. 1.33(b) the magnification turns out to be mt ¼  1. The minus sign (  ) indicates that the image is inverted. In Fig. 1.34, the ray tracing in a convex mirror is shown. For any position of the object, the image is virtual, upright, and smaller than the object. The

34

Chapter 1

image seems to be behind the mirror, and the image magnification can still be computed from Eq. (1.31). 1.4.2 The parabolic mirror The ray tracing presented in Section 1.4.1 for spherical mirrors is limited to the paraxial approximation. For the general case, not paraxial, consider the imaging process in a concave spherical mirror for a point object located at infinity but on the optical axis. This implies that the light reaching the mirror can be considered a bundle of parallel rays. In the paraxial approximation, all the rays in the bundle converge to the focal point [Fig. 1.32(a)]. However, by ignoring the paraxial approximation, if we apply Snell’s law for each of the rays in the bundle, the reflection of rays occurs as shown in Fig. 1.35. As with spherical refracting surfaces, this is called spherical aberration. This can be easily verified with the help of Fig. 1.36. A ray parallel to the optical axis hits Q, and then is reflected through point X on the optical axis. The triangle CXQ is isosceles, so the following relationship can be established:

Figure 1.35 Parallel rays in a concave spherical mirror do not converge to the same point.

Geometrical Optics

35

Q

R

V C

F

X

Figure 1.36 Nonparaxial rays in a concave spherical mirror, traveling parallel to the optical axis, reflect crossing the optical axis at a point other than the focal point. As the angle u increases, the reflected ray crosses the optical axis at a point closer to the vertex.

CX þ QX . R,

(1.32)

as long as Q does not coincide with V. Also, CX ¼ QX; therefore, CX .

R , 2

(1.33)

and the focal point F is at a distance R/2 from C. This implies that the nonparaxial rays that hit parallel to the optical axis do not converge to F. As the incidence angle u increases, the reflected ray crosses the optical axis at a position that is closer to the mirror vertex. So, what shape should the concave mirror have to focus the rays parallel to the optical axis at a single point? The answer is, again, given by Fermat’s principle. Based on the geometry shown in Fig. 1.37, applying Fermat’s principle, OPL ¼ a þ r cos u þ r:

(1.34)

All rays from the orthogonal line to the optical axis passing through P must have the same OPL to reach F. Defining a new constant p ¼ OPL  a, then rð1 þ cos uÞ ¼ p:

(1.35)

This is the equation of a parabola. Thus, the shape of the mirror we are looking for is a paraboloid of revolution. Note that Eq. (1.35) can also be obtained if we change n0 ¼  n ¼  1 in Eq. (1.15).

36

Chapter 1

a P

Q

F

V

n=1

Figure 1.37 A parabolic mirror focuses the parallel rays coming from infinity at the focal point.

In Cartesian coordinates, the profile of the paraboloid can be written as z¼

y2 , 2Rv

(1.36)

where z is the coordinate along the optical axis, y is the meridional coordinate, and Rv is the radius of curvature of the parabola at the vertex. Then, the focal point of the mirror is located at the distance Rv/2 from the vertex V (Fig. 1.37).

1.5 Lenses: Thin Lens Approximation If we want to obtain images in a medium other than the spherical refracting surface discussed in Section 1.3.2, another refracting surface should be included. In this section, we will deal with refracting elements limited by two spherical surfaces with a common optical axis. These types of elements are called lenses. To obtain the position of the image of a point object, we should use Eq. (1.21). The Gaussian equation for surface 1 is n01 n1 ðn01  n1 Þ  ¼ ; R1 s01 s1

(1.37)

n02 n2 ðn02  n2 Þ  ¼ : R2 s02 s2

(1.38)

for surface 2, it is

Geometrical Optics

37

- s1

s'1 s'2

1 n

2 nl

n I

O

t

O'

s2

Figure 1.38 Locating the image of an object generated by a positive lens.

Limiting ourselves to a lens of refractive index nl immersed in a single medium of refractive index n < nl, as shown in Fig. 1.38, then n1 ¼ n02 ¼ n and n01 ¼ n2 ¼ nl . By adding Eqs. (1.37) and (1.38), we get   1 1 ðnl  nÞ 1 1 n n ¼  (1.39) þ l  0l : 0  s2 s1 s2 s1 n R1 R2 The lens thickness t is the distance between the vertices of surfaces 1 and 2, related to the distances s01 and s2 through the relationship t ¼ s01  s2 . If t ≪ s01 and t ≪ s2 , such that s01  s2 , then the last two terms on the right side of Eq. (1.39) would cancel each other. This approximation implies that the lens thickness is negligible compared with the radii of curvature of the two surfaces and is known as the thin lens. Defining the object distance as so ¼ s1 and the image distance as si ¼ s02 , Eq. (1.39) can be rewritten as   1 1 ðnl  nÞ 1 1  ¼  , (1.40) si so n R1 R2 which is known as the thin lens equation. The right side of Eq. (1.40) is a constant that only depends on the lens parameters. The inverse of this is called the lens focal length f and is given by   1 ðnl  nÞ 1 1 : (1.41) ¼  f n R1 R2 Using the definition of the focal length, we arrive at the Gaussian equation for lenses: 1 1 1  ¼ : si so f

(1.42)

The focal length is a parameter that identifies the lens. The sign of the focal length depends on the values of the refractive indices n and nl and the radii of curvature R1 and R2. Since in practice most lenses are used in

38

Chapter 1

f >0

f 0

F'

F

-f

f 1, the image will always be to the right of the secondary focal point (at xi ¼ jf j∕a) and to the left of the lens; i.e., the positional range of the image is jf j ≤ si , 0. The image magnification will be mt ¼ 1/a. In Fig. 1.44, the positional ranges for an object and the corresponding positional ranges for its image through a negative lens are displayed. The image is always smaller than the object.

Geometrical Optics

43

o o' -

-3 f

-2 f

f

2f

3f

-f

Figure 1.44 A negative lens forms virtual images that are upright and smaller than the object, located between the object and the lens.

1.5.4 Focal planes in thin lenses Similar to the definition of focal planes given in Section 1.3.4, thin lens focal planes are planes perpendicular to the optical axis containing the focal points. Within the paraxial approximation, oblique bundles of parallel rays hitting a lens can be refracted in two ways: (1) by converging to a point located in the secondary focal plane if the lens is positive, and (2) by diverging from a point located in the secondary focal plane if the lens is negative. On the other hand, refracted rays come out of the lens as oblique bundles of parallel rays when: (1) a point object is placed in the primary focal plane of a positive lens, and (2) a virtual object is located on the primary focal plane of a negative lens. These cases are illustrated in Fig. 1.45. In any case, a ray passing through the center of the lens is not deflected when refracted. Focal plane

f

Focal plane

f

Focal plane

Focal plane

-f

-f

Figure 1.45 Focal planes in positive and negative lenses.

44

Chapter 1

1.5.5 Ray tracing for oblique rays Although they have been mentioned in Sections 1.3.4 and 1.5.4, oblique rays can be specifically defined as those rays that leave the tip of an object and do not travel in a direction parallel to the optical axis. In particular, we are going to deal with oblique rays that are kept in the meridional plane. We have already dealt with some of them in Section 1.5.4, e.g., the ray passing through the center of a lens and the ray directed toward a primary focal point. In this section, we are interested in oblique rays that are directed in any other direction, which can occur in ray tracings that involve a combination of two or more lenses. For example, in Fig. 1.46, a ray refracted by the lens L1 is directed to the secondary focal point of L1. As it hits the lens L2, how is the ray refracted? With the ray tracing technique illustrated in Fig. 1.40, we do not have a solution. However, the ray tracing shown in Fig. 1.45 gives us a hint as to how to graphically determine the refraction generated by L2. If we assume that the oblique ray reaching L2 is part of a bundle of parallel rays, as illustrated in Fig. 1.47(a), such a ray will be refracted (according to Fig. 1.45), diverging

O

F1

F'2

F'1

L1

F2

L2

Figure 1.46 Ray tracing for a ray parallel to the optical axis coming out of the object tip. The ray refracted by lens 1 becomes an oblique ray for lens 2.

Oblique ray

Oblique ray

F

F' Auxiliary ray

F

F' Auxiliary ray

Secondary focal plane

Secondary focal plane

(a)

(b)

Figure 1.47 Oblique rays incident on thin lenses. (a) A negative lens and (b) a positive lens. To determine the refraction of an oblique ray hitting a lens, an auxiliary ray passing through the center of the lens and parallel to the oblique ray is drawn. The oblique ray is refracted traveling toward, or coming from, the point where the auxiliary ray intersects the secondary focal plane of the lens.

Geometrical Optics

45

from a point located at the secondary focal plane of the lens L2. Such a point is determined by the intersection of a ray that goes through the center of the lens (auxiliary ray) and the secondary focal plane of L2. With this in mind, we can determine the refraction direction of the oblique ray reaching the lens L2, as illustrated in Fig. 1.47(a). The backward extension of the refracted ray should join the auxiliary ray, which passes through the center of the lens at the focal plane. A similar procedure can be used for a positive lens, as illustrated in Fig. 1.47(b). In summary, to determine the refraction direction of an oblique ray incident on a lens, an auxiliary ray can be drawn parallel to the oblique ray so that it passes through the center of the lens. The oblique ray is refracted so that the ray, or its backward extension, passes through the point where the auxiliary ray intersects the secondary focal plane of the lens. Example: two positive lenses Consider an optical system with two thin lenses of focal lengths f1 ¼ 50 mm and f2 ¼ 35 mm separated by 20 mm. Let us find the position and size of the image generated by this optical system when an object is located at so1 ¼  70 mm and its size is ho ¼ 11.80 mm. Graphical ray tracing should be shown as well as analytical verification of the results. The graphical solution is shown in Fig. 1.48, in which two rays have been drawn exiting the object tip: one ray parallel to the optical axis and a second ray directed to the center of the first lens. The two rays refracted by lens 1 reach lens 2 as oblique rays. Using the oblique ray tracing method, the final refraction of the two rays is obtained. At the point where these two rays intersect, we find the tip of the image. Analytical verification can be performed with Newton’s equation. For the first lens, xo1 ¼  20; hence, xi1 ¼ (502/20) ¼ 125. For the second lens, so2 ¼ (125 þ 50  20) ¼ 155, from where xo2 ¼ (155 þ 35) ¼ 190 and thus 70

28.56

20

O

F'1 F'2

F1

F2 I

f 1 = 50

f 2 = 35

Figure 1.48 Ray tracing in a system of two positive lenses.

46

Chapter 1

xi2 ¼ (  352/190) ¼  6.44. In this way, we get si2 ¼ (35  6.44) ¼ 28.56, as in Fig. 1.48. All distances are given in millimeters. The image magnification of an optical system is the multiplicative product of the magnifications generated by each optical element in the system. The first lens magnification is mt1 ¼  125/50 ¼  2.5, and the second lens magnification is mt2 ¼ 6.44/35 ¼ 0.18; thus, the magnification of the optical system in this example is mt ¼ mt1mt2 ¼  0.46. Lastly, the image size is hi ¼ mtho ¼  5.43 mm.

1.6 Lenses: Principal Planes In Section 1.5, we discussed lens imaging in the paraxial range using the thin lens approximation. A thin lens can be thought of as a simplification of a real lens in which two refracting surfaces with a common optical axis meet in a plane, orthogonal to the optical axis, from where the object distance and image distance are measured. The Gaussian equation for lenses, i.e., Eq. (1.42), connects these two distances by using a lens parameter, the focal length, which depends on the radii of curvature of the two refracting surfaces, the refractive index of the lens, and the refractive index of the medium surrounding the lens. Once the focal length is specified, the primary and secondary focal points can be identified. In relation to these focal points, Newton’s lens equation is established, from which the position and size of the image can be determined. With the thin lens approximation, we can design and analyze an optical system quite well; i.e., we can establish the most fundamental parameters of the optical system, such as the number of lenses to be used and their focal lengths, the spacing between lenses, the angular size of the image, the amount of energy the system can collect, etc. However, to analyze image quality, i.e., to determine how similar the image is in relation to the object, it is necessary to consider optical aberrations of the system and analyze light diffraction. Because this requires precise knowledge of the actual geometry of the lenses, the thickness of the lenses must be taken into account. The first parameter with which we identify a lens is its focal length. The focal length can be determined by using a bundle of rays that travels toward the lens in a direction parallel to the optical axis. The point where the refracted rays converge, if the lens is positive, or the point from where the rays diverge, if the lens is negative, must be located. That point of convergence or divergence is the secondary focal point F0 . In the thin lens approximation, the focal length corresponds to the distance between the plane representing the thin lens and the secondary focal point. In the case of a real lens (commonly referred to as a “thick lens”), we can easily define the secondary focal point as

Geometrical Optics

47 1

2

F'

n

nl

n

Figure 1.49 Secondary focal point in a real (thick) lens.

shown in Fig. 1.49. But what is the focal length? And what is the point or plane from where such length is measured? In a thin lens, the forward projection of a ray incident parallel to the optical axis intersects the backward projection of its corresponding refracted ray in a single plane, i.e., a plane that serves as the (principal) plane representing the thin lens. Now, if in Fig. 1.49 we extend forward the ray incident parallel to the optical axis and extend backward the corresponding refracted ray after the lens, to the point where the two rays intersect, we can identify a plane orthogonal to the optical axis that contains the point of intersection. A thin lens can be placed in that plane, omitting the actual lens, with a focal length that is equal to the distance between that plane and the secondary focal point. With this construction, the refraction of an incident ray parallel to the optical axis in both the real lens and the thin lens would be equivalent; the ray would converge toward the secondary focal point. The plane in which the equivalent thin lens can be placed is called the secondary principal plane and is denoted by H0 , as in Fig. 1.50, and the focal length of such a thin lens will be the focal length f of the real lens, which is given by H

H'

1

2

F'

F

n f

nl

n f

Figure 1.50 Establishing focal points and focal lengths in a real lens. The focal lengths are measured from the principal planes H and H0 . 

In Appendix A, this expression for the focal length of a lens immersed in air is derived. The derivation uses paraxial ray tracing equations.

48

Chapter 1

  1 1 1 ðn  1Þ2 t ¼ ðnl  1Þ  , þ l R1 R2 nl f R1 R2

(1.50)

where t is the thickness of the lens. The primary focal point F can be obtained with a similar strategy by sending a ray (parallel to the optical axis) that travels from right to left toward the lens. The focal length obtained in such a case is equal to the one computed from Eq. (1.50) but measured from the primary principal plane, denoted by H, as shown in Fig. 1.50. A real lens can be represented using the principal planes. From these, we can measure the object distance and the image distance, as illustrated by Fig. 1.51. By comparing the geometry of Figs. 1.41 and 1.51, we can establish that the Gaussian and Newtonian equations for lenses, Eqs. 1.42 and 1.47, are also valid in real lenses, although the object and image distances are to be measured from the principal planes and the lens focal length is given by Eq. (1.50). In a thin lens, a ray directed to the center of the lens does not deviate. In the real lens, there is an equivalent ray but not with a straight path line. There is a ray that leaves the object tip and is directed to the point where the optical axis intersects the primary principal plane. The point of intersection is denoted by P and is called the primary principal point. Such a ray is refracted, after the lens, traveling with the same inclination of the incident ray as if it were leaving the point where the optical axis intersects the secondary principal plane. That point of intersection is denoted by P0 and is called the secondary principal point (see Fig. 1.51). The actual trajectory of the ray within the lens will, of course, be a straight line going from the point where the incident ray intersects the first surface to the point where the refracted ray intersects the second surface. When the lens is immersed in two media of different refractive indices, i.e., the object is in one medium and its image is in another (both with a H

H'

V P

n

- xo

f

- so

Figure 1.51

I

F'

P' V' F

O

nl

n

xi

f

si

Object and image distances in a real lens with respect to its principal planes.

Geometrical Optics

49

different refractive index than the lens), the oblique ray that does not change the inclination when refracted no longer passes through the principal points. However, another pair of points can be defined on the optical axis that allow us to know the direction of this oblique ray. These points are called nodal points, and the ray is called the nodal ray. An example of lenses immersed in different media is found in underwater cameras (water–glass–air). In cases like these, Eqs. (1.42) and (1.47) are no longer valid. We will not consider these types of cases in this book. We will deal with lenses immersed in the same medium, so that the nodal points and the principal points coincide. The position of the principal planes is usually given by the distance between the vertex of the refracting surface and the principal plane (Fig. 1.51; see also Appendix A). For the plane H, VP ¼ f

ðnl  1Þt , R2 nl

(1.51)

and for H0 , V0 P0 ¼ f

ðnl  1Þt : R1 nl

(1.52)

In Fig. 1.52, the location of the principal planes is shown for some positive and negative lens configurations. In symmetrical lenses, the principal planes are also symmetrical; in lenses with a flat surface, one of the principal planes is H H'

H H'

H H'

H H'

H H'

HH'

f >0

f s0 (as > 0) and negative if S 0 < s0 (as < 0). AS

R y

O

Marginal ra y Paraxial ray

V

C

-s

O' S' s'

n

n'

Figure 1.88 Spherical aberration. Nonparaxial marginal rays do not converge at the same point on the optical axis.

88

Chapter 1

Size of the defocused image of P Caustic

C

O

Waist

n

Gaussian image plane

n'

Figure 1.89 A caustic of rays affected by spherical aberration. The best image for P is achieved in the caustic waist.

Figure 1.89 shows the refraction of a ray bundle, diverging from the point O, that is affected by spherical aberration. In the Gaussian image plane, the intersection of the rays determines the size of the point image, as shown in the figure. The curve tangent to the intersection of the refracted rays is called a caustic and is a maximum energy curve. The caustic is seen to have a minimal cross section called the waist. In this section, the refracted rays have the lowest radial scatter. There you will have the best image of the point O, i.e., the least possible blur for the given AS size. Spherical aberration affects off-axis points of the object in the same way. 1.9.3 Distortion Let us assume that the AS in Fig. 1.89 is closed enough so that the effect of spherical aberration can be ignored for rays exiting the axial point O; again, the image of O is a point in O0 , as shown in Fig. 1.90. On the other hand, from the point P, which is off-axis, a ray bundle also comes out with the rays traveling very close to the chief ray and hitting point Q on the refracting surface. Therefore, for the rays leaving P, the spherical aberration effect with respect to the chief ray can also be ignored. However, the distance between the point Q and the auxiliary axis PV0 P 00 is greater than the size of the AS; thus,

Gaussian image plane Petzval surface

AS

R

P'

Q O

V

C

P'' P'''

Dist

O'

V' P

n

n'

Figure 1.90 Distortion aberration warps the image but maintains the focus of each point. This aberration depends on the position of the AS.

Geometrical Optics

89

the chief ray leaving P is affected by spherical aberration, and the image P0 of P will be at a distance SphL  as QV02 from the Petzval surface. Because the image is observed in the Gaussian image plane, what we will see is the projection of rays that diverge from P0 . Since the AS is small and in practice VO0 ≫ SphL, the intersection of these rays in the plane of the Gaussian image will look roughly like a point, denoted by P 000 in Fig. 1.90. Therefore, P 000 will be the image of P in the Gaussian image plane. The difference between the height of P 000 , h¯ 0 , and the Gaussian image P 00 , h0 , is called distortion aberration or simply distortion (Dist). In Fig. 1.86, the AS is located at the center of curvature of the surface. Note that the projection of the rays diverging from P0 reaches P 00 , so there is no distortion. The distortion depends on the position of the AS. In Fig. 1.90, the AS is to the left of C and the height of the image P 000 is less than the height of the Gaussian image P 00 ; in this case, the image is said to have barrel distortion. If the AS is placed to the right of C, we will have the opposite: the height of the image P 000 will be greater than the height of the Gaussian image P 00 . In this case, the image is said to have a cushion deformation. In a first approximation, the distortion also depends on the cube of h0 , Dist ¼ at h03 ¼ h¯ 0  h0 ,

(1.79)

where at is the distortion coefficient that depends on the system parameters and the position of the AS. The possible configurations for an image affected by distortion are shown in Fig. 1.91 for an optical system of magnification mt ¼  1 formed by a thin biconvex lens. When the AS is right next to the lens, the distortion is roughly zero. Distortion is strictly zero when the AS is at the center (inside) of the lens. In a multiple lens system, the AS will be located AS

AS

AS

O

P

Object P

P

P

O

O

O

Image with barrel distortion

Image without distortion

Image with pincushion distortion

Figure 1.91 Examples of distortion aberration. If the AS is to the left of the lens, the image has barrel distortion. If the AS is right next to the lens, the image is (practically) without distortion. If the AS is to the right of the lens, the image has cushion distortion.

90

Chapter 1

between the lenses seeking the greatest possible symmetry to have the least distortion. 1.9.4 Astigmatism and coma To see these two aberrations, we will first define the tangential plane and the sagittal plane. In Section 1.3.5, the meridional plane is defined as the plane containing the optical axis. In that section, the extended objects are represented as arrows of a certain height contained in the meridional plane, and the rays emerging from any point on the object are also limited to the meridional plane. In an optical system, the plane that contains the optical axis and the chief ray is called the tangential plane, and the plane that contains the chief ray and is orthogonal to the tangential plane is called the sagittal plane. The tangential plane is a meridional plane, but the sagittal plane is not. The tangential plane maintains its spatial orientation throughout the optical system, whereas the sagittal plane changes its inclination in the same way as the chief ray. In Fig. 1.92, these planes are illustrated in a system composed of an AS and a convex refracting surface. The height of the object is h ¼ OP, and the height of the image is h0 ¼ O0 P0 . In Fig. 1.92, the rays that leave P and are in the tangential plane are called tangential rays (blue); the rays leaving P and contained in the sagittal plane are called sagittal rays (red). The projection of the AS on the refracting surface is shown in the figure and four points are marked: T1 and T2 correspond to the intersections of the tangential marginal rays on the refracting surface, and S1 and S2 correspond to the intersections of the marginal sagittal rays on the refracting surface. The meridional plane containing the marginal tangential rays passing through T1 and T2 is shown in Fig. 1.93. Sagittal points S1 and S2 project into this meridional plane at a single point. Given the effect of spherical aberration Refracting surface

P'

T1

AS

O'

S2

Tangential plane

S1 T2

O Sagittal plane

P

Figure 1.92 Defining the tangential (blue) and sagittal (red) planes.

Geometrical Optics

91

T1 S2

A T2

V V'

O

S1

AS

T'

T1 S1

P' T'1 S'

T'2

P'' P'''

S2

T2 C

V

O''

O'

V'

P

n

n'

Figure 1.93 Defining astigmatism and coma aberrations for marginal tangential rays (blue) and sagittal rays (red). The distance S0 T0 along the chief ray is the (longitudinal) astigmatism. The distance between T0 and the chief ray is the tangential (transversal) coma, and the distance between S0 and the chief ray is the sagittal (transversal) coma. The chief ray cuts the surface at A and the auxiliary axis at P0 .

on the marginal rays, it follows that, due to the symmetry of S1 and S2 with respect to vertex V0 , the marginal sagittal rays (Fig. 1.93, rays in red) intersect in the auxiliary optical axis after refracting at points S1 and S2. On the other hand, because T1 and T2 are not at the same distance from the vertex V0 (top left part of Fig. 1.93) the marginal tangential rays (rays in blue) do not intersect on the auxiliary optical axis after refracting at points T1 and T2. This asymmetry gives rise to astigmatism and coma aberrations, which are defined from the following intersections: T01 , intersection of the superior marginal tangential ray with the auxiliary axis; T02 , intersection of the lower marginal tangential ray with the auxiliary axis; T0 , intersection between the upper and lower marginal tangential rays; and S0 , intersection between the marginal sagittal rays that is located on the auxiliary axis. The longitudinal distance between the transversal projections of S0 and T0 on the chief ray defines the longitudinal astigmatism aberration (AstL). The transversal distance between T0 and the chief ray defines the tangential coma aberration (ComaT). The transversal distance between S0 and the chief ray defines the sagittal coma aberration (ComaS). Astigmatism As the height of an object decreases, the spherical aberration experienced by the marginal rays (measured from the auxiliary axis) also decreases; thus, the intersections T0 and S0 approach the Petzval surface. On the optical axis, T0 and S0 coincide (astigmatism is equal to zero) at the point O 00 that goes from O0 to the distance SphL  as T1 V2 ¼ as T2 V2 (Fig. 1.93). In a first approximation, for a plane object, T0 and S0 describe two paraboloids of revolution with vertex at O 00 , called astigmatic surfaces. As the AS closes, O 00 approaches O0 and the astigmatic surfaces also move toward O0 . For a fixed height of the object, the AstL does not depend on the size of the AS.

92

Chapter 1 (a)

(b)

T1 S2

S1

S2

T2

(c)

T1 T2

S1

P

S2

S1 T1

(e) T2

T2

T2

AS

O

(d)

T1

M

n'

S Ptz P'' P'''

O'

C

V

S2 T1

T

n

S1

S2 S1

AstL

Figure 1.94 Astigmatic curves T and S and the intersection of the marginal rays in different planes in the absence of coma aberration. The projection of the marginal rays in the Gaussian image plane is an ellipse. In M, the circle of least confusion is obtained.

Figure 1.94 shows the sections of the astigmatic surfaces in the tangential plane, denoted by T and S, assuming that the diameter of AS tends to zero (O 00 coincides with O0 ). At T, the tangential marginal rays are in focus while the sagittal marginal rays are not yet in intersection. In a transversal plane passing through T, the marginal rays describe a (horizontal) line called the tangential focal line, shown in (b). At S, the sagittal marginal rays are in focus, while the tangential marginal rays diverge from T0 . In a transversal plane passing through S, the marginal rays describe a (vertical) line called the sagittal focal line, shown in (d). The length of these focal lines is equal and depends on the size of the AS. In the middle of the astigmatic surfaces, there is a surface M where the marginal rays describe a circumference whose diameter is half the length of the focal lines, as shown in (c). In the vertical direction, we have the intersections of the marginal tangential rays T1 and T2, and in the horizontal direction, we have the intersections of the marginal sagittal rays S1 and S2. In particular, the marginal tangential rays have reversed their positions. The circle in M is called the circle of least confusion. For a plane located before T, the marginal rays describe an ellipse with the major axis defined by the sagittal rays, as shown in (a). Finally, in the Gaussian image plane, the marginal rays also describe an ellipse, but now the major axis is defined by the tangential rays. The Petzval 

This condition implies that the coma also tends to zero.

Geometrical Optics

93

surface, in which the image would be formed in the absence of astigmatism, is also shown in Fig. 1.94. According to the above, the image of an off-axis point object will change its geometry depending on the plane that is used to observe the image. The marginal rays describe the edge of the image corresponding to that off-axis point. The other rays, which go inside the AS, will fill the shape described by the marginal rays. Reducing the size of the AS decreases the length of the focal lines; i.e., in general, the ellipse size decreases, thus optimizing the image of the point. On surface M, we will have the best image for the point (a symmetrical point). However, astigmatic curves do not change shape with variation in the size of the AS. The distances from T and S to Ptz satisfy the relationship PtzT ¼ 3PtzS. Like the Petzval surface, astigmatic surfaces can be characterized by the radii of curvature at the vertex O0 that depend on the parameters of the optical system and the position of the AS. Therefore, T¼

h02 2rT

(1.80)



h02 : 2rS

(1.81)

and

Coma aberration Now suppose we have coma aberration and there is no astigmatism. Thus, the points T0 and S0 will be in the same plane orthogonal to the optical axis, and we have the relation ComaT ¼ 3ComaS . The ComaS depends linearly on the height h0 ; thus, ComaS ¼ ac h0 ,

(1.82)

where the coma coefficient depends on the parameters of the optical system and the position and size of the AS. The shape described by the marginal rays is a circle (comatic circle) of radius equal to ComaS with its center at the distance 2ComaS from the chief ray. Decreasing the AS radius decreases the ComaS value, thus decreasing the radius of the comatic circle and moving closer to the chief ray, as shown in Fig. 1.95. The shape we have looks like a comet, hence the name of this aberration. Going from T1 to S1 in Fig. 1.92, following the projection of the AS on the refracting surface, implies going from T0 to S0 in the corresponding circle of the the comatic shape; i.e., we would go around the corresponding comatic circle twice as we go around the edge of the AS once.

94

Chapter 1 P'

Coma S

S' S'

Coma S

ComaT

T'

T'

Figure 1.95 Coma aberration geometry. Each circle corresponds to a diameter of the AS.

Example: the geometrical point spread function So far, we have made a qualitative description of the primary optical aberrations separately in order to understand the main characteristics of each of them. However, in practice, an image has a mixture of aberrations. In the Gaussian image plane, the image of a point outside the optical axis will be, in general, an asymmetrical point called a geometrical point spread function (PSF). To see this, consider the optical system shown in Fig. 1.96, whose data are given in Table 1.3. In particular, we are going to consider the formation of the image of two points in the plane of the object: the point on the axis with h ¼ 0 and the point off-axis with h ¼  10 mm. The focal length of this lens is f ¼ 49.55 mm, the height of the Gaussian image is h0 ¼ 9.3844 mm, and the height of the chief ray in the Gaussian image plane is h¯ 0 ¼ 9.2817 mm. Thus, the distortion of the chief ray is Dist(h ¼  10) ¼  0.1027 mm. Color convention (Figs. 1.97–1.101). The PSF will be drawn as a diagram of points (small circles) corresponding to the intersections of the rays emerging from a point object with the image plane. This type of diagram is called a spot diagram. We will present the spot diagram in the image plane in two ways: first, for tangential and sagittal rays uniformly distributed at the AS (both in black, along the x0 and y0 directions); second, for marginal rays uniformly AS

H H'

_ h' = 9.28

h = - 10 f = 49.5

Figure 1.96 Optical system to view the point spread function in different image planes.

Geometrical Optics

95

Table 1.3 A biconvex lens optical system with an AS. Radii and thicknesses are in millimeters. Surface

Radius

0 (Obj)

`

1 (AS)

`

2

50

3

 50

4 (Imag)

Thickness

Refractive index

80

1.0000

20

1.0000

7

1.5168

93.7

1.0000

`

distributed on the perimeter of three concentric circles with the AS, with diameters D1 ¼ DDA (black), D2 ¼ 2DDA/3 (green), and D3 ¼ DDA/3 (magenta). Points T1 and T2 of the marginal tangential rays are blue, and points S1 and S2 of the marginal sagittal rays are red. The calculations to obtain the spot diagrams are performed with the equations for the exact ray tracing presented by Kingslake. In all cases, the origin of the coordinates corresponds to the intersection of the chief ray with the image plane. In Fig. 1.97, spot diagrams for the point object are shown on the axis when the AS has a diameter of 5 mm. In (a), the image is observed in the Gaussian image plane. The size of the image turns out to be a circular spot approximately 55 mm in diameter. Note that although the marginal rays are distributed in the AS in three circles whose radii vary uniformly, the same is not the case with the radii of the circles in the image. Due to the quadratic dependence of longitudinal spherical aberration on the size of the AS, the size of the circles in the image varies with the cube of the AS diameter. Thus, while for an aperture of 5 mm in diameter a spot of approximately 55 mm in diameter is obtained, for an aperture of 1.67 mm in diameter a spot of approximately 2 mm in diameter is obtained, which represents a great improvement in the image. In (b), the image size is reduced when the image plane is displaced Ds0 ¼  0.63 mm from the Gaussian image plane. This situation corresponds to the caustic waist (Fig. 1.89), in which the circular spot of smaller diameter is obtained. In this case, the image is now about 12 mm in diameter. Other aberrations appear in the image when the point object is displaced from the optical axis. To see astigmatism aberration consider the off-axis 

In the book Lens Design Fundamentals by Rudolf Kingslake (Academic Press, Elsevier, 1978), there is a version of the equations for exact ray tracing in optical systems with symmetry of revolution. The calculations shown in this example were performed with a program that I developed.

96

Chapter 1 30

Image plane, s' = 0 (Gaussian image plane)

y' ( m)

AS = 5 mm

20

20

10

10

0

S1

S2

y' ( m)

h=0

30

T2

-10

-10

-20

-30 -30

0

-20

T1 - 20

-10

0

10

20

-30 -30

30

- 20

-10

x' ( m)

0

10

20

30

10

20

30

x' ( m)

(a)

h=0

30

20

20

10

10

T2 0

S1

y' ( m)

y' ( m)

AS = 5 mm Image plane, s' = - 0.63 mm (Círcle of least confusion)

30

S2 T1

0

-10

-10

-20

-20

-30 -30

- 20

-10

0

10

20

30

x' ( m)

-30 -30

- 20

-10

0

x' ( m)

(b) Figure 1.97 Spherical aberration with a 5 mm diameter AS. (a) In the Gaussian image plane, the image is a circular spot of approximately 60 mm. (b) Moving the image plane a distance of  0.63 mm from the Gaussian image plane, the smallest circular spot (circle of least confusion) can be seen with a size of approximately 12 mm. See the color convention described on page 94.

point object (h ¼  10 mm), with an AS of 1 mm in diameter. The corresponding spot diagram, which resembles an ellipse, is shown in Fig. 1.98(a). The asymmetry of the ellipse is due to the presence of a coma. In the direction of the tangential plane, the size is approximately 70 mm. Note that for the 1 mm aperture, the diameter of the circular spot, due to spherical aberration, will be about 60/125 mm, much smaller than the size of the major axis of the astigmatic ellipse. Therefore, the increase in PSF is mainly due to astigmatism. In Fig. 1.98(b), the image plane was shifted Ds0 ¼  2.0 mm with respect to the Gaussian image plane to find the sagittal focal line. In fact, points S1 and S2 coincide. In Fig. 1.98(c), the image plane was shifted Ds0 ¼  4.5 mm with respect to the Gaussian image plane to find the tangential focal line. Now the points T1 and T2 coincide and the points S1 and S2 separate, exchanging their positions with respect to the positions they have in Fig. 1.98(a). This tells us that AstL  2.5 mm. 

In the ray tracing program, for astigmatism and coma, the image plane is shifted in 0.1 mm steps. Therefore, the value of the AstL is an approximate number.

Geometrical Optics

97

30

Image plane,

s' = 0

y' ( m)

AS = 1 mm

(Gaussian image plane)

30

T2

20

20

10

10

y' ( m)

h = - 10 mm

0

S1

S2

0

-10

-10

-20

-20

-30

-30

T1 -30

-10

-20

0

10

20

30

-30

-20

-10

x' ( m)

0

10

20

30

10

20

30

10

20

30

x' ( m)

(a) 30

30

h = - 10 mm

20

AS = 1 mm

10

0

y' ( m)

y' ( m)

Image plane, s' = - 2.0 mm (Sagittal focus)

20

T2

10

S2 S1

-10

0 -10

-20

-20

T1

-30

-30 -30

-10

-20

0

10

20

-30

30

-20

-10

x' ( m)

0

x' ( m)

(b)

Image plane, s' = - 4.5 mm (Tangential focus)

y' ( m)

AS = 1 mm

30

30

20

20

10

10

0

T2 T1

S2

y' ( m)

h = - 10 mm

S1

0

-10

-10

-20

-20

-30

-30 -30

-20

-10

0

10

x' ( m)

20

30

-30

-20

-10

0

x' ( m)

(c) Figure 1.98 Astigmatism aberration. (a) In the Gaussian image plane, the spot diagram resembles an ellipse, as shown in Fig. 1.94(e). Here the ellipse is slightly asymmetrical due to the presence of a coma. In (b) and (c), the sagittal and tangential focal lines are shown, respectively. The effect of coma aberration is small, and therefore the sagittal and tangential focal lines show up nicely in the figure. See the color convention on page 94.

By changing the size of the AS to 5 mm in diameter, the effect of coma aberration becomes noticeable. In Fig. 1.99(a), the spot diagram is more asymmetrical and now looks like a comatic shape (Fig. 1.95). The image size also increased significantly, measuring 370 mm along the tangential plane, compared with the image size affected only by spherical aberration (60 mm),

98

Chapter 1 300

200

200

T2

100

Image plane, s' = 0 (Gaussian image plane)

y' ( m)

AS = 5 mm

100

y' ( m)

h = -10 mm

300

0

S1

S2

-100

0

-100

-200

-200

T1 -300 -300

-200

-100

0

100

200

-300 -300

300

-200

-100

x' ( m)

0

100

200

300

100

200

300

100

200

300

x' ( m)

(a)

h = -10 mm

300

200

200

100

100

T2

y' ( m)

s' = - 2 mm

y' ( m)

AS = 5 mm Image plane,

300

0

S1

S2

-100

-100

T1

-200

-300 -300

0

-200

-100

0

-200

100

200

-300 -300

300

-200

-100

x' ( m)

0

x' ( m)

(b)

Image plane,

s' = - 4.5 mm

y' ( m)

AS = 5 mm

300

200

200

100

100

0

S2

T2

-100

y' ( m)

h = -10 mm

300

S1

-100

T1

-200

-300 -300

0

-200

-200

-100

0

100

x' ( m)

200

300

-300 -300

-200

-100

0

x' ( m)

(c) Figure 1.99 The coma aberration becomes more relevant as the diameter of the AS increases. (a) On the Gaussian image plane, the cometic form becomes apparent. (b and c) The image plane is shifted to find the astigmatic lines, which are heavily affected by the coma. See the color convention on page 94.

as shown in Fig. 1.97(a). When moving the image plane to the positions where the focal lines shown in Fig. 1.98 were found, the spot diagram now differs greatly from a couple of lines (vertical and horizontal). This is a result of the increase in coma as the diameter of the AS increases. Note that as the aperture diameter increases, the sagittal marginal radii S1 and S2 and the tangential marginal radii T1 and T2 also increases; therefore, these rays will be focused at

Geometrical Optics

99

different positions than when the AS is 1 mm diameter. This explains why in Figs. 1.99(b) and (c) the marginal sagittal radii and the marginal tangential radii do not coincide as they do in Figs. 1.98(b) and (c). For the marginal sagittal and marginal tangential rays to coincide, the image plane must be shifted further. In Fig. 1.100(a), the image plane has shifted Ds0 ¼  2.7 mm, where S1 and S2 coincide, i.e., where the sagittal focal line would be. In Fig. 1.100(b), the image plane has shifted Ds0 ¼  5.3 mm, where T1 and T2 coincide, i.e., where the tangential focal line would be. Although the spot diagrams in (a) and (b) are far from being a vertical line [in (a)] and a horizontal line [in (b)], the overall size is smaller by an approximate factor of 2. Note that the difference between the two positions of the image planes is AstL  2.6 mm. This shows that the AstL does not in fact depend on the size of the AS. Finally, consider again Fig. 1.98. Here we have three positions for the image plane. Which one best reproduces the point object? Ideally, the image of a point object is a point, so we would expect the image to resemble a circular spot. In Fig. 1.94, in the middle of the astigmatic surfaces is the circle of least confusion. This would be the best image we can have in the presence

h = -10 mm

180

120

120

y' ( m)

60

y' ( m)

60

AS = 5 mm Image plane, s' = - 2.7 mm (Sagittal focus)

180

T2 0

S2

S1

-60

-60

-120

-180 -180

0

-120

T1 - 120

-60

0

60

120

-180 -180

180

- 1 20

-60

0

60

120

180

60

120

180

x' ( m)

x' ( m)

(a)

Image plane, s' = - 5.3 mm (Tangential focus)

y' ( m)

AS = 5 mm

180

120

120

60

60

y' ( m)

h = -10 mm

180

0

S2

S1

-60

-60

T2 T1

-120

-180 -180

0

-120

- 120

-60

0

60

120

x' ( m)

180

-180 -180

- 1 20

-60

0

x' ( m)

(b) Figure 1.100 Spot diagram in which the sagittal marginal rays (a) and the tangential marginal rays (b) coincide. This corresponds to the focal lines when the AS has a diameter of 5 mm. See the color convention on page 94.

100

Chapter 1

30 20

AS = 1 mm

10

Image plane,

s' = - 3.3 mm

(Spot of least confusion)

y' ( m)

h = - 10 mm

T2

0

S2

S1

-10

T1 -20 -30 -30

-20

-10

0

10

20

30

x' ( m)

Figure 1.101 Spot diagram corresponding to the circle of least confusion for astigmatic surfaces when the diameter of the AS is 1 mm. The asymmetry is due to the presence of a coma aberration. See the color convention on page 94.

of astigmatism. For our example, with the 1 mm diameter AS, the image plane should be shifted Ds0 ¼  3.3 mm from the image plane, which turns out to be AstL/2 ¼ 1.3 mm toward the left surface S. The corresponding spot diagram is shown in Fig. 1.101. Although the diaphragm is small, the effect of the coma on the asymmetry of the PSF is noticeable. This is the best image for the point object h ¼  10 mm. The center point of the image is at h¯ 0 ¼ 9.28 mm. If the height of the point object is changed, the circle of least confusion will be in another position for the image plane. Therefore, on an extended object, the image will be well in focus in some regions and out of focus in others. Lenses are combined in an optical system to reduce optical aberrations, so that the overall focus of the image is optimal. When the illumination of the object is polychromatic, another type of aberrations occurs: axial chromatic aberration and transverse chromatic aberration. These aberrations are defined and briefly described in Appendix D.

References [1] S. R.Wilk, “Ibn al-Haytham: 1,000 years after the Kitāb al-Manāzir,” Optics & Photonics News, 26(10), 42–48 (2015). [2] I. Newton, Opticks: or, a Treatise of the Reflections, Refractions, Inflections and Colours of Light, Sam. Smith and Benj. Walford, London (1704). [3] E. Hecht, Optics, Global Edition, 5th ed., Pearson, Harlow, England (2017).

Geometrical Optics

101

[4] W. E. Humphrey, “Method and apparatus for analysis of corneal shape,” US Patent 4420228 (1983). [5] H. von Helmholtz, Helmholtz’s Treatise on Physiological Optics, Vol. 1, Optical Society of America, Washington, D.C. (1924). [6] Y. LeGrand and S. G. El Hage, Physiological Optics, Vol. 13, Springer, New York (2013). [7] H. Davson, Physiology of the Eye, Macmillan International Higher Education, New York (1990). [8] G. Smith and D. A. Atchison, The Eye and Visual Optical Instruments, Cambridge University Press, Cambridge, England (1997). [9] D. A. Atchison and G. Smith, Optics of the Human Eye, Vol. 35, Butterworth-Heinemann, Oxford, England (2000). [10] V. N. Mahajan, Optical Imaging and Aberrations: Ray Geometrical Optics, SPIE Press, Bellingham, Washington (1998). [11] S. Dupr, “Galileo, the Telescope and the Science of Optics in the Sixteenth Century,” Ph.D. thesis, Ghent University, Ghent, Belgium (2002). [12] D. Malacara and Z. Malacara, Handbook of Optical Design, CRC Press, Boca Raton, Florida (2016).

Chapter 2

Polarization According to classical physics, light is an electromagnetic wave and its properties are obtained from Maxwell’s equations. One of these properties is that light is a transverse wave; i.e., the electric and magnetic vectors (optical field) vibrate orthogonally to the direction of wave propagation. If we assume that the light source is composed of oscillators that emit electromagnetic energy, then in general, the directions of the electric and magnetic vectors are random. However, it is possible to maintain the vibration of the resulting electric (magnetic) vector in a fixed plane or following an elliptical or circular curve. In such a case, the wave is said to be polarized. This chapter defines polarization and shows some of its applications (Fig. 2.1). Taking into account the linearity of Maxwell’s equations, one can limit the study of polarization to plane harmonic waves. Although the emitted or reflected optical field can have any form, Fourier analysis shows that the complex form of the optical field wavefront can be expressed by the sum of

(a)

(b)

Figure 2.1 Polarization by reflection. The light that enters through a window in a room is not polarized. When reflecting off a glass plate (smooth surface), as in (a), part of the window is visible along with the text below the glass plate. If the reflection is viewed at an angle close to Brewster’s angle, the light will be linearly polarized, which is verified by placing a linear polarizer between the glass plate and the photographic camera taking the image. This eliminates reflected light, and the text below the page is seen clearly, as shown in (b).

103

104

Chapter 2

harmonic plane waves. Thus, the results for plane waves can be extended to more complex forms of the optical field. This chapter begins by developing the algebra to describe linear, elliptical, and circular polarization. Among the polarization mechanisms, dichroism, polarization by total internal and external reflection, and birefringence are discussed in detail, with the latter limited to the case in which the principal directions of the refractive indices coincide with the axes of the crystal glass. The refractive media considered here are dielectrics without absorption. Finally, the Jones formalism to describe polarization states and polarizing elements is presented.

2.1 Plane Waves and Polarized Light In a vacuum, for a vector point r ¼ (x, y, z) and time t, the optical field is described by the electric vector E and the magnetic vector H, which are related to each other according to Maxwell’s equations, given by ∇  E ¼ m0 ∇  H ¼ ϵ0

∂H , ∂t

∂E , ∂t

(2.1) (2.2)

∇ · E ¼ 0,

(2.3)

∇ · H ¼ 0:

(2.4)

From Eqs. (2.1), (2.2), and (2.3), the wave equation for the electric field is ∇2 E ¼

1 ∂2 E , c2 ∂t2

(2.5)

with c2 ¼ 1∕m0 ϵ0 . For the magnetic field, an equation analogous to Eq. (2.5) is obtained. Because Eðx, y, zÞ ¼ fE x ðx, y, zÞ, E y ðx, y, zÞ, E z ðx, y, zÞg for a time t, Eq. (2.5) represents a set of three equations, one for each component of the electric field E. If any of these components is represented by V ¼ V(x, y, z), we have a scalar equation of the form ∂2 V ∂2 V ∂2 V 1 ∂2 V þ þ ¼ : ∂x2 ∂y2 ∂z2 c2 ∂t2

(2.6)

Let sˆ ¼ ðsx , sy , sz Þ be a unit vector in a fixed direction in space. A solution of Eq. (2.6) of the form V ðr, tÞ ¼ gðr · sˆ, tÞ represents a homogeneous plane 

The wave equation is obtained using the identity vector ∇  ð∇  EÞ ¼ ∇ð∇ · EÞ  ∇2 ðEÞ:

Polarization

105

wave propagating in the sˆ direction, since at a given time g is constant for the planes r · sˆ ¼ a (with a constant). In particular, a harmonic plane wave can be written as V ðr, tÞ ¼ A0 cosðkr · sˆ  vt þ dÞ,

(2.7)

where A0 is the amplitude of the wave. In addition, k ¼ 2p/l, where l is the wavelength in vacuum, called the wavenumber; v ¼ 2pn, where n is the wave temporal frequency, called the angular frequency; and d is the initial phase shift. Then, the phase of the wave is composed of three terms: a spatial phase given by the surface wðx, y, zÞ ¼ kr · sˆ, a temporal phase vt, and a constant phase d that allows the value to be adjusted at the origin (spatial and/or temporal). Surfaces of constant spatial phase are called wavefronts and correspond to the geometrical wavefronts derived from Fermat’s principle in Chapter 1. 2.1.1 Maxwell’s equations with plane waves If in Maxwell’s equations the fields E and H describe plane waves, these differential equations simplify to algebraic equations. To see this, let us write E and H in complex form, i.e., Eðr, tÞ ¼ E0 eiðk · rvtÞ ,

(2.8)

Hðr, tÞ ¼ H0 eiðk · rvtÞ ,

(2.9)

where k is the wave vector whose modulus is the angular wavenumber and whose direction is that of the unit vector sˆ; i.e., k ¼ kˆs. Although the term of the initial phase has been omitted to simplify the treatment, it will be included again when required. Then, the result of the operators ∇ and ∂/∂t on plane waves is ∇  E ¼ iðk  EÞ,

(2.10)

∇ · E ¼ iðk · EÞ,

(2.11)

∂E ¼ ivE ∂t

(2.12)

for the field E. For the field H, similar relationships are obtained. Therefore, Maxwell’s equations can be reduced to k  E ¼ m0 vH,

(2.13)

k  H ¼ ϵ0 vE,

(2.14)

k · E ¼ 0,

(2.15)

106

Chapter 2

Figure 2.2

Orientation of the fields E and H and the wave vector k.

k · H ¼ 0:

(2.16)

From these equations, E¼

1 k  H, ϵ0 v

(2.17)

which with Eq. (2.16) implies that E, H, and k form an orthogonal system of vectors, as illustrated by Fig. 2.2. Let H ¼ |H| (the modulus of H) and E ¼ |E| (the modulus of E). Given the mutual orthogonality between E, and H, and k from Eq. (2.14), H ¼ ϵ0 cE,

(2.18)

where c ¼ v/k. Thus the fields E and H vibrate in phase [Eqs. (2.8) and (2.9)] in a plane orthogonal to k and propagate as illustrated by Fig. 2.3. 2.1.2 Irradiance Experimentally, in the visible range, the field E is not measured due to the lack of detectors that can respond as fast as the E vibrations. Instead of measuring the amplitude of the field, a time average of the square of the field can be measured, i.e., the average energy per unit time per unit area. The time

Figure 2.3

Illustration of the propagation of harmonic fields E and H.

Polarization

107

required to do the averaging is determined by the response of the detector. Detector response times are several orders of magnitude of the time period. Formally, the mean value of the energy is calculated from the Poynting vector S ¼ E  H:

(2.19)

From the shared orthogonality between E, H, and k, taking into account Eq. (2.13), the Poynting vector can be written as 1 ðE · EÞk; m0 v

(2.20)

1 ðE · EÞˆs ¼ ϵ0 cðE · EÞˆs: m0 c

(2.21)

S¼ and because v/k ¼ c, S¼

This formula indicates that the direction in which the energy flows is normal to the wavefront, since sˆ is the unit vector that defines the normal of the wavefront. This result is also valid in dielectric (isotropic) media. The irradiance is then defined as I ¼ hSiTˇ ,

(2.22)

where S ¼ |S|. Here, h i denotes the mean value of the function and Tˇ denotes the integration (detection) time. By inserting Eq. (2.21) into Eq. (2.22), the irradiance is ϵc I¼ 0 Tˇ

ˇ T∕2 Z

ðE · EÞdt:

(2.23)

ˇ T∕2

In particular, for a periodic function, the mean value is taken with respect to the period of the signal. So for a harmonic plane wave, the irradiance will be ϵc I¼ 0 T

T∕2 Z

ðE · EÞdt,

(2.24)

T∕2

where T ¼ 1/n is the period of the wave. 

The mean value of the function f over time period t is defined as h f i ¼ 1t

t∕2

∫ f dt.

t∕2

108

Chapter 2

If the harmonic plane wave is given by E ¼ E0 cosðkr · sˆ  vtÞ,

(2.25)

the irradiance would be 1 I ¼ ϵ0 cðE0 · E0 Þ T

T∕2 Z

cos2 ðkr · sˆ  vtÞdt:

(2.26)

T∕2

The mean value of the cosine-squared function in a period is equal to 1/2; therefore; I¼

ϵ0 c ϵc ðE0 · E0 Þ ¼ 0 E 0 2 : 2 2

(2.27)

Another common way of representing the harmonic plane wave, also used in this book, is through the complex exponential function. Suppose the wave is given by E ¼ E0 eiðkr · sˆvtÞ , where the amplitude E0 is also complex and irradiance should be defined as ϵc I¼ 0 2T

T∕2 Z

(2.28) ∥E0∥ ¼ E0.

ðE · E Þdt,

In this case, the

(2.29)

T∕2

where E is the conjugate of E. In this way, it is guaranteed that the irradiance value is equal to the one obtained if the wave is represented as a cosine (or sine) function. By inserting Eq. (2.28) into Eq. (2.29), the irradiance is I¼

ϵ0 c ðE0 · E0 Þ: 2

(2.30)

2.1.3 Natural light and polarized light Because E and H are in phase and related according to Eq. (2.17), we will only consider the vector E to refer to the propagation of the optical field. Let us assume that the field E is the result of monochromatic plane waves emitted by the oscillators that make up a light source. Simplifying the model, let us also assume that the waves propagate in the z direction, so that at a given instant of time in an (x, y) plane at a distance z, the field E can be written as

Polarization

109

o n Eðx, y, z; tÞ ¼ jE ox jeidx eiðkzvtÞ , jE oy jeidy eiðkzvtÞ :

(2.31)

The amplitude of each field component has a phase term (dx and dy), so their difference allows us to measure the delay of one component with respect to the other. Choosing the plane z ¼ z0, from the phase difference Dd ¼ dy – dx, we can observe how the vector E evolves in time. This depends on the nature of the source. In general, the oscillations in the sources are such that Dd is a random variable. Consequently, we cannot predict how the vector E evolves in time (what amplitude and direction it has at a given moment). It is said that these types of sources, very common in nature, emit natural polarized or nonpolarized light. But if Dd remains stable in time, i.e., Dd ¼ constant, then it is possible to determine how the vector E evolves. In this case, it is said that the light is polarized. 2.1.4 Elliptical, circular, and linear polarization Polarized light implies that, given two components for the field E, the phase difference of the amplitude components is a constant. Depending on the value of this difference, the electric vector evolves confined to a plane or following an ellipse (circle). In the first case, there is linear polarization; in the second, there is elliptical (circular) polarization to the left or to the right. To see this, let us consider an example. Suppose n o Eðx, y, z; tÞ ¼ E ox , E oy eiðkzvtÞ (2.32) with E ox ¼ jE ox jei0

and

E oy ¼ jE oy jeip∕2 ;

(2.33)

i.e., dx ¼ 0 and dy ¼ p/2. Therefore, Dd ¼ p/2. First, let us see what the endpoint of the electric vector projected onto a plane, say z ¼ 0, looks like. The components of E take the form E x ¼ jE ox jei2pt∕T ,

(2.34)

E y ¼ jE oy jeið2pt∕Tp∕2Þ :

(2.35)

Considering each component as a phasor, i.e., a rotating vector of radius |Eox| (|Eoy|) and a phase dx (dy), graphically each component will look as illustrated in Fig. 2.4(a). The y component (top left), at t ¼ 0, starts with a lag of p/2, i.e., at point 1. On the other hand, the x component (bottom), at t ¼ 0, starts with a phase shift of 0, which is also indicated by the dot 1. As time increases, particularly for t ¼ T/4, T/2, 3T/4, and 2T, the positions for the phasors for x and y will be indicated by points 2, 3, 4, and 5, respectively. The

110

Chapter 2

(a)

(b)

Figure 2.4 Left elliptical polarization. (a) Endpoint of the electric vector projected onto a fixed plane (z ¼ 0). The end of the vector rotates counterclockwise. (b) Spatial evolution (t ¼ 0) of the electric vector. The end of the vector describes a helix that advances in the z direction in a clockwise rotation.

projections of points 1, 2, 3, 4, and 5 in the vertical direction for the x phasor and in the horizontal direction for the y phasor intersect in the shaded rectangular region with sides 2|Eox| and 2|Eoy|. The locus of the intersections describes the evolution of the electric vector in time, as shown at the top right of Fig. 2.4(a). In this example, the path is an ellipse, since |Eox| > |Eoy|, and it builds counterclockwise. In this case, the electric vector is said to have left elliptical polarization. The opposite direction of the trajectory (clockwise) is obtained if Dd ¼ p∕2; then the electric vector is said to have right elliptical polarization. In particular, if |Eox| ¼ |Eoy|, the trajectory is a circle, corresponding to left or right circular polarization. Now if we look at how the electric vector evolves in space, we see something different. Fixing the time at t ¼ 0, the E components take the form E x ¼ jE ox jei2pz∕l ,

(2.36)

E y ¼ jE oy jeið2pz∕lþp∕2Þ :

(2.37)

For the real part, in z ¼ 0, E ¼ fjE ox j, 0g; in z ¼ l/4, E ¼ f0, jE oy jg; in z ¼ l/2, E ¼ fjE ox j, 0g; in z ¼ 3l/4, E ¼ f0, jE oy jg; and finally, in z ¼ l, E ¼ fjE ox j, 0g. This is illustrated by arrows in Fig. 2.4(b). In this case, the path followed by the end of the electric vector is a helix moving clockwise. Of course, the two images are equivalent. Indeed, if in (b) we observe how the ends of the electric vector reach the plane z ¼ 0, we will have the trajectory given in (a).

Polarization

111

(a)

(b)

Figure 2.5 Linear polarization when the two components are in phase. (a) The trajectory of the end of the electric vector is a straight line bounded by the rectangular region of sides 2|Eox| and 2|Eoy|, tilted by an angle tan–1(|Eoy|/|Eox|). (b) The field E is confined to a plane inclined by an angle tan–1(|Eoy|/|Eox|) and moves harmonically.

When Dd ¼ 0, the observed path of the endpoint of the electric vector in a fixed plane (z ¼ 0) is a straight line inclined at an angle arctan(|Eoy|/|Eox|), as illustrated in Fig. 2.5(a). The extension of the line is limited by the rectangle with sides 2|Eox| and 2|Eoy|. This is the case of linear polarization. If the spatial evolution of the electric vector is observed, the field E will vibrate harmonically in an inclined plane with the angle arctan(|Eoy|/|Eox|), as illustrated in Fig. 2.5(b). 2.1.5 Polarization: general case In the previous section, left elliptical and linear polarization are shown, which result from the phase differences Dd ¼ p/2 and Dd ¼ 0. However, the value of Dd can be any constant. In general, the polarization state of an electromagnetic wave is given by a rotated ellipse, E 2y E 2x Ex Ey cosðDdÞ þ  2 ¼ sin2 ðDdÞ, jE ox j jE oy j jE ox j2 jE oy j2

(2.38)

as shown in Fig. 2.6. The ellipse remains confined to the rectangle of sides 2|Eox| and 2|Eoy| and the angle of rotation c of the ellipse is determined from tanð2cÞ ¼ tanð2aÞ cosðDdÞ, where

(2.39)

112

Chapter 2

Figure 2.6

Polarization ellipse in the general case.

tan a ¼

jE oy j : jE ox j

(2.40)

The sign of sin(Dd) indicates the direction of the polarization. If sin(Dd) > 0, the polarization is to the left; if sin(Dd) < 0, the polarization is to the right. Appendix F shows the complete derivation to obtain Eqs. (2.38), (2.39), and (2.40). Case 1. Dd ¼ mp; m ¼ 0, 1, 2, : : : , Equation (2.38) becomes Ey E 2y E 2x m Ex  2ð1Þ ¼ 0, þ jE ox j jE oy j jE oy j2 jE ox j2

(2.41)

which is equal to 

Ey Ex  ð1Þm jE ox j jE oy j

2

¼ 0,

(2.42)

which represents the straight line E y ¼ ð1Þm ðtan aÞE x ,

(2.43)

i.e., linear polarization with the plane of vibration inclined by the angle ±a.

Polarization

113

Case 2. Dd ¼ ð2m  1Þp∕2; m ¼ 0, 1,  2, : : : , Equation (2.38) becomes E 2y E 2x þ ¼ 1, jE ox j2 jE oy j2

(2.44)

which represents an ellipse with its major and minor axes oriented with the axes x (Ex) and y (Ey), respectively. Thus, there is elliptical polarization to the left or to the right. In particular, if |Eox| ¼ |Eoy|, there is left or right circular polarization. Case 3. Dd ≠ mp, ð2m  1Þp∕2. For phase shifts different from those treated in cases 1 and 2, there is elliptical polarization, in which the ellipse is rotated according to Eqs. (2.39) and (2.40). Example: ellipse from line to circle and vice versa To illustrate these three cases, suppose we have an optical field with |Eox| ¼ |Eoy| and Dd ¼ 0, p/4, p/2, 3p/4, p, 5p/4, 3p/2, 7p/4, and 2p. In this example, because tan 2a ¼ `, the angle of rotation of the general ellipse will be given by tan 2c ¼ `, where the sign is given by the sign of cos(Dd). In other words, the angle of rotation of the ellipse would be c ¼ 45°, if ð0 ≤ Dd , p∕2Þ ∪ ð3p∕2 , Dd ≤ 2pÞ and c ¼ 45°, if p∕2 , Dd , 3p∕2. In Fig. 2.7, the polarization states for the phase changes mentioned above are shown: Eþ45 denotes linear polarization with the plane of vibration at 45°, E–45 denotes linear polarization with the plane of vibration at 45°, EL indicates left circular or elliptical polarization, and ER indicates right circular or elliptical polarization.

Figure 2.7

Polarization states for various phase shifts with |Eox| ¼ |Eoy|.

114

Chapter 2

On the other hand, a ¼ 0 defines a horizontal linear polarization state (EH) and a ¼ 90° defines a vertical linear polarization state (EV).

2.2 Dichroism Polarization One way to remove one of the E-field components is by absorbing that component. This can be achieved by designing a device that performs this task or by using a natural material that has this property [1]. In both cases, the selective absorption of one of the E-field components is called dichroism. The final effect on the field will be linearly polarized light. 2.2.1 Linear polarizer To see how a dichroism-based linear polarizer works, suppose that a grid of parallel conducting wires is constructed, as shown in Fig. 2.8, and an unpolarized field E (natural light) is incident in a direction orthogonal to the grid plane. Because the electric charges have the possibility of greater displacement in the horizontal direction (along the wires) compared with the vertical direction (cross section of the wires), there will be a greater absorption of electric energy in the direction of the wires; thus, the net component Ex experiences a greater attenuation than the net component Ey. If, ideally, the component Ex is completely attenuated, we will have a linear polarizer and the field will have a vertical linear polarization state, EV. The direction in which the field is not attenuated is called the transmission axis of the linear polarizer. Using lithographic methods, polarizers based on a grid of conductive wires for the visible spectrum are manufactured, achieving arrangements with a separation of 100 nm between wires. Aluminum microwires are deposited on glass substrates. The most common dichroic linear polarizers are made of sheets of a special transparent plastic (polyvinyl alcohol). These sheets have been stretched in one direction to align their long molecules, which are then

Figure 2.8 Linear polarizer made with a grid of conducting wires.

Polarization

115

coated with iodine. In this way, something similar to the arrangement of the threads shown in Fig. 2.8 is obtained, but at a microscopic level (type H polarizers). Extinction coefficient and degree of polarization In practice, it is not possible to completely attenuate the component orthogonal to the transmission axis of the polarizer, and the component parallel to the transmission axis of the polarizer is not completely transmitted. If we represent the linear polarizer, as shown in Fig. 2.9, and we decompose the resulting incident field on the polarizer into a component parallel to the transmission axis, E∥, and into a component orthogonal to the transmission axis, E⊥, then the incident field would be E ¼ fE ∥ , E ⊥ geiðkzvtÞ :

(2.45)

To characterize the linear polarizer taking into account the absorption of the components E∥ and E⊥, two quantities are defined: the extinction coefficient, t∥ , t⊥

(2.46)

t∥  t⊥ , t∥ þ t⊥

(2.47)

rP ¼ and the degree of polarization, PP ¼

where t∥ ¼ jE 0∥ j∕jE ∥ j is the fraction that transmits the component parallel to the transmission axis and t⊥ ¼ jE 0⊥ j∕jE ⊥ j is the transmission fraction of the

Figure 2.9 In a real polarizer, 100% of the component parallel to the transmission axis is not transmitted, and the component orthogonal to the transmission axis is not completely canceled.

116

Chapter 2 Table 2.1

Technical specifications for a linear dichroic polarizer.

Bandwidth (nm)

100–1000

Extinction

102–106

Transmission (%)

>50, nt. From this angle the phenomenon of total internal reflection occurs. The Fresnel equations for ni > nt and ui < uc apply in the same way as in the external reflection case (ni > nt), and the only phase changes of the reflected components with respect to the incident components are 0 or ±p, as shown in Fig. 2.17 for ni ¼ 1.5 and nt ¼ 1.0. Unlike external reflection (Fig. 2.13), the orthogonal component does not undergo a phase change. In contrast, the parallel component has a phase shift of ±p for 0 , ui , u0p and is in phase for u0p , ui , uc . The angle u0p is the angle at which the polarization of the reflection occurs and is given by tan u0p ¼ ðnt ∕ni Þ. This angle together with the external polarization angle satisfies the relation up þ u0p ¼ p∕2.

(2.115)

Polarization

131

Figure 2.17 Reflection coefficients for the parallel and orthogonal components in internal reflection (0 < ui < uc) and total internal reflection (uc < ui < p/2).

When ni > nt and ui > uc, the reflection coefficients are complex variable quantities. To see this, note that from Snell’s law pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi cos ut ¼ 1  ðni ∕nt Þ2 sin2 ui ¼ 1  ðsin ui ∕ sin uc Þ2 , and for ui > uc, the term inside the square root is negative. Then it is convenient to write i cos ut ¼ n

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sin2 ui  n2 ,

(2.116)

pffiffiffiffiffiffiffi with i ¼ 1 and n ¼ nt ∕ni , 1. Thus, the reflection coefficients can be rewritten as pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sin2 ui  n2 ffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r∥ ¼ n2 cos ui þ i sin2 ui  n2

(2.117)

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sin2 ui  n2 ffi: pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r⊥ ¼ cos ui þ i sin2 ui  n2

(2.118)

n2 cos ui  i

and cos ui  i

In the two coefficients, the numerator is the conjugate of the denominator; therefore, jr∥ j ¼ jr⊥ j ¼ 1:

(2.119)

132

Chapter 2

This is shown in Fig. 2.17 for uc , ui , p∕2. Then the reflection coefficients can be written as r∥ ¼ jr∥ jeid ¼ eid ¼ ei2f ,

(2.120)

r⊥ ¼ jr⊥ jeid⊥ ¼ eid⊥ ¼ ei2f⊥ ,

(2.121)

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sin2 ui  n2 , tan f∥ ¼ n2 cos ui

(2.122)

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sin2 ui  n2 : tan f⊥ ¼ cos ui

(2.123)







where

The behavior of the phase shifts (d∥ ¼ 2f∥ and d⊥ ¼ 2f⊥ ) of the parallel and orthogonal reflection coefficients in the internal reflection 0 < ui < uc and the total internal reflection uc < ui < p/2, when ni ¼ 1.5 and nt ¼ 1.0, is shown in Fig. 2.18. The polarization state of the reflected wave will be determined by Dd ¼ d∥  d⊥ . For the range 0 < ui < uc, a linearly polarized incident wave is reflected linearly polarized (except for a change in the orientation of the plane of vibration). For the range uc < ui < p/2, the phase difference can be determined from

Figure 2.18 Phase shifts of the parallel and orthogonal components in internal reflection (0 < ui < uc) and total internal reflection (uc < ui < p/2).

Polarization

133



d∥ d⊥  tan 2 2

 ¼ tanðf⊥  f∥ Þ,

(2.124)

i.e., 

d∥ d⊥ tan  2 2



pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sin2 ui  n2 ∕ cos ui  sin2 ui  n2 ∕ðn2 cos ui Þ ¼ : 1 þ ðsin2 ui  n2 Þ∕ðn2 cos2 ui Þ

Simplifying, this can be rewritten as, pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi   d∥ d⊥ cos ui sin2 ui  n2  : ¼ tan 2 2 sin2 ui

(2.125)

(2.126)

Therefore, the phase differences will be  Dd ¼ d∥  d⊥ ¼ 2 arctan

cos ui

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sin2 ui  n2 : sin2 ui

(2.127)

The difference of phase shifts by internal reflection 0 < ui < uc and total internal reflection uc < ui < p/2, when ni ¼ 1.5 and nt ¼ 1.0, are shown in Fig. 2.19. In this figure it can be seen that for total internal reflection the phase difference varies between 0 and a value close to –p/4. Consequently, a wave reflected in the domain of total internal reflection will have an elliptical polarization state. To determine the minimum value of the phase difference in total internal reflection, dðDdÞ∕dui ¼ 0 can be solved for ui. The result obtained for the angle of incidence is

Figure 2.19 Difference of the phase shifts of the parallel and orthogonal reflected components, when ni ¼ 1.5 and nt ¼ 1.0.

134

Chapter 2

sin2 ui ¼

2n2 , 1 þ n2

and the minimum value of the phase difference turns out to be  2  n 1 Ddmin ¼ 2 arctan : 2n

(2.128)

(2.129)

As shown in Fig. 2.19, the minimum of the phase difference is Ddmin ¼ 45.24°, corresponding to the angle of incidence ui ¼ 51.67°. This tells us that with a reflection, in the domain of total internal reflection, when ni ¼ 1.5 and nt ¼ 1.0, it is not possible to obtain a wave with a state of circular polarization. One option to achieve Ddmin ¼ –p/2 is to have a material whose refractive index ni is such that n21 ¼ 2n. The positive solution of this pffiffiffi equation is n ¼ 1 þ 2 ¼ 0.4142. Assuming that nt ¼ 1, then ni ¼ 2.4142. This is a very high index for an optical glass. Another option is to keep a common optical glass and achieve two successive reflections, as long as the sum of the two phase differences is –p/2. The most common option is the one in which each reflection has a phase shift of –p/4. The values of the angles of incidence, for which the phase differences of –p/4 are obtained, are illustrated in Fig. 2.19 with segmented lines. These are ui1 ¼ 50.23° and ui2 ¼ 53.26°. An optical element that changes from linear to circular polarization, based on these ideas, is the Fresnel rhomb, such as the one shown in Fig. 2.20, with glass of refractive index 1.5 for the angle ui2 ¼ 53.26°. Suppose that an incident linearly polarized light beam (traveling from left to right) with the plane of vibration at 45° to the plane of incidence (45° to the positive direction of the orthogonal component) is perpendicular to the first side of the rhomb. Omitting the spatial and temporal phase terms, the electric field shown in Fig. 2.20 at position (1) can be written as Eð1Þ ¼ fei0 , ei0 gE 0 : Position (2) would be ð2Þ

ð2Þ

Eð2Þ ¼ ft⊥ ei0 , t∥ ei0 gE 0 ¼ fei0 , ei0 g0.8E 0 ; i.e., it is still linearly polarized, but the amplitude of the components is 0.8 times the initial one. Position (3) would be n o ð3Þ ð2Þ ð3Þ ð2Þ Eð3Þ ¼ r⊥ t⊥ ei0 , r∥ t∥ ei0 E 0 ¼ feið1.2785Þ , eið2.0639Þ g0.8E 0 ; i.e., fei0 , eip∕4 g0.8eið1.2785Þ E 0 , which represents a state of right elliptical polarization.

Polarization

135

Figure 2.20 Fresnel rhomb in air (nt ¼ 1.0) for glass of refractive index ni ¼ 1.5. The angle of incidence on the diagonal faces is ui ¼ 53.26°. Incident light linearly polarized at 45° (with respect to the plane of incidence) emerges as right circularly polarized light.

Position (4) would be n o ð4Þ ð3Þ ð2Þ ð4Þ ð3Þ ð2Þ Eð4Þ ¼ r⊥ r⊥ t⊥ ei0 , r∥ r∥ t∥ ei0 E 0 ¼ feið2.5570Þ , eið4.1277Þ g0.8  E 0 ; i.e., fei0 , eip∕2 g0.8eið2.5570Þ E 0 , which represents a state of right circular polarization. The radius of the circle is 0.8E0. Position (5) will be n o ð5Þ ð4Þ ð3Þ ð2Þ ð5Þ ð4Þ ð3Þ ð2Þ Eð5Þ ¼ t⊥ r⊥ r⊥ t⊥ ei0 , t∥ r∥ r∥ t∥ ei0 E 0 ¼ feið2.5570Þ , eið4.1277Þ g0.96E 0 ; i.e., fei0 , eip∕2 g0.96eið2.5570Þ E 0 , which represents a state of right circular polarization. The radius of the circle is 0.96E0. If the incident beam is linearly polarized but with a plane of vibration other than þ45°, the result will be a polarization state other than circular. Specifically, if the angle of the vibration plane is 0 or 90°, the state of linear polarization does not change; it remains in a plane perpendicular or parallel to the plane of incidence. Note that the plane of incidence is defined with the normal of the second surface, since with the first surface the plane of incidence is not defined (the incident wave vector is collinear with the normal of the interface). 2.4.2 Reflectance and transmittance Reflectance and transmittance curves are shown in Fig. 2.21. The reflectance is the square of the curves shown in Fig. 2.17 [Eq. (2.108)]. For transmittance,

136

Chapter 2

Figure 2.21 Reflectance and transmittance for internal reflection (0 < ui < uc) and total internal reflection (uc < ui < p/2).

we also have two intervals. In the first, 0 , ui , uc , the transmission coefficients are calculated according to Eqs. (2.98) and (2.99). In the second interval, uc , ui , p∕2, the transmission angle is ut ¼ 90°; therefore, according to Eq. (2.110), the transmittance for both the parallel and the orthogonal components becomes equal to 0. Of course, it is again verified that R þ T ¼ 1 for the two components of the electric (and magnetic) fields.

2.5 Polarization with Birefringent Materials The electric polarization vector in dielectric materials is related to the electric field by the electric susceptibility according to Eq. (B.9): P ¼ ϵ0 xE: When the material is isotropic, the susceptibility quantity is described by a scalar and the wave equation [Eq. (B.14)] is reduced to ∇2 E ¼ m0 ϵ0 ð1 þ xÞ pffiffiffiffiffiffiffiffiffiffiffi ∂2 E∕∂t2 , where y ¼ c∕ 1 þ x is the speed of light in the material. When the material is anisotropic, the susceptibility is described by a tensor (3  3 matrix) and the wave equation is a bit more complex. In general, electric susceptibility can be described as 0

x11 x ¼ @ x21 x31

x12 x22 x32

1 x13 x23 A, x33

(2.130)

 In total internal reflection, the incident energy is completely reflected. However, there is still an electromagnetic wave beyond the interface that is rapidly fading. This wave is known as an evanescent wave.

Polarization

137

and the electric polarization vector can be 0 0 1 x11 x12 Px @ Py A ¼ ϵ0 @ x21 x22 Pz x31 x32

described as 10 1 x13 Ex x23 A@ E y A, x33 Ez

(2.131)

which is not necessarily parallel to the electric field vector. In the case of nonabsorbing anisotropic dielectric materials (of particular interest in this book), x is a symmetric matrix and can be reduced on a system of principal axes to [2] 0 1 x1 0 0 x ¼ @ 0 x2 0 A: (2.132) 0 0 x3 The quantities x1, x2, and x3 are called principal susceptibilities. In general, crystals are anisotropic materials whose elements (atoms, molecules, ions, etc.) are located in regular geometrical arrangements: cubic, trigonal, tetragonal, hexagonal, triclinic, monoclinic, and orthorhombic. For nonabsorbing dielectric crystals, the principal susceptibilities are related as follows: for cubic, x1 ¼ x2 ¼ x3; i.e., it behaves like an isotropic material; for trigonal, tetragonal, and hexagonal, x1 ¼ x2 ≠ x3; and for triclinic, monoclinic, and orthorhombic, x1 ≠ x2 ≠ x3. By writing the wave equation [Eq. (B.14)] as ∇  ð∇  EÞ þ

1 ∂2 E 1 ∂2 E ¼  x c2 ∂t2 c2 ∂t2

(2.133)

and proposing a plane wave solution of the form Eðr, tÞ ¼ E0 eiðk · rvtÞ , a set of algebraic equations are obtained from which the behavior of the components of the electric field inside the anisotropic material can be analyzed. According to Eqs. (2.10) and (2.12), for harmonic plane waves, the following changes can be made: ∇ → ik and ∂/∂t → –iv. With this in mind, Eq. (2.133) becomes k  ðk  EÞ þ

v2 v2 E ¼  xE: c2 c2

In terms of components of k and E, it leads to   v2 v2 2 2 k y  k z þ 2 E x þ k x k y E y þ k x k z E z ¼  2 x1 E x , c c

(2.134)

(2.135)

 To obtain each of the components, the identity vector ½x  ðy  zÞ ¼ yi xj zj  zi xj yj can be i used, where xjyj (xjzj) denotes the scalar product x · y (x · z). With this equation, we obtain the ith component of the double cross product.

138

Chapter 2



 v2 v2 kx ky E x þ  þ 2 E y þ k y k z E z ¼  2 x2 E y , c c   v2 v2 k x k z E x þ k y k z E y þ k 2x  k 2y þ 2 E z ¼  2 x3 E z : c c k 2x

k 2z

(2.136) (2.137)

2.5.1 Phase retarder plates A first result of Eqs. (2.135–2.137) is obtained directly if we assume that a wave propagates within the material in one of the directions, e.g., k ¼ (kx, 0, 0), where the magnitude of k is k ¼ kx. In this case, the wave equations reduce to v2 v2 E ¼  x1 E x , x c2 c2   v2 v2 2 k x þ 2 E y ¼  2 x2 E y , c c   v2 v2 k 2x þ 2 E z ¼  2 x3 E z : c c

(2.138) (2.139) (2.140)

Because ϵ ≠ ϵ0 in the dielectric material, then x1 (and x2, x3) ≠ 0. Therefore, in Eq. (2.138), Ex ¼ 0; i.e., E is transverse to the direction of propagation, or E ¼ {0, Ey, Ez}. In Eq. (2.139), if Ey ≠ 0, then k2 ¼

v2 ð1 þ x2 Þ: c2

(2.141)

2p n: l 2

(2.142)

pffiffiffiffiffiffiffiffiffiffiffi Because the index of refraction is defined as n ¼ 1 þ x [Eq. (B.18)], the wavenumber of the component of the electric field that vibrates in the yx plane and propagates in the x direction is given by k¼

In other words, the Ey component propagates in a material of refractive index pffiffiffiffiffiffiffiffiffiffiffiffiffi n2 ¼ 1 þ x2 . Similarly, in Eq. (2.140), if Ez ≠ 0, then k2 ¼

v2 ð1 þ x3 Þ; c2

(2.143)

therefore, the wavenumber of the component of the electric field that vibrates in the zx plane and propagates in the x direction is given by

Polarization

139



2p n: l 3

(2.144)

Now, the Ez component of the field propagates in a material of refractive pffiffiffiffiffiffiffiffiffiffiffiffiffi index n3 ¼ 1 þ x3 . So for the wave propagating in the x direction, the material has two refractive indices, one for each component of the electric field. These types of materials are called birefringent materials (double index). In short, the electromagnetic wave propagating in the x direction has the form E ¼ f0, E 0y eið2pxn2 ∕lvtÞ , E 0z eið2pxn3 ∕lvtÞ g.

(2.145)

Thus, Ey and Ez components are continuously out of phase. For the distance x, the phase shift between the components would be Dd ¼

2pðn3  n2 Þ x: l

(2.146)

Similar results are obtained whether the propagation direction is y or z. Thus, the principal refractive indices n1, n2, and n3 are associated with the x, y, and z, respectively. Based on these results, birefringent plates are fabricated to generate phase delays between components by controlling the thickness of the plate. Suppose we have a birefringent plate of thickness d, as shown in Fig. 2.22, whose edges coincide with the principal directions and n3 > n2. A wave that vibrates in the zx plane propagates with speed y3 ¼ c/n3, and a wave that vibrates in the yx plane propagates with speed y2 ¼ c/n2. Because n3 > n2, then y3 < y2. In other words, for Eq. (2.145), the Ez component travels slower than the Ey component. So it is said that there is a slow axis in z and a fast axis in y. Thus,

Figure 2.22

Phase retarder plate.

140

Chapter 2

it is clear that the Ey component is ahead of Ez when they emerge from the plate, and the phase difference between the two would be 2pðn3  n2 Þd∕l. Quarter-wave plate Suppose that we want to generate a phase difference equal to p/2, i.e., 2pðn3  n2 Þd∕l ¼ p∕2. The thickness of the plate should be   l 1 d¼ , 4 Dn with Dn ¼ jn3  n2 j. These types of plates are called l/4 plates and are often used to generate circularly polarized light. For example, if a linearly polarized plane wave with the plane of vibration at an angle a ¼ 45° is incident on the front face of the plate with ui ¼ 0 (normal incidence), the light exiting the plate will have a left circular polarization state, as shown in Fig. 2.22 Note that if the plate is rotated 90° around the x axis (now the fast axis will be vertical and the slow axis will be horizontal) maintaining the polarization of the incident beam, the result will be light with a right circular polarization state. Half-wave plate To generate a phase difference of p, the thickness of the plate must be doubled, i.e,   l 1 d¼ : 2 Dn These types of plates are called l/2 plates and are often used to rotate the plane of vibration of a linearly polarized wave. Specifically, if the incident wave is as in the previous case, with a ¼ 45°, the wave leaving the l/2 plate will be linearly polarized with a ¼ 45°. A notable advantage of using a l/2 plate to rotate the plane of vibration of a linearly polarized wave is that the rotation is performed without attenuating the amplitude of the wave, which occurs when dichroic polarizers are used, according to Malus’ law [Eq. (2.53)]. To estimate the thickness of phase-retarding plates, suppose we want to build a l/4 plate of calcite (trigonal crystal structure). The principal refractive indices of calcite are n1 ¼ n2 ¼ 1.658 and n3 ¼ 1.486, i.e., Dn ¼ 0.172. Thus, the thickness for the line “d” of helium (l ¼ 587.56 nm) would be d ¼ 0.8540 mm. This is a very small thickness for a practical device. Thus, at a commercial level, the plates are made with birefringent crystals whose thickness is an odd multiple of l/4 (or an even multiple of l/2). 

These are the ordinary and extraordinary indices. These definitions appear in Section 2.5.2.

Polarization

141

2.5.2 Birefringent crystals The previous section shows how a birefringent crystal behaves assuming that the wave propagates along one of the principal directions. A more general case assumes any direction of propagation. In this case, Eqs. (2.135–2.137) must be considered. These equations constitute a system of homogeneous linear equations. The trivial solution is Ex ¼ Ey ¼ Ez ¼ 0. The nontrivial solution assumes that the determinant of the coefficients is equal to 0, i.e.,    ðn1 v∕cÞ2  k 2y  k 2z  kx ky kx kz   2 2 2   ¼ 0: kx ky ðn2 v∕cÞ  k x  k z ky kz   2 2 2  kx kz ky kz ðn3 v∕cÞ  k x  k y  (2.147) In a kx, ky, and kz coordinate system, this equation represents a doublelayered surface. To see these surfaces in a simple way, let us examine the cuts of the surfaces with the planes kxky (kz ¼ 0), kxkz (ky ¼ 0), and kykz (kx ¼ 0), assuming that n1 < n2 < n3. Starting with the plane kz ¼ 0, Eq. (2.147) is reduced to    ðn1 v∕cÞ2  k 2y  kx ky 0   2 2   ¼ 0, kx ky ðn2 v∕cÞ  k x 0 (2.148)   2 2 2  0 0 ðn3 v∕cÞ  k x  k y from where fðn3 v∕cÞ2  k 2x  k 2y gf½ðn1 v∕cÞ2  k 2y ½ðn2 v∕cÞ2  k 2x   k 2x k 2y g ¼ 0: (2.149) This is the product of two factors (those in curly brackets), and at least one of them must be equal to 0. From the first factor, k 2x þ k 2y ¼ ðn3 v∕cÞ2 ,

(2.150)

i.e., a circle of radius n3v/c. From the second factor, k 2y k 2x þ ¼ 1, ðn2 v∕cÞ2 ðn1 v∕cÞ2

(2.151)

i.e., an ellipse with semiaxes n2 v∕c and n1 v∕c along k x and k y , respectively. The two curves resulting from the intersection of the surface of two shells with the plane k z ¼ 0 are shown in Fig. 2.23: the circle with radius n3 v∕c and the ellipse with semiaxes n1 v∕c and n2 v∕c. Performing an analogous analysis for the planes kx ¼ 0 and kz ¼ 0, the intersections of the two surfaces in each plane are a circle and an ellipse, as shown in Fig. 2.24. So, in general, for any direction k, there will be two values

142

Chapter 2

Figure 2.23

Wave vector curves in a birefringent crystal in the plane kz ¼ 0.

Figure 2.24 Wave vector surfaces in a birefringent crystal with n1 < n2 < n3. In the direction of the optic axis, the phase velocities associated with each of the polarization directions are equal.

for the wavenumber k, i.e., two refractive indices. In particular, for propagation in the x direction, there will be two phase velocities, y2 ¼ c/n2 for the Ey component and y3 ¼ c/n3 for the Ez component; for propagation in the y direction, we will also have two phase velocities, y1 ¼ c/n1 for the Ex component and y3 ¼ c/n3 for the Ez component; and similarly for propagation in the z direction, i.e., y1 ¼ c/n1 for the Ex component and y2 ¼ c/n2 for the Ey component. This means that for each direction of propagation there are two phase velocities corresponding to two mutually

Polarization

143

orthogonal components of the electric field. For any other direction of propagation the same thing happens: there are two phase velocities that always correspond to two polarization directions orthogonal to each other. This is illustrated in Fig. 2.24, with the wave vector k in the plane kz ¼ 0. The values of the refractive indices for each direction of polarization are the distances from the origin of the coordinate system to the two intersections of the vector k, divided by v/c. A particular situation for phase velocities occurs when the two surfaces of the wave vector intersect at a point, as shown in Fig. 2.24 for the ky ¼ 0 plane. There, the wavenumber k is the same for the two polarization directions; therefore, the refractive indices are the same and, consequently, the phase velocities will also be the same. The direction of k for which this occurs is called the optic axis of the crystal. Thus, a wave that propagates in an anisotropic crystal along the optic axis of the crystal does so in the same way as in an isotropic material. The two orthogonal components of the field are not out of phase with each other. In birefringent crystals in which the three principal refractive indices are different from each other, there are two optic axes; these crystals are called biaxial crystals. In birefringent crystals in which two of the three principal refractive indices are equal to each other, there is only one optic axis; these crystals are called uniaxial crystals. The two surfaces of the wave vector in uniaxial crystals consist of a sphere and an ellipsoid of revolution. Depending on which surface contains the other, uniaxial crystals are classified as positive or negative. They are positive if the ellipsoid circumscribes the sphere; they are negative if the sphere circumscribes the ellipsoid. In the case of biaxial crystals, these surfaces are intersecting ellipsoids of revolution. The crystal classification of the sphere and ellipsoid sections in the plane containing the optic axes is shown in Fig. 2.25. In uniaxial crystals x1 ¼ x2, the corresponding index of refraction is called the ordinary refractive index, no. The index corresponding to x3 is called the extraordinary refractive index, ne. Thus, in a positive uniaxial crystal, no < ne,

(a)

(b)

(c)

Figure 2.25 Classification of birefringent crystals according to the optic axis of the crystal. (a) Positive uniaxial, (b) negative uniaxial, and (c) biaxial.

144

Chapter 2

Table 2.2 Some crystals. In uniaxial crystals, the indices that are the same are called ordinary, and the index that is different is called extraordinary [2]. Structure Isotropic cubic

Susceptibility " # a 0 0 x¼ 0 a 0 0 0 a

Uniaxial trigonal tetragonal hexagonal

" x¼

Biaxial triclinic monoclinic orthorhombic

" x¼

a 0 0 a 0 0

a 0 0 b 0 0

0 0 b

0 0 c

#

#

Crystal

n1

n2

n3

Sodium chloride

1.544

1.544

1.544

Diamond

2.417

2.417

2.417

þ Quartz

1.544

1.544

1.553

þ Ice

1.309

1.309

1.310

– Calcite

1.658

1.658

1.486

– Sodium nitrate

1.587

1.587

1.336

Topaz

1.619

1.620

1.627

Mica

1.552

1.582

1.588

and in a negative uniaxial crystal, no > ne. On the other hand, the component of the field that vibrates with the wavenumber corresponding to no is called an ordinary wave (ordinary ray), and the component of the field that vibrates with the wavenumber corresponding to ne is called an extraordinary wave (extraordinary ray). Some examples of positive (þ) and negative (–) uniaxial and biaxial crystals are given in Table 2.2 [2]. In particular, calcite has a relatively large Dn, which explains why it is a widely used material for the manufacture of optical elements (phase retarders and prisms). 2.5.3 Refraction in crystals In general, a birefringent crystal has two refractive indices for one direction of light propagation. Therefore, in the refraction of light at an interface separating two media, one isotropic and the other anisotropic, an incident light beam with ui ≠ 0 (coming from the isotropic medium) will separate into two refracted light beams. In particular, for refraction we will consider interfaces that are parallel or orthogonal to the optic axis of the crystal. In Fig. 2.24, in the ky ¼ 0 plane, it can be seen that the ordinary wave vibrates orthogonally to the optic axis of the crystal. This is the case in all crystals and allows us to determine the direction that ordinary and extraordinary waves follow in refraction. To see the refraction, let us consider a negative uniaxial crystal immersed in air. Assume that the interface is a flat face of the crystal that contains the optic axis and that the plane of incidence is orthogonal to the optic axis. From Fig. 2.25(b), we can see that the incidence and refraction of rays occur as shown in Fig. 2.26. The refractive index curves (wavenumber divided by v/c) are circles. Because the optic axis indicates one

Polarization

145

Figure 2.26 Refraction in a negative uniaxial crystal (no > ne). In refraction, there are two separate beams: the ordinary wave E ∥t and the extraordinary wave E ⊥t .

direction, the optic axis is shown as points (lines orthogonal to the plane of the paper) in Fig. 2.26. If we decompose the incident beam into a vector parallel to the plane of incidence (paper plane), E∥i , and into a vector orthogonal to the plane of incidence, E⊥i , then the parallel component will be orthogonal to the optic axis; this will be the ordinary wave. To obtain the directions of propagation of ordinary and extraordinary waves in refraction, we can make use of the graphical ray tracing of Fig. 1.9, as shown in Fig. 2.26. Of course, both rays obey the vector form of Snell’s law (Eq. [2.75)]. For the ordinary ray, nt ¼ no, and for the extraordinary ray, nt ¼ ne. Therefore, for the ordinary ray, no sin uto ¼ ni sin ui ,

(2.152)

and for the extraordinary ray, ne sin ute ¼ ni sin ui :

(2.153)

The final result is the separation of the two components of the incident wave: E ∥t vibrates propagating in the direction of the ordinary ray, and E⊥t vibrates propagating in the direction of the extraordinary ray. The separation of the beams by double refraction explains the double image that can be observed with calcite crystals, as shown in Fig. 2.27. Changing the orientation of the optic axis with respect to the interface also changes the refractive index curves and thus the way ordinary and extraordinary rays are refracted. Some basic configurations of the index curves for positive and negative uniaxial crystals are shown in Fig. 2.28. The optic axis is represented by parallel lines in gray or by dots (lines orthogonal to the plane of the paper). In addition to the separation of the ordinary (o) and extraordinary (e) rays, in negative uniaxial crystals the ordinary ray lags behind the extraordinary ray; in positive uniaxial crystals the opposite occurs.

146

Chapter 2

Figure 2.27

Double image generated with a calcite crystal.

(a)

(b)

(c)

Figure 2.28 Some configurations of the interface and the plane of incidence in uniaxial crystals whose optic axis is parallel or orthogonal to the interface.

2.5.4 Polarizing prisms From the configurations shown in Fig. 2.28, optical elements can be built to obtain linearly polarized light from natural light. Because the two components of the refracted field have mutually orthogonal linear polarization states, one of them can be blocked to obtain linearly polarized light. However, the obtained beam will have a different direction of propagation than the incident beam. To obtain linearly polarized light in the same direction as the incident natural light beam, a pair of birefringent prisms can be used with their diagonal faces joined by an optical medium or by a film of air. In the first prism, the rays with mutually orthogonal polarizations maintain the direction

Polarization

147

Figure 2.29 Total internal reflection in calcite for ordinary rays, n < no and n < ne.

of the incident ray (orthogonal to the first face). In the diagonal face, one of the rays is completely reflected while the other is reflected and transmitted to the second prism to maintain the direction of the incident beam. What is remarkable here is the total internal reflection of one of the beams. To see the working principle, let us consider the internal reflection of a beam in calcite, as shown in Fig. 2.29. Because there are two refractive indices in calcite, no ¼ 1.658 for the wave that vibrates parallel to the plane of incidence (ordinary ray) and ne ¼ 1.486 for the wave that vibrates orthogonal to the plane of incidence (extraordinary ray), there will also be two critical angles for total internal reflection. If n ¼ 1 (air), for ordinary rays,   n ðcÞ uo ¼ arcsin ¼ 37.09° , no and for the extraordinary, ðcÞ

ue ¼ arcsin

  n ¼ 42.29° : ne

Based on these two angles, if the incident ray arrives at the interface with ðcÞ ðcÞ an angle uo , ui , ue , then the ordinary ray is reflected with total internal reflection, while the extraordinary ray is reflected and transmits, so that in the second medium (air) there is a linearly polarized beam that vibrates orthogonally to the plane of incidence. Figure 2.30 shows a possible system with a pair of calcite prisms in which the output beam is linearly polarized in the “s” state and in the same direction as the incident beam (natural light). The normal of the diagonal face (of the first prism) and the incident beam determine the plane of incidence. On the other hand, the inclination of the diagonal is such that the rays transmitted in the first prism arrive at an angle ui ¼ 40° with the diagonal. A total internal reflection of the ordinary ray and an internal reflection of the extraordinary ray occur in the reflection of this diagonal. In transmission there is only the

148

Chapter 2

Figure 2.30 System of two calcite prisms with the optic axis orthogonal to the plane of incidence to generate linearly polarized light in the “s” state. The two diagonals are parallel, and the medium between them is air (n ¼ 1.000).

extraordinary ray that deviates (according to Snell’s law), and when it reaches the second diagonal it returns to take the direction it had in the first prism because the two diagonals are faced parallel to each other. Commercially, there is a wide variety of polarizing prisms. The diagonal faces are usually bonded with an optical medium other than air, e.g., Canada balsam, which has a refractive index in the range of 1.54 to 1.55. This index is in the middle of the two refractive indices of calcite, no and ne, so there will ðcÞ only be the critical angle for the ordinary ray, i.e., uo ¼ 68.25° . A Glan– Thompson prism made of calcite, with the optic axis orthogonal to the plane of incidence, is shown in Fig. 2.31. The geometry of the prisms is such that the ðcÞ angle of incidence on the first diagonal face is greater than uo . Other types of prisms made of quartz are shown in Fig. 2.32. In these prisms, the optic axes change direction and the diagonals are inclined by 45°. In Wollaston prisms, E⊥i in the first prism is the ordinary ray; it changes to an extraordinary ray when passing to the second prism, so that on the diagonal it

Figure 2.31 Glan–Thompson prism made with two calcite prisms bonded with Canada balsam.

 Canada balsam is a tree resin that, due to its transparency and its refractive index close to glass (after being subjected to an oil evaporation process), is used to glue lenses in optical instruments.

Polarization

149

Figure 2.32 Some types of polarizing prisms that separate the components of the optical field into the “s” and “p” states. In all cases, the prisms are quartz [2].

is refracted approaching the normal. On the other hand, E∥i is the extraordinary ray in the first prism; it changes to an ordinary ray when passing to the second prism, so that on the diagonal it refracts away from the normal. In Rochon prisms, E⊥i and E∥i enter the first prism along the optic axis, so both rays will see the ordinary index. In the second prism, E⊥i changes to an extraordinary ray and refracts closer to the normal. In contrast, E∥i continues as an ordinary ray, so its propagation direction does not change. Finally, in Sénarmont prisms, E⊥i and E∥i also enter the first prism along the optic axis, so both rays will see the ordinary index. In the second prism, E⊥i continues as an ordinary ray, so its direction of propagation does not change, and E∥i changes to an extraordinary ray and refracts closer than normal. There are also polarizing prisms made of isotropic optical glass. These are beamsplitter cubes with a dielectric film between the diagonals that joins the prisms, allowing the transmission of the p-polarization state and the reflection in the diagonal of the s-polarization state. These are mentioned briefly in Appendix E.

2.6 Vectors and Jones Matrices To describe the polarization state of a plane wave in Section 2.1, we use the vector [Eq. (2.32)] Eðx, y, z; tÞ ¼ fˆiE ox þ ˆjE oy geiðkzvtÞ , where E ox ¼ jE ox jeidx and E oy ¼ jE oy jeidy . Because the temporal and spatial phase terms are common to the complex amplitudes of the two wave components, it is convenient to represent the state of polarization as a column vector in which its elements determine the relationship between the components of the wave. This representation is known as a Jones vector: 

E ox E oy





 jE ox jeidx ¼ : jE oy jeidy

(2.154)

150

Chapter 2

For some of the more common polarization states, the explicit representations are as follows. Linear polarization The phase shift between the components must be Dd ¼ mp (m ¼ 0, ±1, ±2,. . . ). Therefore,   jE ox j : (2.155) jE oy j In the most common cases of linear polarization, the Jones vector can be reduced depending on the orientation of the plane of vibration. For horizontal polarization (EH), |Eoy| ¼ 0,   1 jE ox j ; (2.156) 0 for vertical polarization (EV), |Eox| ¼ 0,   0 ; jE oy j 1 and for diagonal polarization at ±45° (E±45), |Eox| ¼ |Eoy|,   1 jE ox j : 1

(2.157)

(2.158)

Circular polarization The phase shift between the components must be Dd ¼ (2m  1)p/2; (m ¼ 0, ±1, ±2,. . . ) and |Eox| ¼ |Eoy|. Therefore, the Jones vector becomes   1 jE ox j : (2.159) i The (þ) sign is for the left circular polarization (EL), and the (–) sign is for the right circular polarization (ER). Elliptical polarization When the phase shift between the components is Dd ¼ (2m  1)p/2; (m ¼ 0, ±1, ±2,. . . ) and |Eox| ≠ |Eoy|, which corresponds to an unrotated ellipse, the Jones vector would be   1 jE ox j , (2.160) ib

Polarization

151 Table 2.3 Jones vector   1 0   0 1   1 1   1 i   1 i

Some polarization states in Jones vector notation. Polarization state EH, (Horizontal linear) EV, (Vertical linear) E±45, (Linear diagonal ±45°) EL, (Left circular) ER, (Right circular)

with b ¼ |Eoy|/|Eox|. The (þ) sign is for the left elliptical (EL) polarization, and the (–) sign is for the right elliptical polarization (ER). Any other state of polarization will be described by Eq. (2.154). The polarization states most commonly used are shown in Table 2.3. Note that the factor that multiplies the column vector is omitted since it is common to both elements. Thus, to represent a state of polarization, only the simplest form of the Jones vector is used. Operations between polarization states can be performed with Jones vectors. For example, the sum of two mutually orthogonal linear polarization states       1 0 1 þ ¼ 0 1 1 results in a state of diagonal polarization; the sum of a linear and circular polarization state       1 1 1 þ ¼2 0 i 0.5i results in a state of elliptical polarization; and the sum of two circular polarization states, one to the left and one to the right,       1 1 1 ¼2 þ 0 i i results in a state of linear polarization. On the other hand, polarizing elements can also be conveniently represented as 2  2 matrices, such that the effect of a polarizing element on a polarization state is described by a linear transformation. Thus, if

152

Chapter 2



a b c d

 (2.161) 

describes the polarizing element, the result on a state of polarization 

A0 B0





a b ¼ c d



 A : B

A B

 is

(2.162)

For example, consider a linear polarizer with its transmission axis horizontal. The result over EH is again EH, and the result over EV is 0; i.e.,      1 a b 1 ¼ (2.163) 0 c d 0 and      a b 0 0 ¼ : 0 c d 1

(2.164)

It is immediately clear that from Eq. (2.163) a ¼ 1 and c ¼ 0, and from Eq. (2.164) b ¼ 0 and d ¼ 0. Therefore, the matrix   1 0 (2.165) 0 0 represents a linear polarizer with its transmission axis horizontal. Table 2.4 shows some matrices that represent polarizing elements. Some pffiffiffi of the matrices are accompanied by the factors 1/2 and 1∕ 2, which are necessary when energy balance is required but can be omitted for the analysis of polarization changes. Multiple representations can exist for the same element; e.g., a left circular polarizer element is represented in Table 2.4 by Table 2.4 Some polarizers represented as Jones matrices. Element

Jones matrix   1 0 0 0 horizontal vertical 0 0 0 1    1 0 1 fast axis horizontal fast axis vertical 0 i 0    1 0 1 fast axis horizontal fast axis vertical 0 1 0 

Linear polarizer Plate l/4 Plate l/2

 Circular polarizer right

1 2

1 i



i 1



 left

1 2

1 i i 1





 1 1 1 1    1 i 0 fast axis ±45° p1ffiffi2 i 0 i  0 1 diagonal

1 2

Polarization

153



 1 i : i 1

But the same polarizer can also be represented by the matrix that results from combining a diagonal linear polarizer with a l/4 plate, i.e., 

1 0 0 i



   1 1 1 1 : ¼ i i 1 1

With either arrangement, a linear polarization state (EH, EV, E±45) can be transformed into a circular EL polarization state, which can be easily verified as follows: 

1 i i 1



   A 1 ¼ ðA  iBÞ B i

and 

1 1 i i



   1 A : ¼ ðA þ BÞ i B

As an example, let us consider the system shown in Fig. 2.33. There are three optical elements aligned with the z axis: a linear polarizer LP with its transmission axis rotated 45°, a l/4 plate with its fast ( f ) axis vertical, and a glass plate reflecting r∥ ¼ 0.2 and r⊥ ¼ 0.2.

Figure 2.33 Optical system that cancels the reflected light after passing through the l/4 plate.

154

Chapter 2

If (from the left) natural light hits the linear polarizer, the reflected light will vanish when it hits the linear polarizer again. Qualitatively (observing from the positive side of the z axis), it follows that after the linear polarizer the light becomes linearly polarized at 45°; passing through the l/4 plate it becomes right-circularly polarized; reflecting off the glass plate it changes to left-circularly polarized; and when it passes again through the l/4 plate during its return, it becomes linearly polarized at 45° and no light is transmitted back through the polarizer LP. Mathematically, with vectors and Jones matrices, it will be:          0 1 1 1 0 1 0 1 0 1 1 A ¼ : 0 1 1 0 i 0 1 0 i 1 1 B Note that the arrays are written in the reverse order of the optical elements (from left to right). The product of the matrices is indeed 0. The middle matrix represents the reflecting surface at normal incidence, which is written as     r⊥ 0 1 0 ¼ 0:2 : 0 r∥ 0 1 On the other hand, when light is reflected, the positive direction of the x axis changes. Although this has no effect on the l/4 plate, since its fast (f ) and slow (s) axes are still oriented vertically and horizontally, it impacts the linear polarizer, since its transmission axis will now be at 45°. This explains the sign change in the matrix representing the linear polarizer (first matrix, after the ¼ sign) on the return of the light. In addition to Jones vectors and matrices, there are other possible mathematical representations, such as Stokes parameters, the Poincaré sphere, and Mueller matrices.

References [1] E. H. Land, “Some aspects of the development of sheet polarizers,” JOSA 41(12), 957–963 (1951). [2] G. R. Fowles, Introduction to Modern Optics, 2nd ed., Dover Publications, Mineola, New York (1989).

Chapter 3

Interference Light wave interference is observed as a modulation of irradiance, usually bright fringes and dark fringes on an observation screen. The geometry of the fringes depends on the shape of the wavefronts and the difference in the optical path traveled by the waves. Differences in the order of the wavelength of light cause changes in irradiance from a bright fringe to a dark fringe, making interference a highly accurate tool for measuring refractive indices, wavefronts, forms of optical surfaces, thicknesses, etc. The physical parameter that determines the quality of the interference (the possibility of generating fringes) is the coherence between the waves. The coherence has its origin in the fluctuations of the optical field emitted by the sources. Natural sources, like the sun, emit spontaneously (randomly), but in artificial sources, like lasers, the emission has a high degree of correlation. Two interference patterns generated with a He-Ne laser are shown in Fig. 3.1. The laser beam is focused with a positive lens into a small hole in an opaque screen, which is seen as a point of light in the figure (point source). The lens is behind the screen and cannot be seen in the image. The divergent (spherical) wavefront passes through several optical elements. First, it passes through a 1 mm thick microscope slide (flat piece of glass). There, light is reflected from each slide face and interference occurs between the two reflected signals, which is seen on the opaque screen (two-source interference pattern at bottom left). This interference pattern consists of roughly circular fringes, where the thickness of the bright fringes is similar to the thickness of the dark fringes. This is the typical result of the interference of two sources that emit spherical waves. The beam transmitted by the microscope slide is then allowed to enter a Fabry–Pérot interferometer, which consists of two thick plates of highly reflective glass parallel to each other. The separation between the plates is less than a millimeter, and the facing faces have a thin aluminum film that increases reflectance. This generates multiple reflections, with similar amplitude coefficients, so there is now interference from more than two waves. The effect on the reflected interference pattern is a thinning of

155

156

Chapter 3 N sources Point source

2 sources Glass plate Fabry–Pérot interferometer Opaque screen with hole

Figure 3.1 Two-wave and multi-wave interference.

the dark fringes, as can be seen on the opaque screen (N sources interference pattern at top right). In this chapter the most common interferometers are presented, starting with the Michelson interferometer, since it is very illustrative to study the interference of plane waves and spherical waves. The interference of multiple beams in a plate with parallel faces is then discussed, which also explains the operation of the Fabry–Pérot interferometer. In all cases, it is assumed that the surfaces of the optical elements that compose the interferometers are ideal, i.e., they coincide with their mathematical description. In practice, the manufacturing process of these elements limits the optical quality of the surfaces. Finally, some practical aspects of the Michelson interferometer, which may also be present in other interferometers, are discussed in Section 3.4.

3.1 Interference and Coherence Consider the sum of two harmonic plane waves given by E1 ðr, tÞ ¼ E01 eiðk1 · rvtþf1 Þ and E2 ðr, tÞ ¼ E02 eiðk2 · rvtþf2 Þ at a point r of empty space (air). The phases f1 and f2 are functions of time that depend on the light emission process and account for the fluctuations of the fields in r. The amplitudes E01 and E02 are assumed to be constant in time. The resulting wave is Eðr, tÞ ¼ E1 ðr, tÞ þ E2 ðr, tÞ:

(3.1)

Interference

157

The irradiance at r, according to Eq. (2.29), is given by I ðrÞ ¼

ϵ0 c hE1 · E∗1 þ E2 · E∗2 þ 2RefE1 · E∗2 gi, 2

(3.2)

I ðrÞ ¼

ϵ0 c 2 ϵc ϵc hE 1 i þ 0 hE 22 i þ 2 0 RehE1 · E∗2 i: 2 2 2

(3.3)

i.e.,

Let us set the plane that contains the wave vectors k1 ¼ 2pˆs1 ∕l and k2 ¼ 2pˆs2 ∕l as a reference plane. With respect to this plane, the parallel and k orthogonal components of E1 and E2 can be defined, so that E01 ¼ E⊥01 þ E01 k

and E02 ¼ E⊥02 þ E02 . Then, jj

jj



E1 · E∗2 ¼ E⊥01 · ðE⊥02 Þ∗ eiðDk · rþDfÞ þ E01 · ðE02 Þ eiðDk · rþDfÞ ,

(3.4)

k

where Dk ¼ k2 – k1 and Df ¼ f2 – f1. In Eq. (3.4), the terms E01 · ðE⊥02 Þ∗ and k



E⊥01 · ðE02 Þ are not included because the components are orthogonal to each other, resulting in 0. In other words, waves with polarizations orthogonal to each other do not interfere. k k The vectors E⊥01 and E⊥02 are parallel, while the vectors E01 and E02 form an angle equal to the angle between sˆ1 and sˆ2 ; thus, I ðrÞ ¼

ϵ0 c 2 ϵc ϵc jj jj hE 1 i þ 0 hE 22 i þ 2 0 Reh½E ⊥02 E ⊥01 þ E 02 E 01 cosð2aÞeiðDk · rþDfÞ i, 2 2 2 (3.5)

where 2a is the angle between sˆ1 and sˆ2 . From Eq. (3.5), the interference due to waves whose polarization states are parallel to the reference plane depends on the angle 2a. Thus, if 2a ¼ p/2, the waves do not interfere and the irradiance will simply be the sum of the irradiances due to the parallel components, I|| ¼ (I1 þ I2)||. On the other hand, the interference due to waves whose polarization states are orthogonal to the reference plane does not depend on the angle formed between sˆ1 and sˆ2 . In what follows, the interfering waves will be assumed to be in the polarization state that is orthogonal to the reference plane; thus, the expression for two-wave interference is pffiffiffiffiffiffiffiffiffi I ðrÞ ¼ I 1 þ I 2 þ 2 I 1 I 2 ReheiðDk · rþDfÞ i, (3.6) where I 1 ¼ ðϵ0 c∕2ÞhE 21 i ¼ ðϵ0 c∕2ÞðE ⊥01 Þ2 and I 2 ¼ ðϵ0 c∕2ÞhE 22 i ¼ ðϵ0 c∕2ÞðE ⊥02 Þ2 are the irradiances in r generated by each of the waves. In the term he–i(Dk·r þ Df)i, the phase difference Df depends on time. So, to get the

158

Chapter 3

average value, one needs to explicitly know the variation of f1 and f2 over time. 3.1.1 Degree of coherence To get to Eq. (3.6), it is assumed that there is no time delay between the waves, which can occur if the origin of one of the sources is displaced or if one of the waves travels through a medium that produces a change in the speed of propagation. If there is a time lag t, then the sum of the waves in general would be Eðr, tÞ ¼ E1 ðr, tÞ þ E2 ðr, t  tÞ: Now, the irradiance is given by pffiffiffiffiffiffiffiffiffi IðrÞ ¼ I 1 þ I 2 þ 2 I 1 I 2 ReheiðDk·rþvtþDfðtÞÞ i pffiffiffiffiffiffiffiffiffi ¼ I 1 þ I 2 þ 2 I 1 I 2 RefeiDk·r heivt eiðDfðtÞÞ ig,

(3.7)

(3.8)

with Df(t) ¼ f2(t – t) – f1(t). The complex degree of coherence is defined as gðtÞ ¼ heivt eiðDfðtÞÞ i ¼ jgðtÞjeiðaðtÞþvtÞ , where a(t) þ vt is the phase of the degree of coherence. Then, pffiffiffiffiffiffiffiffiffi I ðrÞ ¼ I 1 þ I 2 þ 2 I 1 I 2 jgðtÞj cosðDk · r þ vt þ aðtÞÞ:

(3.9)

(3.10)

The modulus of the degree of coherence satisfies 0 ≤ |g(t)| ≤ 1. In its limits, it takes the value 0 if Df is a random function and the value 1 if Df is constant in time. The superposition of the waves is considered to be incoherent, partially coherent, or coherent if the following is satisfied for the magnitude of g(t): ( jgðtÞj ¼

0, f0, 1g, 1,

incoherent partially coherent coherent:

(3.11)

Although the interference expression given by Eq. (3.10) has been obtained specifically for homogeneous plane waves (i.e., when the amplitude of each wave is constant over a given wavefront), the result is valid for the sum of two waves in general. In the latter case, the degree of coherence is the normalized version of the correlation of the fields at point r. Principles of Optics by Born and Wolf contains the general formalism for interference with partially coherent waves [1].

Interference

159

3.1.2 Interference and coherence To see the effect that g(t) has on interference, let us consider a simple but illustrative example. Following Fowles [2], let us suppose there is a light source consisting of a two-level electronic system, which spontaneously emits a pulse given by EðtÞ ¼ E 0 rectðt∕t0 Þeið2pn0 tfðtÞÞ ,

(3.12)

where the function rect(t/t0) describes a rectangular signal as follows:    t 1, if jtj ≤ t0 ∕2 : (3.13) rect ¼ 0, otherwise: t0 Hence, the pulse has cosine form with frequency n0 and duration t0, such that t0 > T, where the period T ¼ 1/n0. The initial phase f(t) is random within the range –p < f < p. The spontaneous character of the emission is described by the initial random phase of the pulse. Imagining a continuous emission of pulses but with random phases, the graphical representation of the phase in the emission process can be described as shown in Fig. 3.2. Now suppose that the light thus generated is used in an optical system with which we can produce interference of two waves; this is an interferometer. For example, in Fig. 3.3, the diagram of a Michelson interferometer is shown. The (plane) wave coming out of the source S is called the primary wave (0). The first task of the interferometer is to generate two (secondary) waves from the primary wave. For this, a flat semi-mirror can be used, which reflects 50% of the amplitude of the primary wave and lets the other 50% of the amplitude pass. The element that does this task is called a beamsplitter (BS). In this way, the secondary waves (1) are obtained. The second task of the interferometer is to

Figure 3.2 Initial phases of pulses emitted randomly one after another by a two-level electronic transition source.

160

Chapter 3

Figure 3.3 Diagram of a Michelson interferometer. A light beam (0) coming out of the source S is divided by the beamsplitter (BS) into two separate beams (1). One of the rays passes through the BS and goes to the flat mirror M1 and the other beam is reflected by the BS and goes to the flat mirror M2. The reflected beams (2) in each mirror reach the BS, resulting in two new beams (3) that overlap to produce interference.

add the two secondary waves. This is achieved by means of the two mirrors, M1 and M2, with which the direction of propagation of the secondary waves (2) is changed, directing them again toward the beamsplitter so that the reflected and transmitted waves (3) in the beamsplitter overlap. In particular, if the mirrors are orthogonal to the wave vectors of the secondary waves (with the beamsplitter at 45°) in the interference region, the unit propagation vectors sˆ1 and sˆ2 will be parallel, i.e., k2 – k1 ¼ 0. The final result is the sum of two plane waves (3) with Dk ¼ 0 and with a time delay t resulting from the difference in the optical path traveled by the two secondary waves (1, 2, 3) from point O at the beamsplitter. If the distance between O and M1 is d1 and the distance between O and M2 is d2, the optical path difference (in air n ¼ 1) between the waves (3) in the interference region is 2(d2 – d1). Consequently, the time lag will be t ¼ 2(d2 – d1)/c. For d2 > d1, t > 0. Then, displacing M2 (or M1) axially, the desired time delay is achieved. Let us assume that the time delay between the two waves is less than the duration of the pulse generated by the light source, t < t0. The degree of coherence between the two waves depends on the difference in the initial phases of the two waves. Because a copy of a wave is made with the interferometer, the phase difference given in Fig. 3.2 would look like the one shown in Fig. 3.4. At the top of Fig. 3.4, the phase f(t) of one of the waves is shown along with the phase of the other wave, including the delay time. The result of the subtraction is displayed at the bottom of Fig. 3.4. In the intervals of duration (t0 – t), the

Interference

161

Figure 3.4 Difference of the phases f(t  t) and f(t) for the time delay t.

phases coincide and, therefore, the result is 0. In the other intervals, the result is not null, but it still has a random distribution of phases. Thus, from Eq. (3.9), eivt gðtÞ ¼ Tˇ

ZTˇ

eiðfðttÞfðtÞÞ dt:

(3.14)

0

This integral can be solved by adding the M integrals over each interval of duration t0, assuming that Tˇ ¼ Mt0 . Thus, 2t 3 Mt Z0 Z0 eivt 4 eiðfðttÞfðtÞÞ dt þ : : : þ eiðfðttÞfðtÞÞ dt5, (3.15) gðtÞ ¼ Tˇ 0

ðM1Þt0

and each of these integrals, in turn, can be decomposed into two intervals: the duration interval (t0 – t), where the phase difference is 0, and the interval of

162

Chapter 3

duration t, where the phase difference takes any value, say Dj for the jth interval. Therefore, grouping the integrals for which the phase shift is 0 and putting in another group of integrals with phase shifts Dj, 2t t 2tZ0 t Z0 eivt 4 gðtÞ ¼ dt þ dt þ : : : þ Tˇ 0

þ

e

ivt



2 4

t0

Mt Z0 t

3 dt5

ðM1Þt0

2t0 Z

Zt0 eiD1 dt þ

t0 t

Mt Z0

eiD2 dt þ : : : þ

3

(3.16)

eiDM dt5,

Mt0 t

2t0 t

which is equal to gðtÞ ¼

eivt eivt ½Mðt0  tÞ þ ½0: Tˇ Tˇ

(3.17)

Because the result of the second group of integrals is 0, since the phases Dj are random, the sum (for Tˇ . t0 ) is 0. Thus, the final result is gðtÞ ¼ eivt

t0  t ; t0

t , t0

¼ 0;

(3.18)

t ≥ t0 :

The modulus of the degree of coherence is jgðtÞj ¼ 1 

t , t0

(3.19)

and the phase a(t) ¼ 0. The graphical representation is shown in Fig. 3.5. When t ¼ 0, the correlation of the waves is maximum and the degree of coherence is equal to 1. For values of t > t0, the correlation between the waves

Figure 3.5 Modulus of the degree of coherence of a source that emits random and consecutive pulses of duration t0.

Interference

163

is null; therefore, there is no interference and the irradiance is reduced to the sum of the individual irradiances of the waves, i.e., I ¼ I1 þ I2. To see in detail the effect of g(t) on the irradiance in the observation screen of the interferometer represented in Fig. 3.3, let us consider the irradiance at the point {x ¼ 0, y ¼ 0}. Because the beamsplitter divides the amplitude of the wave into two equal parts, I1 ¼ I2 ¼ I0. Therefore, the irradiance at r ¼ {0, 0} is      t 2pt I ð0, 0Þ ¼ 2I 0 1 þ 1  cos , t . 0: (3.20) t0 T The condition for t > 0 holds assuming d2 > d1. However, mirror 2 can also be displaced axially so that d2 < d1, leading to t < 0. With this in mind,      t 2pt I ð0, 0Þ ¼ 2I 0 1 þ 1  cos , (3.21) t0 T where the þ sign is used for –t0 < t < 0 and the – sign is used for 0 < t < t0. In Fig. 3.6, the irradiance is shown as a function of the time delay [Eq. (3.21)] in the range –12T < t < 12T, assuming that the source emits pulses of duration t0 ¼ 10T. When the mirrors are at the same distance from the beamsplitter (point O, Fig. 3.3), d2 ¼ d1, then t ¼ 0 and the maximum value of irradiance is obtained. By moving one of the mirrors axially, the irradiance oscillates, progressively decreasing (increasing) the maximum (minimum) of the irradiance. This irradiance variation occurs according to the modulus of the degree of coherence shown in Fig. 3.5. When the time delay reaches the value of t0, the irradiance oscillations disappear, and for |t| > t0 the irradiance remains constant, i.e., I ¼ 2I0. Note that the envelope of the irradiance plot corresponds to the modulus of the degree of coherence. Thus, we have an experimental way (with a Michelson interferometer) to measure the degree of coherence of a light source.

Figure 3.6

Irradiance as a function of delay time t.

164

Chapter 3

A complete oscillation of the irradiance occurs every time the optical path difference between the waves changes by l. In other words, every time the argument of the cosine function changes by 2p, a maximum (minimum) is obtained. Setting the argument of the cosine function equal to 2p, i.e., 2pDt/T ¼ 2p(2Dd)/(cT) ¼ 2p, Dd ¼

l 2

(3.22)

is the spatial increase in the separation of the mirrors corresponding to two consecutive maxima or minima. 3.1.3 Coherence length From the above, there would be interference (radiation modulation) if the delay time were less than t0. This is also equivalent to saying that there would be interference if the path difference between the waves in the interference region were less than lc ¼ ct0. This length is called the coherence length and is often used to characterize the (temporal) coherence of a light source. With the Michelson interferometer, we have a way to measure the coherence length. It can also be done in another way, which depends on the spectral width (bandwidth) of the light source. To see this, let us return to the pulse example from Eq. (3.12). Although the cosine vibration within the rect function [Eq. (3.13)] has a specific frequency, n0, the pulse spectrum is not a Dirac delta located at n0, but rather a distribution of frequencies centered at n0 due to the finite duration of the pulse. A signal is strictly monochromatic if it is described by a cosine function of infinite duration. To evaluate the spectral content of the pulse, Fourier analysis can be used. For a particular pulse, the function f(t), which measures the initial phase, will ˜ have a fixed value. Omitting this term, the Fourier transform EðnÞ ¼ F [EðtÞ] is ˜ EðnÞ ¼ E0

Z`

rectðt∕t0 Þeið2pn0 tÞ ei2pnt dt

`

 Z` ¼ E0 `

  Z`  i2pnt ið2pn0 tÞ i2pnt rectðt∕t0 Þe dt ∗ e e dt ,

(3.23)

`

where the symbol  denotes the convolution operation. Solving the two integrals, then ˜ EðnÞ ¼ E 0 ½t0 sincðpnt0 Þ ∗ ½dðn  n0 Þ,

(3.24)

where the function sinc(x) ¼ sin(x)/x. The delta function centers the function sinc(pnt0) at n ¼ n0. The graph of Eq. (3.24) is shown in Fig. 3.7. The spectral width of the function sinc(pnt0) can be defined as half the spectral separation

Interference

165

Figure 3.7 Spectrum of the pulse given by Eq. (3.12). The spectral width Dn approaches half the spectral separation of the leading zeros of the function sinc(pnt0).

between its leading zeros closest to n0. Let Dn be the spectral width of the initial zeros closest to n0; the spectral positions of the leading zeros are given by pðn0 þ DnÞt0 ¼ p and pðn0  DnÞt0 ¼ p. Subtracting them leads to Dn ¼

1 : t0

(3.25)

Thus, the inverse of the spectral width is a measure of the pulse duration time, which, in turn, determines the coherence length. Therefore, the coherence length of a light source having spectral width Dn is given by lc ¼

c : Dn

(3.26)

Because n ¼ c/l, taking the differentials of n and l, Dn ¼ –nDl/l, leads to an alternative form for the coherence length given by lc ¼

l2 , jDlj

(3.27)

where l is the central value of the wavelength in the spectral range of width Dl. Considering the classical model for an illumination source, the spectral width of the line (for electronic oscillators) is a universal constant [3], Dl ¼ 1.2  105 nm:

(3.28)

With this value, the coherence length for a line of the visible spectrum varies as 13 m < lc < 41 m (for 400 nm < l0 < 700 nm). In contrast, for white light, between 400 nm (violet) and 700 nm (red), the coherence length would be approximately lc ¼ 1 mm, with l ¼ 550 nm and Dl ¼ 300 nm. Other sources, such as mercury arc lamps, have coherence lengths of the order of 3 cm, and

166

Chapter 3

Kr discharge lamps have coherence lengths of the order of 30 cm. Lasers have long coherence lengths; e.g., a stabilized He-Ne laser can have a coherence length of 300 m.

3.2 Interference of Two Plane Waves In the previous section, it is shown that the expression for the interference of two plane waves of angular frequency v ¼ 2pn0 is given by pffiffiffiffiffiffiffiffiffi I ðrÞ ¼ I 1 þ I 2 þ 2 I 1 I 2 jgðtÞj cos½Dk · r þ vt þ aðtÞ, where |g(t)| is the modulus of the degree of coherence, Dk ¼ k2  k1 is the difference of the wave vectors at the observation point r, t is the delay time of the waves with respect to a reference point or plane (point O in the beamsplitter), and a(t) is the phase (together with vt) of the degree of coherence. Taking the two waves from the same source, a(t) ¼ 0. In practice, to observe the interference of two plane waves, it is usual to expand the beam (increase its cross section) that reaches the beamsplitter. This can be done by focusing light at a point S, which is made to coincide with the primary focal point of an aberration-corrected lens. This lens is called a collimating lens (CL), as shown in Fig. 3.8. After the lens, there will be planar wavefronts orthogonal to the optical axis of the lens, which will be taken as the optical axis of the interferometer. Any of these wavefronts can be taken as the reference plane; here, let us take the wavefront arriving at point O of the

Figure 3.8 Plane waves interference with a Michelson interferometer. A collimating lens (CL) converts the divergent spherical wave from the point light source S into a plane wave. When the mirrors are orthogonal to the wave vectors, there is a uniformly illuminated region on the screen that changes in irradiance level as one of the mirrors moves axially.

Interference

167

beamsplitter as the reference plane. Once the beams are split, the path differences from O can be measured. Let us suppose that the illumination source has a degree of coherence that remains constant for the displacements of the mirrors considered, |g(t)| ¼ |g|. By setting the plane mirrors M1 and M2 orthogonal to the wave vectors of the beams transmitted and reflected by the beamsplitter, the interference in the observation plane would be given by pffiffiffiffiffiffiffiffiffi I ðx, yÞ ¼ I 1 þ I 2 þ 2jgj I 1 I 2 cosðvtÞ:

(3.29)

For t fixed, I(x, y) does not depend on x or y, so in the observation plane there would be a uniform irradiance distribution, as shown in Fig. 3.8. By axially displacing one of the mirrors, the irradiance level on the observation screen changes, obtaining maximum values at (d2 – d1) ¼ ml/2, where m is an integer. Figure 3.9 shows three examples of the variation of the irradiance level (with I1 ¼ I2 ¼ I0) in the observation plane as a function of the relative axial displacement between the mirrors M1 and M2, for three degrees of coherence: |g| ¼ 1, 1/2, and 0. By observing the variation of irradiance as a function of (d2 – d1), |g| can be measured directly. In particular, the maximum and minimum irradiances can be measured, which in turn are given by pffiffiffiffiffiffiffiffiffi I 1I 2,

(3.30)

pffiffiffiffiffiffiffiffiffi I min ¼ I 1 þ I 2  2jgj I 1 I 2 ,

(3.31)

I max ¼ I 1 þ I 2 þ 2jgj

according to Eq. (3.29). By defining the visibility or contrast of irradiance modulation as C¼

I max  I min , I max þ I min

(3.32)

the modulus of the degree of coherence is obtained by substituting Eqs. (3.30) and (3.31) in Eq. (3.32); i.e.,

Figure 3.9 Modulation of the irradiance for different degrees of coherence in the interferometer shown in Fig. 3.8 as a function of the difference in position of the mirrors with respect to the point O of the beamsplitter.

168

Chapter 3

jgj ¼

ðI 1 þ I 2 Þ pffiffiffiffiffiffiffiffiffi C: 2 I 1I 2

(3.33)

Thus, from the irradiance measurements, the modulus of the degree of coherence is determined. Irradiances of each wave and the contrast are measured individualy. In particular, if the beamsplitter reflects 50% and transmits 50%, then I1 ¼ I2 and |g| ¼ C. In this case, the contrast is the modulus measure of the degree of coherence. 3.2.1 Interference with inclined plane waves In what follows, let us assume that |g(t)| ¼ 1, e.g., by illuminating with a laser whose coherence length is much greater than the axial displacements of the mirrors. If the mirrors M1 and M2 rotate through an angle a/2, as shown in Fig. 3.10, each beam in the interference region would be inclined by an angle a with respect to the optical (vertical) axis. In each mirror, the rotation is performed with respect to the point of intersection of the mirror with the optical axis, i.e, O1 in mirror M1 and O2 in mirror M2. Therefore, the angle between the two beams in the interference region is 2a. At the place of overlap there will be a pattern of straight fringes (interference pattern) whose separation will arise from the angle of inclination. To determine the geometry of the fringes, let us first assume that the mirror distances d 1 ¼ OO1 and d 2 ¼ OO2 are equal. This implies that there will not be a time delay between waves along the optical axis. The intersection

Figure 3.10 Interference with inclined plane waves. On the observation screen there will be a pattern of fringes. The separation of the fringes depends on the angle of inclination of the mirrors.

Interference

169

of the optical axis with the viewing screen is taken as the coordinate origin for the interferogram (x ¼ 0, y ¼ 0). Now the expression for the interference is pffiffiffiffiffiffiffiffiffi I ðrÞ ¼ I 1 þ I 2 þ 2 I 1 I 2 cosðDk · rÞ: (3.34) Assuming that the observation screen is in the z ¼ 0 plane, according to the geometry of Fig. 3.10, k1 ¼ ð2p∕lÞfs1x , 0, s1z g and k2 ¼ ð2p∕lÞ fs2x , 0, s2z g, with s1z ¼ s2z and s1x ¼ –s2x. On the other hand, r ¼ {x, y, 0}. Thus, Dk · r ¼ ð2p∕lÞðs2x  s1x Þx and because s2x ¼ sin a, Dk · r ¼

2p ð2x sin aÞ: l

(3.35)

This indicates that on the observation screen there will be a modulation of the irradiance in the x direction, according to   pffiffiffiffiffiffiffiffiffi 4p I ðxÞ ¼ I 1 þ I 2 þ 2 I 1 I 2 cos x sin a : (3.36) l A simulated interference pattern for two plane waves is shown in Fig. 3.11 when a ¼ 0.01°, l ¼ 632.8 nm, and I1 ¼ I2. The region where the interference pattern is observed corresponds to (–5 < x < 5) mm and (–5 < y < 5) mm. The separation Dx between two consecutive maxima or minima of irradiance occurs every time (4p/l)xsin a changes by 2p; i.e., Dx ¼

l : 2 sin a

(3.37)

Figure 3.11 Simulation of an interference pattern generated by two inclined plane waves when |g(t)| ¼ 1 (a ¼ 0.01° and l ¼ 632.8 nm). The scale of the axes is in millimeters.

170

Chapter 3

Note that to have a separation between fringes equal to 1.81 mm, in the example of Fig. 3.11, the angle of rotation of the mirrors is very small: a/2 ¼ 0.005°. 3.2.2 Displacement of interference fringes When one of the mirrors in the interferometer shown in Fig. 3.8 is moved axially, there is a change in the level of irradiance on the observation screen. There is no spatial modulation of irradiance (along the screen); the modulation is axial, i.e., temporal. If in the interferometer shown in Fig. 3.10 one of the mirrors moves axially (d2 – d1 ≠ 0), at each point of the observation screen there is also a change in the level of irradiance, but the net effect observed is a transversal displacement of the fringes in the x direction. The axial displacement of one of the mirrors introduces a time delay t ¼ 2(d2 – d1)/c between the waves (with respect to point O), and the irradiance is given by   pffiffiffiffiffiffiffiffiffi 4p 4p x sin a þ ðd  d 1 Þ : I ðxÞ ¼ I 1 þ I 2 þ 2 I 1 I 2 cos l l 2

(3.38)

By bringing the mirror M1 closer to point O, so that d2 – d1 > 0, the value of the argument of the cosine function increases and, to maintain the initial value, x must take a negative value; thus, the interference fringes shift to the left. In Fig. 3.12, three interferograms with a ¼ 0.01°, l ¼ 632.8 nm, and I1 ¼ I2 are simulated when the axial displacement of the mirror M1 leads to the optical path difference d2 – d1 ¼ 0, l/8, and l/4. Moving mirror M1 away from point O, so that d2 – d1 < 0, the interference fringes will shift to the right.

Figure 3.12 Transversal displacement of interference fringes as a function of the axial displacement of the mirror M1 (approaching the beamsplitter). The optical path difference in the images is d2 – d1 ¼ 0, l/8, and l/4. The scale of the axes is in millimeters.

Interference

171

3.2.3 Interferogram visibility The modulation contrast of irradiance in the axial direction was defined by Eq. (3.32). This quantity, based on the irradiances, allows the degree of coherence to be measured. The visibility of the interference fringe pattern can also be calculated from Eq. (3.32), where Imax and Imin are the maximum and minimum irradiance values of the fringe pattern. The change in visibility depends on the degree of coherence and the relationship between the intensities of the two waves; e.g., if |g| ¼ 1, then the visibility pffiffiffiffiffiffiffiffiffi 2 I 1I 2 C¼ ðI 1 þ I 2 Þ

(3.39)

only depends on the ratio between I1 and I2. In Fig. 3.13, three interferograms are shown along with their profiles in the x direction when the wave amplitudes are E1 ¼ E0 and E2 ¼ E0, E1 ¼ E0 and E2 ¼ 0.4E0, and E1 ¼ E0 and E2 ¼ 0.1E0. In the first case, Imax ¼ 4I0, Imin ¼ 0, and C ¼ 1; in the second case, Imax ¼ 1.96I0, Imin ¼ 0.36I0, and C ¼ 0.69; and in the third case, Imax ¼ 1.21I0, Imin ¼ 0.81I0, and C ¼ 0.20. Note that the irradiance oscillates spatially around the mean value I1 þ I2, which in the first case is 2I0, in the second case is 1.16I0, and in the third case is 1.01I0.

Figure 3.13 Interferograms when the visibility of the fringes depends on the ratio of the wave amplitudes: C ¼ 1 (E1 ¼ E0 and E2 ¼ E0), C ¼ 0.69 (E1 ¼ E0 and E2 ¼ 0.4E0), and C ¼ 0.20 (E1 ¼ E0 and E2 ¼ 0.1E0). The modulus of the degree of coherence is set to |g| ¼ 1. The scale of the axes is in millimeters.

172

Chapter 3

Thus, when I1 ¼ I2 ¼ I0 and |g| ¼ 1,   2 4p I ¼ 4I 0 cos x sin a , l

(3.40)

from where Imax ¼ 4I0 and Imin ¼ 0, and the visibility is C ¼ 1. If I1 ≠ I2, the visibility decreases, and it becomes zero if I1 ¼ 0 or if I2 ¼ 0. If in addition |g| < 1, then there is a greater decrease in the amplitude of the spatial oscillation of the fringes (maintaining the mean value). In other words, the width of the profiles shown in Fig. 3.13 decreases. In practice, one of the interferometer mirrors is fixed and aligned, while the other can be moved and tilted using precision screws. The analysis carried out for the formation of the interference fringes is still valid, obtaining the same results.

3.3 Interference of Two Spherical Waves A light source is considered to be point-like if its apparent size is insignificant relative to the distance at which the signal is detected. The detected wavefront will be seen as a spherical wavefront whose radius is equal to the distance between the source and the point of observation. In the previous section, with the Michelson interferometer shown in Fig. 3.8, two plane waves are generated. This is because the point source is located at the primary focal point of the CL. If the point source is displaced axially from the primary focal point of the CL, the wavefront refracted by the lens would be spherical and the radius of curvature will be equal to the distance between the conjugate of the point source and the point at which the wavefront is measured. Once the spherical wave is divided in the beamsplitter, the mirrors M1 and M2 generate two virtual images (secondary point sources) of the conjugate of the point source, and in the interference region we would have the superposition of two spherical waves. As in Section 3.2.1, here we also study the interference phenomena when the mirrors are orthogonal to the optical axis of the interferometer and when the mirrors are tilted at a small angle. Before dealing with the two interference cases of greatest interest, let us describe the geometry of the problem according to Fig. 3.14. Suppose there are two point sources in an isotropic and homogeneous medium (air, n ¼ 1) separated by a distance a, located at z ¼ –a/2 and z ¼ a/2, and the irradiance at a point P located at r ¼ {x, y, z} is wanted. If the two sources are actually images of a primary source, which emits spherical waves of angular frequency v ¼ 2pn0, the initial phases will be equal and the fields of the spherical waves emitted by S1 and S2 at point P can be written as E †0 is the amplitude of the field multiplied by the unit length. Thus, the amplitude of the optical field in a spherical wavefront of radius r would be E 0 ¼ E †0 ∕r. 

Interference

173

Figure 3.14 Geometry to describe the interference of two spherical waves emitted by point sources S1 and S2.

E †01 iðks1 vtÞ e s1

(3.41)

E †02 iðks2 vtÞ e , s2

(3.42)

E 1 ðs1 , tÞ ¼ and E 2 ðs2 , tÞ ¼ 1∕2

where s1 ¼ [r2 þ ða∕2  zÞ2 ]

and s2 ¼ [r2 þ ða∕2 þ zÞ2 ]

1∕2

, where in turn,

1∕2

r ¼ [x2 þ y2 ] is the radial coordinate of the projection of point P on the xy plane. The scalar form of these equations implies that at P, the fields have the same polarization state. If |g(t)| ¼ 1, the irradiance at P due to the superposition of the two waves would be I ðrÞ ¼

ϵ0 c ðE 1 þ E 2 ÞðE ∗1 þ E ∗2 Þ: 2

(3.43)

The individual irradiances at P will be I 1 ¼ ϵ0 cðE †01 ∕s1 Þ2 ∕2 and I 2 ¼ ϵ0 cðE †02 ∕s2 Þ2 ∕2. With this in mind, Eq. (3.43) becomes   pffiffiffiffiffiffiffiffiffi 2p I ðrÞ ¼ I 1 þ I 2 þ 2 I 1 I 2 cos ðs  s1 Þ : l 2

(3.44)

In this situation, (s2 – s1)/c measures the delay time of the waves arriving at P. When the argument of the cosine function takes a constant value q, then 2p(s2  s1)/l ¼ q describes surfaces where the irradiance is constant. Specifically, if 2p(s2  s1)/l ¼ 2pm (m ¼ 0, ±1, ±2,. . . ), the condition of surfaces of maximum irradiance is fulfilled. These surfaces are hyperboloids of revolution defined by ðs2  s1 Þ ¼ ml, where the point sources are the focal points of the hyperbolas.

(3.45)

174

Chapter 3

Figure 3.15 Some maximum irradiance curves produced by point sources located at z ¼ –0.25 mm and z ¼ 0.25 mm for m ¼ 0, 50, 250, 450, 650, 750, 780, 787, and 790, with l ¼ 632.8 nm.

In Fig. 3.15, some of the hyperbolas resulting from the intersection of a meridional plane with the z axis, when the two point sources are separated by a ¼ 0.5 mm and l ¼ 632.8 nm, are shown for m ¼ 0, 50, 250, 450, 650, 750, 780, 787, and 790. Along these curves, the irradiance has a maximum value. The number of curves along which the maximum irradiance is obtained is determined by the nearest integer to the quotient a/l, i.e., mmax ≡ a/l, which in our example is 790. The surface for m ¼ 0 is the plane z ¼ 0 (which is in the middle of the two sources). In this case, the optical path difference is zero for any point of coordinates {x, y, 0}. Let us see in detail the interference in planes parallel to the z ¼ 0 plane and in planes parallel to the y ¼ 0 plane. In the first case, due to the symmetry of revolution around the z axis, the maximum irradiance curves are circles. To determine the radii of these circles, let us explicitly write Eq. (3.45) and solve for s2: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r2 þ ða∕2 þ zÞ2 ¼ ml þ r2 þ ða∕2  zÞ2 :

(3.46)

Squaring and simplifying, the radius of the circle for a given m in the z ¼ z0 plane is given by sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2   2az0  m2 l2 2 a  z0 :  rm ¼ 2ml 2

(3.47)

In Fig. 3.16, the circular interference fringes and the fringe radius variation are shown for the example of two point sources separated by a ¼ 0.5 mm (located at z ¼ –a/2 and z ¼ a/2, with l ¼ 632.8 nm) in the z ¼ 100 mm plane. Moving away from the center, the fringes get closer to each other.

Interference

175

Figure 3.16 Circular interference fringes in the z ¼ 100 mm plane with the center at {x, y} ¼ {0, 0}, generated by two point sources separated by a ¼ 0.5 mm and located at z ¼ –a/2 and z ¼ a/2, when the wavelength is l ¼ 632.8 nm.

In the second case, i.e., for a plane parallel to the plane y ¼ 0, the intersections of the hyperboloids are open curves (hyperbolas). The position of the curves of maximum irradiance along the z direction as a function of m is obtained by making x ¼ 0 and y ¼ y0 in Eq. (3.46), i.e., sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4y20 ml þ 1: zm ¼ 2 a2  m2 l2

(3.48)

In Fig. 3.17 some interferograms obtained in different planes parallel to the plane y ¼ 0 are shown for different separations between the sources, in an observation region of 80 mm  80 mm centered at z ¼ 0 and x ¼ 0. The first interferogram, in the plane y ¼ 10 mm, has mmax ¼ 1, so z1 → `. In other words, there is only one interference fringe, the central fringe (m ¼ 0); the rest is an irradiance distribution that extends to infinity on each side. The second

Figure 3.17 Interference fringes in planes parallel to the y ¼ 0 plane located at y0 ¼ 10, 50, 100, and 200 mm, generated by two separate point sources a ¼ l, 8l, and 32l, with l ¼ 632.8 nm. The scale of the axes is in millimeters.

176

Chapter 3

interferogram, also in the y ¼ 10 mm plane, has mmax ¼ 8 and all the interference fringes that can be observed in this situation are shown: 7 interference fringes on each side of the central fringe (m ¼ ±1, ±2, ±3, ±4, ±5, ±6, and ±7, as they move away from the central band). The shape of the fringes is hyperbolic. In the third interferogram, the viewing plane moves away to y ¼ 50 mm, so there are fewer interference fringes. The fourth interferogram zooms out even farther, to y ¼ 100 mm, again decreasing the number of fringes; now the fringes appear straight and equally spaced. The last interferogram, collected at y ¼ 200 mm, has mmax ¼ 32. The interference fringes in the observation region are straight and equally spaced. This last statement can be supported by the fact that when the maximum number of fringes m in the observation region satisfies the m2 ≪ m2max relation, a2 ≫ m2 l2 and Eq. (3.48) can be approximated by zm ¼ mly0/a. Therefore, the fringes in that region are evenly spaced with a separation equal to Dz ¼

ly0 : a

(3.49)

Going back to the last interferogram, the maximum number m of fringes in the observation screen is 5 and, in fact, 25 ≪ 1024 holds. The two situations corresponding to Figs. 3.16 and 3.17 are obtained in the Michelson interferometer when illuminated by a spherical wave: the first with the configuration shown in Fig. 3.8 and the second with the configuration shown in Fig. 3.10. These two cases are discussed below. 3.3.1 Circular fringes with the Michelson interferometer If in the interferometer shown in Fig. 3.8 the CL is moved axially or the lens is simply removed, a spherical wavefront arrives at the beamsplitter, as shown in Fig. 3.18. Mirrors M1 and M2 will form virtual images of the point source from which the wavefront diverges. These virtual images constitute two secondary point sources S1 and S2. Looking from the observation screen toward the beamsplitter, two secondary point sources can be seen, one after the other, along the optical axis of the interferometer, separated by a distance a ¼ 2(d2 – d1), where d 2 ¼ OO2 and d 1 ¼ OO1 . The radii of curvature of the wavefronts emitted by S2 and S1, on the observation screen, are R2 ¼ 2d 2 þ OS þ OOP and R1 ¼ 2d 1 þ OS þ OOP . OP is the intersection of the optical axis of the interferometer with the observation screen (x ¼ 0 and y ¼ 0). Keeping the origin of coordinates in the middle of the two secondary sources (as in Fig. 3.14), 

Note that S2 is the virtual image of S generated by mirror M2, but S1 is the reflection in the beamsplitter of the virtual image generated by mirror M1 of S. This virtual image is to the right of mirror M1, but to an observer on the observation screen, the two secondary sources are aligned.

Interference

177

Figure 3.18 A Michelson interferometer illuminated by a spherical wave. On the observation screen, along the optical axis of the interferometer, there is a pattern of circular fringes when the separation between the secondary sources S1 and S2 is a ≠ 0.

z0 ¼ (R1 þ R2)/2. Consequently, R1 ¼ z0 – a/2 and R2 ¼ z0 þ a/2, and DR ¼ R2 – R1 ¼ a. In what follows, let us assume that R2 > R1. The interference of the two spherical waves on the observation screen would look as shown in Fig. 3.16. Let us now consider what happens to the interference fringes if one of the mirrors is displaced axially, say mirror M2. To see this, suppose we shift mirror M2 so that the separation between the secondary sources is a ¼ 500l, 400l, 300l, 200l, 100l, and 0 (with l ¼ 632.8 nm). Let us keep the observation screen at z0 ¼ 100 mm and the observation region centered on the z axis with dimensions 33 mm  33 mm. The radii of

178

Chapter 3

Figure 3.19 Radii of the circles of maximum irradiance as a function of the number m with respect to mmax for different values of the separation (a ¼ 500l, 400l, 300l, 200l, 100l, and 0) of the secondary point sources in the interferometer shown in Fig. 3.18.

the circles of maximum irradiance are shown in Fig. 3.19. The maximum value of m, mmax ¼ a/l, is taken as a reference point, and in each case it tells us that in the interferogram we have, there is a maximum irradiance at {0, 0, z0}. The first ring of irradiance (circular fringe) is given by mmax  1, and each curve tells us what the radius of the circle of maximum irradiance that corresponds to it is (which from now on is taken as the radius of the ring). The second irradiance ring occurs for mmax  2, and so on. The interference patterns for a ¼ 500l, 400l, 300l, 200l, and 100l are shown in Fig. 3.20. For a ¼ 0, the region will be completely illuminated, i.e., there are no interference rings, which is illustrated by Fig. 3.19 with the vertical line labeled with 0. Let us consider the position of the first ring in each interferogram. If initially the position of mirror M2 with respect to mirror M1 gives a separation a ¼ 300l, the radius of the first ring is rmmax 1 ¼ 8.18 mm. By displacing the mirror M2, approaching the beamsplitter, so that a ¼ 200l, the radius of the first ring increases to rmmax 1 ¼ 10.04 mm. And when the mirror M2 moves away from the beamsplitter such that a ¼ 400l, the radius of the first ring decreases to rmmax 1 ¼ 7.08 mm. Thus, if R2 > R1, by displacing the mirror M2,

Figure 3.20 Interference patterns for different separations (a ¼ 500l, 400l, 300l, 200l, and 100l; l ¼ 632.8 nm) in the interferometer shown in Fig. 3.18 when the observation screen is at z0 ¼ 100 mm. The scale of the axes is in millimeters.

Interference

179

Figure 3.21 Interference patterns for different separations (a ¼ 300l, 300.25l, 300.50l, 300.75l, and 301l; l ¼ 632.8 nm) in the interferometer shown in Fig. 3.18 when the observation screen is at z0 ¼ 100 mm. The scale of the axes is in millimeters.

the separation a decreases, leading to an increase in the radius of the rings and therefore fewer rings in the region of observation. And on the contrary, moving the mirror M2 to increase the separation a led to a decrease in the radius of the rings and therefore more rings in the observation region. When the displacement is followed slowly, in fractions of a wavelength, the effect when a increases is that the rings emerge from the center and the effect when a decreases is that the rings converge toward the center, as shown in Fig. 3.21 for a ¼ 300l, a ¼ (300 þ 1/4)l, a ¼ (300 þ 1/2)l, a ¼ (300 þ 3/4)l, and a ¼ 301l. An analogous situation holds if R2 < R1. Approximation to calculate the radius of the rings The radii of the fringes that have the maximum irradiance are given by Eq. (3.47). In many practical situations in interferometry, the separation of the secondary sources is found to be much smaller than the radii of curvature R1 and R2, i.e., DR ≪ fR1 , R2 g, and the size of the region in which the interference pattern is observed is also much smaller than the radii of curvature R1 and R2. In this case, the distance of the virtual sources S1 and S2 to a point on the observation screen can be approximated as s1 ¼ R1 þ D1

(3.50)

s2 ¼ R2 þ D2 ,

(3.51)

and

with r2 2R1

(3.52)

r2 , 2R2

(3.53)

D1 ¼ and D2 ¼

180

Chapter 3

where r is the radial distance of the observation screen point with respect to x ¼ 0 and y ¼ 0. D1 and D2 are the distances of the wavefronts of radii R1 and R2 to the observation screen point in the second order of approximation. Thus, the difference (s2 – s1) in Eq. (3.44) for a point on the observation screen is   r2 1 1  : s2  s1 ¼ R2  R1 þ 2 R2 R1

(3.54)

Therefore, the circles of maximum irradiance on the observation screen are obtained when  DR 1 

r2 2R1 R2

 ¼ ml:

(3.55)

And the radius of the circles as a function of m is rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi RR rm ¼ 2ðDR  mlÞ 1 2 : DR

(3.56)

Again, the maximum value of m (circular fringe closest to the center) is given by m ¼ DR/l ¼ a/l. Whereas the exact value of the radii of the circles of maximum irradiance can be obtained from Eq. (3.47), Eq. (3.56) serves to obtain the values of the radii of the circumferences in the second-order approximation. To compare the results given by the two equations, recall that R1 ¼ z0  a/2 and R2 ¼ z0 þ a/2. The results obtained with Eqs. (3.47) and (3.56) are compared for the first circular fringes when a ¼ 0.5 mm and z0 ¼ 100 mm, with l ¼ 632.8 nm, in Fig. 3.22.

Figure 3.22 Comparison of the radii of circles of constant irradiance calculated with the exact form [Eq. (3.47)] and the approximate form [Eq. (3.56)] for the first 11 circular fringes, when a ¼ 0.5 mm and z0 ¼ 100 mm, with l ¼ 632.8 nm.

Interference

181

The separation between two consecutive fringes, taken as the difference of the radii of the circles of maximum irradiance, can be evaluated from computing rm – rm þ 1. Let us first perform the subtraction of the squares: r2m  r2mþ1 ¼ 2l

R1 R2 : DR

(3.57)

Rewriting the left side as (rm – rm þ 1)(rm þ rm þ 1) and defining the separation between two consecutive fringes as Dr ¼ (rm – rm þ 1) and the mean value of the radii as r ¼ ðrm þ rmþ1 Þ∕2, Dr ¼

l R1 R2 : r DR

(3.58)

Note that we have omitted the subscript m in Dr and r, because the right-hand side of Eq. (3.57) does not depend on m. Equation (3.58) shows how the separation between fringes decreases inversely proportional to the position (radius) of the fringes as we move away from the center of the interferogram. Finally, if DR ≪ fR1 , R2 g, then R1 R2  z20 , and Eq. (3.58) can also be written as Dr ¼

l z20 : ra

(3.59)

In practice, r can be taken as the radius of the dark ring between the two bright fringes for which the separation is to be measured, where a is twice the difference in the separation of the mirrors (with respect to the beamsplitter) and z0 is the geometrical mean of the radii of curvature of the interfering wavefronts. 3.3.2 Parallel fringe approximation with the Michelson interferometer If the interferometer shown in Fig. 3.10 is illuminated by spherical waves, keeping the mirror separation the same, i.e., (d2 – d1) ¼ 0, the secondary point sources S1 and S2 would be seen as illustrated in Fig. 3.23. The sources S1 and S2 are at the same axial distance from the observation screen but have a transverse separation a 0 , which depends on the angle of tilt a/2 of the mirrors. The interference pattern on the observation screen consists of fringes whose maximum irradiance follows curves resulting from the intersection of the hyperboloids of revolution with the plane on the observation screen, similar to those shown in the example in Fig. 3.17 (except for the names of the axes). In Fig. 3.23, if z0 is the axial distance between the sources {S1, S2} and the observation screen, and a 0 is the separation between S1 and S2, the separation of the fringes along the x direction on the observation screen will be given by Eq. (3.48), but the name of the parameters will be changed to

182

Chapter 3

Figure 3.23 A Michelson interferometer with mirrors M1 and M2 tilted at a small angle. The two mirrors are at the same distance from the beamsplitter.

y0 → z0 and a → a 0 , and the name of the variable will be changed to zm → xm. Therefore, for the geometry of the interferometer shown in Fig. 3.23, the position of the interference fringes in the x direction is sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4z20 ml xm ¼ þ 1, 2 a02  m2 l2 where a 0 depends on the angle of tilt according to

(3.60)

Interference

183

a0 ¼ 2ðO2 O þ OSÞ sin a:

(3.61)

The distance between point O2 and the midpoint of sources S1 and S2 is (a 0 /2)/tan a; hence, z0 is z0 ¼

a0 þ ðO2 O þ OOP Þ: 2 tan a

(3.62)

In the case where a0 ≪ ðO2 O þ OSÞ and in turn a0 ≪ fR1 , R2 g, tan a can be approximated as sin a. In the observation region where the square of the maximum number of lateral fringes is much smaller than the square of mmax ≡ a0 ∕l, Eq. 3.60 can be approximated by xm ¼ mlz0 ∕a0 , which corresponds to a pattern of evenly spaced parallel straight fringes. According to Eq. (3.62), the separation between fringes would be Dx ¼

l l þ 0 ðO2 O þ OOP Þ: 2 sin a a

(3.63)

This equation is similar to Eq. (3.37), which was obtained for plane wave interference. The difference is that if in the interferometer shown in Fig. 3.10 the observation screen moves axially, the separation of the fringes does not change. But if in the interferometer shown in Fig. 3.23 the viewing screen is moved axially (changing OOP ), the fringe spacing changes. The addend lðO2 O þ OOP Þ∕a0 acts as a scale factor in the separation of the fringes. The first addend of Eq. (3.63) is identical to Eq. (3.37). In conclusion, the wavefronts that reach the observation screen in the interferometer shown in Fig. 3.23 resemble plane wavefronts when

Figure 3.24 An interference pattern generated by spherical waves in a Michelson interferometer when one of the mirrors is tilted. The axial separation of the sources S1 and S2 is a ¼ 400l, and the lateral displacement of the source S2 is x0 ( ¼ a 0 /2) ¼ 0.03 mm. The scale of the axes is in millimeters.

184

Chapter 3

a0 ≪ fR1 , R2 g and the observation region has dimensions such that m2 ≪ m2max , where m is the number of lateral fringes. Finally, if only one of the mirrors in the interferometer shown in Fig. 3.18 is rotated, a lateral displacement of the center of the rings is observed, as shown in Fig. 3.24, where the axial separation of the sources S1 and S2 is a ¼ 400l and the mirror M2 rotates by an angle that laterally displaces the source S2 by x0 ð¼ a0 ∕2Þ ¼ 0.03 mm. If the angle of tilt is increased, the center of the rings moves out of the observation region and the fringe patterns approach straight parallel fringe patterns.

3.4 Practical Aspects in the Michelson Interferometer In the previous sections of this chapter, the Michelson interferometer was used as an optical tool to study, in some detail, the interference patterns generated by the superposition of plane waves and spherical waves. The Michelson interferometer is one of many interferometers that can be used for this purpose. In any case, the goal is to generate two secondary waves from one primary wave, so that interference is guaranteed (provided that the optical path difference between the beams is not greater than the coherence length). The use of an interferometer goes beyond the explanation of the interference patterns of plane or spherical waves. Its power is applied in the evaluation of surfaces or wavefronts, either to characterize the same surfaces or to measure some other parameter such as the refractive index or optical aberrations. For this type of application, one of the beams is taken as the reference beam, while the other beam is used to analyze the optical element to be measured. Most commonly, the reference beam is flat or spherical. In practice, some drawbacks arise with the generation of the reference beam due to the optical quality of the interferometer’s components and its configuration. In this section, some practical elements of the Michelson interferometer are considered: the point source, the collimating lens, the mirrors, the peak– valley error (as a measure of the quality of the optical surfaces), and an example of a Michelson interferometer. Point source Let us suppose that we want to illuminate the interferometer with a plane wave, as shown in Fig. 3.8. To do this, we should have a point source S. This is already a first precaution because physically we can build small sources, but not point sources (mathematically speaking), and with a very narrow spectral band (laser), but not a Dirac delta function. Now let us not dwell on that and assume we have a quasi-monochromatic source small enough to be considered a point source. 

A source is said to be nearly monochromatic if its bandwidth Dl is much smaller than the value of the wavelength centered on the bandwidth, i.e., Dl∕l ≪ 1.

Interference

185

Collimating lens The next thing is to select a collimating lens to generate a plane wave. A first option could be a simple lens (positive or negative) with spherical faces, aligned and with its primary focus coinciding with the point source. Because the point source is on the optical axis, the only aberration present will be spherical. Although the aberration can be reduced by optimally shaping the lens (correctly choosing the radii of curvature of the lens faces), it is not possible to eliminate this aberration. A better option is to use an achromatic doublet. In addition to correcting chromatic aberration for two colors (spectral lines F and C, Appendix C), this type of lens, greatly reduces spherical aberration. This may already be a good solution. But if full correction of spherical aberration is desired, an aspherical lens for finite (primary focal point) and infinite conjugates can be designed. This solution is not always possible due to its high cost. Finally, there is an aspect that has not been mentioned, which has to do with the finite extension of the wavefront. In the previous sections, it was assumed that the wavefront has a circular edge (as is usually the case, either by the edge of the collimating lens or by a diaphragm that is placed before or after the lens to determine the size of the cross section of the beams). This physical limitation on the wavefront extension causes edge diffraction, so the wavefront is not strictly a plane in its entirety. Beamsplitter This element has been represented with a diagonal line, which of course is another idealization. In practice, this element is usually a plate with parallel faces made of glass or another material. When an oblique ray hits the plate, the ray refracted to the other side of the plate exits at an angle equal to that of the incident ray, but with a lateral shift, as shown in Fig. 3.25. This is not a problem when the interferometer is illuminated with collimated light (Fig. 3.8), because any ray associated with the wavefront is shifted by the same amount, so the output remains a plane wavefront. However, when illuminated with a spherical wave (Fig. 3.18), the rays are refracted according to the angle of incidence, giving rise to an aberrated wavefront.

Figure 3.25 Deviation of the ray in a plate of parallel faces.

186

Chapter 3

To calculate the lateral shift of the refracted ray, suppose that the plate of parallel faces is separated by a distance d and is made of a material with refractive index nt. Let us place a point source S at a certain distance from the plate. The line orthogonal to the faces, which passes through S, defines the z axis. Now consider a ray diverging from S with a tilt angle ui, as shown in Fig. 3.25. The back projection of the refracted ray leaving the second face passes through the point S 0 . Therefore, an observer behind the plate will see that the refracted ray comes from point S 0 and not from S. The separation between S and S 0 is the measure of the z-axis deviation of the refracted ray, dz ¼ S0 S. At the point of incidence of the first face of the plate, the vectors ni sˆi and nt sˆt associated with the incident and refracted rays satisfy the vector form of Snell’s law [Eq. (2.75)], so G ¼ nt cos ut  ni cos ui . Considering similar triangles, G dz ¼ , nt d∕ cos ut

(3.64)

  ni cos ui dz ¼ d 1  : nt cos ut

(3.65)

then

The deviation from dzjui¼0 ¼ dð1  ni ∕nt Þ is shown as a function of the angle of incidence ui for d ¼ 1 mm, ni ¼ 1, and nt ¼ 1.5168 (BK7 glass), with l ¼ 587.56 nm, in Fig. 3.26. The curve shows that as the angle of incidence increases, point S 0 moves away from S and toward the plate. In interferometers, the beamsplitter plates can be nitrocellulose membranes (pellicles), thin plates (a few millimeters thick), and cubes formed by two right prisms (Fig. 3.27). In all three options, one side will have a thin film coating to increase reflection and create beamsplitters of, e.g., 50% reflection

Figure 3.26 1 mm.

Deviation of a refracted ray in a plate of parallel faces with a thickness equal to

Interference

187

(a)

(b)

(c)

Figure 3.27 Beamsplitters from Thorlabs, Inc (www.thorlabs.com): (a) pellicle, (b) plate, and (c) cube.

and 50% transmission. The pellicles are very thin (2 mm), so the deflection of the refracted rays can be considered negligible. This element is very close to the idealization that we have made of the beamsplitter. The drawback is that they are elements that must be handled carefully so as not to break the membrane and must be protected from dust. They are a good solution in closed systems free of mechanical vibrations. Thin plates are very common in interferometers. One of the beams will pass through the plate once, while the other must pass through it three times, which must be taken into account when using low-coherence sources. Cube beamsplitters are made up of two right prisms joined at their diagonal faces (Appendix E). One of these faces is coated with a thin film to increase reflection. Unlike thin plates, the two rays pass through the cube the same number of times. In short, by using collimated light to illuminate the interferometer, any of these beamsplitters can be used. But if the light used diverges from a point source, the best option would be the pellicle. However, the usual option is a thin plate. The cube beamsplitter is not a good choice if the radius of curvature of the wavefront reaching the cube is comparable to the side of the cube. Because the deflection of rays in a plate is a function of the angle of incidence, this effect is similar to the spherical aberration that affects rays converging to form an image on a spherical refracting surface, as shown in Fig. 1.88. So, in a thick plate, such as the dividing cube, the end result is that an incident spherical wavefront upon refraction takes a shape other than spherical, i.e., an aberrated wavefront (aspherical wavefront). For the point source in the optical axis of the interferometer shown in Fig. 3.18, the aspherical wavefront will have symmetry of revolution, so the interference patterns generated on the observation screen will be rings of interference, but their spatial distribution will be different from that of spherical wave interference. Mirrors After the beamsplitter, mirrors M1 and M2 follow. They are flat mirrors, i.e., sheets of glass with the flat face covered with a metallic film, usually aluminum, but silver and gold are also found. To protect the metallized face

188

Chapter 3

from oxidation, dust, or dirt (fingerprints), a thin (half-wavelength) film of dielectric material, such as silicon oxide, is usually deposited. This is a layer that allows the surface to be cleaned. Peak–valley error Now let us consider an aspect that affects all the optical elements previously mentioned. In practice, what is meant by a spherical or flat optical surface? In the manufacturing process, polished surfaces (like a mirror) very close to the mathematical surface design can be obtained. The deviation between the real surface and the mathematical surface gives a measure of the optical quality of the surface. For example, an optical plane of precision l/4 refers to a polished surface whose deviations from a reference plane do not exceed one-fourth of the nominal wavelength. Although l/4 may seem like a small quantity, it implies a considerable change in the irradiance distribution. A change in l/2 implies going from a bright area to a dark area. Let us suppose that the interferometer shown in Fig. 3.8 is wanted, with optical elements of quality l/4. By adjusting the distance of the mirrors from the beamsplitter to have an optical path difference equal to 2(d2 – d1) ¼ ml, a fully illuminated region is expected on the observation screen. However, this is not the case, but rather an illuminated region with some less luminous and even dark areas will be observed. For best results, the quality of the optical elements should be changed, e.g., to a precision of l/10, l/20, or l/100. 3.4.1 Laboratory interferometer A Michelson interferometer built in a laboratory with optical elements of precision l/4 is shown in Fig. 3.28. The point source is a laser beam (He-Ne, l ¼ 632.8 nm) focused on a small hole (pinhole) of about 15 mm. The objective of focusing it on the hole is to eliminate the high spatial frequencies present in the beam, thus obtaining an approximately homogeneous illumination. This is discussed in Chapter 4. A circular stop is then placed to set the extent of the interferogram to approximately 25 mm in diameter when the light is collimated. Then there is the collimating lens, which in this case is an achromatic doublet with a focal length of 300 mm. This lens is mounted on an axial slider to facilitate beam collimation, which is achieved when the primary focus of the lens coincides with the point source. The axial slider also makes it easy to change the beam collimation to obtain spherical wavefronts. At 225 ±1 mm from the lens is the center of a beamsplitter cube (point O in Fig. 3.8), with a 50 mm side made of BK7 glass. Next, the flat aluminum mirrors are placed, as shown in the figure, 125 ± 1 mm from the center of the cube. Mirror 2 is mounted on another axial slider with which the optical path difference between the rays in the interference region can be changed. Each of the mirrors is supported by a mount that has two fine-thread screws with which the mirror can be tilted. Finally, there is the observation screen,

Interference

189

Figure 3.28 A Michelson interferometer. Mirror 2 is mounted on an axial slider.

125 ± 1 mm from the center of the cube. The observation screen is ground glass (translucent). Interferograms on the observation screen are recorded with a photographic camera (not shown in the figure). With this interferometer, the interference patterns shown in Fig. 3.29 are generated. In (a), the two mirrors are approximately the same distance from the center of the beamsplitter cube and, furthermore, the light beams are collimated and aligned (the path difference between the collimated beams is

(a)

(b)

(c)

Figure 3.29 Experimental interference patterns. In (a) and (b) the beams are collimated, and in (c) the beams are divergent spherical. (a) The optical path difference is approximately zero, and the mirrors are aligned; (b) the optical path difference is approximately zero, and one of the mirrors is tilted at a small angle; (c) keeping the mirrors aligned, the collimating lens moves away from the beamsplitter cube (40 mm) and the mirror M1 also moves away about 60 mm. The scale of the axes is in millimeters.

190

Chapter 3

approximately zero). However, the observation region does not have a homogeneous or symmetric illumination. This is an example of how the precision of the surfaces of the optical elements used to generate the interferogram (of l/4) affects its quality. To obtain this interferogram, mirrors were adjusted until the illuminated region without interference fringes was obtained. In (b), a pattern of parallel straight fringes is obtained by tilting the mirror 2 at a small angle. In reality, the fringes are somewhat distorted, which is more noticeable on the lower left side. This is also a result of the quality of the optical surfaces of the interferometer and shows that the two interfering wavefronts are not strictly flat. In (c), the mirror 2 moves axially about 60 mm away from the beamsplitter cube and the collimating lens moves about 40 mm away from the cube. By moving the collimating lens, the wavefront exiting the lens is divergent spherical and its center of curvature will be 1950 mm from the lens. Now the lens is about 265 mm from the center of the cube. Taking into account these distances, the distance of the observation screen, and the additional optical path due to the beamsplitter cube (with refraction index 1.51), R1 ¼ 2641 mm and R2 ¼ 2761 mm. Therefore, DR ¼ 120 mm, where mmax ¼ DR/l ¼ 189633 is the value of m corresponding to the central zone shown in Fig. 3.29(c). In this figure, three circular fringes are counted and the radius of the third one is estimated to be about 15 mm. This can be verified from Eq. (3.56), by substituting DR ¼ mmaxl and m ¼ mmax  3 for the third fringe in rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 6lR1 R2 rm3 ¼ ¼ 15.19 mm, DR which agrees very well with the experimental estimate. In conclusion, a real planar reference beam will be a wavefront distorted by an amount similar to the deformations of the optical surfaces of the beamsplitter and the planar mirror that sends the reference beam to the observation region. For a spherical reference beam, the effect of the thickness of the beamsplitter must also be taken into account. The other beam of the interferometer, also initially affected by the beamsplitter, can be used to assess the optical quality of an element, e.g., a flat or spherical mirror. The optical quality of the reference beam will determine the accuracy with which the surfaces under test can be measured. In the example shown in Fig. 3.29(c), the cube effect is negligible because the angle of the marginal ray turns out to be 0.37°, which gives an axial deviation of the ray of 0.34 mm.

3.5 Interference in a Plate of Parallel Faces In the interferometer shown in Fig. 3.8, what an observer sees from the observation screen is a pair of parallel reflecting flat surfaces separated by a distance 2(d2  d1). The light reflected by the two surfaces generates an

Interference

191

Figure 3.30 Reflection and transmission in a plate of parallel faces.

interference pattern in the plane of observation. A system that emulates the previous one is a plate with parallel faces. Now the reflected and transmitted light will be on the faces of the plate, and it is possible to observe interference on both sides of the plate: if it is on the same side as the light source, it is said to be reflection interference, but if it is on the side where there is no light source, it is said to be transmission interference. To see this, let us consider Fig. 3.30, which shows the incidence of a ray with an angle ui on a plate of refractive index nl and thickness d, immersed in a medium of index ni. At each of the interfaces, there will be multiple reflected and transmitted rays. The amplitudes of the waves associated with light rays are given by the Fresnel equations, Eqs. (2.96–2.99). In Fig. 3.30, r and t denote the reflection (r⊥ or r||) and transmission (t⊥ or t||) coefficients when light passes from the medium of refractive index ni to the medium of refractive index nl, and r 0 and t 0 denote reflection and transmission coefficients when the light passes from the medium of refractive index nl to the medium of refractive index ni. 3.5.1 Stokes relations The relations between r and r 0 and between t and t 0 are obtained from the Stokes relations, which are deduced from the illustrations in Fig. 3.31. In (a), the reflection and transmission of a ray of amplitude E0 incident at an angle ui at an interface separating two media of refractive indices ni and nt are shown. The amplitude of the reflected wave would be rE0, and the amplitude of the transmitted wave would be tE0. Taking into account the principle of reversibility (or reciprocity), if two rays are sent in opposite directions to the reflected and transmitted rays in (a), with amplitudes rE0 and tE0, respectively, the incident beam with amplitude E0 must again be obtained but going in the opposite direction. In (b), the incident ray is opposite the reflected ray in (a), but has the same amplitude. This ray will have a reflection and a

192

Chapter 3

(a)

(b)

(c)

Figure 3.31 Stokes coefficients. Reflection and transmission of (a) a wave of amplitude E0, (b) a wave of amplitude rE0, and (c) a wave of amplitude tE0.

transmission with amplitudes r2E0 and trE0, respectively. On the other hand, in (c), the incident ray is opposite the transmitted ray in (a), but has the same amplitude. This ray will also have a reflection and a transmission, but with amplitudes r 0 tE0 and tt 0 E0, respectively. If the principle of reversibility holds, then r2 þ tt0 ¼ 1

(3.66)

tr þ r0 t ¼ 0:

(3.67)

tt0 ¼ 1  r2

(3.68)

r0 ¼ r:

(3.69)

and

Therefore,

and

These last two equations are called Stokes relations and are especially useful for adding the amplitudes of the multiple reflected and transmitted waves shown in Fig. 3.30. The reflected/transmitted irradiance will be the square modulus of the sum of the reflected/transmitted waves, including the optical path difference between consecutive waves. 3.5.2 Multiple-wave interference This section will consider the superposition of the multiple reflected and transmitted waves (rays) shown in Fig. 3.30.

Interference

193

Interference by reflection For the reflected irradiance, Ir ¼

ϵ0 c ðE r E ∗r Þ, 2

(3.70)

with E r ¼ rE 0 þ tr0 t0 E 0 eid þ tr03 t0 E 0 ei2d þ tr05 t0 E 0 ei3d þ · · · þ ,

(3.71)

and E ∗r the conjugate of Er. The phase d depends on the optical path difference L between two consecutive reflected rays, i.e., d ¼ 2pL/l. Now, Eq. (3.71) can be rewritten as E r ¼ rE 0 þ tt0 r0 E 0 eid ½1 þ r2 eid þ ðr2 Þ2 ei2d þ ðr2 Þ3 ei3d þ · · · þ:

(3.72)

And in a more compact form, E r ¼ rE 0 þ tt0 r0 E 0 eid

Q X

ðr2 eid Þq ,

(3.73)

q¼0

where q ¼ 0, 1, 2,. . . defines the reflection q þ 2, and Q þ 2 is the total number of reflections. The sum in Eq. (3.73) represents a geometrical series whose ratio is r2eid and with a modulus ≤1. The sum of such series is given by 1  ðr2 eid ÞQþ1 : 1  r2 eid

(3.74)

If the number of reflections is much greater than 1 (there will be infinite reflections if ui ¼ 0 or if the plate has infinite extension), the sum reduces to 1/(1  r2eid). With this in mind and taking the Stokes relations into account, the reflected field will be   1  eid E r ¼ rE 0 : (3.75) 1  r2 eid Finally, the irradiance of the multiple reflected waves [Eq. (3.70)] is given by   4Rsin2 ðd∕2Þ Ir ¼ I0 , (3.76) ð1  RÞ2 þ 4Rsin2 ðd∕2Þ where I 0 ¼ ϵ0 cðE 0 Þ2 ∕2 and the reflectance R ¼ r2. To calculate the phase d, let us consider Fig. 3.32. The optical path difference between the two reflected beams would be L ¼ nl ðAB þ BCÞ  ni AD:

(3.77)

194

Figure 3.32 reflections.

Chapter 3

Geometry for calculating the optical path difference between two consecutive

From similar triangles, AB ¼ BC ¼ d∕ cos ut and AD ¼ ð2d tan ut Þ  sin ui. Then the phase d ¼ 2pL/l is given by   2p 2nl d d¼  ð2d tan ut Þni sin ui : (3.78) l cos ut Using Snell’s law and simplifying, d¼

2p ð2nl d cos ut Þ: l

(3.79)

Interference by transmission For the case of transmitted waves (Fig. 3.30), the resulting sum is E t ¼ tt0 E 0 þ tr02 t0 E 0 eid þ tr04 t0 E 0 ei2d þ · · · þ ,

(3.80)

E t ¼ tt0 E 0 ½1 þ r2 eid þ ðr2 Þ2 ei2d þ · · · þ:

(3.81)

i.e.,

Analogous to the derivation of the reflected multi-wave irradiance, the transmitted multi-wave irradiance is given by   ð1  RÞ2 It ¼ I0 : (3.82) ð1  RÞ2 þ 4Rsin2 ðd∕2Þ Equations (3.76) and (3.82) are complementary when there is no absorption. Case 1. Plane wave interference. A first result that can be observed is when the plate is illuminated by (coherent) monochromatic plane waves in a direction orthogonal to the plate,

Interference

195

(a)

(b)

Figure 3.33 Axial interference of multiple plane waves on a plate with parallel faces: (a) transmitted and (b) reflected.

i.e., when ui ¼ 0 and therefore ut ¼ 0. In Fig. 3.33(a), a situation is shown in which a plane wave bounded by a diaphragm D is incident orthogonally to a plate of thickness d. If the interference behind the plate is observed in one plane (parallel to the plate), the situation is similar to that of the Michelson interferometer shown in Fig. 3.8 and a homogeneously illuminated region will be seen. But if a lens L is added and the observation screen is moved to the secondary focal plane of the lens, a bright spot will be observed. In both cases, the irradiance would be given by Eq. (3.82) and depends on the thickness of the plate. In Fig. 3.33(b), the interferometer is designed to observe the interference of the multiple waves reflected by the faces of the plate. With the help of the beamsplitter, the reflected waves are deflected toward the lens L, which focuses the light on the observation screen. The irradiance would be given by Eq. (3.76). For g(t) ¼ 1, Fig. 3.9 shows how the transmitted irradiance in the observation plane changes as the separation between the mirrors changes. In this case, the irradiance contrast is 1. By changing the thickness of the plate in Fig. 3.33(a), a modulation of the irradiance in the observation plane (focal plane) is also obtained. Figure 3.34 shows the value of the irradiance in the focal plane as a function of the thickness of the plate (on the order of the wavelength) for three values of the reflection coefficient r: 0.2, 0.56, and 0.96. In particular, the reflection coefficient r ¼ 0.2 (rk ¼ r⊥ ) is obtained for a glass plate (nl ¼ 1.5, l ¼ 632.8 nm) immersed in air, when ui ¼ 0. The modulation of the irradiance behaves similarly to that of the irradiance shown in Fig. 3.9, for g(t) ¼ 1, in the Michelson interferometer. The difference is in the contrast of the modulation, given that in the case of the plate, it is low, equal to 0.08. This resemblance is no accident. When r ¼ 0.2, then

196

Chapter 3

Figure 3.34 Modulation of the irradiance of multiple transmitted waves in the focal plane [Fig. 3.33(a)] when the reflection coefficient r of the faces of the plate is 0.2, 0.56, and 0.96.

r2 ¼ 0.04 and r4 ¼ 0.0016, so ðr2 Þ2 ei2d , and the other higher order addends do not contribute significantly to the sum of Eq. (3.81). Thus, in this low reflectance case (R ¼ 0.04), the interference is determined only by the first two transmitted beams, i.e., E t ¼ tt0 E 0 ½1 þ r2 eid , and the irradiance will be (omitting the term r4)    4p 2 I t ¼ I 0 ð1  RÞ 1 þ 2R cos nd : l l

(3.83)

(3.84)

This expression is analogous to Eq. (3.29) of the interference of two plane waves when |g| ¼ 1 and t ¼ 2ðd 2  d 1 Þ∕c. Thus, when the reflectance of the faces of the plate is low (as in a glass plate), the effective interference of the multiple transmissions resembles the interference of two waves. If the reflectance increases, other addends will be added and the result will move away from that corresponding to two waves. To increase reflectance, one option is to increase the refractive index of the plate. In practice, however, not much can be improved, as high refractive indices are around 2.5 (e.g., in diamonds), giving reflectances of around 0.18. To achieve reflectances such as those shown in Fig. 3.34, of 0.31 (r ¼ 0.56) or 0.92 (r ¼ 0.96), thin metallic or dielectric films are deposited on the faces of the plate. In the case where the reflection coefficient is r ¼ 0.96, the interference result changes remarkably compared with the case where r ¼ 0.2. The maxima irradiance (I0) is obtained when the thickness of the plate is a multiple of half a

Interference

197

wavelength divided by the refractive index of the plate (same as for two waves), but the irradiance values decay rapidly when a little change in the thickness of the plate is produced. In other words, the most prominent irradiance values are found only in very narrow bands. In the case of multiple reflected waves [Fig. 3.33(b)], an analogous situation is obtained, as shown in Fig. 3.35, for the same values of the reflection coefficient. Irradiance minima (zero) are obtained when the thickness of the plate is a multiple of half a wavelength divided by the refractive index of the plate (same as for two waves). Now when r ¼ 0.2, the irradiance contrast is equal to 1, as in Fig. 3.9, for g(t) ¼ 1. And when r ¼ 0.96, there are narrow bands where the irradiance is around zero, but then irradiance values increase rapidly when the thickness of the plate changes a bit. This is the working principle of antireflection thin films. Note that the transmission (Fig. 3.34) and reflection (Fig. 3.35) graphs are complementary, resulting in conservation of energy if there is no absorption in the plate. Case 2. Interference of spherical waves I Let us now consider a spherical wave incident on the plate with parallel faces. As we have already seen, there is a pattern of circular fringes when the Michelson interferometer is illuminated with a spherical wave. Something similar happens in the case of the plate, but the structure of the fringes depends on the reflectance of the faces of the plate. Consider the configuration shown in Fig. 3.36 in which the transmission interference fringes are observed. In (a), the interference at a point on the screen is caused by the superposition of rays arriving at different angles. In (b), a lens is inserted into the setup and the observation screen is placed in the secondary focal

Figure 3.35 Modulation of the irradiance of multiple reflected waves in the focal plane [Fig. 3.33(b)] when the reflection coefficient r of the faces of the plate is 0.2, 0.56, and 0.96.

198

Chapter 3

(a)

(b)

Figure 3.36 Interference of multiple transmitted waves on a plate with parallel faces when illuminated by a spherical wave. (a) Nonlocalized fringe formation. (b) Localized fringe formation.

plane of such a lens. The interference now occurs at one point on the observation screen due to overlapping rays hitting the lens at the same angle. Whereas in (a) the circular interference fringes are observed for any position of the observation screen along the optical axis (nonlocalized fringes), in (b) the circular interference fringes are observed focused in the focal plane of the lens (localized fringes). First, let us see the formation of the interference fringes in configuration (b), since the required algebra has already been developed in Eq. (3.82). To obtain this equation, the incidence of a ray with a certain angle was considered in Fig. 3.30. Then, from Fig. 3.33(a), the specific case of ui ¼ 0 was examined: the irradiance given by Eq. (3.82) is observed at the secondary focal point. If now in the configuration shown in Fig. 3.33(a) we assume that ui ≠ 0, the interference of the multiple waves of the inclined plane ui will be seen at an offaxis bright point at the radial position f tan ui, where f is the lens focal distance. This is what happens with the multiple transmitted rays corresponding to a divergent ray from S in Fig. 3.36(b), with ui ≠ 0: the transmitted rays will be focused at a point located at f tan ui. Taking into account the symmetry of revolution around the optical axis, what we would have is a pattern of rings. The radii of the circles of maximum irradiance will be given by rm ¼ f tan um ,

(3.85)

where sin um ¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n2l  ðml∕2dÞ2 ni

,

(3.86)

according to Eq. (3.79), when d ¼ 2mp. The maximum value of m, mmax, will be the nearest integer to 2nld/l. Moving away from the center, the value of m decreases. Thus, with m ¼ mmax – 1, mmax – 2, . . . , the interference rings of the center are identified.

Interference

199

(a)

(b)

(c)

Figure 3.37 Transmission interferograms generated by a plate with an optical thickness of 0.5 mm for r ¼ 0.2, r ¼ 0.5, and r ¼ 0.9, using a lens with a focal length of 65.65 mm. The scale of the axes is in millimeters.

To compare the interferograms generated by spherical waves with the Michelson interferometer and with the parallel plate interferometer, let us assume that in the Michelson interferometer shown in Fig. 3.18 the separation of the mirrors is a/2 ¼ 0.25 mm and the glass plate has a thickness d ¼ (a/2)/nl, with nl ¼ 1.5 (for l ¼ 632.8 nm). The interferogram obtained with the Michelson interferometer, when the distance between the midpoint of the virtual sources and the observation screen is z0 ¼ 100 mm, is shown in Fig. 3.16. And the interferograms that are obtained when using a lens of focal length f ¼ 65.65 mm for r ¼ 0.2 (glass plate without reflective coatings), r ¼ 0.5, and r ¼ 0.9 are shown in Fig. 3.37. The focal length was set to this value taking into account the angle subtended by the first interference ring in the pattern shown in Fig. 3.16. The radius of this ring is r790 ¼ 1.88 mm. On the other hand, for the plate, the corresponding angle with the first ring is u790 ¼ 1.61° [Eq. (3.86)], which gives a focal length equal to 65.65 mm [Eq. (3.85)]. In the interferograms shown in Fig. 3.37, the position of the rings (the radius of the circles of maximum irradiance) is the same. This is because they do not depend on the reflection coefficient [Eqs. (3.85) and (3.86)]. However, the reflection coefficient determines the width of the circular fringes. The greater the effective number of interfering beams (which occurs as the reflection coefficient increases), the smaller the width of the fringes. When r ¼ 0.2, the pattern looks like the two-spherical wave pattern shown in Fig. 3.16, but the contrast is low, C ¼ 0.08. In the plate, as the reflectance of the faces increases, the contrast of the fringes also increases. Thus, with r ¼ 0.5, C ¼ 0.47; and with r ¼ 0.9, C ¼ 0.98. For reflection interference, the configuration shown in Fig. 3.33(b) can be used, but illuminating with a spherical wave. In this case, the irradiance maxima occur when d ¼ 2mp ± p. Reflection interference patterns generated with the same parameters as those used in the transmission interference example are shown in Fig. 3.38. As expected, these interferograms are the

200

Chapter 3

(a)

(b)

(c)

Figure 3.38 Reflection interferograms generated by a plate with an optical thickness of 0.5 mm for r ¼ 0.2, r ¼ 0.5, and r ¼ 0.9. The scale of the axes is in millimeters.

complement of the transmission interferograms. Unlike the transmission interferograms, the contrast of the interference fringes in the reflection is C ¼ 1 because the dark areas have zero irradiance value. The irradiance of the interferogram shown in Fig. 3.37(a) was calculated using r ¼ 0.2 regardless of the angle. Strictly speaking, this should not be the case because the reflection and transmission coefficients depend on the angle, according to the Fresnel equations [Eqs. (2.96–2.99)]. However, in the example under consideration, the range of the angle of incidence with which the interference pattern is generated goes from 0 to 10°, and for that range the value of r changes very little, around 2% (Fig. 2.13). Even for larger angles, the variation remains small, so for the first interference rings it suffices to take the reflection coefficient equal to that of ui ¼ 0. This also applies to plates with reflective coatings. At first glance, the interferograms shown in Figs. 3.16 and 3.37(a) are similar. This can be verified if we compare the radial positions of the interference rings [Eqs. (3.47) and (3.85)]. A comparison of the radial position of the first 12 interference rings in the Michelson interferometer, when a ¼ 0.1 and 0.5 mm, and the first 12 interference rings on the plate of parallel faces, when 2dnl ¼ 0.1 and 0.5 mm, is shown in Fig. 3.39. In fact, when a ¼ 0.5 mm, the radial position of the rings of the two interferograms is approximately the same. However, when decreasing to, e.g., a ¼ 0.1 mm, the interference rings generated with the Michelson interferometer have a smaller radius compared with the rings generated with the plate. Case 3. Interference of spherical waves II Let us now consider the configuration shown in Fig. 3.36(a). Suppose that three rays (labeled 1, 2, and 3) diverge from the point source S, as shown in Fig. 3.40. In transmission, there will be multiple rays parallel to the incident rays. In Fig. 3.40, only three transmitted rays are drawn for each incident ray. In each case, the first of the transmitted rays suffers a lateral deviation produced by the plate of thickness d. The backward extensions of the first

Interference

201

Figure 3.39 Radius (r) of the first 12 interference rings of spherical waves with the Michelson interferometer ( o ) when a ¼ 0.5 and 0.1 mm and with a plate with parallel faces (⋆) when 2dnl ¼ 0.1 and 0.5 mm.

Figure 3.40 Refraction of three rays diverging from the source S together with some transmitted rays.

transmitted rays (solid gray lines) converge approximately at the point S01 . Also in each case, the second of the transmitted rays undergoes a lateral deviation produced by an equivalent plate of thickness 3d. The backward extensions of the second transmitted rays (straight gray lines with equal segments) converge approximately at the point S02 . The third of the transmitted rays experiences, in each case, a lateral deviation produced by an equivalent plate of thickness 5d. The backward projections of the third transmitted rays (long and short gray straight segments) converge approximately at the point S03 . And so on it continues for the other transmitted rays. The extensions of the jth ( j ¼ 1, 2, 3, . . . ) transmitted rays do not converge

202

Chapter 3

exactly at a point (S0j ) because the axial deviation experienced by the refracted rays depends on the angle of incidence, according to Eq. (3.65). For the analysis that follows, let us assume that the axial prolongations of the jth transmitted rays converge at the point S0j . This implies that the angle of incidence of the rays diverging from S and reaching the plate is small. With this in mind, the rays arriving at a point P on the screen appear to come from multiple virtual sources located at S01 , S02 , S03 , : : : , as shown in Fig. 3.41(a). Therefore, the interference at P is produced by the superposition of rays with different angles. Thus, the superimposed rays at P are rays emerging from the source S at different angles. So how is it possible that the refracted rays meet at point P? This is possible if the different rays experience different internal reflections in the plate, as shown in Fig. 3.41(b). For example, three rays arrive at point P2 [Fig. 3.41(a)]: the ray that appears to come out of S01 arrives

(a)

(b) Figure 3.41 (a) Virtual sources. Rays arriving at point P appear to come from multiple virtual sources located at S01 , S02 , and S03 . (b) Real rays. The interference at a point on the observation screen results from the superposition of rays with different angles and with different numbers of internal reflections in the plate.

Interference

203

after having been transmitted by both sides of the plate without reflecting internally (ray in black); the ray that appears to leave S02 arrives after being transmitted in the first face, then internally reflected from the second face, plus internally reflected from the first face, and transmitted from the second face (segmented gray ray); and the ray that appears to leave S03 arrives after being transmitted from the first face, then internally reflected from the second face twice, plus internally reflected from the first face twice, and transmitted from the second face (ray on solid gray). This will happen progressively for the rays that emerge from the other virtual sources. The separation between two consecutive virtual sources can be calculated with the help of Fig. 3.40. Let us consider extensions of the corresponding transmitted rays with one of the rays diverging from S, as shown in Fig. 3.42. From the geometry of the figure, the separation between any pair of consecutive sources is constant. By defining this separation as a, a ¼ S01 S02 ¼ S02 S03 ¼ : : : ¼ S0j S0jþ1 . It is also true that B1 B2 ¼ A1 A2 ¼ A2 A3 ¼ : : : ¼ Aj Ajþ1 . Therefore, for any pair of consecutive sources ( j and j þ 1), tan ui ¼ B1 B2 ∕a. Because B1 B2 ¼ 2d tan ut , a¼

2d tan ut : tan ui

(3.87)

Equation (3.87) says that the separation between the virtual sources varies with the angle of incidence. However, in a practical situation, limiting the angles of incidence to small values (around 10°) so that cos ui  1 and cos ut  1 allows the change from tan ut/tan ui to sin ut/sin ui. Using Snell’s law, a ¼ 2d

ni , nl

(3.88)

which is independent of the angle. Thus, in this situation the interference of spherical waves in the configuration shown in Fig. 3.36(a) can be seen as the superposition of spherical waves that diverge from multiple virtual sources

Figure 3.42 Geometry to calculate the distance between two consecutive virtual sources.

204

Chapter 3

uniformly separated by a distance of 2dni/nl along the optical axis. The distance from the virtual source S0j to the plate (front face) turns out to be zj ¼ zS  dz0 þ ðj  1Þa,

(3.89)

where zS is the distance between the point source S and the plate, and dz0 is the axial deviation given by Eq. (3.65), with nt ¼ nl for ut ¼ 0. Based on this, the optical field for a point P on the observation screen at pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi radial distance r ¼ x2 þ y2 (omitting the time phase term vt and the initial phase term) will be E t ¼ tt0

E †0 iks1 E† E† e þ tr2 t0 0 eiks2 þ tr4 t0 0 eiks3 þ · · · þ , s1 s2 s3

(3.90)

or, more compactly, E t ¼ tt0 E †0

J X j¼1

r2ðj1Þ

eiksj , sj

(3.91)

where sj is the distance between the virtual source S0j and the point P, and J is the total number of reflections (J → `). Assuming that the distance between the plate and the observation screen is d þ zP, then qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (3.92) sj ¼ R2j þ r2 , with R j ¼ zj þ d þ zP :

(3.93)

As in cases 1 and 2, the reflection coefficient determines the weight with which each addend intervenes in the sum. For example, for the glass plate without reflective coating, by just taking the first two addends, the transmitted irradiance would be    pffiffiffiffiffiffiffiffiffi 2p 2 2 I t ¼ ð1  RÞ I 1 þ I 2 R þ 2R I 1 I 2 cos (3.94) ðs  s1 Þ , l 2 where I 1 ¼ ϵ0 cðE †0 ∕s1 Þ2 ∕2 and I 2 ¼ ϵ0 cðE †0 ∕s2 Þ2 ∕2. This irradiance is analogous to Eq. (3.44), except for the contrast of the fringes, which is low for the plate case. By increasing the reflectance of the plate, there will be substantial contributions from more addends and the irradiance resulting from the superposition of J transmitted beams (J → `) can no longer be written in a simple way, as is done in Eq. (3.82) in the case of the superposition of parallel beams.

Interference

205

Figure 3.43 Profiles of the interferograms obtained with the interferometers shown in Figs. 3.36(a) (in black) and (b) (in gray dashed).

To see the difference between the interferogram obtained with the configuration shown in Fig. 3.36(a) and the interferogram obtained with the configuration shown in Fig. 3.36(b), let us compare the profile of the interferogram shown in Fig. 3.37(b), which is obtained when r ¼ 0.5, and the profile of the interferogram expressed in Eq. (3.91), with the same value of r. For the calculation of the second profile, the following parameters were set: d ¼ 0.25nl (in mm), nl ¼ 1.5, and ni ¼ 1.0. With this, the virtual sources are separated by a ¼ 0.5 mm. The distances zS ¼ 50 – dz0 and zP ¼ 50 – d, both in mm, were also fixed. In Fig. 3.43, the two profiles for the first six rings are shown as a function of the radial coordinate: in black, the one obtained with the interferometer shown in Fig. 3.36(a), and in gray dashed, the one obtained with the interferometer shown in Fig. 3.36(b). Two things stand out: the first, moving away from the center, the irradiance maxima separate; second, the rings in the black interferogram fade (going away from the center). The separation between the maxima corresponds to that shown in Fig. 3.39 for a ¼ 0.5. Although the calculation of the black profile must be done with J → `, using the first 12 addends of the sum of Eq. (3.91) is more than sufficient because the 12th addend turns out to be 0.0001% of the second addend. Equation (3.94) is completely analogous to Eq. (3.44), which implies that the radii of the circles of maximum irradiance for the interferometer shown in Fig. 3.36(a), with r ¼ 0.2, coincide with the radii of the interference rings generated by two waves in the Michelson interferometer. On the other hand, if the effect of increasing the reflectance of the plate translates into a thinning of the interference rings while maintaining the radius of the circles of maximum irradiance, then it can be anticipated that the circles of maximum irradiance in 

Another example: if the reflection coefficient is r ¼ 0.9, the first 68 addends are required so that the last addend is approximately 0.0001% of the second addend.

206

Chapter 3

the interferometer of Fig. 3.36(a) are also calculated with Eq. (3.47). In terms of the interference by reflection, a result similar to that shown in Fig. 3.38 will also be obtained because, of course, it will be the complement of the interferograms generated by transmission. 3.5.3 Two-wave interference It has already been mentioned that in a plate with parallel flat faces, when the reflectance of the faces is low, in practice, only the first two reflected or transmitting waves interfere. In both cases, the analysis of the formation of interference fringes can be carried out by considering the two virtual sources that are generated by reflection or by transmission. This section deals with a situation of special interest: the interference of the first two spherical waves reflected by a plate of low reflectance (r  0.2). This case will allow us to analyze the interference produced by extended polychromatic light sources. According to the previous sections, there are two possible configurations in which to observe the interference, which are illustrated in Figs. 3.44(a) and (b). So there are two virtual sources (on the opposite side to the source S) generated by the reflection on each of the faces of the plate. In (a), two rays arrive at P at different angles from the virtual sources. For any point P on the source side the same will happen, so there will be interference fringes anywhere on the source side (nonlocalized fringes). In (b), two parallel rays are focused by the lens at P; therefore, there will only be interference fringes in the focal plane (localized fringes) that depend on the angle of the rays (fringes of equal inclination). In this section, we consider fringe formation for the configuration shown in Fig. 3.44(a). With the help of Fig. 3.45, it is possible to see that, due to the symmetry around the optical axis (the line through S and the virtual sources), the interference fringes in a plane orthogonal to the optical axis containing P are circular. The separation between the virtual sources a ¼ S01 S02 is given by

(a)

(b)

Figure 3.44 Interference by reflection: (a) nonlocalized fringes for any point P and (b) localized fringes in the focal plane of the positive lens.

Interference

207

(a)

(b)

Figure 3.45 Fringe formation by reflection. (a) Circular fringe in an observation plane at a certain distance from the plate. (b) Circular fringe in an observation plane close to the first face of the plate.

a ¼ 2d tan ut ∕ tan ui [Eq. (3.87)], and the radius of the interference rings is given by Eq. (3.47), with z0 ≥ ðSO þ a∕2Þ. The formation of an interference ring in an observation plane at a certain distance z0 . ðSO þ a∕2Þ is shown in Fig. 3.45(a). Changing the distance z0 changes the scale of the rings (Fig. 3.15). In the limit, when the observation plane coincides with the first face of the plate [z0  ðSO þ a∕2Þ], the fringes will have the smallest possible size [Fig. 3.45(b)]. If the plate has irregularities in its optical thickness (refractive index variations and geometrical thickness variations), the interference fringes are distorted. In particular, if the observation plane coincides with the first face of the plate, then the interference fringes in that plane will allow the irregularities of the sheet to be measured with respect to the pattern of regular rings that would be obtained for an ideal plate with parallel flat faces. Furthermore, if the distance from the source S to the plate is such that the wavefronts are practically flat, then the interference pattern will be a topographic map of optical thickness (similar to contour lines on geographic maps). This allows direct measurement of the optical quality of the plate. In addition to the above, an experiment to observe reflection interference from a glass plate (n ¼ 1.51) is shown in Fig. 3.46. In (a), the complete setup is shown: a laser beam (He-Ne, 632.8 nm), a focusing lens (focal length, 8 mm), a microscope slide (1 mm thick), an imaging lens (150 mm focal distance and 60 mm in diameter), and an observation screen (frosted glass). The laser beam is focused with the positive lens to generate the point source. The diverging light (spherical wave) reaches the plate where the two relevant reflections for the interference take place. In (b), a detail of the plate is shown to which a label (with the word “óptica”) has been placed that will serve to form the

208

Chapter 3

Figure 3.46 (a) Experimental setup to view reflection interference in a plate. (b) Detail of the plate with a label (the word “óptica”). (c) Interferogram on the observation screen about 60 cm from the plate, without the imaging lens. (d) Image (formed with the imaging lens) of the interferogram on the first side of the plate.

image of the plate on the observation screen. In (c), the interference pattern on the display screen is shown when the imaging lens has been removed; i.e., when light travels freely from the plate to the display screen. In (d), the image of the interference pattern is observed just on the first side of the plate. For this, the imaging lens is used. Sure enough, the label image next to the interferogram can be seen. Because the fringes are not localized, in (d) we have a scaled version of (c). This interferogram shows the defects of the plate that can be caused by variations in the refractive index or by irregularities in the flatness of the faces. It is worth noting that the aperture of the imaging lens must be large enough to collect all (or nearly all) of the reflected beams on the slide if the full

Interference

209

interferogram is to be observed. If instead of this lens one of our eyes is placed to look at the plate, the interferogram will not be seen because the beam of rays that enters is very small, limited by our pupil.

3.6 Interference from N Point Sources In the previous sections, the phenomenon of interference by two point sources through the Michelson interferometer and by J point sources with J → ` through a plate with parallel faces has been considered. Increasing the number of sources, with a number that tends to infinity, and making the contribution of each wave to the interference relevant lead to a thinning of the interference rings. If the reflectance approaches 1, the fringe profile tends to a distribution of very narrow bands (like Dirac deltas). This section will study the intermediate case: the interference of coherent waves with each other generated by N point sources (2 < N < `). In particular, there will be two situations to deal with: a set of sources evenly spaced in the axial direction, as shown in Fig. 3.47(a), and an array of evenly spaced sources in the transverse direction, as shown in Fig. 3.47(b). In both cases, let us assume that the point sources have the same amplitude and initial phase, so the field at a point P on the observation screen becomes (omitting the time term vt) E ¼ E †0

N X eiksj j¼1

sj

:

(3.95)

In a way, interference with the first source array has already been discussed in the previous section. The distance sj is determined by Eq. (3.92),

(a)

(b)

Figure 3.47 An array of N point sources: (a) axial and (b) transversal.

210

Chapter 3

sj ¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R2j þ r2 ,

where Rj is the radius of curvature of the wavefront with the center at Sj and the vertex at point O on the observation screen. The radial distance r ¼ ðx2 þ y2 Þ1∕2 measures the separation between O and P. If a is the separation between consecutive sources, then Rj ¼ R1 þ ðj  1Þa:

(3.96)

Normalized interference patterns (and their profiles) are shown in Fig. 3.48 when a ¼ 1000l, l ¼ 632.8 nm, R1 ¼ 100 mm, and N ¼ 2, 3, and 5. As expected, increasing the number of sources decreases the width of the fringes. But now there is a remarkable fact: between the maxima of the brightest fringes, fringes of much lower intensity appear. The first fringes are called primary maxima, and the second fringes are called secondary maxima. In fact, there are (N – 2) secondary maxima between two consecutive principal maxima. As the number of sources increases, the intensity of the secondary maxima decreases, and if N → `, the secondary maxima disappear and there will be very narrow (diffraction-limited) fringes, as shown in Fig. 3.37(c).

(a)

(b)

(c)

Figure 3.48 Interference of spherical waves generated by 2-, 3-, and 5-point sources located axially and uniformly separated by a distance a ¼ 1000l. The distance from the first source to the observation screen is R1 ¼ 100 mm.

Interference

211

For the second array of point sources [Fig. 3.47(b)], the radii of curvature of the wavefronts with the vertices in the observation screen are equal to z0. Assuming that the linear array of sources is set along the x direction, the distance from the source Sj to the point P will be sj ¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi z20 þ ðx  ðj  1ÞaÞ2 þ y2 :

(3.97)

Normalized interference patterns (and their profiles) for 2- and 4-point source arrays are shown in Fig. 3.49. In (a) and (b), the sources are a ¼ 8l apart and at a distance z0 ¼ 50 mm from the observation screen. Again, going from two to four sources, the major fringes become thinner and two minor fringes emerge, but the position of the principal fringes is maintained. In (c), the quantities a and z0 for the set of four sources have been multiplied by 10, but the observation region is the same as in (a) and (b). With this change, the interference pattern changes the shape of the fringes from hyperbolic to straight lines. Now the principal fringes are the same distance apart, and of course, the secondary maxima are still visible. 3.6.1 Plane wave approximation In the array shown in Fig. 3.47(b), when the observation region is much smaller than z0, the waves from the N sources when they reach the observation

(a)

(b)

(c)

Figure 3.49 Interference of spherical waves generated by 2- and 4-point source arrays located laterally. (a and b) The distance of the sources from the observation screen is z0 ¼ 50 mm, and the sources are uniformly separated by a distance a ¼ 8l. (c) The distance of the sources from the observation screen is z0 ¼ 500 mm and the sources are uniformly separated by a distance a ¼ 80l, but the same observation region of (a) and (b) is maintained.

212

Chapter 3

screen can be approximated by plane waves. This explains why in Fig. 3.49(c) the fringes resemble a pattern of equally spaced straight fringes. If, in addition to the condition mentioned above, the extension of the line array of sources satisfies ðN  1Þa ≪ z0 , then the amplitudes of the N waves at point P will be approximately equal and sj in the denominator of Eq. (3.95) can be changed to z0. By limiting the analysis to the x direction, sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2  2ðj  1Þxa þ ðj  1Þ2 a2 sj ¼ z0 1 þ , (3.98) z20 which in first approximation is sj ¼ z0 þ

x2 ðj  1Þxa  : 2z0 z0

(3.99)

With this, Eq. (3.95) can be written as E¼

N E †0 ikz0 ikx2 ∕2z0 X e e eikxðj1Þa∕z0 : z0 j¼1

(3.100)

2 ∗ Defining I 0 ¼ ðϵ0 c∕2ÞE †2 0 ∕z0 , the irradiance I ¼ ðϵ0 c∕2ÞðE EÞ is

X 2 N ikxðj1Þa∕z0 I ¼ I 0 e :

(3.101)

j¼1

The sum turns out to be N X

eikxðj1Þa∕z0 ¼

j¼1

1  eikxNa∕z0 , 1  eikxa∕z0

(3.102)

which can be rewritten as ekðN1Þa∕2z0

i2 sinðkxNa∕2z0 Þ : i2 sinðkxa∕2z0 Þ

(3.103)

In the end, the irradiance is  I ¼ I0

 sinðkxNa∕2z0 Þ 2 : sinðkxa∕2z0 Þ

(3.104)

Principal maxima The zeros in the denominator of Eq. (3.104) determine the position of the principal maxima. These are obtained by

Interference

213

ka x ¼ mp, 2z0

(3.105)

where m is an integer. Then, the irradiance at these points would be   sinðNmpÞ 2 I ¼ I0 : (3.106) sinðmpÞ Because both the denominator and the numerator become zero for m integer, from L’Hôpital rule,   sinðNmpÞ N cosðNmpÞ lim ¼ N: (3.107) ¼ m→integer sinðmpÞ cosðmpÞ Therefore, I ¼ N 2I 0

(3.108)

is the value of the principal maxima found in xm ¼ m

lz0 : a

(3.109)

That is, the separation between consecutive maxima turns out to be Dx ¼

lz0 : a

(3.110)

There are other zeros in the numerator, between the zeros in the denominator, which are given by N

ka x ¼ m0 p, 2z0

(3.111)

m0 lz0 : N a

(3.112)

with m 0 (integer) < N, located at xm 0 ¼

Thus, there will be (N – 1) minima (I ¼ 0) between two consecutive principal maxima. Consequently, there should be (N  2) secondary maxima interspersed with the minima. The profile of the interference patterns for N ¼ 2-, 3-, and 5-point sources is shown in Fig. 3.50 when a ¼ 80l, z0 ¼ 1000 mm, and l ¼ 632.8 nm. In all cases, because it is assumed that the amplitude of the sources is the same, the irradiance produced by a source at a point on the observation screen is I0.

214

Chapter 3

Figure 3.50 Interference pattern profile of N ¼ 2, 3, and 5 equally spaced point sources of equal amplitude.

The results shown in Fig. 3.50 are the basis for diffraction gratings, a topic that is covered in Chapter 4.

3.7 Interference with Extended Light Sources Until now, our focus in the previous sections has been limited to the interference of waves emitted by a monochromatic point source. This situation is very common in the laboratory, where we have laser sources and optical elements to form a point source. Outside the laboratory, white light (sunlight) interference can be seen in soap bubbles or oil films in water. In both cases, there should be a thin film with a thickness of about half the coherence length of light. Direct or diffuse sunlight illuminating the thin film arrives from different directions and can be modeled as an extended incoherent source, i.e., a set of spatially distributed incoherent point sources. The color interference pattern recorded by a camera focusing on an oil film on wet asphalt on a rainy day is shown in Fig. 3.51. To explain the formation of the fringes shown in Fig. 3.51, consider Fig. 3.52, which shows a plate with parallel plane faces, of index nl, resting on a block of index nm. An extended source, represented by the curve where S and S 0 are, illuminates the plate. Let us analyze this figure in parts. First, suppose that there is only the point source S, which generates a pattern of nonlocalized fringes. Pay attention to the interference pattern near the front face of the plate, as in Fig. 3.45(b). Changing S for another point source S 0 , analogously to the case of the source S, there would be another pattern of nonlocalized fringes. If the two sources are left active, assuming that the two sources are incoherent with each other, there will be a superposition in P of the

Interference

215

Figure 3.51 Interference generated by an oil stain on wet asphalt (water) with white light (sun).

interference patterns generated individually by each source. By adding other incoherent sources between each other, until completing the extended source, the effect of the superposition of the different interference patterns will be an illuminated region without modulation of the irradiance by interference; i.e., with the extended source, the nonlocalized interference fringes that might be observed with a point source disappear. Returning to the situation with only the point source S, the nonlocalized interference fringes near the front face of the plate can be observed by imaging the plate through a lens L, as shown in Fig. 3.52. If the aperture diaphragm is small (as in the eye or in a photographic camera), the region illuminated by S that can be observed is limited to point P and its neighborhood. This neighborhood is defined by the rays leaving S, striking the plate, and reflecting through the aperture diaphragm. The angular size of the neighborhood is defined by the angle subtended by the virtual image of the aperture diaphragm (the image generated by the first face of the plate) with respect to the source S. Let us assume that the neighborhood of point P is very small. The interference at P can be explained as the superposition of the chief ray (axislike line) reflected from the first face of the plate and the neighboring ray reflected from the second face, with both rays leaving S. Rays leaving S at other angles do not fall within the neighborhood of P. To see the interference in other regions of the plate, rays from other sources are needed, e.g., with the source S 0 , interference in P 0 can be observed. By considering all the point sources that make up the extended source, the full interferogram would be

216

Chapter 3

Figure 3.52

Interference fringes with an extended source.

observed, similar to what happens with a single point source (and a large aperture lens). Now the size of the observation screen (field stop) defines the region in which the interferogram would be seen. Finally, if the point sources are polychromatic, each color satisfies the interference conditions, and because the position of the fringes depends on the wavelength, the final result is color fringes. As the wavelength increases, the size of the interference fringes decreases [Eq. (3.56)]. Reflection interference fringes generated by the film of air between two microscope slides are shown in Fig. 3.53. In (a), fringes on an observation screen separated from the slides by about 25 cm when illuminated with a point source (l0 ¼ 632.8 nm) are shown. In (b), fringes on the slides recorded with a photographic camera when illuminated with white light are shown. To generate the air wedge, a very small drop of water was placed near the edge of

Figure 3.53 Reflection interference patterns generated by the film of air between two microscope slides put in contact. (a) Fringes on an observation screen (25 cm from the slides) when illuminating with a point source (wavelength, 632.8 nm). (b) Fringes recorded by a camera when the microscope slides are illuminated with white (solar) light.

Interference

(a)

217

(b)

(c)

Figure 3.54 Interference fringes with white light (sun) when viewed from different angles with respect to the normal of the slides.

one of the slides, and then the other slide was placed on top of the first and mechanical pressure was exerted on the slides, thus spreading the drop of water. Beyond the boundary of the water stain there is air, and it is there that the interference fringes are observed. The two patterns in (a) and (b) are observed in a direction close to the normal of the sliders. When the color fringes are viewed from another angle, the position of the fringes changes, as shown in Fig. 3.54. In (a), the fringes are observed at an approximate angle of 10° with respect to the normal of the slides, which can be considered close to the normal. In this case, the fringes are called fringes of equal thickness and can be used to make a topographic map of the air film. In (b) and (c), the photographic camera was tilted approximately 45° and 70°, respectively. This change in fringe size depends not only on the change in the optical path in the angled film, but also on the thickness of the layer above it. In the case of the oil film shown in Fig. 3.51, there is no other medium (other than air), so changing the viewing angle changes the size of the fringes less noticeably. Thus, with light from an extended polychromatic source, localized color fringes can be seen on a film (of air, soap, oil, etc.). Viewed in the direction normal to the film, we essentially have fringes of equal thickness. 3.7.1 Artificial extended sources Direct and diffuse white light (sun) are an example of an extended source, in which the point sources are practically at infinity. One way to obtain an extended artificial source is through the use of a polished glass (diffusing plate), i.e., a glass plate to which one or both smooth surfaces have been modified through a sanding process (using some abrasive material, such as silicon oxide). Therefore, the plate is seen as a translucent surface (like the viewing screen used in Fig. 3.46 to see interferograms). By illuminating the polished glass, the microroughnesses of the sandblasted face act as secondary light sources that 

The phase difference changes in a similar way to that expressed by Eq. (3.79) for parallel rays. As the angle of incidence increases, the phase difference decreases, so the fringes increase in size.

218

Chapter 3

emit roughly spherical waves. Illumination can be done with an extended source (white light, incandescent lamp, discharge lamp, etc.) or with a spherical wave generated by a laser. The first case is usually used to homogenize the lighting. The second case is used to get an extended monochromatic source, but now the different secondary point sources are coherent with each other. The superposition of the fields generated by the multitude of these secondary coherent sources generates an interference pattern with a grainy (random) structure known as a speckle pattern. The average size of the speckles depends inversely on the size of the illuminated region on the frosted glass. In general, to obtain an extended monochromatic source, the illumination region has to be of a size such that the average size of the speckles is very small, and what is observed is a homogeneous illumination (of very small grains). Michelson interferometer with an extended source. An example of artificial extended source interference common in teaching laboratories employs a Michelson interferometer and a diffuser plate illuminated with a discharge lamp (usually sodium or mercury vapor lamps). In these lamps, most of the emitted radiation occurs in a few spectral lines; e.g., for the sodium lamp, most of the radiation is around the 589.0 and 589.6 nm lines (hence its intense yellow color). A schematic of the Michelson interferometer illuminated by the extended source S (lamp-illuminated diffusing screen) is shown in Fig. 3.55(a). The source S is a collection of mutually incoherent point sources with random initial phases. Mirrors M1 and M2 generate the virtual images S01 and S02 of

(a)

(b)

Figure 3.55 (a) A Michelson interferometer illuminated with an extended source S. (b) Illustration of the virtual images S01 and S02 of the extended source S generated by the mirrors M1 and M2, seen from the lens L.

Interference

219

Figure 3.56 An interference pattern between two microscope slides when pressed with a pencil and using white light (sun).

the source S, which are seen one after the other from the observation screen, as shown in Fig. 3.55(b). Assuming that the coherence length of the lamp is greater than the separation between S01 and S02 , the waves emitted by the virtual point sources S01 and S02 (images of the point source S in S) interfere. This occurs for all virtual point sources. But there would not be interference between waves emitted by the different point sources in S. Consequently, without the lens L, no interference fringes will be seen on the observation screen. With the help of lens L, by placing the observation plane in the focal plane of the lens, interference fringes of equal inclination can be observed. These fringes are the result of the superposition in intensity of the patterns generated by each point source in S. Equations (3.85) and (3.86), with nl ¼ ni ¼ 1, describe these fringes. The interference between two microscope slides, when pressed with a pencil tip, is shown in Fig. 3.56. Based on what is covered in this section, the reader is invited to make a qualitative description of the formation process of the color fringes and the thickness of the air film between the slides.

3.8 Young Interferometer I In the eighteenth century, the corpuscular theory of light developed by Newton prevailed. Although a wavelike behavior of light was already observed, the difficulty of observing diffraction (as it was already known in sound or in surface waves of water) motivated Newton to develop a corpuscular theory of light, which was accepted thanks to Newton’s scientific reputation. Around the same time, other scientists such as Christiaan Huygens (1629–1695) and Robert Hooke (1635–1703) advocated wave theories of light. A little earlier, Francesco Grimaldi (1618–1663) discovered the diffraction of light through small openings, which suggested a wavelike behavior of light. Despite this background, there was no strong evidence that Newton’s

220

Chapter 3

(a)

(b)

(c) Figure 3.57 Young’s experiment. (a) Geometry of the experiment. In screen 1, there is a small hole to let through some of the light emitted by the extended source S of size s. On screen 2, there are two other holes, S1 and S2, also small, which in turn allow part of the light that diverges from S0 to pass through. Light diverging from holes S1 and S2 is superimposed on the display screen. (b) Simplified scheme with point sources at S0, S1, and S2. (c) Geometry for the calculation of the optical path difference.

corpuscular theory was wrong. In 1803, Thomas Young (1773–1829) performed an experiment that seriously challenged the corpuscular theory [4]. The modern version of Young’s interferometer is illustrated in Fig. 3.57. The general scheme is shown in Fig. 3.57(a). The illumination source S is an

Interference

221

extended source of size s. At distance zs is screen 1, which has a small hole (S0). Then, at a distance zp from screen 1 there is screen 2, which contains two other identical holes (S1 and S2) separated from each other by the distance a. Finally, at distance z0 from screen 2 is the observation screen. The size of S0 determines the spatial correlation of the fields in S1 and S2. If S0 is a point source, there will be the maximum spatial correlation between S1 and S2. On the other hand, the sizes of S1 and S2 determine the diffraction modulation of the interference pattern on the observation screen. The analysis of the interference pattern, taking into account the geometry of the holes and the size of the light source, is covered in Chapter 4. In what follows, let us consider the ideal situation where the holes are so small that it can be assumed that there is one primary point source at S0 and two secondary point sources at S1 and S2, as shown in Fig. 3.57(b). In Young’s experiment, it is typical to have symmetry about the optical axis, so that S1 and S2 are the same distance from the optical axis and therefore S0 S1 ¼ S0 S2 . Thus, on the observation screen, there will be a superposition of the spherical wavefronts that diverge from S1 and S2. This situation is discussed in Section 3.3. Some interference patterns are shown in Fig. 3.17 by changing the distance between the sources S1 and S2, and the distance between the sources and the observation screen. Assuming that the source is monochromatic with wavelength l and greater coherence length than the optical path difference at P, i.e., l c . jðS0 S2 þ S2 PÞ  ðS0 S1 þ S1 PÞj ¼ jS2 P  S1 Pj [Fig. 3.57(c)], the irradiance maxima on the screen along the x 0 direction are given by sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4z20 ml x0m ¼ þ 1, (3.113) 2 a2  m2 l2 according to Eq. (3.48), changing y0 → z0 and zm → x0m . The parameter m labels the interference fringes. Thus, m ¼ 0 corresponds to the central fringe, m ¼ 1, 2, 3, . . . corresponds to the fringes on the right (left) side of the central fringe according to their order, and m ¼ –1, –2, –3, . . . corresponds to the fringes on the left (right) side of the central fringe according to their order. In Young’s experiment, z0 ≫ a ≫ l holds. If the region in which the interferogram is observed has an extent much smaller than z0, the observed fringes are evenly spaced parallel fringes (as in the right image in Fig. 3.17 when the separation between the sources and the observation screen is 200 mm and a ¼ 32l). In this case, the position of the mth fringe can be obtained by making the corresponding approximations in Eq. (3.113), i.e., x0m ¼ m

z0 l , a

and the separation between two consecutive fringes would be

(3.114)

222

Chapter 3

Dx0 ¼

z0 l : a

(3.115)

These results [Eqs. (3.114) and (3.115)] were also obtained in Section 3.6 [Eqs. (3.109) and (3.110)], in which spherical wavefronts were approximated by planar wavefronts, an approximation also found in Young’s experiment. With this in mind, the irradiance on the observation screen of Young’s experiment is given by Eq. (3.104) with N ¼ 2, i.e.,   sinðkx0 a∕z0 Þ 2 I ¼ I0 , (3.116) sinðkx0 a∕2z0 Þ with I 0 ¼ ðϵ0 c∕2ÞE †0 2 ∕z20 , where E †0 is the field amplitude (per unit length) of sources S1 and S2. Using the trigonometric identity sin(2a) ¼ 2 sin a cos a in the numerator of Eq. (3.116) leads to    pa 0 2 I ¼ 4I 0 cos x : (3.117) z0 l Another approximate way to determine the irradiance is obtained directly from Fig. 3.57(c). Taking into account the condition z0 ≫ a, the field at P due to S1 is E 1 ðPÞ ¼ ðE †0 ∕z0 Þeiks1 and the field at P due to S2 is E 2 ðPÞ ¼ ðE †0 ∕z0 Þeiks2 . Therefore, the irradiance at P, given by I ¼ ðϵ0 c∕2Þj E 1 ðPÞ þ E 2 ðPÞj2 , would be I ¼ 2I 0 ½1 þ cosðkðs2  s1 ÞÞ: Approximating s2  s1  aa  ax0 ∕z0 leads to    2pa 0 I ¼ 2I 0 1 þ cos x , lz0

(3.118)

(3.119)

which is equivalent to Eq. (3.117). 3.8.1 Division of wavefront and division of amplitude In Young’s interferometer there is a remarkable fact as to how to generate the two secondary sources S1 and S2 compared with the Michelson interferometer. In the Michelson interferometer (as in the plate with parallel faces), the secondary sources are virtual images of a primary source obtained by the beamsplitter and the mirrors M1 and M2 (Fig. 3.18). The beamsplitter divides the amplitude of the incident wave (into a reflected wave and a transmitted wave). Interferometers based on this principle are also called amplitudesplitting interferometers. In contrast, in Young’s interferometer, the secondary sources S1 and S2 are obtained by isolating regions of the wavefront emitted

Interference

223

by the primary source S0. Secondary sources are then said to be obtained by dividing the wavefront. Interferometers based on this principle are also called wavefront-splitting interferometers. In amplitude-splitting interferometers, the secondary sources are copies of the primary source, so the interference between the secondary waves depends on the temporal correlation of the superimposed fields. If the delay time between the waves is greater than the coherence time (Fig. 3.6), no interference will be observed. In the case of wavefront-splitting interferometers, the interference also depends on the spatial similarity of the optical fields in S1 and S2. The spatial correlation of the fields in S1 and S2 measures the spatial coherence of the waves. In particular, in Young’s interferometer, the size of the observation region of the interferogram is usually such that for any point P within the region, the coherence length is greater than the optical path difference, so the coherence of interference waves is basically determined by the spatial correlation of the fields in S1 and S2. Thus, coherence has two aspects, one temporal and one spatial, which can be measured with the help of Michelson’s interferometer and Young’s interferometer, respectively.

3.9 Other Interferometers An interferometer can be said to be an optical system that splits a primary wave in amplitude or wavefront into two or more secondary waves, and then, after the secondary waves have traveled different optical paths, they are superimposed to observe a pattern of interference. The geometry of the fringes accounts for some parameters of the interferometric system, e.g., quality of the optical surfaces, homogeneity of the optical glass components (variation of the refractive index), curvatures of the surfaces, etc. In this section, some types of interferometers other than the Michelson and Young interferometers are described. 3.9.1 Fabry–Pérot interferometer Figure 3.58 shows a system consisting of two flat mirrors, a spacer ring about 0.5 mm thick, and a cylindrical support. Mirrors are thick (wedge-shaped) plates of glass with a metallic (aluminum) coating on one side to produce high reflectance (0.9 < r < 1). The spacer ring is placed in the middle of the two mirrors (in contact with their metallized faces) guaranteeing a fixed distance between the mirrors. The mirror and ring assembly is placed inside the cylindrical support and adjusted. This device is known as a Fabry–Pérot interferometer, in its simplest form. The operation of the interferometer is explained with the concepts developed for the plate with parallel faces, but with nl ¼ 1 (air between the mirrors). A schematic of an experimental setup is shown in Fig. 3.59(a) for simultaneously observing interference fringes (nonlocalized) by reflection and

224

Chapter 3

Figure 3.58 A basic Fabry–Pérot interferometer. It consists of two flat mirrors of semisilvered glass on one of the faces, a spacer ring (0.5 mm thick), and a cylindrical support that adjusts the mirror system with the spacer in the middle.

(a)

(b)

(c)

Figure 3.59 (a) Optical system for simultaneously observing reflected and transmitted interference. (b) Reflection interferogram on screen 1. (c) Transmission interferogram on screen 2.

Interference

225

transmission with the Fabry–Pérot interferometer shown in Fig. 3.58. A laser beam is focused at the center of a hole about 3 mm in diameter located in an opaque screen (screen 1). The point source S will be there. The divergent spherical wavefront is transmitted and reflected several times at the interferometer plates. The interferograms seen by reflection on screen 1 and by transmission on screen 2 are shown in Figs. 3.59(b) and (c), respectively. In (b), the dark dot corresponds to the hole in which the point source is located. In effect, the two interferograms complement each other, as in Figs. 3.37(c) and 3.38(c). Fabry–Pérot interferometer as a resonant cavity

A very important application of the Fabry–Pérot interferometer is in lasers, where the interferometer is the resonant cavity. To see this, let us consider Fig. 3.34, which shows that for r ¼ 0.96, a very narrow band of transmitted irradiance is obtained. This figure is based on the thickness of the plate. However, if the thickness of the plate is fixed and the irradiance is plotted as a function of the frequency of the wave, a similar plot is obtained, since the phase given in Eq. (3.79) also depends linearly on the frequency n ¼ c∕l. That is, for very high reflectances of the mirrors, the transmitted irradiance only occurs in very narrow frequency bands, which explains the high degree of monochromaticity of lasers. This book does not deal with the details of the Fabry–Pérot interferometer as a resonant cavity, which can be found in specialized works on lasers [5]. An introduction to the subject can also be obtained by consulting elementary optics books [6]. 3.9.2 Antireflective thin film One application of parallel face plate interference is in antireflective coatings. The simplest system consists of a plate of parallel faces whose thickness is of the order of a wavelength, called a thin film, placed on top of another plate of parallel faces, called a substrate, as illustrated in Fig. 3.60.

Figure 3.60 A thin film on a substrate. When the film thickness is d ¼ l/4nd, Er1 and Er2 interfere destructively.

226

Chapter 3

The reflected field is E r1 ¼ r1 E 0 on the first face and E r2 ¼ t1 r2 t01 E 0 on the second face. Thus, the total reflected field (omitting the reflection at the base of the substrate) would be E r ¼ r1 E 0 þ t1 r2 t01 E 0 eid , where d¼

4pnd d: l

The reflected irradiance would be I r ¼ I 0 R, with I 0 ¼ ðϵ0 c∕2ÞE 20 and the reflectance R ¼ r21 þ ðt1 t01 r2 Þ2 þ 2r1 t1 t01 r2 cos d: At normal incidence, the reflection and transmission coefficients (for the parallel polarization state) are r1 ¼

nd  n ; nd þ n

r2 ¼

n0  nd ; n0 þ nd

t1 ¼

2n ; nd þ n

t01 ¼

2nd : nd þ n

In practice, it is common to use MgF2 (magnesium fluoride), whose index is nd ¼ 1.38, for the thin film. If this film is deposited on a glass substrate (n 0 ¼ 1.5) and n ¼ 1 (air), the relation n < nd < n 0 holds, so the reflectance takes a minimum value when d ¼ p; i.e., the thickness of the film must be d¼

l : 4nd

With a little more work, we find that the minimum takes the value of zero if n2d ¼ n0 n: For example, for the glass substrate, the thin film should have a refractive index of 1.22, somewhat less than the refractive index of MgF2. 3.9.3 Newton and Fizeau interferometers In general, any arrangement of two contacting optical surfaces illuminated with monochromatic light is called a Newton interferometer [7]. The name given to this type of interferometer comes from the first reported experiments by Newton [8] of bringing a pair of telescope lenses into contact. In this experiment, Newton observes colored circular fringes centered on the contact region. Something similar to this is shown in Fig. 3.56, where the center of the

Interference

227

Figure 3.61 A mount to observe Newton’s rings generated by the curved face of the lens in contact with an optical reference plane.

colored fringes is at the tip of the pencil. These circular fringes are also called Newton’s rings. The basic setup for observing Newton’s rings is shown in Fig. 3.61. A lens with spherical faces is placed in an optical reference plane (peak–valley error, l/100). The surfaces in contact, spherical surface 1 and flat surface 2, form a film of air. An extended monochromatic source (mercury discharge lamp) is used to illuminate the surfaces in contact. The interference pattern formed by the superposition of externally reflected waves in the optical plane (surface 2) and internally reflected waves in the lens (surface 1) is observed with the help of a lens L. With this configuration, the quality of the curved surface of the lens that is in contact with an optical reference plane is examined. If the face of the lens is spherical, there will be an interference pattern made up of circular rings. The radius of the mth bright ring is given by rm ¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðm þ 1∕2ÞlR,

(3.120)

and the radius of the mth dark ring will be rm ¼

pffiffiffiffiffiffiffiffiffiffi mlR,

(3.121)

where R is the radius of curvature of the spherical surface 1 and m ¼ 0, 1, 2,. . . It is left as a task for the reader to derive Eqs. (3.120) and (3.121). Note that now the central fringe is labeled m ¼ 0 (where the contact region is), whereas for fringes generated by two virtual sources, as in Fig. 3.16, the value of m for

228

Chapter 3

Figure 3.62 (a) Polishing of the diagonal surfaces of two right prisms. (b) Interference fringes generated by the diagonal surface of one of the prisms in (a) when it comes into contact with the optical reference plane of the Newton interferometer.

the central fringe depends on the relationship between the separation of the virtual sources and the wavelength. The configuration in Fig. 3.61 can also be used to assess the quality of the flat surfaces of other optical elements (lenses, planes, prisms); e.g., in the process of polishing the flat surfaces of a prism [Fig. 3.62(a)], the quality of the surfaces can be checked by putting the surface to be evaluated in contact with the optical plane of the Newton interferometer shown in Fig. 3.61. In Fig. 3.62(b), the fringe pattern obtained for the diagonal surface of one of the prisms in Fig. 3.62(a) is shown. Because the fringes have the same thickness, a map of the topography of the surface under analysis can be obtained. In general, the reference surface should have a similar shape to the surface to be evaluated by the interference fringes. If the shape of the two surfaces

Figure 3.63 Scheme of the Fizeau interferometer to evaluate a test surface that is separated by a distance t from the optical reference plane.

Interference

229

differs by a large number of wavelengths, there will be many fringes and the imaging system will not be able to resolve them. If in practice it is not possible to place the two surfaces in contact, a variation of the optical system shown in Fig. 3.61 can be made, as shown in Fig. 3.63 (Fizeau interferometer). The light source must have a coherence length greater than twice the separation t between the optical reference plane and the test surface. Therefore, it is common to use a laser beam focused in S. The interference fringes will be of equal thickness.

References [1] M. Born and E. Wolf, Principles of Optics, 6th ed., Pergamon Press, Oxford, England (1993). [2] G. R. Fowles, Introduction to Modern Optics, 2nd ed., Dover Publications, Mineola, New York (1989). [3] J. D. Jackson, Classical Electrodynamics, 3rd ed., Wiley, New York (1999). [4] T. Young, “The Interference of light,” Chap. 7 in Great Experiments in Physics: Firsthand Accounts from Galileo to Einstein, 2nd ed., M. H. Shamos, Ed., Dover Publications, Mineola, New York, 93–107 (1987). [5] M. Csele, Fundamentals of Light Sources and Lasers, Wiley, New York (2004). [6] E. Hecht, Optics, Global Edition, 5th ed., Pearson, Harlow, England (2017). [7] M. V. Mantravadi and D. Malacara, “Newton, Fizeau, and Haidinger interferometers,” Chap. 1 in Optical Shop Testing, 3rd ed., D. Malacara, Ed., Wiley, New York, 1–45 (2007). [8] I. Newton, Opticks: or, a Treatise of the Reflections, Refractions, Inflections and Colours of Light, Sam. Smith and Benj. Walford, London (1704).

Chapter 4

Diffraction Diffraction, like interference, is a wave phenomenon. From a mathematical point of view, the difference between interference and diffraction lies in the number of sources that generate the interference waves. In interference there is a discrete number of sources, whereas in diffraction there is a continuous number of sources. In terms of the behavior of the optical field, diffraction is considered the deviation of the rectilinear path (of light) that is not due to reflection or refraction. In this chapter, diffraction is limited to the paraxial range, i.e., Fresnel diffraction and Fraunhofer diffraction. Detailed examples of diffraction by a circular aperture and by a rectangular aperture are given. With diffraction through a circular aperture, the formation of the image is analyzed taking into account the wave nature of light; with diffraction through a rectangular aperture, the basic mathematics for one-dimensional diffraction gratings are developed. The image of a point object (monochromatic) generated by an optical system that models a human eye with myopia, astigmatism, and spherical aberration is shown in Fig. 4.1. The effect of diffraction and aberrations reduces visual acuity in the human eye and generally reduces resolution in imaging systems. Note on calculated diffraction patterns Except for Section 4.5.2, which deals with image resolution, calculated diffraction patterns are shown in this chapter as grayscale images that represent the square root of the irradiance distribution. This allows regions of lower intensity to be highlighted. Plots of the irradiance profiles are shown at scale.

4.1 Huygens–Fresnel Principle Huygens’ principle, discussed in Section 1.1.3, states that every point on a wavefront can be considered as a source of secondary spherical waves that propagate with the same speed as the wavefront. After a while, the propagated

231

232

Chapter 4

Figure 4.1 Experimental image of a monochromatic point source generated by an optical system with astigmatism, spherical aberration, and defocus.

wavefront will be the envelope of the secondary spherical waves [1]. With this principle, the (unobstructed) propagation of a wavefront can be derived, where S 0 is obtained from S, as shown in Fig. 4.2, and the laws of reflection and refraction can be derived (Fig. 1.8). On the other hand, Fresnel establishes that the optical field at a point P is obtained from the interference of secondary waves [2]. In this way, Fresnel gives a satisfactory explanation of the phenomenon of diffraction. The combination of the Huygens principle and the interference of secondary Fresnel waves is called the Huygens–Fresnel principle.

Figure 4.2 The Huygens–Fresnel principle states that the field at P is the superposition (interference) of the secondary spherical waves emitted by the virtual sources located in the wavefront S.

Diffraction

233

The mathematical formalization of the Huygens–Fresnel principle would be carried out a century later by Kirchhoff and later refined by Rayleigh and Sommerfeld [3]. Although the study of diffraction can be started from Kirchhoff’s mathematical formalism, it is worth following Fresnel’s ideas to gain a further conceptual understanding of diffraction. The treatment in this book follows that described by Born and Wolf [4]. Let us start with the simplest situation: the propagation in vacuum of a spherical wavefront emitted by a point source. There is a point source S depicted in Fig. 4.3. According to the optical field expression for a spherical wave, omitting the time phase term e–ivt, the electrical field at point P would be EðPÞ ¼ E †0

eikd , d

(4.1)

where d ¼ SP and E †0 is the amplitude of the field multiplied by the unit of length. Using the Huygens–Fresnel principle to calculate the electric field at P, the result should be the same as in Eq. (4.1). Let S be the spherical wavefront of radius r0 emitted by the point source S in a given time. From this wavefront, the sources that emit the secondary Fresnel waves will be located at the points that form S. In particular, at point Q of the wavefront S there will be a secondary emitter whose contribution to the field at P will be of the form E(Q)eiks/s, where EðQÞ ¼ E †0 eikr0 ∕r0 . To obtain E(P) from the sum of all the (infinite) secondary waves, another Fresnel hypothesis is included: the amplitude of the secondary waves varies with the direction defined by the angle x, which is the angle between the normal of the wavefront S in Q and the line joining Q and P (Fig. 4.3). Therefore, the amplitude of the secondary waves will be of the form E(Q)K(x), where K(x) is the function that determines how the amplitude variation occurs. The angle x is called the inclination angle, Q

r0

S

s

V

P

Figure 4.3 Propagation of a spherical wave. The contribution of the secondary source Q to the field at P depends on the angle x.

234

Chapter 4

and the function K is called the inclination factor. With the function K, the Fresnel hypothesis is described as follows: given that the impulse communicated in any part of the primary wavefront S follows the normal of the wavefront, the effect on the medium must be more intense in the direction of the normal, so the rays from Q to P will be less intense as they deviate from the normal [2]. Fresnel mentions that determining the explicit form of the function K is a “very difficult matter”; it should not be an issue in many practical situations given that the rays from Q to P deviate little from the normal, so a constant value can remain for the function K. Following Fresnel, K is maximum when x ¼ 0 and disappears when the line from Q to P is tangent to the wavefront S, i.e., x ¼ p/2. This implies that not all of the spherical wavefront S contributes to the sum at P. The validity of these conditions is considered later with the analytical treatment developed by Kirchhoff. According to this, the field at P would be given by E † eikr0 EðPÞ ¼ 0 r0

ZZ

eiks KðxÞds, S s

(4.2)

where ds describes the differential element of area in Q. This integral is the mathematical version of the Huygens–Fresnel principle. 4.1.1 Fresnel zones The Huygens–Fresnel integral can be solved by dividing the domain into regions where the inclination factor approaches a constant value. This procedure proposed by Fresnel gives surprising results, which occur in practice, as shown in some later sections in this chapter. The regions into which the domain is divided are called Fresnel zones and for a spherical wavefront they are constructed as shown in Fig. 4.4. The spheres of radius b þ jl/2, with j ¼ 0,1, . . . , N, and b ¼ VP, are drawn from the point P (thus, d ¼ r0 þ b). The jth zone (Zj) is the annular region of S contained between the spheres of radius b þ ( j  1)l/2 and b þ jl/2. If b ≫ l and r0 ≫ l, then the following approximations are made in the inclination factor: • K(x)  constant, for a given Fresnel zone, and changes very little between consecutive zones. • K j ðxÞ  [K jþ1 ðxÞ þ K j1 ðxÞ]∕2. Also, • KN(x) ¼ 0, if x ¼ p/2. 

In Fresnel’s time, a hypothetical substance called Ether, or luminiferous Ether, was believed to occupy all of space and was supposed to act as a medium for the propagation of electromagnetic waves.

Diffraction

235 Z4

Z3 Z2 Z1

r0

V

S

b b+ / 2

P

b+ b+3 / 2 b+2 b+5 / 2 b+3

Figure 4.4

Fresnel zones in a spherical wavefront.

From the geometry of Fig. 4.3, s2 ¼ r20 þ ðr0 þ bÞ2  2r0 ðr0 þ bÞ cos u, where u is the polar angle. By partial differentiation, then 2sds ¼ 2r0 ðr0 þ bÞ sin udu: Because the differential element ds ¼ r20 sin ududf, where f is the azimuth angle, substituting sin udu leads to ds ¼

r0 sdsdf: ðr0 þ bÞ

(4.3)

Taking into account the previous results, the diffraction integral [Eq. (4.2)] can be approximated as EðPÞ ¼

N X

E j ðPÞ,

(4.4)

j¼1

where E † eikr0 E j ðPÞ ¼ 0 2pK j r0

bþjl∕2 Z

bþðj1Þl∕2

eiks r0 sds, s ðr0 þ bÞ

(4.5)

236

Chapter 4

where the approximation Kj (x) ¼ Kj (constant for the jth zone) has been used. Evaluating the integral of Eq. (4.5), the optical field at P will be EðPÞ ¼

N X j¼1

N E †0 eikd X E j ðPÞ ¼ i2l ð1Þjþ1 K j : d j¼1

(4.6)

Thus, the value of the integral depends on the sum of the inclination factors in each Fresnel zone. Taking into account the second approximation on the average value of the inclination factor of the adjacent zones for a given zone, i.e., K j ðxÞ ¼ [K jþ1 ðxÞ þ K j1 ðxÞ]∕2, the sum N X

ð1Þjþ1 K j ¼ K 1  K 2 þ K 3  K 4 þ : : : þ ð1ÞNþ1 K N

(4.7)

j¼1

can be written as N X

ð1Þjþ1 K j ¼

j¼1

    K1 K1 K K3 K þ  K2 þ 3 þ  K4 þ 5 þ : : : 2 2 2 2 2  ; N → odd K N ∕2 þ K N1 ∕2  K N ; N → even:

(4.8)

Because the average of the zones adjacent to zone Zj is approximately equal to the value of zone Zj, the sum reduces to N X

ð1Þjþ1 K j ¼

j¼1

K1 KN  → 2 2



þ; N → odd : ; N → even

(4.9)

Therefore, E † eikd EðPÞ ¼ i2l 0 d



K1 KN  2 2

 (4.10)

or EðPÞ ¼

i 1h E 1 ðPÞ  E N ðPÞ : 2

(4.11)

When the wavefront S propagates unobstructed, the total number of zones N is obtained when x ¼ p/2 and EN(P) ¼ 0. The field at P will be 1 EðPÞ ¼ E 1 ðPÞ: 2

(4.12)

Diffraction

237

Taking into account the result of Eq. (4.1) for E(P) and the field expression for the first zone E 1 ðPÞ ¼ i2lK 1 E †0 eikd ∕d, Eq. (4.12) is satisfied if K1 ¼ 

i eip∕2 : ¼ l l

(4.13)

In this way, it is possible to find the explicit value of the inclination factor for the first zone. 4.1.2 Fresnel treatment results Equation (4.12) can be interpreted as shown in Fig. 4.5. Figure 4.5(a) represents the free propagation of the wavefront S; in Fig. 4.5(b), the wavefront is obstructed by a circular aperture that allows the passage of the field delimited by the middle of the first Fresnel zone. In both cases, the field at P will be given by E †0 eikd ∕d and the irradiance will be given by I ðPÞ ¼ I 0 ¼ ðϵ0 c∕2ÞðE †0 ∕dÞ2 . From the point of view of geometrical optics, the result does not depend on the radius of the aperture since the energy that reaches P propagates along the ray that joins S with P. Therefore, the irradiance at P corresponds to the expected result. So, what is the gain of the Fresnel wave treatment? In the Fresnel treatment, the field at P depends on the size of the aperture. Let us consider the following cases: • Aperture for N ¼ 1. If the radius of the aperture is such that it allows the passage of the field delimited by the first Fresnel zone, from Eq. (4.6), E †0 eikd EðPÞ ¼ i2l K 1: d Then the irradiance at P will be Circular aperture

S

P

(a)

S

P

(b)

Figure 4.5 The field at P (a) due to the free propagation of a spherical wavefront is equal to (b) the field bounded by a circular aperture that allows only the field corresponding to half of the first Fresnel zone to pass.

238

Chapter 4

I ðPÞ ¼ 4I 0 : This result is no longer predictable by geometrical optics. The increase in irradiance with increasing aperture radius seems reasonable. However, the Fresnel treatment also tells us that this is not always the case, since a further increase in the radius of the aperture decreases the irradiance, even to zero. • Aperture for N ¼ 2. If the radius of the aperture is increased, such that the aperture coincides with the outer edge of the second Fresnel zone, from Eq. (4.6), EðPÞ ¼ i2l

E †0 eikd ðK 1  K 2 Þ: d

Taking into account the Fresnel hypothesis, i.e., that the inclination factor changes very little between consecutive zones, then K1  K2. Thus, the irradiance at P will be I ðPÞ  0. This result is even more surprising, but it is explained by considering the interference. Generally speaking, we can say that the field of the second zone is out of phase by p with respect to the field of the first zone. This is because of the way Fresnel zones have been constructed: with spheres whose radii increase by l/2. Based on the previous results, it can be anticipated that when the aperture allows the passage of M Fresnel zones, where M is odd, the consecutive zones grouped in pairs cancel each other and the irradiance at P will be given only by the remaining zone (I  4I0). When the aperture allows M Fresnel zones to pass, where M is even, the consecutive zones grouped by pairs cancel each other and the irradiance at P will be zero (I  0). • Opaque disk for N ¼ 1. Another interesting situation is if instead of an aperture in an opaque screen, like the one shown in Fig. 4.5(b), an opaque disk whose radius is equal to the edge of the first zone is placed, as in Fig. 4.6. Then, the passage of the field limited by the first Fresnel zone is blocked, and the passage of the field from the second Fresnel zone (up to the last zone where x ¼ p/2) is allowed. The result for the field at P is 1 1 EðPÞ ¼ E 1 ðPÞ  E 1 ðPÞ ¼  E 1 ðPÞ: 2 2 Therefore, the irradiance at P would be given by

Diffraction

239

Circular disk P

S

Figure 4.6 Poisson spot. Despite the circular disk that hinders light propagation within the first Fresnel zone, there is a bright spot at P behind the obstacle.

I ðPÞ ¼ I 0 : In other words, at P there will be a bright spot even though the ray from S to P is blocked. This point is called the Poisson spot. • Fresnel zone plate. Finally, let us consider the situation where only odd or even zones are blocked. As two consecutive zones have a phase shift of p and every two zones will be in phase (phase shift of 2p), an obstacle with apertures equivalent to the even or odd annular zones considerably increases the irradiance value at point P. This situation is illustrated in Fig. 4.7(a), with a zonal plate blocking even zones, and Fig. 4.7(b), with a zonal plate blocking odd zones. In both cases, if M zones are allowed to pass, the field at P is approximated by E(P)  ME1(P) and the irradiance would be I ðPÞ ¼ 4M 2 I 0 : A further improvement to the zonal plate is achieved if instead of blocking the odd or even zones, an offset of p is introduced in the odd or even zones. This can be done by depositing on a glass substrate a thin film of transparent material whose optical thickness is equal to l/2. The thin film is deposited only in the annular regions that correspond to the odd or even Fresnel zones. To do this, a mask is used that obstructs the deposit of the material (as in a lithographic process). A Fresnel phase zone plate for the even zones is shown in Fig. 4.8. In this way, the irradiance at P increases even more. The increase in irradiance at P occurs at the expense of the decrease in irradiance at points neighboring P, which guarantees energy conservation. 

The Fresnel treatment predicts that there may be light behind an obstacle. This fact, pointed out by Poisson as erroneous, actually occurs.

240

Chapter 4

S

P

(a)

S

P

(b) Figure 4.7

Fresnel zonal plates to block (a) even and (b) odd zones.

S

P

Figure 4.8 The phase zone plate takes advantage of the entire optical field S and constructively interferes with the fields contained in the odd and even Fresnel zones.

4.2 Diffraction Integral Diffraction involves finding the optical field at any point in space generated by a source with boundary conditions. The typical geometry in diffraction is illustrated in Fig. 4.9. In region I, the source S (point-like or extended) is located; in region II, the volume is limited by the closed surface S ¼ S1 þ S2, in which the optical field is measured. Region II is called the diffraction region.

Diffraction

241

1

S

I

II 2

Figure 4.9 General geometry in the diffraction problem.

The surface S1 that separates regions I and II is an opaque surface with apertures that allow the passage of part of the optical field emitted by the source S. A first approach to the problem of finding the optical field in the diffraction region is to solve the homogeneous wave equation for the scalar optical field (ignoring polarization). Using a monochromatic wave, E 0 ðx, y, z, tÞ ¼ Eðx, y, zÞeivt ,

(4.14)

from the wave equation in vacuum [Eq. (2.5)], ð∇2 þ k 2 ÞE ¼ 0,

(4.15)

with k ¼ v∕c ¼ 2p∕l. To determine E at any point in the diffraction region, Green’s theorem can be used, which in turn follows from Gauss’ theorem. Gauss’ theorem states that if F is a vector function of the position, then ZZ S

ZZZ F · nˆ ds ¼

∇ · Fdv,

(4.16)

V

where nˆ is the unit vector normal to the closed surface S (outwards), V is the volume enclosed by the surface, and ds and dv denote the differential elements of area and volume, respectively. If the function F can be obtained as F ¼ E∇U,

(4.17)

where E and U are scalar functions defined on S and V, then ZZ S

ZZZ ðE∇U · nˆ Þds ¼

ðE∇2 U þ ∇E · ∇UÞdv: V

(4.18)

242

Chapter 4

If E and U are exchanged, a similar relationship is obtained: ZZZ ZZ ðU∇E · nˆ Þds ¼ ðU∇2 E þ ∇U · ∇EÞdv: S

(4.19)

V

Subtracting Eq. (4.19) from Eq. (4.18) leads to ZZ ZZZ ðE∇U · nˆ  U∇E · nˆ Þds ¼ ðE∇2 U  U∇2 EÞdv; S

(4.20)

V

taking into account the directional derivatives ∂E∕∂n ¼ ∇E · nˆ and ∂U∕∂n ¼ ∇U · nˆ , Green’s theorem is obtained:  ZZ  ZZZ ∂U ∂E U ðE∇2 U  U∇2 EÞdv: (4.21) E ds ¼ ∂n ∂n S V If the function U satisfies the time-independent wave equation, [Eq. (4.15)], (∇2 þ k2)U ¼ 0, then the right-hand side of Eq. (4.21) vanishes; therefore,  ZZ  ∂U ∂E U E ds ¼ 0: (4.22) ∂n ∂n S With this integral, given the field E (and its derivative ∂E/∂n) on the surface S, it is possible to calculate the field E at a point P(x 0 , y 0 , z 0 ) inside the surface S with the help of the function U (and its derivative ∂U/∂n). 4.2.1 Kirchhoff integral theorem Kirchhoff uses Green’s theorem with the function Uðx, y, zÞ ¼

eiks , s

(4.23)

where s is the distance between the point P(x 0 , y 0 , z 0 ) and the point (x, y, z) on the surface. This function generates a singularity in the diffraction region that must be removed (since U must be defined anywhere in V). The singularity can be eliminated by constructing a sphere Sε of radius ε → 0 centered at point P, as illustrated in Fig. 4.10. With ε → 0, the volume V enclosing S is maintained, but now the integration surface will be S þ Sε. Therefore, the diffraction integral over S becomes   ZZ  ZZ  ∂U ∂E ∂U ∂E U U E E ds ¼  ds: (4.24) ∂n ∂n ∂n ∂n S Sε The directional derivative ∂U/∂n is equal to

Diffraction

243

1

S

P

2

Figure 4.10 Geometry to calculate the diffraction at point P with the Green’s function U(x,y,z) ¼ e–iks/s.

ikeiks s∇s  eiks ∇s · nˆ s2   eiks 1 ¼ ik  ðˆs · nˆ Þ, s s

∇U · nˆ ¼

(4.25)

where ∇s ¼ sˆ is the unit vector in the direction radial from point P. For the sphere Sε, the unit normal vector points toward point P, so ðˆs · nˆ Þ ¼ 1. In the integral to the right of Eq. (4.24),   ikε  ∂U  1 e ; (4.26)  ik ¼ ε ∂n s¼ε ε thus, the integral remains !   ikε ZZ 1 e eikε ∂E 2  E  ik ε sin ududf, ε ε ∂n ε Sε

(4.27)

where the differential area has been written in spherical coordinates (for the sphere Sε), with u as the polar angle and f as the azimuthal angle. In the limit ε → 0, this last integral reduces to ZZ E sin ududf ¼ 4pEðx0 , y0 , z0 Þ: (4.28) Sε

Consequently, the field at point P can be calculated as   ZZ iks   1 e 1 ∂E 0 0 0 Eðx , y , z Þ ¼  E ik  ðˆs · nˆ Þ  ds, 4p S s s ∂n

(4.29)

where the unit vector nˆ now corresponds to surface S. This integral is called the Kirchhoff integral theorem.

244

Chapter 4

4.2.2 Fresnel–Kirchhoff diffraction Suppose we have an opaque flat screen with an aperture and want to determine the diffracted optical field when the light source S is point-like, as in Fig. 4.11(a). To solve the integral equation [Eq. (4.29)] for point P, the first thing to do is select the integration surface S. The surface that is usually proposed in this problem is shown in Fig. 4.11(b). The closed surface consists of three open surfaces: the flat surface A that fills the aperture, the flat surface S1 behind the opaque screen, and the surface S2, which is a spherical cap of radius R centered at point P. Therefore, the Kirchhoff integral must be solved for the three surfaces, which together complete the closed surface S ¼ A þ S1 þ S2. Because the surface is arbitrarily (but conveniently) chosen, if R → `, then the surface S1 will have infinite extent. In this case, the one-aperture diffraction problem assumes that the opaque screen has infinite extent. Thus, the contribution of the field emitted by the point source E ¼ E †0 eikr ∕r, where r is the distance between source S and a point (x, y, z), on each surface will be: • in A, assuming that the field in the aperture is equal to the field in the absence of the opaque screen with the aperture, then E A ¼ E †0 eikr ∕r,

(4.30)

where r is the distance between S and a point in the aperture; • in S1, assuming that the opaque screen does not transmit light, then E S1 ¼ 0;

(4.31)

1

S

S

A

R

P

P 1

Opaque screen with aperture

2

(a)

(b)

Figure 4.11 (a) Geometry of the diffraction of a spherical wave in an aperture. (b) Selection of the integration surface to solve the Kirchhoff integral.

Diffraction

245

• in S2, with R → `, E S2 (and also U) decreases as 1/R, so the field E is practically null. However, the area of integration grows as R2 (in S2, ds ¼ R2 sin ududf), so it is not obvious that the diffraction integral vanishes in S2. The first two assumptions also have some drawbacks. In the first one, the presence of the screen changes the field at the edge of the aperture; in the second assumption, the field extends behind the opaque screen in the vicinity of the aperture [5]. However, for practical problems where the size of the aperture is much larger than the wavelength, these two assumptions work very well. For the surface S2, if R ≫ l, the integral of Eq. (4.29), with s ¼ R, is approximated by 1  4p

ZZ

  ∂E UR ikE  R sin ududf, ∂n S2

(4.32)

taking into account that now ðˆs · nˆ Þ ¼ 1. Because UR is finite valued with R → `, the integral in S2, Eq. (4.32), vanishes if   ∂E lim R ikE  ¼ 0: R→` ∂n

(4.33)

This is called the condition of radiation of Sommerfeld, and is satisfied if E → 0 as fast as 1/R. This occurs for a point source, and the contribution of the integral on the surface S2, in effect, is null. Therefore, the diffraction generated by an aperture when illuminated by a point source will be given by Eðx0 , y0 , z0 Þ ¼ 

1 4p

ZZ A

E †0

eikr eiks r s

     1 1 ik  ðˆs · nˆ Þ  ik  ðˆr · nˆ Þ ds, s r (4.34)

where rˆ is the unit vector from the source S to a point Q [of coordinates (x, y, z)] in the aperture A, sˆ is the unit vector from the observation point P [of coordinates (x 0 , y 0 , z 0 )] to point Q, and nˆ is the unit normal vector to surface A at point Q, as illustrated in Fig. 4.12(a). If the distances r ¼ SQ and s ¼ QP are much greater than the wavelength, the approximations (ik  1/s)  ik and (ik  1/r)  ik can be used in Eq. (4.34); therefore, i Eðx , y , z Þ ¼  l 0

0

0

ZZ A

E †0

  eikr eiks ðˆs · nˆ Þ  ðˆr · nˆ Þ ds: r s 2

(4.35)

246

Chapter 4

s n

r Q

Q

S

S

P

P A

A

(a)

(b)

Figure 4.12 Unit vectors at the Q point of the diffraction aperture.

This integral is called the diffraction integral of Fresnel–Kirchhoff. In fact, the term i[ðˆs · nˆ Þ  ðˆr · nˆ Þ]∕2l formally defines the inclination factor K(x) from Eq. (4.2) in the Fresnel treatment. By defining the angles between the unit vectors as shown in Fig. 4.12(b), the inclination factor can be written as i ½cos a  cos b l 2 i ½cos b þ cosðx þ bÞ : ¼ l 2

KðxÞ ¼ 

(4.36)

This definition leads to an analogous equation to Eq. (4.2); thus, ZZ eikr eiks EðPÞ ¼ KðxÞds: (4.37) E †0 r s A The inclination factor defined in Eq. (4.36) does not depend only on the angle x, as the Fresnel formulation suggests. This is because the integration surface in Eq. (4.37) is flat, whereas the integration surface in Eq. (4.2) is the spherical wavefront of radius r0, in which the amplitude of the secondary sources is constant and equal to E †0 eikr0 ∕r0 . Instead, in Eq. (4.37), not only is the amplitude of the secondary sources variable at the aperture A, but the secondary sources EðQÞ ¼ E †0 eikr ∕r are not always on the same wavefront. In this sense, Eq. (4.37) is a generalized version of the Huygens–Fresnel principle. When the source is at infinity, Eqs. (4.2) and (4.37) coincide, since the angle b ¼ p (and a ¼ x). In this case, the inclination factor turns out to be   i 1 þ cos x KðxÞ ¼  : (4.38) l 2 This situation is illustrated in Fig. 4.13. A further simplification occurs in the paraxial approximation where x  0. This last situation is very common in practice and is analyzed in the next section.

Diffraction

247

s n

r Q

P

A

Figure 4.13

Unit vectors when the incident wavefront at the aperture is flat.

4.2.3 Sommerfeld diffraction To obtain the Fresnel–Kirchhoff integral, the spherical wave of unit amplitude eiks/s was used as a Green’s function. The selection of the function is arbitrary, but it should facilitate the calculation of the diffraction. This function has some issues with the E field inside the aperture, at the edges of the aperture, and just behind the aperture near the edges. Sommerfeld proposes another Green’s function, such that the aperture boundary problems are solved, while maintaining the assumption that the optical field within the aperture is equal to the optical field in the same region when there is no opaque screen defining the aperture. The new function U is constructed with two unit spherical waves, one originating from the observation point P [as in Eq. (4.23)] and the other originating from the point P 0 , which is the mirror image of P with respect to the plane of the aperture A, as illustrated in Fig. 4.14.

Q

n s'

s

P

P'

A

Figure 4.14 P and P 0 : origin of the two auxiliary unit spherical waves for the diffraction calculation.

248

Chapter 4

With this configuration, the Green’s function becomes 0

eiks eiks  0 : U¼ s s

(4.39)

Using this function in the diffraction problem described in Fig. 4.11(b) when R → `, for any point in S1 and in A, s 0 ¼ s and, therefore, U ¼ 0 and ∂U/∂n ¼ 0. Thus, it is not necessary to make any assumptions about the boundary conditions of the field E at S1 and at the edge of the aperture, eliminating the inconsistencies of the entire Green’s function chosen by Kirchhoff. Using Eq. (4.39), leads to [6] ZZ i eikr eiks 0 0 0 Eðx , y , z Þ ¼  ðˆs · nˆ Þds: (4.40) E †0 r s l A This integral is called the first Rayleigh–Sommerfeld solution. 0 The function U ¼ eiks ∕s þ eiks ∕s0 can also be chosen, which gives rise to the second Rayleigh–Sommerfeld solution [6]: ZZ i eikr eiks 0 0 0 0 Eðx , y , z Þ ¼ ðˆs · nˆ Þds: E †0 (4.41) r s l A The inclination factor will be different in each case: • KðxÞ ¼ i[ðˆs · nˆ Þ  ðˆr · nˆ Þ]∕2l in Fresnel–Kirchhoff; • KðxÞ ¼ iðˆs · nˆ Þ∕l in the first Rayleigh–Sommerfeld solution; and • KðxÞ ¼ iðˆs0 · nˆ Þ∕l in the second Rayleigh–Sommerfeld solution. When the aperture is illuminated by a plane wave (source S at infinity) and in the paraxial approximation, the inclination factors coincide at K(x) ¼ i/l.

4.3 Fresnel and Fraunhofer Diffraction The problem of diffraction by an aperture in a flat opaque screen, as illustrated in Fig. 4.15, is considered in this section. y x Q

r s

y'

R P

d

A

x'

r'

z

Figure 4.15 Geometry for the calculation of diffraction by a plane aperture.

Diffraction

249

According to Eq. (4.37), the Huygens–Fresnel diffraction integral can be written in general as ZZ EðPÞ ¼

EðQÞKðxÞ A

eiks ds: s

(4.42)

E(Q) is the complex amplitude of the field at the aperture. The aperture is in the xy plane, and the diffraction pattern observation screen (spatial distribution of irradiance) is in the x 0 y 0 plane. These planes are separated by the distance d ¼ |d|. In particular, we will limit ourselves to the paraxial approximation, which is defined under the following conditions: 1. The separation d between the plane of the aperture and the plane of the diffraction pattern satisfies d ≫ l. 2. The dimensions of the aperture are much smaller than the distance d. 3. The dimensions of the region of observation of the diffraction pattern are much smaller than the distance d. 4. The inclination factor for any point in the aperture is approximated by the inclination factor of the first Fresnel zone, i.e., K(x)  i/l. The vectors indicated in Fig. 4.15 are: • d, separation vector between the planes of the aperture and the observation screen; • r ¼ {x,y}, position vector of the point Q in the aperture; • r 0 ¼ {x 0 ,y 0 }, position vector of point P on the observation screen; • R ¼ d þ r 0 , relative position vector of point P with respect to the origin of coordinates of the aperture; and • s, relative position vector of point P with respect to point Q. Taking into account the paraxial condition, it is fulfilled that |r| ≪ |R|, |r 0 | ≪ |R| and |R|  |d|. From the law of cosines, s2 ¼ R2 þ r2  2R · r

(4.43)

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r 2 2R · r  2 : s¼R 1þ R R

(4.44)

or



This inclination factor value is obtained from K(x) ¼ i(1 þ cos x)/2l when in Eq. (4.36) the angle b ¼ p, i.e., when illuminated by a plane wave. In practice, the wavefront at the aperture may have small deviations, such that b  p. In this case, it could still be assumed that K(x) ¼ i/l.

250

Chapter 4

Because R · r ¼ (d þ r 0 ) · r ¼ r 0 · r, then rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r 2 2r0 · rffi s¼R 1þ  2 : R R

(4.45)

The paraxial approximation implies that the square root can be approximated to second order; thus,    1 r2  2r0 · r s¼R 1þ : (4.46) 2 R2 Completing the square binomial for the term that is in parentheses in Eq. (4.46),    r02 1 r02  2r0 · r þ r2 s¼R 1 2þ : (4.47) 2 2R R2 With r 0 ≪ R, we have 1  r02 ∕2R2  1, and replacing R with d, Eq. (4.47) simplifies to s¼dþ

jr0  rj2 : 2d

(4.48)

Therefore, the paraxial form of Eq. (4.42) is obtained by replacing s with the expression given by Eq. (4.48). The distance s is in the argument of the exponential function and in the denominator of Eq. (4.42). The quantity |r 0  r|2/2d describes small variations with respect to distance d. In the exponential function, these small variations are comparable with the wavelength; thus, the relationship between these two quantities gives important phase changes in the interference process. On the other hand, in the denominator, these small variations are only compared with the distance d; therefore, there is no appreciable change in the denominator, and there s can be exchanged for d. Finally, the paraxial version of Eq. (4.42) becomes ZZ eikd 0 2 0 Eðr Þ ¼ i EðrÞeikjr rj ∕2d d 2 r: (4.49) ld A From this integral, one can see what the secondary waves of the Huygens– Fresnel principle look like in the paraxial approximation. Consider Fig. 4.16, in which r ¼ yj and r 0 ¼ y 0 j. At Q(y), there is a source of amplitude E(y) that emits a parabolic wave. The field at P(y 0 ), due to Q, is equal to the attenuated field E(y)/d, and the phase shift of the wave at P is given by ik[d þ ðy0  yÞ2 ∕2d]. Note that the term (y 0  y)2/2d is the sag at distance y 0  y from the vertex V of the front parabolic waveform centered on Q. Thus, in the paraxial approximation, the secondary waves of the Huygens–Fresnel

Diffraction

251 (y ' y) 2 /2d P s Q

V

y'

d

y O

Parabolic wavefront

A

Figure 4.16

z

O'

d

Approximation of the distance s as d þ (y 0  y)2/2d.

principle are paraboloids centered on the sources located in the primary wavefront S. Writing the vectors of Eq. (4.49) in Cartesian components, 02

eikd eikðx þy Eðx , y Þ ¼ i ld 0

0

02 Þ∕2d

ZZ

0

0

Eðx, yÞeikðx þy Þ∕2d eikðx xþy yÞ∕d dxdy: 2

2

A

(4.50) This version of the diffraction integral allows us to observe the influence of the Fresnel zones in the field that fills the opening A. Focusing on the plane 2 2 of the aperture, the field E(x,y) is modulated by the term eikðx þy Þ∕2d . Now, the 2 2 2 2 1/2 term (x þ y )/2d will be the sag at the distance (x þ y ) from the origin of coordinates O of a paraboloid with center at O 0 (Fig. 4.16). If the distance (x2 þ y2)/2d is divided by l/2, we will count the number of Fresnel zones contained in a circular opening of radius (x2 þ y2)1/2. Therefore, eikðx þy Þ∕2d ¼ eipN , 2

2

(4.51)

where N ¼ (x2 þ y2)/ld is the number of Fresnel zones. 4.3.1 Fraunhofer diffraction Let rmax be the radius of the circle circumscribing the diffraction aperture and N max ¼ r2max ∕ld be the number of Fresnel zones subtended by the circle with respect to the diffraction pattern observation plane. If Nmax ≪ 1 (Nmax  0), 2 2 then the term eikðx þy Þ∕2d inside the integral can be neglected and the diffraction pattern will be given by 2 ZZ  ðϵ0 c∕2Þ  0 xþy0 yÞ∕ld 0 0 i2pðx Iðx , y Þ ¼ 2 2  Eðx, yÞe dxdy : (4.52) ld A This integral is called the Fraunhofer diffraction integral.

252

Chapter 4

4.3.2 Fresnel diffraction Let rmax be the radius of the circle circumscribing the diffraction aperture and N max ¼ r2max ∕ld be the number of Fresnel zones subtended by the circle with respect to the diffraction pattern observation plane. If Nmax is of the order of a 2 2 Fresnel zone, then the term eikðx þy Þ∕2d cannot be negligible within the integral and, in this case, the diffraction pattern will be given by ZZ 2  ðϵ0 c∕2Þ  ikðx2 þy2 Þ∕2d i2pðx0 xþy0 yÞ∕ld I ðx , y Þ ¼ 2 2  ½Eðx, yÞe e dxdy : ld A 0

0

(4.53)

This integral is called the Fresnel diffraction integral. The diffraction integrals given by Eqs. (4.52) and (4.53) have the form of a two-dimensional Fourier transform whose spatial frequencies are x 0 /ld and y 0 /ld. Therefore, it is easy to compute these integrals numerically. 4.3.3 Some examples A circular aperture A very common aperture in diffraction is the circular aperture. This has very important practical applications because the aperture diaphragms or lens edges of an optical system are usually circular. Suppose we have a circular aperture of radius w illuminated by a flat homogeneous wavefront (of amplitude E0 and wavelength l) orthogonal to the plane of the aperture. The field at the aperture can be described as pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! x2 þ y2 Eðx, yÞ ¼ E 0 circ , w

(4.54)

where circ( ) is called the circular function and is defined as 1 for pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2 þ y2 ≤ w, and 0 for other values of (x,y) In this case, the aperture coincides with the circle that circumscribes the diffracting aperture. Then the number of Fresnel zones is given by N ¼ w2/ld. The Fresnel diffraction patterns for a circular aperture of radius w ¼ 0.5 mm are shown in Fig. 4.17, obtained when the observation screen distance corresponds to (a) the first Fresnel zone (d ¼ 395.1 mm), (b) the first two Fresnel zones (d ¼ 197.5 mm), (c) the first three Fresnel zones (d ¼ 131.7 mm), and (d) the first four Fresnel zones (d ¼ 98.7 mm), using 

To make the rings or regions of lower intensity visible, instead of plotting the irradiance, the square root of the irradiance is drawn. This is done for the simulated patterns in this section, as well as in Sections 4.4 and 4.5. But in Section 4.5.2, the irradiance is drawn because it better illustrates resolution for a two-point image and corresponds to what is observed in practice.

Diffraction

253

(a)

(b)

(c)

(d)

Figure 4.17 Fresnel diffraction patterns of a circular aperture of radius 0.5 mm, for a wavelength of 632.8 nm. The size of the image box in all cases is 3 mm on each side.

l ¼ 632.8 nm. Of course, the distance that the observation screen must be in each case is calculated from d ¼ w2/Nl. In the case of the Fraunhofer diffraction, the calculation is even easier 2 2 because it does not include the term eikðx þy Þ∕2d within the integral. This implies that d → `. In practice, the observation screen must be placed at a finite distance such that d ≫ w, e.g., the Fraunhofer pattern when d ¼ 3951 mm is shown in Fig. 4.18(a). This particular value of the distance d corresponds to the distance by which the circular aperture subtends 0.1 Fresnel zones. The profile of this pattern, Fig. 4.18(b), corresponds to the square of a Bessel function divided by its argument, as shown below. The Fraunhofer integral for a circular aperture has a well-known analytical solution. Because the aperture is circular, it is convenient to solve the integral in cylindrical coordinates. Let r ¼ ðx2 þ y2 Þ1∕2 , tan f ¼ y∕x; r0 ¼ ðx02 þ y02 Þ1∕2 , and tan w ¼ y0 ∕x0 . Hence, x ¼ r cos f, y ¼ r sin f, x0 ¼ r0 cos w, and y0 ¼ r0 sin w. By changing the variables in the diffraction integral, 1.0

Irradiance, I (I0)

0.8 0.6 0.4 0.2

rA 0 -10

(a)

-8

-6

-4

-2

0

2

4

6

8

10

x' (mm)

(b)

Figure 4.18 (a) Fraunhofer diffraction pattern on a screen observation at a distance of 3951 mm from a circular aperture of radius 0.5 mm. The size of the image box is 20 mm on a side. (b) Profile of the diffraction pattern.

254

Chapter 4

Zw Z2p 2 2   E ðϵ c∕2Þ 0 0 I ðr0 Þ ¼ 0 2 2  ei2pr r½cosðfwÞ∕ld rdrdf : ld 0

(4.55)

0

Because the problem has symmetry for w, it can be solved for any w. In particular, for w ¼ 0, the integral becomes Zw Z2p 0

0

ei2pr r cos f∕ld rdrdf,

(4.56)

0

which is equal to Zw 2p

J 0 ð2pr0 r∕ldÞrdr:

(4.57)

0

Ru

J 0 ðu0 Þu0 du0 ¼ uJ 1 ðuÞ, then    2J 1 ð2pwr0 ∕ldÞ2 0 , Iðr Þ ¼ I 0  2pwr0 ∕ld 

Using the recurrence relation

0

(4.58)

with I 0 ¼ ðϵ0 c∕2Þðpw2 E 0 ∕ldÞ2 . The first dark ring of the pattern in Fig. 4.18 is obtained for the first zero of the Bessel function, i.e., when 2pwr0 ∕ld ¼ 3.8317. Because the energy contained in the region enclosed by the first dark ring is around 84%, the first ring plays a very important role in the image of a point source. The distribution of irradiance contained within the first ring is called the Airy disk. The radius of the Airy disk is given by r0A ¼ 1.22

ld : ð2wÞ

(4.59)

In the example of Fig. 4.18(b), the Airy radius is 3.0541 mm. The second dark ring of the pattern is obtained for the second zero of the Bessel function, i.e., 2pwr0 ∕ld ¼ 7.0156, which gives r0 ¼ 2.23ld∕ð2wÞ. In the example in Fig. 4.18(b), this radius is 5.5825 mm. The energy in the region enclosed by the second dark ring is 91%. The function 2J1(u)/u is called the Jinc(u) function. A rectangular aperture Another aperture, also widely used in practical diffraction problems, is the rectangular aperture. Suppose we have a rectangular aperture with sides 2wx and 2wy illuminated by a homogeneous plane wavefront (of amplitude E0 and

Diffraction

255

wavelength l) orthogonal to the plane of the aperture. The field at the aperture can be described as x

y

Eðx, yÞ ¼ E 0 rect rect , (4.60) 2wx 2wy where rect(x/2wx) is called the rectangle function and is defined as 1 for |x| ≤ wx and 0 for other values of x. Similarly, the function rect(y/2wy) is defined for the coordinate y. Fresnel diffraction patterns for a rectangular aperture with sides 2wx ¼ 1 mm and 2wy ¼ 4 mm are shown in Fig. 4.19, obtained when the distance to the observation screen corresponds to (a) the first Fresnel zone (d ¼ 6321 mm), (b) the first two Fresnel zones (d ¼ 3161 mm), (c) the first three Fresnel zones (d ¼ 2107 mm), and (d) the first four Fresnel zones (d ¼ 1580 mm), using l ¼ 632.8 nm. Because wy > wx, the circle that circumscribes the rectangular aperture will have a radius close to wy. Therefore, the number of Fresnel zones in this case is calculated as N ¼ w2y ∕ld and, consequently, the distance at which the observation screen should be at is d ¼ w2y ∕Nl. The Fraunhofer diffraction when d ¼ 63211 mm is shown in Fig. 4.20(a). In this example, this value of d also corresponds to the distance by which the circle circumscribing the rectangular aperture subtends 0.1 Fresnel zones. The Fraunhofer integral for the rectangular aperture also has a well-known analytical solution. In this case,  w wy 2  Zx Z  2   E ðϵ c∕2Þ 0 0 0 0 i2pðx0 xþy0 yÞ∕ld  I ðx , y Þ ¼ e dxdy  2 2 ld   wx wyx

w 2  wy 2  Zx  Z  2     E 0 ðϵ0 c∕2Þ  i2pðx0 xÞ∕ld i2pðy0 yÞ∕ld   , ¼ e dx e dy   l2 d 2    wx

(a)

(b)

(4.61)

wy

(c)

(d)

Figure 4.19 Fresnel diffraction patterns of a rectangular aperture with sides of 1 mm and 4 mm, for a wavelength of 632.8 nm. The size of the image in all cases is 24 mm on each side.

256

Chapter 4 1.0

Irradiance, I (I0)

0.8 0.6 0.4 0.2 0 -125

d/ (2wx) -100

-75

-50

-25

0

25

50

75

100

125

x' (mm)

(a)

(b)

Figure 4.20 (a) Fraunhofer diffraction pattern on the observation screen at a distance of 63211 mm from the rectangular aperture with sides 1 mm and 4 mm. Image size is 250 mm on each side. (b) Profile of the diffraction pattern.

i.e.,  0

0

I ðx , y Þ ¼ I 0

sinð2pwx x0 ∕ldÞ ð2pwx x0 ∕ldÞ

2 

 sinð2pwy y0 ∕ldÞ 2 , ð2pwy y0 ∕ldÞ

(4.62)

with I 0 ¼ ðϵ0 c∕2Þð4wx wy E 0 ∕ldÞ2 . The function sin(u)/u was introduced in Section 3.1.3, i.e., the function sinc(u). The profile of this pattern in the horizontal direction (x 0 ) is shown in Fig. 4.20(b). The zeros of the function sincð2pwx x0 ∕ldÞ are obtained at x0m ¼ m

ld , ð2wx Þ

(4.63)

with m ¼ ±1, ±2, ±3, . . . . In the vertical direction (y 0 ), there is a similar behavior, with the zeros at y0m ¼ mld∕ð2wy Þ. Most of the energy is in the region bounded by the leading zeros: x01 ¼ ld∕ð2wx Þ and y01 ¼ ld∕ð2wy Þ. Therefore, it is convenient to define the width of the sinc( ) as Dx0 ¼ 2x01 in the direction x 0 and Dy0 ¼ 2y01 in the direction y 0 . Note that the diffraction pattern has a greater dispersion in the direction in which the rectangular aperture has a shorter length [Fig. 4.20(a)]. This topic is treated in Section 4.6 on diffraction gratings. One-dimensional gratings are such that each diffraction element satisfies wx ≪ wy, which makes the diffraction pattern look like a one-dimensional irradiance distribution. On the other hand, when wx ¼ wy ¼ w, the aperture geometry is a square of side 2w. In such a case, the Fraunhofer diffraction pattern is symmetric, as shown in Fig. 4.21.

Diffraction

257

Figure 4.21 Fraunhofer diffraction pattern generated by a square aperture of side 2w ¼ 1 mm, at a distance d ¼ 3951 mm (N ¼ 0.1), for a wavelength of 632.8 nm. The image size is 20 mm on each side.

4.4 Young Interferometer II In Section 3.8, the Young interferometer is analyzed as a system of two mutually coherent point sources, S1 and S2. This section deals with two practical aspects of the Young interferometer. The first has to do with the finite size of the two sources (apertures) S1 and S2, while the second deals with the size and coherence of the light source. With this, an enhanced description of Young’s original experiment from the 19th century [7] is presented in this book. A diagram of Young’s experiment that will be discussed in this section is shown in Fig. 4.22. The primary source S with which the apertures S1 and S2 are illuminated is an incoherent, monochromatic extended source of wavelength l and lateral size s. The two apertures are circles of radius w and are separated from each other (from their centers) by a distance a along the x direction. The distance between the source S and the apertures is zp, and the distance between the apertures and the observation screen is d. zp

d

P S1

S

Source

x'

S2

Screen with apertures

Observation screen

(x',y')

Figure 4.22

Young’s experiment, or diffraction through two apertures, in an opaque screen.

258

Chapter 4

Taking into account the conditions under which Young’s experiment is performed, d ≫ a ≫ l, the diffraction on the screen corresponds to the Fraunhofer diffraction. 4.4.1 Effect of the size of the diffraction aperture First, let us assume that the primary source is a point source (on the optical axis) and that the field amplitude at each aperture is uniform and of constant phase. Therefore, the irradiance will be  !  Z` Z` pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  h i ϵ0 c  x2 þ y2 0 0 I ðx , y Þ ¼ 2 2  E 0 circ ∗ dðx  a∕2Þ þ dðx þ a∕2Þ w 2l d  ` `

2   i2pðx0 xþy0 xÞ∕ld dxdy : e 

ð4:64Þ

The function circ( ) describes the geometry of each aperture of radius w, and the Dirac delta functions locate the aperture at x ¼ a/2 and x ¼ a/2. The symbol  denotes the convolution operation. Taking into account that the Fourier transform of the convolution of two functions is equal to the product of the Fourier transforms of each function,    2J 1 ð2pwr0 ∕ldÞ 2 pa 0 2 I ðx0 , y0 Þ ¼ 4I 0 x cos , (4.65) 2pwr0 ∕ld ld pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi with I 0 ¼ ðϵ0 c∕2Þðpw2 E 0 ∕ldÞ2 and r0 ¼ x02 þ y02 . Note that cos2 ðpax0 ∕ldÞ is the Fourier transform of [dðx  a∕2Þ þ dðx þ a∕2Þ]. In fact, the result given in Eq. (4.65) is the modulated version of the result given in Eq. (3.117). Thus, the modulation of the pattern is determined by the size of the aperture, whereas the spacing between the fringes is determined by the spacing between the apertures. The simulation of the diffraction pattern by two identical circular apertures of radius w ¼ 0.5 mm separated by a ¼ 5 mm, with l ¼ 632.8 nm, is shown in Fig. 4.23(a) when the observation screen is at a distance d ¼ 3951 mm (as in Fig. 4.18). The horizontal profile of the pattern is shown in Fig. 4.23(b); the gray segmented curve describes the modulation of the interference pattern due to the diffraction pattern. 4.4.2 Effect of light source size Now let us see how the extent s of the source S affects the diffraction pattern. Let us assume that the source is monochromatic and spatially incoherent, i.e., the oscillations of the fields emitted by two (independent) point sources of S are uncorrelated and therefore these fields do not interfere with each other. This implies that if we consider two point sources of S, each will generate its

Diffraction

259 1.0

d/a

Irradiance, I (I0)

0.8

1.22 d/ 2w

0.6 0.4 0.2 0

(a)

-6

-4

-2

0 x' (mm)

2

4

6

(b)

Figure 4.23 (a) Interference pattern generated by two circular apertures of 0.5 mm radius and 5 mm apart. The interference pattern is modulated by the diffraction pattern (in gray) of one of the apertures. (b) Interference pattern profile.

own Young interference pattern. The final result is the sum of the intensities produced by each of the point sources. To qualitatively see the effect of the source size, let us consider a light source formed by two incoherent point sources separated from each other by the distance s 0 , as shown in Fig. 4.24. The angular size of the source will be a ¼ s 0 /zp. Each source generates its own interference pattern with an offset for the maximum of ±da/2. The interference fringes in each pattern will be separated from each other by the distance ld/a. Because the two patterns are identical, when the offset between the patterns is equal to ld/2a, the minima of one pattern coincide with the maxima of the other pattern; therefore, the sum of the patterns eliminates the interference fringes. Let us denote by a0 ¼ l/2a the angle that subtends the displacement ld/2a with respect to the midpoint between the apertures. The evolution of the interference pattern, as the angular size of the source increases, is shown in Fig. 4.25. The parameters used are the same as those used zp

d

S1

' S

S2

Two incoherent point sources

Figure 4.24 Sum of intensities of two incoherent sources.

260

Chapter 4 0=

2.0

Irradiance, I (4I0 )

=

=

/(2a)

=2

0

=3

0

=

0

0

1.5

1.0

0.5

0

-4

-2

0

x' (mm)

2

4

-4

-2

0

x' (mm)

2

4

-4

-2

0

x' (mm)

2

4

-4

-2

0

x' (mm)

2

4

-4

-2

0

2

4

x' (mm)

Figure 4.25 Decrease in the visibility of the Young-type interferogram generated by two circular apertures of radius w ¼ 0.5 mm separated by a ¼ 5 mm, when illuminated with light emitted by two point sources that are incoherent with each other depending on the angular separation a of the sources. The wavelength of light is l ¼ 632.8 nm.

in Fig. 4.18. When a ¼ 0, the two point sources coincide on the optical axis, and the intensity at each point is twice the intensity of the pattern generated by a single source. The visibility of the interferogram [Eq. (3.32)] is the maximum (C ¼ 1). When a ¼ a0/4, the point sources are separated by s 0 ¼ lzp/8a. The intensity change is small compared with the first case, and the interferogram visibility is still high (C ¼ 0.93). With a ¼ a0/2, the point sources are separated by s 0 ¼ lzp/4a and now the intensity change is appreciable; the minima inside the envelope move away from zero. The latter notably reduces the visibility of the interferogram (C ¼ 0.71). For a ¼ 3a0/4, the point sources are separated by s 0 ¼ 3lzp/8a and the visibility decreases considerably (C ¼ 0.38). Finally, when a ¼ a0, the point sources are separated by s 0 ¼ lzp/2a and the visibility of the interferogram becomes zero; i.e., there are no interference fringes. This example qualitatively illustrates what happens to a Young interference pattern when illuminated by an extended source (composed of an infinite number of incoherent point sources). According to the visibility of the interferogram generated by the optical fields emerging from the apertures S1 and S2, the degree of spatial similarity of these fields can be established. The visibility of the interferogram measures the degree of spatial coherence of the fields in the apertures. If the fields in the apertures are identical, which happens if the source is a point source, as in the case of a ¼ 0 in Fig. 4.25, the visibility is maximum and the fields are mutually or fully coherent. As the size of the extended source increases, the field oscillations at the apertures become less correlated, thus decreasing the visibility of the interferogram and the degree of spatial coherence. Results similar to those shown in Fig. 4.25 can be obtained if instead of changing the size of the light source, the separation of the two apertures is changed. Suppose the size of the light source is s. Based on what was seen above, the separation of the apertures has a limit value at which the visibility of the interferogram becomes zero. For smaller separations, interference will be observed. In the case of a source made up of two incoherent point sources,

Diffraction

(a)

261

(b)

(c)

(e)

(f)

(d)

Figure 4.26 Interference patterns in Young’s experiment, with a partially coherent monochromatic light of wavelength 632.8 nm, as a function of the separation of two circular apertures of 0.5 mm radius each.

the separation limit is a ¼ lzp/2s. However, for a continuous source, this limit value is calculated from the van Cittert–Zernike theorem: the position of the first zero of the Fourier transform of the irradiance distribution of the incoherent source is taken as the maximum separation of the two apertures. If the amplitude of the optical fields at the apertures is approximately equal, the result of the van Cittert–Zernike theorem measures the visibility of the interferogram, which is equivalent to measuring the spatial coherence of the fields in the aperture; e.g., if the light source is a square of side s, with a constant irradiance distribution, the visibility will be given by C  jsinðpsa∕lzp Þ∕ðpsa∕lzp Þj. If the light source is circular with a radius of s/2, with a constant irradiance distribution, the visibility will be given by C  jJ 1 ðpsa∕ldÞ∕ðpsa∕ldÞj. Then, the limiting values for the aperture spacing will be: a ¼ lzp/s with the square light source, and a ¼ 1.22lzp/s with the circular light source. The experimental interference patterns generated by two circular openings of radius w ¼ 0.5 mm, when the illumination source is incoherent with an irradiance distribution that follows a Gaussian profile [8], are shown in 2 Fig. 4.26. In this experiment, C  eða∕5.95Þ , where 5.95 is the width of the 

The van Cittert–Zernike theorem states that the region in the plane of the diffraction apertures within which the spatial coherence is not zero is determined by the Fourier transform of the irradiance distribution of the incoherent light source. A good explanation of this theorem is found in Born and Wolf [4].

262

Chapter 4

Gaussian profile in millimeters and is taken as the radius of the coherence region. In the experiment, the opening gap varies from 2 to 12 mm in steps of 2 mm. By examining the profile of the interferograms in the horizontal direction (x 0 ), curves similar to those shown in Fig. 4.25 are obtained. When a ¼ 12 mm, there is no more interference. Note that in these images no interference rings are observed [as shown in the simulated pattern in Fig. 4.23(a)]. This occurs because in practice, the maximum irradiance of the first ring is very small with respect to the maximum of the central region, as can be seen in the profile of Fig. 4.23(b). From the lessons learned in this section, one can imagine how careful Thomas Young was to look at the interference fringes, which must be colored if the primary source is the sun.

4.5 Image Formation with Diffraction According to geometrical optics, the image of a point formed by an optical system free of optical aberrations is also a point. Suppose the point object is located on the optical axis. The spherical wavefront that diverges from the object when passing through the optical system will be truncated by the aperture diaphragm; i.e., the diaphragm plays the role of the aperture that diffracts the light. This implies that the image cannot be a point. On the other hand, the image of a large object will depend on the spatial coherence of the optical field in the object. This section briefly deals with the topic of imaging by taking diffraction into account in the paraxial approximation (Fresnel/Fraunhofer diffraction). 4.5.1 Image of a point (source) object Let us consider the system shown in Fig. 4.27. The thin lens represents the imaging optics, and the edge of the lens is the aperture diaphragm. The lens introduces a phase delay in the wavefront as it passes through the diaphragm. With this in mind, the lens can be modeled as a complex variable transmittance that changes the phase of the incident wavefront at the diaphragm. Thus, the process of image formation of a point object can be described as follows: a spherical wavefront that diverges from the point object is truncated by the aperture diaphragm and undergoes a phase shift due to the transmittance of the lens, then converges as a diffraction pattern in the Gaussian image plane. In Fig. 4.27, so and si are the object and image distances from the thin lens in the plane of the aperture (diaphragm). The phase of the optical field (diverging from the point object) just before the aperture would be 

For a detailed discussion of the problem of imaging, by taking diffraction into account, see Goodman [6].

Diffraction

263 so

si AS

O

Point object Lens

Gaussian image plane

(x,y)

(x',y')

Figure 4.27 Schematic of an optical imaging system. The aperture diaphragm limits the wavefront and generates diffraction in the image.

eikso eikðx þy Þ∕2so . The transmittance of the lens at the aperture is given by 2 2 tðx, yÞ ¼ eikðx þy Þ∕2f . This result is easily deduced and can be found in Introduction to Fourier Optics [6]. Therefore, if E0 is the amplitude of the field at the aperture, the optical field just after the aperture will be pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! x2 þ y2 ikðx2 þy2 Þ∕2so ikðx2 þy2 Þ∕2f Eðx, yÞ ¼ E 0 eikso circ e ; (4.66) e w 2

2

this describes the edge of the lens, and w is the radius of the lens. The diffraction between the diaphragm (lens) and the Gaussian image plane is  2  Z` Z`   ðϵ0 c∕2Þ  0 0 ikðx2 þy2 Þ∕2si i2pðx0 xþy0 yÞ∕lsi I ðx , y Þ ¼ ½Eðx,yÞe e dxdy : (4.67) 2 2  l si   ` `

Inside the integral are the following phase terms: eikðx þy Þ∕2so eikðx þy Þ∕2si eikðx þy Þ∕2f ¼ eik½1∕si 1∕so 1∕f ðx þy Þ∕2 ¼ 1: 2

2

2

2

2

2

2

2

(4.68)

This follows from the thin lens equation 1/si  1/so ¼ 1/f [Eq. (1.42)]. Thus, the image of a point object would be given by  2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!  Z` Z`  2 2 2   E 0 ðϵ0 c∕2Þ  x þy 0 0 i2pðx0 xþy0 yÞ∕lsi  : (4.69) I ðx , y Þ ¼ circ dxdy e   w l2 s2i   ` `

This integral was solved in the circular aperture example (Section 4.3.3). Thus, the image of a point object formed by a lens of diameter 2w depends on the diameter of the lens and the distance of the Gaussian image si, and is of the form

264

Chapter 4

   2J 1 ð2pwr0 ∕lsi Þ2 ,  Iðr Þ ¼ I 0  ð2pwr0 ∕lsi Þ  0

(4.70)

where I 0 ¼ ðϵ0 c∕2Þðpw2 E 0 ∕lsi Þ2 . In geometrical optics, the geometrical PSF (Section 1.9) was defined to describe the shape of the image of a point object. If the optical system is free of aberrations, the geometrical PSF would be a point. Taking diffraction into account, the image of a point object is no longer a point but a diffraction pattern. Analogously, in physical optics (when the wave nature of light is taken into account), the diffractive PSF is defined to describe the shape of the image of a point object. If the optical system is free of aberrations, the diffractive PSF will be given by Eq. (4.70). When an optical system is free of optical aberrations, the system is said to be diffraction-limited. Pupil function and optical aberrations Equation (4.69) indicates that in the image formation process, which is in the Fresnel diffraction domain, the lens compensates for the quadratic Fresnel phase term, which results in a Fraunhofer diffraction integral. This equation can be generalized to multiple lens optical systems with an aperture diaphragm separated from the lenses. The diffraction aperture would be the edge of the exit pupil, and the distance at which the Fresnel diffraction occurs would be the distance between the exit pupil and the Gaussian image plane, say sps. The pupil is described by a function P(x,y) that includes the geometry of the pupil and a possible variation of the transmittance in the pupil. If, in addition, the optical system presents optical aberrations, these affect the phase of the wavefront in the pupil, which can be included by multiplying the function P(x,y) by a term e–ikW(x,y), where W(x,y) is the variation of the real wavefront with respect to the ideal spherical wavefront of radius sps. Thus, the PSF of an optical system in general will be given by 

I ðx0 , y0 Þ ¼

 Z` Z` E 20 ðϵ0 c∕2Þ  l2 s2ps  ` `

2   ikW ðx,yÞ i2pðx0 xþy0 yÞ∕lsps Pðx,yÞe e dxdy : 

(4.71)

The function W(x,y) is a polynomial whose terms describe the optical aberrations present in the optical system [9]; e.g., primary aberrations such as astigmatism, coma, and spherical aberrations in the wavefront are given by aa(x2  y2), acy(x2 þ y2), and as(x2 þ y2)2, respectively. Defocus can also be included as an aberration, given by ad(x2 þ y2). The coefficients ad, aa, ac, and as depend on the parameters of the optical system. The PSF of a diffractionlimited optical system is shown Fig. 4.28, in which the distance between the exit pupil and the Gaussian image plane is sps ¼ 100 mm and the exit pupil

Diffraction

265

Figure 4.28 Diffraction pattern [function Jinc( )] without aberrations. The radius of the Airy disk is 7.7 mm.

diameter is 2w ¼ 10 mm, with l ¼ 632.8 nm. The radius of the Airy disk is rA ¼ 7.7 mm. The size of the box in the image is 316 mm on a side. Figure 4.29 shows the diffraction patterns in the same optical system as Fig. 4.28. The wavefront at the exit pupil is affected by aberrations such as: defocus, with ad ¼ 5  10–5 mm–1; astigmatism, with aa ¼ 3.5  10–5 mm–1; coma, with ac ¼ 1.25  10–5 mm–2; and spherical aberration, with as ¼ 2  10–6 mm–3. The size of the image box in all cases is 316 mm on a side. The first thing to note is that the presence of any of these aberrations increases the size of the PSF compared with the PSF without aberrations shown in Fig. 4.28. Note that the defocus aberration corresponds to a Fresnel diffraction pattern because in this case, the term eikW ðx,yÞ in the integral of 2 2 Eq. (4.71) is equal to eikad ðx þy Þ , which has the form of the quadratic Fresnel factor. The number of Fresnel zones introduced by the defocus in the pupil will be N ¼ 2ad w2 ∕l. In the example given here, this is N ¼ 3.95. The defocus coefficient is given by ad ¼ Ds∕ð2s2ps Þ, where Ds is the defocus. In our example, Ds ¼ 1 mm, and the negative sign means that defocus occurs when the image plane moves 1 mm closer to the exit pupil.

(a)

(b)

(c)

(d)

Figure 4.29 Diffraction patterns corresponding to the primary aberrations when the diameter of the exit pupil is 10 mm and the distance between the exit pupil and the Gaussian image plane is 100 mm, with l ¼ 632.8 nm. The size of the image box in all cases is 0.316 mm on a side.

266

Chapter 4

In practice, the PSF would be the result of a combination of different aberrations; e.g., the experimental PSF in Fig. 4.1 corresponds to an optical system (human eye model) affected by defocus (myopia), astigmatism, and spherical aberration. 4.5.2 Resolution in the image (two points) From the point of view of geometrical optics, if an object is formed by two points separated by a certain distance, the image should have two points separated by a distance that depends on the magnification of the system. Regardless of the distance that separates the two points of the object, two points should be observed in the image. But as mentioned before, the image of a point object is an irradiance distribution called PSF. Therefore, the image of two spatially incoherent points would be two diffraction patterns that can overlap (in intensity) depending on the distance that separates them. This means that there may be a situation where the two diffraction patterns overlap, resulting in an irradiance distribution where the individual patterns cannot be distinguished. What is the minimum distance between these two diffraction patterns at which they can still be identified? This distance is called the limit of spatial resolution, and it depends on the diameter of the exit pupil of the optical system and the distance of the image. If the imaging system is diffraction-limited, the size of the image of a point object is taken to be equal to that of the Airy disk. The incoherent superposition of the images of two identical point objects is shown in Fig. 4.30 when they are separated by 2rA, 1.5rA, rA, and 0.5rA, where rA is the radius of the Airy disk. The distance separating the images is measured from the center of each of the diffraction patterns, which corresponds to the distance of the point images according to geometrical optics. The calculations were made taking into account the same parameters as those shown in Fig. 4.28; i.e., when the distance between the exit pupil and the Gaussian image plane is sps ¼ 100 mm and the exit pupil diameter is 2w ¼ 10 mm, with l ¼ 632.8 nm. The profiles of each of the diffraction patterns are shown in Fig. 4.30(a), the result of the incoherent superposition of the two diffraction patterns is shown in Fig. 4.30(b), and the diffractive images of the superposition are shown in Fig. 4.30(c). When the separation is 2rA or 1.5rA, the images of the two points can be clearly identified. When the separation is rA, part of the patterns overlap and the two diffractive images can still be resolved, but when they get closer, at the distance 0.5rA, it is no longer possible to identify the two point objects. Although there may be distances between rA and 0.5rA for which the images can still be resolved, the separation rA is usually established as a resolution criterion. This is the Rayleigh criterion [1]: Images of two incoherent point sources are resolved when the center of the Airy pattern of one of the images falls on the first minimum of the Airy pattern of the other image.

Diffraction

267

Irradiance, I (I0)

2rA

rA

1.5rA

0.5rA

1.0 0.5 0

-20

-10

0 10 x' ( m)

20

-20

-10

0 10 x' ( m)

20

-20

-10

0 10 x' ( m)

20

-20

-10

0 10 x' ( m)

20

-10

0 10 x' ( m)

20

-20

-10

0 10 x' ( m)

20

(a) Irradiance, I (I0)

1.5 1.0 0.5 0

-20

-10

0 10 x' ( m)

20

-20

-10

0 10 x' ( m)

20

-20

(b)

(c) Figure 4.30 Incoherent superposition of the images of two point sources when the images are separated by 2rA, 1.5rA, rA, and 0.5rA. (a) Profiles of the irradiance of each image, (b) profiles of the image resulting from the superposition of the two images, and (c) diffraction patterns of the image of two point sources.

Given that rA ¼ 1.22ld/(2w), the size of the aperture diaphragm of the optical system is of great relevance for a fixed distance; the larger the diameter, the better the resolution. That is why it is desirable in astronomy to have large primary mirrors in telescopes. On the other hand, in other cases, such as lithography, it is possible to modify (decrease) the wavelength of the light coming from the object to increase the resolution. The same thing happens in electron microscopy, where the wavelength is of the order of 1 Å. Of course, the quality of the diffraction pattern depends on the optical aberrations, which increase with the diameter of the aperture stop. Therefore, increasing the diameter of the diaphragm does not necessarily improve resolution. If the two point sources are coherent with each other, the result of the superposition of the images depends on the initial phase of the light in each of the sources; e.g., if the phase difference between the two sources is p, then the distance between the images can be reduced below rA. But if the phase difference is 0, the separation of the images must be increased above rA in order to resolve them [6].

268

Chapter 4

Visual acuity In visual optics (optometry and ophthalmology), instead of using the concept of resolution as explained above, the term visual acuity is used. Although in practice these two concepts are equivalent, a standard for the human eye has been established that determines the conditions in which a person is said to have good visual acuity. If at a distance of 20 feet (6 m) a person can resolve two separate lines 1 0 of arc, that person is said to have 20/20 visual acuity (emmetropic eye). Following the Rayleigh criterion, the angular resolution limit is given by ðDuÞmin ¼

rA l : ¼ 1.22 sps 2w

(4.72)

For the emmetropic eye, the resolution limit will be (Du)min_ojo ¼ 1 0 . Taking 540 nm as the value of the wavelength in the center of the visible spectrum, the diameter of the pupil of the eye for which the value of the resolution limit is obtained will be 2w ¼ 2.3 mm. If we consider the Gullstrand–Emsley schematic eye (Fig. 1.68), where f 0 ¼ 22.05 (sps), the size of a point source in the retina would be a circular spot of 12.8 mm in diameter. 4.5.3 Image of an extended object An object can be considered as an infinite set of points. From the discussion in the previous section, we already know how the image of a point object is formed; this is the PSF given by Eq. (4.71). The generalization to a set of points is not immediate but rather depends on the degree of coherence of the illumination of the object. For example, for the incoherent case, the image is given by I i ðx0 Þ ¼ I o ðx0 Þ ∗ PSF ,

(4.73)

where Io(x 0 ) represents the Gaussian image, PSF is the diffractive point spread function [Eq. (4.71)], and the symbol  is the operation of convolution. This means that every point in the Gaussian image is affected by the PSF of the system in the same way. For coherent lighting, the situation is more complex because the convolution of the Gaussian image must be performed with a function that depends on the optical system (pupil geometry and optical aberrations) and the characteristics of the object (spatial frequency content). In other words, if the object is changed (keeping the same optical system), the function changes [10,11]. A similar situation occurs with partially coherent light. A generalization of the partially coherent light imaging process can be seen in Mejía and Suárez [12]. 

This is valid only in the paraxial region. It should be noted that for larger fields of view the PSF varies with angle; e.g., the coma aberration increases as the chief ray increases its inclination.

Diffraction

269

4.6 Diffraction Gratings A diffraction grating consists of a large number of identical elements that diffract light. These elements can be apertures in an opaque screen, steps or grooves in a substrate, or even an interference pattern of straight parallel fringes etched in amplitude or phase into a photosensitive material. The position of the irradiance maxima produced by diffraction gratings is a function of wavelength; thus, diffraction gratings find great application in the spectral measurement of the wavelength of light. In this section, a basic configuration of diffraction gratings is analyzed, consisting of an array of N rectangular openings of width b and length c separated from each other by a distance a (> b), as shown in Fig. 4.31. Assuming that b ≪ c, it is sufficient to analyze the diffraction pattern in one dimension along the aperture distribution (axis x in Fig. 4.31). The distance at which the diffraction pattern produced by diffraction gratings is usually observed is such that d ≫ Na; thus, the observed diffraction pattern corresponds to Fraunhofer diffraction. In the one-dimensional case, the profile of each of the apertures can be described by the function rect( ), which is introduced in Section 4.3.3. On the other hand, the interference of N point sources [Fig. 3.47(b)] is described in Section 3.6. The interference of N sources with rectangular geometry is equivalent to the diffraction of the rectangular apertures array. By illuminating the set of apertures with a (presumable) plane wave of amplitude E0 in the orthogonal direction, the optical field at the set of apertures can be written simply as N x X EðxÞ ¼ E 0 rect dðx  jaÞ, (4.74) ∗ b j¼1 where d(x  ja) determines the position of the jth aperture; with the convolution operation, the rectangular shape of the aperture is copied at each point located at x ¼ ja. The Fraunhofer diffraction would be y a

x

y' x' d z

A

Figure 4.31

Array of N identical rectangular apertures.

270

Chapter 4



I ðx0 Þ ¼

 Z` E 20 ðϵ0 c∕2Þ   2 2 ld



`



2   N x X  i2px0 x∕ld dðx  jaÞ e dx , rect ∗ b  j¼1

(4.75)

which turns out to be  N  2  sinðpx0 b∕ldÞ2  X  i2px0 ðjaÞ∕ld     I ðx Þ ¼ I 0  e   : 0 ðpx b∕ldÞ j¼1 0

(4.76)

I0 absorbes ðϵ0 c∕2ÞðE 0 ∕ldÞ2 and the other constants that come out of the integral. The sum of the second factor from the right of the equality is solved in the same way as in Eq. (3.101). The end result is     sinðpx0 b∕ldÞ2  sinðNpx0 a∕ldÞ2 0  :   I ðx Þ ¼ I 0  (4.77) ðpx0 b∕ldÞ   sinðpx0 a∕ldÞ  This integral differs from the integral given in Eq. (3.104) in the term corresponding to the diffraction of a single aperture. Therefore, the diffraction pattern produced by an array of N identical apertures is equal to the interference pattern of N point sources (located in the center of the apertures) modulated by the diffraction pattern of one of the openings. The diffraction pattern profile for an array of eight identical rectangular apertures separated by a ¼ 4b is shown in Fig. 4.32. The array is located 64

d/b

56

Irradiance, I (I0)

48 40 32 24 16 8 0 -8

-7

-6

-5

-4

-3

-2

-1

0 1 x' ( d/a)

2

3

4

5

6

7

8

Figure 4.32 Diffraction pattern produced by an array of eight identical rectangular apertures spaced a ¼ 4b, with b being the width of each aperture. The segmented curve corresponds to the diffraction pattern of a single aperture.

Diffraction

271

symmetrically with respect to the optical axis (axis z in Fig. 4.31). Figure 4.32 includes the diffraction profile (segmented curve) of an aperture as well as the modulation this produces in the interference pattern generated by eight point sources located in the center of the apertures. The unit of the horizontal scale is given as ld/a, i.e., the separation between the principal maxima of the pattern without interference. On the other hand, at distances ±m 0 ld/b (m 0 ¼ 1,2,3, . . . ) from the center at 0 (optical axis) are the zeros of the diffraction pattern. Because a ¼ 4b, the principal maximum of the interference pattern located at 4(ld/a) falls right on the first zero of the diffraction pattern located at ld/b, and therefore the maximum of the interference is not observed there. In the central lobe there will be a total of 2(a/b)  1 principal maxima, and in the side lobes there will be a total of (a/b)  1 principal maxima. Principal maxima are at ±mld/a (m ¼ 0,1,2,3, . . . ), with m used to label the maxima or the diffraction order. Thus, the central maximum will be the zero order of diffraction, the two maxima next to the central one will be diffraction orders þ1 and 1, and so on. As N increases, the energy in the secondary maxima decreases (approaching zero) and the principal maxima are in the form of very sharp peaks. The peak of order m subtends the angle um with respect to the grating center (on the optic axis) and is given by tan um ¼ m

ðld∕aÞ : d

(4.78)

In the Fraunhofer approximation, the function tan( ) can be changed to the function sin(); thus, the equation of the diffraction grating that angularly locates a diffraction order is ml ¼ a sin um :

(4.79)

When the lighting source is polychromatic, at zeroth order there will be a maximum of the same color as the source, but for m ≠ 0, there will be spatially separated maxima of different colors, e.g., if the source is white light, a continuous spectrum will be seen for a given m (≠ 0). If the spectrum of the source consists of two wavelengths, what is the smallest difference in wavelength that can be resolved in the diffraction pattern? This problem can be treated similarly to the way spatial resolution of two-point images is handled in the previous section. Equation (3.112) allows the separation of the minima between the principal maxima of the interference pattern of N sources to be determined. The separation between two consecutive minima will be given by Dx0 ¼

ld , Na

(4.80)

which in turn is half the width of the principal maxima. By applying the Rayleigh criterion, Dx 0 would be the minimum separation in x 0 between the

272

Chapter 4

diffraction maxima for two wavelengths l and l þ Dl for the diffraction order m. On the other hand, the angular separation between the two maxima can be calculated as follows: let x 0 be the position of the mth order, and then x0 ∕d ¼ tan um  sin um ; thus, Dx0 ¼ Dum cos um : d

(4.81)

By inserting Eq. 4.80 in Eq. 4.81, the angular separation is equal to Dum ¼

l : Na cos um

(4.82)

On the other hand, the angular separation can also be evaluated from Eq. (4.79). Differentiating, mDl ¼ aDum cos um ,

(4.83)

i.e., Dum ¼

mDl : a cos um

(4.84)

The resolving power of the diffraction grating is defined as R¼

l , ðDlÞmin

(4.85)

where (Dl)min is the difference in wavelength that can be resolved around the wavelength l. By equating Eqs. (4.82) and (4.84), the resolving power of the diffraction grating can also be written as R ¼ Nm;

(4.86)

i.e., the resolving power increases with N and the diffraction order m. In a diffraction grating, the density of diffraction elements (apertures) is usually in the hundreds or thousands per millimeter. This density is the parameter that is usually used to characterize the grating, and its value is given in lines per millimeter. A line is equivalent to a diffracting element. For example, in an optical store catalog, a 12.5 mm wide diffraction grating has 500 lines/mm. From this information, it can be concluded that N ¼ 6250. If the grating is completely illuminated with a plane wave, the resolving power of the order m ¼ 1 would be R ¼ 6250. Thus, for a light signal around l ¼ 540 nm, two wavelengths can be separated whose difference is (Dl)min ¼ 0.086 nm. 

The resolving power on a diffraction grating is a measure of the ability to spatially separate two wavelengths.

Diffraction

273

If the analysis is done for the second order of diffraction, the wavelength difference would be (Dl)min ¼ 0.043 nm. In practice, gratings are designed so that the amount of light in the diffraction orders can be controlled. In the case of the orthogonally illuminated planar grating, most of the light is in the zero order, where the spectrum cannot be resolved. Thus, it would be convenient to have most of the light in the first or second orders. This can be done, e.g., by etching steep steps into the surface of a mirror. If the mirror is also concave, it is possible to focus the different orders of diffraction, which is common in spectrometers. For an extension to this topic, Diffraction Gratings and Application [13] and Diffraction Grating Handbook [14] can be consulted.

References [1] E. Hecht, Optics, Global Edition, 5th ed., Pearson, Harlow, England (2017). [2] A. Fresnel, “The diffraction of light,” Chap. 8 in Great Experiments in Physics: Firsthand Accounts from Galileo to Einstein, 2nd ed., M. H. Shamos, Ed., Dover Publications, Mineola, New York, 108–120 (1987). [3] J. D. Jackson, Classical Electrodynamics, 3rd ed., Wiley, New York (1999). [4] M. Born and E. Wolf, Principles of Optics, 6th ed., Pergamon Press, Oxford, England (1993). [5] A. Sommerfeld, Lectures on Theoretical Physics, Vol. 4: Optics, Academic Press, New York (1964). [6] J. W. Goodman, Introduction to Fourier Optics, Roberts and Company Publishers, Englewood, Colorado (2005). [7] T. Young, “The Interference of Light,” Chap. 7 in Great Experiments in Physics: Firsthand Accounts from Galileo to Einstein, 2nd ed., M. H. Shamos, Ed., Dover Publications, Mineola, New York, 93–107 (1987). [8] Y. Mejía and A. I. González, “Measuring spatial coherence by using a mask with multiple apertures,” Opt. Commun. 273(2), 428–434 (2007). [9] Y. Mejía, “El frente de onda y su representación con polinomios de Zernike,” Cien. Tecnol. Salud Vis. Ocul. 9(2), 145–166 (2011). [10] H. H. Hopkins, “On the diffraction theory of optical images,” Proc. R Soc. Lond. A Math. Phys. Sci. 217(1130), 408–432 (1953). [11] B. J. Thompson, “Image formation with partially coherent light,” Chap. 4 in Progress in Optics, Vol. 7, E. Wolf, Ed., Elsevier, Amsterdam, 169–230 (1969). [12] Y. Mejía and D. Suárez, “Optical transfer function with partially coherent monochromatic illumination,” Optik 193(163021), 1–4 (2019). [13] E. G. Loewen and E. Popov, Diffraction Gratings and Applications, CRC Press, Boca Raton, Florida (2018). [14] C. A. Palmer and E. G. Loewen, Diffraction Grating Handbook, 6th ed., Newport Corporation, Rochester, New York (2005).

Appendix A

Ray Tracing Considering the paraxial approximation, a method can be established to propagate an optical ray in a system of refracting or reflecting surfaces. This makes it possible to measure the paraxial properties of the optical system, namely, focal points, principal points, and nodal points. In Fig. A.1, the angles u and u0 of inclination of the incident and refracted rays on a spherical surface of radius R are indicated, as well as the height y of the incident ray on the spherical surface. In the paraxial approximation, the height of the ray at the surface is measured along the segmented line passing through the vertex V of the surface, and the tangents of the angles are taken as the angles (in radians). Therefore, u¼

y s

(A.1)

and y u0 ¼  0 . s

(A.2)

Multiplying Gauss’ equation for the spherical surface of refraction [Eq. (1.21)] by the height y, n0 y ny n0  n ¼ y .  s0 s R

(A.3)

Figure A.1 Height and angles of inclination of a ray on a spherical refracting surface.

275

276

Appendix A

Figure A.2 Ray propagation on two surfaces.

And from Eqs. (A.1) and (A.2), n0 u0 ¼ nu  ðn0  nÞcy,

(A.4)

where c ¼ 1/R is the curvature of the surface. Equation (A.4) allows the angle of inclination of the ray to be propagated throughout the optical system. To generalize propagation on various surfaces, let us consider two spherical surfaces of an optical system, as shown in Fig. A.2. The equation to propagate the angle from left to right of the jth surface becomes nj uj ¼ nj1 uj1  Pj yj ,

(A.5)

where Pj ¼ (nj – nj–1)cj is the refractive power. Thus, n0j1 ¼ nj and cj ¼ 1/Rj. On the other hand, the object for the ( j þ 1)th surface will be the image of the jth surface and the object and image distances are related by the separation or thickness tj between the surfaces as sjþ1 ¼ s0j  tj ,

(A.6)

yjþ1 yj ¼   tj , uj uj

(A.7)

yjþ1 ¼ yj þ tj ðnj uj Þ,

(A.8)

which is equal to  i.e.,

where tj ¼ tj /nj is the reduced thickness. Thus, whereas the angle is propagated with Eq. (A.5) the height is propagated with Eq. (A.8). With these two equations, it can be seen how a ray evolves in an optical system with any number of refracting and/or reflecting surfaces. Equations (A.5) and (A.8) are the basis of the paraxial ray tracing y-nu method [1, 2].

Ray Tracing

277

Focal length of a set of lenses From Eq. (A.5), the focal length of a set of refracting surfaces (lenses) can be determined. Suppose that there are M refracting surfaces. Then, for the M surfaces, nM uM ¼ nM1 uM1  PM yM nM1 uM1 ¼ nM2 uM2  PM1 yM1 . . . n1 u1 ¼ n0 u0  P1 y1 ,

(A.9)

and by adding these equations, nM uM ¼ n0 u0 

M X

Pj yj .

(A.10)

j¼1

The focal length of the system is determined when s0 ¼ –`, i.e., u0 ¼ 0 (and y1 ¼ y0). Therefore, f ¼

y1 uM

(A.11)

and M yj 1 1 X ¼ P. f nM j¼1 y1 j

(A.12)

Focal length of a simple lens As a result of Eq. (A.12), the focal length of a lens of thickness t immersed in air can be calculated. The geometry for calculating the focal length is illustrated in Fig. A.3. In this case, M ¼ 2 and n0 ¼ n2 ¼ 1. Therefore, Eq. (A.12) is reduced to 1 y ¼ P1 þ 2 P2 . y1 f

(A.13)

1 t ðn u Þ ¼ P1 þ P2 þ P2 1 1 1 f y1

(A.14)

Using Eq. (A.8) for y2,

and, with Eq. (A.5) for n1u1,

278

Appendix A

Figure A.3

Calculation of the focal length of a lens.

1 t ¼ P1 þ P2  P1 P2 , f nl

(A.15)

where nl ¼ n1 and t ¼ t1. This last equation is equivalent to 1 ðnl  1Þ ð1  nl Þ ðnl  1Þ ð1  nl Þ t ¼ þ  . f R1 R2 R1 R2 nl

(A.16)

Thus, the result for the focal length of a lens of thickness t given in Eq. (1.50) is obtained, i.e.,   1 1 1 ðn  1Þ2 t þ l  . ¼ ðnl  1Þ R1 R2 nl f R1 R2

(A.17)

Principal planes Principal planes can be located with respect to the vertices of the lens. Thus, the secondary principal plane is at a distance V0 P0 from vertex V0 , and the primary principal plane is at a distance VP from vertex V. The distance between V0 and F0 is called the back focal length, f b ¼ V0 F0 , and the distance between V and F is called the front focal length, f f ¼ VF. Thus, V0 P0 ¼ f b  f , and because y1 y2 ¼ , f fb from Eq. (A.8),

(A.18)

Ray Tracing

279

y1 þ t1 ðn1 u1 Þ y1

(A.19)

y1 þ t1 ðP1 y1 Þ y1

(A.20)

fb ¼ f and, with Eq. (A.5), fb ¼ f or

tP1 . nl

(A.21)

tðnl  1Þ . nl R1

(A.22)

fb ¼ f f Therefore, V0 P0 ¼ f

Note that V0 P0 depends on the power of the first side of the lens. If it is zero (r1 ¼ `), then the secondary principal plane is at V0 . To determine VP, consider that the lens is rotated and the same procedure of the previous case is carried out, therefore, VP ¼ f

tðnl  1Þ . nl R2

(A.23)

Focal length of a set of thin lenses Approximating a set of lenses N by thin lenses (immersed in air), Eq. (A.12) is simplified as follows: N 1 X yk 1 ¼ , y f f k¼1 1 k

where yk is the height of the ray in the kth lens and fk is the focal length of the kth lens. For example, the focal length of two thin lenses at a distance d will be given by 1 1 y 1 ¼ þ 2 . f f 1 y1 f 2 Using Eq. (A.8) for y2 and then Eq. (A.5) for (n1u1) leads to the known result 1 1 1 1 1 ¼ þ  d. f f1 f2 f1 f2

280

Appendix A

References [1] D. Malacara and Z. Malacara, Handbook of Optical Design, CRC Press, Boca Raton, Florida (2016). [2] R. Ditteon, Modern Geometrical Optics, John Wiley & Sons, New York (1998).

Appendix B

Refractive Index The refractive index in a medium measures the change in the speed of light in the medium with respect to the speed of light in a vacuum. Because the speed of light in a medium depends on the frequency n, the refractive index also depends on the frequency. Denoting by n the refractive index, y the speed of light in the medium, and c the speed of light in a vacuum, the refractive index is given by the equation nðnÞ ¼

c , yðnÞ

(B.1)

which is also known as the dispersion relation. From the microscopic point of view, in a classical approximation, the speed change occurs due to the phase change of the electromagnetic wave reemitted by the induced electric dipoles that make up the medium with respect to the incident electromagnetic wave (of speed c). Refractive index in dielectric materials To obtain a first model of the refractive index in a dielectric medium, it is initially assumed that the medium in the presence of an external electric field is composed of N induced dipoles of the type p ¼ –er (electron attached to an atom) per unit volume. Because the dipole is surrounded by other dipoles, in the presence of a harmonic electromagnetic wave of amplitude E0 and frequency n, the dipole will oscillate similarly to a driven damped harmonic oscillator. Thus, if E ¼ E0 eivt

(B.2)

represents the harmonic wave, with v ¼ 2pn, the electric force on the charge e will be F ¼ eE:

281

(B.3)

282

Appendix B

The charge displacement vector e satisfies me

d 2r dr þ me g þ kr ¼ eE0 eivt , 2 dt dt

(B.4)

where me is the mass of the charge e, g is the damping constant (due to the presence of the other dipoles), and k is the constant that explains the force that holds the charge together for the atom. In steady state, the response of the dipole will be r ¼ r0 eiðvtþdÞ ;

(B.5)

i.e., it will oscillate with the same frequency as the incident wave, but with an offset d. By inserting Eq. (B.5) in Eq. (B.4), r0 ¼

ðv20

eE0 ∕me eid ,  v2 Þ  igv

(B.6)

ðe∕me Þ E,  v2 Þ  igv

(B.7)

with v20 ¼ k∕me . With this result, r¼

ðv20

and the induced electric polarization vector P ¼ –Ner is P¼

Ne2 ∕me E: ðv20  v2 Þ  igv

(B.8)

Thus, the macroscopic effect of the electromagnetic wave incident on the medium is a polarization vector, which is proportional to E. In general, the equation above can be written as P ¼ ϵ0 xE,

(B.9)

where ϵ0 is the permittivity of the vacuum (a constant) and x is the electrical susceptibility of the medium, which measures the degree of proportionality with E and is defined as x ¼ ϵ/ϵ0 1, where ϵ is the permittivity of the medium. On the other hand, in dielectric (nonmagnetic) media, there is the magnetic polarization vector, M ¼ 0; the electric current density vector, J ¼ 0; and the free charge density, r ¼ 0. Thus, Maxwell’s equations for the medium have the form [1] ∇  E ¼ m0

∂H , ∂t

(B.10)

Refractive Index

283

∇  H ¼ ϵ0

∂E ∂P þ , ∂t ∂t

∇⋅E¼

(B.11)

∇⋅P , ϵ0

(B.12)

∇ ⋅ H ¼ 0,

(B.13)

where H is the magnetic vector and m0 is the vacuum permeability. The wave equation that results from taking ∇  (∇  E) ¼ –m0∂(∇  H)/∂t, and also using Eqs. (B.11) and (B.9), is ∇ð∇ ⋅ EÞ  ∇2 E ¼ m0 ϵ0 ð1 þ xÞ

∂2 E . ∂t2

(B.14)

Isotropic media A medium is isotropic if its physical properties do not depend on direction. In particular, in optics, a medium is said to be isotropic if the refractive index does not depend on the direction of propagation of light. This implies that in Eq. (B.9) the electrical susceptibility is described by a scalar. Otherwise, in an anisotropic medium (in which the refractive index depends on the direction), the electrical susceptibility is described by a tensor (3  3 matrix). Assuming that the electrical medium is isotropic, then in Eq. (B.12), ∇ ⋅ ðϵ0 E þ xϵ0 EÞ ¼ ð1 þ xÞ∇ ⋅ E ¼ 0,

(B.15)

which implies that ∇ · E ¼ 0, since x ≥ 0. Therefore, the wave equation for the isotropic dielectric medium is given by ∇2 E ¼ m0 ϵ0 ð1 þ xÞ

∂2 E ; ∂t2

(B.16)

thus, the speed of light in the medium is determined from 1 ¼ m0 ϵ0 ð1 þ xÞ: y2

(B.17)

Because c2 ¼ 1/m0ϵ0, the refractive index turns out to be n2 ¼ 1 þ x.

(B.18)

And with the result obtained for the medium of the example given by Eq. (B.8), then n2 ¼ 1 þ

Ne2 1 . 2 me ϵ0 ½ðv0  v2 Þ  igv

(B.19)

284

Appendix B

Starting from the previous result, a generalization of the medium is achieved by assuming that instead of a single type of electric dipole there are M types of dipoles [2]. If Nj is the volume density of the jth type of dipole, then Nj (B.20) N P is the fraction of jth type of dipole and M j¼1 gj ¼ 1. With this in mind, the refractive index can then be written in a general form as gj ¼

n2 ¼ 1 þ

M gj Ne2 X , 2 me ϵ0 j¼1 ½ðv0j  v2 Þ  igj v

(B.21)

where there are now several eigenfrequencies v0j and damping constants gj corresponding to each type of dipole. According to Eq. (B.21), the refractive index is a quantity of complex variable, i.e., n ¼ nR + inI. Taking the real and imaginary parts of the square of the refractive index, n2R  n2I ¼ 1 þ

M ðv20j  v2 Þ Ne2 X g me ϵ0 j¼1 ½ðv20j  v2 Þ2 þ g2j v2  j

(B.22)

and 2nR nI ¼

M gj v Ne2 X g. 2 me ϵ0 j¼1 ½ðv0j  v2 Þ2 þ g2j v2  j

(B.23)

Although this model requires adjustments to be applied in real cases, it allows us to see the dependence of the refractive index on frequency. On the other hand, it shows that the refractive index has two components: a real one nR, with which refraction can be explained, and an imaginary one nI, which represents the absorption in the medium. The general behavior of nR and nI through Eqs. (B.22) and (B.23) is shown in Fig. B.1 for two resonance frequencies: v01 and v02. The damping constants gj are usually small compared with the respective resonance frequencies v0j. When v  v0j, the imaginary part of the refractive index takes relevant values that give rise to absorption bands. When v moves away from the resonant frequencies, the imaginary part of the index of refraction is practically zero, and the real part of the index of refraction increases with frequency. This behavior is called normal dispersion. When the real part of the refractive index decreases with frequency, it is called anomalous dispersion (region where absorption occurs).

Refractive Index

285

Figure B.1 General behavior of the real and imaginary parts of the refractive index according to Eqs. (B.22) and (B.23).

The speed of light in the medium can also be given in terms of the permittivity ϵ and the permeability m of the medium (which also depend on the frequency of the electromagnetic wave incident on the medium); i.e., 1 ¼ mϵ. y2

(B.24)

Therefore, the refractive index is n2 ¼

mϵ . m 0 ϵ0

(B.25)

For nonmagnetic materials, the m  m0 approximation can be made, and in this case the refractive index can be calculated as n¼

rffiffiffiffiffi ϵ . ϵ0

(B.26)

286

Appendix B

References [1] G. R. Fowles, Introduction to Modern Optics, 2nd ed., Dover Publications, Mineola, New York (1989). [2] E. Hecht, Optics, Global Edition, 5th ed., Pearson, Harlow, England (2017).

Appendix C

Optical Glasses The refractive index varies with the speed of the electromagnetic wave in the medium, as shown in Appendix B. In turn, the speed depends on the frequency of the wave (or the wavelength in a vacuum, for l ¼ c/n, where n is the wave frequency). As a function of wavelength, away from resonance frequencies, the refractive index decreases with increasing wavelength; near resonance frequencies, where absorption occurs, the refractive index increases. This can be understood by the dispersion relation obtained in Eq. (B.21). In practice, the dispersion relation for each medium is obtained empirically as a function of the wavelength. For example, the glass used for microscope slides, soda lime silica, can be characterized in the visible range by the relationship obtained by Rubin [1]: n ¼ 1.5130  0.003169l2 þ 0.003962l2 :

(C.1)

Another way to characterize optical glasses is by the refractive index at three wavelengths corresponding to two spectral lines of hydrogen and one spectral line of helium, as shown in Table C.1. With these spectral lines, the visible range is covered. These are used for analysis of chromatic aberrations. The indices of the corresponding colors are indicated by nF, nd, and nC. On the other hand, a measure of chromatic dispersion is given by V¼

nd  1 , nF  nC

(C.2)

which is called the Abbe number. Usually, when the refractive index of a medium is said to be n, by default (unless otherwise stated) this value corresponds to the index nd. Optical glasses are made from SiO2 and a combination of light metals (crown glass) or heavy metals (flint glass), with which it is possible to set the refractive indices and the desired Abbe number. 287

288

Appendix C Table C.1 Spectral lines to characterize optical glasses. Line

Wavelength

Element

Color

F

486.13

H

blue

d

587.56

He

yellow

C

656.27

H

red

An Abbe diagram plots the refractive index nd versus the Abbe number for a range of different optical glasses, as shown in Fig. C.1. Each glass is identified by a labeled point on the graph nd versus V. Thus, the glasses used in the manufacture of optical elements (lenses, prisms, mirrors, etc.) are identified with the name of the glass (and not with the refractive index). For example, consider an achromatic doublet manufactured by a commercial company. The catalog says that the first lens is made of BK7 glass and the second is made of SF5 glass. The difference in glasses is intended to correct axial chromatic aberration (for blue and red colors). It can seen in Fig. C.1 that the index nd for BK7 glass is 1.52 ± 0.05 (1.5168) and its Abbe number is 64.0 ± 0.5 (64.17), and the index nd for SF5 glass is 1.68 ± 0.05 (1.6727) and its Abbe number is 32.0 ± 0.5 (32.21).

Figure C.1 Abbe diagram of optical glasses. Reprinted from [2] with permission granted by the GNU Free Documentation License.

Optical Glasses

289

References [1] M. Rubin, “Optical properties of soda lime silica glasses,” Sol. Energy Mater. 12(4), 275–288 (1985). [2] B. Mellish, “Abbe-diagram,” Wikimedia Commons, https://commons. wikimedia.org/wiki/File:Abbe-diagram.svg, accessed November 7, 2022.

Appendix D

Chromatic Aberrations Chapter 1 deals with imaging systems for a single wavelength. When the light coming from the object is polychromatic, the parameters that depend on the refractive index change according to the color of the light; e.g., the focal length of a lens, see Eq. (1.50), and the position of the principal planes, see Eqs. (1.51) and (1.52). Consequently, for each color there will be an image of a different size and in a different axial position. Let us consider the three wavelengths shown in Table D.1 with which optical glasses are characterized. Denoting the focal lengths for these colors as fF, fd, and fC, the equations for locating the image in each case are 1 1 1 ¼ , 0  sF sF f F

(D.1)

1 1 1 ¼ , 0  sd sd f d

(D.2)

1 1 1 ¼ . 0  sC sC f C

(D.3)

In the case of a lens, the distance between the object and the first surface of the lens is the same in all three cases, i.e., sF  ðVPÞF ¼ sd  ðVPÞd ¼ sC  ðVPÞC . The former also implies that, in general, the distance of the object is different depending on the color. For the image, the situation is a bit more complex, since in general s0F þ ðV0 P0 ÞF ≠ s0d þ ðV0 P0 Þd ≠ s0C þ ðV0 P0 ÞC . Table D.1 Spectral lines to characterize optical glasses. Line

Wavelength (nm)

Element

Color

F

486.13

H

blue

d

587.56

He

yellow

C

656.27

H

red

291

292

Appendix D

Figure D.1 Axial chromatic aberration in a positive lens.

The axial difference between the positions of the images corresponding to the spectral lines F and C, AchrL ¼ ðs0F þ ðV0 P0 ÞF Þ  ðs0C þ ðV0 P0 ÞC Þ ¼ ðs0F  s0C Þ þ ððV0 P0 ÞF  ðV0 P0 ÞC Þ,

(D.4)

is called the axial chromatic aberration. For example, Fig. D.1 shows the position of the images for the spectral lines F, d, and C for a lens of focal length fd ¼ 50 mm (R1 ¼ 25.84 mm, R2 ¼ `, thickness t ¼ 4.5 mm; glass BK7) when the object is at sd ¼ 100 mm, which gives a magnification mt ¼ 1 for the spectral line d. In this example, the primary principal plane is at the vertex of the first surface; thus, sd ¼ sF ¼ sC. The parameters for this example are summarized in Table D.2. All distances (except wavelength) are given in millimeters. From these data it follows that the axial chromatic aberration is AchrL ¼ 3.05 mm. Suppose that the object is a point and the system is free of primary monochromatic aberrations. Then, for the spectral lines F, d, and C, there are three image points located at s0F ¼ 94.67 mm, s0d ¼ 96.77 mm, and s0C ¼ 97.73 mm, respectively. Where should the image plane be located? If it is set at s0F , there will be a blue dot in focus, but the yellow and red images will be out of focus. Analogous situations occur if the image plane is located at s0d and s0C . However, in the middle of the blue and red images there will be a circle of least confusion, and this will be the place where the best image can be seen. By focusing on s0F , there will be a red dot larger than the yellow dot. On the other hand, focusing on s0C will result in a blue dot that is larger than the yellow dot. This means that the edge of the image of a polychromatic object Table D.2 Parameters of a positive lens and the image according to the spectral lines to characterize optical lenses when the object is at 100 mm from the primary principal plane. Line

l(nm)

f

V0 P0

s0

mt 0.9788

F

486.13

49.46

3.21

94.67

d

587.56

50.00

3.23

96.77

1

C

656.27

50.24

3.23

97.73

1.0097

Chromatic Aberrations

293

obtained with a simple lens (like the example shown in Fig. D.1) looks red from a distance s0F , whereas it looks blue from a distance s0C . Magnification also depends on the wavelength in an image plane; there will be a superposition of images of different colors and sizes (one in focus and others out of focus). If h0F ¼ mtF h and h0C ¼ mtC h are the heights of the images, the difference AchrT ¼ h0F  h0C

(D.5)

is defined as transverse chromatic aberration. With a negative lens, the order in which the images are placed for each color is opposite to that of the positive lens. This can be seen with an example similar to the positive lens. For a lens of focal length fd ¼ 50 mm (R1 ¼ 25.84 mm, R2 ¼ `, thickness t ¼ 3.5 mm; BK7 glass) when the object is at sd ¼ 100 mm, the virtual image for blue is closer to the lens than the image for red (and the image for yellow in between; Fig. D.2). The values of the distances are shown in Table D.3. The chromatic aberration in this case is AchrL ¼ 0.36 mm. Achromatic doublet The union of two lenses results in a single lens called a doublet achromatic. This assumes that the radius of curvature of the posterior face of the first lens must be equal to the radius of curvature of the anterior face of the second lens, i.e., R12 ¼ R21 (Fig. D.3).

Figure D.2 Axial chromatic aberration in a negative lens. Table D.3 Parameters of a negative lens and the virtual image according to the spectral lines to characterize optical lenses when the object is 100 mm from the primary principal plane. Line

l(nm)

f

V0 P0

s0

mt

F

486.13

49.46

2.30

35.39

0.3309

d

587.56

50.00

2.31

35.64

0.3333

C

656.27

50.24

2.31

35.75

0.3344

294

Appendix D

Figure D.3 Achromatic doublet.

Suppose we want to obtain an achromatic doublet of focal length fd. From the thin lens approximation, the focal length of the combination of two thin lenses [Eq. (1.53)] will be 1 1 1 ¼ þ , f d f 1d f 2d

(D.6)

since the separation between the lenses is zero. For each lens,   1 1 1 ¼ P1d ¼ ðn1d  1Þ  ¼ ðn1d  1Þk1 f 1d R11 R12

(D.7)

and 1 f 2d



¼ P2d ¼ ðn2d

1 1  1Þ  R21 R22

 ¼ ðn2d  1Þk2 .

(D.8)

In Eqs. (D.7) and (D.8), the factors k1 and k2 have been introduced for the difference in curvatures. Thus, the power of the achromatic doublet in the approximation of thin lenses is P ¼ ðn1d  1Þk1 þ ðn2d  1Þk2 .

(D.9)

The achromatic doublet implies that the power is constant as the wavelength changes in the neighborhood of l ¼ 587.56 nm, i.e.,   ∂P ¼ 0: (D.10) ∂l d In other words,

Chromatic Aberrations

295

k1

∂n1 ∂n þ k2 2 ¼ 0: ∂l ∂l

(D.11)

By approximating the variation of the refractive index with nF – nC and the variation of the wavelength with lF – lC, k1

n1F  n1C n  n2C  k2 2F ¼ 0: lF  lC lF  lC

(D.12)

By multiplying and dividing each addend by (n1d  1) and (n2d – 1), respectively, to introduce the Abbe number, 0 ¼ k1

ðn1F  n1C Þ ðn1d  1Þ ðn  n2C Þ ðn2d  1Þ  k2 2F ðn1d  1Þ ðlF  lC Þ ðn2d  1Þ ðlF  lC Þ

ðn1d  1Þ ðn2d  1Þ ¼ k1  k2 . ðlF  lC ÞV 1 ðlF  lC ÞV 2

(D.13)

By eliminating the common factor from the denominator and using the definition of power for each lens, i.e., Eqs. (D.7) and (D.8), P2d P ¼  1d . V2 V1

(D.14)

This equation and Eq. (D.6) in the form of powers Pd ¼ P1d þ P2d allow us to write the power of each lens as a function of the Abbe numbers of each glass and the power of the achromatic doublet, as follows, P1d ¼ Pd

V 1 V2  V1

(D.15)

P2d ¼ Pd

V2 . V2  V1

(D.16)

and

After this, the curvature factors are obtained: k1 ¼

P1d n1d  1

(D.17)

k2 ¼

P2d . n2d  1

(D.18)

and

To finalize the design of the doublet, the radii of curvature of the lenses must be determined. A very common design proposes that the first lens

296

Appendix D

be an equiconvex lens of crown glass [1]. Thus, the radii satisfy the following relations: R11 ¼ R12 , R12 ¼ R21 and R22 ¼

R12 . 1  k2 R12

(D.19)

Example: achromatic doublet Suppose an achromatic doublet of focal length fd ¼ 50 mm is desired. The power would be P ¼ 20 D. Using the glasses LAKN22 for the first lens and SFL6 for the second lens (Appendix C) and Eqs. (D.15–D.18), together with the definition of the curvature factors, leads to the results for the radii of curvature shown in Table D.4. To have real lenses, we need to assign a thickness to each lens, e.g., 8 mm for the first lens and 4 mm for the second. The (real) achromatic doublet is summarized in Table D.5. With these parameters, the focal length for the spectral line d becomes fd ¼ 50.85 mm. To see the improvement in imaging, compare the image with this doublet and image with the single lens of Fig. D.1 by placing an object at sd ¼ 101.7 mm from the doublet (such that the magnification is again mt ¼ 1 for the spectral line d). The parameters for the three spectral lines are shown in Table D.6. The axial chromatic aberration is AchrL ¼ 0.19 mm, i.e., only Table D.4 Design parameters of an achromatic doublet in the thin lens approximation. P(D)

20

V1

55.89

V2

25.39

n1d

1.6511

n2d

1.8051

k1(D)

56.29

k2(D)

20.68

R11(mm)

35.52

R22(mm)

133.97

Table D.5 Achromatic doublet. The units of radius and thickness are given in millimeters. Surface

Radius

Thickness

Index

1

35.52

8

LAKN22

2

35.52

4

SFL6

3

133.97

Chromatic Aberrations

297

Table D.6 Parameters of an achromatic doublet and the image along the spectral lines to characterize optical lenses when the object is at 101.7 mm from the primary principal plane for the spectral line d. Line

l(nm)

f

VP

V0 P0

s0

mt

F

486.13

50.83

1.05

6.13

95.54

1.0002

d

587.56

50.85

1.10

6.13

95.57

C

656.27

50.90

1.12

6.14

95.73

1 1.0014

6% of the aberration obtained with the simple lens. On the other hand, the magnification of each color varies very little, which makes transverse chromatic aberration negligible. Thus, with the achromatic doublet, a highquality image is achieved. Although the original design calls for a focal length of 50 mm on the d line, the proposed actual doublet has a slightly higher value. With a small adjustment in the radii of curvature of the first lens, the desired value can be obtained (e.g., with R11 ¼ –R12 ¼ 34.7 mm, fd ¼ 50.01 mm is obtained). Achromatic doublets are very common in imaging systems because, in addition to producing good-quality images, they also reduce spherical aberration. Another possible solution to reduce chromatic aberration is to design two lenses of the same glass separated by the distance [( f1d þ f2d)/2]. This can be consulted in [2].

References [1] R. Ditteon, Modern Geometrical Optics, John Wiley & Sons, New York (1998). [2] D. Malacara and Z. Malacara, Handbook of Optical Design, CRC Press, Boca Raton, Florida (2016).

Appendix E

Prisms A prism is an optical element with flat faces of which at least two are mirrorpolished and inclined toward each other, such that light can be reflected or transmitted by them. Prisms used to reflect light do so by total internal reflection on one or more of their faces and serve to change the orientation of images in an optical system or the direction of light propagation [1]. On the other hand, prisms that make use of refraction are used as elements to scatter light and to measure the change of the refractive index in a medium (prism) with the wavelength or spectral components of a light source. Reflecting prisms Figure E.1 shows images of reflecting prisms: (a) right-angle prism, (b) Porro prism, (c) pentaprism, and (d) Amici prism. In these prisms, a beam of light falls on one of the faces and the transmitted beam must undergo at least one total internal reflection. Right-angle and Porro prisms are geometrically the same; the difference is their orientations with respect to the direction of light. Assuming that light falls from left to right, the path followed by light in this type of prism is illustrated in Fig. E.2. To see how the orientation changes in

Figure E.1 Reflecting prisms: (a) right-angle prism, (b) Porro prism, (c) pentaprism, and (d) Amici prism.

299

300

Appendix E

(a)

(b)

(c)

(d)

Figure E.2 Direction that light follows when reflected internally in the (a) right-angle prism, (b) Porro prism, (c) pentaprism, and (d) Amici prism.

the image, a couple of symbols are included in the incident beam: an arrow pointing up and a circle to the left (out of the plane of the paper). Light is also assumed to propagate toward the observer. After the reflection on the diagonal face, the right-angle prism deflects the light 90°, inverts the arrow, and keeps the orientation of the circle, such that when looking at the arrow pointing up again (with a rotation of 180°), the circle faces right (reverses from left to right). The Porro prism reflects light parallel to the incident beam, rotates the arrow twice, and maintains the orientation of the circle, so that when looking at the arrow, the circle stays to the left (does not change orientation). The pentaprism deflects light 90° and does not change orientation. The Amici prism deflects the light 90°, reverses the arrow, and reflects the circle twice off the roof of the prism, changing its orientation in such a way that when looking again at the arrow pointing up, the circle is on the left (it does not change orientation). A configuration that allows the direction of light propagation to be maintained with an inversion of the image (central symmetry) is achieved by combining two Porro prisms, as shown in Fig. E.3. This system of prisms is usually used in binoculars (terrestrial telescope) in which the image is seen from the front. The Amici prism is also often used in eyepieces to correct image inversion in spotting scopes. The pentaprism is often used in range finders and alignment instruments in topographic surveying. A variant of the pentaprism results from changing one of the flat reflecting faces to a roofshaped face (similar to the Amici prism); this is usually used in reflex-type cameras. Another very useful prism is the Dove prism, the configuration of which is shown in Fig. E.4. An incident ray parallel to the base is refracted at the first face, then reflected internally at the base (midpoint), and finally emerges parallel to the incident ray by reversing the arrow. This prism must be used with collimated light and has the following property: if the prism is rotated about the optical axis (incident beam) by a certain angle, the image is rotated by twice that angle. For example, in some slit lamps (ophthalmic instruments), 

The orientation between the arrow and circle.

Prisms

301

Figure E.3 A combination of two Porro prisms to invert the image (central symmetry with respect to a point on the optical axis) and maintain the direction of propagation.

Figure E.4

Dove prism.

the Dove prism is placed between the two lenses of a confocal system that projects a line of light onto the surface of the cornea. To observe corneal astigmatism, it is common to rotate the line by an angle of 90°, which is done by rotating the cylindrical support that contains the Dove prism by an angle 45°. It is very convenient for an instrument operator to rotate the beamline. Beamsplitter cubes Another widely used prism configuration is the beamsplitter cube formed by two right prisms joined by their diagonal faces to divide an incident beam into transmitted and reflected beams [Fig. E.5(a)]. One of the faces of the diagonals is covered with a reflective film (semi-mirror), and the diagonals are

(a)

(b)

Figure E.5 (a) Nonpolarizing and (b) polarizing beamsplitter cubes.

302

Appendix E

joined using an optical adhesive. The thickness of the reflective film determines the percentages of light reflected and transmitted on the diagonal. The refractive index of the optical adhesive must be close to that of the prisms to avoid unwanted reflections. One of the most common cubes transmits 50% and reflects 50% of the light. The cubes are also designed to separate an unpolarized beam into a transmitted beam with polarization p and a reflected beam with polarization s, as shown in Fig. E.5(b). The diagonal of one of the prisms is covered with a dielectric film or metal–dielectric mixture. When the incident beam reaches the film, the beam with polarization parallel to the plane of incidence (p) is transmitted, while the beam with polarization orthogonal to the plane of incidence (s) is reflected. Refracting prisms Prisms can also be used to analyze the spectral components (in wavelength or frequency) of a light source. Because the refractive index is a function of wavelength, a polychromatic light beam refracts into multiple beams depending on its wavelength. This physical separation of the refracted rays allows the light spectrum to be measured. To determine the deviation of the refracted ray as a function of the refractive index, let us consider the refraction of a ray at two faces of a prism that form an angle a, as shown in Fig. E.6. Suppose that the prism is immersed in a medium of index n1 and that the index of refraction of the prism for the beam considered is n2. Snell’s law on both sides would be n1 sin ui1 ¼ n2 sin ut1 and

Figure E.6

Refraction of a beam at two faces of a prism.

(E.1)

Prisms

303

n2 sin ui2 ¼ n1 sin ut2 .

(E.2)

From the geometry of Fig. E.6, a ¼ ut1 þ ui2. Changing ui2 to a – ut1 in Eq. (E.2), sin ut2 ¼

n2 ðsin a cos ut1  cos a sin ut1 Þ, n1

(E.3)

which is equal to sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2 n2 sin ut2 ¼ sin a  sin2 ui1  cos a sin ui1 . n1

(E.4)

If n1 ¼ 1 and n2 ¼ n, the index of refraction of the prism for a ray of a certain wavelength is given by n2 ¼ sin2 ui1 þ

ðsin ut2 þ sin ui1 cos aÞ2 . sin2 a

(E.5)

Using a prism characterized in its refractive indices according to wavelength, we can measure the light spectrum of a source by observing the position of the refracted rays on a length scale. One configuration used in prism spectrometers is based on the minimum value of the angle d ¼ ui1 þ ut2 – a, which measures the deviation of the refracted ray leaving the prism with respect to the ray that is incident on the prism. The minimum of d occurs when ut2 ¼ ui1 ¼ umin [2]. In this case, pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sin umin 2ð1 þ cos aÞ , (E.6) n¼ sin a and because umin ¼ (dmin þ a)/2, then n¼

sin½ðdmin þ aÞ∕2 . sinða∕2Þ

(E.7)

In particular, if the prism is constructed as an isosceles triangle, where a is the angle between the two equal sides, the ray refracted inside the prism will be parallel to the other side of the prism (base of the prism), as shown in Fig. E.7. Another prism used in spectroscopy is the Pellin–Broca prism. The operating principle of this prism can be understood as a variant of an equilateral prism (b ¼ a ¼ 60° in Fig. E.7). Suppose that in the equilateral prism of Fig. E.8(a) a ray of a given wavelength is incident at the condition of least deviation (the ray refracted in the prism is parallel to the base). If the equilateral prism is separated into two prisms of 30° – 60° – 90°, as in Fig. E.8(b), the refracted ray still satisfies the

304

Appendix E

Figure E.7 An isosceles prism in which the minimum deviation of the ray occurs when the ray refracted within the prism propagates parallel to the base prism.

(a)

(b)

(c)

Figure E.8 Pellin–Broca prism principle. (a) Equilateral prism and a ray in condition of minimum deviation. (b) Separation of the equilateral prism into two prisms of 30° – 60° – 90° maintaining the condition of minimum deviation. (c) Change of the path of the ray by a mirror at 45° maintaining the condition of minimum deviation in the prisms. The end result is that the refracted ray makes an angle of 90° with the incident ray.

minimum deviation condition. If additionally a mirror is placed at 45° and the second prism is moved, as shown in Fig. E.8(c), the minimum deviation condition in the second prism still holds. The end result is that if the minimum deviation condition is met, the refracted beam, regardless of wavelength, will refract at an angle of 90° with respect to the incident ray. If a 45° – 45° – 90° prism is used instead of the mirror, a prism like the one shown in Fig. E.9 can be created. This is the Pellin–Broca prism [3]. Thus, for a polychromatic incident ray, the ray refracted at 90° corresponds to a given wavelength. Rays of other wavelengths do not propagate under the minimum deviation condition and emerge from the prism at an angle other than 90°. However, with a small tilt around an axis orthogonal to the plane of the paper, the minimum deviation condition can be adjusted for another wavelength. In particular, if the axis of rotation passes through a point defined by the intersection of the angle bisector ∠BAD and the side BC, i.e., point O in Fig. E.9, the ray refracted at 90° with respect to the incident ray maintains its lateral position. This situation is illustrated in Fig. E.10 for two rays of different wavelengths in a prism whose indices of refraction for the two wavelengths are

Prisms

305

Figure E.9

Pellin–Broca prism.

Figure E.10 Illustration of the advantage of rotating the prism at point O. Regardless of the wavelength, the ray refracted at 90° maintains its lateral position.

1.5628 and 1.6361. In Fig. E.10, there is an overlay of the prism with the two orientations (with the axis of rotation at O) that satisfy the minimum deviation condition for each wavelength. The orientation corresponding to the refractive index 1.5628 is shown in gray, and the orientation corresponding to

306

Appendix E

the refractive index 1.6361 is shown in black. The tilt angle is 3.5°. Indeed, the two refracted rays maintain their spatial location thanks to the rotation of the prism at point O. If it rotates at another point, the two rays are refracted with a lateral offset from each other. Thus, the pivot point at O offers an advantage when designing a spectrometer.

References [1] Navel Education and Training Program Development Center, Basic Optics and Optical Instruments, Dover Publications, Mineola, New York (1997). [2] E. Hecht, Optics, Global Edition, 5th ed., Pearson, Harlow, England (2017). [3] P. Pellin and A. Broca, “A spectroscope of fixed deviation” Apj 10, 337 (1899).

Appendix F

Polarization Ellipse To determine the general polarization state, let us consider the real part of the amplitudes of the electric vector given in Eq. (2.31), i.e., E x ¼ jE ox j cos dx ,

(F.1)

E y ¼ jE oy j cos dy .

(F.2)

Multiplying by sin dx and sin dy the functions cos dx and cos dy from Eqs. (F.1) and (F.2), as follows Ex sin dy ¼ cos dx sin dy , jE ox j

(F.3)

Ey sin dx ¼ cos dy sin dx , jE oy j

(F.4)

and then subtracting Eq. (F.4) from Eq. (F.3) leads to Ey Ex sin dy  sin dx ¼ sinðdy  dx Þ: jE ox j jE oy j

(F.5)

By an analogous procedure, multiplying by cos dx and cos dy the functions cos dx and cos dy of Eqs. (F.1) and (F.2), and then subtracting, leads to Ey Ex cos dy  cos dx ¼ 0: jE ox j jE oy j

(F.6)

Finally, squaring Eqs. (F.5) and (F.6), and then adding them, leads to E 2y E 2x Ex Ey cosðDdÞ þ  2 ¼ sin2 ðDdÞ, jE ox j jE oy j jE ox j2 jE oy j2

307

(F.7)

308

Appendix F

with Dd ¼ dy – dx. This equation represents a rotated conic. By rotating the axes, the crossed term of ExEy is eliminated. The angle of rotation is tan 2c ¼ 

2 cosðDdÞ∕ðjE ox jjE oy jÞ . 1∕jE ox j2  1∕jE oy j2

(F.8)

By defining tan a ¼

jE oy j , jE ox j

(F.9)

then tan 2c ¼ tan 2a cos Dd.

(F.10)

The sign of the discriminant of Eq. (F.7), cos2 ðDdÞ∕ðjE ox jjE oy jÞ2  1∕ðjE ox jjE oy jÞ2 ¼ 

sin2 ðDdÞ , 0, ðjE ox jjE oy jÞ2

(F.11)

determines the type of conic. Because the result is less than zero, Eq. (F.7) is an ellipse rotated by the angle c in the Cartesian system xy.



The quadratic form Ax2 þ 2Bxy þ Cy2 ¼ D represents a rotated conic in the Cartesian system xy. If B2 – AC < 0, it is an ellipse; if B2 – AC ¼ 0, it is a parabola; if B2 – AC > 0, it is a hyperbola. The unrotated version of the conic in a new Cartesian system x 0 y 0 is achieved by a coordinate transformation corresponding to an eigenvalue problem of a symmetric matrix 2  2 whose elements are the coefficients A, B, and C. In the rotated coordinate system, the ffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi new coefficients are given by A0 ¼ 12 [ðA þ CÞ þ ðA  CÞ2 þ 4B2 ], C 0 ¼ 12 [ðA þ CÞ  pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2B ðA  CÞ2 þ 4B2 ]; the angle of rotation can be determined from c ¼ 12 arctanðAC Þ.

Index diffraction gratings, 269 diffraction order, 271 diffraction condition of Sommerfeld, 245

axial chromatic aberration, 292 beamsplitter, 185 birefringent materials, 139 biaxial crystals, 143 double image, 145 optic axis, 143 uniaxial crystals, 143 birefringent plates, 139 half-wave plate, 140 quarter-wave plate, 140 slow axis and fast axis, 139

extended source, 214

Cartesian oval, 18 Cartesian sign convention, 21 coherence coherence length, 164 degree of coherence, 158 van Cittert–Zernike theorem, 261 critical angle, 10, 130

f-number, 56 Fermat’s principle, 11 geometrical wavefront, 16 mirage, 13 field of view, 63 field of view and object field magnifying glass, 71 Fraunhofer diffraction integral, 251 Fresnel diffraction integral, 252 Fresnel equations, 124 Fresnel zones, 234 Poisson spot, 239 zonal plate, 239

degree of coherence, 162 diffraction, 240 Airy disk, 254 circular aperture, 252 Fresnel–Kirchhoff integral, 246 Huygens–Fresnel, 249 inclination factor, 234, 246 Kirchhoff integral, 243 Rayleigh–Sommerfeld, 248 rectangular aperture, 254 two identical circular apertures, 258

harmonic plane waves, 119, 156 human eye far point distance (fpd), 66 farsightedness, 69 Gullstrand–Emsley, 66 hyperopia, 69 myopia, 69 near point distance (npd), 66, 69 nearsightedness, 69 Purkinje, 66 Huygens’ principle, 6, 231 Huygens–Fresnel principle, 232

309

310

image of a point object diffractive, 263 image of point object geometrical, 16 interference by multiple wave reflection, 193 equal inclination, 206 equal thickness, 217, 228–229 from N, 209 localized fringes, 197–198, 206 nonlocalized fringes, 197–198, 206, 208, 214 primary maxima, 210 principal maxima, 212 secondary maxima, 210, 213 spherical waves, 200 two plane waves, 168 two spherical waves, 197 visibility, contrast, 167, 171 white light, 214, 217 interference fringes circular, 177, 180, 190 hyperbolic, 176 straight, 183, 190 interference of two plane waves, 166 interference pattern straight fringes, 168 interferometers Fabry–Pérot, 223 Fizeau, 229 Michelson, 184 Newton, 226 Young, 220–221 irradiance, 58, 107–108, 157 Jones matrices, 152 vector, 149 law of reflection, 31, 121 law of refraction, 8, 121

Index

lenses doublet achromatic, 293 focal length of a thick lens, 47 focal planes, 43 Gaussian equation, 37 lens system, 52 negative, 42 oblique rays, 44 positive, 41 primary focal point, 38 principal planes, 48 ray tracing, 39–40 secondary focal point, 38 thin lens approximation, 37 thin lens equation, 37 thin lens focal length, 37 thin lens power, 53 limit of spatial resolution, 266 magnification lenses, 40–41 microscope, 82 spherical mirrors, 32 spherical refracting surface, 29 telescope, 76 magnifying glass, 71 Malus’ law, 118 Maxwell’s equations, 104, 119, 282 meridional plane, 28 Michelson interferometer, 159, 177, 181 microscope, 81 monochromatic primary aberrations astigmatism, 91 coma, 91 distortion, 89 field curvature, 86 Petzval, 86 spherical, 87

Index

numerical aperture, 82 optical glasses, 287 Abbe diagram of optical glasses, 288 optical path length, 14 plane of incidence, 31 polarization Brewster’s angle, 126 circular, 110, 113 elliptic, 110, 113 linear, 112 natural light, 109 polarized light, 109 reflectance and transmittance, 127, 135 polarizers dichroic linear polarizers, 114 extinction coefficient, 115 Fresnel rhomb, 134 Glan–Thompson prism, 148 Rochon prism, 149 Sénarmont prism, 149 Wollaston prism, 148 Poynting vector, 107 primary monochromatic aberrations aberrations polynomial, 264 diffraction patterns, 265 prisms, 299 right-angle prism, Porro prism, pentaprism, Amici prism, 300 beamsplitter cubes, 302 Dove prism, 301 isosceles prism, minimum deviation, 303 Pellin–Broca prism, 304 PSF diffractive, 264 geometrical, 94 pupils coupling, 76

311

entrance and exit, 58 pupil function, 264 radiant intensity, 54 ray tracing exact, 95 paraxial, y-nu method, 276 rays chief, 60, 90 marginal, 60, 75 refractive index extraordinary, 143 ordinary, 143, 281, 284–285 resolving power diffraction grating, 272 microscope, 83 sagittal plane, 90 Snell’s law, 8, 12 spatial resolution limit Rayleigh criterion, 266 spherical mirrors, 31 ray tracing, 31 spherical aberration, 34 spherical refracting surface Gauss equation, 23 paraxial approximation, 23 primary focal length, 24 primary focal point, 25 ray tracing, 28 refractive power, 29 secondary focal length, 24 secondary focal point, 24 spherical aberration, 23 Stokes relations, 191 stop aperture, 56 field, 62 tangential plane, 90 telescope Galilean refracting, 78 Keplerian refracting, 75

312

Newtonian reflecting, 80 total internal reflection, 130 transverse chromatic aberration, 293 wave equation birefringent material, 137

Index

dielectric, 119 vacuum, 104 wave vector, 119 waves harmonic plane, 105 ordinary, extraordinary, 144 spherical, 172

Yobani Mejía-Barbosa is a Professor in the Department of Physics at the Universidad Nacional de Colombia, where he has taught courses in optics for more than 12 years. He received his B.S. and M.S. degrees in Physics from the Universidad Nacional de Colombia in 1991 and 1995, respectively, and his Ph.D. in Optics from the Centro de Investigaciones en Óptica, Mexico, in 2001. His current research interests include optical design, interferometry, visual optics, and classical coherence. He is a senior member of Optica. Herminso Villarraga-Gómez obtained his B.S. in Physics in 2007 from the Universidad Nacional de Colombia, under the guidance of Prof. Mejía-Barbosa. He also holds an M.S. in Physics (University of Puerto Rico, 2010), an M.S. in Optics (University of Central Florida, 2012), and a Ph.D. in Optical Science and Engineering (University of North Carolina, 2018). Herminso has worked for world-renowned optical companies, including Nikon (2015–2019) and most recently ZEISS (since 2019). He has been a member of SPIE and Optica since 2012.

FUNDAMENTALS OF

OPTICS

An Introductory Course Yobani Mejía-Barbosa Translated from the Spanish by Herminso Villarraga-Gómez

This book presents a simple yet elegant introduction to classical optics focused primarily on establishing fundamental concepts for students new to the field. With examples demonstrating the use of optics in a wide range of practical applications, it reflects the pedagogical approach used by Prof. Mejía-Barbosa to teach his Fundamentals of Optics course at the Universidad Nacional de Colombia. This book will prove useful for undergraduate and graduate students of physics, optical science and engineering, and any other related science or engineering discipline that deals with optics at some level. Readers are invited to study the fundamental principles of optics and find pleasure in learning about this fascinating and vibrant field.

P.O. Box 10 Bellingham, WA 98227-0010 ISBN: 9781510657809 SPIE Vol. No.: PM359