Field Guide to Photographic Science 2019049944, 9781510631151, 9781510631168

This Field Guide provides a concise summary of the photographic imaging chain and explains the connections between the s

536 102 23MB

English Pages 176 [178] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Field Guide to Photographic Science
 2019049944, 9781510631151, 9781510631168

Citation preview

SPIE Terms of Use: This SPIE eBook is DRM-free for your convenience. You may install this eBook on any device you own, but not post it publicly or transmit it to others. SPIE eBooks are for personal use only. For details, see the SPIE Terms of Use. To order a print version, visit SPIE.

Library of Congress Cataloging-in-Publication Data Names: Rowlands, D. Andrew, author. Title: Field guide to photographic science / D. Andrew Rowlands. Description: Bellingham, Washington, USA : SPIE Press, [2020] | Includes bibliographical references and index. Identifiers: LCCN 2019049944 | ISBN 9781510631151 (spiral bound) | ISBN 9781510631168 (pdf) Subjects: LCSH: Photography–Handbooks, manuals, etc. | Image processing–Digital techniques–Handbooks, manuals, etc. Classification: LCC TR146.R88 2020 | DDC 770–dc23 LC record available at https://lccn.loc.gov/2019049944

Published by SPIE P.O. Box 10 Bellingham, Washington 98227-0010 USA Phone: 360.676.3290 Fax: 360.647.1445 Email: [email protected] Web: www.spie.org Copyright © 2020 Society of Photo-Optical Instrumentation Engineers (SPIE) All rights reserved. No part of this publication may be reproduced or distributed in any form or by any means without written permission of the publisher. The content of this book reflects the thought of the author. Every effort has been made to publish reliable and accurate information herein, but the publisher is not responsible for the validity of the information or for any outcomes resulting from reliance thereon. Printed in the United States of America. Last updated 27 March 2020. For updates to this book, visit http://spie.org and type “FG46” in the search field.

Introduction to the Series In 2004, SPIE launched a new book series under the editorship of Prof. John Greivenkamp, the SPIE Field Guides, focused on SPIE’s core areas of Optics and Photonics. The idea of these Field Guides is to give concise presentations of the key subtopics of a subject area or discipline, typically covering each subtopic on a single page, using the figures, equations, and brief explanations that summarize the key concepts. The aim is to give readers a handy desk or portable reference that provides basic, essential information about principles, techniques, or phenomena, including definitions and descriptions, key equations, illustrations, application examples, design considerations, and additional resources. The series has grown to an extensive collection that covers a range of topics from broad fundamental ones to more specialized areas. Community response to the SPIE Field Guides has been exceptional. The concise and easy-to-use format has made these small-format, spiral-bound books essential references for students and researchers. Many readers tell us that they take their favorite one with them wherever they go. The popularity of the series led to its expansion into areas of general physics in 2019, with the launch of Field Guides to General Physics. The core series continues as the SPIE Field Guides to Optical Sciences and Technologies. The concept of these books is a format-intensive presentation based on figures and equations supplemented by concise explanations. In most cases, this modular approach places a single topic on a page, and provides full coverage of that topic on that page. Highlights, insights, and rules of thumb are displayed in sidebars to the main text. The appendices at the end of each Field Guide provide additional information such as related material outside the main scope of the volume, key mathematical relationships, and alternative methods. While complete in their coverage, the concise presentation may be best as a supplement to traditional texts for those new to the field.

otographic Science

Introduction to the Series The SPIE Field Guides are intended to be living documents. The modular page-based presentation format allows them to be updated and expanded. In the future, we will look to expand the use of interactive electronic resources to supplement the printed material. We are interested in your suggestions for new topics as well as what material should be added to an individual volume to make these books more useful to you. Please contact us at [email protected]. J. Scott Tyo, Series Editor The University of New South Wales Canberra, Australia

Related Titles from SPIE Press Keep information at your fingertips with these other SPIE Field Guides: Colorimetry and Fundamental Color Modeling, Jennifer D. Kruschwitz (Vol. FG42) Geometrical Optics, John E. Greivenkamp (Vol. FG01) Image Processing, Khan M. Iftekharuddin and Abdul Awwal (Vol. FG25) Lens Design, Julie Bentley and Craig L. Olson (Vol. FG27) Linear Systems in Optics, J. Scott Tyo and Andrey Alenin (Vol. FG35) Other related titles: Camera Lenses: From Box Camera to Digital, Gregory H. Smith (Vol. PM158) Formation of a Digital Image: The Imaging Chain Simplified, Robert D. Fiete (Vol. PM218) Modeling the Imaging Chain of Digital Cameras, Robert D. Fiete (Vol. TT92) Optical Design for Visual Systems, Bruce H. Walker (Vol. TT45) Optics for Technicians, Max J. Riedl (Vol. PM258)

otographic Science

vii

Table of Contents xiii

Preface

Glossary of Symbols and Notation

xv

Fundamental Optics Refraction Paraxial Imaging Gaussian Optics Compound Lenses Gaussian Equation Focal Length Thick and Thin Lenses Magnification Entrance and Exit Pupils Pupil Magnification Aberrations

1 1 2 3 4 5 6 7 8 9 10 11

Focusing Unit Focusing Internal Focusing Single Lens Reflex Cameras Autofocus Focusing: Practice

12 12 13 14 15 16

Framing and Perspective Angular Field of View Focus Breathing Equivalent Focal Length Perspective Camera Movements

17 17 18 19 20 21

Depth of Field Circle of Confusion Depth of Field: Formulae Depth of Field: Practice Depth of Field: Examples Lens Bokeh: Theory Lens Bokeh: Example

22 22 23 24 25 26 27

Exposure Photometry

28 28 Field Guide to Photographic Science

viii

Table of Contents Relative Aperture f-Number Working f-Number Natural Vignetting Photometric Exposure Exposure Control Mechanical Shutters Electronic Shutters

29 30 31 32 33 34 35 36

Raw Data Imaging Sensors Color Filter Arrays Radiometry Camera Response Functions Charge Readout Programmable ISO Gain Analog-to-Digital Conversion Linearity

37 37 38 39 40 41 42 43 44

Noise and Dynamic Range Dynamic Range Temporal Noise Fixed Pattern Noise Noise Units Read Noise: Measurement Shadow Improvement Raw Dynamic Range

45 45 46 47 48 49 50 51

Color Color Theory Eye Cone Response Functions CIE RGB Color Space CIE XYZ Color Space Chromaticity Diagrams Color Temperature White Point and Reference White Camera Characterization Output-Referred Color Spaces Raw to sRGB

52 52 53 54 55 56 57 58 59 60 61

ix

Table of Contents White Balance White Balance: Matrix Algebra White Balance: Practice

62 63 64

Digital Images Raw Conversion Digital Output Levels Gamma Tone Curves Histograms Image Dynamic Range Image Display Resolution Color Management: Raw Conversion Color Management: Optimal Workflow

65 65 66 67 68 69 70 71 72 73

Standard Exposure Strategy Average Photometry Reflected-Light Meter Calibration Camera Sensitivity Standard Output Sensitivity Exposure Value

74 74 75 76 77 78

Practical Exposure Strategy Aperture Priority Mode Shutter Priority and Manual Mode Exposure Compensation Advanced Metering Clipping Expose to the Right High Dynamic Range: Theory High Dynamic Range: Tone Mapping High Dynamic Range: Example Neutral Density Filters Graduated Neutral Density Filters Polarizing Filters: Theory Polarizing Filters: Practice Polarizing Filters: Example

79 79 80 81 82 83 84 85 86 87 88 89 90 91 92

otographic Science

x

Table of Contents Lighting Front Lighting Front Lighting: Example Side Lighting Side Lighting: Example Back Lighting Back Lighting: Example Diffuse Lighting Diffuse Lighting: Example Sunrise and Sunset Sunrise and Sunset: Example Flash Lighting Flash Guide Number Manual Flash Digital TTL Flash Metering Sync Speed

93 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107

Image Quality: Resolution Linear Systems Theory Point Spread Function Modulation Transfer Function Diffraction PSF Diffraction MTF Lens PSF Lens MTF Detector Aperture Sampling Aliasing Optical Low-Pass Filter Camera System MTF Resolving Power Perceived Sharpness

108 108 109 110 111 112 113 114 115 116 117 118 119 120 121

Image Quality: Noise and Dynamic Range Signal-to-Noise Ratio Pixel Size Dynamic Range Metrics

122 122 123 124

xi

Table of Contents Cross-Format Comparisons Equivalence Theory: Practice Equivalence Theory: Image Quality Units: Resolution Units: Noise and Dynamic Range

125 125 126 127 128

Equation Summary

129

Bibliography

139

Index

146

otographic Science

xiii

Field Guide to Photographic Science This Field Guide provides a concise summary of the photographic imaging chain and explains the connections between the science and photographic practice. The book begins by summarizing the fundamental optics required for understanding photographic formulae. This is followed by optics-based sections (pp. 12–36) covering topics such as focusing, framing, depth of field, and photometric exposure. The middle part of the book (pp. 37–73) describes the relationship between the photometric exposure distribution at the sensor plane and the resulting digital raw data produced by the camera, and the basics of the subsequent steps required to convert the raw data into an output color image designed for viewing on a display. Later sections concentrate on photographic practice, covering topics such as lighting and strategies for obtaining a suitable exposure (pp. 74–107). The final sections of the book cover the more technical topics relating to camera image quality. I would like to thank Prof. R. Barry Johnson for many useful suggestions. I would also like to thank Scott McNeill and Tim Lamkins at SPIE. D. A. Rowlands www.andyrowlands.com January 2020

otographic Science

xv

Glossary of Symbols and Notation Image-space quantities are denoted using a prime symbol. Adet Ap ADC ADU AF AFoV AS ATF AW b B ¯ bðlÞ BL BWP Bv c C C CA CAT CCD CCT cd CFA CIPA CMF CMOS CoC CSF d D DAW DD65 Dv

Flux detection area Photosite area Analog-to-digital converter Analog-to-digital unit Autofocus Angular field of view Aperture stop Aberration transfer function Adopted white Bellows factor Camera raw space tristimulus value (blue channel) Color matching function for CIE RGB color space Linear sRGB relative tristimulus value (blue) Raw white balance multiplier (blue channel) Brightness value Speed of light; Circle of confusion diameter Center of curvature; Sense node capacitance Color transformation matrix Chromatic adaptation Chromatic adaptation transform Charge-coupled device Correlated color temperature Candela Color filter array Camera and Imaging Products Association Color matching function Complementary metal-oxide semiconductor Circle of confusion Contrast sensitivity function Length of sensor diagonal Diameter of entrance pupil Diagonal white balance matrix for adopted white Diagonal WB matrix for D65 illumination Least distance of distinct vision

otographic Science

xvi

Glossary of Symbols and Notation dx ; dy DXP DCNU DN DoF DOL DR DSLR DSNU e

Sides of rectangular photosite detection area Diameter of exit pupil Dark current non-uniformity Data number Depth of field Digital output level Dynamic range Digital single lens reflex Dark signal non-uniformity Lens extension; Elementary charge E Illuminance Ee Irradiance Spectral irradiance Ee;l ˜ e;l Spectral irradiance averaged over detection area E Ev Illuminance EC Exposure compensation EP Entrance pupil ETTR Expose to the right Ev Exposure value f Front effective focal length F Front focal point f0 Rear effective focal length F0 Rear focal point fE Effective focal length f ðx; yÞ Ideal spectral irradiance (real space) Fðmx ; my Þ Ideal spectral irradiance (Fourier domain) FEC Flash exposure compensation FF Fill factor FFP Front focal plane FP Focal plane FPN Fixed pattern noise FS Field stop FT Fourier transform FWC Full-well capacity g Conversion factor between electrons and DN G Gain G Camera raw space tristimulus value (green) Gbase Base gain

xvii

Glossary of Symbols and Notation GCG GISO GISO-less GL Gp GWP gðx; yÞ ¯ gðlÞ Gðmx ; my Þ GN h

Conversion gain Programmable ISO gain ISO-less gain setting Linear sRGB relative tristimulus value (green) Source follower amplifier gain Raw white balance multiplier (green channel) Real spectral irradiance (real space) Color matching function for CIE RGB color space Real spectral irradiance (Fourier domain) Guide number Planck’s constant; Hyperfocal distance approximation h 5 f 2 =ðcNÞ H Hyperfocal distance Object and image heights h; h0 H, H0 First and second principal planes hðx; yÞ System point spread function Hðmx ; my Þ System optical transfer function hdiff Diffraction point spread function jH det j Detector modulation transfer function 〈H〉 Average photometric exposure HDR High dynamic range HVS Human visual system i; i0 Paraxial angles of incidence and refraction I; I 0 Real angles of incidence and refraction ICC International Color Consortium ih Image height IP Image plane IQ Image quality ISO International Standards Organization JPEG Joint Photographic Experts Group K Reflected light meter calibration constant Km Luminous efficacy constant for photopic vision L Viewing distance; Luminance; Tristimulus value in LMS color space Le Radiance Le;l Spectral radiance Maximum scene luminance Lmax Lv Luminance otographic Science

xviii

Glossary of Symbols and Notation L* ¯ lðlÞ 〈L〉 LCD LENR lm lp LSI m M

mp MR M sRGB ¯ mðlÞ MOS MTF N n; n0 nDN ne nlens Nw ND nm OA OLPF OP OTF p P px ; py P,P0 PðlÞ PDAF PDR PGA

Lightness Eye cone response function (long) Average scene luminance Liquid crystal display Long exposure noise reduction Lumen Line pair Linear shift invariant Magnification Bit depth of ADC; Number of frames; Number of stops; Tristimulus value in LMS color space Pupil magnification Color rotation matrix sRGB to XYZ transformation matrix (D65 WP) Eye cone response function (medium) Metal-oxide semiconductor Modulation transfer function f-number Object and image space refractive indices Raw level expressed using data numbers Electron count per photosite Lens refractive index Working f-number Neutral density Nanometer Optical axis Optical low-pass filter Object plane Optical transfer function Pixel pitch for square photosites Photographic constant Pixel pitch in x and y directions First and second principal points Spectral power distribution function Phase-detect autofocus Photographic dynamic range Programmable gain amplifier

xix

Glossary of Symbols and Notation ph PSF q Q QFWC QE QEðlÞ R R r¯ ðlÞ Ri ðlÞ RL RWP RA REI RFP RP S s; s0 Sbase sEP sf s0f sn s0n s0XP s¯ ðlÞ SA SLR SNR SOS SP SPD t

Picture height Point spread function Exposure equation constant (value 5 0.65) Charge signal Charge signal at full well capacity Quantum efficiency External quantum efficiency Radius of curvature; Tristimulus value in CIE RGB color space; Focal length multiplier or equivalence ratio Camera raw space tristimulus value (red channel) Color matching function for CIE RGB color space Camera response functions Linear sRGB relative tristimulus value (red) Raw white balance multiplier (red channel) Relative aperture Recommended exposure index Rear focal plane Resolving power ISO setting; Tristimulus value in LMS color space OP and IP distances measured from H and H0 Base ISO setting Distance from H to entrance pupil Distance from H to far DoF boundary H0 to far DoF boundary distance in image space Distance from H to near DoF boundary H0 to near DoF boundary distance in image space Distance from H0 to exit pupil Eye cone response function (short) Spherical aberration Single lens reflex Signal-to-noise ratio Standard output sensitivity Sensor plane Spectral power distribution Exposure duration or shutter speed; Integration time otographic Science

xx

Glossary of Symbols and Notation T T CFA;i TIFF TTL U u; u0 U; U0 Vp V p;FWC WB WP x X x¯ ðlÞ XP y

Y y¯ ðlÞ Z z¯ ðlÞ a; a0 hðlÞ gD gE l mc mc;det mc;diff mNyq mr mx ; m y

Lens transmittance factor; Color temperature CFA transmission function Tagged image file format Through the lens Unity gain Ray tangent slopes Real ray angles Photosite voltage Photosite voltage at full well capacity White balance White point Real space positional coordinate; Chromaticity coordinate Enlargement factor; Tristimulus value in XYZ color space Color matching function for XYZ color space Exit pupil Paraxial ray height in extended paraxial region; Real space positional coordinate; Chromaticity coordinate for XYZ color space Tristimulus value in XYZ color space; Relative luminance Absolute luminance Lv Color matching function for XYZ color space (same as standard luminosity function) Tristimulus value in XYZ color space Color matching function for XYZ color space Angles defining the AFoV Charge collection efficiency or internal QE Decoding gamma Encoding gamma Wavelength System cutoff frequency Detector cutoff frequency Diffraction cutoff frequency Sensor Nyquist frequency for square photosites Radial spatial frequency on SP Spatial frequencies on SP

xxi

Glossary of Symbols and Notation mm F Fe Fe;l Fs Fv sDN;read se;DN se;psn se;read

Micrometer Total refractive power Radiant flux Spectral flux Surface power Luminous flux Read noise measured in data numbers Photon shot noise measured in data numbers Photon shot noise measured in electrons Read noise measured in electrons

otographic Science

Fundamental Optics

1

Refraction Geometrical optics uses rays to represent light propagation through an optical system. According to Fermat’s principle, rays will choose a path that is stationary with respect to variations in path length, typically a time minimum. The quickest path may result in a change of direction at the interface between two refractive media separated by a refracting surface.

The change of direction is described by Snell’s law, n0 sin I 0 5 n sin I n and n0 are the refractive indices of the first and second refractive media. The refractive index describes the speed of light in vacuum relative to the medium. I is the angle of incidence, and I 0 is the angle of refraction. Both are measured relative to the surface normal. For rotationally symmetric systems, the optical axis (OA) is the axis of symmetry taken along the z direction. Image formation requires a spherical refracting surface so that an incident ray travelling parallel to the OA can be bent towards the OA. However, spherical surfaces give rise to aberrations and associated image defects. A conceptually simple example is spherical aberration, where rays originating from a unique object point do not converge to form a unique image point at the OA.

z (1)

(2)

otographic Science

2

Fundamental Optics

Paraxial Imaging Aberrations arise from the higher-order terms in the expansion of the sine function (p. 11) but can be corrected by counterbalancing using a compound lens design and real raytracing. A perfect design is free of residual aberrations. Gaussian optics is a simplified framework that neglects all aberrations. However, the Gaussian imaging properties are exact (not approximate) if the lens has been perfectly designed. The first step is to consider points and angles only in the paraxial region infinitesimally close to the OA where sin u 5 u holds exactly and thus aberrations are absent. Snell’s law becomes n0 i0 5 ni, where i, i0 are infinitesimal.

Consider the point object p on the OA with image at p0 . C is the spherical surface center of curvature, and R is the radius of curvature (positive when C is to the right of the surface, and negative when to the left). The object and image distances s and s0 satisfy the Gaussian conjugate equation for a spherical surface, n n0 þ 5 Fs s s0

s is positive when measured from right to left, and s0 is positive when measured from left to right. The surface power Fs for a single refracting surface is Fs 5

n0  n R

Fs is positive for a converging surface that bends rays towards the OA, and negative for a diverging surface. Note that in order to image objects of arbitrary height, Gaussian optics extends the paraxial region (p. 3).

Fundamental Optics

3

Gaussian Optics Gaussian optics treats each surface as free of aberrations by performing a linear extension of the paraxial region to arbitrary heights above the OA.

y

p

i

i n

n

u

u

p

C

R s

s

Since tan u 5 u holds exactly in the paraxial region, each refracting surface is projected onto a flat tangent plane normal to the OA, and the surface power Fs is retained. The angles i and i0 are now equivalent to u and u0 , where     y y 1 y 0 1 y 5 ; u 5 tan 5 0 u 5 tan s s s0 s 0 u and u are interpreted as ray slopes instead of angles. Substituting u and u0 into the Gaussian conjugate equation for a spherical surface yields a new form of Snell’s law, n0 u0 5 nu  yFs Paraxial imaging is now valid at arbitrary heights h; h0 . OP

IP

u

h

y u h

s

n

n

s

OP is the object plane, and IP is the image plane. otographic Science

4

Fundamental Optics

Compound Lenses Consider Gaussian imaging by a compound lens that comprises many refracting surfaces of various curvatures and spacings that bound refractive media. The image of a given OP appears in sharp focus at the IP located by tracing rays from the OP and applying n0 u0 5 nu  yFs at each surface (p. 3). This is known as a ynu raytrace. Only ray slopes and the intersection height at each surface are required.

Now consider a ynu raytrace for an OP distance that approaches infinity so that rays arriving at the lens from the axial position on the OP will enter parallel to the OA.

The second principal plane H0 is located at the intersection of a parallel ray entering at height y1 and the backwards extension of the same ray emerging from the final surface with slope uk . All such rays converge to the OA at the rear focal point F0 . The first principal plane H can be found analogously from a backwards ynu raytrace using a ray that enters the final surface parallel to the OA. H and H0 intersect the first principal point P and second principal point P0 , respectively, at the OA. The principal planes are equivalent refracting planes from the imagined origin of the overall compound lens refractive power.

Fundamental Optics

5

Gaussian Equation The Gaussian conjugate equation for a spherical surface (p. 2) can be generalized and applied to a compound lens, n n0 þ 5F s s0 F is now the total refractive power, which depends on all surface powers, refractive media, and spacings between surfaces. The OP and IP distances s and s0 are measured along the OA from the principal points P and P0 (p. 4).

In photography, s is positive when measured to the left of H, and s0 is positive when measured to the right of H0 . The region to the left of H is defined as object space, and the region to the right of H0 is defined as image space. H and H0 are planes of unit magnification. They can be in any order and do not have to be located inside the lens. Therefore, object space and image space can overlap. Object space adopts the refractive index n of the medium to the left of the first surface. Image space adopts the refractive index n0 of the medium to the right of the final surface. n 5 n0 5 1 if the lens is immersed in air. The rear focal point F0 defines the closest possible IP location behind the lens. The front focal point F defines the closest possible OP location in front of the lens. When the positions of P0 and F0 are known (p. 4), the total refractive power F can be determined from the rear effective focal length (p. 6).

otographic Science

6

Fundamental Optics

Focal Length The ability of a lens to bend rays can be described by either the total refractive power or the focal length. When the OP distance approaches infinity, rays entering the lens from the axial position on the OP will arrive parallel to the OA (p. 4). The exiting rays will converge to the rear focal point F0 at the axial position on the rear focal plane.

Letting s!∞ is known as setting focus at infinity. The rear effective focal length f 0 is defined as the distance s0 measured from P0 to F0 . According to the Gaussian conjugate equation, n0 =f 0 5 F. The front focal point F positioned on the front focal plane is the axial position from which all rays entering the lens will exit parallel to the OA.

The front effective focal length f is defined as the distance measured from P to F. It follows from the Gaussian equation that n=f 5 F. The effective focal length f E is defined simply as the reciprocal of F. It is not in principle an observable length. The Gaussian conjugate equation can be expressed as n n0 n n0 1 þ 5 5 05 ; f s s0 f fE

or in air;

1 1 1 þ 5 s s0 f E

Since f 5 f 0 5 f E for a lens in air ðn 5 n0 5 1Þ, photographers refer to both f and f 0 as the “focal length.”

Fundamental Optics

7

Thick and Thin Lenses Simple expressions for the focal length and total refractive power F can be found for special cases of a compound lens. A thin lens has negligible thickness compared to its diameter, so tlens !0.

In this case, F is simply the sum of the surface powers: n n n0  nlens F 5 F1 þ F2 ; where F1 5 lens ; F2 5 R1 R2 Substituting into the Gaussian conjugate equation with n 5 n0 5 1 yields the lensmakers’ formula:   1 1 1 1 þ 5 ðnlens  1Þ þ s s0 R1 R2

For a single thick lens, an extra term is required: t F 5 F1 þ F2  F12 ; where F12 5 lens F1 F2 nlens Here, tlens is the lens thickness at the OA. Substituting F into the Gaussian equation yields Descartes’ formula. otographic Science

8

Fundamental Optics

Magnification Lateral or transverse magnification m is the size of the image relative to the size of the object. It is defined as m 5 h0 =h h, h0 are the object and image heights measured from OA. The image from a photographic lens is real and inverted, and so h0 and m are negative. n h

u

u

h

n

s

s

The Lagrange theorem can be used to express m in terms of the initial and final ray slopes u and u0 : nu u m 5 0 0 ; or when immersed in air; m 5 0 nu u Magnification can also be expressed in terms of object and image distances measured from the principal points: ns0 s0 0 ; or when immersed in air; m 5  ns s Substituting into the Gaussian conjugate equation yields m5

m5

f f s

Magnification is typically taken to be a positive value in photographic formulae. Such formulae should use jmj so that m! þ jmj: f jmj 5 sf This formula shows that the magnification reduces to zero when focus is set at infinity, i.e., jmj!0 when s!∞. A macro lens can achieve life-size (1:1) reproduction. This occurs when the OP is at s 5 2f so that jmj 5 1. Higher m may be possible for f , s , 2f by using a lens extension tube, close-up filter, or physical bellows.

Fundamental Optics

9

Entrance and Exit Pupils An iris diaphragm acts as an adjustable aperture stop (AS) that controls the amount of light passing through the lens. Within Gaussian optics, the entrance pupil (EP) is a flat surface in object space that coincides with the image of the AS due to lens elements in front of the AS. The exit pupil (XP) is a flat surface in image space that coincides with the image of the AS due to lens elements behind the AS. The images of the AS are often virtual and so cannot be displayed on a screen. The EP is important because 1. The apex of the angular field of view (AFoV) denoted by a is located at the EP (p. 17). 2. The total amount of light that enters the lens depends on the EP diameter. This is fundamental for exposure.

A meridional plane is any plane that contains the OA. The chief rays are incoming meridional rays (lying on a meridional plane) that pass by the edges of any field stop (FS) and through the centers of the AS, EP, and XP. The chief rays define the AFoV (shown above in black). The marginal rays (shown above in blue) are incoming meridional rays from the axial position on the OP that pass by the edges of the AS, EP, and XP. The lens above has reversed virtual pupils, and the virtual marginal and chief rays are shown using dotted lines. Adjusting the AS alters the size of the ray bundle between the marginal rays. This will affect the amount of light that reaches the IP but will not affect the AFoV. otographic Science

10

Fundamental Optics

Pupil Magnification The pupil magnification mp is numerically important at macro to portrait OP distances: mp 5 DXP =D Here, DXP and D are the XP and EP diameter, respectively. mp

EP

1 1 2

H

XP

D 2

D

sEP mp

1 2

IP

mp

f

sXP H

1

H

EP

H

XP

D

D 2

sEP

sXP

IP

mp

f

For a symmetric lens with mp 5 1, the EP will coincide with H, and the XP will coincide with H0 . In this case, sEP 5 s0XP 5 0. When mp ≠ 1, the pupils will be displaced from H and H0 . They can be in any order and may be located away from the lens. Typically, mp , 1 for telephoto lenses, mp . 1 for wide-angle lenses, and mp may vary for zoom lenses. Focus has been set at infinity in the above figure. This 0 reveals the distance sXP measured from H0 : s0XP 5 ð1  mp Þf 0 By applying the Gaussian conjugate equation to the pupils, the distance sEP measured from H is found to be   1 sEP 5 1  f mp

Fundamental Optics

11

Aberrations When rays are traced using exact trigonometric equations based on Snell’s law (p. 1), aberrations arise from the higher-order terms in the expansion of the sine function: u3 u5 u7 sin u 5 u  þ  þ · · · 3! 5! 7! Primary (Seidel) aberrations arise from the third-order term: • Spherical aberration (SA) is caused by variations of focus with ray height in the aperture (p. 26). A converging element has under-corrected SA, so rays come to a focus closer to the lens as the ray height increases. • Coma is caused by variation of magnification with ray height in the aperture so that rays from a given object point will be brought to a focus at different heights on the IP. The image of a point is spread into a non-symmetric shape that resembles a comet. Coma is absent at the OA but increases with radial distance outwards. • Astigmatism arises because rays from an off-axis object point are presented with a tilted rather than symmetric aperture. A point is imaged as two small perpendicular lines. Astigmatism improves as the aperture is reduced. • Petzval field curvature describes the fact that the IP corresponding to a flat OP is itself not naturally flat. Positive and negative elements introduce inward and outward curvature, respectively. Field curvature increases with radial distance outwards. • Distortion describes the variation of magnification with radial height on the IP. Positive or pincushion distortion will pull out the corners of a rectangular image crop to a greater extent than the sides, whereas barrel distortion will push them inwards. Distortion does not introduce blur and is unaffected by aperture. • Chromatic aberration and lateral color appear in polychromatic light due to variation of focus with wavelength l and off-axis variation of m with l, respectively. Aberrations affect image quality and can be measured as deviations from the ideal Gaussian imaging properties. Modern compound lenses are well corrected for primary aberrations. otographic Science

12

Focusing

Unit Focusing The imaging sensor is located at the sensor plane (SP). The position of the SP is fixed to coincide with the rear focal plane when focus is set at infinity. H

H

SP,IP

F

F

s

s

f

When the OP moves towards F, the IP falls behind the SP. OP

H

H

SP

F

IP

F

l

l

f

e

e

The IP must be brought forward to render the OP in sharp focus at the SP. For lenses that utilize traditional or unit focusing, the whole lens barrel is moved along the OA, either manually or via an autofocus motor. OP

H

SP,IP

H

F

F

s l e

s

l

e

d

Since the original OP distance l measured from P (or H) is reduced by the barrel movement, the required extension e for a lens in air satisfies the following Gaussian equation:

 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 1 1 1 þ ðl  f E Þ  ðl  f E Þ2  4f 2E 5 ⇒ e5 l  e fE þ e fE 2

If the OP distance is measured from the SP instead, then

l  f E 5 d  2f E  D Here, d is the distance from the SP to OP, and D is the separation between the compound lens principal planes.

Focusing

13

Internal Focusing Lenses that utilize internal focusing achieve focus by movement of a floating element or group along the OA. Internally focusing lenses necessarily reduce their focal length as the OP distance is reduced from infinity.

front (+ve)

floating (–ve)

rear (+ve)

The example above shows a telephoto lens comprising a front group with positive (+ve) overall refractive power, a floating group with negative (–ve) overall refractive power, and a rear group with positive overall refractive power. When the OP distance s is reduced, movement of the floating group towards the rear group increases the total refractive power and reduces f so that the Gaussian conjugate equation is satisfied. Advantages include 1) The lens does not physically extend. 2) The lens can be fully weather sealed. 3) Graduated neutral density and polarizing filters can be used since the barrel does not rotate upon focusing. 4) Only a small movement of the floating element is needed. This is advantageous for fast autofocus. 5) Variation of the AFoV with OP distance that occurs with unit-focusing lenses (focus breathing, p. 18) can, in principle, be eliminated. This is particularly advantageous for cinematography lenses. All Gaussian focusing equations utilize the plane of best focus for an ideal aberration-free lens. In practice, the SP may be intentionally shifted slightly from the Gaussian location to balance aberrations. For example, defocus can be used to counterbalance spherical aberration. The required movement of floating elements will be measured and calibrated by the manufacturer. otographic Science

14

Focusing

Single Lens Reflex Cameras The lens is an integral part of the single lens reflex (SLR) viewfinder, and so the photographer can look through the lens to frame the scene and set focus. The SLR viewfinder uses a 45° reflex mirror to reflect the cone of light emerging from the XP onto a ground-glass focusing screen located at a plane equivalent to the SP but rotated 90° vertically. A roof pentaprism with ocular lens (eyepiece) images the focusing screen, correcting the reversed horizontal orientation while simultaneously raising the image by 90° for viewing at eye level. The image seen by the eye is virtual.

When focus is set at infinity, the camera lens focal length divided by the eyepiece focal length gives the overall scene magnification. For example, a 50-mm lens on the 35-mm full frame format used with a 62.5-mm eyepiece gives an overall visual magnification of 0.83. A typical operational cycle upon activating the shutter: 1. The iris diaphragm closes to the required aperture. 2. The mirror rises 45° into the horizontal position. 3. The shutter is released to expose the imaging sensor. 4. The shutter closes, the mirror returns to its starting position, and the iris diaphragm fully opens.

Focusing

15

Autofocus Phase-detect autofocus (PDAF) systems used in digital SLR cameras have evolved from the 1985 Minolta Maxxum design. The reflex mirror has a zone of partial transmission, and the transmitted light is directed by a secondary mirror down to an autofocus (AF) module located at an optically equivalent SP (p. 14). The module contains microlenses that direct the light onto a CCD strip. Consider light passing through the lens from a small region on the OP indicated by an AF point on the focusing screen. When the OP is in focus at the SP, the light distribution arriving from equivalent portions of two halves of the XP will produce an identical optical image and signal along each half of the CCD strip. XP

equivalent SP

microlenses

CCD

signal

When the OP is not in sharp focus (i.e., out of focus), these images will shift either toward or away from each other, and this shift indicates the direction and amount by which the lens needs to be adjusted by its AF motor.

A horizontal CCD strip is suited for analyzing signals with scene detail present in the horizontal direction, such as the change of contrast provided by a vertical edge. A cross-type AF point utilizes both a horizontal and a vertical CCD strip so that scene detail in both directions can be utilized to achieve focus. otographic Science

16

Focusing

Focusing: Practice • In single AF mode, the camera locks the distance to the OP as soon as sharp focus has been obtained. • In continuous AF mode, the camera continuously attempts to achieve sharp focus on a subject while the focusing button is pressed. This mode is suitable for tracking moving subjects, particularly when the desired plane of focus is moving toward/away from the camera. • Rather than half-press the shutter button to focus and meter before fully pressing to activate the shutter, assigning a dedicated AF button allows multiple shots to be taken without needing to refocus. Similarly, metering can be separated from AF and shutter activation. • Multiple-area AF modes will automatically choose the position at which focus is set by using information from multiple AF points to analyze the scene. • Single-area AF mode allows an AF point to be manually selected. This is the most accurate method for static subjects provided the point is close to the subject. OP

s

sEP

near DoF

sn EP H

The focus-and-recompose technique used in single-area AF mode involves setting focus and then recomposing the scene before activating the shutter. Although adequate for focusing on static subjects far from the camera, a subject close to the camera may become out of focus if it falls outside the near depth of field (p. 23) after recomposing by pivoting the camera. The maximum allowed pivot angle about the lens EP can be calculated within Gaussian optics in terms of the circle of confusion diameter c (p. 22):     s  sEP jmjD 5 cos1 w 5 cos1 n s  sEP jmjD þ c

Framing and Perspective

17

Angular Field of View OP

EP

H

XP

H

SP

h α

α

h′ s

s XP

s EP

s

The angular field of view (AFoV) a is defined by tan

a h ; 5 2 ðs  sEP Þ

where h 5

h0 d 5 jmj 2jmj

d is the length of the sensor diagonal. The apex of the AFoV is located at the lens EP (p. 9). The distance s  sEP can be found by combining the magnification (p. 8) and pupil distance (p. 10) formulae:   1 1 s  sEP 5 þ jmj mp The AFoV formula becomes   d ; where a 5 2 tan1 2bf

b51þ

jmj ; mp

jmj 5

f sf

b is the bellows factor; b 5 1 when focus is set at ∞. s is the H-to-OP distance after focus has been set. Shown is the AFoV ðb 5 1Þ for focal lengths (mm) on 35-mm full frame. 14 18 24 35 50 100

otographic Science

18

Framing and Perspective

Focus Breathing The AFoV formula is valid for all focusing methods,   d jmj f a 5 2 tan1 ; where b 5 1 þ ; jmj 5 2bf mp sf However, the value of the AFoV is dependent upon the OP distance s when focus is set closer than infinity. This phenomenon is known as focus breathing. In summary, • For unit-focusing lenses, the AFoV decreases as s is reduced. • For front-cell or internally focusing lenses, the variation of AFoV with s depends upon the lens design details. • s is always the H to OP distance after setting focus. Unit focusing • f remains constant. • jmj and b gradually increase from zero as the OP distance s is reduced from ∞. • Reducing s causes the AFoV to decrease relative to its ∞ value. Objects therefore appear larger than expected. • The AFoV reduction can be significant at macro object distances where jmj is large. Internal focusing • f decreases as the OP distance s is reduced from ∞. • The jmj and b values at a given s depend upon the “new” f value after setting focus at the OP. • The “new” f and b that occur after setting focus must be used in the AFoV formula. The specific values depend upon the details of the lens design. For zoom lenses, the values may change over the zoom range. • Internal focusing can, in principle, be used to eliminate focus breathing by designing the lens so that the product bf always remains equal to the value of f at infinity focus when the OP distance s changes. • Internal focusing in photographic lenses often overcompensates for the reduction in AFoV that would occur when using a unit-focusing lens. The AFoV of many 70–200-mm lenses on 35-mm full frame increases as s is reduced, and so objects appear smaller than expected. This behavior is opposite to that of unit-focusing lenses.

Framing and Perspective

19

Equivalent Focal Length The AFoV depends upon the sensor format dimensions and the front effective focal length f. The following formula can be used to obtain the same AFoV on two different formats when focus is set at infinity: f 1 5 Rf 2 ; where R 5 d1 =d2 R is the equivalence ratio between the formats; d1 and d2 are the diagonal lengths of formats 1 and 2; and f 1 and f 2 are the focal lengths used on formats 1 and 2. When format 1 is 35-mm full frame (d1 5 43.27 mm), R and f 1 are known as the focal length multiplier and 35-mm equivalent focal length. This is useful for photographers familiar with the way that focal lengths on 35-mm full frame film relate to the expected AFoV. When focus is set closer than infinity, R is formally replaced by the working equivalence ratio Rw . For example, if the larger format focal length f 1 is known,   mc;1 Rw 5 1  R pc;1   R  1 f1 f ; pc;1 5 mp þ ð1  mp Þ 1 where mc;1 5 R s1 s1 Here, f 1 and s1 are the values after focus has been set (see p. 18 on focus breathing). For a given OP distance s1 , precisely the same AFoV can be obtained only if the lenses have the same pupil magnification mp . In practice, R  Rw at ordinary magnifications. The following are focal length multipliers for a selection of sensor formats.

otographic Science

20

Framing and Perspective

Perspective The center of perspective is situated at the EP. When viewing a scene, true perspective depends only upon the OP distance measured from the EP. This is defined by s  sEP , where s is the OP distance measured from H. Object B (below) appears almost as tall as object A when viewed from 1. A different perspective is obtained at 2, where object B appears much shorter than object A and the space between them appears compressed. A B

1

2

Focal length affects m and the AFoV but not perspective, and so a lens designed for a certain task should provide an appropriate AFoV at a working OP distance defined by a suitable perspective. The classic portrait lens has an 85-mm focal length on 35-mm full frame so that it can be used at a working distance that provides flattering compression of facial features. OP

EP

s EP

XP

s′XP

H

SP photograph

s

H

s′

≅ X fE

The position from which a print or screen image is viewed defines the apparent perspective. In order to provide a correct visualization of the scene without perspective distortion, the photograph should be viewed from the center of perspective. In image space and in air, the center of perspective is located at the XP at a distance s0  s0XP ≅ f E from the SP. When viewing the photograph, the correct equivalent position is located at a distance f E multiplied by the enlargement factor X from the sensor dimensions (p. 22).

Framing and Perspective

21

Camera Movements Panning is used to increase the AFoV in panoramic photography. The camera is placed on a nodal slide fixed to a tripod and then panned (pivoted) horizontally about the fixed lens EP. Since the center of perspective is located at the EP, parallax error will be avoided so that background and foreground points will line up correctly when the overlapping images are digitally merged. Vertical tilting of the SP relative to the OP projects a tilted IP onto the SP. The magnification will vary over the SP due to the varying ratio between s and s0 . This is known as keystone distortion as it causes vertical lines to lean toward each other. From the figure, the tilt angles are related by the Scheimpflug condition, tan u0 5 jmj tan u Here, jmj is the magnification at the OA. Keystone distortion is objectionable for small u since in this case the HVS subconsciously interprets leaning vertical lines as parallel when viewing the scene.

Tilting can be avoided by using an ultra-wide-angle lens or, preferably, by using the shift function of a tilt-shift lens to shift the lens relative to the SP in a vertical plane. Tiltshift lenses can also tilt the lens relative to the SP in order to tilt the plane of focus without affecting the AFoV. An example application is depth-of-field extension. otographic Science

22

Depth of Field

Circle of Confusion In principle, only the OP will appear in sharp focus on a photograph. In practice, a certain amount of blur will remain undetectable due to the limited resolving power (RP) of the HVS. This permits a region of depth away from the OP in object space known as the depth of field (DoF) that appears acceptably sharp on the photograph since the blur will not be noticed. The permitted blur depends upon the viewing distance L and the RP of the HVS at the specified L. When the photograph is projected down to the sensor dimensions by the enlargement factor X, simple DoF calculations assume that blur can arise from lens defocus only so that a permitted SP blur spot can be described by a uniform circle of confusion (CoC) of diameter c. Sensor format

Dimensions (mm)

c (mm)

36 3 24

0.030

APS-C

23.6 3 15.7

0.020

Micro four thirds

17.3 3 13.0

0.015

1 inch

13.2 3 8.80

0.011

2/3 inch

8.6 3 6.6

0.008

1/1.7 inch

7.6 3 5.7

0.006

1/2.5 inch

5.76 3 4.29

0.005

35-mm full frame

The standard CoC diameters tabulated here correspond to the following set of standard viewing conditions: • L 5 Dv, where Dv 5 250 mm is the least distance of distinct vision. • RP(Dv) 5 5 lp/mm (line pairs per mm) for a photograph of an alternating black/white striped pattern. (Multiply by X to obtain the RP at the sensor dimensions.) • X 5 8 for the 35-mm full frame format. c line pair at sensor dimensions

In general, the CoC diameter is given by c 5 ð1.22 3 LÞ=ðX 3 Dv 3 RPðDv ÞÞ

Depth of Field

23

Depth of Field: Formulae

From similar triangles, it can be seen that     c s  sn sf  s D5 D 5 sn  sEP jmj sf  sEP Here, c is the CoC diameter. Rearranging yields: • The near DoF defined by the distance s  sn: near DoF 5

cðs  sEP Þ ðs  f Þðs  sEP Þ 5 jmjD þ c h þ ðs  f Þ

• The far DoF defined by the distance sf  s: far DoF 5

cðs  sEP Þ ðs  f Þðs  sEP Þ 5 jmjD  c h  ðs  f Þ

• The total DoF defined by the distance sf  sn: total DoF 5

2jmjDc ðs  sEP Þ 2hðs  f Þðs  sEP Þ 5 m2 D 2  c2 h2  ðs  f Þ2

• h = f 2/(cN ) approximates the hyperfocal distance (p.24). • N = f/D is the lens f-number (p. 30). • The distance sEP 5 ð1  m1 p Þf measured from H to EP is important only for macro photography (p. 10). The above formulae are approximate since only uniform defocus blur has been included when defining the CoC. (See p. 108 for other sources of blur.) otographic Science

24

Depth of Field

Depth of Field: Practice A large or small total DoF is described as deep or shallow, respectively. A shallower DoF can be created by • Increasing the magnification. According to the formula |m| = f/(s  f ), this can be achieved by either using a longer lens focal length f or by setting a closer OP. • Using a larger EP diameter. According to N = f/D, this is achieved by using a smaller f-number for a given f. Cameras based on different sensor formats will give the same DoF provided “equivalent” f and N are used (p. 124). A larger format has the capability to produce a shallower DoF than a smaller format since the EP diameter can be made physically larger. The near-to-far DoF ratio decreases as the OP distance increases. At the OP distance s  sEP 5 H measured from the EP, the ratio reduces to zero because the far DoF extends to infinity. The distance H is the hyperfocal distance, and the DoF formulae are valid only up to H. 1.0

near-to-far DoF ratio

0.5 0.0

f

OP distance

The far DoF extends to infinity when the denominator of the far DoF formula approaches zero so that s  f = h. Since s  sEP 5 H at this distance, H is given by H 5 h  sEP þ f When s  sEP 5 H, the corresponding near DoF 5 H=2. This means that focusing at H yields a near DoF that extends to half the value of H itself. Focusing at H maximises the DoF according to Gaussian optics. For landscapes, it is preferable to focus just beyond H to ensure that distant features are fully in focus.

Depth of Field

25

Depth of Field: Examples

A shallow DoF due to a close OP and low f-number. The focus was set on the eye using single-area AF mode.

A deep DoF due to a high f-number has kept both the front text in the foreground and the distant features in focus. otographic Science

26

Depth of Field

Lens Bokeh: Theory Lens bokeh describes the aesthetic character of the out-offocus blur regions observed on a photograph. Certain lenses are highly regarded for the special bokeh that they produce. The DoF formulae are based upon a uniform and circular CoC for simple calculations. In practice, • The shape of an out-of-focus blur spot depends primarily upon the number of blades used by the iris diaphragm. • The distribution of blur over the blur spot depends primarily upon the nature of the lens aberrations. SP(2)

IP

SP(1)

Blur spots reproduced from Nasse, 2010.

Consider a lens with under-corrected SA (p. 11). The IP corresponding to rays from a point object positioned just beyond the far DoF boundary will be some distance in front of the SP, labeled above as SP(1). At SP(1), the point will appear as a blur spot larger than that produced in the absence of SA, and with a blur distribution that decreases toward the edges where the ray density is lower. This yields smooth background bokeh. For the same lens, the IP corresponding to rays from a point positioned just in front of the near DoF boundary will be situated some distance behind the same SP now shown as SP(2). At SP(2), the point will appear as a smaller blur spot with blur distribution that increases at the edges. Overlapping blur spots of this type can yield harsh foreground bokeh. The situation reverses for over-corrected SA. Background bokeh is often considered more important, and so slightly under-corrected SA is preferred. The red/green rims are due to chromatic aberration.

Depth of Field

27

Lens Bokeh: Example An interesting example is the swirling bokeh produced by certain lenses at a wide aperture, e.g., f/1.2 designs and Petzval-type lenses. The effect is considered desirable by some portrait photographers but objectionable by others. It is produced by a cats-eye-shaped blur spot that arises from field curvature and vignetting.

Swirling bokeh produced by the Panasonic-Leica Nocticron 42.5-mm f/1.2 lens.

Bokeh cannot be easily deduced from lens MTF curves (p. 104) since lens MTF is based upon the lens point spread function (PSF, p. 112) on the SP, whereas bokeh depends upon the defocused PSF away from the SP. otographic Science

28

Exposure

Photometry Photometry can be used to quantify the amount of light perceived by the HVS. Photometry is the same as radiometry (p. 39) except that electromagnetic energy is weighted as luminous energy according to the spectral sensitivity of the HVS. Quantity

Symbol

Unit

Luminous flux

Fv

lumen (lm)

Illuminance

Ev

lux (lx or lm/m2)

Luminous intensity

Iv

candela (cd or lm/sr)

Luminance

Lv

cd/m2

• Luminous flux is the rate of flow of luminous energy emitted or received by a specified surface area. • Illuminance is the luminous flux per unit area received from all directions by a specified surface. • Luminous intensity is the luminous flux emitted per unit solid angle from a point source into a cone. Each point on an extended surface can be associated with an infinitesimal area. This defines luminance as the luminous flux per unit solid angle per unit projected source area in the direction of a cone. Luminance does not depend on the source distance. OP

EP

XP

SP

The scene luminance distribution is an array of infinitesimal luminance patches representing the scene, each of which forms a cone subtended by the lens EP. The lens transforms the total flux entering the EP into the sensor-plane illuminance distribution, which is an array of infinitesimal illuminance patches on the SP. Photometric quantities use a “v” subscript for “visual.” The “v” is often dropped from photographic formulae.

Exposure

29

Relative Aperture Relative aperture is a quantity that arises when deriving a mathematical expression for the illuminance at the SP.

OP

dA

EP H

H′

XP

SP

u′

u

n′

n s

d A′

sXP

sEP

s′

Consider an infinitesimal area element dA associated with the axial position on the OP. Within Gaussian optics, the luminous flux at the lens EP arriving from dA is given by Fv 5 pL u2 dA

L is the scene luminance associated with dA, and u is the marginal ray tangent slope (p. 9). Using the Lagrange theorem (p. 8) and m2 5 dA0 /dA yields a formula for the same flux deposited at the corresponding area element dA0 at the axial position on the SP: 

Fv 5 pLT

n0 u0 n

2

dA0

T ≤ 1 is the lens transmittance factor. The illuminance E 5 Fv/dA0 at the axial position on the SP can be written in terms of the relative aperture (RA): E5

p LT 3 ðRAÞ2 ; 4

where

RA 5

n0 0 2u n

RA is directly associated with the cone of light subtended by the XP from the axial position on the SP. In photography it is convenient to express the ray slope u0 in terms of focal length and EP diameter. Two cases can be considered: 1) Focus set at infinity; in this case the RA can be expressed in terms of the f-number. 2) Focus set closer than infinity. The RA can be expressed in terms of the working f-number. otographic Science

30

Exposure

f-Number

EP

OP

H

XP

H

SP n

n 1 2D

P

f

sEP

u

s XP

dA

f

When focus is set at infinity, s0 5 f 0 . The ray slope is then D u0 5 0 2f u0 can be substituted into the formula for the RA (p. 29). The refractive indices can be removed using n0 /n 5 f 0 /f. The illuminance E at the axial position on the SP becomes p 1 f E 5 LT 2 ; where N 5 4 D N The f-number N depends on the front effective focal length and the EP diameter D. The f-number is the reciprocal of the RA when focus is set at infinity. It is usually marked on lens barrels using the symbols f/N or 1:N, where N is the f-number. Beyond Gaussian optics, the f-number can be written n ; where NA0 ∞ 5 n0 sin U 0 N5 2NA0 ∞ NA0 ∞ is the image-space numerical aperture when s ! ∞, and U0 is the real image-space marginal ray angle. Provided the lens is aplanatic (free from SA and coma), then sin U0 5 u0 when s ! ∞, according to Abbe’s sine condition, so the Gaussian expression N 5 f/D is exact for an aplanatic lens. However, the sine function restricts the lowest achievable value in air (n 5 n0 5 1) to N0 5 0.5.

Exposure

31

Working f-Number

When the OP is brought forward from infinity, s0 . f 0 , and the image-space ray slope u0 becomes u0 5

mp D  s0 XP Þ

2ðs0

u0 can be substituted into the formula for the RA (p. 29), and the distance s0XP 5 ð1  mp Þf 0 (see p. 10). The illuminance E at the axial position on the SP becomes E5

p 1 LT 2 4 Nw

The working f-number Nw is defined by jmj N w 5 bN; where b 5 1 þ mp • b is the bellows factor. For front-cell or internally focusing lenses, |m| must be replaced according to the “new” focal length f after setting focus on the OP. • For a unit-focusing lens, |m| increases as the OP distance is reduced, and so E at the axial position on the SP decreases, e.g., b 5 2 at 1:1 reproduction with mp 5 1, in which case E is reduced to 1/4 of the value at ∞. • Knowledge of Nw is useful when using a hand-held exposure meter. In-camera through-the-lens (TTL) metering automatically accounts for the change in E. When focus is set at infinity, |m| ! 0, and b ! 1. The working f-number then reduces to the f-number. otographic Science

32

Exposure

Natural Vignetting Only the axial positions on the OP and SP are needed to define N and Nw. The formula for the illuminance E at the SP (p. 31) is valid only at the axial position. Even for a uniform scene luminance distribution, geometrical considerations dictate that E at non-axial positions on the SP will be reduced compared to the axial value. This is known as natural fall-off, roll-off, or natural vignetting. OP

EP

OA

The luminous flux at the lens EP arriving from a non-axial position on the OP subtending an angle u with the OA is given by the cosine fourth law, Fv ðuÞ  pLu2 cos4 u dA The illuminance at any chosen position on the SP is then p 1 E 5 LT 2 cos4 u 4 Nw At the OA, u 5 0°, and so cos4 u 5 1. L and E are functions of position on the OP and SP, respectively. These coordinates are related by m but are not typically shown in photographic formulae (see p. 107).

Natural vignetting for a 24-mm lens on 35-mm full frame.

Natural vignetting can be compensated for when designing a lens. The vignetting profile can be described by the relative illumination curve. If known, this function can replace the cos4 u term.

Exposure

33

Photometric Exposure A photometric description of exposure is needed so that the photographer and camera metering system can make exposure decisions based on the visual properties of light. Photometric exposure Hv or H is defined as Zt

H5

Eðt0 Þ dt0

0

H 5 Et for a time-independent luminance distribution. t is the exposure duration or shutter speed. For a mechanical shutter, it is the time from when the shutter blades are half open to when they are half closed. H and illuminance E are functions of the position on the SP. Substituting for E yields the camera equation H5

p t LT 2 cos4 u 4 Nw

Luminance L is a function of the corresponding position on the OP. These positional coordinates are related via the magnification. Photometric exposure H can be increased by 1) Increasing t, i.e., using a slower shutter speed. 2) Increasing the magnitude of the scene luminance distribution by adding natural or artificial light. 3) Lowering the working f-number (or f-number). A higher H is beneficial for image quality because, except at low luminance levels, the largest contribution to signal noise is generally pffiffiffiffiffi photon shot noise (p. 46), which is proportional to H . Therefore, the pffiffiffiffi ffi signal-to-noise ratio (SNR) generally increases as H . The maximum H that can be tolerated by the imaging sensor and analog-to-digital converter (ADC) is defined as the saturation exposure. For a given scene, an exposure strategy is needed to produce a useful response from the imaging sensor (p. 74). H is independent of the camera ISO setting (pp. 42, 76). However, raising the ISO setting decreases the exposure headroom by reducing the saturation exposure, which lowers the overall SNR (p. 51). otographic Science

34

Exposure

Exposure Control When focus is set at infinity, the photometric exposure (p. 33) at the axial position on the SP can be written as p t H 5 LT 2 4 N pffiffiffi N must decrease or increase by a factor 2 in order to respectively double or halve H. This defines the following base 2 logarithmic series of possible N in air: 0.5 0.7 1

1.4 2

2.8 4

5.6 8

11

16

22

32



Changing N by a single increment in this series is defined as an increase or decrease by one f-stop. Modern iris diaphragms allow the use of fractional f-stop increments such as 1/3 and 1/2. By combining N and the lens transmittance factor, the above equation can alternatively be written as p t N H 5 L 2 ; where T # 5 pffiffiffiffi 4 T# T The T-number (T#) provides a more accurate indication of the expected H and for that reason is commonly specified on cinematography lenses. However, the aperture and DoF are determined by N and not the T-number. Note that T ≤ 1, and so T# $ N. Along with a shutter for controlling the exposure duration, cameras use an internal metering metering sensor to determine appropriate sensor exposure settings. • In DSLR cameras, a portion of the light passing through the lens is directed onto a metering sensor that is typically located in front of the roof pentaprism. • In mirrorless interchangeable lens and compact cameras, exposure metering is carried out directly by the imaging sensor.

Exposure

35

Mechanical Shutters Focal plane shutter: The modern FP shutter is a type of rolling shutter capable of fast shutter speeds of order 1/8000 s. It typically comprises two curtains situated in front of the SP, each composed of three horizontal blades. When the shutter is activated, the first curtain opens vertically to expose the sensor. The second curtain then closes in the same vertical direction to end the exposure.

first curtain opens

second curtain closes

exposure duration (long) time →

height on SP →

height on SP →

The shutter traversal time is the time needed for a curtain to traverse the sensor, typically around 1/250 s. It is the same for both the first curtain opening (blue arrow below) and second curtain closing (green arrow below).

exposure duration (short) time →

A shutter speed faster than the shutter traversal time is obtained when the second curtain starts closing before the first curtain fully opens (right diagram above). This causes the sensor to be exposed by a moving slit that prevents the use of conventional flash (p. 106). FP shutters can cause rolling shutter distortion when there is fast scene motion since different vertical positions are not exposed at the exact same time instant. Leaf shutter: Commonly used in compact cameras next to the aperture, the blades open from the center. The shutter traversal time is very quick, which is useful for flash, but the fastest achievable t is only about 1/2000 s since the same blades are used to open and close the shutter. otographic Science

36

Exposure

Electronic Shutters

height on SP →

An electronic shutter allows much faster shutter speeds than possible with a mechanical shutter. At present, electronic shutter speeds of up to 1/32000 s are possible in consumer cameras. The value depends on the row readout speed, which is the time taken to read a row of the sensor. Advantages of electronic shutters include: 1) Silent operation. 2) Absence of mechanical wear. 3) Absence of vibrations described as shutter shock. CCD sensors can use an electronic global shutter because all rows can be read simultaneously. This has two major advantages: 1) Rolling shutter distortion is absent. 2) The shutter traversal speed is equivalent to the row readout speed (the fastest possible t). This enables flash to be used at very fast shutter speeds. An electronic rolling shutter, used in consumer cameras with a CMOS sensor, requires each row to be read in sequence. Since each row needs to be exposed for the same time duration, the exposure at a given row cannot be started until the previous row has been read out. The shutter traversal time is therefore limited by the frame readout speed, which is typically slower than the shutter traversal time of a mechanical shutter. Therefore, • Rolling shutter distortion frame readout is more severe than that caused by mechanical FP shutters. • The use of conventional flash is limited to shutter speeds slower than the frame readout speed. The use of flash exposure with an electronic rolling duration shutter is often disabled in cameras that also offer a time → mechanical shutter. An electronic first curtain shutter is a compromise solution that uses the electronic rolling shutter only to start the exposure at each row and is synchronized with a mechanical shutter that ends the exposure.

Raw Data

37

Imaging Sensors Imaging sensors comprise a 2D array of sensor pixels or photosites. Each photosite contains one or more photoelements, such as a photogate / MOS capacitor or photodiode, that convert light into stored charge. In a photogate, a polysilicon electrode is held at a positive bias above doped p-type silicon. Mobile positive holes flow toward the ground electrode and leave immobile negative acceptor impurities, creating a depletion region. When a photon is absorbed in this region, an electron–hole pair is created. The holes flow toward the ground electrode, leaving the electrons as stored charge. In a photodiode, the depletion region arises from a reverse bias applied to the n-type and p-type silicon junction.

microlens color filter polysilicon gate

SiO2 layer depletion region

p-type silicon ground electrode

A photon with wavelength l has energy hc/l, where h is Planck’s constant. Silicon has a band-gap energy of 1.1 eV, so only photons with l , 1100 nm can be absorbed. The internal quantum efficiency or charge collection efficiency h(l) is the charge successfully stored expressed as a fraction of the charge generated. h(l) 5 1 for photons absorbed in the depletion region; h(l) , 1 for photons absorbed in the silicon bulk since only a fraction will avoid recombination by successfully diffusing to the depletion region. This tends to occur at longer l where the photons can penetrate more deeply. Although charge-coupled device (CCD) and complementary metal–oxide semiconductor (CMOS) architectures traditionally use photogates and photodiodes, respectively, photoelement varieties are increasingly being used with either. The main difference remains the charge readout strategy (p. 41). otographic Science

38

Raw Data

Color Filter Arrays The HVS is sensitive to wavelengths between 380–780 nm. The eye contains three types of cone cells with photon absorption properties described by a set of eye cone ¯ response functions ¯lðlÞ, mðlÞ, and s¯ ðlÞ. These different responses lead to the visual sensation of color (p. 53).

Bayer

Fuji® X-Trans®

A larger number of green filters are used since the HVS is more sensitive to l in the green region of the visible spectrum.

A camera requires an analogous set of response functions to detect color. In consumer cameras, a color filter array (CFA) is fitted above the imaging sensor. The Bayer CFA uses a 2 3 2 block pattern of red, green, and blue filters that form three types of mosaic. The filters have different spectral transmission properties described by a set of spectral transmission functions TCFA,i(l), where i is the mosaic label. The overall camera response is determined largely by the product of TCFA,i(l) and the charge collection efficiency h(l) (pp. 37, 40). The Fuji® X-Trans® CFA uses a 6 3 6 block pattern. It requires greater processing power and is more expensive but can give improved image quality. • An infrared-blocking filter is combined with the CFA to limit the response outside of the visible spectrum. • The spectral passband of the camera describes the range of wavelengths over which it responds. • In order to record color correctly, a linear transformation should in principle exist between the camera response and eye cone response functions. • After raw data capture, only the digital value of one color component is known at each photosite. Color demosaicing estimates the missing values so that all color components are known at every photosite.

Raw Data

39

Radiometry A photometric measure of light (p. 28) can be helpful when making exposure decisions based upon the visual properties of the scene (p. 74). However, a spectral radiometric measure of light is needed to model the charge signal generated by the imaging sensor. This is a measure of electromagnetic, rather than luminous, energy. Quantity

Symbol

Unit

Radiant flux

Fe

W (or J/s)

Irradiance

Ee

W/m2

Radiance

Le

W/(m2 sr)

Radiant exposure

He

J/m2

Spectral flux

Fe,l

W/nm

Spectral irradiance

Ee,l

W/(m2 nm)

Spectral radiance

Le,l

W/(m2 sr nm)

Spectral exposure

He,l

J/(m2 nm)

Photometric quantities all have spectral radiometric counterparts denoted using an “e,l” subscript. They are obtained from spectral radiometric quantities by including the standard luminosity function y¯ ðlÞ for human photopic vision (p. 55), the luminous efficacy constant Km 5 683 lm W1, and integrating over the visible spectrum; e.g., luminance is obtained from spectral radiance as follows: 780 Z nm

Lv 5 K m

Le;l y¯ ðlÞ dl 380 nm

Full radiometric quantities are denoted using “e” (for “energetic”) and are obtained from spectral radiometric quantities by directly integrating over all l; e.g., radiance is obtained from spectral radiance as follows: þ∞ Z

Le 5

Le;l dl ∞

In a spectral radiometric description, the lens transforms the scene spectral radiance distribution into the sensor-plane spectral irradiance distribution. otographic Science

40

Raw Data

Camera Response Functions A photon has energy hc/l, and so the number of photons nph(l) with wavelength l incident at a given photosite during the exposure duration t is defined by l ˜ e;l nph ðlÞ 5 Fe;l t; where Fe;l 5 Adet E hc ˜ e;l is the Fe,l is the spectral flux incident at the photosite, E average spectral irradiance at the photosite, and Adet is the effective photosite flux detection area. The average number of stored electrons ne,i(l) generated per incident photon can be expressed as follows: ne;i ðlÞ 5 nph ðlÞQEi ðlÞ and QEi ðlÞ 5 FFhðlÞT ðlÞT CFA;i ðlÞ QEi(l) is the external quantum efficiency for mosaic i [this is also known as the quantum efficiency (QE)], FF 5 Adet / Ap is the fill factor, Ap is the photosite area, h(l) is the charge collection efficiency (p. 37), T(l) is a Si-SiO2 surface layer transmission function, and TCFA,i(l) is the CFA transmission function for mosaic i. The set of spectral responsivity or camera response functions in units of amperes per watt are given by el Ri ðlÞ 5 QEi ðlÞ ; where e is the elementary charge hc 1.0 Normalized camera response functions for the Nikon D5100

0.8 0.6 0.4 0.2 0.0 380

430

480

530 580 630 wavelength (nm)

680

730

780

Combining the above and integrating over the spectral passband yields the total expected electron count tAp ne;i 5 e

Zl2 l1

˜ e;l dl Ri ðlÞ E

Raw Data

41

Charge Readout A CCD photosite typically contains up to four photogates with overlapping depletion regions. By adjusting the gate voltages through systematic clocking, charge can be transferred very quickly from one photogate to another. A parallel-to-serial horizontal shift register situated at the end of the columns transports the charge from each column to an output charge detection circuit in a serial manner, one row at a time. The figure below illustrates the charge readout mechanism as a function of time for a 2 3 2 photosite block at the corner of a CCD sensor. photosite

photogate

electrons

column

charge motion ( )

charge detection

charge motion ( )

In active-pixel CMOS sensors, charge detection occurs inside each photosite. Each photosite also contains connection circuitry for address and readout. Charge detection is similar in both CCD and CMOS. A charge amplifier converts the charge signal defined by Q 5 ne,i e at each photosite into a voltage Vp: V p 5 ðGp =CÞQ

C is the sense node capacitance, and Gp is the source follower amplifier gain of order unity. The maximum voltage Vp,FWC occurs at full-well capacity (FWC), V p;FWC 5 ðGp =CÞQFWC

QFWC is the maximum charge that a photosite can hold. The conversion gain GCG describes the voltage change per electron in mV/e units: GCG 5 ðGp =CÞe

otographic Science

42

Raw Data

Programmable ISO Gain

Each voltage Vp produced from the charge amplifier is amplified by a programmable gain amplifier (PGA) to a useful level V that lies within the input range of the analog-to-digital converter (ADC): V 5 GVp

G is a programmable gain controlled by the camera ISO setting S. The numerical value of S is determined from the JPEG output (p. 77). The ADC converts each V into a raw value that can be one of many discrete raw levels (p. 43). The raw values over the SP define the raw data. The raw data for a specific mosaic is known as a raw channel. The maximum voltage Vmax occurs at FWC: V max 5 Gbase V p;FWC

The base ISO setting Sbase corresponds to the analog gain Gbase that uses the least analog amplification to output the highest available raw level. When multiple S use the same Gbase, the highest S defines Sbase. G can be written as the product of Gbase and an ISO gain: V 5 Gbase GISO V p

GISO 5 1 at the base ISO setting. Each time S is doubled, GISO and V are doubled, and so only half the exposure H is required to produce the same V and associated raw level. This is useful when a short exposure duration is required at low scene luminance levels. So-called extended ISO settings below Sbase and above a camera-dependent S value do not adjust GISO further but instead use over-/under-exposure and JPEG tone curve adjustment to achieve the required JPEG output. “RAW” files are proprietary camera-dependent files that can be processed using a raw converter. They contain raw data along with camera metadata and a JPEG preview.

Raw Data

43

Analog-to-Digital Conversion Photoelement

PGA

ADC

A linear ADC can be modeled as a quantization by integer part of the fraction V/Vmax ≤ 1 to a raw level:   V M kð2  1Þ nDN;i 5 INT V max nDN,i is a raw level expressed as either a data number (DN) or identically as an analog-to-digital unit (ADU). M is the bit depth of the ADC; a 12-bit ADC provides 212 5 4096 raw levels ranging from DN 5 0 to DN 5 4095. A larger M defines a smaller quantization step represented by 1 DN or 1 ADU. k ≤ 1 accounts for situations where the maximum available raw level is less than that provided by the ADC. If the ADC can accommodate Vp,FWC, then Vp V 5 GISO V max V p;FWC Vp is the voltage from the charge amplifier. Vp,FWC is the value of Vp at FWC. GISO $ 1 is the ISO gain, and Vp , Vp, FWC when GISO $ 1. By substituting the voltages with electron counts ne,i (p. 40), raw levels can be expressed in the following way:   ne;i ne;i;FWC U nDN;i 5 INT ; g5 ; U5 GISO g kð2M  1Þ g is the conversion factor between electron counts and raw levels expressed using e/DN units. For example, g 5 10 implies that 10 electron counts gain 1 raw level. U is the unity gain, the value of GISO when g 5 1e/DN. The above equations need to be modified if the raw data contains a bias offset (p. 49). For a Bayer CFA, the raw channels can be denoted R, G, B. Each channel contains nDN,i for a specific mosaic i.

otographic Science

44

Raw Data

Linearity For the raw data to correctly represent the photographic scene perceived by the HVS, a camera will ideally produce raw data that is linearly related to scene luminance. Neglecting natural vignetting, this at least requires that the raw data be linearly related to irradiance at the SP. 1) The camera spectral passband must match that of the HVS. This requires an infrared-blocking filter. 2) The spectral responsivity must be a linear function of spectral exposure, R(l) / He,l 5 Ee,lt. 3) The voltage Vp from the charge detection circuit must be a linear function of electron count, Vp / ne (p. 41). 4) The PGA must be linear so that V / Vp, where V is the voltage fed to the ADC (p. 42). 5) The ADC must be linear so that the raw values are linear functions of V up to the quantization step (p. 43). An ideal sensor response curve expressed as a function of relative scene luminance is shown below. Electrons

FWC

noise floor Luminance

A useful response is obtained below FWC and above the noise floor (p. 44). Electron count ne can be converted to raw level nDN by using the conversion factor g (p. 43). For greyscale cameras, the slope changes according to the spectral characteristics of the lighting unless RðlÞ / y¯ ðlÞ (p. 39). For cameras with a CFA, the Luther–Ives condition requires that a linear transformation exists between the set of Ri(l) and the eye cone response or colormatching functions if full color (both luminance and chromaticity) is to be correctly represented (p. 54). Consumer cameras are radiometrically linear but only approximately photometrically linear. Radiometric nonlinearities close to FWC can be characterized and utilized to increase the raw dynamic range (pp. 45, 51).

Noise and Dynamic Range

45

Dynamic Range Scene dynamic range (DR) is the ratio between the highest and lowest scene luminance values. It can also be expressed in terms of the photographic stop, which is a unit on the base-2 logarithmic luminance scale. For example, a scene luminance ratio 128:1 is equivalent to log2 128 5 7 stops. In this case, the lowest luminance value needs to be doubled seven times to equal the highest. The f-stop (p. 34) is a specific example of a stop. ADC dynamic range describes the maximum scene DR that can be quantized by the ADC. For an imaging sensor that gives a perfectly linear output response to scene luminance, a linear ADC can quantize M stops of scene DR, where M is the bit depth of the ADC, and 2M is the number of raw levels. Raw dynamic range (p. 51) is the maximum scene DR that can be represented by the raw data. It is restricted from above by either the ADC bit depth M or FWC, and from below by the read noise (p. 46). If the sensor response and ADC are both perfectly linear, the raw DR cannot be greater than M provided the noise stays above the quantization step on average. Clipping will occur if the scene DR is greater than the raw DR. Image dynamic range is the maximum scene DR that can be represented by the encoded output image file e.g. JPEG or TIFF. Since the image file digital output levels (p. 65) are not involved in capturing raw data, they can in principle encode an infinite DR. However, the image DR cannot be greater than the raw DR when the image file is produced from the raw file. The entire raw DR can be transferred to the image file if an appropriate tone curve is applied, even if the image file bit depth is lower than M. For accurate luminance reproduction, any nonlinearity must be compensated for by the display. Display dynamic range is the contrast ratio between the highest and lowest luminance values that can be produced by the display medium. If the display DR is less than that of the image, the image DR can be compressed into the display DR through tone mapping, for example, by reducing display contrast. otographic Science

46

Noise and Dynamic Range

Temporal Noise The charges and voltages appearing on p. 42 are mean values since random fluctuations occur over short time scales due to temporal noise measured by the standard deviation s of the associated statistical distribution. Noise power is defined by the variance s2, and so independent temporal noise sources add in quadrature: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u M uX 5t s2 s m

total

m51

Three major contributions to the total temporal noise are • Photon shot noise: Even for a time-independent scene radiance distribution, the number of photons arriving at the SP fluctuates randomly over short time scales and manifests as photon shot noise se,psn, forming a Poisson distribution about the mean signal. The variance of the Poisson distribution equals the mean itself, s2e;psn 5 ne , and so the SNR improves as the signal increases. • Dark-current shot noise: Even in the absence of irradiance at the SP, a dark current Idark 5 ne,dark e/t and associated shot noise distribution with standard deviation se,dcsn will be generated due to thermal agitation of electrons inside each photoelement. Idark reduces the effective FWC, and subtracting Idark will not eliminate se,dcsn. In the absence of irradiance at the SP, t is referred to as the integration time. • Read noise: Each time charge readout occurs, voltage fluctuations in the readout circuitry, which includes the charge detection circuitry and PGA, will produce read noise sDN,read. This defines the noise floor. To prevent visible banding artifacts in the raw data, sDN,read should remain above the quantization step (1 DN). Frame averaging: Averaging M raw files taken using identical camera settings (frames) reduces temporal pffiffiffiffiffiffi noise by a factor M provided the scene luminance distribution is time-independent: 1 pffiffiffiffiffiffiffiffiffiffi2ffi s M s 5 pffiffiffiffiffiffi ðunits can be electrons or DNÞ hsiM 5 M M

Noise and Dynamic Range

47

Fixed Pattern Noise Fixed pattern noise (FPN) arises from small variations in photosite response that do not change from frame to frame. FPN can be modeled as an offset in the raw data equations (p. 43), but it can in principle be removed. Dark-signal non-uniformity (DSNU) refers to FPN in the absence of irradiance at the SP. The major contribution to DSNU is dark-current non-uniformity (DCNU), which arises from variations in photosite response to thermally induced dark current (p. 46). DCNU increases with integration time t. Since DSNU remains when the sensor is exposed, long-exposure noise reduction can be activated to take a dark frame (LENR) (a raw frame in the absence of irradiance at the SP) of equal t immediately after the standard frame. Subtracting the dark frame (containing DSNU and temporal noise) will remove the DSNU. Photographers can create their own DSNU templates for various t and ISO settings. If a template is made by averaging M dark frames, read noise will increase by a pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi factor 1 þ 1=M when the template is subtracted. Pixel response non-uniformity (PRNU) refers to FPN in the presence of irradiance at the SP. PRNU arises due to small variations in photosite response over the SP but increases in proportion with exposure or electron count so that ne,PRNU 5 kne. Since the total noise will be larger than the PRNU, the constant k imposes a limit on the maximum achievable SNR: SNR ≤

ne ne;PRNU

5

1 k

PRNU is compensated for in astrophotography via a flatfield template constructed by averaging over M flat-field frames to reduce temporal noise, each flat-field frame being a raw frame of a uniform surface taken under uniform illumination. The template needs to be divided from the normal frame, which is then multiplied by the average raw value in the template. otographic Science

48

Noise and Dynamic Range

Noise Units Noise measured using the raw data will be in units of raw levels (DN or ADU). These are output-referred units because the signal and noise have been amplified by the system gain (pp. 42, 43). Output-referred units can be converted to input-referred units (electron counts) by using the conversion factor g for the appropriate ISO setting (p. 43). Noise sources such as read noise that were not part of the original detected charge can also be expressed in terms of input-referred units. By using software such as the freeware raw converter dcraw to extract the raw channels without performing the color demosaic, g can be measured experimentally. 1) Take two successive raw frames of a uniform grey card under uniform illumination using the same integration time t. Subtracting one frame from the other removes the FPN and adds the temporal noise. 2) For a given raw channel, measure the standard deviation in a fixed area of the new frame. Dividing by pffiffiffi 2 yields sDN,total for either of the original frames. 3) Repeat the procedure for a range of ISO settings at various t. Plot s2DN;total as a function of mean nDN at a specified ISO setting and fit the data to a straight line. Subtracting the intercept with the y axis from the plotted data yields s2DN;psn as a function of nDN. 4) Since s2DN;psn 5 nDN =g, the conversion factor g is given by the inverse of the gradient of the straight line.

Temporal Noise Squared (DN)

500

Conversion factor measurement (green channel) for the Olympus E-M1 at S =1600 (GISO = 8)

400 300 200 100 0 0

50

100

150

Raw Level (DN)

200

250

Noise and Dynamic Range

49

Read Noise: Measurement

x 10000

Pixels

A dark frame (p. 46) will contain temporal noise and FPN, but photon shot noise will be absent. A bias frame is a frame taken with zero integration time t. Since all thermally induced noise will also be absent, a bias frame will contain only read noise. A bias frame can be approximated using a dark frame and the shortest integration time available in manual mode. 40

ISO 6400

bias offset = 255 DN

ISO 12800

20

ISO 25600

0 0

64

128

192

256

320

384

448

512

Some manufacturers leave a bias offset in the raw data. The mean of the bias frame noise distribution will then be centered at the bias offset rather than zero DN. This enables sDN to be measured accurately. Read noise and g measurement for the Olympus E-M1:

σDN

32 8 2 0.5 8 2

g

0.5 0.125

0.03125

σe

4 2

25600

12800

6400

3200

1600

800

400

200 100

1

ISO setting

otographic Science

50

Noise and Dynamic Range

Shadow Improvement Read noise expressed using output-referred units increases as the ISO gain GISO is raised. Read noise is seen to decrease as GISO is raised when expressed using input-referred units (p. 49). This can be explained by decomposing the read noise into separate contributions arising from circuitry upstream and downstream from the PGA. For a single-stage PGA:

Upstream read noise is amplified when GISO is raised. Downstream read noise is unaffected by GISO. Inputreferred units normalize for the gain and are therefore appropriate for understanding the relative magnitude of a noise source. For a fixed exposure level H (or electron count ne), raising GISO improves the SNR because only part of the total read noise is amplified. This is known as shadow improvement. However, once GISO reaches the camera-dependent ISO-less gain setting GISOless, the upstream read noise dominates, and no further noise benefit can be achieved through additional analog amplification. 1) For a fixed exposure level H, the SNR can be maximized by using the highest ISO gain (up to GISOless) that just utilizes the highest available raw level. 2) In situations where H is insufficient, for example, when a short t is required at low scene luminance levels, the signal-to-photon-shot-noise ratio will be lower. Compensating by increasing the ISO gain will improve read noise in the shadows, but the overall SNR will generally be lower and result in noisier raw data. Maximizing the SNR in either of the above situations is the strategy behind the expose to the right method (p. 84). Higher ISO gains (up to GISOless) are less noisy. Noisier raw data and associated output images result from reduced exposure levels rather than higher gain.

Noise and Dynamic Range

51

Raw Dynamic Range Output-referred units: Raw DR per photosite is a function of ISO setting. If the raw values are perfectly linearly related to scene luminance, raw DR per photosite can be expressed using output-referred units as follows: 

Raw DRðstopsÞ 5 log2

nDN;clip sDN;read



nDN,clip is the maximum raw value minus any bias offset. sDN,read is the read noise at the selected ISO setting. Since sDN,read in DN is larger at higher ISO gains, raw DR per photosite is reduced by increasing the ISO gain GISO. Input-referred units can provide more insight into the relationship between raw DR and the ISO setting. Input-referred units: At the base ISO setting, the minimum analog ISO gain GISO is used to output the highest available raw level (p. 42). An appropriate ADC and conversion factor g will ensure that FWC is utilized and quantized to the highest available raw level. Therefore, raw DR is maximized at the base ISO setting. By using g to convert raw levels to electron counts, the maximum achievable raw DR can be expressed as 

Maximum raw DRðstopsÞ 5 log2

ne;FWC



se;read;base

ne,FWC is the electron count at FWC. se,read,base is the read noise at the base ISO setting. When GISO is doubled from the base value, only up to half FWC will be utilized when the ADC outputs the highest available raw level. Each further doubling of GISO will again halve the well capacity that can be utilized. Since log2 0.5 5 1, up to one stop of raw will be lost each time GISO is doubled from the base value. Due to shadow improvement, less than a stop of raw DR will be lost upon each doubling of GISO for ISO gains below the ISO-less gain setting GISOless (p. 50). Above GISOless, a full stop will be lost upon each doubling of GISO but without the benefit of shadow improvement. otographic Science

52

Color

Color Theory Color vision is a perceived physiological sensation to electromagnetic waves with wavelengths in the visible region, which ranges from approximately 380–780 nm.

400

450

500

550

600

650

700

750

Color can be described by its luminance and chromaticity. Chromaticity can be subdivided into hue and saturation components. Pure spectrum colors (rainbow colors) are fully saturated. Their hues can be divided into six main regions. Each region contains many hues, and the transitions between regions are smooth.

Colors that are not pure spectrum colors have been diluted with white light and are not fully saturated. Pink is obtained by mixing a red hue with white. Greyscale or achromatic colors are fully desaturated. Light is generally polychromatic, i.e., a mixture of various wavelengths described by its spectral power distribution (SPD) P(l). This can be any function of power at each l, for example, spectral radiance (p. 39). Metamers are different SPDs that appear to be the same color under the same viewing conditions. The principle of metamerism is fundamental to color theory.

Color

53

Eye Cone Response Functions Most humans are trichromats, meaning the eye has three types of cone cells (long, medium, short) that act as light receptors. Their photon absorption properties are described ¯ ¯ by the eye cone response functions lðlÞ, mðlÞ, s¯ ðlÞ. According to the Young–Helmholtz theory, these different responses lead to the visual sensation of color.

Trichromacy is important because it permits metameric color matches to be made using Grassmann’s laws in a way that resembles linear algebra: • Any color can be matched by a linear combination of any three linearly independent primary colors or primaries. ¯ ¯ • lðlÞ, mðlÞ, s¯ ðlÞ are the amounts of the eye cone primaries lL, lM, lS the eye uses to sense color at a given l. Since there is considerable overlap between ¯ ¯ lðlÞ, mðlÞ, s¯ ðlÞ, the eye cone primaries are invisible as they cannot be uniquely stimulated and observed individually. Invisible or imaginary primaries are more saturated than pure spectrum colors. Utilizing Grassmann’s laws, integration of the product of an SPD and the eye cone response functions over the visible spectrum yields a set of tristimulus values L, M, S: Z Z Z ¯ ¯ L 5 PðlÞlðlÞdl; M 5 PðlÞmðlÞdl; S 5 PðlÞ¯sðlÞdl Each L, M, S triple defines a color in the LMS color space, a reference color space containing all colors. Identical L, M, S will result from metameric SPDs. Reference color spaces such as CIE XYZ (p. 55) have mathematical properties that are more useful for digital imaging. CIE XYZ is based on CIE RGB (p. 54). otographic Science

54

Color

CIE RGB Color Space In 1931, before the eye cone response functions were known, the CIE (Commission internationale de l’éclairage) analyzed results from color-matching experiments that used a set of three real monochromatic primaries: lR 5 700 nm;

lG 5 546.1 nm;

lB 5 435.8 nm

Using Grassmann’s laws, human observers were asked to visually match the color of a monochromatic light source at each wavelength l by mixing amounts of the primaries. From the results, the CIE defined a set of color-matching ¯ ¯ functions (CMFs) denoted r¯ ðlÞ, gðlÞ, bðlÞ. 0.4

( )

0.3

( )

0.2

( )

0.1 0.0 -0.1 380

480

580

680

780

wavelength (nm)

Since the primaries are necessarily real, negative values occur. This physically corresponds to matches that required a primary to be mixed with the source light instead of the other primaries. The CMFs are normalized such that the area under each curve is the same; adding a unit of each primary matches the color of a unit of a reference white that was chosen to be CIE illuminant E. This hypothetical illuminant has constant power at all wavelengths. Adding equal photometric or radiometric amounts of the primaries does not yield the reference white; the actual luminance and radiance ratios between a unit of each primary are 1 : 4.5907 : 0.0601 and 72.0966 : 1.3791 : 1. CIE RGB tristimulus values can be defined, and each R, G, B triple defines a color in the CIE RGB color space. This ¯ is again a reference color space. The CMFs and ¯lðlÞ, mðlÞ, s¯ ðlÞ are linearly related (p. 53). A different set of primaries would give a different set of CMFs obtainable via a linear transformation.

Color

55

CIE XYZ Color Space The CIE XYZ color space is an alternative reference color space that is linearly related to CIE RGB. By writing the CMFs and tristimulus values as “vectors,” the linear relationship can be written in matrix form as follows: 2 3 2 3 2 3 3 r¯ ðlÞ X R x¯ ðlÞ 4 y¯ ðlÞ 5 5 T 4 gðlÞ 5 and 4 Y 5 5 T 4 G 5 ¯ ¯ Z B z¯ ðlÞ bðlÞ 2

3 2 0.49 0.31 0.20 1 4 0.17697 0.81240 0.01063 5 where T 5 0.17697 0.00 0.01 0.99 x¯ ðlÞ, y¯ ðlÞ, z¯ ðlÞ do not have any negative values, and so the primaries of CIE XYZ are imaginary. 2.0

( )

1.5

( )

1.0

( )

0.5 0.0 380

480

580 wavelength (nm)

680

780

The CIE XYZ tristimulus values are defined as follows: Z

X 5k

¯ PðlÞlðlÞdl; Y 5k

Z

Z

¯ PðlÞmðlÞdl; Z5k

PðlÞ¯sðlÞdl

The Y tristimulus value directly corresponds to the luminance of the color represented by X, Y, Z since the coefficients that define y¯ ðlÞ correspond to the luminance ratio between the lR, lG, lB primary units (p. 53). • Absolute tristimulus values are obtained by setting k 5 K m 5 683 lm W1 . If P(l) is spectral radiance (p. 39), then Y becomes absolute luminance Lv in cd m2. • Relative tristimulus values with relative luminance Y in the range [0,1] are obtained as follows: 1 k5R PðlÞ¯yðlÞdl

otographic Science

56

Color

Chromaticity Diagrams Reference color spaces contain all possible colors and are difficult to visualize. Each primary cannot be stimulated independently due to the overlap between the CMFs, and so reference color spaces do not form a cubic shape when viewed using a 3D Cartesian coordinate system. However, color spaces can be conveniently projected onto a 2D chromaticity diagram by removing the luminance information. The chromaticity coordinates represent relative proportions of the tristimulus values. The chromaticity coordinates (x, y) for CIE XYZ are defined as 1.0

520 0.8 510

540 560

0.6

570

500

580

0.4

E

590 600 610 620 (nm)

490 0.2

480 470 0.0 0.0

0.2

0.4

0.6

0.8

1.0

The XYZ primaries are at (x, y) 5 (0,0), (0,1), (1,0). The grey horseshoe shape defines all visible chromaticities. Pure spectrum colors are situated along the curved part of the boundary of the horseshoe. Saturation decreases with distance inwards from the boundary. The CIE RGB primaries are defined by the red, green, and blue circles. The smaller black triangle contains all chromaticities accessible through additive mixing of these primaries; negative amounts (p. 54) are required to describe the entire horseshoe area.

Color

57

Color Temperature A blackbody is an ideal object that absorbs all incident electromagnetic radiation. When in thermal equilibrium with its surroundings at constant temperature T, it emits an SPD that is a function of T only (Planck’s law):  1   2hc2 hc Le;l ðT Þ 5 5 exp 1 lkB T l h is Planck’s constant, and kB is Boltzmann’s constant. T (in kelvin) defines the color temperature of a blackbody since it will appear red at low T and blue at high T. 590 nm 0.4

Planckian locus

2222K

1667K

4000K E 0.3 25000K 0.2

0.3

0.4

0.5

0.6

The color of an SPD emitted by an incandescent source (illumination derived from a heated object) can be closely matched to that of a blackbody radiator at a T value defined as its correlated color temperature (CCT). Many different chromaticity coordinates have the same CCT. These can be distinguished using color tint. The xy chromaticity diagram can be transformed into the uv chromaticity diagram of the CIE 1960 UCS: u 5 4x=ð12y  2x þ 3Þ;

v 5 6y=ð12y  2x þ 3Þ

Isotherms defining chromaticities with the same CCT are normal to the Planckian locus in uv space. Color tint is the distance along an isotherm between (u, v) and the Planckian locus. Color tint will be magenta or red below the Planckian locus and green or amber above. The concept of CCT is valid only for distances up to a value ±0.05 from the Planckian locus in uv space. otographic Science

58

Color

White Point and Reference White The CIE defined a set of standard illuminants that each have a well-defined SPD. Some examples are shown here. 250

E

A

D65

D50

200 150 100 50 0 300

400

500

600

700

wavelength (nm)

The white point (WP) of an SPD is defined by the chromaticity coordinates of a 100% diffuse reflector illuminated by that SPD. WP (x,y)

WP (Y 5 1)

CCT

Description

A

x 5 0.4476 y 5 0.4074

X 5 1.0985 Z 5 0.3558

2856 K

Incandescent bulb

D50

x 5 0.3457 y 5 0.3586

X 5 0.9642 Z 5 0.8252

5003 K

Horizon

D65

x 5 0.3127 y 5 0.3291

X 5 0.9504 Z 5 1.0888

6504 K

Noon

E

x 5 1/3 y 5 1/3

X51 Z51

5454 K

Equi-energy

Using relative tristimulus values (p. 55) in the range [0,1], the reference white of a color space is defined by the unit vector [1 1 1] in that color space. This color is obtained by adding a (normalized) unit of each primary. Example 1: The reference white of CIE XYZ defined by [X Y Z] 5 [1 1 1] is illuminant E. The corresponding xy chromaticity coordinates are (1/3,1/3). Example 2: The reference white of sRGB (p. 60) defined by [R G B] 5 [1 1 1] is D65. When linearly transformed to XYZ, the WP of illuminant D65 is given by [X Y Z] 5 [0.9504 1 1.0888]. The corresponding xy chromaticity coordinates are (0.3127,0.3291).

Color

59

Camera Characterization The camera raw space is defined by the camera response functions (p. 40). To reproduce color correctly, the Luther– Ives condition requires that a linear transformation exists from the camera raw space to a reference color space: 2 3 2 3 X R 4 Y 5  C4 G 5 Z B Here, R, G, B are demosaiced raw values associated with a given pixel in the camera raw space. C is a color characterization matrix. The camera raw space is not a true color space in practice as the above linear transformation is approximate. This causes metameric error because metameric SPDs may stimulate different color responses from the camera. The linear transformation can be determined by photographing a color chart with known XYZ values under a CIE standard illuminant and using the error minimization procedure detailed in ISO 17321-1. White balance and color-space conversion should be disabled. The freeware raw converter dcraw can provide this type of output. Example matrix scaling: The green raw channel is typically the first to clip. If the raw values and Y are scaled to the range [0,1] and the characterization is performed using illuminant D65, then C can be scaled so that G 5 1 for the white patch maps to the D65 illumination white point, 3 2 3 2 X 5 0.9504 RWP , 1 4 Y 5 1.0000 5 5 C 4 GWP 5 1 5 Z 5 1.0888 BWP , 1 D65 Provided white balance was disabled when performing the characterization, C can be used with arbitrary scene illumination. However, optimal results will be obtained for illumination that has a white point matching that of the characterization illuminant. otographic Science

60

Color

Output-Referred Color Spaces Conventional monitors/displays that use three primaries are incapable of displaying all possible colors since they necessarily use nonnegative amounts of real primaries. A further transformation is needed from CIE XYZ to a smaller output-referred color space such as sRGB or Adobe RGB designed to view images. 1.0

0.8

0.6

CIE XYZ (E)

y

ProPhoto RGB (D50) 0.4 D65

Adobe RGB (D65)

D50 E

sRGB (D65)

0.2 reference white 0.0 0.0

0.2

0.4

0.6

0.8

1.0

x

The primaries associated with each color space are defined by the (x, y) chromaticities at the corners of each triangle. The sRGB color space uses real primaries and is additive, and so it can be visualized as a cube using a 3D Cartesian coordinate system. blue primary

(0,0,0) black

(1,1,1) D65

green primary

Color

61

Raw to sRGB The type of matrix algebra required to transform from the camera raw space to an output-referred color space can be illustrated using the raw-to-sRGB transformation. Consider illumination that has a white point matching that of illuminant D65, the sRGB reference white. The sRGB and XYZ color spaces are related as follows: 2 3 2 3 RL X 4 GL 5 4Y 5 5 M 1 sRGB Z D65 BL D65 The “L” indicates linear sRGB tristimulus values. The “D65” subscript indicates that the illumination white point (WP) matches that of illuminant D65. The sRGB to XYZ color transformation matrix M sRGB valid for illumination with a D65 white point is 3 2 0.4124 0.3576 0.1805 M sRGB 5 4 0.2126 0.7152 0.0722 5 0.0193 0.1192 0.9505 By using the raw to XYZ matrix C determined by camera characterization (p.59), the transformation between the camera raw space and sRGB can be written 2 2 3 3 RL R 4 GL 5 4G5 5 M 1 C sRGB B D65 BL D65 sRGB tristimulus values outside [0,1] are clipped. The final form of sRGB includes a nonlinear gamma curve (p. 67) to be applied after all linear operations. The above transformation is valid only for illumination that has a D65 white point. For other scene illumination, the matrix M sRGB would need to be appropriately adapted using a chromatic adaptation transform (p. 62). In practice, the HVS naturally adjusts its perception of the scene illumination WP, and so a white balance procedure (p. 62) must accompany the transformation.

otographic Science

62

Color

White Balance The HVS naturally adapts to ambient illumination by using a mechanism known as chromatic adaptation (CA) to adjust the eye cone responses. Stated simply, the HVS aims to discount the chromaticity of the illuminant so that a perfect diffuse reflecting object will continue to appear as neutral white (perfectly achromatic with 100% relative luminance) under varying illumination. The adapted white is the color stimulus that an adapted observer considers to be neutral white. Camera imaging sensors cannot naturally perform CA, and so it is performed manually using a chromatic adaptation transform (CAT). CATs are used in color science to adapt the WP of an SPD to a different value. The following method is used in digital SLR cameras: 1) An algorithm is used to identify the adapted white by analyzing the raw data. This camera estimate is defined as the adopted white (AW) or camera neutral. Alternatively, the photographer can select a preset CCT suitable for the observed scene. 2) The scene illumination chromaticity is discounted by applying raw white balance multipliers to scale the raw channels. These serve to adapt the AW to [1 1 1], which is the reference white of the camera raw space. This is typically a green color since the camera raw channel primary units differ from the normalized primary units used by CIE color spaces (p. 54). 3) A color rotation matrix is applied to transform the raw data into an output-referred color space. This matrix can be derived from the characterization matrix and each row sums to unity. Therefore, the camera raw space reference white is adapted to the output-referred color space reference white (encoding white), which is D65 for sRGB. The output image should be viewed under the recommended viewing conditions for the chosen outputreferred color space. Adapting the AW to the encoding white is known as white balance (WB).

Color

63

White Balance: Matrix Algebra The raw to sRGB transformation (p. 61) for scene illumination with a D65 white point can be rewritten: 2 3 2 3 RL R 4 GL 5 5 M R DD65 4 G 5 where M R DD65 5 M 1 sRGB C B D65 BL D65 M R is a color rotation matrix (each row sums to 1): 1 M R 5 M 1 sRGB C DD65

DD65 is a diagonal white balance matrix of raw WB multipliers for D65 scene illumination: 3 2 1=RWP 0 0 1=GWP DD65 5 4 0 0 5 0 0 1=BWP D65 1=GWP 5 1 provided C has been appropriately normalized (p. 59). Typically, 1=RWP . 1 and 1=BWP . 1. For scene illumination with a different white point, DD65 can be replaced by a matrix suitable for the adopted white: 3 2 3 3 2 RL R 0 0 1=R WP 4 GL 5 1=GWP 5 M R DAW 4 G 5 ; DAW 5 4 0 0 5 B AW BL D65 0 0 1=BWP AW 2

Better accuracy can be achieved using M R derived from a characterization performed with illuminant CCT closely matching that of the AW. Camera manufacturers typically use a set of rotation matrices, each optimized for use with an associated WB preset. Two presets from the Olympus® E-M1 raw metadata are tabulated below (divide by 256). CCT

Scene

Multipliers

3000 K

Tungsten

296, 256, 760

6000 K

Cloudy

544, 256, 396

MR

3 324 40 28 4 68 308 16 5 16 248 488 3 2 380 104 20 4 40 348 52 5 10 128 374 2

otographic Science

64

Color

White Balance: Practice A photo will appear too red/warm if the AW is higher than the real scene illumination WP and too blue/cold if lower.

AW too high

AW too low

Rather than use a scene preset CCT or the auto WB function, a more reliable AW can be obtained by taking a photo of a neutral card placed at the scene, ideally filling the field of view. The in-camera custom WB feature can use this photo to calculate and store the AW as a preset. Raw WB multipliers calculated by the camera are stored in the raw file metadata and do not directly affect the raw data. The multipliers are applied when the in-camera raw converter produces the output JPEG file. If the output image is obtained using an external raw converter, the applied raw WB multipliers can be easily changed. Such software may opt to apply its own transformation matrices and CAT models instead. If a photo of a neutral card placed at the scene is available, a custom WB selection tool may be available that can be applied to estimate the scene illumination WP. A white balance filter placed in front of the lens can give better results when setting a custom WB. The camera should be positioned at the subject location and pointed in the direction of the dominant light source.

Digital Images

65

Raw Conversion Fundamental stages in the conversion of raw data into a viewable output image file such as JPEG include 1) Dark frame subtraction: The average signal from optical black photosites positioned around the edges of the sensor and shielded from light is used to estimate the dark signal to be subtracted before the raw data is written. More sophisticated methods will compensate for shading gradients as a function of operating temperature. When the dark signal is large, LENR can be used instead for better accuracy (p. 47). 2) White balance: WB is typically performed by applying multipliers to the raw channels in order to adapt the estimated illumination WP to the camera raw space reference white (p. 62). The multipliers are stored as raw metadata and do not affect the raw data. 3) Color demosaic: This is performed to estimate the missing color information at each photosite in the presence of a CFA (p. 38). In terms of image quality, it is beneficial to perform the color demosaic after WB. 4) Color space transformation: Transformation from the camera raw space to an output-referred color space such as sRGB is typically performed using a color rotation matrix derived from camera characterization. This matrix also adapts the illumination WP from the camera raw space reference white to the encoding white. 5) Image processing: Operations performed in-camera include noise reduction and sharpening. 6) Tone curve: For preferred tone reproduction, incamera processing engines apply a nonlinear tone curve (p. 69) that differs from the gamma curve (p. 67) of the output-referred color space, often through use of a lookup table. This step is typically applied in conjunction with bit depth reduction, and the new quantized values are saved as digital output levels (DOLs, p. 66). 7) JPEG encoding: The DOLs are converted into luma Y0 , which is a weighted sum of DOLs, and chroma (color difference) values Cb, Cr, which can be compressed by sub-sampling. otographic Science

66

Digital Images

Digital Output Levels Digital images are ordinarily viewed on 8-bit-per-channel displays. These have three color channels, each with 28 5 256 discrete levels. Therefore, a total of 224 colors of the chosen output-referred color space can be shown. For display and storage efficiency, it is advantageous to produce 8-bit digital images from the raw data. Since the bit depth of linear raw data is typically between 10 and 16, bit depth reduction needs to be performed. The entire raw DR can be represented by 256 levels, but visible banding or posterization artifacts would appear if the levels were to be evenly allocated over the luminance range of the raw data. smooth posterized

Raw data does not suffer from posterization since the raw bit depth is chosen so that the minimum noise level exceeds the quantization step (1 DN). This has a smoothing or dithering effect on the tonal transitions. To prevent posterization when encoding 8-bit images, bit depth reduction is instead performed in a nonlinear manner using gamma encoding, which aims to efficiently utilize the 8 bits by allocating more levels to tonal transitions that the HVS can more easily perceive. Denoting linear relative tristimulus values of the chosen output-referred color space normalized to the range [0,1] by RL, GL, BL, the required nonlinearity can be represented by a power law with exponent gE that denotes the encoding gamma: R0 5 RgLE ; G0 5 GgLE ; B0 5 BgLE ; where 1=2.2 ≤ gE ≤ 1=3

The gamma-encoded values R0 , G0 , B0 are subsequently quantized to the range [0,255]. These 8-bit quantized values are nonlinearly related to scene luminance and are referred to as digital output levels (DOLs). The nonlinearity introduced by gamma encoding must ultimately be canceled out by the display device.

Digital Images

67

Gamma The physiological brightness perception of luminance by the HVS is nonlinear, and so a low relative luminance appears brighter than its value would suggest. For example, 18% relative luminance is known as middle grey since it lies midway at 50% on the lightness L* (relative brightness) scale defined by the CIE. 8-bit DOL

L*

180 120

E

50

= 1/2.2

60 0 0

10

20

30 40 50 60 70 Relative Luminance (%)

80

90

Lightness

100

240

0 100

When reducing bit depth to 8, posterization can be minimized by using an encoding that linearly allocates DOLs according to lightness L*. Output-referred color spaces generally use a gamma curve (p.66) with encoding gamma 1=2.2 ≤ gE ≤ 1=3 that is similar to L*. A gamma curve is incorporated into the definition of output-referred color spaces such as Adobe RGB (gE 5 1=2.2) and sRGB (piecewise gamma curve). Since luminance must be presented to the viewer in a linear manner as it is in nature (g 5 1), the gE nonlinearity must be canceled out. This is achieved through gamma decoding by the display device. 1 0.8 E

= 1/2.2

0.6

=1

0.4 D

0.2

= 2.2

0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

In principle, the decoding gamma gD 5 1=gE . However, several nonlinearities may need to be compensated for in a general imaging chain, and an overall gamma slightly higher than unity is preferred to account for environmental viewing factors such as flare. otographic Science

68

Digital Images

Tone Curves An encoding gamma is a specific type of tone curve designed to yield photometrically accurate images on a display with a compensating decoding gamma. In-camera processing engines apply tone curves that differ from the gamma curve of the output color space. Output images are no longer photometrically accurate but are designed to be visually more pleasing. linear axes

gamma-encoded axes

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2 0

0 0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

An s-curve boosts mid-tone contrast at the expense of highlight/shadow contrast (left figure). An example s-curve (blue) is shown in relation to an encoding gamma curve with gE 5 1=2.2 (black). When the same data is plotted using gamma-encoded axes (right figure), the gamma curve becomes a straight line, and the characteristic shape of the s-curve is seen. The tone-curve adjustment control in raw converters typically uses axes that represent DOLs so that the default tone curve appears as a straight line. An inverted s-curve boosts contrast in the shadows and highlights at the expense of mid-tone contrast. The resulting image appearance can be described as “flat.” An s-curve generally lowers image DR (p. 70) relative to the raw DR since it raises the relative scene luminance that clips to DOL 5 0. Conversely, an inverted s-curve increases image DR. Although camera manufacturers typically apply s-curves, an appropriate custom tone curve can be applied in an external raw converter to transfer the entire raw DR to the output image file.

Digital Images

69

Histograms A luminance histogram uses a linear gamma ðgE 5 1Þ. The horizontal axis approximately represents relative scene luminance, and the vertical axis the pixel count. Although a luminance histogram is a true representation of the scene, it is typically skewed to the left and difficult to use. The corresponding image would appear correctly on a linear display but too dark on a conventional display with nonlinear decoding gamma. A raw histogram of the raw data approximately relates to a luminance histogram (p. 44). Linear gamma

0

10

20

30

40 50 60 70 Relative Luminance (%)

80

90

100

An image histogram, seen on the back of a digital camera when reviewing an image, indicates the number of pixels as a function of the image DOL. It therefore represents the output JPEG image and not the scene luminance or raw data. An image histogram can be seen directly through the viewfinder in real time on a mirrorless camera or a camera with liveview capability. This can greatly help when making exposure decisions. Since an image histogram has been gamma encoded according to the gamma curve of the output-referred color space, it is much easier to interpret as it more closely relates to lightness (relative brightness). For example, the same data used for the above luminance histogram appears much more symmetrical when encoded using default DOLs of the sRGB color space. sRGB gamma

255

240

225

210

195

180

165

150

135

120

105

90

75

60

45

30

15

0

Digital Output Level

otographic Science

70

Digital Images

Image Dynamic Range Raw DR is the maximum scene DR that can be represented by the raw data (p. 45). Image DR is the maximum scene DR that can be represented by the DOLs of the output image file (p. 66):   Y ðmax: DOLÞ image DRðstopsÞ 5 log2 Y ðmin: non-clipped DOLÞ The minimum non-clipped DOL is the minimum non-zero DOL that results from a non-zero relative scene luminance Y. For a standard gamma curve with encoding gamma gE , the image DR for an 8-bit output image is given by     Y ðDOL 5 255Þ 1 image DR 5 log2 5 log2 ðstopsÞ Y ðDOL 5 1Þ ð1=255Þg1 E e.g., gE 5 1=2.2 can accommodate 17.6 stops of scene DR. The sRGB color space uses a piecewise gamma curve with a linear segment close to zero. The image DR is image DRðstopsÞ 5 log2 ð1=ð255 3 12.92Þ1 Þ  11.7 Any tone curve that differs from the gamma curve of the output-referred color space can alter the image DR. An s-curve decreases image DR by raising Y (minimum non-clipped DOL), whereas an inverted s-curve increases image DR by lowering Y (minimum non-clipped DOL). An appropriate custom tone curve applied using an external raw converter can transfer the entire raw DR to the output image file, in which case image DR 5 raw DR. The image DR cannot be greater than the raw DR when the image file is produced from the raw data. When using an external raw converter, it is beneficial to work with lossless 16-bit TIFF files, which provide 216 5 65536 DOLs. Although the image DR again cannot be greater than the raw DR and the extra tonal transitions cannot be seen on 8-bit displays, rounding precision will improve, and possible posterization artifacts that could arise will be minimized.

Digital Images

71

Image Display Resolution The pixel count of a digital image is the total number of pixels. It is often expressed as n 5 nh 3 nv, where nh and nv are the horizontal and vertical pixel counts. The image display resolution is the number of displayed image pixels per unit distance, e.g., pixels per inch (ppi). It is a property of the display device/medium: 1) For a monitor/screen, the image display resolution is defined by the screen resolution, which typically has a value such as 72 ppi, 96 ppi, etc. 2) For a hard-copy print, the image display resolution can be chosen by the user; 300 ppi is considered enough for high-quality prints viewed at Dv. Note that image display resolution is independent of the printer resolution, which is the number of ink dots per unit distance used by a printer, e.g. dots per inch (dpi). For a given printer technology, a larger dpi can yield a betterquality print. The image display size is defined as image display size 5

pixel count image display resolution

The print size is the image display size for a print. Example: The image display size for a digital image viewed on a 72-ppi monitor will be 300/72  4.16 times larger than that of a 300-ppi print of the same image. Images cannot be “saved” at a specific ppi since image display resolution is a property of the display, but a tag can be added to the image metadata indicating a desired ppi for the printer software. This tag does not affect the image pixel count, and the value can be overridden. In order to match a desired image display size for a given image display resolution (or vice versa), the pixel count needs to be changed through image resampling. Printer software will automatically perform the required resampling when printing an image. Notably, this does not change the pixel count of the stored image file. Image editing software can be used to resample an image. In contrast to printing, the resampled image needs to be resaved, which will alter its pixel count. otographic Science

72

Digital Images

Color Management: Raw Conversion When an image is transferred between different devices, color management aims to preserve the real-world color appearance as much as possible. A prerequisite for successful color management is the use of an external hardware profiling device to create an International Color Consortium (ICC) display profile that describes the color response of the monitor/display device via a mapping to the Profile Connection Space (CIE LAB or XYZ). The display profile can be utilized by all color-managed software. In-camera raw conversion: A tag is added to the JPEG metadata indicating the appropriate ICC profile for the chosen output-referred color space, e.g. sRGB or Adobe RGB. This choice does not affect the raw data. The ICC profile allows the image DOLs to be correctly interpreted. sRGB is an appropriate choice for displaying images on standard-gamut monitors and on the internet. Adobe RGB is more suitable than sRGB for displaying images on widegamut monitors and for printing. Colors that can be displayed by a hardware device define its gamut. Multi-ink printers can have large gamuts. Wide-gamut monitors can display colors outside of the sRGB color space; this is particularly useful when using a raw converter or editing software. External raw conversion: Raw converters use an internal working space, which is usually a camera raw space ICC profile or a reference color space. An outputreferred color space needs to be selected for displaying and saving the output image, and only colors contained within that color space that are also contained within the display gamut can be seen on the monitor/display. Wide Gamut RGB and ProPhoto RGB are larger than Adobe RGB and are suitable for editing/displaying images on wide-gamut monitors and for high-quality printing. ProPhoto RGB cannot be seen in its entirety on a conventional three-channel monitor since two of its primaries are imaginary.

Digital Images

73

Color Management: Optimal Workflow When using a raw converter followed by image editing software for postprocessing, an optimal workflow that aims to preserve as many colors as possible from the captured raw data requires an appropriate color strategy: 1. Use a wide-gamut monitor/display and ICC profile. 2. Select a large output-referred color space (e.g., ProPhoto RGB) when saving the output image from the raw converter, ideally as a 16-bit TIFF file to prevent posterization. Unlike JPEG files, lossless TIFF files can be repeatedly saved without loss of information. 3. The image editing software working space that defines the accessible colors when editing an image should be set to be the same color space as that chosen in step 2. 4. Select the same color space chosen in step 3 when saving the processed image, ideally as 16-bit TIFF. The file from step 4 can be archived and a JPEG file produced whenever required. Different applications require specific adjustments such as resampling, sharpening, or conversion to a suitable color space. Only colors contained in both the working space from step 3 and the monitor gamut can be seen when editing. Even if there are colors in the working space that cannot be seen on the monitor/display, those colors will be visible on a print if they lie within the printer gamut. When converting the image file from step 4 into a more suitable color space, software such as Adobe Photoshop® (via “Convert to Profile”) will perform gamut mapping with an applied rendering intent. Perceptual intent is recommended when converting to a smaller color space, e.g., conversion to sRGB before saving as an 8-bit JPEG for the internet. Although any out-ofgamut colors will be clipped, perceptual intent will shift the in-gamut colors to preserve color gradations. Use of a working space larger than the final chosen output-referred color space allows more of the scene colors to be preserved by minimizing clipping until the final gamut-mapping stage. otographic Science

74

Standard Exposure Strategy

Average Photometry Although appropriate exposure settings ultimately depend upon the strategy of the photographer, the Camera & Imaging Products Association (CIPA) DC-004 and International Standards Organization (ISO) 12232 standards provide a standard exposure strategy based upon average photometry that can be used as a starting point for further adjustments. This requires the following: 1) A reflected-light meter for measuring the arithmetic average scene luminance 〈L〉. 2) A measure of the sensitivity of the DOLs (of the camera JPEG output) to incident photometric exposure at the SP. This is provided by the ISO setting S. 3) Knowledge of either the lens f-number N or the required exposure duration t. Lightness

100

50

0 0

10

20

30 40 50 60 70 Relative luminance (%)

80

90

A “typical scene” is assumed to have a scene luminance distribution that approximately averages to 18% relative luminance, i.e., 〈L〉Lmax  18%. This value is equivalent to 50% lightness or middle grey (p. 67). Provided standard output sensitivity (SOS) defines S, standard exposure strategy ensures that 〈L〉 for a typical scene is correctly reproduced as middle grey in the JPEG output, irrespective of the camera JPEG tone curve. Japanese camera manufacturers are required to determine S with SOS or recommended exposure index (REI) introduced by CIPA in 2004, based on the camera JPEG output, not the raw data. Unlike the earlier noise- and saturation-based ISO speed introduced in ISO 12232, SOS does not guarantee a minimum SNR or stop highlights from clipping. Instead, it aims to ensure that a typical scene is reproduced with the correct mid-tone lightness.

Standard Exposure Strategy

75

Reflected-Light Meter Calibration Standard exposure strategy uses reflected-light metering to estimate 〈H〉, the arithmetic average photometric exposure at the SP. The 〈H〉 value that is considered to yield a well-exposed photograph satisfies hHiS 5 P S is the ISO setting, which corresponds to the sensitivity of the DOLs in the JPEG output to exposure at the SP. P is the photographic constant; it was obtained by analyzing the acceptability of photographs covering a wide subject and luminance range to observers in terms of exposure. The ISO 12232 standard uses P 5 10. It can be argued that P 5 10 corresponds to 〈L〉/Lmax  18%, i.e., the 〈L〉 for a typical scene will be approximately 18% of the maximum scene luminance. Reflected-light meters measure 〈L〉, the arithmetic average scene luminance. In order to estimate 〈H〉 given 〈L〉, the camera equation (p. 33) is rewritten as t hHi 5 qhLi 2 N The constant q 5 0.65 ensures proportionality between 〈L〉 and 〈H〉. It combines various assumptions about a typical lens and includes an “effective” cos4 u value. The two equations above can be combined to give the reflected-light meter equation: t K ; 5 N 2 hLiS

where

K5

P 10 5 5 15.4 q 0.65

K is the handheld reflected-light meter calibration constant. A tolerance is applicable, and K 5 14 is often used in practice. (ISO 2720 uses 10.6 ≤ K ≤ 13.4 based upon an obsolete definition of film speed; this range should be multiplied by 10/8.) In-camera through-the-lens metering uses P directly. Any combination of t, N, S that satisfies the reflected-light meter equation is regarded as a suitable exposure setting. For a given metered scene, all such combinations have the same exposure value (p. 78). otographic Science

76

Standard Exposure Strategy

Camera Sensitivity The camera ISO setting S controls the sensitivity of the DOLs recorded in the camera JPEG output to incident exposure at the SP. The value depends upon three factors: 1) The sensitivity of the imaging sensor to light. This depends primarily upon its QE as this determines the rate at which the electron well of a photoelement fills with electrons in response to spectral exposure at the SP (pp. 39, 40). The QE remains fixed. 2) The adjustable analog ISO gain controlled by the PGA that amplifies the voltage signal before it is quantized by the ADC into a raw level (pp. 42, 43). Raising the analog gain can improve the SNR in the shadow regions but lower exposure headroom and raw DR (pp. 50, 51). 3) Digital gain applied when converting the raw levels into DOLs recorded in the JPEG file. This includes the gamma curve and any further tone curve manipulation. For a given QE and fill factor, the photosite area (or sensor pixel count) does not affect the rate at which a photoelement fills with electrons, and so S does not depend upon photosite area. An analogy can be made with the fact that different-sized buckets fill with rainwater at the same rate.

Although S is not a performance metric, note that • A higher QE and fill factor can raise the base ISO setting Sbase because the electron wells will fill more quickly. • A higher FWC per unit sensor area can lower Sbase as more time will be needed for the electron wells to fill. • A low Sbase provides greater exposure headroom. This can offer a higher peak SNR and raw DR provided the low base value arises from a high FWC per unit sensor area rather than poor QE.

Standard Exposure Strategy

77

Standard Output Sensitivity Although meter calibration ensures that 〈H〉S 5 10, digital gain is utilized when producing an output JPEG file, and methods to determine ISO settings must specify the nature of the corresponding JPEG digital output required. Standard output sensitivity (SOS) is based upon a measurement of the photometric exposure required to map 18% relative scene luminance to 50% lightness on the gamma curve of the output-referred color space. Example: 50% lightness (L*) on the gamma curve of the sRGB color space corresponds with DOL 5 118. 240

sRGB

100

L*

DOL

60 120

40

60

Lightness

80

180

20

0 0

10

20

30

40

50

60

70

80

90

0 100

Relative Luminance (%)

In this case, defining S using SOS ensures that 18% relative luminance always maps to DOL 5 118, irrespective of the JPEG tone curve used by the camera. Since 〈L〉/Lmax  18% for a typical scene, all typical scenes will be reproduced at the same standard mid-tone JPEG lightness provided the metering is based on average photometry. SOS (and REI) give camera manufacturers the freedom to place metered 18% relative luminance at any position on the sensor response curve when defining S. For example, • Highlight DR and shadow DR provided by the JPEG tone curve above and below middle grey can be freely adjusted since the measured S will adjust accordingly. • S below the base value can be defined by overexposing at the base analog gain setting, and then adjusting the JPEG tone curve to obtain the correct mid-tone lightness. • Recommended exposure index (REI) gives camera manufacturers the freedom to define their own requirements for the JPEG digital output. otographic Science

78

Standard Exposure Strategy

Exposure Value The reflected and incident-light metering equations (pp. 75, 82) can be rewritten as the APEX equation (additive system of photographic exposure) designed to simplify manual calculations: Ev 5 Av þ Tv 5 Bv þ Sv 5 Iv þ Sv • • • • • •

Ev is the exposure value (not to be confused with Ev). Av 5 log2 N2 is the aperture value. Tv = log2 t is the time value. Bv 5 log2 (〈L〉/(0.3K)) is the brightness value. Iv 5 log2 (E/(0.3C)) is the incident-light value. Sv 5 log2 (S/3.125) is the speed value.

These are associated with specific N, t, 〈L〉, S. For example, N

0.5

0.7

1

1.4

2

2.8

4

5.6

Av

2

1

0

1

2

3

4

5

t

4

2

1

1/2

1/4

1/8

1/16

1/32

Tv

2

1

0

1

2

3

4

5

etc.

etc.

Bv and Iv depend upon the choice of reflected light and incident light meter calibration constants K and C. Suitable exposure settings for a typical scene are provided by any combination of N, t, S that yields the recommended Ev. A difference of 1 Ev defines a photographic stop (pp. 34, 45). An f-stop specifically relates to a change of Av. Exposure compensation (EC, p. 81) can be included in the APEX equation by modifying the brightness value: Bv ! Bv  EC Positive EC compensates when Bv is too high by reducing Ev and therefore increasing 〈H〉. Negative EC compensates when Bv is too low by raising Ev and therefore decreasing 〈H〉. The APEX equation is valid provided the metering is based on average photometry. It is not valid when using in-camera matrix/pattern metering modes (p. 82).

Practical Exposure Strategy

79

Aperture Priority Mode Aperture priority mode is a useful exposure mode for general photography as it provides direct control over the f-number N. The camera provides the shutter speed t according to the metered Ev (p. 78) and the selected S. The photographer can select N according to the required DoF and then increase the ISO setting S if the t provided by the camera is too slow to prevent unwanted motion blur due to camera shake or if t is too slow to prevent subject motion blur. To prevent camera shake with the 35-mm full frame format, the reciprocal rule of thumb states that t (in seconds) should be quicker than the reciprocal of the focal length (in mm), i.e., t , 1/fE. With other formats with focal-length multiplier R (p. 19) and/or if M stops of image stabilization are available, the reciprocal rule is t,

2M Rf E

The reciprocal rule originated before digital cameras. If the aim is to maximize resolution (p. 120) and a tripod is not available, a quicker t is needed for cameras with high pixel counts. This is because camera shake introduces a motion blur component MTF that will reduce the camera system resolving power (p. 120) if its cutoff frequency drops below the sensor Nyquist frequency (p. 116). Diffraction softening (pp. 111, 112) increases at higher f-numbers. If a photograph is viewed under standard viewing conditions (p. 22), then diffraction softening becomes noticeable at f-numbers higher than N 5 11 with the 35-mm full frame format or the equivalent N with other formats (p. 125). Auto-ISO will automatically raise S if the metered t becomes slower than some minimum speed (i.e., maximum time). There may be an option to manually choose the minimum speed, or the camera may automatically choose the minimum speed according to the reciprocal rule or other proprietary algorithm.

otographic Science

80

Practical Exposure Strategy

Shutter Priority and Manual Mode Shutter priority or time priority mode is best suited for fast action photography as it provides direct control over the shutter speed t for a selected S. The camera provides N according to the metered Ev (p. 78). Experience of the subject matter is required for choosing an appropriate t. The camera will notify the user if the metered Ev cannot be achieved at the selected t and lowest available N. In this case the user may wish to increase S. Auto-ISO will perform this task automatically.

Shutter priority mode with a fast shutter speed was used to freeze the appearance of the moving action.

Manual mode allows N, t, S to be freely and independently adjusted. The camera viewfinder will indicate the difference between the metered Ev and the user Ev defined by the selected N, t, S. If available, auto-ISO will automatically adjust S to achieve the metered Ev for the selected N and t. This is useful when N and t both need to remain fixed. If available, exposure compensation (p. 81) will replace the metered Ev with a new target value.

Practical Exposure Strategy

81

Exposure Compensation Standard exposure strategy determines an appropriate exposure value (Ev) for typical scenes where the luminance distribution averages to 18% relative luminance.

0 Relative luminance (%)

51

102

153

204

255

DOL

A model luminance histogram that averages to 18% relative luminance is shown above left. The peak occurs slightly to the left of 18% since the histogram is skewed. The corresponding sRGB image histogram is shown above right. Since 18% relative luminance maps to DOL 5 118 in sRGB, the peak occurs slightly to the left of DOL 5 118. Since the DOLs are gamma encoded, the histogram appears much more symmetrical. For non-typical scenes where the scene luminance distribution does not average to 18% relative luminance, standard exposure strategy will produce an output JPEG with an incorrect mid-tone lightness, and so the image histogram will be shifted towards the left or right. Although this may be desirable for aesthetic reasons, an incorrect mid-tone lightness can be corrected by using exposure compensation (EC), and/or using an advanced form of exposure metering (p. 82). • When 〈L〉/Lmax , 18%, standard exposure strategy will lead to overexposure. Negative EC will compensate by pushing the image histogram to the left. (An example is a scene with back lighting due to the excess shadows.) • When 〈L〉/Lmax . 18%, standard exposure strategy will lead to underexposure. Positive EC will compensate by pushing the image histogram to the right. (An example is a snow scene due to abundance of highlight regions.) EC can be measured in stops as a modification to the metered Bv and Ev (p. 78). otographic Science

82

Practical Exposure Strategy

Advanced Metering Instead of performing a traditional arithmetic average of the whole scene luminance distribution, alternative metering methods can be used when a scene is not typical. Center-weighted metering gives greater weighting to the central area of the scene when calculating 〈L〉. Spot metering (either in-camera or with a handheld meter) uses only a small chosen area of the scene to calculate 〈L〉. Selecting an object known to be middle grey will ensure the object is recorded as middle grey. Most manufacturers have stopped offering an in-camera traditional average metering mode; center-weighted or matrix metering is typically the default. Matrix metering (or pattern/evaluative metering) is an in-camera reflected-light metering mode that evaluates multiple parts of the scene. It does not function using average photometry and instead may compare scenes with a database. Consequently, the reflected-light meter equation does not apply. Matrix metering essentially provides its own EC to the results of average photometry; however, this EC value cannot be predicted. Incident-light metering follows a different strategy to reflected-light metering. A handheld incident-light meter is used at the subject location and pointed in the direction of the camera. The meter records the illuminance E incident on the subject. Suitable combinations of t, N, S satisfy the incident-light meter equation t C 5 N2 E S C is the incident-light meter calibration constant. Incidentlight metering is useful when an important subject must be exposed correctly. It is more reliable than reflected-light spot metering since the subject luminance does not need to be middle grey; the subject will be exposed correctly according to its own % reflectance.

Practical Exposure Strategy

83

Clipping Even if a scene is typical in that its luminance distribution averages to 18% relative luminance, modern standard exposure strategy (p. 74) aims to produce a photograph that has the correct mid-tone lightness only and does not attempt to preserve highlight/shadow detail. This can lead to clipping of scene luminance data. The following image histogram shows an output JPEG image where both the shadows and highlights are clipped.

0

51

102

153

204

255

Digital Output Level

The scene DR that can be reproduced in the output JPEG file is restricted first by the raw DR and second by the raw DR that can be accommodated by the default JPEG tone curve. If the raw DR is sufficient but the default JPEG tone curve clips the highlights or shadows, then 1) An external raw converter can be used to apply a custom tone curve that will transfer more of the raw DR to the output image file. This can be performed in conjunction with the expose-to-the-right technique for maximizing raw DR (p. 84). Some of the latest cameras allow a custom tone curve to be applied in-camera. If the scene DR exceeds the raw DR, then standard exposure strategy will lead to a loss of shadow or highlight detail even if the scene is typical. The photographer can 2) Use EC and/or advanced metering to preserve the luminance range considered to be more important in portraying the scene, e.g., the highlights or shadows. The lightness of the over/underexposed regions of the image can be corrected afterwards using editing software or, preferably, by using an external raw converter if working with the raw file. 3) For scenes with a bright sky, use a graduated neutral density filter (p. 89) to compress the scene DR. 4) Preserve more of the scene DR by using high dynamic range techniques (p. 85). otographic Science

84

Practical Exposure Strategy

Expose to the Right Rather than produce an output JPEG file with the correct mid-tone lightness, the expose-to-the-right (ETTR) method ensures that the raw histogram (p. 68) is pushed as far to the right as possible without the highlights clipping. This maximizes raw DR by utilizing as much of the sensor response curve as possible. The method should be implemented in the following order: 1) Maximize the photometric exposure distribution over the SP, i.e. select the base ISO setting and then use the lowest N and longest t that the photographic conditions allow without the highlights clipping. Since higher exposure yields a better SNR (pp. 34, 121), this maximizes the overall SNR and raw DR. 2) Increase the ISO setting S if the exposure distribution has been maximized but the raw histogram can still be pushed farther to the right before the highlights clip. Each doubling of S will remove access to the uppermost stop of the sensor response curve (pp. 51, 121) and push the histogram to the right by 1Ev, but raw DR will not be reduced. Provided S remains below the ISO-less setting, raw DR will increase (pp. 50, 51). After taking a photo using ETTR, an output image file can be obtained using an external raw converter. The mid-tone lightness may need to be reduced to the correct level, but doing so will preserve the improved SNR. The improved raw DR can also be preserved provided a custom tone curve is applied (pp. 69, 70). Although third-party custom firmware can show raw histograms, camera manufacturers only provide image histograms for the JPEG output (p. 68). In order to use ETTR with a JPEG histogram, an external raw converter can reveal if there is any extra highlight headroom available on the sensor response curve that is not utilized by the default camera JPEG tone curve. If the difference between the raw and JPEG clipping points is found to be M stops, the photographer can allow the JPEG highlights to clip by M stops when implementing ETTR.

Practical Exposure Strategy

85

High Dynamic Range: Theory When the scene DR exceeds the raw DR, high dynamic range (HDR) imaging can be used to capture more of the scene DR by combining multiple frames of the same scene taken at different Ev by using different time values (p. 78). Ignoring offsets and quantization, a raw level is proportional to the average irradiance at the corresponding sensor pixel (p. 40): ˜e nDN a t E The proportionality factor depends upon SPD, camera responsivity, photosite area, and elementary charge. Dividing the raw levels by t yields a range of scaled or relative irradiance values. The range can be extended by taking multiple frames, each using a different t. The relative irradiance ranges from each frame will overlap so that the irradiance at a given pixel could be deduced from any of the frames covering that value. However, each frame will have a different noise level since each was taken using a different Ev. For a given pixel, rather than use the frame with the lowest noise, a gain in the SNR can be achieved by frame averaging (p. 46) but with appropriate weighting factors. The optimum weighting depends upon all noise sources. If only photon shot noise is included, the optimum weighting for frame i is simply its exposure duration t(i): X  X  ˆ ˜ Ee 5 tðiÞE e ðiÞ tðiÞ i

=

i

All relative irradiance values obtained from any clipped nDN will be incorrect. These will ideally be omitted using techniques that consider the noise distribution. The HDR relative irradiance distribution for all pixels will be proportional to the relative scene radiance distribution (neglecting natural vignetting, p. 32) and relative scene luminance, provided the linearity conditions are satisfied (p. 44). Absolute luminance can be estimated from maximum and minimum measurements taken using a light meter at the scene. Linear HDR data needs to be efficiently encoded. File formats include Radiance (.hdr) and OpenEXR (.exr). otographic Science

86

Practical Exposure Strategy

High Dynamic Range: Tone Mapping HDR images are linear, and their DR far exceeds that of conventional displays. HDR images can be shown directly on a true linear HDR display. In the future, such displays will be available on the consumer market. To show an HDR image on a conventional low dynamic range (LDR) display, the DR of the HDR image needs to be compressed into that of the display device through application of a tone-mapping operator (TMO). Tonemapping an HDR image will preserve the extra highlight/ shadow scene information, but the resulting tone-mapped image will not be photometrically correct as it cannot appear as a linear representation of the scene luminance distribution. TMOs applied in the context of HDR imaging generally aim to reduce the visual effect of this disparity by rendering a visually pleasing image. Global TMOs: These operate identically on all image pixels, irrespective of their position in the image. Simple examples include general tone curves (p. 69) and linear contrast reduction by the display device. (Unlike a general tone curve, gamma encoding/decoding aims to efficiently encode the scene luminance distribution as 8-bit DOLs rather than compress the scene DR.) Local TMOs: These operate locally so that the effect at a given pixel depends upon the surrounding pixels. Although global contrast must ultimately be reduced in order to compress the scene DR, local TMOs preserve contrast locally. This technique takes advantage of the fact that the HVS is more sensitive to the correct appearance of local contrast in comparison with the (photometrically incorrect) luminance levels throughout the tone-mapped image. A drawback of local TMOs is that they can produce undesirable artifacts, for example, halos around edges that are already of high contrast. Tone-mapped images are encoded using DOLs (p. 66) and stored using conventional formats such as JPEG. They should not be confused with linear HDR images.

Practical Exposure Strategy

87

High Dynamic Range: Example Due to back lighting, negative EC was required to preserve the highlights in the sun. This rendered the shadows much too dark. The shutter speed was t 5 1/2500 s.

A frame taken at +2 Ev using a shutter speed t 5 1/640 s reveals more shadow detail but clips the highlights.

A frame taken at +4 Ev using a shutter speed t 5 1/160 s completely clips the highlights in order to reveal the full shadow detail.

Merging the three raw files into a linear HDR image and then applying a local TMO yields an output image with both highlight and shadow detail visible.

otographic Science

88

Practical Exposure Strategy

Neutral Density Filters A neutral density filter (ND filter) reduces the light transmitted by the lens and therefore the irradiance at the SP. Equivalently, an ND filter reduces exposure value Ev. This enables a longer exposure duration to be used than would ordinarily be possible. Labeling systems include • Filter factor (transmittance). An “NDn” or “ND xn” filter has fractional transmittance T 5 1/n. This reduces Ev by log2 n stops. For example, ND32 has T 5 1/32, which reduces Ev by 5 stops. • Optical density. An “ND d” filter has an optical density d 5 log10 T. This is a base 10 logarithmic measure and so Ev is reduced by log2(10d) stops. For example, ND 1.5 reduces Ev by 5 stops. • ND 1 number. Example: ND 105 reduces Ev by 5 stops. Situations where ND filters are useful include • Shallow DoF in bright daylight conditions: A low N may require a shutter speed t faster than that achievable by the shutter. An ND filter can prevent overexposure and associated clipping by enabling a slower t to be used. • Creative motion blur effects: Landscape photographers often use ND filters to add creative motion blur, for example, to smooth the flow of water.

Practical Exposure Strategy

89

Graduated Neutral Density Filters A graduated neutral density filter (GND) reduces the exposure distribution for the sky relative to the foreground. This has two main uses: 1) To darken the sky or reveal detail in the foreground for artistic effect. Although these effects can also be created using a digital filter, a physical GND filter can give an improved SNR in the region with increased exposure. 2) To prevent clipping when the scene DR exceeds the raw DR. Note that this effect cannot be recreated using a digital filter after the raw data has been recorded. Since the raw data will no longer be a photometrically linear representation of the scene, the use of a graduated ND filter can be viewed as a form of tone mapping (p. 86) that compresses more of the scene DR into the raw DR. This can circumvent the need to construct an HDR image by taking multiples frames. • Hard-edge GND filter. This provides a hard transition between the dark and clear areas and is useful when the scene has a horizon. The transition line is typically halfway. The lightness of foreground objects projected above the horizon may need to be corrected digitally. • Soft-edge GND filter. This provides a more gradual transition between the dark and clear areas. The effective transition line is again typically halfway. • Attenuator or blender GND filter. This provides a gradual transition over the entire filter and is useful when the scene does not contain a horizon. • Reverse GND filter. The darkest area is located midway, and the lightest is at the top. This is useful for sunrise and sunset with a horizon. • Center GND filter. This is a reverse GND filter mirrored at the transition line and is useful when the sunrise or sunset is reflected onto water. Although circular GND filters are convenient, composition may be restricted since the transition line is fixed. Rectangular GND filters require a lens adaptor and holder, but the transition line can be shifted vertically. otographic Science

90

Practical Exposure Strategy

Polarizing Filters: Theory Light is a transverse wave with electric field oscillations perpendicular to the direction of propagation z and oriented according to the electric field vector E. This vector can be resolved into x and y components with amplitude A:

y E Ey x Ex

Ex 5 A cos u; E y 5 A sin u Direct sunlight is unpolarized, and so the angle u fluctuates randomly. All orientations are equally likely. Light reflected from a dielectric surface will in general be partially plane polarized. Orientations closer to a preferred u perpendicular to the reflected beam and surface normal will occur with greater probability. Light incident upon a dielectric surface at Brewster’s angle will be completely plane polarized upon reflection. The electric field oscillates in the single plane with fixed orientation u. In a general coordinate system, planepolarized light will be oriented at a fixed angle that defines the plane of polarization. y Plane of polarization along the y axis

y x x z

A polarizing filter only transmits oscillations along a single plane defined as the plane of transmission. If unpolarized light is incident upon a filter with a plane of transmission oriented in the x direction, then Ey 5 0, but u can fluctuate. The time-averaged irradiance is halved: E e 5 hjE x j2 i 5 A2 hcos2 ui 5 A2 =2 If the incident light is completely plane polarized, then (Malus’ law) Ee 5 jE x j2 5 A2 cos2 u u is between the planes of polarization and transmission. An ideal filter can now transmit between 0% and 100%.

Practical Exposure Strategy

91

Polarizing Filters: Practice The utility of the polarizing filter is that the ratio between unpolarized light and any partially or completely plane polarized light entering the lens can be altered. Dielectric surface s from which partially or completely plane polarized light emerges include glass, wood, leaves, paint, and water. Rotating the filter plane of transmission to eliminate plane polarized light emerging from glass or water can eliminate the observed reflections. Eliminating reflections from leaves can reveal their true color. Light arriving from clouds will in general be unpolarized due to repeated diffuse reflections (p. 99) and will be unaffected by a polarizing filter. Light arriving from a blue sky will in general be partially plane polarized due to Rayleigh scattering from air particles (pp. 99, 101). Therefore, rotating the filter to reduce the partially plane polarized light will darken the blue sky relative to any sources of unpolarized light. The darkening effect is most noticeable along the vertical strip of sky that forms a right angle with the sun and the observer since maximum polarization occurs at a 90° scattering angle from the sun (or other light source). • A plane or linear polarizing filter should not be used with digital SLR cameras as it will interfere with the autofocus and metering systems. These utilize a beamsplitter that functions using polarization. • A circular polarizing filter (CPL) can be used with digital SLR cameras in place of a linear polarizing filter. A CPL functions as an ordinary linear polarizer but modifies the transmitted (plane polarized) light so that E rotates as a function of time and traces out a helical path. This prevents the transmitted light from entering the beamsplitter in a plane polarized state. A polarizing filter should be removed from the lens in low light conditions since an ideal polarizing filter only transmits 50% of all unpolarized light, equivalent to use of a 1-stop neutral density filter (p. 88).

otographic Science

92

Practical Exposure Strategy

Polarizing Filters: Example

Here, the polarizing filter has darkened the blue sky relative to the clouds and swan.

Here, the polarizing filter has eliminated the reflections from the red leaves and revealed their true color.

Lighting

93

Front Lighting The direction from which the light source illuminates the subject or scene has a profound effect on its appearance. The direction can be broadly categorized as from the front, back, or side. A combination of these may be present. Frontal or front lighting causes shadows to fall behind the subject since the light source is behind or above the camera. This leads to characteristics such as • Lower scene DR. • Accurate colors. • Limited information about 3D form or surface texture. • Shadows under facial features. Depending on the subject, these may need to be removed using fill flash.

Scenes illuminated by front lighting are most likely to be “typical scenes” that do not require exposure compensation (EC) when metered using average photometry (p. 74). Obtaining a satisfactory exposure is straightforward, even under harsh lighting (p. 99). otographic Science

94

Lighting

Front Lighting: Example

Lashi Lake, Yunnan, China

Yongzhou, Hunan, China

Lighting

95

Side Lighting Side lighting illuminates certain areas of the scene while other areas are cast under shadow. This leads to characteristics such as • Shadows are directed sideways and will appear longer when the sun is lower in the sky. • Dramatic appearance. • Higher scene DR. • 3D form or depth and surface texture will be revealed. Side lighting is commonly used for product and still-life photography. In order to prevent major subjects from inappropriately falling under shadow, side lighting is best suited for simple compositions. Soft diffused side lighting (p. 99) is more appropriate for complex scenes.

Shangri-La, Yunnan, China

When using on-camera frontal flash, an improvement in the 3D form and appearance of shadows and surface texture can be gained by mounting the flash on a side bracket. Remotely triggered off-camera flash can also act as a side light. otographic Science

96

Lighting

Side Lighting: Example

Luoping, Yunnan, China

Side lighting reveals the depth and 3D form.

Llyn Gwynant, Snowdonia, Wales

Side lighting reveals the surface texture.

Lighting

97

Back Lighting Back lighting from a light source behind the main subject casts the main subject under shadow. The following are characteristics of back lighting: • Dramatic scene appearance overall. • Very high scene DR. • Very limited color information about the main subject. • Very limited information about the 3D form or surface texture. The main subject may appear as a silhouette. • Flare may arise. This can be reduced by a lens hood. • The light source can appear through translucent objects. • Object edges can be revealed. A back light can be added to enhance edges and illuminate translucent features for product and still-life photography. Fashion/portrait photography uses a back light to add highlight edges to a model’s hair.

Four Mile Bridge, Holy Island, Wales

The abundance of shadows typically requires negative EC to be applied, particularly if the highlights of a bright light source must be preserved. If detail in the shadows is also needed, HDR techniques can be employed (pp. 83, 85). otographic Science

98

Lighting

Back Lighting: Example

Trearddur Bay, Holy Island, Wales

Lighting

99

Diffuse Lighting Light that reaches the subject directly is described as hard or harsh lighting. It leads to the following: • Strong and distinct shadows and highlights. • Increased subject contrast and scene DR. Light that reaches the subject indirectly is described as soft or diffuse lighting. It leads to the following: • Shadows and highlights with softer edges since the subject is illuminated more evenly. • Lower subject contrast and scene DR. Natural light can be diffused in three main ways: 1) Diffuse reflection from surfaces. Diffuse reflecting materials have an irregular molecular structure. When the incoming light causes the molecules to vibrate, the light will be emitted in many different directions. 2) Scattering from small objects such as cloud droplets (e.g., Mie scattering). 3) Rayleigh scattering from air molecules in the atmosphere. Since these are much smaller than the wavelength of the incoming light, the waves will be diffracted and emerge as spherical waves.

B Sun

A

Earth C

Although harsh lighting is desirable in certain situations, diffuse lighting is generally preferred. Light is naturally softer in the early morning and evening as it will have been scattered to a greater extent over the longer distance required to reach the observer (B and C above).

Flash lighting is direct unless softened by using a dedicated diffuser or by bouncing from a diffuse surface. Larger diffusers produce greater softening.

otographic Science

100

Lighting

Diffuse Lighting: Example

Four Mile Bridge, Holy Island, Wales

Light cloud acts as a natural diffuser.

Trearddur Bay, Holy Island, Wales

Diffused evening side lighting due to scattering.

Lighting

101

Sunrise and Sunset

B A

denser layer C

• A significant portion of the sunlight that reaches an observer on the Earth does so through Rayleigh scattering (p. 99) by air molecules in the sky. • Unlike ordinary scattering, Rayleigh scattering is found to be proportional to 1/l4, where l is the wavelength. • Light at the violet end of the visible spectrum is scattered to a much greater extent due to its shorter wavelength. During the daytime when the observer is at a position corresponding to A in the figure above, the sky appears pale blue due to the overall mixture of scattered colors. (These are mainly blue and green since violet itself is partly absorbed by ozone in the stratosphere. Also, the photopic eye response falls off rapidly into the violet). • In the early morning or late evening, the observer will be at position B or C. In these cases, the sunlight must travel a much greater distance through the atmosphere to reach the observer. Furthermore, the atmosphere is denser closer to the Earth’s surface. These combined occurrences cause the blue end of the spectrum to scatter away by the time the sunlight reaches the observer, and so the remaining mixture of wavelengths appear orange, and eventually red. • Rayleigh scattering can be enhanced by pollutants such as aerosols and sulfate particles close to the Earth’s surface provided their size is less than l/10. • Mie scattering (p. 99) from dust and a mild amount of cloud providing diffuse reflection (p. 99) can both enhance the appearance of any colored light that has already undergone Rayleigh scattering, thus creating a more dramatic sunrise or sunset. otographic Science

102

Lighting

Sunrise and Sunset: Example

Sunrise in Puzhehei, Yunnan, China

Sunset in Holy Island, Wales

Lighting

103

Flash Lighting Low-light flash can be defined as the use of flash as the main light source. Without flash, a handheld sharp photo could not easily be taken. The exposure is affected by 1) f-number (N) and ISO setting (S). 2) Flash duration. This becomes the effective shutter speed; it is controlled by the flash output power, which can be set to a fraction of the maximum. Studio flash photography in the absence of continuous lighting belongs in the low-light category. Fill flash can be defined as the use of flash in the presence of enough ambient light such that a handheld sharp photo could be taken without flash. The exposure is affected by 1) f-number (N) and ISO setting (S). 2) Flash duration. 3) Shutter speed. Fill flash can fill in shadows when the lighting is harsh, for example under the midday sun. Conversely, fill flash can increase subject contrast when the lighting is too diffuse, for example on an overcast day. The flash duration to shutter speed ratio affects the flash to ambient light ratio. A longer flash duration increases the influence that the flashlight has on the main subject. A longer exposure duration increases the effect of the ambient light. Depending on the photographic scene, any of the subject, background, or foreground may or may not be under illumination by ambient light. Advantages of flash: Flash is much more powerful and versatile than continuous lighting, and it offers much greater control over lighting ratios. Multiple flash units can provide additional side and back lighting to achieve the desired aesthetic look. These can be triggered wirelessly by the on-camera flash or command unit. It is often beneficial to improve flash quality by softening (p. 99) through a dedicated diffuser or by bouncing from a diffuse surface. otographic Science

104

Lighting

Flash Guide Number Flash duration is short and of fixed value for a given flash output power, and flash reach depends only upon 1) f-number N. 2) ISO setting S. 3) Flash output power. Flash reach can be increased by lowering N, raising S, or raising flash output power since these all increase the image DOL corresponding to a fixed subject location. If the flash head has a flash zoom level in mm that can match the lens focal length and camera AFoV, flash reach will also increase at higher zoom levels. Flash guide number (GN) is the maximum distance (specified in either meters or feet) at which a subject can be satisfactorily illuminated by flash when N 5 1, S 5 100, and the flash is set at full output power. A flash zoom level may also be specified. The ISO 1230 standard for determining guide numbers is based upon the calibration for an incident light meter reading at the subject location (p.82). According to the inverse square law, illuminance at a subject location is inversely proportional to the square of the distance from the subject to the light source. Since photometric exposure at the camera SP similarly has an inverse square relationship with N (strictly Nw), guide number can be expressed in the following way: GN 5 d 3 N d is the maximum subject distance from the flash source. d and N have a reciprocal relationship; GN is a constant. The GN at other S or flash output power is given by pffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffi GNðS; PFÞ 5 d 3 N 3 S=100 3 PF PF ≤ 1 is the power factor corresponding to the chosen power setting (a fraction of the maximum output power). Guide numbers are independent of shutter speed as this only affects the digital output due to ambient light.

Lighting

105

Manual Flash Manual flash refers to the manual determination of suitable exposure settings when using flash. Low light: When a single flash unit is used manually in low light, the GN equation can be utilized when selecting suitable N, S, and PF. The flash duration ordinarily becomes the effective shutter speed t, and very long t are required for artistic slow-sync effects. Fill flash: The GN is useful for ensuring the subject receives sufficient illumination. However, t is typically selected on a trial-and-error basis in order to achieve a suitable flash to ambient light ratio. This is because 1) The camera metering does not account for the flash. 2) The subject may already be under illumination by ambient light. 3) A darker or brighter background/foreground may be desired for artistic effect. The camera is typically set in manual mode when using manual flash. If set in aperture or shutter priority modes, EC will alter the influence of the ambient light. Manual settings are required to control studio flash units. Trial and error is inevitable when using multiple units, and an incident-light flash meter can help.

Use of flash in low light with a low guide number produces a dark background.

otographic Science

106

Lighting

Digital TTL Flash Metering Digital TTL flash metering refers to recent “through the lens” automatic flash metering systems offered by camera manufacturers as an alternative to manual flash. Information is transmitted between the camera and flash unit. After the camera has metered the scene, digital TTL flash typically performs additional metering in the presence of a brief pre-flash fired after the shutter has been activated and before the main flash fires. Information from both readings is combined to determine the appropriate flash output power. The system may consider the lens to subject distance and focus point. Unlike manual flash, changing the ISO setting may also change the flash to ambient light ratio. The flash to ambient light ratio chosen by the system can be overridden. Whereas conventional EC can be used to alter the influence of the ambient light, flash exposure compensation (FEC) can independently alter the influence of the flash by adjusting its output power away from the metered value. EC and FEC are in principle equivalent in low light. Guide numbers are not needed, but they remain useful for evaluating flash unit capability. Digital TTL flash is convenient but has several drawbacks. The reflected-light nature of the measurement leads to the same issues faced by standard reflected-light metering when the scene is non-typical (pp. 81, 82). Different proprietary systems may not offer the same functionality and may be implemented differently, and so the result varies amongst the camera manufacturers. Flash is conventionally fired as soon as the shutter fully opens, but this can cause the flash-lit subject to appear to overtake itself when using a long shutter speed for slowsync creative effects. Rear-curtain sync mode solves the issue by firing the flash just before the shutter starts to close. In this mode, the pre-flash used by digital TTL flash metering may become perceptible due to the delay between the pre-flash and main flash.

Lighting

107

Sync Speed The shutter speed t is typically set much slower than the flash duration itself when using flash. Shutter speeds quicker than the sync speed should not be used, otherwise image artifacts can arise due to conflict between the flash and the shutter method of operation. Mechanical focal plane shutter: Since very quick t are obtained only when the second shutter curtain starts to close before the first curtain fully opens (p. 35), the sync speed is the quickest t at which the first curtain can fully open before the second curtain needs to start closing, on the order of 1/250 s. This ensures that the flash can be fired when both curtains are fully open. Quicker t would shield part of the scene from the flash and cause a dark band to appear in the image. Mechanical leaf shutter: Since this is positioned next to the lens aperture, the sync speed is limited only by the quickest possible t, typically on the order of 1/2000 s. Electronic global shutter and CCD sensor: The sync speed is limited only by the quickest electronic shutter speed available or by the flash duration. Electronic rolling shutter and CMOS sensor: Although quicker shutter speeds t are available compared to a mechanical shutter, exposing all sensor rows to the flash requires limiting the sync speed to the total frame readout time due to the rolling nature of the shutter (p. 35). This is typically too slow to be useful. Electronic global shutter and CMOS sensor: Available on scientific cameras, the sync speed is limited only by the quickest electronic t or by the flash duration. If a t faster than the sync speed is required with a focal plane shutter, e.g., when a low N is needed for shallow DoF, the high-speed sync (HSS) mode fires the flash continuously at low power throughout the entire duration t in order to eliminate shielding artifacts. In this case, the effective flash duration and t are the same. Nevertheless, “high speed” refers to shutter speed t and not the effective flash duration; high-speed photography requires a single flash of very short duration, such as 1/32000 s.

otographic Science

108

Image Quality: Resolution

Linear Systems Theory The radiometric form of the camera equation (p. 33) for the spectral irradiance at position (x, y) on the SP is   p x y 1 E e;l ðx; yÞ 5 Le;l ; T cos4 u 4 m m N 2w This describes the ideal spectral irradiance distribution since a point on the OP will not in reality be imaged as a point on the SP. Various camera system components will introduce blur so that light originating from a point on the OP will be distributed over an area surrounding the ideal image point on the SP. The blur due to a given camera component is described by its point spread function (PSF). The camera system PSF denoted by h(x, y) describes the total point spread for the camera system. For a camera system to be treated as linear shift invariant (LSI), the following two conditions must hold: 1) A linear mapping must exist between the input and output functions f(x, y) and g(x, y), where • f(x, y) 5 Ee,l(x, y), the ideal spectral irradiance. • g(x, y) is the real spectral irradiance. 2) The camera system PSF denoted by h(x, y) must not change functional form when shifted over the SP. For an LSI system, g(x, y) is obtained as a convolution: gðx; yÞ 5 f ðx; yÞ  hðx; yÞ Convolving the ideal spectral irradiance distribution with h(x, y) can be understood by making an analogy with the sliding of a blur filter over a digital image. Mathematically, the convolution operation is defined by Z∞ Z∞ gðx; yÞ 5 f ðx0 ; y0 Þhðx  x0 ; y  y0 Þdx0 dy0 ∞ ∞

In practice, h(x, y) will vary over the SP, and so the system must be treated as a set of subsystems, each corresponding to a region that is approximately LSI.

Image Quality: Resolution

109

Point Spread Function Mathematically, a point in 1D can be represented by a delta function. This is zero everywhere except at x0: dðx  x0 Þ 5 0 when x ≠ x0 A delta function defines an area of 1 so that its height approaches infinity as its width becomes infinitesimal: Z ∞ dðx  x0 Þdx 5 1 ∞

Delta functions are denoted graphically by an upward arrow of height equal to 1. A point in 2D is defined by d(x  x0) d(y  y0). Hypothetically, an ideal camera system PSF is given by hðx  x0 ; y  y0 Þ 5 dðx  x0 Þdðy  y0 Þ In this case, there would be no point spread, and the input and output functions would be identical, g(x, y) 5 f(x, y). There will always be point spread in a camera system. The shape and extent of the camera system PSF determines the nature of the total blur. The camera system PSF is obtained as a convolution of the individual component PSFs: hðx; yÞ 5 h1 ðx; yÞ  h2 ðx; yÞ  : : : hn ðx; yÞ The main contributions to the camera system PSF are 1) Lens PSF (p. 113). The main contributions to the lens PSF are from aperture diffraction described by the diffraction PSF (p. 111) and from lens aberrations. 2) Sensor PSF. The main contribution to the sensor PSF is the detector-aperture PSF (p. 115), which arises from the fact that the photosites are not point objects. 3) Optical low-pass filter PSF (p. 118). The positional coordinates x, y appearing in the output function g(x, y) (the real spectral irradiance) will later be replaced by a set of sampling points that correspond to the photosite centers (p. 116). The sampled g(x, y) ˜ e;l when calculating the electron count (p. 40). replaces E

otographic Science

110

Image Quality: Resolution

Modulation Transfer Function The optical image at the SP can equivalently be treated as a linear combination of sinusoids of various spatial frequencies, phases, and amplitudes. The spatial frequency representation is obtained as the Fourier transform (FT) from real space: Z ∞Z ∞ Fðmx ; my Þ 5 f ðx; yÞ expð2piðmx x þ my yÞÞdxdy ∞

∞

The spatial frequency components mx, my are in units of cycles per unit distance on the SP, e.g., cycles per mm. A convolution in the real domain is equivalent to a multiplication in the Fourier domain. This provides a simple relationship between F(mx, my) and G(mx, my), the ideal and real irradiance distributions at the SP (p. 108): Gðmx ; my Þ 5 Fðmx ; my ÞHðmx ; my Þ The system optical transfer function (OTF) H(mx, my) is the FT of the camera system PSF and also the product of the individual component OTFs, Hn(mx, my). If a target sinusoid at a specified mx on the OP is imaged, the ideal and real image sinusoids at the SP will both have the same spatial frequency mx/m and dc bias (mean value). However, the real sinusoid modulation depth (amplitude ÷ dc bias) will be reduced compared to that of the ideal sinusoid. ideal real

1 μx amplitude

dc bias x

The depth reduction is a function of mx, my described by the modulation transfer function (MTF): MTFðmx ; my Þ 5 jHðmx ; my Þj and 0 ≤ MTFðmx ; my Þ ≤ 1 The MTF describes sinusoidal contrast transfer. Since most of the point spread is over a small blur spot surrounding the ideal image point, the MTF will reduce from 1 slightly at low mx, my but significantly at high mx, my as the spatial period approaches the size of the blur spot.

Image Quality: Resolution

111

Diffraction PSF An ideal camera system MTF that remains at MTF 5 1 for all mx, my corresponds with a delta function system PSF (p. 109). This is not achievable in practice. A major contribution to the lens PSF arises from the wave phenomenon of diffraction. This causes light to spread out upon passing through the lens aperture. The diffraction PSF can be derived using wave optics. Diffraction cannot be described using ray optics. For incoherent illumination, the diffraction PSF due to a circular lens aperture in air with focus set at infinity is pffiffiffiffiffiffiffiffiffiffiffiffiffiffi    pDXP  2J 1 ðaÞ2 p x2 þ y2 pr hdiff ðx;y;lÞ / 5 ; where a 5 lN lN 4lN  a  J1 is a Bessel function of the first kind. The central spot of diameter 2.44lN is known as the Airy disk (p. 120). The Airy disk diameter increases as the f-number N increases, which leads to increased diffraction softening. The f-number at which the softening becomes noticeable on the viewed output photograph depends upon the viewing conditions. These can be specified in terms of the CoC diameter (p. 22).

The volume under each component PSF is normalized to unity. (Each OTF is normalized to unity at (0,0)). A lens is described as diffraction limited if its performance is limited only by diffraction. Residual aberrations lead to a total lens PSF that differs from the diffraction PSF (p. 113).

otographic Science

112

Image Quality: Resolution

Diffraction MTF The diffraction MTF corresponding to hdiff is defined by     sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi    2  mr mr mr 2  1  1 jH diff ðx; y; lÞj 5  cos mc;diff mc;diff mc;diff  p

mr 5

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi m2x þ m2y

mc,diff 5 (lN)

1

is

the

radial

spatial

is the diffraction cutoff frequency. N = 22 N = 11 N = 5.6 N = 2.8 N = 1.4

1

MTF (λ = 550 nm)

frequency.

0.8 0.6 0.4 0.2 0 0

100

200 300 400 500 600 Spatial frequency μr (cycles/mm)

700

800

• mc,diff defines where the diffraction MTF drops to zero. Scene information corresponding to mx, my above mc,diff on the SP cannot be imaged by the lens, and so mc,diff places a fundamental limit on resolution (p. 120). The achievable resolution decreases as the f-number increases. Since the system MTF is the product of the component MTFs, the diffraction MTF reduces the system MTF at all mx, my below mc,diff. This is known as diffraction softening as it can lower perceived image sharpness (p. 121). The effect is greater as the f-number N increases. mc,diff depends upon l. The overall cutoff depends upon the nature of the SPD when integrating over the spectral passband; polychromatic MTF curves have been weighted according to a specified SPD. Diffraction softening becomes noticeable at around N 5 16 on the 35-mm full frame format when a photograph is viewed under standard viewing conditions (p. 22). The same level of diffraction softening is observed at the equivalent f-number on other formats (p. 125).

Image Quality: Resolution

113

Lens PSF Residual lens aberrations lead to a total lens PSF that differs from the diffraction PSF.

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

The above figure shows real lens PSFs obtained using a microscope (reproduced from Nasse, 2008). The white square represents an 8.5-mm photosite. • PSF (7) shows diffraction-limited performance as the PSF mostly arises from lens aperture diffraction only. • PSFs (1)–(6) show typical effects of aberrations at large EP diameters and at the edge of the lens circle. • PSF (8) includes the effects of an optical low-pass filter (p. 117). Mathematically, this PSF is a convolution of the lens PSF with the optical low-pass filter PSF (p. 118).

ih

Aberrations vary with image height (ih), the radially symmetric field position between the OA and lens circle. Rotational symmetry dictates that the longest or shortest elongation of a non-circular PSF will always be parallel or perpendicular to the radial direction. Distinct sagittal (radial) and meridional (tangential) lens MTF curves are obtained (p.114), depending on whether the target sinusoid is oriented parallel or perpendicular, respectively, to the radial direction.

otographic Science

114

Image Quality: Resolution

Lens MTF Lens MTF curves originate either from lens design calculations or preferably from direct measurements. The effects of aberrations on the diffraction MTF are typically illustrated in two ways: • Lens MTF plotted as a function of spatial frequency for a set of three chosen image heights. This is suitable for illustrating the cutoff frequency mc. Separate sagittal (radial) and meridional (tangential) curves are usually shown (p.113). • Lens MTF plotted as a function of image height for a set of three important spatial frequencies. On 35-mm full frame, the maximum chosen is 40 lp/mm on the SP since higher frequencies cannot be resolved by the HVS when the photograph is viewed under standard viewing conditions (p. 22). The maximum value scales with the focal-length multiplier on other formats (pp. 19, 125). The lens MTF at any given spatial frequency cannot be greater than the diffraction MTF at that frequency. Aberrations do not formally alter mc, however they can reduce the lens MTF at high spatial frequencies to such an extent that the effective cutoff is lower. 1

Diffraction MTF × ATF

0.8

0.00 0.07

0.6

0.12

0.4

0.18

0.2 0 0

1 Normalized spatial frequency

The general effect on the cutoff can be shown graphically by the product of the diffraction MTF and aberration transfer function (ATF), which describes the overall effect of aberrations at a chosen image height in terms of a single number, the RMS error along the wavefront emerging from the XP. Values ,0.07 indicate diffractionlimited performance.

Image Quality: Resolution

115

Detector Aperture A major contribution to the total camera system PSF and MTF is due to the photosites not being point objects. Irradiance is mixed (averaged) over the photosensitive flux detection area Adet of each photosite, referred to as the detector aperture. The blur increases as Adet increases. The detector-aperture PSF for a rectangular Adet is     1 x y rect hdet ðx; yÞ 5 rect ; where Adet 5 dx dy Adet dx dy The 2D rectangle function is 8     > = þ dðx  a1 Þdðy  b1 Þ with hOLPF ðx; yÞ 5 þ dðx  a2 Þdðy  b2 Þ > 4> > > ; : þ dðx  a3 Þdðy  b3 Þ 8 > > 1